Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability Thomas Bayes Thomas Bayes? (1701?-1761?) Bayesian methods were invented in the 18th century, but their application in phylogenetics dates from 1996. Bayes’ theorema links a conditional probability to its inverse Bayes’ theorem Prob(H) Prob(D|H) Prob(H|D) = ∑H Prob(H) Prob(D|H) in the case of two alternative hypotheses, the theorem can be written as Bayes’ theorem Prob(H) Prob(D|H) Prob(H|D) = ∑H Prob(H) Prob(D|H) Prob(H1|D) = Prob(H1) Prob(D|H1) Prob(H1) Prob(D|H1) + Prob(H2) Prob(D|H2) Bayes for smarties Bayes’ theorem m m m m m =D H1=D came from mainly orange bag H2=D came from mainly blue bag Prob(D|H1) = ¾ • ¾ • ¾ • ¾ • ¼ • 5 = 405/1024 Prob(D|H2) = ¼ • ¼ • ¼ • ¼ • ¾ • 5 = 15/1024 Prob(H1) = ½ m m mm m m m m m m m m m m m mm m Prob(H1|D) = m m m m m m Prob(H2) = ½ m m m m m m m m m m m m m m m Prob(H1) Prob(D|H1) Prob(H1) Prob(D|H1) + Prob(H2) Prob(D|H2) = ½ • 405/1024 ½ • 405/1024 + ½ • 15/1024 = 0.964 a-priori knowledge can affect one’s conclusions Bayes’ theorem positive test result negative test result ill true positive false negative healthy false positive true negative positive test result negative test result ill 99% 1% healthy 0.1% 99.9% using the data only, P(ill|positive test result)≈0.99 a-priori knowledge can affect one’s conclusions Bayes’ theorem positive test result negative test result ill true positive false negative healthy false positive true negative positive test result negative test result ill 99% 1% healthy 0.1% 99.9% using the data only, P(ill|positive test result)≈0.99 a-priori knowledge can affect one’s conclusions Bayes’ theorem positive test result negative test result ill 99% 1% healthy 0.1% 99.9% a-priori knowledge: 0.1% of the population (n=100 000) is ill positive test result negative test result Ill (100) 99 1 Healthy (99 900) 100 99800 with a-priori knowledge: 99/190 of persons with positive test results is ill P(ill|positive result) ≈ 50% Bayes’ theorem a-priori knowledge can affect one’s conclusions Bayes’ theorem a-priori knowledge can affect one’s conclusions a-priori knowledge can affect one’s conclusions Bayes’ theorem Behind door 1 Behind door 2 Behind door 3 Result if staying at door 1 Result if switching to door offered Car Goat Goat Car Goat Goat Car Goat Goat Car Goat Goat Car Goat Car a-priori knowledge can affect one’s conclusions Bayes’ theorem C=number of the door hiding the car S=number of the door selected by the player H=number of the door opened by the host P(C=c|H=h, S=s) = P(H=h|C=c, S=s)• P(C=c|S=s) P(H=h|S=s) probability of finding the car, after the original selection and the host’s opening of one. a-priori knowledge can affect one’s conclusions Bayes’ theorem C=number of the door hiding the car S=number of the door selected by the player H=number of the door opened by the host P(C=c|H=h, S=s) = P(H=h|C=c, S=s)• P(C=c|S=s) 3 ∑ P(H=h|C=c,S=s) C=1 the host’s behaviour depends on the candidate’s selection and on where the car is. Bayes’ theorem a-priori knowledge can affect one’s conclusions C=number of the door hiding the car S=number of the door selected by the player H=number of the door opened by the host P(C=2|H=3, S=1) = 1 • 1/3 1/2 • 1/3 + 1 • 1/3 + 0 • 1/3 = 2/3 Bayes’ theorema is used to combine a prior probability with the likelihood to produce a posterior probability. Bayes’ theorem prior probability Prob(H) Prob(D|H) likelihood Prob(H|D) = ∑H Prob(H) Prob(D|H) posterior probability normalizing constant Bayesian inference of trees in BI, the players are the tree topology and branch lengths, the evolution model and the (sequence) data) tree topology and branch lengths A G C T evolutionary model (sequence) data the posterior probability of a tree is calculated from the prior and the likelihood Bayesian inference of trees prior probability of a tree likelihood Prob( posterior probability of a tree A ,C G T | )= Prob( A ,C G ) • Prob( T Prob( | A ,C G T ) ) summation over all possible branch lengths and model parameter values the prior probability of a tree is often not known and therefore all trees are considered equally probable Bayesian inference of trees C D D 1 15 A E B C 1 15 A E A E A A E D A E D C B 1 15 A D E C A D E A E A D C E C E 1 15 B 1 15 D E C B 1 15 B 1 15 A C D B A B 1 15 E C D C 1 15 D 1 15 B E C E B 1 15 D 1 15 A C B B A D 1 15 B C B B 1 15 A D C B 1 15 E E C prior probability Prob(Tree i) likelihood Prob(Data |Tree i) posterior probability Prob(Tree i |Data) Bayesian inference of trees the prior probability of a tree is often not known and therefore all trees are considered equally probable but prior knowledge of taxonomy could suggest other prior probabilities Bayesian inference of trees (CDE) constrained: C D D 1 3 A E B C 1 3 A E A E A A E D A E D C B 0 A D E C D E E A D C E C E 0 A D E B 0 C B 0 A A C B 0 D B A B 0 E C D C 0 B E D 0 C E B 0 A C D 0 B B A D 1 3 B C B B 0 A D C B 0 E E C BI requires summation over all possible trees … which is impossible to do analytically Bayesian inference of trees Prob( A ,C G T | )= Prob( A ,C G ) • Prob( T Prob( | A ,C G T ) ) summation over all possible branch lengths and model parameter values Bayesian inference of trees but Markov chain Monte Carlo allows approximating posterior probability Posterior probability density 1. Start at a random point tree 1 tree 2 parameter space tree 3 Bayesian inference of trees but Markov chain Monte Carlo allows approximating posterior probability Posterior probability density 1. Start at a random point 2. Make a small random move 3. Calculate posterior density ratio r = new/old state 2 1 tree 1 tree 2 parameter space tree 3 Bayesian inference of trees but Markov chain Monte Carlo allows approximating posterior probability Posterior probability density 1. Start at a random point 2. Make a small random move 3. Calculate posterior density ratio r = new/old state 4. If r > 1 always accept move 2 always accepted 1 tree 1 tree 2 parameter space tree 3 Bayesian inference of trees but Markov chain Monte Carlo allows approximating posterior probability If r < 1 accept move with a probability ~ 1/distance Posterior probability density 1. Start at a random point 2. Make a small random move 3. Calculate posterior density ratio r = new/old state 4. If r > 1 always accept move 1 perhaps accepted 2 tree 1 tree 2 parameter space tree 3 Bayesian inference of trees but Markov chain Monte Carlo allows approximating posterior probability If r < 1 accept move with a probability ~ 1/distance Posterior probability density 1. Start at a random point 2. Make a small random move 3. Calculate posterior density ratio r = new/old state 4. If r > 1 always accept move 1 rarely accepted 2 tree 1 tree 2 parameter space tree 3 Bayesian inference of trees If r < 1 accept move with a probability ~ 1/distance 5. Go to step 2 Posterior probability density 1. Start at a random point 2. Make a small random move 3. Calculate posterior density ratio r = new/old state 4. If r > 1 always accept move 20% tree 1 the proportion of time that MCMC spends in a particular parameter region is an estimate of that region’s posterior probability. 48% tree 2 parameter space 32% tree 3 Bayesian inference of trees flat 0<b<1 Metropolis-coupled Markov Chain Monte Carlo speeds up the search cold chain cold chain hot chain: P(tree|data)b hotter chain: P(tree|data)b hottest chain: P(tree|data)b Bayesian inference of trees hot scout signalling better spot Metropolis-coupled Markov Chain Monte Carlo speeds up the search Hey! Over here! cold scout stuck on local optimum
© Copyright 2026 Paperzz