Bayesian versus Frequentist statistical analyses: what’s the difference? Jeff Presneill Intensive Care Unit Royal Brisbane and Women’s Hospital 18th Annual Meeting on Clinical Trials in Intensive Care ANZICS Clinical Trials Group Noosa 8th March 2016 Acknowledgements • Dobson AJ, Barnett AG. An introduction to generalized linear models. 3rd ed. CRC Press, 2008. • Gelman A, et al. Bayesian Data Analysis. 3rd ed. CRC Press, 2014. • Ghahramani S. Fundamentals of probability. 2nd ed. Prentice-Hall, 2000. • Tijms H. Understanding probability. 3rd ed. Cambridge University Press, 2012 • Berry DA. Bayesian clinical trials. Nature Reviews Drug Discovery 2006;5:27-36. • https://commons.wikimedia.org • Amanda Martin, Administration Manager and PA to the Director, ANZIC-RC This presentation contains • Formulae and references to: • Mathematics Any resemblance of any image • Statistics in this presentation to any world class researcher in the CTG • Religion is purely a chance event • History (you decide the probability!) Frequentist (= Classical) Interpretation of probability • Current standard statistical approach Bayesian rising due to advanced computer simulation options • Experiment = one of an infinite sequence of possible repetitions of the same experiment, each producing statistically independent results • Statistical hypothesis testing (true or false) • Confidence intervals • Unknown parameters have fixed but unknown values Bayes • Reverend Thomas Bayes (c. 1702 – 61) • Fellow of the Royal Society in 1742 • Only two known publications, only one mathematical • Probability assigned to a hypothesis, whereas under frequentist inference, a hypothesis is typically tested without being assigned a probability. • Theorem published 1763 Icon representing Bayesian statistics By Mikhail Ryazanov https://commons.wikimedia.org/w/in dex.php?curid=17116987 Bayesian vs Frequentist • Bayes' Theorem = Bayes Rule = Conditional probability • Combining new evidence with prior belief • Contrast with frequentist inference = relies only on the evidence as a whole, with no reference to prior beliefs • "Bayesian updating" – Bayes' rule can be applied iteratively – Observe some evidence, resulting posterior probability can then be treated as a prior probability, and a new posterior probability computed from new evidence – Evidence viewed all at once, or over time Rise of Bayesian statistics • From1980s, dramatic growth in research and applications of Bayesian methods • Markov chain Monte Carlo methods (MCMC), which removed many of the computational problems • Most undergraduate teaching is still based on frequentist statistics • Bayesian methods are widely accepted and used Do Bayesian and Frequentist methods agree? • Posterior is a compromise between data and prior information • Under a standard noninformative prior distribution, Bayesian estimates and standard errors coincide with classical frequentist regression results Fundamental concept of Bayes Theorem • Calculation of P(B | A) in terms of P(A | B) | = • • • • | | = ∑=1 | Subtle use of conditional probabilities The above theorem due to Laplace Posterior probability of Bi after the occurrence of A Posterior is proportional to prior × likelihood Pierre-Simon Laplace (1749–1827) • Introduced a general version of the theorem • Used it to approach problems in celestial mechanics, medical statistics, reliability, and jurisprudence Posterior probability Likelihood Prior probability data| 0 Θ = |data = data Probability of data (parameter space) Bayes rules appear in different form #1 Prior probability Likelihood Posterior probability | | | = = ∑=1 | Probability of data (parameter space) Bayes rule – discrete parameter Prior probability Likelihood Posterior probability | | | = = ∑=1 | Sum over all possible values Probability of data (parameter space) Bayes rule – continuous parameter Prior probability Likelihood Posterior probability data| 0 Θ = |data = data| 0 Integrate over all possible values if θ continuous Probability of data (parameter space) Bayes rule – continuous parameter • Markov chain Monte Carlo (MCMC) methods = algorithms sampling probability distribution • key step compute large models that require integrations over hundreds Prior or even thousands of unknown parameters probability Likelihood Posterior probability data| 0 Θ = |data = data| 0 Integrate over all possible values if θ continuous Probability of data (parameter space) EXAMPLE: Bayesian calculations discrete probability Estimating a probability from binomial data • See the – prior probability – posterior probability – likelihood • Bayesian “Prior” must be specified = expert’s belief about all possible values of the parameter θ • Prior specified before seeing data, can take any form • Data updates Prior information Steering Committee = S. Finfer (Chair), …, R. Bellomo, …, J. Cooper,…, J. Myburgh… “Prior” knowledge available = 50% believe John, Simon and James are CTG Saints The CTG Saints Study* • Very Low Budget • Can only afford total sample size of 10 subjects • How to proceed to analysis • Frequentist? • Bayesian? * Completely unregisterable at any legitimate trials database! Frequentist approach – Binomial random variable • 10 randomly selected CTG attendees Six report a belief in Saint Rinaldo Frequentist approach – Binomial random variable Frequentist approach – Binomial random variable Bayesian calculations discrete probability “The CTG Saint Study” • Goal is inference about proportion of ANZICS-CTG attendees who believe in selected “CTG Saints” • Random sample of 10 CTG attendees • Data (y) = ⁄ who believe in a particular “CTG Saint” • θ = probability that an individual believes • H0 = CTG authors are NOT saints (θ ≤ 0.5) • H1 = CTG authors ARE saints (θ > 0.5) Bayesian calculations discrete probability “The CTG Saint Study” • Model = Binomial random variable • Probability of observing a particular set of data 10 # Likelihood ≡ #| = $ % 1 − 10−# # • For θ = 0, 0.05, 0.1, …, 0.95, 1 • To find the posterior probability use Bayes rule: data| 0 Θ = |data = data “Prior” knowledge available = 50% believe John, Simon and James are CTG Saints Bayesian calculations discrete probability “The CTG Saint Study” • Prior Distribution p(θ) – Probability of being a CTG Saint = 50% – P(y = 1) = P(y = 0) = 0.5 – Uniformly distributed across closed intervals θ = [0.0, 0.5]; [0.55, 1.0] • Data to update Prior information – Random sample from CTG conference, n = 10 Bayesian calculations discrete probability “The CTG Saint Study” – Prior probability = 0.5 Bayesian prior 0.5, obs 0.6 0.4 Prior Likelihood Posterior p(θ ) 0.3 0.2 0.1 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 θ BUT, what if “Prior” = only 10% believe John, Simon and James are CTG Saints Bayesian calculations discrete probability “The CTG Saint Study” – Prior probability = 0.5 Bayesian prior 0.5, obs 0.6 0.4 Prior Likelihood Posterior p(θ ) 0.3 0.2 0.1 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 θ Bayesian calculations discrete probability “The CTG Saint Study” – Prior probability = 0.1 Bayesian prior 0.1, obs 0.6 0.4 Prior Likelihood Posterior p(θ ) 0.3 0.2 0.1 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 θ Bayesian calculations discrete probability “The CTG Saint Study” – Prior probability = 0.1 Bayesian prior 0.1, obs 0.6 0.4 Prior Likelihood Posterior p(θ ) 0.3 0.2 0.1 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 θ Bayesian calculations discrete probability “The CTG Saint Study” – Sceptical Informative Prior probability = 0.1 20 beta( 20.63 , 196.4 ) prior, B( 10 , 6 ) data, beta( 26.63 , 200.4 ) posterior 10 0 5 Density 15 Prior Likelihood Posterior 0.0 0.2 0.4 0.6 0.8 1.0 theta Using R for Bayesian Statistics http://a-little-book-of-r-for-bayesian-statistics.readthedocs.org/en/latest/src/bayesianstats.html Bayesian calculations discrete probability “The CTG Saint Study” – Enthusiastic Informative Prior probability = 0.9 20 beta( 196.4 , 20.63 ) prior, B( 10 , 6 ) data, beta( 202.4 , 24.63 ) posterior 10 0 5 Density 15 Prior Likelihood Posterior 0.0 0.2 0.4 0.6 0.8 1.0 theta Using R for Bayesian Statistics http://a-little-book-of-r-for-bayesian-statistics.readthedocs.org/en/latest/src/bayesianstats.html Bayesian calculations discrete probability “The CTG Saint Study” – Conservative Informative Prior probability = 0.5 beta( 9.2 , 9.2 ) prior, B( 10 , 6 ) data, beta( 15.2 , 13.2 ) posterior 2 0 1 Density 3 4 Prior Likelihood Posterior 0.0 0.2 0.4 0.6 0.8 1.0 theta Using R for Bayesian Statistics http://a-little-book-of-r-for-bayesian-statistics.readthedocs.org/en/latest/src/bayesianstats.html Bayesian calculations discrete probability “The CTG Saint Study” – Uninformative Uniform Prior probability [0, 1] beta( 1 , 1 ) prior, B( 10 , 6 ) data, beta( 7 , 5 ) posterior Uninformative prior likelihood = posterior Bayesian = Frequentist 1.5 0.0 0.5 1.0 Density 2.0 2.5 Prior Likelihood Posterior 0.0 0.2 0.4 0.6 0.8 1.0 theta Using R for Bayesian Statistics http://a-little-book-of-r-for-bayesian-statistics.readthedocs.org/en/latest/src/bayesianstats.html General features of Bayesian inference • Posterior distribution centered at a point of compromise between prior information and data – Weighted proportional to precision • Data influence rises with increasing sample size = prior less influence • Posterior variance is usually less than prior variance Bayesian credible intervals vs Frequentist confidence intervals • Central 95% probability interval of posterior distribution = “credible interval” = “posterior interval” for p • Slightly different summary also encountered, = “highest posterior density” • Credible intervals sometimes have similar properties to confidence intervals Fundamental concept of Bayes Theorem • Calculation of P( B | A) in terms of P(A | B) | | | = = ∑=1 | Posterior probability of Bi after the occurrence of A State an hypothesis = prior probability distribution Collect and summarize relevant data Revised distribution of unknown parameter (which is a random variable) is the posterior probability distribution. • This allows direct statements about probability hypothesis is true • • • • Bayesian analyses can be sequential • Bayesian analyses sequential • New information incorporated into analysis as soon as available = continuously update • Uses posterior obtained after previous trial as the new prior • Legitimate intermediate conclusions based on partial results from ongoing experiment • Modify future course of experiment in light of these conclusions • Stop trial not at fixed size, but when adding more patients will not appreciably change the conclusions Bayes rule – continuous parameter • Markov chain Monte Carlo (MCMC) methods = algorithms sampling probability distribution • Key step compute large models that require integrations over hundreds Prior or even thousands of unknown parameters probability Likelihood Posterior probability data| 0 Θ = |data = data| 0 Integrate over all possible values if θ continuous Probability of data (parameter space) • Convergence of MetropolisHastings algorithm • MCMC attempts to approximate the blue distribution with the orange distribution By Chdrappi - Using R; FOSS statistical software, CC BY-SA 3.0 https://commons.wiki media.org/w/index.ph p?curid=25674906 The three steps of Bayesian data analysis 1. Setting up a full probability model – Joint probability distribution for all quantities – Consistent with scientific knowledge 2. Condition on the observed data – Conditional probabilities of quantities of interest, given observed data 3. Evaluation of model fit – Fit of model to data – Sensitivity of results to Modelling assumptions Fundamental concepts Frequentist vs Bayesian • Population parameters – different ways of expressing uncertainty – Frequentist = fixed unknown constant – Bayesian = random variable, subject to change and as additional data arises. Assign probability distributions before seeing data = prior distributions • The revised distribution of unknown parameter (which is a random variable) is the posterior probability distribution. • Direct statements about probability hypothesis is true • Frequentist = probability of obtaining a result at least as extreme as observed, assuming hypothesis true = tail probability = p value Bayesian calculations discrete probability “The CTG Saint Study” – Prior probability = 0.1 Bayesian prior 0.1, obs 0.6 0.4 Prior Likelihood Posterior p(θ ) 0.3 0.2 0.1 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 θ Parameters θi assigned prior distribution data| 0 Θ = |data = data| 0 Integrate over all possible values of θi using MCMC approach
© Copyright 2026 Paperzz