Rev. Thomas Bayes • Studied at the University of Edinburgh in 1719 • Elected Fellow of the Royal Society in 1742 • Famous for “An Essay towards solving a Problem in the Doctrine of Chances” I Intro to Bayes Published posthumously in 1763 Slide 1 Bayes Theorem • He proposed a theorem that now bears his name: p(B|A) = p(A|B)p(B) p(A|B)p(B) + p(A|B C )p(B C ) • A and B are two events • Common example: • I A is the event: positive test I B is the event: have disease I B C is the event: disease free You know p(A|B) I Intro to Bayes Patient cares about p(B|A) Slide 2 Bayes Theorem • We would rewrite his theorem as p(x|y) = • p(y|x)p(x) p(y|x)p(x) =P p(y) x p(y|x)p(x) If x and y are continuous we have p(x|y) = p(y|x)p(x) p(y|x)p(x) =R p(y) p(y|x)p(x) dx • Bayes rule is all about conditional probability • We want to find p(x|y) from p(y|x) Intro to Bayes Slide 3 Bayes Theorem: Statistics • When using Bayes rule in statistics: f (θ|y) = • f (y|θ)f (θ) f (y|θ)f (θ) =R f (y) f (y|θ)f (θ)dθ Often see this expressed as: f (θ|y) ∝ f (y|θ)f (θ) Posterior ∝ Likelihood × Prior Intro to Bayes I f (y|θ): Data model (or likelihood) I f (θ): Prior distribution. I f (θ|y): Posterior distribution Slide 4 What is Bayesian statistics? • • We start with prior beliefs about quantities of interest I Before the data are collected I These beliefs are often ‘vague’ I Expressed as pmf/pdf Use the data to update our belief I • The posterior distribution contains all of our knowledge about the quantity of interest I • Obtain posterior distribution In the form of a pmf/pdf ‘Bayesian yardstick’: Degree of knowledge about anything unknown can be described through probability. Intro to Bayes Slide 5 Likelihood • • We need to define the information the data contain about the quantity of interest I Specify a likelihood or data model I Commonly known as a statistical model We have seen these already I These are the pmf/pdfs we looked at earlier – We might assume our data are normally distributed – We might assume our data are binomially distributed Intro to Bayes I These describe the uncertainty of the data given parameter values I Talk more about these soon Slide 6 The posterior distribution • • The posterior distribution I Contains all our knowledge about θ given the data I Inference is based on this distribution Use it to find point estimates for θ: I • Use it to find interval estimates for θ: I Intro to Bayes Mean or median of posterior distribution Quantiles of the posterior distribution Slide 7 In theory • Intro to Bayes Once we have: statistical model and prior distribution I Find the posterior distribution I Use posterior to make inference about θ Slide 8 In practice • • In practice we are unable to calculate the posterior directly I It is too difficult to find required integrals I We can generate samples from it I Markov chain Monte Carlo (MCMC) We can simulate from the posterior distribution I Use these samples to summarize the distribution – Visualize the distribution – Find point estimates – Find interval estimates – ... Intro to Bayes Slide 9 Summary of Bayesian statistics • Investment required to understand the basic concepts I Easier to extend to realistic and complex models. – e.g. hierarchical models (outside scope of course) • Not constrained by ‘recipe book’ style procedures I Easier to make a mistake: – Software: ‘WARNING: MCMC can be dangerous’ • • Intro to Bayes MCMC can largely be automated (e.g. JAGS) I It can take a long time to sample from posterior I MCMC can struggle (particularly as models get more complex) Need to specify priors I Natural way to incorporate additional information I Offers flexibility in modeling I No longer have estimator choice I Make probability statements about a parameter Slide 10 Example: binomial data • I want each of you to answer the following question: I Does U of O need to do more to help students refrain from academic misconduct? • On a piece of paper write: 1. Answer (yes or no) 2. Teaching load (0, 1, 2.5, etc) Intro to Bayes Slide 11 Example: binomial data • Now have data • Assume the workshop is a representative sample of U of O staff • • Assume it is binomially distributed I n = 15 trials I Each with probability π of answering yes I Observed y =? saying yes Goal is to estimate π I Intro to Bayes Find the posterior distribution p(π|y) Slide 12 Knowledge as pdf • When using Bayesian inference we represent our knowledge using pdfs I Go over various choices of prior distribution – Show the corresponding posterior when y = 9 • Intro to Bayes Then, we will fit the model in JAGS Slide 13 Priors for probabilities • We need a pdf for parameter π to describe our prior belief I Probabilities must be between 0 and 1 • We can use a beta distribution: Be(α, β) • Two ‘vague’ priors often used are: • Intro to Bayes I Be(0.5,0.5) I Be(1,1) Also look at: I Be(1,9) I Be(5,5) I Be(9,1) Slide 14 Priors: vague 2 0 1 Density 3 4 Be(1,1) Be(0.5,0.5) 0.0 0.2 0.4 0.6 0.8 1.0 π Intro to Bayes Slide 15 Priors: informative 4 0 2 Density 6 8 Be(1,9) Be(9,1) Be(5,5) 0.0 0.2 0.4 0.6 0.8 1.0 π Intro to Bayes Slide 16 Priors and posteriors: vague 3 0 1 2 Density 4 5 6 Prior: Be(1,1) Prior: Be(0.5,0.5) 0.0 0.2 0.4 0.6 0.8 1.0 π Intro to Bayes Slide 17 Priors and posteriors: informative 4 0 2 Density 6 8 Prior: Be(1,9) Prior: Be(9,1) Prior: Be(5,5) 0.0 0.2 0.4 0.6 0.8 1.0 π Intro to Bayes Slide 18 Example: binomial data • Intro to Bayes Fit the model using our data in JAGS Slide 19 Example: two sample binomial data • Consider two populations: I University staff/students that teach I University staff/students that do not teach • Is there a difference between these two populations? • Minimal changes needed to JAGS code Intro to Bayes Slide 20 Example: normal data • Consider paired data I Weight change (in lbs) for 29 young female anorexia patients undertaking cognitive behavioural treatment1 – File anorex1.txt or use data below ycbt = c(1.7, 0.7, -0.1, -0.7, -3.5, 14.9, 3.5, 17.1, -7.6, 1.6, 11.7, 6.1, 1.1, -4, 20.9, -9.1, 2.1, -1.4, 1.4, -0.3, -3.7, -0.8, 2.4, 12.6, 1.9, 3.9, 0.1, 15.4, -0.7) 1 The full data includes another treatment and a control. Hopefully we will explore that data later. Intro to Bayes Slide 21 Example: normal data • Statistical model I We assume each observation is normally distributed – Unknown mean µ – Unknown standard deviation σ (or precision τ ) I Intro to Bayes We need priors for µ and σ (or τ ) Slide 22 Priors • Prior for µ: I We often use a normal prior with large variance for µ – Lets the data speak • The variance/precision is trickier I The precision must be positive I It is common to use a gamma prior for τ A well motivated alternative is a half-t prior for σ I – Weakly informative prior – Regularizes the model I E.g. unlikely to see weight changes of hundreds of lbs – Unlikely to see σ exceed 50 – Use a half-t(0, 252 , 3) – See next slide Intro to Bayes Slide 23 0.015 0.000 0.005 0.010 Density 0.020 0.025 0.030 Half-t priors 0 20 40 60 80 100 σ Intro to Bayes Slide 24 Half-t prior • Advantage: I I Most of the mass between 0 and 50 Heavy tails (we have used 3 degrees of freedom) – There is still considerable mass >50 if we are wrong • Disadvantage? I Intro to Bayes Requires some knowledge about the data Slide 25 Fitting model • In JAGS • Trick: I Intro to Bayes JAGS uses precisions not standard deviations Slide 26 Example: two sample normal data • • Anorexia data: increase in body weight. I Treatment (modeled previously) I Control Assume the data are normally distributed I I Each group have different mean Each group have same variance – Straightforward to allow variance to differ between samples • Intro to Bayes Data in anorex2.txt Slide 27
© Copyright 2026 Paperzz