Set 14 - Posterior model probability

Set 14 - Posterior model probability
STAT 401 (Engineering) - Iowa State University
March 8, 2017
(STAT401@ISU)
Set 14 - Posterior model probability
March 8, 2017
1 / 10
Posterior model probabilities
Hypothesis test decisions
A pvalue can loosely be interpreted as “the probability of observing this
data if the null hypothesis is true”,i.e.
p(y|H0 ),
But what we really want is “the probability the null hypothesis is true,
given that we observed this data”,i.e.
p(H0 |y) =
p(y|H0 )p(H0 )
.
p(y)
If there are only two hypotheses (say H0 and HA ), then we have
p(H0 |y) =
(STAT401@ISU)
p(y|H0 )p(H0 )
=
p(y|H0 )p(H0 ) + p(y|HA )p(HA )
1+
Set 14 - Posterior model probability
1
p(y|HA )p(HA )
p(y|H0 )p(H0 )
March 8, 2017
.
2 / 10
Posterior model probabilities
Point null hypotheses
If H0 : θ = θ0 and HA : θ 6= θ0 , then
p(y|H0 ) = Rp(y|θ0 )
p(y|HA ) = R p(y, θ|HA )dθ
= R p(y|θ, HA )p(θ|HA )dθ
= p(y|θ)p(θ|HA )dθ
where p(θ|HA ) is the distribution of the parameter θ when the alternative
hypothesis is true.
Example
ind
If Yi ∼ N (µ, 1) and we have the hypotheses H0 : µ = 0 vs HA : µ 6= 0
with θ|HA ∼ N (0, 1), then
y|H0 ∼ N (0, 1)
y|HA ∼ N (0, 2).
(STAT401@ISU)
Set 14 - Posterior model probability
March 8, 2017
3 / 10
Posterior model probabilities
Relative frequency interpretation
Relative frequency interpretation
Suppose you have a model p(y|θ), hypotheses H0 : θ = θ0 and HA : θ 6= θ0 , and
you observe a pvalue equal to 0.05. Now you want to understand what that
means in terms of whether the null hypothesis is true or not. That is you want
−1
p(pvalue = 0.05|HA ) p(HA )
p(H0 |pvalue = 0.05) = 1 +
p(pvalue = 0.05|H0 ) p(H0 )
If we are using a relative frequency interpretation of probability, then the answer
depends on
the relative frequency of the null hypothesis being true
p(H0 ) = 1 − p(HA )and
the ratio of the relative frequency of seeing pvalue = 0.05 under the null
and the alternative which depends on the distribution for θ under the
alternative because
Z
p(pvalue = 0.05|HA ) = p(pvalue = 0.05|θ)p(θ|HA )dθ.
(STAT401@ISU)
Set 14 - Posterior model probability
March 8, 2017
4 / 10
Posterior model probabilities
Bayesian hypothesis tests
Bayesian hypothesis tests
To conduct a Bayesian hypothesis test, you need to specify
p(Hj ) and
p(θ|Hj )
for every hypothesis j = 1, . . . , J. Then, you can calculate

−1
X
p(y|Hj )p(Hj )
p(y|Hk ) p(Hk ) 
= 1 +
p(Hj |y) = PJ
p(y|Hj ) p(Hj )
k=1 p(y|Hk )p(Hk )
k6=j
where
p(y|Hk )
p(y|Hj )
are the Bayes factor for hypothesis Hk compared to hypothesis Hj and
Z
p(y|Hj ) = p(y|θ)p(θ|Hj )dθ
BF (Hk : Hj ) =
for all j.
(STAT401@ISU)
Set 14 - Posterior model probability
March 8, 2017
5 / 10
Posterior model probabilities
Normal example
Normal example
Let Y ∼ N (µ, 1) and consider the hypotheses H0 : µ = 0 and HA : µ 6= 0
with µ|HA ∼ N (0, C) and, for simplicity, p(H0 ) = p(HA ) = 0.5. Then the
two hypotheses are really
Y ∼ N (0, 1) and
Y ∼ N (0, 1 + C).
Thus
p(y|HA )
p(H0 |y) = 1 +
p(y|H0 )
−1
N (y; 0, 1 + C) −1
= 1+
N (y; 0, 1)
where N (y; µ, σ 2 ) is evaluating the probability density function for a
normal distribution with mean µ and variance σ 2 at the value y.
(STAT401@ISU)
Set 14 - Posterior model probability
March 8, 2017
6 / 10
Posterior model probabilities
Normal example
Normal example
1.00
0.75
y
p(H0|y)
0
1
2
0.50
3
4
5
0.25
0.00
0
25
50
75
100
C
(STAT401@ISU)
Set 14 - Posterior model probability
March 8, 2017
7 / 10
Posterior model probabilities
Normal example
Do pvalues and posterior probabilities agree?
Suppose ∼ Bin(n, θ) and we have the hypotheses H0 : θ = 0.5 and
HA : θ 6= 0.5 We observe n = 10, 000 and y = 4, 900 and find the pvalue is
pvalue ≈ 2P (Y ≤ 4900) = 0.0466
so we would reject H0 at the 0.05 level.
The posterior probability of H0 if we assume θ|HA ∼ U nif (0, 1) and
p(H0 ) = p(HA ) = 0.5 is
p(H0 |y) ≈
1
= 0.96,
1 + 1/10.8
so the probability of H0 being true is 96%.
It appears the Bayesian and pvalue completely disagree!
(STAT401@ISU)
Set 14 - Posterior model probability
March 8, 2017
8 / 10
Posterior model probabilities
Jeffrey-Lindley Paradox
Jeffrey-Lindley Paradox
Definition
The Jeffrey-Lindley Paradox concerns a situation when comparing two
hypotheses H0 and H1 given data y and find
a frequentist test result is significant leading to rejection of H0 , but
the posterior probability of H0 is high.
This can happen when
the effect size is small,
n is large,
H0 is relatively precise,
H1 is relative diffuse, and
the prior model odds is ≈ 1.
(STAT401@ISU)
Set 14 - Posterior model probability
March 8, 2017
9 / 10
Posterior model probabilities
Jeffrey-Lindley Paradox
No real paradox
Pvalues:
Pvalues measure how incompatible your data are with the null
hypothesis.
The smaller the pvalue, the more incompatible.
But they say nothing about how likely the alternative is.
Posterior model probabilities:
Bayesian posterior probabilities measure how likely the data are under
the predictive distribution for each hypothesis.
The larger the posterior probability, the more predictive that
hypothesis was compared to the other hypotheses.
But this requires you to have at least two well-thought out models,
i.e. no vague priors.
Thus, these two statistics provide completely different measures of model
adequecy.
(STAT401@ISU)
Set 14 - Posterior model probability
March 8, 2017
10 / 10