Chapter 8: Optimal Tests of Hypotheses Most Powerful Tests For

Chapter 8: Optimal Tests of Hypotheses
1. Most Powerful Tests
For 𝐻0 : 𝜃 ∈ 𝜔0 versus 𝐻1 : 𝜃 ∈ 𝜔1 (*note: 𝐻1 is also commonly
denoted as 𝐻𝑎 )
Let the random sample be denoted by 𝑿 = (𝑋1 , 𝑋2 , ⋯ , 𝑋𝑛 )′ .
Suppose the rejection region is 𝑪 such that: (*note: the
rejection region can be defined in terms of the sample points
directly instead of the test statistic values)
We reject 𝐻0 if 𝑿 ∈ 𝑪
We fail to reject 𝐻0 if 𝑿 ∈ 𝑪𝐶
Definition: The size or significance level of the test is defined
as:
𝜶 = 𝐦𝐚𝐱 𝑷(𝑿 ∈ 𝑪) = 𝐦𝐚𝐱 𝑷𝜽 (𝑹𝒆𝒋𝒆𝒄𝒕 𝐻0 |𝜃 ∈ 𝜔0 )
𝜃∈𝜔0
𝜃
Definition: The power of a test is given by
𝟏 − 𝜷 = 𝜸(𝜃) = 𝑷𝜽 (𝑹𝒆𝒋𝒆𝒄𝒕 𝐻0 |𝜃 ∈ 𝜔1 )
Definition: A most powerful test corresponds to the best
rejection region 𝑪 such that in testing a simple null hypothesis
𝐻0 : 𝜃 = 𝜃 ′ versus a simple alternative hypothesis 𝐻𝑎 : 𝜃 = 𝜃 ′′ ,
this test has the largest possible test power (𝟏 − 𝜷) among all
tests of size 𝜶.
Theorem (Neyman-Pearson Lemma). The likelihood ratio test
for a simple null hypothesis 𝐻0 : 𝜃 = 𝜃 ′ versus a simple
alternative hypothesis 𝐻𝑎 : 𝜃 = 𝜃 ′′ is a most powerful test.
1
Egon Sharpe Pearson (August 11, 1895 – June 12, 1980); Karl
Pearson (March 27, 1857 – April 27, 1936); Maria Pearson
(née Sharpe); Sigrid Letitia Sharpe Pearson.
In 1901, with Weldon and Galton, Karl Pearson founded the
journal Biometrika whose object was the development of
statistical theory. He edited this journal until his death.
In 1911, Karl Pearson founded the world's first university
statistics department at University College London.
His only son, Egon Pearson, became an eminent statistician
himself, establishing the Neyman-Pearson lemma. He
succeeded his father as head of the Applied Statistics
Department at University College.
http://en.wikipedia.org/wiki/Karl_Pearson
2
Jerzy Neyman (April 16, 1894 – August 5, 1981), is best known
for the Neyman-Pearson Lemma. He has also developed the
concept of confidence interval (1937), and contributed
significantly to the sampling theory. He published many books
dealing with experiments and statistics, and devised the way
which the FDA tests medicines today.
Jerzy Neyman was also the founder of the Department of
Statistics at the University of California, Berkeley, in 1955.
http://en.wikipedia.org/wiki/Jerzy_Neyman
3
2. Uniformly Most Powerful Tests
Definition: A uniformly most powerful test in testing a simple
null hypothesis 𝐻0 : 𝜃 = 𝜃 ′ versus a composite alternative
hypothesis 𝐻𝑎 : 𝜃 ∈ 𝜔1 , or in testing a composite null
hypothesis 𝐻0 : 𝜃 ∈ 𝜔0 versus a composite alternative 𝐻𝑎 : 𝜃 ∈
𝜔1 , is a size 𝜶 test such that this test has the largest possible
test power 𝜸(𝜃 ′′ ) for every simple alternative hypothesis:
𝐻𝑎 : 𝜃 = 𝜃 ′′ ∈ 𝜔1 , among all such tests of size 𝜶.
Example of a uniformly most powerful test:
Let X 1 , , X n be iid N (  ,1) and suppose we want to test
H 0 :   0 versus H1 :   0 . For each 1  0 , the most
powerful size  test of H 0 :   0 versus H1 :   1 rejects
the null hypothesis for

n
i 1
( X i  0 )
  1 (1   ) . Since this
n
same test function is most powerful for each 1  0 , this test
function is UMP.
But suppose we consider the alternative hypothesis
H1 :   0 . Then there is no UMP test. The most powerful
test for each H1 :   1 , where 1  0 , rejects the null
hypothesis for

n
i 1
( X i  0 )
  1 (1   ) , but the most
n
powerful test for each H1 :   1 , where 1  0 , rejects for

n
i 1
( X i  0 )
n
  1 ( ) . Note that the test that rejects the null
hypothesis for  i 1
n
( X i  0 )
n
  1 (1   ) cannot be most
powerful for an alternative 1  0 by part (c) (necessity) of
the Neyman-Pearson Lemma since it is not a likelihood ratio
test for 1  0 .
It is rare for UMP tests to exist. However, for one-sided
alternative, they exist in some problems.
4
Note: The one sided tests that we derived in the normal
population model, for µ with 𝜎 2 known, for µ with 𝜎 2
unknown, and for 𝜎 2 with µ unknown are all uniformly most
powerful. On the other hand, none of the two-sided tests are
uniformly most powerful.
A condition under which UMP tests exists is when the family of
distributions being considered possesses a property called
monotone likelihood ratio.
Definition: Let 𝑓(𝒙|𝜃) be the joint pdf of the sample 𝑿 =
(𝑋1 , 𝑋2 , ⋯ , 𝑋𝑛 )′ . Then 𝑓(𝒙|𝜃) is said to have a monotone
likelihood ratio in the statistic 𝑇(𝑿) if for any choice 𝜃1 < 𝜃2
of parameter values, the likelihood ratio 𝑓(𝒙|𝜃2 )/𝑓(𝒙|𝜃1 )
depends on values of the data X only through the value of
statistic 𝑇(𝑿) and, in addition, this ratio is a monotone
function of 𝑇(𝑿).
Theorem (Karlin-Rubin). Suppose that 𝑓(𝒙|𝜃) has an
increasing monotone likelihood ratio for the statistic 𝑇(𝑿).
Let 𝛼 and 𝑘𝛼 be chosen so that 𝜶 = 𝑷𝜽 (𝑇(𝑿) ≥ 𝒌𝜶 ). Then
𝑪 = {𝒙; 𝑇(𝒙) ≥ 𝒌𝜶 }
is the rejection region (*also called ‘critical region’) for a UMP
test for the one-sided tests of: 𝐻0 : 𝜃 ≤ 𝜃0 versus 𝐻1 : 𝜃 > 𝜃0 .
Theorem. For a Regular Exponential Family with only one
unknown parameter θ, and a population p.d.f.:
𝑓(𝑥; 𝜃) = 𝑐(𝜃) ⋅ 𝑔(𝑥) ⋅ exp[𝑡(𝑥)𝑤(𝜃)]
If 𝑤(𝜃) is a is an increasing function of 𝜃, then we have a
monotone likelihood ratio in the statistic 𝑇(𝑋) = ∑𝑛𝑖=1 𝑡(𝑋𝑖 ).
{*Recall 𝑇(𝑋) = ∑𝑛𝑖=1 𝑡(𝑋𝑖 )} is also complete sufficient for 𝜃}
Examples of a family with monotone likelihood ratio:
For X 1 , , X n iid Exponential (  ),
5
n
p( x | 1 )

p( x | 0 )
 e 
i 1
n
 e 
i 1
For 1  0 ,
 1Xi
1

0
0 Xi
n
n
 


  1  exp  (1  0 ) X i 
i 1


 0 
n
p ( x | 1 )
is an increasing function of   X i so the
p ( x | 0 )
i 1
n
family has monotone likelihood ratio in T ( x )   X i .
i 1
Karl Pearson (left) and Egon Pearson (right).
Karl Pearson (left) and R. A. Fisher (right).
6
Example. Inference on 2 population means,
when both populations are normal and we
have two independent samples. Furthermore
the population variances are unknown but
equal (  12   2 2   2 )  pooled-variance t-test.
iid
X1, , X n1 ~ N ( 1,12 )
Data:
iid
, Yn2 ~ N (2 ,  22 )
Y1 ,
Compare 1 and 2
Goal:
Pivotal Quantity Approach:
1) Point estimator:
ˆ1  ˆ 2  X  Y ~ N ( 1  2 ,
 12
n1

 22
n2
)  N ( 1  2 ,(
1 1 2
 ) )
n1 n2
2) Pivotal quantity:
Z
( X  Y )  ( 1  2 )
~ N (0,1)
1 1


n1 n2
not the PQ, since we
don’t know  2
(n1  1) S12
2
~  n21 1 ,
(n2  1) S22
2
~  n22 1 , and they are
independent ( S12 & S 22 are independent because these two
samples are independent to each other)
W 
(n1  1) S12
2

(n2  1) S22
2
~  n21  n2 2
7
Definition: W  Z12  Z 22 
i .i .d .
 Z k2 , when Z i ~ N (0,1) , then
W ~  k2 .
Definition:
t-distribution: Z ~ N (0,1) , W ~  k2 , and Z & W are
independent, then T 
Z
~ tk
W
k

T
Z

W
n1  n2  2
( X  Y )  ( 1  2 )
1 1


n1 n2
(n1  1) S12
2

(n2  1) S 22

2
n1  n2  2
( X  Y )  ( 1  2 )
~ tn1  n2  2
1 1
Sp

n1 n2
.
(n1  1) S12  (n2  1) S22
where S 
is the pooled variance.
n1  n2  2
2
p
This is the PQ of the inference on the parameter of interest
(1  2 )
3) Confidence Interval for (1  2 )
8
1    P ( t
1    P ( t
1    P ( t
n1  n2  2,
n1  n2  2,
n1  n2  2,
T t

2


2
)
2
1 1
1 1

 ( X  Y )  ( 1   2 )  t
 )
  Sp
n

n

2,
1
2
n1 n2
n
n2
1
2
2
1    P( X  Y  t

( X  Y )  ( 1  2 )
t
)
n1  n2  2,
1 1
2
Sp

n1 n2
 Sp

n1  n2  2,
n1  n2  2,

2
 Sp
1 1
1 1

 1   2  X  Y  t
 )
  Sp
n1  n2  2,
n1 n2
n1 n2
2
 This is the 100(1   )% C.I for (1  2 )
4) Test:
Test statistic: T0 
( X  Y )  c0 H 0
~ tn1 n2 2
1 1
Sp

n1 n2
9
 H :   2  c0
a)  0 1
(The most common situation
 H a : 1  2  c0
is c0  0  H 0 : 1  2 )
At the significance level α, we reject H 0 in favor
of H a iff T0  tn1  n2 2,
If P  value  P(T0  t0 | H 0 )   , reject H 0
 H 0 : 1  2  c0
b) 
 H a : 1  2  c0
At significance level α, reject H 0 in favor of H a
iff T0  tn1  n2  2,
If P  value  P(T0  t0 | H 0 )   , reject H 0
10
 H :   2  c0
c)  0 1
 H a : 1  2  c0
At α=0.05, reject H 0 in favor of H a iff
| T0 | t
n1  n2  2,

2
If P  value  2 P(T0 | t0 || H 0 )   , reject H 0
Data Analysis Example. An experiment was conducted to compare the
mean number of tapeworms in the stomachs of sheep that had been
treated for worms against the mean number in those that were
untreated. A sample of 14 worm-infected lambs was randomly divided
into 2 groups. Seven were injected with the drug and the remainders
were left untreated. After a 6-month period, the lambs were
slaughtered and the following worm counts were recorded:
Drug-treated sheep
18 43 28 50 16 32 13
Untreated sheep
40 54 26 63 21 37 39
Assume the both populations are normal, and the population variances
are equal, please test at α = 0.05 whether the treatment is effective or
not.
SOLUTION:
Inference on two population means. Two small and
independent samples. Both populations are normal, and the population
variances are unknown but equal (  12   22 ).
We perform the pooled-variance t-test with hypotheses
H 0 : 1   2  0 versus H a : 1   2  0
t0 
X1  X 2 0
sp
1
1

n
n2

28.57  40.0  0  1.49
14.39
1 1

7 7
Since t 0  1.49 is greater than  t12, 0.05  1.782 , we cannot
reject H0. We have insufficient evidence to reject the hypothesis
that there is no difference in the mean number of worms in treated
and untreated lambs.
11
LRT Derivation of the Pooled Variance T-Test
Given that we have two independent random samples from two
normal populations with equal but unknown variances. Now we
derive the likelihood ratio test for:
H0 : μ1 = μ2 vs Ha : μ1 ≠ μ2
Let μ1 = μ2 = μ, then,
ω = {−∞ < μ1 = μ2 = μ < +∞, 0 ≤ σ2 < +∞}
Ω = {−∞ < μ1 , μ2 < +∞, 0 < σ2 < +∞}
1
n1 +n2
2
L(ω) = L(μ, σ2 ) = (2πσ2 )
1
n1
(xi − μ)2 +
exp[− 2σ2 (∑i=1
2
2
∑nj=1
(yj − μ) )], and there are two parameters.
lnL(ω) = −
n1 +n2
2
1
n1
2
(xi − μ)2 + ∑nj=1
ln(2πσ2 ) − 2σ2 (∑i=1
(yj −
2
μ) ), since it contains two parameters, we take the partial
derivatives with μ and σ2 respectively and set the partial
derivatives equal to 0. Solving them we have:
μ̂ =
2 =
σ̂
ω
n2
1
∑ni=1
xi + ∑j=1
yj
n1 + n2
=
n1 x̅ + n2 y̅
n1 + n2
n1
n2
1
2
[∑ (xi − μ̂)2 + ∑ (yj − μ̂) ]
n1 + n2
i=1
j=1
1
n1 +n2
2
L(Ω) = L(μ1 , μ2 , σ2 ) = (2πσ2 )
1
n1
(xi − μ1 )2 +
exp[− 2σ2 (∑i=1
2
2
∑nj=1
(yj − μ2 ) )], and there are three parameters.
12
lnL(Ω) = −
n1 + n2
ln(2πσ2 )
2
n1
n2
1
2
− 2 (∑ (xi − μ1 )2 + ∑ (yj − μ2 ) )
2σ
i=1
j=1
We take the partial derivatives with μ1 , μ2 and σ2 respectively and
set them all equal to 0. Then solutions are:
μ
̂1 = x̅,
μ
̂2 = y̅,
σ̂2Ω =
n1
n2
1
2
[∑ (xi − x̅)2 + ∑ (yj − y̅) ]
n1 + n2
i=1
j=1
Next we take the likelihood ratio. After some simplifications, we
have:
n1 +n2
2
λ=
L(ω
̂)
=
̂)
L(Ω
1
( ̂
)
2πσ2ω
n1 +n2
2
1
( ̂2 )
2πσΩ
σ̂2
= [ Ω]
2
σ̂
ω
1
2
∑ni=1
(xi − x̅)2 + ∑nj=1
(yj − y̅)
n1 +n2
2
2
n1 +n2
2
=[
]
n1 x̅ + n2 y̅ 2
n1 x̅ + n2 y̅ 2
n1
n2
∑i=1 (xi −
∑
n1 + n2 ) + j=1 (yj − n1 + n2 )
= [1 +
n1 +n2
t 20
]− 2
n1 + n2 − 2
where t 0 is the test statistic in the pooled variance t-test.
Therefore, λ ≤ λ∗ is equivalent to |t 0 |≥ c. Thus at the
significance level α, we reject the null hypothesis in favor of the
alternative when |t 0 | ≥ c = t n1 +n2−2,α/2
13
In general, for LR test, we know that when the sample
size goes to infinity, we have
– 𝟐𝐥𝐧𝛌~𝛘𝟐 𝐤
where k is the difference in the number of free
parameters under the alternative and the null
hypothesis respectively.
Note: In the pooled-variance t-test derivation we have in
the previous page:
k=3–2=1
14