February 19

Hotelling’s T 2
Testing a Hypothesis about the Mean
I
Consider as usual a random sample x1 , x2 , . . . , xN from Np (µ, Σ).
I
Recall: if Σ is known, we can test the null hypothesis
H0 : µ = µ0
with the size-α critical region
N (x̄ − µ0 )0 Σ−1 (x̄ − µ0 ) > χ2p (α)
I
If Σ is unknown, the natural statistic to use is
T 2 = N (x̄ − µ0 )0 S−1 (x̄ − µ0 ) .
I
We need to justify its use, and find its null distribution.
NC STATE UNIVERSITY
1 / 11
Statistics 784
Multivariate Analysis
Hotelling’s T 2
Likelihood Ratio Test
I
The likelihood function is
L(µ, Σ) = √
I
1
det 2πΣ
#
N
1X
(xα − µ)0 Σ−1 (xα − µ) .
N exp −
2
"
α=1
The generalized likelihood ratio statistic is
max L (µ0 , Σ)
λ=
Σ
max L (µ, Σ)
µ,Σ
NC STATE UNIVERSITY
2 / 11
Statistics 784
Multivariate Analysis
Hotelling’s T 2
I
I
Write Σ̂ω for the mle of Σ under H0 , when µ = µ0 , and Σ̂Ω for the
mle of Σ under the alternative hypothesis, when µ is unconstrained.
Then
N
1 X
(xα − µ0 ) (xα − µ0 )0
Σ̂ω =
N
α=1
and
L µ0 , Σ̂ω = max L (µ0 , Σ) = p
Σ
I
1
1
det 2π Σ̂ω
− pN
N × e 2 .
Similarly
N
1 X
(xα − x̄) (xα − x̄)0
Σ̂Ω =
N
α=1
and
1
L x̄, Σ̂Ω = max L (µ, Σ) = p
µ,Σ
NC STATE UNIVERSITY
det 2π Σ̂Ω
3 / 11
1
− pN
N × e 2 .
Statistics 784
Multivariate Analysis
Hotelling’s T 2
I
So the generalized likelihood ratio statistic is
N
det 2π Σ̂Ω
λ = p
N
det 2π Σ̂ω

" N
#  12 N


X





det
(xα − x̄) (xα − x̄)0 




α=1
=
" N
#



 det X (x − µ ) (x − µ )0 





α
α
0
0


p
α=1
NC STATE UNIVERSITY
4 / 11
Statistics 784
Multivariate Analysis
Hotelling’s T 2
I
Now
N
X
(xα − µ0 ) (xα − µ0 )0
α=1
=
N
X
(xα − x̄) (xα − x̄)0 + N(x̄ − µ0 )(x̄ − µ0 )0
α=1
= (N − 1)S + N(x̄ − µ0 )(x̄ − µ0 )0
whence
"
det
N
X
#
0
(xα − µ0 ) (xα − µ0 )
α=1
= det[(N − 1)S] 1 + N(x̄ − µ0 )0 [(N − 1)S]−1 (x̄ − µ0 )
NC STATE UNIVERSITY
5 / 11
Statistics 784
Multivariate Analysis
Hotelling’s T 2
I
So
1
2
λN =
1 + N(x̄ − µ0
1
=
T2
1+
N −1
)0 [(N
− 1)S]−1 (x̄ − µ0 )
I
Since λ is a monotone function of T 2 , they define equivalent tests.
I
That is, the test based on T 2 , known as Hotelling’s T 2 , is equivalent
to the generalized ratio test.
NC STATE UNIVERSITY
6 / 11
Statistics 784
Multivariate Analysis
Hotelling’s T 2
Invariance
I
If X ∼ Np (µ, Σ) and X∗ = CX for a nonsingular C, then
X∗ ∼ Np (µ∗ , Σ∗ ), where
µ∗ = Cµ, Σ∗ = CΣC0
I
The null hypothesis H0 : µ = µ0 is equivalent to the null hypothesis
H0∗ : µ∗ = Cµ0 .
I
If T 2 is the statistic for testing H0 , based on x1 , x2 , . . . , xN , and T ∗2
is the statistic for testing H0∗ , based on x∗1 , x∗2 , . . . , x∗N , then T ∗2 = T 2 .
I
That is, the statistic T 2 , and the test based on it, are invariant to the
mapping X 7→ X∗ , µ 7→ µ∗ .
I
The test based on Hotelling’s T 2 is the uniformly most powerful test
with this invariance property.
NC STATE UNIVERSITY
7 / 11
Statistics 784
Multivariate Analysis
Hotelling’s T 2
Distribution of T 2
I
T 2 may be written as Y0 S−1 Y, where Y ∼ Np (ν, Σ), and S is of the
form
n
X
Zα Z0α
nS =
α=1
I
I
I
where Z1 , Z2 , . . . , Zn are iid Np (0, Σ), independent of Y.
Without loss of generality, we can take Σ = Ip (see above,
“Invariance”).
√
Construct a p × p orthogonal matrix Q with first row Y0 / Y0 Y, and
the remaining rows arbitrary.
Then
 √

Y0 Y


0


QY = 

..


.
0
NC STATE UNIVERSITY
8 / 11
Statistics 784
Multivariate Analysis
Hotelling’s T 2
I
Hence
T 2 = Y0 S−1 Y
= (QY)0 (QSQ0 )−1 (QY)
= (Y0 Y)(QSQ0 )1,1
where Ai,j denotes the (i, j) entry of A−1 .
I
Write
0
nQSQ = B =
and note that
nQSQ0 = B =
n
X
b1,1 b0(1)
b(1) B2,2
(QZα )(QZα )0
α=1
NC STATE UNIVERSITY
9 / 11
Statistics 784
Multivariate Analysis
Hotelling’s T 2
I
Q is random, but conditionally on Q, QZα has the same distribution
as Zα , so B has the same distribution as nS.
I
Since this conditional distribution does not depend on Q, it is also the
unconditional distribution, and B is independent of Q, and hence of
Y.
I
Also
1
= b1,1 − b0(1) B−1
2,2 b(1) = b1,1·2,...,p
b 1,1
a sample partial (or residual) variance, which is therefore distributed
as χ2 with n − (p − 1) degrees of freedom.
NC STATE UNIVERSITY
10 / 11
Statistics 784
Multivariate Analysis
Hotelling’s T 2
I
So finally,
T2
Y0 Y
=
n
b1,1·2,...,p
is the ratio of a (possibly non-central) χ2 random variable with p
degrees of freedom to an independent central χ2 random variable
with n − (p − 1) degrees of freedom.
I
Consequently,
n − (p − 1) T 2
Y0 Y/p
×
=
p
n
b1,1·2,...,p /[n − (p − 1)]
has the F distribution with p and n − (p − 1) degrees of freedom.
I
Under the null hypothesis µ = µ0 , ν = 0, and the distribution is
central.
NC STATE UNIVERSITY
11 / 11
Statistics 784
Multivariate Analysis