POWER FOR A GENERALIZATION OF THE GLMM
1WITH FIXED AND RANDOM PREDICTORS
by
Deborah Helen Glueck
Department of Biostatistics
University of North Carolina
Ins,titute of Statistics
Mime~
Series No. 2158T
May 1996
Power for a Generalization of the G LMM
with Fixed and Random Predictors
by
Deborah Helen Glueck
A dissertation submitted to the faculty of the University of North Carolina at Chapel
Hill in partial fulfillment of the requirements for the degree of Doctor of Philosophy
in the Department of Biostatistics, School of Public Health
Chapel Hill
1995
@1996
Deborah Helen Glueck
ALL RIGHTS RESERVED
11
DEBORAH HELEN GLUECK.
Power for a generalization of the GLMM with fixed
and random predictors. (Under the direction of Dr. Keith E. Muller.)
ABSTRACT
Multivariate models with random predictors are often considered in public health.
Prospective power analysis for such models must take into account all possible stochastic realizations of the predictors. An extended definition is given for General Linear
Multivariate Models with both random and fixed predictors. A classification scheme
for hypotheses is introduced. Those that concern both fixed and Gaussian predictors
are called GLH(F, G), those about Gaussian predictors only are called GLH(G), and
those about fixed predictors are called GLH(F).
Although there are some asymptotic power approximations for tests of the independence of sets of Gaussian random variables, (a special case of a GLMM(G),
GLH(G)), power for more complicated models has not been widely studied. In the
GLMM(F), the power for some multivariate tests and for the univariate approach to
repeated measures (UNIREP) can be approximated by using noncentral F statistics
(Muller et al., 1992). These results are conditional powers since they depend on the
observed values of the predictors. Unconditional power is the expected value of conditional power with respect to the density of the random noncentrality matrix. The
law of total probability has been used similarly to calculate unconditional power for
general linear univariate models with random Gaussian predictors (Sampson, 1974;
Gatsonis and Sampson, 1989).
For a GLMM(F, G) with one random predictor and a GLH(F, G) with one degree
of freedom, small sample unconditional power results are derived for the HotellingLawley and Pillai-Bartlett traces, Wilks' Lambda and the UNIREP tests. For the
GLMM(F, G) and the GLH(G), in the linear case, small sample unconditional power
results are derived for the tests just listed. When the noncentrality has more than one
non-zero eigenvalue, results are given for the Hotelling-Lawley and UNIREP tests.
Asymptotic limits are examined under Pitman local alternatives. Numerical algorithms are developed for some unconditional power results. Comparisons with simulations allow concluding that the unconditional power results are very accurate for
many cases, and that the deviations from expected values are due to approximations
for conditional power. Unconditional and conditional power results are compared in
simulations, and by application to a clinical trial on bone density.
111
Acknowledgement
I thank my doctoral committee members Drs. G. G. Koch, L. M. LaVange, P.
W. Stewart and D. Ransohoff for their comments and suggestions. I am grateful to
my advisor Dr. Keith E. Muller for his patience, humor, encouragement, advice and
mentoring. My parents and siblings deserve special gratitude for their unswerving
moral support. My wonderful daughter, Sam, spent nine months in utero and her
entire first year raising my spirits as I worked on this dissertation. My husband, Joe
Hoffman, has earned a trip to Jamaica for his help, without which I would never have
made it this far.
IV
CONTENTS
LIST OF TABLES
x
LIST OF FIGURES
Xl
1 INTRODUCTION AND LITERATURE REVIEW
1.1
1.2
1.3
1.4
1.5
Describing Variables in Models . . .
Motivation...............
Notation of Models and Hypotheses.
1.3.1 Notation............
2.2
Definition of the GLMM(F, G, D) .
General Linear Hypotheses: GLH(F), GLH(G) and GLH(F, G)
5
7
1.3.4
Definitions...................
8
1.3.5 Multivariate Test Statistics
1.3.6 Univariate Approach to Repeated Measures
Literature Review. . . . . . . . . . . . . . . . . . .
10
11
12
1.4.1
Conditional Power Approximations . . . . .
12
1.4.2
Unconditional Power in the GLUM with Random Predictors
15
1.4.3 Power for Tests of Independence. . . . .
1.4.4 Monte Carlo Evaluations of Power. . . .
1.4.5 Power for Mixed Model-like Approaches
1.4.6 Summary
Overview......................
17
19
19
20
21
23
Unconditional Power for Multivariate Tests: GLMM(F, G), GLH(F, G) 27
Univariate Approach to Repeated Measures . . . . . . . . . . . . ..
3 THE GLH(G) AND GLMM(F, G)
3.1
1
2
3
3
1.3.2
1.3.3
2 A SPECIAL CASE OF THE GLH(F, G) AND GLMM(F, G)
2.1
1
Introduction.............
VI
30
33
33
3.2
3.3
3.4
3.5
3.6
Lemmas
.
Univariate Approach to Repeated Measures
.
3.3.1 Approximate Unconditional Power: UNIREP
Hotelling-Lawley Trace
.
3.4.1
3.4.2
3.4.3
Wilks'
3.5.1
3.5.2
Distribution of the Noncentrality Statistic
Power for the Linear Case .
Power for non-Linear Cases
.
Lambda . . . . . . . . . . . . . . . . . . .
Distribution of the Noncentrality Statistic
Power . . . . . . . . . . . . . . . . . . . .
Pillai-Bartlett trace . . . . . . . . . . . . . . . . .
3.6.1 Distribution of the Noncentrality Statistic
3.6.2 Power . . . . . . . . . . . . . . . . . . . .
34
39
40
41
41
42
42
43
43
45
45
45
47
4 ASYMPTOTIC POWER FOR PITMAN ]~OCAL ALTERNATIVES
4.1 Infinite Sample Size and Fixed Design Matrices
4.1.1 GLMM(F)..
4.1.2 GLMM(F, G) . . . . . . . . . . . . . . .
4.2 Local Alternatives. . . . . . . . . . . . . . . . .
4.3 Asymptotic Limits of Conditional Power Approximations
4.3.1 Univariate Approach to Repeated Measures
4.3.2 Hotelling-Lawley Trace.
4.3.3 Pillai-Bartlett........
4.3.4 Wilks' . . . . . . . . . . . .
4.3.5 Summary and Implications.
4.4 Asymptotic Limits of the Unconditional Power Approximations
4.4.1 Discussion..........................
48
48
49
50
51
51
53
56
56
57
59
59
64
5 NUMERICAL EVALUATIONS
5.1 Transformations, Monotonicity and Convergence.
5.1.1 GLMM(F, G), GLH(F, G)
5.1.2 GLMM(F, G), GLH(G) .
5.2 Analytic Results and Simulations
5.2.1 GLMM(F, G), GLH(F, G)
5.2.2 GLMM(F, G), GLH(G), s = 1, s* == 1
5.2.3 GLMM(F, G), GLH(G), s > 1, s* == 1
5.3 Impact of Approximating Conditional Power.
65
Vll
65
66
67
68
68
70
71
72
5.4
5.5
Comparison of Asymptotic Results to Small Sample Analytic Results
Under Pitman Local Alternatives . . . . . . .
72
5.4.1
GLMM(F, G), GLH(G), s = 1, s* = 1
73
5.4.2
GLMM(F, G), GLH(G), s > 1, s* = 1
Conditional and Unconditional Power. . . . .
73
73
6 POWER ANALYSIS EXAMPLE: BONE DENSITY DATA
99
7 SUMMARY AND DISCUSSION
7.1 Conclusions . . . . . . . .
7.2 Plans for Future Research . . . .
106
106
108
A
111
111
111
112
113
115
A.l
A.2
A.3
A.4
A.5
Theorem on Quadratic Forms . . .
Note on Positive Definite Matrices.
Note on Magnitude of the Trace . .
Transformations for Wilks' Lambda.
Density of the H-L Scalar Noncentrality Parameter: Special Case
116
Bibliography
YIn
LIST OF TABLES
1.1
1.2
1.3
1.4
Univariate Approach to Repeated Measures Test Statistics
Multivariate Test Statistics
.
Literature Review Summary for the GLUM
Literature Review Summary for the GLMM
14
15
4.1
Limits of Unconditional Power. . . . . . . .
63
5.1
Hotelling-Lawley. GLMM(F, G), GLH(F, G), Special Case. Theoretical and Empirical Unconditional Power
Geisser-Greenhouse. GLMM(F, G), GLH(F, G), Special Case. Theoretical and Empirical Unconditional Power . . . . . . . . . . . . . ..
Huynh-Feldt. GLMM(F, G), GLH(F, G), Special Case. Theoretical
and Empirical Unconditional Power
5.2
5.3
5.4
5.5
5.6
5.7
5.8
5.9
Conservative. GLMM(F, G), GLH(F, G), Special Case. Theoretical
and Empirical Unconditional Power
Hotelling-Lawley. GLMM(F, G), GLH(G), s = 1, s* = 1. Theoretical
and Empirical Unconditional Power
20
20
75
76
77
78
79
Geisser-Greenhouse. GLMM(F, G), GLH(G), s = 1, s* = 1. Theoretical and Empirical Unconditional Power
Huynh-Feldt. GLMM(F, G), GLH(G), s = 1, s* = 1. Theoretical and
Empirical Unconditional Power
. . . . . . . . . . . . . ..
Conservative. GLMM(F, G), GLH(G), s = 1, s* = 1. Theoretical and
Empirical Unconditional Power
. . . . . . . . . . . ..
82
Hotelling-Lawley. GLMM(F, G), GLH(G), s > 1, s* = 1. Theoretical
and Empirical Unconditional Power
83
5.10 Geisser-Greenhouse. GLMM(F, G), GLH(G), s = 1, s* = 1. Theoretical and Empirical Unconditional Power
5.11 Huynh-Feldt. GLMM(F, G), GLH(G), s > 1, s* = 1. Theoretical and
Empirical Unconditional Power
. . . . . . . . . . . ..
IX
80
81
84
85
5.12 Conservative. GLMM(F, G), GLH(G), s > 1, s* = 1. Theoretical and
Empirical Unconditional Power . . . . . . . . . . . . . . . . . . . ..
5.13 Hotelling-Lawley. GLMM(F, G), GLH(G), s = 1, s* = 1. Asymptotic
and Small Sample Unconditional Power. Pitman Local Alternatives.
87
5.14 Conservative. GLMM(F, G), GLH(G), s = 1, s* = 1. Asymptotic and
Small Sample Unconditional Power. Pitman Local Alternatives .. ,
88
5.15 Geisser-Greenhouse. GLMM(F, G), GLH(G), s = 1, s* = 1. Asymptotic and Small Sample Unconditional Power. Pitman Local Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
89
5.16 Huynh-Feldt. GLMM(F, G), GLH(G), s = 1, s* = 1. Asymptotic and
Small Sample Unconditional Power. Pitman Local Alternatives . . .
90
5.17 Hotelling-Lawley. GLMM(F, G), GLH(G), s > 1, S* = 1. Asymptotic
and Small Sample Unconditional Power. Pitman Local Alternatives.
5.18 Conservative. GLMM(F, G), GLH(G), s > 1, s* = 1. Asymptotic and
Small Sample Unconditional Power. Pitman Local Alternatives . , .
5.19 Geisser-Greenhouse. GLMM(F, G), GLH(G), s > 1, s* = 1. Asymptotic and Small Sample Unconditional Power. Pitman Local Alterna-
86
91
92
tives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
93
5.20 Huynh-Feldt. GLMM(F, G), GLH(G), s > 1, S* = 1. Asymptotic and
Small Sample Unconditional Power. Pitman Local Alternatives
94
7.1
7.2
Summary for the GLUM
Summary for the GLMM .
107
107
x
LIST OF FIGURES
5.1
5.2
5.3
5.4
6.1
6.2
Muller-Peterson and Empirical Conditional[ Power s
10.
= 1, s* = 1, N =
GLMM(F, G), GLH(G). s = 1, s* = 1. Hotelling-Lawley. N
Unconditional and Conditional Power Curves
GLMM(F, G), GLH(G). s = 1, s* = 1. Hotelling-Lawley. N
Unconditional and Conditional Power Curves
GLMM(F, G), GLH(G). s = 1, s* = 1. Hotelling-Lawley. N
Unconditional and Conditional Power Curves
95
= 10.
96
= 50.
97
= 90.
98
GLH(F) and GLMM(F, G). Bone density data. Power curves for the
effect of gender-treatment interaction. Geisser-Greenhouse statistic.
N = 39. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
104
GLH(G) and GLMM(F, G). Bone density data. Power curves for the
effect of baseline BMD. Geisser-Greenhouse statistic. . . . . . . . ..
105
Xl
CHAPTER 1
INTRODUCTION AND
LITERATURE REVIEW
1.1
Describing Variables in Models
Variables can be either fixed or random. Random variables take on different values
with specified probabilities. Categorical random variables take on a finite number of
values. For example, blood type is a categorical random variable, which only takes
on the values A, B, AB, and O. Continuous random variables can assume an infinite
number of values. Examples include weight, blood pressure or cholesterol level.
In statistical modeling, variables are classified not only by whether they are fixed
or random, but also by whether they are considered to be outcomes or predictors.
Outcomes are the variables of interest. Predictors are variables that are chosen because they are highly correlated with outcome variables, are believed to influence or
even cause the outcomes of interest, or because they enable one to control for extraneous variation that may modify the outcome. Outcomes in general linear multivariate
models are always considered to be random. Predictors can be fixed or random.
Deciding whether predictors are fixed or random requires careful consideration
of the experimental design. In experiments with random predictors, the subjects
are selected at random, and the values of both the predictors and the outcomes are
discovered. The scientist observes the effect of natural variation. For example, a
scientist might examine the effect of race on blood pressure. In a random sample of
the population, the number of African-American people would be a binomial random
variable. Race would be a random categorical predictor. An epidemiologist wishing to
describe the association between height and weight would measure both in a sample
of subjects. In such a study, weight is a continuous random predictor.
In an experiment with fixed predictors, the scientist decides what the values will
be before the experiment is conducted. An example is a clinical trial, in which a
specified number of people are assigned to control or treatment. Treatment serves as
a fixed categorical predictor. Another example is a dose response study, in which dose
is an increasing percentage of body weight, and the range and number of different
dose levels is fixed before the experiment. Dose is a continuous predictor that assumes
only fixed levels.
Complicated experimental designs can make it more difficult to determine what
is fixed and what is random. For example, in order to ensure adequate data for
subgroup analyses, a sample stratified on race-gender groups may be chosen. In this
case, by design, one must include the fixed variables race and gender as predictors.
1.2
Motivation
Experiments with random predictors are often considered in public health, in both
epidemiological studies and clinical trials. For example, Sambrook et al (1994) conducted an observational study of bone mineral density (BMD) in patients who were
about to undergo heart transplantation. Patients who undergo transplantation typically lose bone mass due to massive doses of corticosteroids and bed rest. BMD
was measured before and six months after surgery. Serum osteocalcin was used as a
predictor for lumbar spine BMD loss at six months. Serum osteocalcin is a random
predictor. Ontjes et al. (1994) planned a clinical trial in which treatment, gender,
their interaction, and baseline BMD levels will be used to predict BMD levels at six
months and twelve months. Treatment, gender and the interaction are fixed predictors, while baseline BMD is a random predictor.
In analyzing such studies, one can estimate parameters, test hypotheses and
conduct retrospective power analyses by considering all the predictors to be fixed.
They are indeed fixed, once the experiment has been observed. When designing the
size of a study, if the random predictors are considered to be fixed, a problem can
arise. This in fact is an assumption that underlies the power calculations typically
done for the General Linear Multivariate Model (GLMM) and the General Linear
Univariate Model (GLUM). This difference between assumption and truth reflects
the current lack of theory for calculating power for models with random predictors.
Methods with incorrect assumptions may result in sample sizes that are too big and
wasteful, or too small to allow for adequate power. They do not account for the extra
2
variability in the experiment due to the stochastic variation of the predictors.
In planning an experiment, the interest should lie not in the power calculated for
certain realizations of the predictors, but in the power expected over all the stochastic
realizations of a particular experimental design (Dozier and Muller, 1993). The power
calculated after observing a particular set of predictor values in an experiment will be
called the conditional power. The power averaged over all possible realizations will
be called the unconditional power.
Using the explicit assumption that predictors can be random, and the concept of
unconditional power, this research develops the theory needed to calculate power and
sample size for models with random predictors. This involves extending4he definition
of the GLMM to allow for random predictors, examining the properties of the new
model, and deriving small sample and asymptotic unconditional power results. The
new theory will make power calculations more accurate because the assumptions
match the design better. These methods will have immediate applications in selecting
sample sizes for real studies in public health.
1.3
Notation of Models and Hypotheses
1.3.1
Notation
The notation used does not differentiate between the random variable X and a particular realization of it. Usually, when dealing with scalar random variables, one denotes
a random variable X with a capital letter, and a particular realization of it with the
lower case letter x. This conflicts with the matrix convention, in which z is used to
denote a column vector and X is used to denote a matrix. Since this research centers
on matrices, it seems better to follow the matrix convention and ignore the random
variable convention. It will be clearly stated in the text whenever the matrix being
considered is a particular realization of a random matrix. This compromise follows
the example of Rao (p.157, 1985).
Define
T
reX!
where
mi
= vec(
M )
rXe
= [m~ m~
... m~] I
,
is the i th column of M. Define the "left" Kronecker product, so that
A ® B = {aijB}. Let V(a) be the covariance matrix of any column vector a.
Several distributions are used throughout the text. The matrix normal distribution is defined in Arnold (p.310, 1981). His notation is adopted here because of its
3
brevity. The matrix normal is defined in terms of the usual vector normal.
Definition 1 A matrix
Y has the matrix normal distribution, written
Nxp
Y = NNp(
M, NxN
S , pxp
:E ), iff vec(Y) = NNp(vec(M),:E ® S). If:E ® S zs
, Nxp
singular, Y has a singular matrix normal distribution, written
Y = (S)NNp(
M, NxN
S , :E ).
, Nxp
pXp
The definition implies that V(Yij) = Sii:Ejj , and Cov(Y ij , Yi'j') = Sii,:Ejjl. When
either matrix S or :E is singular, so will be the Kronecker product.
One can define a matrix moment generating function for an n x p matrix Y by
My ( T
nxp
)
= E [exp(tr(Y'T))].
The vector moment generating function is defined
by Mvec(Y) (t) = E {exp [(vec(Y))' t]}.
The next lemma demonstrates the equality of the moment generating functions
of the vector normal and matrix normal distributions. It shows that the matrix
normal is simply a notational convenience.
Lemma 1.1 Ifvec(Y)
Proof: Let
= NNp(vec(M),:E ® S),
= [t1
T
Nxp
t2
•••
then Y
t p ] be a matrix, and define
[t~ t~ ... t~]'. Then
Mvec(y)(t)
= NN,p(M, S, :E).
exp {[vec(M)]' t
t
Npxl
= vec(T)
=
+ ~t'(:E ® S)t}
exp {[vec(M)]' vec(T)
+ ~ [vec(T)]' (:E ® S) [vec(T)]}
Then by Theorems 2 and 3 in Searle (1982), p. 333,
exp {tr(M'T) + ~tr(T'ST:E)}
Mvec(y)(t)
-
My(T),
which is the moment generating function of the matrix normal, as desired. 0
The Wishart distribution can be defined conveniently using the matrix normal
notation.
Definition 2 Consider W
pXp , where W = Z'Z, and
Z = NN p( M , IN, :E ). W is said to have a Wishart distribution if :E is full
'Nxp
pXp
4
rank and N
~
p, a pseudo- Wishart distribution if E is full rank and N
singular Wishart distribution if E is less than full rank and N
~
< p, a
p, and a singular
pseudo Wishart distribution if E is less than full rank and N < p. This is written
W
= Wp(N,E,D),
where D = M'M.
The C.D.F. of a general random variable M at a point x, with parameters
0ll O2 , ... , Op will be written FM(X; 0ll O2 , .. • , Op). The (100p)th percentile will be
written Fi/(p; 0ll O2 , . •• , Op). The cumulative distribution function (CDF) of a central
F with a and b degrees of freedom, evaluated at a point x, will be notated FF(X; a, b).
The CDF of a noncentral F with noncentrality parameter w, evaluated at a point x
is given by FF(X; a, b,w). An infinite series for the noncentral F distribution function
and density function appears in Johnson and Kotz (1970, p.192). The formulas in
Abramowitz and Stegun, (1972, p.946) have errors.
1.3.2
·Definition of the GLMM(F, G, D)
An extension of the usual GLMM, the GLMM(F, G, D), is defined in this section.
F indicates the presence of fixed predictors, G indicates Gaussian predictors, and
D indicates discrete random predictors. The notation closely follows that of Muller
et al. (1992), with suitable extensions made to accommodate the more complicated
predictor matrix.
The most general model considered here, the GLMM(F, G, D), is of the form
Y
Nxp
-
+ Nxp
e
FBF + GBa +
NxqFXp
NxqGxp
XB
Nxqxp
DBD
NXqDXP
+ Nxp
e
(1.1)
in which qF, qa and qD are the number of columns of fixed, multivariate Gaussian
and indicators of random discrete variables respectively. Here qF
+ qa + qD
= q.
Without loss of generality, one can sort the predictors in this manner so that they
are in groups with a common distribution. The columns of Yare a set of p random
response variables. The N rows of Y, X and £ correspond to independent sampling
units. For convenience, the word subjects will be used rather than sampling units.
This does not affect the generality of the theory. The design matrix, X, has columns
of fixed and random predictors. We shall assume that X is of full column rank q,
and that each of its submatrices, F, G, and D are also of full rank. B is assumed
to be a fixed and unknown parameter matrix. The columns of B correspond to the
columns of Y. The columns of £, the error matrix, are a set of unobserved random
variables.
5
Let K be the number of different groups of subjects in F. Assume that K is
finite. Moreover, assume that the maximum entry in F is finite. Let Nk be the number
of subjects in each group in F. Assume that N k is finite, Vk E {1, 2, ... , K}. Both Y
and X are assumed to be observed without any appreciable error in measurement. In
addition, the actual variables of interest are assumed to be available, and no surrogate
variables are used. None of the data are permitted to be missing.
Depending on the choice of predictors and distributional assumptions, there are
several different simplifications of the form of the full model. The one most widely
recognized is the classical GLMM. In our notation, this is the GLMM(F), and is
(1.2)
Y=FBF+c.
In a situation with fixed and multivariate Gaussian predictors only, the model is the
GLMM(F, G):
(1.3)
Y=FBF+GBG+c.
A model with fixed and discrete random predictors, the GLMM(F, D) is given by
(1.4)
Y=FBF+DBD+C.
After observing an experiment, conditional on the values of the predictors, all of the
models reduce to model 1.2.
As in the usual GLMM, certain assumptions are made about expectations and
variances.
Because X is random, it is necessary to make assumptions about its
moments, in addition to the usual assumptions about Y and c. Throughout, X and
C are assumed to have finite second moments. This implies that for a given i, where Y
= {Yij}, i E {1, 2, ... , N}, j E {1, 2, ... ,p}, the Yij are a set of random variables with
finite second moments. Also assume that the expected value of Y, E(Y)
= E(X)B,
which is equivalent to assuming that Ec) = 0 and .X .1 c. That is, X is statistically
independent of c. Each row of Y and X is assumed to be statistically independent
of all other rows. The variance of each row of C is assumed to be the same. That
is, V {[rowi(c)]'}
=
E for all i E {1, ... , N}. The variance of each row of G is
assumed to be the same. That is, for all i E {1, ... , N}, V {[rowi( G)]'}
= EGo
In
addition, assume E(G) = O. Both E and E G are assumed to be positive definite.
As a consequence, with probability one, each realization of either C or G will be
nonsingular.
Certain distributional assumptions are made. 'To allow for convenient hypothesis testing, C is assumed to be distributed NN,p(O, IN, E).
The usual assump-
tion about the predictors is that they are all fixed. In this case, we assume that
6
..
G
= NN,qG(O, IN, EG)'
The most important reason for this assumption is that many
predictors of interest will have an approximately Gaussian distribution in reality.
The assumption allows the convenient derivation of exact small sample distributional
results for unconditional power.
D is a matrix of indicator variables for the outcomes of a discrete random vari-
able. Consider a predictor that is a discrete random variable that can take on
qD
val-
ues, say {db d2 , ••• , dqD }, with probabilities {7l"1' 7l"2, ... , 7l"qD}' where I:k~t 7l"k = 1.
Without loss of generality, one can order the sample by the outcome of the discrete
random variable, so that every subject in the first group has outcome dt, and so on.
This allows one to represent D as follows, arbitrarily using a cell mean coding:
Here
INl
is a Nt x 1 vector with every entry equal to 1. The associated matrix of
coefficients B D can be written
BD =
Then I:k~t N k
= N,
and the vector [Nt
N2
•••
N qD ]' has a multinomial distri-
bution with probability vector [7l"t 7l"2 '" 7l"qD ]'.
The general model, including discrete random predictors, was introduced because
the GLUM(D) and GLUM(F, D) have been studied previously. (Articles on discrete
random predictors are reviewed in Section 1.4.2.) A general taxonomic notation is
useful in elucidating what has been done, and what remains to be done.
1.3.3
General Linear Hypotheses: GLH(F), GLH(G) and
GLH(F, G)
In this section, notation is defined for the various sorts of general linear hypotheses
possible with the GLMM(F, G). In a model with both fixed and random predictors,
one can test hypotheses that involve only the random, only the fixed, or both the
fixed and random predictors, depending on the choice of the C matrix.
7
C
axq
is the
contrast matrix, which creates linear combinations of the coefficients of the predictors.
Each kind of general linear hypothesis can be stated as:
Ho:e
where
= eo,
e = CBU, and eo is a known, fixed set of constants.
axb
U
pxb
creates linear
combinations of the columns of B, which correspond to different dependent variables.
For the GLMM(F, G), (equation 1.3), the GJlH(F), a test of fixed predictors
only, is of the form
C = [C
0] ,
(1.5)
F
axqFaxqG
1
axq
with 0 a matrix whose entries are all zeros. The GLH(G), a test of random predictors
only, is of the form
C = [0
CG
axqFaxqG
2
axq
(1.6)
].
A test of both random and fixed predictors, the GLH(F, G), is of the form
CG
axqG
].
(1.7)
Here C1, C 2 and C 3 are assumed to be of full row rank. Hypotheses can be defined
in a parallel fashion for the GLMM(F, D). The GLH(F) and the GLH(G) are special
cases of the GLH(F, G). If C F
= 0,
and C G is of full row rank, C 3 reduces to C 2 •
One can similarly reduce C 3 to C 1.
1.3.4
•
Definitions
With the GLMM(F, G), it is necessary to distinguish carefully between population
parameters, random functions of X, and conditional estimates. Conditional estimates
can be calculated once the values of the predictors are fixed, either by design or by
the actual observation of an experiment.
In data analysis, X is an observed predictor matrix. In the equations that follow
this paragraph, X is a fixed set of numbers, not a random matrix. The slopes can
be estimated by
(1.8)
..
The estimate of the variance is given by
iJ = (Y - XB)'(Y - XB)/(N - q).
C, U, and
is given by
eo are fixed matrices defined for hypothesis testing.
B=CBU.
8
(1.9)
The estimate of
e
(1.10)
..
Other matrices needed to define the estimates and test statistics are given by
M
-
C(X'X)-lC',
(1.11 )
H -
(6) - 8 0 )'M- 1 (6) - 8
E
U' ~U(N - q)
T
H+E,
A
A
= ~*(N -
q),
(1.13)
(1.14)
A-I
HE (N -q).
fl
and
(1.12)
0 ),
(1.15)
Notice that
(1.16)
H is the estimated matrix of sums of squares for the hypothesis. E is the estimated
matrix of sums of squares for the error. T is the estimated matrix of sums of squares
for the total. il is the estimated noncentrality matrix. Although X is considered to
be fixed, Y is random. H, E, T and il are also random, as they are functions of Y.
For unconditional power analysis, X is considered to be random. Thus, in the
equations that follow, it is a random variable, not an observed set of numbers. The
population matrices B (the slopes), lJ G (the variance of the Gaussian portion of
X), and lJ (the variance of the error) are assumed to be known, for the purpose of
calculating the power. It would perhaps be clearer to denote the latter matrix as
lJ E , to make it similar in appearance to lJG, but the desire was to conform to the
notation usually used in the GLMM(F). Other matrices needed to define the test
statistics and distributions for power analysis are given by
C(X'xt 1c'
M
aXa
H
-
(8 - 8 0 )'M- 1 (8 - 8
E
-
U'EU(N)
bxb
bxb
T
bxb
fl
bxb
-
(1.17)
(1.18)
0)
=lJ*(N)
(1.19)
H+E
(1.20)
HE-1(N).
(1.21)
Here, lJ * = U' lJU. Also, since E and fl are population parameters, not estimates,
N instead of (N - q) was used in their definition. If any component of X is random,
then so are M, H, T and fl. No matter how X is defined, for power analysis, E is
fixed, as it is the product of fixed matrices.
If fl has only one non-zero eigenvalue, then the situation is described as the
linear case. It is of special importance in the literature, since many problems can be
solved in exact terms when this occurs. Let s
note that s* :5 s.
9
= min(a, b)
and rank(fl)
= s*,
and
1.3.5
Multivariate Test Statistics
Power is a phenomenon specific to the choice of a test statistic. Many parametric and
nonparametric tests have been considered for the GLMM. The four most common
multivariate tests are Roy's largest root, Wilks' likelihood ratio statistic (W), the
Hotelling-Lawley trace (HLT) and the Pillai-Bartlett trace (PB). Roy's largest root
I
statistic is the largest eigenvalue of (HT- ). WiIks' likelihood ratio test statistic
is I ET-II. The Pillai-Bartlett trace statistic is t:r(Hi'-I). The Hotelling-Lawley
I
trace is tr(HE- ). Notice that the test statistics, like the parameter estimates, are
conditional, and use the values from an observed realization of the experiment.
Although all of the multivariate statistics provide an exact size
Q
test when the
assumptions are met, there is a uniformly most powerful test only if s = min(a, b) = 1,
or under restrictions on E. The choice of a test in each application is dictated by
aesthetic preference, or by considerations of power and robustness.
Olson (1974,
1976) studied power and robustness by conducting many simulations and reviewing
the literature. He differentiated between two situations in considering power. The
noncentrality is said to be concentrated when only one of the eigenvalues of the noncentrality matrix is large. It is diffuse when many of the eigenvalues are large. Olson
showed that for concentrated noncentrality, the order of the statistics from best to
worst power was Roy's largest root, the Hotelling-Lawley trace, Wilks' Lambda and
the Pillai-Bartlett trace. For diffuse noncentrality, the order was Pillai-Bartlett trace,
Wilks' Lambda, the Hotelling-Lawley trace, and Roy's largest root. Asymptotically,
all four test statistics are equivalent. Studies of various sorts of deviations from normality revealed that Roy's largest root was not very robust in the sense that it did not
achieve the claimed type I error rate. The Pillai-Ba.rtlett trace was the most robust,
followed closely by Wilks' Lambda and the Hotelling-Lawley trace. For these reasons
Olson suggested using the Pillai-Bartlett trace as the default test statistic.
Stevens (1979) contested this view. He said that the Hotelling-Lawley trace,
the Pillai-Bartlett trace and Wilks' Lambda were equally good for the concentrated
noncentrality likely to occur in practical situations" He agreed with Olson that the
Pillai-Bartlett trace was the most appropriate for diifuse non-centrality. Olson (1979)
reanalyzed his simulations, and again recommended the Pillai-Bartlett trace. He
argued that one should not base the choice of the test statistic on the basis of the
outcome to be tested, and hence always should choose the same test. The PillaiBartlett test will sometimes be more powerful than the other tests and sometimes
less powerful, but consistently more robust.
10
..
Some authors have sought a mathematical reason for choosing a test. For example, if E is known, then the uniformly most powerful, invariant, size a test is to
reject the hypothesis for improbably large values of tr(HE- 1 ) (Arnold, p.363-364).
John (1971) suggested that Pillai's trace is the best one, as it is the test that maximizes the directional derivative with respect to the population parameters. Roy's
largest root is an example of the union-intersection principle, and Wilks' Lambda is
the likelihood ratio test. In this work, we will only consider the Pillai-Bartlett trace,
the Hotelling-Lawley Trace, and Wilks' Lambda.
1.3.6
Univariate Approach to Repeated Measures
In practice, any study design that may be analyzed with the multivariate approach
to repeated measures may also be analyzed with the univariate approach to repeated measures.
The validity of the analysis depends on stringent assumptions
about the covariance matrix. They include sphericity of E* and a completely balanced design in the within subject dimension. The X matrix may be like that of the
GLMM(F, D), the GLMM(F, G), or the GLMM(F, G, D). The outcome matrix is
Y = {Yij},
i E {1,2, ... ,N},
j E {1,2, ... ,p}, with i indexing subjects, and
j indexing the repeated measurements made on one subject.
A summary of the univariate approach to repeated measures for the analysis of
the GLMM(F) was given by Muller and Barton (1989), and is paraphrased here. The
univariate test statistic is defined by
Fu
=
Atr(H)jab
tr(E)f[b(N - q)]
(1.22)
Under the null hypothesis Fu = F(ab, b(N -q)), if E* = u;I (sphericity). Compound
symmetry of E and the following conditions are sufficient:
1. U is assumed to be proportional to an orthonormal matrix,
2. b < p, and
3. if b > 1 then Ul =
o. (Here 1 is a b x 1 vector whose entries are all 1.)
If sphericity does not hold, the univariate statistic is no longer exactly distributed
as an F. This observation led to the creation of three other approximate F statistics,
the conservative, the Geisser-Greenhouse (GG), and the Huynh-Feldt (HF). These
statistics do not require as stringent assumptions about the covariance matrix. The
three tests do not provide exact size a tests and are all biased to varying degrees.
11
It is recognized that the univariate approach to repeated measures has been
replaced in some applications by the use of stratified survey analysis techniques like
those implemented in SUDAAN, or by the use of GEE models (La Vange et ai., 1994,
Shah et ai., 1991, Zeger and Liang, 1986a, 1986b). However, the robustness and error
....
rate of the methods are not well studied in small samples. In fact, both methods rely
on large sample approximations, like Taylor series, for variance estimates. Because
of this, we will concentrate on theory for the conservative, Geisser-Greenhouse and
Huynh-Feldt statistics only.
1.4
Literature Review
1.4.1
Conditional Power Approximations
A complete review of power approximations for the GLMM with fixed predictors was
given in Muller et ai. (1992). A brief review of the most easily calculated approximations is given here. Many authors considered only Wilks' likelihood ratio statistic,
the Hotelling-Lawley trace and the Pillai-Bartlett trace, which will be referred to as
the three most popular tests.
Local alternatives, i. e.
n /N
rather than
n, are used widely to demonstrate the
validity of asymptotic approximations for the non-null distributions of the test statistics. Lee (1971b) provided an asymptotically correct approximation to the non-null
distribution of Pillai's trace, expressed in terms of non-central chi-squared distributions. Muirhead (1972a) gave an asymptotically correct expansion of Hotelling's T 2
expressed in terms of incomplete Bessel functions. Note that Muirhead considered
only large error degrees of freedom. Fujikoshi (1975) considered cases in which both
the error and hypothesis degrees of freedom are very large. He gave asymptotic results
for the three most popular tests in terms of the Gaussian distribution and its derivatives. Boik (1981) examined the univariate approach to repeated measures under
non-sphericity. He gave power approximations for contrasts with 1 degree of freedom. Kulp and Nagarsenker (1984a) used incomplete beta distributions to express
the asymptotic non-null distribution of Wilks' statistic. Although the approximation
reduces correctly to the exact distribution for p = 1, there seem to be typographical errors in the extended formulas that prohibit actual use of the approximation.
Betz (1987) studied the power of the Hotelling-Lawley trace. He used a method
of moments central Wishart approximation for a noncentral Wishart, and a further
approximation by a central F distribution.
12
.
O'Brien and Shieh (unpublished ASA presentation, 1994) gave new noncentral F
approximations for the power of the three most popular tests, and showed that their
power approximations converged in distribution to the correct noncentral F. They
also demonstrated that the Muller and Peterson (1984) power approximations for the
three most popular tests converge to the correct limit. Simulation studies show that
both approximations work well in practice, except in certain noncentrality configurations in very small samples. O'Brien and Shieh use N in the scalar noncentrality
functions; Muller and Peterson use N - r, where r is the rank of X. Thus the power
estimate of O'Brien and Shieh is always slightly larger than the one of Muller and
Peterson.
A more detailed review is given for the approximations summarized in Muller
et ai. (1992), since the Muller approximations will be the basis of the unconditional
power approximations derived in later chapters. It is possible that better estimators
of the unconditional power would be obtained by using some of the other conditional
power approximations. However, the Muller et ai. approximations provide acceptable
accuracy, make computations convenient through the use of readily available power
software, and permit thinking about all power calculations from a unified viewpoint.
In data analysis, to calculate a p value for the univariate test statistic, one
compares the value of Fu to a central F distribution with abET numerator degrees of
freedom, and b(N - q)ET denominator degrees of freedom, where ET is the estimate of
€ appropriate to test T (see equations 1.23, 1.26 and 1.27 for definitions). To calculate
the conditional power for the Geisser-Greenhouse and the Huynh-Feldt tests, Muller
and Barton (1989) suggested the following algorithm.
1. Specify a value for
From it, calculate
~,
€,
perhaps by using results from other studies in the field.
where
tr2(~*)
= btr(~;)'
(1.23)
Wu = abFa €,
(1.24)
tr(H)/ab
Fa = tr(E)/b(N),
(1.25)
€
2. Calculate the noncentrality. Here
where
3. Calculate the critical value, using the formulas in Table 1.1.
4. Compute the power for test T as
13
A different critical value is used for each statistic, as shown in Table 1.1.
Table 1.1: Univariate Approach to Repeated Measures Test Statistics
Exact
Approx Ho
distribution
FF[ab, b(N - q)]
1
Critical
value
1"F 1[1 - Q; ab, b(N - q)]
Conservative
FF[a,N - q]
lib
Fi 1 [1 -
GG
FF[abf, b(N - q)f]
€
1;;1[1 - Q; ab£(f), b(N - q)£(f)]
HF
FF[abf, b(N - q)f]
€
1;;1[1 - Q; ab£(f), b(N - q)£(f)]
Test name
Estimate of
€
Q; a, N - q]
In Table 1.1
A
€
=
tr 2 (.E*)
A
(1.26)
2 ,
btr(E*)
where .E* is defined in Equation 1.16 and
_
€
. [
= mIll
•
Nbf - 2
b(N - q _ bf):'
(1.27)
Approximations for £(f) and £(f) are given in Muller and Barton (1989). Note that
Muller and Barton did not explicitly discuss approximating power for the conservative
test. The estimate that appears in the table above is similar to their suggestions for
the other tests.
To approximate power for the multivariate tests, Muller and Peterson (1984)
suggested the following approach, with 'fJ and df defined in Table 1.2. The specification
of the noncentrality appears correctly in the POWERLIB software (Keyes, 1992), but
incorrectly in the Muller et ai. paper (1992). In the latter, the authors failed to treat
the a = 1 case separately.
1. For test T compute the noncentrality as
WT=
for a = 1 and as
WT=
14
N'fJT
.
(1 -'fJT)
(1.28)
df(T)'fJT
(1 - 'fJT)
(1.29)
.
•
2. For test T compute the critical value as
fCrit(T)
=
Fi l [1 -
Q,
ab, df(T)] .
(1.30)
3. Compute the power for test T as
Power(T)
=1-
FF [fcrit(T); ab, df(T), WT] .
(1.31 )
Table 1.2: Multivariate Test Statistics
Statistic
Wilks'
Lambda
PillaiBartlett
HotellingLawley
Association( "IT )
df(T)
g[(N - q) - (b - a + 1)/2] - (ab - 2)/2
PB/s
s[(N - q) - b + s]
(HLT/s)j(l
+ HLT/s)
s[(N - q) - b -1]
+2
W, P B, and H LT are defined in section 1.3.5. In both Muller and Peterson,
.
1984 and Muller et al., 1992, for all values of a and b, the quantity 9 is defined by
However, as defined by Roo, c.f. p. 555, it is given by
ifmax(a, b)
<2
otherwise
(1.32)
Rao's definition is the one used here.
1.4.2
Unconditional Power in the GLUM with Random Predictors
Univariate linear models of Gaussian responses and discrete predictors have been
popular in the context of genetics (Jayakar, 1970). Such predictors take on a limited
number of fixed values with known probabilities. Soller and Genizi (1978) discussed
the design of genetical experiments using such models, and suggested large sample
15
normal approximations to the power. Genizi and Soner (1979) examined similar models, with and without an intercept. In our notation, they considered the GLUM(D)
and the GLUM(F, D). They derived the distribution of a test for no difference between the discrete values under the null, and proposed a moment based Laguerre
series to approximate the power.
Data analysis methods for univariate models with Gaussian predictors have been
widely studied. Neter, Wasserman and Kutner (1985) gave an overview of the relationships between correlation theory and regression models, and discussed tests for
correlation coefficients. Sampson (1974) provided a more thorough treatment of the
same topic. While in the regression model, the GLUM(F), the predictors are assumed
to be fixed, in the GLUM(G) the predictors are assumed to be multivariate Gaussian.
He compared the GLUM(F) and the GLUM(G), and showed that the maximum likelihood estimates are the same, while the maximum likelihood estimators are different.
Once an experiment has been observed, conditiona.l on the value of the predictors,
the test statistics coincide, as do their distributions under the null hypothesis.
Sampson (1974) derived distinct power functions for the two models. He noted
that with random predictors, the noncentrality parameter is random, and derived its
distribution. He then derived the exact unconditional power of the univariate test in
terms of a ratio of chi-squared random variables in which the numerator has random
degrees of freedom. Although Sampson (1974) suggested extending his result to the
multivariate case, he only provided results for the general linear univariate model.
Anderson (1984, p.143) gave the result in terms of an infinite exact series representation, by calculating the integral of the product of the density of the noncentrality
and the conditional power function. Moser, Stevens and Watts (1989) used a similar
unconditioning argument in their discussion of the power of the Satterthwaite T test.
Gatsonis and Sampson (1989) also considered the GLUM(G). They used Sampson's earlier results and a series expansion for the multiple correlation coefficient to
compute tables of unconditional power and sample size estimates. They offered a
public access computer program to compute more. They also gave results for the
GLUM(F, G), with C = [0 C G
],
the GLH(G). Like Sampson's results, they are
in terms of a ratio of two independent chi-squared ra.ndom variables, in which the numerator has random degrees of freedom. They suggested that the sample size tables
of Cohen (1977) might be used as approximations to the unconditional power, but
would provide underestimates in small sample size cases.
16
1.4.3
Power for Tests of Independence
No papers have been found on the GLMM(F, G, D). All articles on multivariate
models with random predictors center on the test of independence between two sets
of multivariate normal random variates. The corresponding model is the GLMM(G),
with the hypothesis given by C = I, and U = I. As in the GLUM with random
predictors, all power results are asymptotic, and given in terms of local alternatives.
Many authors have proposed test statistics for independence and derived their
null and non-null distributions. A brief review was given in Anderson (pp. 376-402,
1984). Many of the results were based on those of Constantine (1963), who gave the
distribution of the canonical correlation coefficients in terms of zonal polynomials,
infinite sums, and multiple products.
Exact results for a few special cases were given in three papers by Pillai and
Jayachandran. The results in these papers should serve as a benchmark for checking
the accuracy of approximate results. However, they did not specify X, B, e or E,
and their programs were unpublished. In 1967, for p
= 2, they gave some exact power
results for the three most popular multivariate tests. From an unpublished report of
Pillai's, they summarized exact power results for Roy's largest root. They considered
and p = 3 in the linear case, for small deviations from the null. Numerical
tables were provided that allowed the comparison of the four tests. In 1970, they gave
p
= 2,
the exact distribution of Pillai's trace under the null hypothesis for s. = 3 and s. = 4,
where s. is the rank of the noncentrality matrix. The moment generating function
was given in terms of incomplete gamma functions, and the c.d.f. was described in
terms of incomplete beta functions.
Some authors used asymptotic approximations to the distributions of the three
most popular multivariate tests. Lee (1971) gave approximations for the noncentral
distribution of Wilks' likelihood ratio and for both the trace statistics. The expansions are weighted sums of non-central chi-squared distributions, and correspond to a
mixture distribution. When compared numerically to the exact results of Pillai and
Jayachandran, they were accurate to approximately the third decimal place, with
sample sizes of 63 and 83. Unfortunately, the highest power value calculated was
approximately.5. It would be interesting to see if the accuracy was as good for power
values useful in experimental design. Fujikoshi (1988a, 1988b) employed a method of
perturbation to derive power approximations.
Some authors concentrated on the likelihood ratio test statistic. Sugiura and
Fujikoshi (1969) suggested a Gaussian approximation, which behaved well'only when
17
the correlations were large. Nagao (1972) examined. the power of the likelihood ratio
test for the independence of many sets of multivariate normal vectors. Muirhead
(1972b) gave an expression for the power in terms of mixtures of noncentral chisquared distribution functions, and provided tables for small values of p and q. This
approximation worked better when the population correlation coefficients were small.
Fujikoshi (1973a) derived another term for the Sugiura and Fujikoshi approximation.
Kulp and Nagarsenker (1984b) suggested complicated approximations to both the
null and non-null distributions of the likelihood ratio criterion in terms of incomplete
non-central beta distributions. Their results reduced to the exact results in the null
case for p
= 1 and 2.
t
Some authors gave asymptotically correct approximations for the power of Roy's
largest root for the test of independence. In 1967 Pillai and Jayachandran gave zonal
polynomial approximations for p E {2, 3, 4} in the linear case. The approximations
are only useful for small deviations from the null. They later extended them to
allow large deviations from the null (Pillai and Jaychandran, 1968). Sugiyama and
Ushizawa (1992) used zonal polynomials to calculate the non-central distribution for
p
= 2.
They provided tables of their results.
Other authors suggested using different tests. Pillai (1955) developed three new
test criteria based on the harmonic means of the eigenvalues of Hand E, and gave
their null, but not non-null distributions. Pillai and Dotson (1969) gave expansions in
terms of the incomplete non-central beta distribution for the largest root, the smallest
root, and the median root of the noncentrality matrix. The results were tabulated
for p = 2 and 3, and the powers of the various roots were compared. Bagai (1970)
suggested using a test statistic that is the product of the nonzero roots of the matrix
(HTt l . Williams (1970) examined the test of independence in terms of the multiple
correlation coefficient, and appeared to suggest a method similar to Sampson's for
calculating the unconditional power. He gave results only in the special case of the
GLUM(G).
Most tests of independence between two sets of multivariate Gaussian random
variables share certain properties. Roy and Mikhail (1961) claimed that the power
of Roy's largest root test was a monotonically increasing function of the canonical
correlation statistics. Anderson and Das Gupta (1964) said that Roy and Mikhail
only showed unbiasedness. They went on to demonstrate that the power functions
of the likelihood ratio test and Roy's largest root test are monotonically increasing
functions of the canonical correlation coefficients. Because the tests of independence
depend on the observations only through the squared sample canonical correlation
18
coefficients, John (1971) was able to describe the set of transformations under which
the tests are invariant. Perlman and Olkin (1980) showed that any test that has a
monotone acceptance region, including any test of independence, is unbiased.
1.4.4
Monte Carlo Evaluations of Power
The most direct way to examine power is through simulations. Olson (1974) considered power results for MANOVA with fixed predictors under various violations of
normality in a massive simulation study. His results were reviewed in 1.3.5. Berger
(1986) used Monte Carlo simulations to evaluate the power of Roy's largest root for
MANOVA of longitudinal data with fixed predictors. Habib and Harwell (1989) compared the power of three nonparametric procedures and one normal-theory test for
the independence of two sets of variables. They examined the Puri and Sen L statistic, with either mixed or pure ranks, the Iman and Conover rank transform statistic,
and Roo's asymptotic F approximation for the Wilks' likelihood ratio test. The F
approximation assumes normality, and is calculated on the data, not the ranks. The
simulations used random predictor and error matrices, with varying distributions.
Their results suggested that the normal theory test did not perform well with large
departures from normality, even with large sample size. They advocated use of the
rank transform statistics.
1.4.5
Power for Mixed Model-like Approaches
A formal definition of the mixed model appears in Laird and Ware (1985), and Ware
(1985). It is also discussed by Jennrich and Schluchter (1986). Any GLMM can be
represented as a special case of a Mixed Model with a restricted covariance matrix.
The GLMM with random X is no exception, at least conditionally. The test statistics
usually used for the Mixed Model are the asymptotic Wald tests (Schluchter and
Elashoff, 1990). Because these tests are different from those used for the GLMM,
it is hard to use power results for the mixed model in the GLMM setting. Roebuck
(1982a, 1982b) suggested some GLUM-like exact F tests for a restricted mixed model,
and discussed power issues. Hawkins and Han (1986) examined the power of GLUMlike tests in a random effects covariance model. McCarroll (1987) gave a power
approximation for a stacked data F test for the GLMM, which she then extended to
mixed models with linear covariance structure.
19
1.4.6
Summary
The literature review is brief because this topic has not been studied extensively.
Tables 1.3 and 1.4 make clear what power results have been derived, and what remains
unknown.
Table 1.3: Literature Review Summary for the GLUM
Model
GLUM(F)
GLUM(G)
GLUM(D)
GLUM(F, G)
GLUM(F, G)
GLUM(F, D)
GLUM(F, D)
GLUM(F, G, D)
c
Authors
Multiple
Sampson (1974)
Genizi and Soller (1979)
Gatsonis and Sampson (1989)
Not done
Genizi and Soller (1979)
Not done
Not done
All
All
All
C=
C=
C=
C=
All
Exact or App.
Exact
Exact
App.
Exact
App.
.
Table 1.4: Literature Review Summa.ry for the GLMM
Model
GLMM(F)
GLMM(G)
GLMM(D)
GLMM(F, G)
GLMM(F, G)
GLMM(F, G)
GLMM(F, D)
GLMM(F, G, D)
C
All
C=1
All
C=[O C G
]
Some C = [ C F
Some C = [ C F
All
All
0]
CG
]
Authors
Multiple
Multiple
Not done
This work
This work
This work
Not done
Not done
Exact or App.
Some exact, some app.
App.
Some exact, some app.
Some exact, some app.
The literature on the GLMM in particular is very sparse. There are no results
for the GLMM(F, G), the case that will be examined in this dissertation.
20
•
1.5
Overview
Chapter 2 concerns unconditional power for a special case of the GLMM(F, G), with
only one column of random predictors and a special case of the GLH(F, G) with
only one row. Linear models with one column of random predictors often occur in
clinical trials with repeated measures, in which the baseline is used as a covariate.
In such trials, one often tests hypotheses both about treatment, a fixed predictor,
and baseline, a random predictor. These hypotheses can be classified as GLH(F,
G). The distribution of M- 1 is derived, using a theorem about general quadratic
forms. A corollary gives the distribution for a fixed hypothesis. The distribution of
the scalar noncentrality parameter for the Hotelling-Lawley trace is described, and
an expression for the unconditional power is given. Finally, the power for all three
multivariate tests is shown to coincide in this case.
In Chapter 3, exact and approximate analytic small sample power results are
specified for the GLMM(F, G), with C = [0 C G ]. First, a series of lemmas
are proven. The first one allows decomposing Wishart matrices into their component
normal distributions. The second one reviews scaling and translation of matrix normal
distributions. The next gives the distribution of the trace of noncentral true, pseudo,
singular, or pseudo-singular Wishart matrices. The distribution of M- 1 is derived
for a GLH(G). This allows one to specify the distribution of H.
These lemmas are used to examine the power of the multivariate tests, and the
tests for the univariate approach to repeated measures. All of the proofs are similar.
First, the distribution of the scalar noncentrality function is deduced. Then, the
unconditional power is derived.
Chapter 4 covers the asymptotic limits of the F approximations for multivariate
power, both conditional and unconditional. Pitman local alternatives are used to
allow a meaningful comparison of the tests. For the conditional power, the limit of
the F cumulative distribution function is found. The asymptotic limit of H is shown
to be a constant under local alternatives. Then the asymptotic limits of the scalar
noncentrality functions are given for the univariate approach to repeated measures,
and the multivariate tests. The results are shown to agree with Rothenberg's (1971)
•
expressions, except for a term of order N-l. A similar approach is used with the
unconditional power. Under local alternatives, the random matrix H converges in
probability to a constant. A limit theorem about Lebesgue integrals is applied to the
F approximations and the scalar noncentrality functions to derive the unconditional
powers. These results are shown to agree with those of Lee (1977), up to a term of
21
order N- 1 .
In Chapter 5, numerical algorithms are given for some of the unconditional power
results. In addition, this chapter provides details a.bout numerical integrations and
contrasts analytic results to simulations. Chapter 5 also accounts for error due to
the approximation of conditional power, compares asymptotic results to small sample
results under local alternatives, and examines conditional and unconditional power
results. The intent is to evaluate the performance of the algorithms for computing unconditional power, and check their accuracy. Tables and graphs are used to illustrate
the results.
In Chapter 6, several power analyses are performed for a clinical trial on bone
density in cystic fibrosis patients who have had lung transplants. Both a GLH(F)
for the effect of treatment-gender interaction and a GLH(G) for the effect of baseline
bone density are considered. Three analyses are demonstrated. An unconditional
power analysis uses the methods developed in Chapters 2 and 3. A conditional power
analysis used the values of the random predictor matrix that were actually observed.
A second conditional power analysis uses the observed predictors, but adjusts the
error variance to account for the random predictor. The results are compared.
In Chapter 7, the results are summarized and conclusions drawn. The tentative
titles and abstracts for four planned papers are given. In addition, a long list of topics
for future research is given.
22
CHAPTER 2
A SPECIAL CASE OF THE
GLH(F, G) AND GLMM(F, G)
This chapter is devoted entirely to deriving formulas for power for a special case of
the GLMM(F, G), and a special case of the GLH(F, G). First, the distribution of
M- 1 is derived. This is the first step to deriving the distribution of the noncentrality
parameter, and hence the unconditional power.
In the model considered here, G is assumed to be an N x 1 matrix. To conform
to usual notation, we will write it as 9 in this chapter. Usually, we write V[rowi(g)'] =
£g, but since 9 is a vector, we will write V(g) = u;IN . The hypothesis considered
is a special case of the GLH(F, G), with
a
= 1, and thus only one row in B.
These
assumptions are reflected in the structure of C = [CF CG], in which CF is 1 X qF
and CG is 1 x 1, a scalar. In addition, C F is assumed to be non-zero. C G is not
similarly restricted. If C G is zero, the GLH(F, G) reduces to the GLH(F), and the
hypothesis then concerns only fixed variables. If CG is nonzero, the hypothesis is still
a GLH(F, G).
This result is different from any of the other ones in the dissertation because it
deals with a special case of the GLH(F, G). More general cases have proven intractable
to this point because the expressions involve multiple integrations of the product of
Wishart and multivariate Gaussian densities. However, this special case is valuable
because it allows the calculation of unconditional power results in GLMM(F, G)
with single covariates. These cases are often treated in clinical trials with repeated
measures, with the baseline used as a covariate. This in fact is the example considered
in Chapter 6, which motivated this derivation.
Lemma 2.2 Consider a GLMM(F, G), with 9 an N x 1 matrix. For a GLH(F, G),
with Cal x q matrix defined by C
M
=
[Cp C G ], C p =f:. 0, M is scalar, and
1
-1
=
[Cp(F'F)-lC~rl1 + (N _ qp)-lF(l,N _ qp,wd'
(2.1)
with WI = CGC~/[Cp(F'F)-IC~O"b]·
.
Proof: With
(2.2)
X=[F g],
one can write
x'x = [ F'F
g'F
F'9 ] ,
g'g
Then
where
(F' F)-l
B ll =
=
+ (F' Ft 1 F'gB 22 g'F(F' F)-l
B 21 =
-(F'F)-l F'gB 22
-B g'F(F'Ft l
B 22 =
[g'g - g'F(F'F)-l F'gr 1 ,
B 12
22
Since
M
= C(X'X)-lC',
we can rewrite it as
M
=
C~
[Cp Ca] [Bll B12] [
]
B 21 B 22
CG
To simplify the result, define the 1 x N matrix P by
Then
M = PP' + B 22 [Pgg'P' - CGg'P' -- PgC~
24
+ CGC~]
Since Pg and C G are scalar, we can rearrange terms to obtain an easily factorable
form:
M
-
PP' + B 22 [g'P'Pg'- g'P'C G PP' + B 22 [(g'P' -
C~)(Pg
C~PG
+ C~CG]
- CG)]'
Define
C~)(PG
(G'P' -
- C G)
PP'CJb
Notice that one can rewrite B 22I as a quadratic form in g:
g'g - g'F(F'Ft l F'g
-
g'[IN
-
F(F'F)-I F']g,
and define
Then
M
To find the distribution of M, it remains to consider the joint and marginal
distribution of the two generalized quadratic forms QI and Q2' Note that the matrix
P contains only constants. A generalized quadratic form, Qj' is defined by
Q = X'AX
+ ~(LX + X'L') + C,
where X is a random variable (Mathai and Provost, 1992). It is called a generalized
quadratic form because it is an extension of the usual quadratic form, defined by
Q=X'BX.
A theorem in Mathai and Provost, 1992, p.286 (see the Appendix, Section A.l) allows
determining whether two generalized quadratic forms are independent.
•
Defining various constants to check the conditions of the theorem, let
Al
pp'
LI
-2C~P
25
C1
-
C'c;C G
A2
[I - F(F'Ft- 1 F']
L2
0
C2
-
o.
The symmetry of All A 2 , C 1 and C 2 is obvious by inspection. Then Qj is
symmetric for all X, since Qj = Qj, iff A j , and C j are symmetric. By definition, A j ,
L j , and C j are real matrices of constants. Since (1'" F)-l F' [I - F(F' F)-l F'] = 0,
A 1 A 2 = 0 and L 1 A 2 = O. Also since L 2 = 0, L 2 A 1
quadratic forms are independently distributed.
= 0,
and LIL~
= O.
Thus the
But what are their marginal distributions? Since [IN - F(F'F)-l F'] is symmetric, idempotent, and of rank N - qF,
(Searle, p.60, section 2.5).
Now it remains to find the distribution of the other
quadratic form, Q2' Since
it follows that
Pg-C G
(P P'(jb)~
[-C G
]
= N (P P'(jb)'~ , 1
1
and hence
Using these facts, one can derive the distribution of M-1 . Since
the result follows. 0
This lemma leads directly to a corollary for a. fixed hypothesis, for the same
model. If C G = 0, the GLH(F, G) reduces to the GLH(F), and the noncentral F
reduces to a central F. This agrees with Yang's result (Yang, 1995).
26
Corollary 1 Consider a GLMM(F) G}) with 9 an N x 1 matrix. For a GLH(F})
with Cal x q matrix defined by C
=
[C F 0]) CF =1= 0) M is scalar and
(2.3)
2.1
Unconditional Power for Multivariate Tests:
GLMM(F, G), GLH(F, G)
To derive the unconditional power, it is necessary to derive the distribution of the
scalar noncentrality parameter. For a
=
1, in the conditional model, the tests co-
incide, and all the multivariate test statistics are equivalent (Muller et al., 1992).
The unconditional power is a weighted average of the conditional power values. The
F values· for the conditional power are the same for all of the multivariate tests.
Thus the unconditional power for all the multivariate tests will coincide iff one can
show that the distribution of the scalar noncentrality parameter is the same for the
Hotelling-Lawley trace, Wilks' Lambda and the Pillai-Bartlett trace.
First, we shall consider the distribution of the scalar noncentrality parameter for
the Hotelling-Lawley trace,
WHLT,
and then use it to derive the unconditional power.
Then we will show that the conditional and unconditional power of all the tests are
equivalent for these conditions.
Lemma 2.3 Consider a GLMM(F) G}) with 9 a N x 1 matrix. If one considers a
GLH(F) G}) with Cal x q matrix defined by C
WHLT
= K[l + (N -
qF)-l F(l,
= [C F
N -
CG
qF,Wl)t
])
C F =1= 0) then
1
•
Proof: For the Hotelling-Lawley trace, since a = 1, the scalar noncentrality is given
by
WHLT
= tr(HE- 1 ). N.
27
(2.5)
In the special case of the GLH(F, G), GLMM(F, G), H is a scalar random variable,
and E is fixed. Since M is a scalar random variable,
tr(HE- 1 )
tr [(8 - B o)'M- l (8 - (
-
M- 1 tr [(8 - (
0 )E-
1
0
)E- 1 ]
(8 - B o)']
Now, since tr [(8 - ( 0 )E- 1 ( 8 - 8 0 Y] is constant, and M-1 has a distribution given
by Equation 2.1, the result follows. 0
...
Corollary 2 For the conditions in Lemma 2.3, if the hypothesis only concerns the
fixed variables and C G = 0, then WI = 0, the noncentral F becomes central and
where K is defined in Equation
2.4.
Lemma 2.4 For the conditions in Lemma 2.3, the density of the scalar non-centrality
parameter is given by
00
-
~
~
;=0
e
_~ (wt/2)j
2
[
qF)(N-qF)/2
(3 (!.:~
2 ,
J"!
K -
(N -
N-qr)
x
(2.6)
..
2
] (2:j-l)/2 X
W
(N - QF)-IW
[N
(
.~ _ W
) ] -(l+2j+(N-qF)-1)/2
- qF + (N '-qF )-1 W
x
K
for
°< W < K.
(N - qF)W 2 '
In 2.6, the shorthand notation W
=:
WHLT has been used for clarity.
Equivalently,
f(wHLT
11, N -
qF,wd = (N _K )_1 wFi'iTiF [(K/W - l)(N - qF); 1, N - qF,Wl],
QF
where iF is the noncentral F density.
Proof: The density is derived in Appendix A.5. 0
For a = 1, the F approximation gives the exact power, and the distribution just
derived for the scalar noncentrality parameter is also exact.
28
Theorem 2.1 For the conditions in Lemma 2.3 the unconditional power for the
Hotelling-Lawley trace is given by
Pu
=
1-!oK. FF (Jecit;b,df(HLT),wHLT)f(wHLT)dwHLT
(2.7)
where fecit and df(HLT) are defined in Table 1.2, Section 1.4.1.
Proof: The expression follows from the the definition of unconditional power, and
the density given in Lemma 2.4. For finite alternatives, and a fixed sample size,
the integral in Equation 2.7 is bounded, since F(Jcrit; b, df(HLT), WHLT) < 1, and
f(WHLT) is bounded as it is a density. 0
Now, it remains to show that the distribution of the noncentrality parameter for
Pillai-Bartlett and Wilks' is the same.
Lemma 2.5 For the conditions in Lemma 2.3, WHLT
= WPB = Ww
and thus they
have the same distribution.
Proof: One can prove the lemma by demonstrating that the scalar noncentrality
parameters for all three tests are the same in this case. From Table 1 in Muller and
Peterson (1984), if {CK} are the set of eigenvalues of (HE- l ), then
In the linear case, H has only one non-zero eigenvalue, and hence so does H E- l
.
Then
HLT=c, W=(l+ct l
PB=c(l+c)-l.
Now calculating the degrees of freedom, since a
s
= min(a, b) = 1.
= 1, then g = 1, by definition,
(2.8)
and
Then
df2(W)
= df2(HLT) = dh(PB) = N
-
r - b+ 1.
The scalar noncentralities are defined by a· FA, where
." is the measure of association for test i, and dh are the degrees of freedom defined
above. Calculating them for each test, one obtains
29
1. Pillai Bartlett
TJ
= PB/s = PB.
Then
PB/b
[N-r-b+1]
(1-PB)/(N-r-b+1) =c
b
.
F.
a=
2. Hotelling-Lawley
TJ
HLT/s
c
= 1 + HLT/s = --.
1 +c
Then
Fa
=c [
V
=1-
1] .
N - r - b+
b
3. Wilks'
Then
Fa
=[
1- W ] [N -
1_ 1+W
1
Wi
r -
b
=1-
ltV.
b+ 1] = [N - r b- b+ 1] .
C
Thus in the linear case, for all three tests, FA coincides, the degrees of freedom
coincide, and so do the scalar noncentrality parameters. Then the distribution of the
scalar noncentrality parameters must be the same. []
Theorem 2.2 For the conditions in Lemma 2.3 the unconditional power of the three
multivariate tests is the same.
Proof: This follows directly from Lemma 2.5.
2.2
Univariate Approach to Repeated Measures
Lemma 2.6 For the conditions in Lemma 2.3,
where
Ku
WI
= CGC'c;/CF(F'F)-IC'pO"b,
=
and
l
l
tr [(S - So)' (S - So)] [C F (F'Ffl C'pr b(N - q)f. [tr (E)r x
[1- (N - qF)-IF(l,N - qF,WI)r
30
1
(2.9)
Proof: The scalar noncentrality for the Muller-Barton approximations for the power
of the univariate approach to repeated measures statistics is given by
Wu
In this expression,
€
= dr (H) b(N - q) [tr (E)r
1
•
and E are fixed, and H is random. Since
H = (e -
eo)' M- 1 (e - eo),
and a = 1,
tr (H) = M- 1
[(e - eo)' (e - eo)] .
By Corollary 1,
where h . CGC~/
[C F (F'F)-1 C~(jb]. Then the result follows.
D.
Corollary 3 For the conditions in Lemma 2.3) if the hypothesis only concerns the
fixed variables and C G
Wu
where
Ku
= 0)
then
= Ku
[1 + (N - qf t 1F(l, N -
WI
= 0)
the noncentral F becomes central and
qF)
]-1 ,
is defined in Equation 2.9.
Lemma 2.7 For the conditions in Lemma 2.3) the density of the scalar noncentrality
parameter is
(2.10)
where fF is the noncentral F density with degrees of freedom 1) and N - qF) and
noncentrality parameter WI •
The density can be expressed by the same form as given in equation number 2.7)
with
K
replaced by
Ku .
Proof: The derivation is exactly parallel to the one in Appendix A.5.
31
Theorem 2.3 For the conditions in Lemma 2.3,
tht~
approximate unconditional power
for the univariate approach to repeated measures statistics is given by
(2.11)
where T indexes the tests, the critical values are given in Table 1.3,1Wu (w u ) zs gwen
in Equation 2.10, and
Ku
is given in Equation 2.9.
Proof: The expression follows from the definition of unconditional power, and the
density given in Equation 2.10. For finite alternatives, and a fixed sample size, this
is a bounded integral, since FF(Jcrit,T; abE, b(N - q)E, wu ) < 1 and fw u (w u ) is bounded
as it is a density. o.
32
CHAPTER 3
THE GLH(G) AND GLMM(F, G)
3.1
Introduction
An extensive discussion of the power approximations of Muller and Barton (1991),
and Muller et aI. (1992) was given in the literature review. One can consider their
results to be power calculated conditionally on the observed values of the predictors.
For an experiment with random predictors, the unconditional power corresponds to
the expected value of the conditional power statistics over all possible stochastic realizations of the predictors. Some of the predictors are assumed to have a multivariate
Gaussian distribution. To calculate the power, one can use the law of total probability, and integrate the conditional power with respect to the density of the random
noncentrality matrix. This approach closely follows that of Sampson (1978) and Gatsonis and Sampson (1979) in the univariate setting. It also resembles the derivation
of the exact power of the Satterthwaite T test, by Moser, Stevens and Watts (1989).
The first step in this process is to note that the conditional power approximations
depend on X only through scalar functions of the eigenvalues of the noncentrality
matrix, n. Specifying the distribution of X allows one to obtain the density of these
scalar noncentrality functions by deriving the distribution of the noncentrality matrix.
In this chapter, this strategy of proof is used for the univariate approach to
repeated measures tests, the Hotelling-Lawley trace, the Pillai-Bartlett trace and
Wilks' Lambda. A new theorem is proven that shows that the trace of a singular or
non-singular, true or pseudo-Wishart is exactly that of a weighted sum of independent
noncentral chi-squared random variables. This allows the derivations of the exact
distributions of the scalar noncentrality functions for the noncentral F approximations
for the Hotelling-Lawley trace and the univariate approach to repeated measures tests
for general s*. Only the linear case, with s*
= 1, is considered for the Pillai-Bartlett
trace and Wilks' Lambda.
To complete the derivation of the unconditional power, it remains only to integrate the noncentral F approximations with respect to the density of the scalar
noncentrality functions. The power approximations reduce to exact results for the
GLUM, and, more generally, for the GLMM with s == 1. In those cases, the combination of exact results for the conditional power and exact results for the distribution
of the scalar noncentrality function leads to exact unconditional power results for the
Hotelling-Lawley trace and the GG and HF statistics.
For the GLMM(F, G) with s > 1, the exact distribution of the scalar noncentrality function and the conditional power approximations leads to approximate
unconditional power estimates. In all cases, numerical integration in one dimension
will provide convenient calculations.
3.2
Lemmas
This lemma gives a result about the decomposition of a Wishart into its component
normal distributions.
Lemma 3.8 If S = Wp(n, E, r), with E positive definite, and r = M'M, then
there exists a matrix J = N np ( M, I , E ) such that J'J = S.
nxp
'nxp nXn pxp
H be a matrix such that the entries hij are indenxp
pendent and h ij = N(O,l). Then H = Nn,p( 0 , I , I ). Since E is positive
nxp nXn pXp
definite there exists FE such that F~FE = E. Let J = HF~ + M. Then by
Proof: By construction. Let
Arnold, p.312, Theorem 17.2d, J
as desired. 0
Lemma 3.9 IfY
= Nn,p(M, In,E),
and hence J'J
= Wp(n,E,r)
= SNn,p(M,1P,E), thenAYB == SNa ,b(AMB,A1PA',B'EB).
Proof: This is a well known result that appears in Arnold, 1981 without proof. The
proof is given here for completeness. The moment generating function, My(t) is given
by exp[tr(M't)] exp[~tr(t'1PtE)]. Let Z
= AY B.
34
Then Mz(t) =
•
exp[tr(B'JL' A't)] exp[~tr(t'Aw-A'tB'EB)], which implies that Z
SNn(AM B, Aw-A, B' EB). 0
=
The next lemma is an extension of a result proved in Muller and Barton (1989).
They considered the non-singular case only. Arnold (Theorem 17.14, p.322, 1981)
•
gave a special case of this result, for some non-singular true Wisharts. This lemma
gives the distribution of the trace of a noncentral true, pseudo, singular, or pseudosingular Wishart matrix.
Lemma 3.10 Let S = Y'AY, A be idempotent, rank(A) =v > 0, and Y
(S)Nn,b(M, In' E). with rank(E) = b*
=
Y:kAY*k
< b. Then
2
AkX (V, M:kAM*k).
Proof: Since E is symmetric, we can write the spectral decomposition as
where V I; = [VI;l VI;2
... VI;b] is a matrix whose columns are eigenvectors, and
{ AI;l' AI;2, ... , A:Eb} are the eigenvalues of E.
Without loss of generality, we can order the eigenvalues and eigenvectors and
write
b. x(b-b.)
o
(b-b.) ~ (b-b.)
]
Note that for VI;,
V:EDg(~I;)V~ -
VI;V~
and
Y*
E,
Ib'
-
YVI;
-
(S)NN,b[MV I;, IN, V~EVI;]
-
(S)NN,b[M*, IN, Dg(~I;)].
Let Y~k be the kth row of Y and let M~k be the
ph
row of M*. Then applying
the lemma from the appendix of Muller and Barton (1989),
tr(Y'AY)
tr(V~Y'AYVI;)
tr(Y:AY*)
b
L: Y:kAY*k·
k=l
35
Because the covariance matrix of row k (Y *)' is diagonal, each term in the sum is
independent of the others. Thus to find the distribution of the sum, it suffices to
consider the distribution of the summands.
For k :::; b*, Y*k are independent and identically distributed NN (M *k, A.k1n),
and hence
.
For k> b*,
which implies that
Then
.
and
Hence
b.
2
tr(Y'AY) = L: A.kX [V, M:kAM*k]
k=l
b
+
L:
M:kAM*k,
k=(b.+l)
where {X 2 [v, M:kAM*k]} are independent noncentral chi-squared random variables.
The second sum is a sum of constants and hence constant.
When b = b*, the Wishart or pseudo-Wishart is non-singular, and the result
reduces to that obtained by Muller and Barton ill Theorem 1. In that case, the
constant term is zero.
For M = 0, the chi-squared random variables are central, and tr(Y'AY)
= Et~l AkX 2(V).
0
Note that the algorithms of Davies (1980) may be used for exact calculations.
Alternately, Muller and Barton (1989) suggested approximating the weighted sum of
independent non-central chi-squared random variables by a single scaled non-central
chi-squared random variable.
Their approximation would need to be generalized
slightly to allow for an additive constant.
36
.
Lemma 3.11 Assume Nxq
X
=
[F
NXqF
C [0
= NN,qG(O,IN, EG),
] and rank(C G) = a ~ qG. Then
G], with F fixed, G
NxqG
rank(G) = qG, rank(F) = qF,
= a.xqF a.xqG
CG
1
M- = Wa.[N - qF - (qG - a), (CGEG1C~)-1].
Proof: Here
,
[F'F F'G] .
XX=
G'F G'G
Then
C(X'X)-lC'
= C G {G'[1 -
F(F'F)-l F']G} -1 C~.
Since the following four things are true,
i)
ii)
iii)
and
iv)
[rowi( G)]
[I - F(F' F)-l F'] (I - F(F'F)-l F')' -
Rank(1 - F(F'F)-l F')
NqG(O, E G),
(I - F(F'F)-l F'?,
(I - F(F' F)-l F'),
N-qF,
then by Theorem 17.7, p. 315, Arnold (1981),
By definition of the inverse Wishart,
Since C G
a.xqG
is of full rank, .3 a nonsingular matrix
C*
G
qGXqG
-
Lr;;
(qG-a.)xqG
such that
[
Then by Lemma 1 of Sampson (1974),
C~ {G'(1 - F(F' F)-l F')G} -1 C~
By Lemma 2 of Sampson (1974),
C G {G'[1 - F(F'F)-lF']G}-l C~ -
37
Wa.- 1(N - qF - (qG - a),CGE(/C~)
Then
(CG {G'[I - F(F'F)-l F']G}
-1 C~) -1
=
Wa
[N - qF -
(qG _ a),
(CGEG1C~)
-1] ,
and
M- 1 = W a
[N - qF -
(qG - a),
(CGEG1C~) -1]
by the definition of Wisharts. 0
Lemma 3.12 Consider a GLMM(F, G) with GLH(G). Define the spectral decom1
1
position of E = VEAkAkV~ = FEF~. Define H. = Fi/HFit. Then H =
Wa(v,S), with v = (N - qF - qG + a) and S = (6) - Bo)'[CGEGC~]-l(B - B o).
H. = Wa(v, Fi/SFit). Both Wishart distributions are singular if S is, and pseudo
if v
< a.
Proof: By lemma 3.11,
By Lemma 3.8, one can decompose M- 1 into its constituent matrix normal distributions, so M- 1 = J'J, for J = NII,a[O,III,(CGEGC~)-l]. Then
This matrix normal distribution is singular if rank(E> - B o) < b. Since
it follows that
Since
J(B - Bo)Fit
= NII,b[O,III,Fii/(B -
Bo)'[CGEGC~r1(B - Bo)F E
T
],
H. = W b[v,F E1(E> - Bo)'[CGEGC~ir1(B - Bo)FET ]
Lemma 3.13 Let E ••
= F' (8 -
8 0 )' [CGEGC~rl (8 - 8 0 ) F. Rank E ••
= s•.
Proof: Rank(E••) = Rank(8 - 8 0 )' [CGEGC~r1 (8 - 8 0 )E- 1, (circular permutations). Then rank(E••) = Rank(U) iff rank(M- 1 ) = Rank [CGEGC~r\ but it
must, since both matrices must be of rank a, since they are non-singular. 0
38
3.3
Univariate Approach to Repeated Measures
Theorem 3.4 If w is the scalar noncentrality statistic of the noncentral F approxi-
mation to the Geisser-Greenhouse and Huynh-Feldt statistics, then w has the distribution of a linear combination of independent chi-squared random variables.
Using the noncentral F approximation for the power of the Geisser-Greenhouse
and the Huynh-Feldt tests (Muller and Barton, 1989), the noncentrality parameter is
given by
for
= [b(N) tr(H)]j[ab tr(E)].
FA
Note that a, b, €, and the degrees of freedom suggested for calculating the critical
value for the Geisser-Greenhouse and Huynh-Feldt tests are all fixed. The degrees of
freedom involve only the expected value of the estimate of €.
For a GLMM(F, G), the noncentrality parameter can be written as the product
of fixed constants and random variables,
The term in brackets is constant, the other term random. Thus, to find its distribution, it suffices to find the distribution of tr(H).
By Lemma 3.11, for C
M- 1
= [0
CG],
[C(X'xt c
'r
1
Wa(N - qF - (qG - a),
Let
By Lemma 3.8, we can write
M- 1
= J'J,
39
[CGEG1C~] -1).
with
By Lemma 3.9,
(S)NII3 ,b(O, I 1I3 , {e - eO}'[cGEGlc~tl{e - eO})
J(e - eo)
-
(S)NII3 ,b(O, I 1I3 , D).
RecognizethatH = (e-eo)'J'J(e-e o). Note that rank(D).
D as in the proof of lemma 3.10, so that
Then taking A
=I
1I3 ,
s., and decompose
and applying Lemma 3.10,
s.
tr(D) =
L: ADkX2(V3).
k=l
Then
s.
Wu
-
€b(N)[tr(E)t 1 L: ADkX 2(V3),
k=l
as desired. 0
3.3.1
Approximate Unconditional Power: UNIREP
= 1, thedensityofthescalarnoncentralityreducestofwu(wu )
where"" = €b(N)[tr(E)]-l A, and A is the nonzero eigenvalue of
Whens.
= ",,-1f.x2(wu/"";V3)
(e - eo)' [cGEGlc~rl (e - eo). Thus
Pu = 1 -
10
00
FF(fcrit; ab€, b(N - q)€, 'vu)fwu (wu ) dw u'
(3.1)
For s. > 1, because the density of a linear combination of independent X2 random
variables is not known, it is not possible to write down an explicit expression for
fwJw u )' However, Davies' (1980) algorithm can he used to generate the CDF. A
numerical derivative would be used to calculate the density. This will be used to
develop an algorithm to calculate the following integral.
where T indexes the tests, and the critical values are given in Table 1.1.
40
3.4
Hotelling-Lawley Trace
3.4.1
Distribution of the Noncentrality Statistic
Theorem 3.5 If w is the scalar noncentrality parameter for the noncentral F approximation for the Hotelling-Lawley trace, then w has the distribution of a linear
combination of independent chi-squared random variables.
Proof: Using the noncentral F approximation for the power of the Hotelling-Lawley
trace (Muller et al., 1992), the noncentrality parameter is given by
for a
= 1 and
WHLT
=
tr(HE- 1 ){s[(N - q) - b -1]
+ 2}/s
for a > 1. Under the model with random X, H is' a random variable, and E is fixed.
Define the spectral decomposition of
E- 1
*
=
F F'.
bxb
Then, using this spectral decomposition and circularly permuting the trace,
(N)· tr(HE- 1 )
-
tr(F'HF)
tr[F'(B - B o)'M- 1 (B - Bo)F].
By Lemma 3.11,
M- 1
[C(X'xt)C'r
1
Wa(N - qF - (qG - a), [CGE(/C~] -1).
Let (N - qF - qG + a)
= 1I3.
By Lemma 3.8, we can write M- 1
= J' J, with
By Lemma 3.9,
J(B - Bo)F
(S)NII3 ,b
{o, 1
113 ,
F'(B - B o)'
(S)NII3 ,b(0, 1 113 , E**).
41
[CG.E(/C~]
-1
(B - Bo)F}
Then we can apply Lemma 3.10, with A =
Let
I1.I3'
By Lemma 3.13, rank(.E**) = s*.
as in the proof of Lemma 3.10. Then
tr [F'((.;) - B o)' M- 1 F(B - B o]
tr[F'(B - B o)' J' J(B - Bo)F]
s.
- I: A~..kX2(V3)'
k=l
Thus
WHLT
=
df(HLT) s.
[s(N)] (; A~."kX2(V3)'
as desired, with df(HLT) given in Table 1.4.
0
·Power for the Linear Case
3.4.2
In the linear case s*
= 1, and
WHLT
Here
(3.2)
A~ ••
=
sdf(HLT)
2
s(N) A~.. X (V3)
_
= /'i,X
2
.
(V3)'
(3.3)
is the single non-zero eigenvalue of E**. The conditional power is given by
Here, df(HL) and fcrit(HL) are given in Table 1.4. Then the unconditional power
can be written as
Pu
= 1- loco FF[fcrit(HL)iab,df(HL),WHLT]",;-lf,,2(WHLTiV3)ciw}lLT
(3.4)
Alternately,
(3.5)
3.4.3
Power for non-Linear Cases
These may be calculated in a manner exactly parallel to the univariate approach to
repeated measures tests, in section 3.3.1. Davies' algorithm can be used to examine
the CDF of the sum of scaled independent chi-squared random variables.
42
3.5
Wilks' Lambda
3.5.1
Distribution of the Noncentrality Statistic
Theorem 3.6 Consider a GLMM(F, G) with a GLH(G). Suppose s* = rank(fl) = l.
Then Ww
= bx 2(V3) + lr'9
1
and its density is given in Equation 3.7.
Proof: As noted in Equation 1.29, the scalar noncentrality function for the power
approximation for Wilks' Lambda can be written as
ww=
df(W)1]w
.
(1 -1]w)
For Wilks' Lambda,
1]w
with W
=1 ET- 1 I.
= 1- W1/g
Then
1
ww
= df(W)(l :- W'9) = df(W)(W-~ -
1).
W'9
Thus the scalar noncentrality function can be expressed as a bijective function of W.
To find the distribution of the scalar noncentrality function, the first step is to find
the distribution of W. Then it is straightforward to derive the distribution of Ww
from it.
First, we will derive the distribution of W- 1 . Since W
W- 1
=1
I / I E I.
= U' EU(N -
=1 ET- 1 1=1 E I / I T I,
q) is assumed to be fixed and
known. However H is a random variable, as is T = H + E. E is symmetric. Write
the spectral decomposition of E as E = U EAU~ = FEF~, where U EU~ = I, and
A
T
Notice that E
= Diag[Al, A2,' .. , Ab].
Then
W- 1
ITI
lEI
_ I T II E- 1 I
I F E1 II H + E II F ET I
IH*+lbl
Now let Ai be the eigenvalues of (H * + I). Then
b
I H* +11= II Ai.
i=l
43
These eigenvalues are defined by the equation
I (H * + I) -
AI
1= 0,
..
which can be rewritten as
I H* - (A -
l)I
1= o.
(3.6)
Define A* = A- 1. Then we can rewrite equation 3.6 as
and realize that {A*} are the eigenvalues of H*. Here, rank(H*)= rank(HE- 1 ) =
rank(n) = s*. This means that H* has s* non-zero eigenvalues. Then
w- 1
I H*+I I
b
IT Ai
i=l
s.
-
IT(A*
i=l
+ 1).
Define V3 = N - qF - qG + 1. When s* = 1,
w- 1 = A* + 1 = tr(H* + 1).
By lemma 6, H * has a Wishart distribution. Then if , is the nonzero eigenvalue of
F"E1(B - Bo)'[CG.EGlC~tl(B - Bo)F"ET , by Lemma 3.10,
,=
A*
X2 (V3) .
This means that
Then the probability density of the scalar non-centrality function Ww is given by
g(~rg-l [(~rg
fww(ww) =
for 0 < ww <
00,
-lr'
8r (T) (2,)1l
exp
((~r-l)
(3.7)
and is 0 elsewhere. Here 9 is defined in Equation 1.32 and
8 = g[(N - q) - (b - a + 1)/2] - (ab - 2)/2. The derivations are shown in the
Appendix, Section A.4. o.
44
3.5.2
Power
The unconditional power of Wilks' lambda for the GLMM(F, G), GLH(F, G), and
s* = 1 is given by
where the critical value and the degrees of freedom are defined in Equations 1.30 and
1.31 and Table 1.2, and fww(ww) is given in Equation 3.7. This integral does not
have an obvious analytic solution. However, since FF is a cumulative distribution
function and hence less than 1, and since fww(ww) is a probability density function,
J FFfww(ww) < J fww(ww) <
1. Thus, the integral is bounded. A one dimensional
numerical integration should provide sufficient accuracy.
3.6
3.6.1
Pillai-Bartlett trace
Distribution of the Noncentrality Statistic
Theorem 3.7 Consider a GLMM(F) G) with a GLH(G). Supposes* = rank(!1) = l.
Then the density of WPB is given in Equation 3.8) below.
Proof: As noted in Equation 1.29, the scalar noncentrality function for the power
approximation for the Pillai-Bartlett trace can be written as
For the Pillai-Bartlett trace
7}PB
Then
WPB
and
= tr(HT- 1 )fs.
df( P B)tr( HT- 1 )
= S _ tr(HT 1)
]
df(P B)tr(HT- 1 )
P r (W < a ) = P r [
1) < a .
s -tr (HT-
By Note 2 in the Appendix, s - tr(HT- 1 ) > O. Then
Pr [WPB < a]
= Pr [df(PB)tr(HT- 1 ) < as 45
atr(HT- 1 )]
.
Simplifying this expression
Thus, it suffices to consider the distribution of tr(HT- 1 ) to derive the distribution
of the scalar noncentrality function.
Note that
b
tr(HT-
1
)
=L
s,.
Ai
i=l
= 2: Ai,
i=:l
1
with Ai are the eigenvalues of HT- and s* its rank. We need to express these
eigenvalues of HT- 1 in terms of a known set of random variables. Recall that the
eigenvalues are numbers such that
1HT- 1 Since T- 1 exists, 1T
1# o.
AI
1= o.
This allows writing
0
I HT- 1 - AI II T I
I H - AT I - 0
0
I H - A(H + E) I
I (1 - A)H - AE I - o.
Let E = FEF'e be the spectral decomposition of B. Since E is nonsingular, FE is
nonsingular,
FE?
exists and
I FE? 1# o.
I FE? II (1 -
Then
I (1 A # 1, since
I-II # O.
II FEll - 0
A)H* - ).[ I
o.
A)H - AE
Then
Let A* = A/(1 - A). Then
where A* are the eigenvalues of H*. For s* = 1, tr(HT- 1 ) = A = A*/(1 + A*). By
Lemma 3.12, H* has a Wishart distribution, and by Lemma 3.10, A* = 'YX 2 (V3),
46
where, is the nonzero eigenvalue of
FE/BF E?
Then one can calculate the proba-
bility distribution function of the scalar noncentrality function by noting that
Pr(WPB < a)
=
Pr [tr(HT-
1
)
< df(PS;)
Pr [A < df(P;)
+ a]
+ a]
Pr [A* < df(PB/: a - sa]
Pr
[,x
2
(N)
< df(PB)s: a _ sa] .
The probability density function of WPB can also be found. We can write
WPB =
Here z
df(P B)tr(HT- 1 )
s - tr(HT- 1 )
df(PB)A*
s-(l-s)A*
df(PB)z
s/,-(l-s)z·
= A*h has a X2 (N) distribution.
Then, writing W for WPB, and 8 for df(P B)
(3.8)
for -8/(1 - s)
3.6.2
< W < 0 and 0 elsewhere.
0
Power
The unconditional power is given by
with the critical value and the degrees of freedom are defined in Equations 1.30 and
1.31, and Table 1.2, and fWPB(WPB) is given in equation 3.8. This integral does not
have an obvious analytic solution. However, since FF is a cumulative distribution
function and hence less than 1, and since fWPB(WPB) is a probability density function,
J FFfwPB(WPB) < J fWPB(WPB) < 1.
Thus, to calculate the integral in Equation 3.9,
a one dimensional numerical integration should provide sufficient accuracy.
47
CHAPTER 4
ASYMPTOTIC POWER FOR
PITMAN LOCAL
ALTERNATIVES
This chapter is concerned with the asymptotic limits of the F approximations for
multivariate power, both conditional and unconditional. The apparent contradiction
between infinite sample size and fixed designs is resolved, and the asymptotic limits
of the conditional power approximations are derived, assuming Pitman local alternatives. These limits are shown to be identical to the first term of the power functions
derived by Rothenberg (1977), and presented on page 331 of Anderson (1984). Next,
the asymptotic limits of the unconditional power approximations for the GLMM(G)
are derived, again under Pitman local alternatives. These are shown to coincide with
the first term given by Lee (1971). The implications of the derivations are examined.
4.1
Infinite Sample Size and :E"ixed Design Matrices
The discussion that follows arises from two sources. Helms (1988), popularized the
concept of the essence X matrix. O'Brien and Shieh (1993) discussed the asymptotic
behavior of X matrices, and introduced a weight matrix, the W matrix. However,
as defined, their weight matrix is in fact dependent on sample size. The definitions
that follow make W explicitly independent of the sample size.
•
4.1.1
GLMM(F)
Simultaneously considering infinite sample sizes and fixed design matrices seems to
be mutually contradictory. In a purely fixed design, the GLMM(F), the experimenter
decides a priori to select a certain number of subjects from each of a series of groups.
Membership in a group is defined by possessing certain characteristics. For example,
one group might consist of white males. A design with K groups can be represented
by
F=
(4.1)
f~ ® INK
where f~, a 1 x qF matrix, is the predictor matrix for the ph subject, and group k
has Nk subjects in it.
It is not required that N 1
balanced. Let N e
= Ef:l Nk
=
N2
= ... =
NK
•
If that occurs, the design is
be the sample size of the planned experiment, and let
N(m) = mNe = Ef=l mNk , where m is an integer. As m -+ 00, N(m) -+ 00. Thus,
one can think of the sample size becoming infinite in a series of quantized steps, each
of size N e • We will describe this sort of limit as a quantized limit.
The series of quantized steps used in the limits here are the same in spirit as the
limits used by authors like Anderson (1984). For a design with random predictors
only, a complete replicate of the design consists of one observation. For a design with
fixed predictors, replicating the design once corresponds to doubling the sample size.
Thi's leads to quantized limits.
For any fixed design, one can form the essence matrix of F, Fe' by deleting
duplicates, and leaving only the unique rows of the design matrix. Helms (1988)
uses the notation Es(F). O'Brien and Shieh (1984) call the matrix the "exemplary
matrix", and used the notation Fe. Here, we adopt Helms' name and O'Brien's
notation.
With F shown in Equation 4.1, the essence matrix is given by
Fe
=
f~
f~
f~
Note that F is an N e x qF matrix, and Fe is a K x qF matrix, where K is the
number of distinct groups or classes of subjects. Let W be a diagonal K x K matrix,
49
whose diagonal is Diag W = [ N 1 , N 2 , ••• , N K ]. Then F'F = F~ W Fe. Let
F( m) = F ® 1m be the matrix that has m times as many rows as the original design.
Then F(m)'F(m) = mF~WFe. Note the definitions of N e, Wand Fe are invariant
with respect to sample size.
4.1.2
GLMM(F, G)
Again, we consider the sample size becoming large in a series of quantized steps, each
of size N e . Just as the size of F
m
increases by N e rows at each step, so does G m • We
assume G m = NN(m),qG (O,IN(m),.E G ).
These assumptions about quantized growth are in addition to the ones assumed
in Section 1.3.2, and are needed to ensure sensible asymptotic behavior. In fact, these
assumptions imply the Huber condition. This condition ensures that the influence of
anyone observation on its estimated mean is small. (Arnold, 1981, p. 143). For A
a i x j matrix, define max (A)
max (A)
---+
0 as m ---+
= maxi,j /ai,jl.
Then the Huber condition is that as
00.
Lemma 4.14 Consider a GLMM(F) or a GLMM(F, G). Consider the sample size
becoming infinite in a series of quantized steps, each of size N e. Then the Huber
condition holds for F(m) (F(m)'F(m))-l F(m)'.
Proof:
1. First define a series of terms. Let A
= m- 1A.
Let
=
(F~W Fe)-l. Then (F(m)'F(m))-l
Ai be the i th , ph entry of F(m),
j E {I, 2, ... , qF}, and let ak,l be the i
th
, ph
for i E {I, 2, ... , N(m)}, and
entry of A, for i E {I, 2, ... ,qF},
and j E {1,2, ... ,qF}.
2. Next examine what is fixed and what diminishes in size as the sample size
increases. Notice max(F) <
inax(F e ) <
00,
00
by the assumptions in Section 1.3.2, and hence
as is max(F(m)). Similarly, max(W) <
assumption. This implies that max(A) <
00
since Ni <
00
by
00.
3. The conclusion follows, as max[F(m) (F(m)'P(m))-l F(m)'] =
max[F(m)m-1AF(mYJ = m-1L:;;;'1 (L:~;;'lfkqaqp)flP' for some k,l. This approaches 0 as m ---+ 00, and hence the Huber condition holds.
o
50
4.2
Local Alternatives
For all of the multivariate tests, the conditional power approaches 1 as N
--+ 00
(An-
derson, 1984, p.330). This means that the unconditional power would also approach
1, since it is defined as an integral with respect to the conditional power. This occurs because N- 1 E converges to E with probability 1 and hence the noncentrality
parameter increases without bound. To allow comparing tests, one usually chooses
a sequence of alternatives for which the powers of the tests differ. The use of local
alternatives is common in the literature: see, e.g, Anderson, 1984, p.330, or Sen and
Singer, 1993, p.238. The local Pitman-type alternatives are of the form
For the multivariate tests of interest, under an assumption of quantized limits, this
corresponds to considering E = E*(N(m) - q), (8 - 8 o)/JN(m), /3/JN(m), and
H/N(m) = HLA' and forming the test statistics from those matrices. Given the
interest in power analysis, not data analysis, 8 is assumed to be known, and is not
estimated.
•
4.3
Asymptotic Limits of Conditional Power Approximations
Muller and Peterson (1989) and Muller and Barton (1984) suggested F approximations for conditional power. In this section, the asymptotic limits of the F approximations for the conditional power are derived. The same proof strategy is used for
all three tests. First, a series of lemmas is proven. Lemma 4.15 demonstrates that
under a sequence of Pitman local alternatives, H
LA
approaches a constant. Lemma
4.16 describes the limit of a noncentral F as the denominator degrees of freedom
becomes large, and the noncentrality parameter and point of evaluation approach
limits. Lemma 4.17 concerns the limits of subsequences. This allows one to consider asymptotic proofs with quantized limits. Using these lemmas, the limit of the
scalar noncentrality parameter for each test is derived, and from it, the limit of the
F approximation.
Lemma 4.15 For a GLMM(F), under a sequence of Pitman local alternatives mdexed by m, limN(m)--+oo H LA = Q, where Q is constant.
51
Proof: Here HLA
= H/N(m).
Then
(mNe )- I {e - eo} , [C (F(m)'F(m)f I c' ] -I {e - eo}
N;I {e - eo}'
[c (F~WFe)-1 C"r
l
{e - eo}
Q,
which is constant with respect to sample size. 0
Lemma 4.16 Let !crit,N = Fi l (1 - a; Ub U2,N) wher'e
N. Assume 0 <
UI
=
00
limN-+co 1I2,N
!crit,N
= FF(l -
<
00,
0<
111
<
00,
and limN-+oo U2,N
!crit,N
and U2,N both depend on
and a E (0,1). Suppose limN-+co)1N =
= 00.
Let
= F~I(1
Ccrit
>. <
00,
- a; UI), and define
a; 111, 1I2,N). Then
Proof: This is equivalent to showing that
\:j €
> 0 :3 M such that for N > M,
By the generalized triangle inequality,
L.H.S < IFF (fcrit,N; 111, 1I2,N, >'N)
-
Fp
(fcrit,N; 111, 1I2,N, >')1
+ IFF (fcrit,N; lib 1I2,N, >') - }'x (lIl!crit,N; 111, >')1
+ IFx (lIl!crit,N; lib >') - FX (lIIUIICcrit; 111, >.) I·
2
2
2
1. Consider the first absolute value. The F c.d.f. has an infinite sum representa-
tion (Abramowitz and Stegun, 1977, p.947). Differentiation with respect to the
noncentrality parameter produces an infinite sum. This infinite sum expression
for the derivative converges to a finite limit (Clark, 1987). Thus, the noncentral
F c.d.f. is continuous with respect to the noncentrality parameter. Byassumption, >'N -+ >. as N -+ 00. Thus, \:j €/3 > 0, :3 8 > 0 and :3 M I such that for
N > MI, I>. - >'NI < 8, and by the definition of continuity, the first absolute
value is less than €/3.
2. Now consider the second absolute value. As Nand 1I2,N approach 00, FF (111, 1I2,N, >.)
.g X2 (111, >') /111 (Abramowitz and Stegun, 1972, p.948). By a theorem of Polya,
if FN converges weakly to F, the pointwise convergences hold uniformly (Serfling, 1980, p.18). Thus,
\:j
€/3 > 0 :3 M 2 such that for N > M 2 , the second
absolute value is less than €/3.
52
3. Finally, consider the third absolute value. As Nand U2,N approach 00,
F(Vb V2) ~ ul l X2(UI) (Abramowitz and Stegun, p. 948). Thus Vf/3, :3 M 3
such that for N > M 3 , and 8/VI > 0, Ifcrit,N - ullCcritl < vll8, which implies
that IVIfcrit,N - VIU1ICcrit! < 8. Notice F X2(X; Vb A) is differentiable with respect
to x and hence continuous in x. Since the X2 distribution function is continuous,
and the pointwise convergence holds uniformly, the last absolute value is less
than f/3.
Then choose M > max(Mb M 2 , M 3 ), and the result follows. 0
= Fi l (1 - a; VI, V2,N) where f crit,N and 1I2,N both depend on
N. Suppose limN-+oo AN = A < 00, and limN-+oo V2,N = 00. Let Cerlt = F;;l(1- a; vd,
and define fcrit,N = FF(l - a; Vb V2,N). Then
Corollary 4 Let f crit,N
Proof: Let
UI
= VI, U2,N =
V2,N and apply Lemma 4.16. 0
Lemma 4.17 Let {Xi} be a set of k x s matrices, and let A be a k x s constant
matrix. If Xi~A, then so does any subsequence {X k }. Here the convergence in
probability of a matrix is taken to mean elementwise convergence in probability.
Proof: The real numbers form a complete sigma algebra, so every infinite subsequence of a convergent sequence converges itself (Apostol, 1967).
4.3.1
Univariate Approach to Repeated Measures
The exact test is only used under an assumption of compound symmetry. If one
assumes compound symmetry, stacks the data, and applies an orthonormal transform
corresponding to the eigenvectors of the compound symmetric matrix, one obtains
two separate GLUM models. For these models, one can use the GLUM test, which
is a special case of the multivariate tests. Thus, the asymptotic results for the exact
test can be thought of as special cases of the asymptotic results for the multivariate
tests.
53
For the conservative, Geisser-Greenhouse and Huynh-Feldt tests, consider taking
the limit of the Muller-Barton power approximation for the univariate approaches to
repeated measures. The approximate conditional power is given by
Pc(T)
=1-
FF [fcrit,T,N(m); ab€, b[N(m) - q]€, WU,N(m)]
where the test is indexed by T. As defined in 1.2:3,
€
is constant with respect to
N(m). Thus b[N(m) - q]€ -+ 00 as N(m) -+ 00. To derive the limit of the power
approximation, it suffices then to consider the limits of the noncentrality function
and the critical values, and then apply Lemma 4.16.
Under a sequence of Pitman local alternatives indexed by m, the scalar noncentrality function for the univariate approach to repeated measures may be expressed
Using Lemma 4.16 one can derive the critical values. For the conservative test,
the small sample critical value is Fit (1 - Q; a, N(m) - q), and the limiting critical
is chosen so that F x2(cc,crit; a) = 1- Q. For the GeisserGreenhouse test, Muller and Barton (1989) applied a theorem of Fujikoshi (1978) and
value is (b€)Cc,crit, where
Cc,crit
~ €
+ k/[N(m)
- q], where k is a constant with respect to
N(m) and is defined in the appendix of Muller and Barton (1989). Hence asymptotically, £(f) = €. The usual critical value is Fit [1 - D:; ab£(€), b[N(m) - q]£(f)]. The
asymptotic critical value is C~t, where C~t is chosen so that FX2(C~t; ab€) = 1 - Q.
For the Huynh-Feldt test,
demonstrated that £(f)
. [ N(m)bf-2
€=mm b[N(m)-q-bf]'
1] .
This definition ensures that the statistic provides a proper estimate. Here, we consider
instead the asymptotic performance of an unbounded statistic defined simply by €u =
[N(m)bf - 2]/{b[N(m) - q - bf]}. Notice that
£(€)
= £(ful€u < 1) . Pr(€u < 1) + Pr(€u > 1) . 1.
54
The usual and unbounded Huynh-Feldt statistics have the same expectation asymptotically if limN(m)-+oo Pr(tu. > 'I) = O.
Fujikoshi's results show that
£(tu.) ~ (N(m)bf. - 2)/[b(N(m) -
r- bf.)] + k[N(m) - rr
l
.
Since
lim [N(m)bE - 2]/{b[N(m) - q - bE]}
N(m)-+oo
= f.,
and
lim k/[N(m) - q]
N(m)-+oo
= 0,
Then
lim £(t)
N(m)-+oo
= f.,
and
lim Fi l [1 - a; ab£(f.) , b[N(m) - q]£(f.)]
N(m)-+oo
= F-;l
[1 x
a; abf.] ,
since b[N(m) - q] --+ 00 and 0 < f. < 1.
Thus, by Lemma 4.16, under a sequence of Pitman local alternatives indexed by
m, the limit of the Muller-Barton F approximation for the conservative test for the
univariate approach to repeated measures is
PC,Ucons,MB
=1-
bdr (Q)]
F X 2 [ bf.Cc,crit; abf., tr (17*) ,
where Cc,crit is chosen so that Fx2 (Cc,crit; a) = 1 - a.
Similarly, by Lemma 4.16, under a sequence of Pitman local alternatives indexed
by m, the limit of the Muller-Barton F approximation for the Geisser-Greenhouse
test for the univariate approach to repeated
PC,GG,MB
measure~ is
bf.tr (Q)]
*
= 1 - F X 2 [ Ccrit; abf., tr (17*) ,
where C~t is chosen so that FX2(C~t; abf.) = 1 - a.
Similarly, by Lemma 4.16, under a sequence of Pitman local alternatives indexed
by m, the limit of the Muller-Barton F approximation for the unbounded HuynhFeldt test for the univariate approach to repeated measures is
PC,HF,MB =
bdr (Q)]
*
1 - F X 2 [Ccrit; abf., tr (17*) ,
where C~t is chosen so that F X 2 (c~t; abf.) = 1 - a.
55
4.3.2
Hotelling-Lawley Trace
Consider taking the limit of the Muller-Peterson scalar noncentrality function under
a sequence of Pitman local alternatives indexed by m. Under a sequence of local
alternatives, the scalar noncentrality function for the Hotelling-Lawley trace is
WHLT,LA
=
tr(HLAE-I){s[df(HLT)}js,
where df(HLT) is given in Table 1.4. Notice for both a = 1 and a > 1,
limN....oo df(HLT)jN
lim
N(m) ....oo
WHLT,LA
= 1.
Then
lim
N(m) ....oo
_
tr(HLAE;:-I){s[(N(m) - q) - b - 1] + 2}js(N(m))
tr(QE;:-I).
Then by Lemma 4.16, under a sequence of Pitman local alternatives indexed by
m, the limit of the Muller-Peterson F approximation to the conditional power of the
Hotelling-Lawley trace statistic is
with
Ccrit
4.3.3
is chosen so that F X2(Ccrit; ab) = 1 - a.
P illai-Bartlett
Consider taking the limit of the Muller-Peterson scalar noncentrality function under
a sequence of Pitman local alternatives indexed by m. Under a series of local alternatives, the Pillai-Bartlett trace becomes tr[HLA(HLA + E)-I]. In turn, examining
the limit of the scalar noncentrality function for the Pillai-Bartlett trace,
lim)
WpB,LA
lim
s[( N (m) - q) - b + s] P B LA
S - P B LA
N ( m ....oo
N(m) ....oo
lim
-
N(m) ....oo
lim
N(m) ....oo
lim
N(m) ....oo
r
s[(N(m) - q) - b + s]tr {HLA [H LA + (N(m) - q)E.. l }
s-tr{HLA[HLA+E..(N(m)-q)rl}
s[(N(m) - q) - b + s]tr {HLA(N(m) -- q)-I [HLA(N(m) - q)-I
S -
tr {HLA(N(m) - q)-I [H"LA(N(m) - q)-I
+ E ..r l }
s[(N(m) - q) - b + s](N(m) - qtItr -[ H LA [HLA(N(m) - q)-I
S -
tr {HLA(N(m) - q)-I [H"LA(N(m) - q)-I
56
+ E ..
+ E .. r l }
+ E ..
r
l
r
l
}
}
Then by Lemma 4.16, under a sequence of Pitman local alternatives indexed by m,
the limit of the Muller-Peterson F approximation to the conditional power of the
Pillai-Bartlett trace statistic is
PC,MB,MP
where
4.3.4
Ccrit
= 1 - F~2
is chosen so that F X 2 ( Ccrit; ab)
[Ccrit; ab, tr (Q:E;l)]
=1-
,
a.
Wilks'
Consider taking the limit of the Muller-Peterson scalar noncentrality function under
a sequence of Pitman local alternatives indexed by m. Under a sequence of local
alternatives, the scalar noncentrality function for Wilks' Lambda is
wi{g)
dj(W)(l WW,LA
l/g
w,LA
dj(W)
[wLl/ -1] ,
g
where dj(W) = g[(N(m) - q) - (b - a + 1)/2] - (ab - 2)/2. Here,
IETi,ll
WLA
1:E*(N(m) - q) [HLA
- I:E* [HLA(N(m) -
+ :E*(N(m) - q)rll
qt
l
IHLAE;l(N(m) - qt l
+ E*rll
+ [bl-
l
Then
lim
N(m)--+oo
WW,LA
=
lim
N(m)--+oo
dj(W) [IHLAE;l[N(m) - qt l
+ [bl
l g
/ -
1] .
Let Ai be the eigenvalues of H LAE;l. Notice that there are s* such non-zero
eigenvalues. Then AdN(m) - q]-l are the eigenvalues of HLA.L\[N(m) - qJ-l, and
57
Adding and subtracting an identity matrix yields
qt +
1
IHLA.E*[N(m) -
I- (N(~ _q+ 1) II = o.
In turn
and
g[N(m) - q]
WW,LA
[fi (N(~;
_ q+
2-ab-g(b-a+1)
+
2
As N(m) --+
00,
r -1]
g
[s.n: (' N(m)Ai _
t=l .
q
9 ]
)1/
+1
- 1 .
the first fraction becomes an indeterminate form, and the second one
approaches zero. Let m
= [N( m) - qt 1 •
lim WW.LA
N(m)-+oo'
-
Then
[fi
lim!L
(Aim
m i=l
m-+O
+ 1)1/9 -
1] + 0
f(m)-l
9 lim :...-.:....---:....--
m
m-+O
Notice that
s.
logf(m) = g-l2:(Aim + 1),
i=l
and hence
df(m) -f( ) -1~\,
m 9 L.t I\t·
dm
i=l
Applying l'Hopital's rule,
lim
WW.LA
N(m) .....oo
'
.
d[f(m) - l]jdm
dmjdm
-
9 11m
-
1 s.
9 lim -
m ..... O
2:
m .....O
A'
t
s.
II (1 + mAi)1/9
9 i=l 1 + m.\i i=l
S.
_
2: Ai
i=l
tr (Q.E;l) .
Then by Lemma 4.16, under a sequence of Pitman local alternatives indexed by m,
the limit of the Muller-Peterson F approximation to the conditional power of the
Wilks' statistic is
CPW,MP = 1 - F
X2[Cc:rit; ab, tr (q.E;l)] ,
where Cc:rit is chosen so that FX2 (Cc:rit; ab) = 1 -
58
Q:.
4.3.5
Summary and Implications
The asymptotic limits of the F approximations (Muller and Peterson, 1989) for the
conditional power is shown to be the same for the Pillai-Bartlett trace, the HotellingLawley trace and Wilks' Lambda. This common limit is the first term of the power
function derived by Rothenberg (1977). His other terms are of order liN or smaller.
The asymptotic limit of the F approximations to the power of the various univariate approach to repeated measures tests was also derived, and shown to be a
function of a noncentral X2 random variable. Muller and Barton (1989) had also
examined this question, and showed that the asymptotic limit was related to the
noncentral F. However, they did not consider local alternatives.
4.4
Asymptotic Limits of the Unconditional Power
Approximations
In this section, the asymptotic limits of the approximations to the unconditional
power of the multivariate tests are derived. As in the conditional case, the derivations
rely on the fact that under local alternatives, H
LA
is asymptotically constant. This
is demonstrated in Lemma 4.18. Theorem 4.9 describes the asymptotic behavior
of the limits of Lebesgue integrals. This allows one to derive the limit of the F
approximations to the unconditional power by deriving the limit of the noncentrality
parameter.
Lemma 4.18 For a GLMM(F, G), and a GLH(G), under Pitman local alternatives,
H
LA
~
Qa
where Qa = (e - eo)' [Ca1i(/c~r1 (e - eo), and is constant with respect to
sample size.
Proof: Notice that as in in the proof of Lemma 3.11,
M- 1 =
{C a [G'G -
I }-1
G'F (F'F)-l F'Gr C~
Considering local alternatives,
N(m)-lM- 1
_ {C a [N(mt 1G'G - N e N(m)-lG'FN(m) (Fe'WFe )-l N(mt 1F'Gr
59
1
C~}-l
By Lemma 19.9 of Arnold (1981, p.365), as N --+ (Xl, N-1G'G ~ E G . Since by
Lemma 4.17, every infinite subsequence of a convergent sequence is also convergent,
as m --+ 00, N(m)-lG'G ~ E G .
By Theorem 19.7 of Arnold (1981, p.380), as N(m) --+ 00, N-IG'F (F'F)-l F'G
$. WqG(qF,E G ), and hence N-2Ne G'F (F'F)-l P'G ~
o.
Again, the infinite
subsequence indexed by m converges by Lemma 4.17. Then the result follows, since
everything else in the formula for H is constant with respect to N(m). 0
Consider the following theorem from Serfling (p.16, 1980).
Theorem 4.8 Let the distribution functions .F', FI, F 2, ... possess respective characteristic functions <p, <PI, <P2, . ...
The following statements are
equivalent:
1. Fn
=> F;
2. limN <PN(t) = <p( t), each real t
3. limN J gdFN = J gdF, each bounded continuous function g.
All integrals in Theorems 4.8 and 4.9 below are Lebesgue-Stieltjes integrals.
Theorem 4.9 is original and does not appear in Serfling.
Theorem 4.9 Consider a set of distribution functions F, FI, F 2, . .. possessing respective characteristic functions <P, <PI, <P2, ... , and suppose Fn converges in distribution to
F. Let gn be a set of absolutely continuous bounded functions such that gn converge
uniformly to g. Assume
J gn dFn < 00,
and
J 9 dF
<:
00.
lim jgn dFn = jgdF.
N-+oo
Then
(4.2)
Proof: It is equivalent to demonstrate that 'tit> 0, 3M> 0 such that for n > M,
Step 1: By assumption, gn converges uniformly to g. Following the definition of
Courant and John, (1965), this means that 'tit > 0, there exists a corresponding
number M 1 , possibly dependent on t, such that for all n > M1 and for all x in the
support of g, Ign(x) - g(x)1 < €/2. Thus, for n > MrI,
60
If
.
gn dFn -
f
If
If
If
If
gdFnl
(gn - g) dFnl
<
!gn - gl dFnJ
<
ItldFnl
t
-
dFnl
t.
The last step follows because F n is a cumulative probability distribution function,
and hence Vn,
J dFn = 1.
Step 2: By part 3 of Theorem 4.8, Vt > 0, we may conclude that :3 M 2 > 0 such that
for n > M 2 ,
Now, Vt > 0 and for all values x in the domain of g, choose n > max(M1 , M 2 ). Then
If
gn dFn -
f
gdF\
-
<
If
If
gn dFn gndFn -
f
f
f
+ If
gdFn +
gdFnl
gdFn gdFn -
f
f
gdFI
gdF\
< t/2 + t/2
where the first inequality follows from the triangle inequality, and the second from
Steps 1 and 2.
o.
A detailed derivation of the asymptotic limit of the unconditional power approximations will be given for the Hotelling-Lawley trace. The proofs for the other cases
are exactly parallel, so only the answer will be provided.
Theorem 4.10 Consider taking the limit of the unconditional power approximation
for the Hotelling-Lawley trace, under a sequence of Pitman local alternatives indexed
)
by m. Then the limit of the unconditional power approximation is
where
Ccrit
is chosen so that F X 2 (Ccrit; ab) = 1 - a.
61
Proof: The approximate unconditional power is defined as
10
Pu(HL) = 1 -
00
FF(fcrit,N(m);
ab, 1I2,N(m),WHL,N(m»)!(WHL,N(m») dwHL,N(m) , (4.3)
where 1I2,N(m) = N if a = 1 or 1I2,N(m) = {s [(N(m) - q) - b - 1] + 2} /s for a >
1. The problem with the Riemann integral in this context is that the domain of
converges to a point as WHL,N(m) approaches its asymptotic limit. Riemann integrals are not well defined on sets of measure zero. Instead, recast equation
!(WHL,N(m»)
4.3 in terms of Lebesgue-Stieltjes integrals. Let
Let
dFN(m)
n be
the domain of
!(WHL,N(m»)'
= !(WHL,N(m») dwHL,N(m) , to allow taking the integral with respect to the
probability measure. Define 9N(m) = FF(fcrit,N(m); ab, 1I2,N(m),WHL,N(m»)' Then one
can write the formula for the approximate asymptotic unconditional power as
lim
N(m) .....=
Pu(HL)
= 1- N(m)--+=
lim
1
wEn
9N(m)d}~V(m)'
LA ~ QG' By a proof completely parallel to the one in
section 4.3.2, limN(m) .....oowHL,N(m) = tr (Q GlJ-:;l). Thus FN(m) becomes a point mass
at tr (QGlJ-:;l). By Lemma 4.16, 9N(m) converges to F X [Ccrit; ab, tr (QGlJ-:;l)].
By Lemma 4.18, H
2
Then, by Theorem 4.9,
lim
N(m)--+=
Pu(HL) -
r
J.
9N(m) dFN(m)
N(m)--+oo ,,,,En
lim
- 1-1
-
as desired.
1-
dP
wEn 9
1 - F X2 [Ccrit; ab, tr( QGlJ-:;l)]
,
0
Similar proofs can be applied to the univariate approach to repeated measures
tests, the Pillai-Bartlett trace, and Wilks' Lambda. The asymptotic limits of the
unconditional power approximations for all the tests, under a sequence of local alternatives indexed by m, are summarized in Table 4.1. In the table,
= 1-
a, C~t is chosen so that 1~2(c~t; ab€)
chosen so that F X2(Ccrit; ab) = 1 - a.
that F X2(Cc,crit; a)
Cc,crit
=1-
is chosen so
a and
Ccrit
is
t
62
Test
UNIREP
.
Table 4.1: Limits of Unconditional Power
Unconditional Power Approximation
betr Q G
conservative
1 - F X 2 [bfCc,crit; abf,
Geisser-Greenhouse
* a bf,
1 - Fx2 [Ccrit;
Huynh-Feldt (unbounded)
G ]
* a bf, ----,......
betr Q-'<-'1 - Fx2 [Ccrit;
tr
*
tr
*)
bet~))]
tr
*
MULTIVARIATE
Hotelling-Lawley
Pillai-Bartlett
1 - FX2[Ccrit; ab, tr(Q G.E;l)]
1 - F X2[Ccrit; ab, tr(Q G.E;l)]
Wilks' Lambda
63
]
4.4.1
Discussion
The results presented here are for the GLMM(F, G), GLH(G). Neither the derivation nor the limits depend on F in any way. Thus, one can immediately conclude
..
that the asymptotic limit of the F approximation for the unconditional power of the
multivariate tests of the GLH(G) in the GLMM(F, G) and the GLMM(G) coincide.
Earlier results by other authors give some indica.tion of the accuracy of the results
•
presented here. The test of independence is a special case of the GLMM(F, G) and the
GLH(G). Lee (1971) derived asymptotic approximations for the test of independence
for Wilks' Lambda, the Pillai-Bartlett trace, and the Hotelling-Lawley trace. The
asymptotic limits of the approximations to the unconditional power displayed in Table
4.1 are the same as the first term of his results. The other terms are of order N- 1 or
smaller.
In Chapter 3, small sample results are derived only for the linear case for the
Pillai-Bartlett trace and Wilks' Lambda. However, the asymptotic results from this
chapter are for general s•.
•
64
..
CHAPTER 5
NUMERICAL EVALUATIONS
This chapter provides details about numerical integrations, contrasts analytic results
to simulations, accounts for error due to the approximation of conditional power,
compares asymptotic results to small sample results under local alternatives, and
examines conditional and unconditional power results. The intent is to evaluate the
performance of the algorithms for computing unconditional power, and check their
accuracy. Tables and graphs are used to demonstrate the results.
This chapter does not present algorithms for all the theoretical results derived
in this dissertation. For the GLMM(F, G) and GLH(G), the linear case of Wilks'
Lambda and the Pillai-Bartlett trace are left for future research. The HotellingLawley trace and the UNIREP statistics are considered only in the linear case. The
case in which n has more than one non-zero eigenvalue is left for future research.
Numerical results are given for all the tests for the special case of the GLMM(F, G)
and the GLH(F, G) that is discussed in Chapter 2.
5.1
Transformations, Monotonicity and Convergence
Transformations are employed to produce finite integrals, bounded integrands and
finite limits. This allows the computation of the integrals numerically by using Simpson's rule. The transformations also produce monotone integrands, which allows
one to specify the accuracy of the numerical integration. The Hotelling-Lawley and
UNIREP statistics are examined for the special case of the GLMM(F, G), GLH(F,
G), and for the linear case of the GLMM(F, G), GLH(G).
5.1.1
GLMM(F, G), GLH(F, G)
First, the unconditional power of the Hotelling-Lawley trace and the UNIREP statistics are considered. The model is the special case of the GLMM(F, G) and the
hypothesis is the special case of the GLH(F, G) considered in Chapter 2. Equation
2.7 gives the unconditional power of the Hotelling Lawley trace as
Pu
= 1 - fa~ FF(feciti Vb V2, w)K(N -
qF )w- 2iF [g(W)i 1, N - qj, 8] dw
where g(w) = (KW- 1 -1)(N - qF).
Transform 1 Let Y = g(w). Then w
Y = 0, and when w = 0, Y = 00. Then
Pu
=1-
1° FF
00
{
feciti
Vb
[K(N - qp)](N - qF + yt 1 • When w
=
= K,
[K(N-qF)]}
V2 N
fF(Yi 1, N - qj, 8) dy.
- qF + Y
Transform 2 Let Y = F;I(Pi 1, N - qF,8). Then P = FF(Yi 1, N - qF, b). When
Y = 00, P = 1, and when Y = 0, P = 0. Then
(5.1)
Notice that for P = 1, F- 1 (Pi 1, n - qF, 8) =
00,
which means that the integrand in
Equation 5.1 reduces to the central F, FF(feciti Vb 1/2,0).
In fact, the integrand in Equation 5.1 is monotone increasing and the integral
is bounded. Distribution functions of absolutely continuous random variables are
bijective and monotone increasing, which means that their inverses are also. For
K > 0, C > 0, and x > 0, functions of the form K/(C + x) are monotone decreasing
in x, which means that the noncentrality parameter is monotone decreasing in p.
Finally, as the noncentrality parameter of the noncentral F decreases monotonely, the
noncentral F increases monotonely (Johnson and Kotz, 1970, p.142). This guarantees
that the integral in Equation 5.1 is bounded, since
1 - Pu
<
<
-
FF [feciti Vb V2, N _
+~~:(.qF~
_
o
J[1
qF
F p,q,
qF,
r
max FF [feciti VI, V2,
Jo
pE(O,I)
fal
FF [feciti Vb V2, 0] dp
FF [fecit; Vb V2, 0] <
8)] dp
K(",,: - qF)
] dp
N - qF + Fi (Pi q, N - qF, 8)
00.
Since the integral in Equation 5.1 converges and the integrand is monotone increasing, there are upper and lower bounds for a Simpson's rule numerical integration
66
"
(Thisted, 1988, p.293). The distance between the upper and lower bounds provides an
upper bound for the error the integration. This error bound can be made arbitrarily
small by adjusting the step size of the numerical integration.
The derivations in this section were for the Hotelling-Lawley trace. Because the
expression for the unconditional power of the UNIREP statistics is of exactly the
same form as Equation 2.7 except for constants, an exactly parallel argument was
used to derive algorithms for the unconditional power of the UNIREP statistics.
5.1.2
GLMM(F, G), GLH(G)
Next, the unconditional power of the Hotelling-Lawley trace and the UNIREP statistics are considered for the GLMM(F, G) and the GLH(G) (Chapter 3). Equation 3.5
gives an exact form for the unconditional power for the Hotelling Lawley trace in the
linear case in terms of an infinite sum. Equation 3.1 gives an approximation for the
unconditional power of the UNIREP statistics in terms of an infinite sum. While infinite sums can be approximated by adding a finite number of terms, finding the error
of these approximations can be tricky. Instead, a series of transformations is used to
recast the integral in a simple form. The integral is then calculated numerically using
Simpson's rule.
Since WHLT = KX 2(V3), if C is distributed X2(V3)'
Pr{WHLT < h} -
Pr{KC < h}
Pr{C < hiI'}
FX2(hIK;V3).
Then fWHLT(WHLT) = (1/K)fx2(WHLT/K; V3) Thus, one can express the unconditional
power as
Pu = 1 -
10
00
FF(Jcrit; Vb V2,WHLT )1'-1 fx2 (K- 1W, V3) dwHLT .
Transform 1: Let h = WHLT I K. Then dwHLT
when WHLT = 00, h = 00. Then
Pu
=1-
1
= Kdh.
When WHLT
FF(Jcrit; Vb V2, Kh)fx 2(h; V3) dh.
= F~I(p; V3). Then p = FX2(h; V3),
h = 00, p = 1, and when h = 0, p = 0. Then
= 1-
and
00
Transform 2: Let h
Pu
= 0, h = 0,
and dp
fal FF [fcrit; VI, V2, KF;21(p)] dp
67
= fX2(h)dh.
When
(5.2)
Notice if p
= 1,
FF(Jcrit; VI, V2, 00) = 0 and if p
reduces to the central F, FF(Jcrit; VI, V2, 0)
= 0,
the integrand in Equation 5.2
In fact, by a parallel argument to the one in Section 5.1.1, one can show that
in Equation 5.2 the integral is bounded and the integrand is monotone. Distribution
functions for absolutely continuous variables are bijective and monotone increasing,
which means that their inverses are also. Multiplying by a positive constant again
...
produces a bijective, monotone increasing function. Finally, the noncentral F is monotone decreasing with respect to the noncentrality parameter (Johnson and Kotz, 1970,
p.293). This means that the integrand is monotone decreasing.
This also guarantees that the integral in Equation 5.2 converges, since
1
[0 FF
in
[fcrit; VI, V2, F;/(p; V3)] dp < i[1
max FF [fcrit; VI, V2, Y;}(p; V3)] dp
o pE[O,I]
- fal FF [fCIit;
VI, V2,
-
FF [fcrit; z.'I, V2, 0]
-
1- a
<
0] dp
ClO.
Since in Equation 5.2 the integral converges and the integrand is monotone decreasing, there are upper and lower bounds for a Simpson's rule numerical integration
(Thisted, 1988, p.293). The distance between the upper and lower bounds provides
an estimate of the error of the integration. This error can be made arbitrarily small
by adjusting the maximum step size of the numerical integration. A similar transformation and integration is used for the UNIREP statistics.
5.2
5.2.1
Analytic Results and Simulations
GLMM(F, G), GLH(F, G)
Different sets of parameter values were chosen for each sort of model and hypothesis.
For the special case of the GLMM(F, G) and GLH(F, G) considered in Chapter 2, a
set of initial conditions was selected. The model chosen was
Y
NX4
=[
(5.3)
F
Nx2
Here
F
=
[1o 0]
1
68
'
•
with all vectors in F having N /2 rows.
The Type I error rate was set at .05.
V {[rowi(£)]'} = 1(4), the identity matrix with four rows and columns. V {[rowi(G)]'} =
1, the identity matrix with one row. There is only one random predictor in this special
case. For hypothesis testing, let C
= [1
-1 1], U = 1(4), B o =
•
B
= [ BFB G ] =
0
lx4
and
[~ ~ ~ ~] .
The value of 8 in Ba controls the correlation between the predictors and the single
independent variable. Rank(B)
which means that s
= 1, which ensures that
s*
= 1.
In addition, a
= 1,
= 1 and the Muller-Peterson conditional power approximations
reduce to exact conditional power results. All of the multivariate tests coincide.
For each entry in the tables that follow, Nand 8 were fixed. From the fixed
values of N, B, E, and E a , 10,000 sets of data were simulated. The SAS NORMAL (SAS, 1990) random number generator was used to produce pseudo-random
spherical Gaussian data, and then linearly transformed to obtain realizations of G
and E. From these, values of Y were generated. For each sample of data, LINMOD
(Christiansen et aI, 1995) was used to calculate the values of the Hotelling-Lawley
trace, and the Geisser-Greenhouse and uncorrected UNIREP statistics. An additional
module was used to calculate the conservative and Huynh-Feldt tests. The p-values
were accumulated. The empirical power for each test was calculated to be the total
number of times that the null hypothesis was rejected, divided by the total number
of replicates (10,000). The theoretical values were calculated using Simpson's rule for
numerical integration. Details are given in Section 5.1.1.
All of the simulations used 10,000 replications. The approximate half-width
for a 95% confidence interval for a power estimated by one of the simulations is
1.96[P(1 - P)/lO, 000]-5. This half width takes on its maximum value, .0098, when
P = .5. The maximum error in estimating the numerical integrals was set at .0002.
Table 5.1, (page 75), compares the theoretical prediction for the unconditional
power of the multivariate tests to the simulated values. The maximum difference
between the theoretical prediction and the simulated value was less than .009. Each
difference fell within the 95% confidence interval.
Tables 5.2, 5.3 and 5.4, (pages 76, 77 and 78), compare the theoretical prediction
for the unconditional power of the univariate tests to the simulated values. The
maximum error was less than .1 and in general was the same order of magnitude
as the error in the conditional power approximations. For the Geisser-Greenhouse
69
statistic, the difference fell inside the 95% confidence interval for all the results for
N=90. The approximation for the unconditional power for the Huynh-Feldt statistic
was less accurate, with the difference falling within the 95% confidence interval for
only 2 of the 12 values examined. All of the differences fell into the 95% confidence
intervals for the conservative test.
5.2.2
GLMM(F, G), GLH(G), s = 1, s* = 1
For the GLMM(F, G), GLH(G), two sets of initial conditions was chosen. First,
the unconditional power of the UNIREP and HoteHing-Lawley trace statistics were
examined for s = 1, s. = 1. The model chosen was
y
Nx4
=[
(5.4)
F
Nx2
Here
F
=
[1o 0]
1
.
'
with each vector in F of length N /2. The Type 1 error rate was set at .05. V {[row i( E)]'} =
1(4), the identity matrix with four rows and columns. V {[rowi( G)]'} = 1(2), the identity matrix with two rows and columns. For hypothesis testing, let
= [0 0 0
C
U=I(4),B o =
0
lx4
1] ,
and
1 111
111 1
b b b b
b b b b
The values of b in BG control the correlation between the predictors and the independent variables. Rank(B) = 1, which ensures that s. = 1. In addition, a = 1,
which means that s = 1 and the Muller-Peterson conditional power approximations
reduce to exact conditional power results.
Table 5.5, (page 79) allows comparing the theoretical prediction for the unconditional power of the Hotelling-Lawley trace to simulated values. The largest difference
was of order .01. For 11 out of 12 of the values examined, the differences fell within
the 95% confidence intervals.
70
Tables 5.6, 5.7 and 5.8, (pages 80, 81 and 82) compare the empirical estimates
and theoretical predictions for the unconditional power of the conservative, GeisserGreenhouse, and Huynh-Feldt statistics. The largest difference was of order .01. For
the conservative test, all of the differences fell within the 95% confidence intervals.
None of the differences for the Huynh-Feldt test did, and only 2 of the 12 power values
for the Geisser-Greenhouse test fell within the 95% confidence interval. This reflects
the inaccuracy of the conditional power approximations.
5.2.3
GLMM(F, G), GLH(G),
s
> 1,
s* = 1
Another set of initial conditions was to used examine the performance of the UNIREP
and Hotelling-Lawley trace statistics for s
> 1, s* = 1. The model was the same as in
Equation 5.4. Alpha was set at .05. V {[rowi(£)]'} = 1(4), the identity matrix with
four rows and columns. V {[rowi( G)]'}
columns. For hypothesis testing, C
=
= 1(2), the identity matrix with two rows and
[0 0 1 0], U
o 0 0 1
= 1(4),80 =
0
and
2x4
o
000
111 1
222 2
333 3
Rank(B)
= 1, which ensures that
s*
= 1.
In this case, a
= 2 and b = 4 which means
that s > 1 and the Muller-Peterson conditional power results are approximate.
Table 5.9, (page 83) compares the empirical estimates and theoretical predictions for the unconditional power of the Hotelling-Lawley trace. The values are very
inaccurate for N
=
10, but the accuracy improves as N becomes larger. None of
the values fell within the 95% confidence intervals; this is expected as the conditional
power is approximate.
Tables 5.10, 5.11, 5.12, (pages 84, 85 and 86), compare the empirical estimates
and theoretical predictions for the unconditional power of the conservative, GeisserGreenhouse, and Huynh-Feldt statistics. The differences are of order .01. Few values
fell within the 95% confidence limits, as expected, since the conditional powers are
approximate.
71
5.3
Impact of Approximating Conditional Power
As one can see in Table 5.5, (page 79), when the density for the noncentrality is exact,
and the conditional power is exact, the analytic results are exact, and the algorithm
used for calculating them matches the simulated values very well. However, in Table
5.9, (page 83), although the density for the noncentrality is exact, the conditional
power value used is only approximate, and thus the unconditional power is approximate. Thus, the algorithm's results do not match the simulated values very well
for N
= 10.
They are much closer at N
= 50,
and provide very good estimates for
N=90.
Much of the error is in fact due to the conditional power approximation. To
evaluate the extent of this error, simulations were conducted for s > 1, s.
=
1,
and N = 10, 90. A GLMM(F) was simulated. An effort was made to match the
conditions of the GLMM(F, G), GLH(G) shown in Table 5.9, (page 83), as closely
as possible. The initial conditions are mostly the same as in Section 5.2.3. The only
difference is that there are no random predictors in .X, and X = [ 1(4)
0
N-4X4
] '.
The GLMM(F) was simulated as described in Section 5.2.1. The Muller-Peterson
conditional power estimates were generated by using the initial conditions as inputs
for POWERLIB (Keyes and Muller, 1992).
Figure 5.1, (page 95), shows power curves corresponding to the simulated conditional power and to the Muller-Peterson approximate conditional power (Muller and
Peterson, 1984) One can see that for N
= 10,
the Muller-Peterson approximation
consistently underestimates the power. The extent of the error is similar to that
observed in Table 5.9, (page 83). The accuracy of the unconditional power approximation strongly depends on the accuracy of the conditional power approximation.
5.4
Comparison of Asymptotic Results to Small
Sample Analytic Results tJnder Pitman Local
Alternatives
.
Chapter 4 contains an analytic derivation of asymptotic results for unconditional
power under local alternatives. The model considered is the GLMM(F, G), with
the GLH(G). If the algorithms used to calculate small sample unconditional power
in Section 5.2 are correct, under local alternatives, the small sample unconditional
72
power estimates should approach the limits predicted by the aSYmptotic theory. This
convergence is what is examined in this section.
5.4.1
GLMM(F, G), GLH(G),
s =
1,
s* = 1
The model and choice of parameters were the same as in Section 5.2.2. From these
parameters, the asymptotic limit for the multivariate tests, the conservative test, the
Huynh-Feldt test and the Geisser-Greenhouse test were calculated using the formulas
in Table 4.1. Local alternatives were produced by considering
e/Vii,
as specified
by Sen and Singer (1993, p.238). The small sample unconditional power results were
calculated using the local alternatives and the same set of parameters.
Table 5.13, (page 87) has results for the Hotelling-Lawley trace. Tables 5.14,
5.15, and 5.16, (pages 88, 89 and 90) have results for the UNIREP tests. Under
local alternatives, the small sample unconditional power results approach the results
predicted by the asymptotic theory very well.
5.4.2
GLMM(F, G), GLH(G),
s
> 1,
s* = 1
The model and choice of parameters were the same as in Section 5.2.3. From these
parameters, the asymptotic limit for the multivariate tests, the conservative test and
the Geisser-Greenhouse test were calculated using the formulas in Table 4.1.
Local alternatives were produced by considering e /Vii, as specified by Sen and
Singer (1993, p.238). The small sample unconditional power results were calculated
using the local alternatives and the same set of parameters.
Table 5.17, (page 91), has results for the Hotelling-Lawley trace. Tables 5.18,
5.19 and 5.20, (pages 92, 93 and 94), have results for the UNIREP statistics. These tables show that under local alternatives, the small sample unconditional power results
approach the results predicted by the aSYmptotic theory very well.
5.5
Conditional and Unconditional Power
Recall that conditional power is calculated for observed, and thus fixed values of
X. Unconditional power is the average power that one would expect to see over
all possible realizations of an experiment with random predictors. Each distinct
realization of the predictors produces a distinct values of
73
(J.
Because the actual density of the scalar noncentrality for the Hotelling-Lawley
trace is given in Chapter 3, one can derive the nOllcentrality values that would be
produced by extreme values of observed X. In the linear case,
(Equation 3.3). Abbreviate
Wq
= KF;}(q; £13)'
WHLT
by w. Let q
= Pr{w $
wq }
WHLT
=
KX 2(V3)
= FX2(K- 1 wq ; £13)'
Then
One would have observed a noncentrality parameter that small or
smaller in only q% of all experiments with this set of random predictors. Thus, one
can calculate the conditional power one would have obtained if one had observed a
•
noncentrality parameter of W q ,
Pc
= FF(Jc:rit,HLT; ab, s[(N -
q) - b -1]
+ 2,w
q ).
(5.5)
Figure 5.2, 5.3, and 5.4, (pages 96, 97, and 98) show the unconditional and two
conditional power curves for the Hotelling-Lawley trace for various sample sizes. In
each figure, the horizontal axis is a function of the correlation between the independent and dependent variables. For each fixed correlation between the dependent and
independent random variables, two values of the conditional power are shown. The
bottom one is the power one would observe if one had observed a noncentrality parameter so large that for 95% of all experiments with these random predictors, the
scalar noncentrality would have been at least that large. The top curve is the conditional power one would have obtained if one had observed a noncentrality parameter
so small that for 5% of all experiments with these predictors, the scalar noncentrality
would have been at least that large. The curves are not symmetric around the power
curve, since the underlying X2 distribution is not symmetric, and power is bounded.
This graph is a demonstration of the extent to which one can over or underestimate
power by failing to account for the randomness of the predictors. This effect depends
on sample size.
The choice of conditional or unconditional power can have a dramatic effect
on estimating sample size. For the example shown in Figures 5.2, 5.3 and 5.4, an
additional analysis was run to calculate how large the sample should be to achieve at
least 80% power for an effect size of .25. Since one needs to choose whole numbers for
sample size, one typically rounds up from the nearest fractional estimate, or chooses
power that is as large or larger than the target power. For the Hotelling-Lawley trace,
and this example of the GLMM(F, G), the unconditional sample size estimate was
58. If one had used the conditional power that corresponded to the 95% percentile
of the noncentrality, one would have chosen a sample size of 42. If one had used the
conditional power that corresponded to the 5% percentile of power, one would have
chosen a sample size of 73.
74
•
Table 5.1:
Hotelling-Lawley
GLMM(F, G), GLH(F, G), Special Case
Theoretical and Empirical Unconditional Power
•
8
0.25
0.50
0.75
1.00
Theoretical
0.0594
0.0894
0.1435
0.2232
Empirical
0.0564
0.0889
0.1426
0.2266
Difference Half Width
-.0030
0.0046
-.0005
0.0056
-.0009
0.0069
0.0034
0.0082
50
50
50
50
0.10
0.15
0.20
0.25
0.0682
0.0932
0.1319
0.1864
0.0640
0.0917
0.1324
0.1880
-.0042
-.0015
0.0005
0.0016
0.0049
0.0057
0.0066
0.0076
90
90
90
90
0.10"
0.15
0.20
0.25
0.0860
0.1385
0.2217
0.3363
0.0830
0.1390
0.2221
0.3396
-.0030
0.0005
0.0004
0.0033
0.0055
0.0068
0.0081
0.0093
N
10
10
10
10
75
Table 5.2:
Geisser-Greenhouse
GLMM(F, G), GLH(F, G), Special Case
Theoretical and Empirical Unconditional Power
•
N
10
10
10
10
6
0.25
0.50
0.75
1.00
Theoretical Empirical
0.0502
0.0348
0.1071
0.0789
0.2271
0.1829
0.4097
0.3518
Difference
-.0154
-.0282
-.0442
-.0579
Half Width
0.0043
0.0061
0.0082
0.0096
50
50
50
50
0.10
0.15
0.20
0.25
0.0672
0.0939
0.1355
0.1946
0.0616
0.0853
0.1269
0.1887
-.0056
-.0086
-.0086
-.0059
0.0049
0.0057
0.0067
0.0078
90
90
90
90
0.10
0.15
0.20
0.25
0.0861
0.1407
0.2273
0.3466
0.0821
0.1340
0.2269
0.3465
-.0040
-.0067
-.0004
-.0001
0.0055
0.0068
0.0082
0.0093
76
Table 5.3:
Huynh-Feldt
GLMM(F, G), GLH(F, G), Special Case
Theoretical and Empirical Unconditional Power
N
10
10
10
10
8
0.25
0.50
0.75
1.00
Theoretical
0.1118
0.2046
0.3678
0.5696
Empirical
0.0659
0.1383
0.2780
0.4668
Difference
-.0459
-.0663
-.0898
-.1028
Half Width
0.0062
0.0079
0.0095
0.0097
50
50
50
50
0.10
0.15
0.20
0.25
0.0783
0.1078
0.1531
0.2163
0.0674
0.0929
0.1382
0.2016
-.0109
-.0149
-.0149
-.0147
0.0053
0.0061
0.0071
0.0081
90
90
90
90
0.10
0.15
0.20
0.25
0.0931
0.1502
0.2399
0.3617
0.0865
0.1405
0.2343
0.3554
-.0066
-.0097
-.0056
-.0063
0.0057
0.0070
0.0084
0.0094
77
Table 5.4:
Conservative
GLMM(F, G), GLH(F, G), Special Case
Theoretical and Empirical Unconditional Power
•
8
0.25
0.50
0.75
1.00
Theoretical
0.0034
0.0109
0.0359
0.1003
Empirical
0.0027
0.0103
0.0353
0.0996
Difference
-.0007
-.0006
-.0006
-.0007
Half Width
0.0011
0.0020
0.0036
0.0059
50
50
50
50
0.10
0.15
0.20
0.25
0.0062
0.0107
0.0193
0.0346
0.0051
0.0091
0.0184
0.0344
-.0011
-.0016
-.0009
-.0002
0.0015
0.0020
0.0027
0.0036
90
90
90
90
0.10
0.15
0.20
0.25
0.0095
0.0210
0.0456
0.0926
0.0088
0.0207
0.0460
0.0909
-.0007
-.0003
0.0004
-.0017
0.0019
0.0028
0.0041
0.0057
N
10
10
10
10
•
78
Table 5.5:
Hotelling-Lawley
GLMM(F, G), GLH(G)
s = 1, s* = 1
Theoretical and Empirical Unconditional Power
N
10
10
10
10
b
0.20
0.80
1.40
2.00
Theoretical
0.0693
0.3611
0.7217
0.9014
Empirical Difference
0.0728
0.0035
0.3667
0.0056
0.7191
-.0026
0.8956
-.0058
Half Width
0.0050
0.0094
0.0088
0.0058
50
50
50
50
0.10
0.20
0.30
0.40
0.1506
0.5178
0.8760
0.9977
0.1451
0.5147
0.8758
0.9873
-.0055
-.0031
-.0002
-.0104
0.0070
0.0098
0.0065
0.0009
90
90
90
90
0.05
0.10
0.15
0.20
0.0949
0.2663
0.5629
0.8318
0.0942
0.2587
0.5618
0.8292
-.0007
-.0076
-.0011
-.0026
0.0057
0.0087
0.0097
0.0073
79
Table 5.6:
Geisser-Greenhouse
GLMM(F, G), GLH(G)
s = 1, s* = 1
Theoretical and Empirical Unconditional Power
•
10
10
10
b
0.10
0.40
0.70
1.00
Theoretical
0.0407
0.2354
0.6300
0.8697
Empirical
0.0259
0.1827
0.5665
0.8343
Difference
-.0148
-.0527
-.0635
-.0354
Half Width
0,,0039
0,,0083
0,,0095
0,,0066
50
50
50
50
0.05
0.10
0.20
0.25
0.0713
0.1559
0.5486
0.7601
0.0655
0.1470
0.5321
0.7498
-.0058
-.0089
-.0165
-.0103
0,,0050
0,,0071
0,,0098
0,,0084
90
90
90
90
0.05
0.10
0.15
0.20
0.0953
0.2738
0.5797
0.8467
0.0894
0.2610
0.5733
0.8370
-.0059
-.0128
-.0064
-.0097
0,,0058
0,,0087
0,,0097
0,,0071
N
10
80
..
Table 5.7:
Huynh-Feldt
GLMM(F, G), GLH(G)
s = 1, s. = 1
Theoretical and Empirical Unconditional Power
N
10
10
10
10
0040
0.70
1.00
Theoretical
0.0718
0.3217
0.7130
0.9084
Empirical
0.0589
0.2903
0.6866
0.8983
Difference
-.0129
-.0314
-.0264
-.0101
Half Width
0.0051
0.0092
0.0089
0.0057
50
50
50
50
0.05
0.10
0.20
0.25-
0.0849
0.1783
0.5819
0.7849
0.0725
0.1584
0.5491
0.7627
-.0124
-.0199
-.0328
-.0222
0.0055
0.0075
0.0097
0.0081
90
90
90
90
0.05
0.10
0.15
0.20
0.1040
0.2898
0.5975
0.8570
0.0945
0.2701
0.5836
0.8442
-.0095
-.0197
-.0139
-.0128
0.0060
0.0089
0.0096
0.0069
8
0.10
•
81
Table 5.8:
Conservative
GLMM(F, G), GLH(G)
s = 1, s* = 1
Theoretical and Empirical Unconditional Power
N
10
10
10
10
8
0.10
0.40
0.70
1.00
Theoretical
0.0026
0.0420
0.2791
0.6172
Empirical
0.0017
0.0384
0.2724
0.6139
Difference
-.0009
-.0036
-.0067
-.0033
Half Width
0.0010
0.0039
0.0088
0.0095
50
50
50
50
0.05
0.10
0.20
0.25
0.0070
0.0245
0.2149
0.4263
0.0064
0.0239
0.2121
0.4231
-.0006
-.0006
-.0028
-.0032
0.. 0016
0.. 0030
0.. 0081
0.. 0097
90
90
90
90
0.05
0.10
0.15
0.20
0.0114
0.0627
0.2403
0.5540
0.0118
0.0594
0.2353
0.5524
0.0004
-.0033
-.0050
-.0016
0.. 0021
0,,0048
0,,0084
0,,0097
"
82
Table 5.9:
Hotelling-Lawley
GLMM(F, G), GLH(G)
s> 1, s. = 1
Theoretical and Empirical Unconditional Power
N
10
10
10
Theoretical
8
0.4343
0.60
1.20 0.8873
1.90 0.9869
50
50
50
0.06
0.12
0.20
90
90
90
0.04
0.07
0.10
Empirical
0.8355
0.9917
1.0000
Difference
0.4012
0.1044
0.0131
Half Width
0.0097
0.0062
0.0022
0.4102
0.9683
1.0000
0.4756
0.9824
1.0000
0.0654
0.0141
0.0000
0.0096
0.0034
0.0001
0.3789
0.8960
0.9974
0.3998
0.9167
0.9986
0.0209
0.0207
0.0012
0.0095
0.0060
0.0010
83
Table 5.10:
Geisser-Greenhouse
GLMM(F, G), GLH(G)
s = 1, S* = 1
Theoretical and Empirical Unconditional Power
N
10
10
10
8
0.10
0.20
0.30
Theoretical
0.1376
0.5590
0.8661
Empirical
0.0976
0.4776
0.8261
Difference
-.0400
-.0814
-.0400
Half Width
0.0068
0.0097
0.0067
50
50
50
0.02
0.04
0.08
0.0817
0.2227
0.7858
0.0748
0.2129
0.7726
-.0069
-.0098
-.0132
0.0054
0.0082
0.0080
90
90
90
0.02
0.04
0.05
0.1191
0.4219
0.6377
0.1109
0.4020
0.6229
-.0082
-.0199
-.0148
0.0063
0.0097
0.0094
.
•
84
Table 5.11:
Huynh-Feldt
GLMM(F, G), GLH(G)
s > 1, s* = 1
Theoretical and Empirical Unconditional Power
N
10
10
10
8
0.10
0.20
0.30
Theoretical
0.2162
0.6638
0.9117
Empirical Difference
-.0270
0.1892
0.6320
-.0318
-.0129
0.8988
Half Width
0.0081
0.0093
0.0056
50
50
50
0.02
0.04
0.08
0.0985
0.2532
0.8116
0.0851
0.2311
0.7867
-.0134
-.0221
-.0249
0.0058
0.0085
0.0077
90
90
90
0.02
0.04
0.05
0.1304
0.4422
0.6565
0.1155
0.4162
0.6328
-.0149
-.0260
-.0237
0.0066
0.0097
0.0093
•
85
Table 5.12:
Conservative
GLMM(F, G), GLH(G)
s > 1, s* = 1
Theoretical and Empirical Unconditional Power
•
8
N
10
10
10
0.10
0.20
0.30
Theoretical
0.0097
0.1576
0.5280
Empirical Difference
0.0072
-.0025
-.0026
0.1550
0.5214
-.0066
Half Width
0.0019
0.0071
0.0098
50
50
50
0.02
0.04
0.08
0.0051
0.0292
0.3961
0.0043
0.0289
0.3931
-.0008
-.0003
-.0030
0.0014
0.0033
0.0096
90
90
90
0.02
0.04
0.05
0.0101
0.0988
0.2363
0.0094
0.0962
0.2301
-.0007
-.0026
-.0062
0.0020
0.0058
0.0083
•
86
Table 5.13:
Hotelling-Lawley
GLMM(F, G), GLH(G)
s = 1, s* = 1
Asymptotic and Small Sample Unconditional Power
Pitman Local Alternatives
8
0.10
0.40
0.70
10
0.050547
0.057669
0.073754
100
0.051894
0.081386
0.158457
1000
0.052032
0.083894
0.167701
10000
0.052046
0.084145
0.168629
87
00
0.051970
0.084101
0.168674
Table 5.14:
Conservative
GLMM(F, G), GLH(G)
to
s = 1, S* = 1
Asymptotic and Small Sample Unconditional Power
Pitman Local Alternatives
8
0.10
DAD
0.70
10
0.001879
0.003106
0.007383
100
0.004125
0.008851
0.026389
1000
0.004334
0.009500
0.028734
10000
0.004354
0.009565
0.028971
00
0.004276
0.009492
0.028921
...
.
.
88
Table 5.15:
Geisser- Greenhouse
GLMM(F, G), GLH(G)
s = 1, s* = 1
Asymptotic and Small Sample Unconditional Power
Pitman Local Alternatives
•
6
0.10
10
0.032961
0040
0.046218
0.70
0.081545
100
0.050960
0.081295
0.161115
1000
0.051943
0.083890
0.167976
10000
0.052037
0.084145
0.168657
.
89
00
0.051970
0.084101
0.168674
Table 5.16:
Huynh-Feldt
GLMM(F, G), GLH(G)
s = 1, s* = 1
Asymptotic and Small Sample Unconditional Power
Pitman Local Alternatives
0.10
0040
0.70
10
0.059859
0.080216
0.130900
100
0.056008
0.088250
0.171900
1000
0.052423
0.084560
0.169028
10000
0.052085
0.084212
0.168762
90
00
0.051970
0.084101
0.168674
Table 5.17:
Hotelling-Lawley
GLMM(F, G), GLH(G)
s> 1, s* = 1
Asymptotic and Small Sample Unconditional Power
Pitman Local Alternatives
.
S
0.20
0.40
0.60
0.80
10
0.053593
0.064759
0.084611
0.114356
100
0.121805
0.428273
0.825085
0.980167
1000
0.133555
0.488327
0.887158
0.993964
10000
0.134783
0.494454
0.892634
0.994774
91
00
.13490537
.49513537
.89324785
.99487625
Table 5.18:
Conservative
GLMM(F, G), GLH(G)
s > 1, s* = 1
Asymptotic and Small Sample Unconditional Power
Pitman Local Alternatives
8
0.20
0040
0.60
0.80
10
0.002860
0.023990
0.128684
0.346441
100
0.011746
0.123450
0.525358
0.899233
1000
0.012971
0.137109
0.572287
0.934025
10000
0.013097
0.138506
0.577033
0.937160
92
,X)
0.012948
0.138541
0.577586
0.937650
Table 5.19:
Geisser- Greenhouse
GLMM(F, G), GLH(G)
s > 1, s* = 1
Asymptotic and Small Sample Unconditional Power
Pitman Local Alternatives
8
0.20
0.40
0.60
0.80
10
0.064519
0.226264
0.512613
0.756694
100
0.129218
0.470868
0.866157
0.988752
1000
0.134461
0.492727
0.890527
0.994247
10000
0.134978
0.494904
0.892866
0.994674
.
93
00
0.134905
0.495135
0.893248
0.994876
Table 5.20:
Huynh-Feldt
GLMM(F, G), GLH(G)
s > 1, S* = 1
Asymptotic and Small Sample Unconditional Power
Pitman Local Alternatives
b
0.20
0.40
0.60
0.80
10
0.114129
0.325357
0.621506
0.829674
100
1000
10000
()O
0.139791
0.489272
0.875475
0.989937
0.135486
0.494516
0.891339
0.994315
0.135080
0.495082
0.892946
0.994680
0.134905
0.495135
0.893248
0.994876
94
Figure 5.1:
GLMM(F, G), GLH(G)
Muller-Peterson and Empirical Conditional Power
s = 1, s* = 1, N = 10.
POWER
1.0
•••
••••
0.9
•••••••
••••
••
0.8
••
••
••
••
0.7
0.6
//l'''''
0.5
0.4
...:..
0.3
0.2
••
••
••
••••
••
0.1
••••
0.01r-......
--.
o
........
-...
2
1
Effect size
Power
-
Empirical
...., Muller - Peterson
95
3
Figure 5.2:
GLMM(F, G), GLH(G)
s = 1, s. = 1
Hotelling-Lawley
N=lO
Unconditional and Conditional Power Curves
POWER
1.0
.
0.8
0.6
0.4
0.2
O.O'\----__-----.-------.-------r
0.5
1.0
1.5
2.0
0.0
Effect Size
Noncentrality
....,w .95
--- w .05
-
96
Unconditional
•
Figure 5.3:
GLMM(F, G), GLH(G)
s = 1, s* = 1
Hotelling-Lawley
N=50
Unconditional and Conditional Power Curves
.
•
POWER
1.0
0.8
0.6
0.4
0.2
•
0.0'\---__- -__- - - , . . - . - - _ - -.....
0.00
0.10
0.15
0.20
0.25
0.05
Effect size
Unconditional
Noncentrality ••••,w .95
- - - W .05
97
Figure 5.4:
GLMM(F, G), GLH(G)
s = 1, S* = 1
Hotelling-Lawley
N=90
Unconditional and Conditional Power Curves
POWER
1.0
0.8
0.6
0.4
0.2
0 . 0 ' \ - - - _ - -__- - -......- -__- -......
0.00
0.05
0.10
0.15
0.20
0.25
Effect size
Unconditional
Noncentrality ....,w .95
--- w .05
98
•
CHAPTER 6
POWER ANALYSIS EXAMPLE:
BONE DENSITY DATA
It is always interesting to apply theoretical results to a real example, in order to
demonstrate how the new results would be used in practice. In this chapter, both
conditional and unconditional power techniques are used to examine the power of a
.
small clinical trial. The trial is designed to examine bone mineral density (BMD) in
cystic fibrosis patients who have had lung transplants.
Cystic fibrosis typically decreases BMD. This can lead to fractures in the bones.
Ontjes et ai. (1994) planned a clinical trial in which spine BMD will be measured at
baseline, six months, and one year. All patients are being treated with calcium plus
vitamin D. Stratified by gender, patients were assigned to treatment with a drug
designed to increase bone mass.
The model planned for the study is a repeated measures analysis of variance.
The density of the spine at six and twelve months are the outcomes of interest, and
gender, treatment, their interaction and baseline spine BMD are used as predictors.
Seven men and six women are currently enrolled in the study. Five of the men and
five of the women were assigned to drug treatment. Because the number of people
in each gender/treatment group was fixed by design before the study began, gender,
treatment and their interaction are fixed predictors. Baseline BMD is a random
continuous predictor, whose distribution is approximately Gaussian.
The mix of
random and fixed predictors means that this is a GLMM(F, G).
•
The study is designed to answer the following questions. Is baseline BMD a
good predictor of BMD? Is there an interaction of treatment with gender?
The
first question is a GLH(G), since baseline BMD is a random predictor. The second
question is a GLH(F), since treatment and gender are fixed.
Because the study
involves repeated measures, the Geisser-Greenhouse test for the univariate approach
to repeated measures was selected as the test statistic.
Because this trial involves both random and fixed predictors, and was designed
to be analyzed with a GLMM, it is an excellent example of the sort of study where
unconditional power methods can be used. BMD was measured by a very accurate
technique, dual energy X-ray absorbiometry, so the assumption made in this dissertation that the variables are measured without any error is reasonable. Because the
subjects are members of a very rare group, accurate power calculations are essential.
The model is given by
[ Y6monthBMD
[ Zl
Y12monthBMD ]
Z2
Z3
Z4
1 0 0 0
1 0 1 0
Zmp
1 1 0 0
Zjp
1 1 1 :1
Zjt
Zs ]
B +£
136,0 1312,0
136,1 1312,1
Zmt
136,2 1312,2
+£,
136,3 1312,3
136,4 1312,4
where
Zl
is the intercept,
Z2
is an indicator variable for female gender that is 1 if the
subject is female, and 0 otherwise,
Z3
is an indicator variable for treatment that is 1 if
the subject is receiving drug, and 0 otherwise,
female gender and treatment, and
Zs
Z4
represents the interaction between
.
is the measurement of BMD at baseline. For
the measurements at six months, 136,0 is the intercept for males on placebo, 136,1 is
the effect of being female, 136,2 is the effect of being treated, 136,3 is the effect of a
treatment-gender interaction, and
(312,3
(36,4
is the slope of the baseline. 1312,0, 1312,1, 1312,2,
and 1312,4 are the analogous coefficients for the twelve month measurements.
One can express the hypotheses of interest in terms of this model. The hypoth-
esis about the interaction is a GLH(F). Stating it in words, HOI: No effect of the
interaction between treatment and gender on the outcomes, vs., HAl: Some effect of
the interaction between treatment and gender on the outcomes. Define
C1
= [0
0 0 1 0],
and let
•
Let
100
Then the first hypothesis can be written as
(6.1)
vs.
The second hypothesis of interest is a GLH(G). In words, H 02 : No effect of
baseline on outcomes, vs. H A2 : Some effect of baseline on outcomes. To express the
same thing mathematically, define
C 2 = [0 0 0 0 1],
and
Then the hypothesis can be restated by
H 02
(6.2)
:
vs.
Before the results in this dissertation were derived, statisticians conducting a
power analysis for a GLMM with random predictors had two choices. The first is the
conditional power method. In this approach, one conditions on a particular realization
of the predictors, and considers the model
y =
[F
GObS]
B + t',
where Gobs is a fixed matrix of observed or predicted values. Since the predictors are
thought of as fixed, the Muller-Peterson or Muller-Barton F approximations can be
used to estimate the power for either a GLH(G) or a GLH(F). If one actually observed
those values, this method would provide valid conditional power estimation. However,
one never knows before the experiment is run what the values of the predictors are.
The second option was to do a conditional power analysis, but assume that the use
of the random covariates reduced the error variance. Why would that be? The logic
101
went as follows. By assumption, Y
c/
= F B F +GB G + E.
Then V {[ROWi(E)]}'
= E-
B'aEGBG = E - EGYE E'aY, where E GY is the covariance between [ROWi(G)],
and [ROWi(Y)]'. Thus, adding a random covariate should reduce the error variance.
This is the adjusted variance method. This method could only be used for a GLH(F).
For the purposes of comparison, for the GLH(F), a power analysis was conducted
using the unconditional power method of Chapter ~2, the conditional power method
and the adjusted variance method. For the GLH(G), the unconditional power method
of Chapter 3 and the conditional power method were used. The conditional power
method and the unconditional power method used the observed values of the baseline
covariate for the thirteen people currently in the study. For all the power analyses,
B, E and EG were assumed to be known. Reasonable population values for B G,
E and E G were selected based on pilot data given in the grant proposal (Ontjes et
al., 1994). It was assumed that the variances of the baseline, six and twelve month
BMD's were equal, and that the correlation between any pair of measurements was
equal. Thus, the variance structure was compound symmetric. The sample size was
set to be 39, and the effect size was varied to examine the effect on power.
Figure 6.1, (page 104), shows the power curves for the GLH(F) and the unconditional, conditional and adjusted variance method. Note that the difference in power
equals the vertical distance between the curves for a fixed effect size. The fact that
the curves do not all coincide demonstrates that one must consider the random nature
of the predictors when doing power analysis for a GLMM(F, G).
Figure 6.2, (page 105) shows the power curves for the GLH(G) and the unconditional and conditional method. The wide discrepancy between the two curves is possible with random predictors. Extreme observed values of X can lead to conditional
power predictions that are very different from the unconditional power estimates. In
this case, the conditional power predictions are much lower than the unconditional
ones, and the power is much higher than the investigators had hoped for. The study
is currently under way, so the sample size has already been fixed. In the future, sample size calculations for such studies need to be made with the unconditional power
results presented here.
We are planning to write and distribute software that will allow statisticians
to use these methods for calculating unconditional power. This software will be an
extension of the widely used package distributed by Keyes et al, 1992. The program
user will need to decide what predictors are fixed and random, using the definitions
from the first page of the dissertation. Then, they will need to specify B, E, E G, C,
U,
eo,
0:,
and F. These are, respectively, the matrix of slopes and intercepts, the
102
error covariance matrix, the covariance matrix of the random predictors, the matrices
that specify linear combinations of the slopes and intercepts for the hypotheses, the
null hypothesis matrix, the type one error rate and the matrix of fixed predictors.
..
...
103
.
..
Figure 6.1:
GLH(F) and GLMM(F, G)
Bone density data
Power curves for the effect of gender-treatment interaction
Geisser-Greenhouse statistic
N=39
POWER
1.0
•
0.8
0.6
0.4
0.2
o.o~----------------
0.0
0.1
0.2
0.3
0.4
Effect Size
Method ... Adjusted Var. - Conditional
-- Unconditional
104
0.5
..
•
Figure 6.2:
GLH(G) and GLMM(F, G)
Bone density data
Power curves for the effect of baseline BMD
Geisser-Greenhouse statistic
N= 39
POWER·
1.0
0.8
0.6
0.4
0.2
•
O.OT---....,..----,.---r----~--_r_--~
0.1
Method
0.3
0.5
0.7
Effect Size
..... Conditional
-
105
0.9
1.1
Unconditional
1.3
..
CHAPTER 7
•
SUMMARY AND DISCUSSION
This dissertation established a general framework for classifying general linear multivariate models with fixed and random predictors. For special cases of the
GLMM(F, G) and GLH(F, G), exact unconditional power results were derived for
the multivariate tests, and approximate ones for the UNIREP statistics. For the
GLMM(F, G), GLH(G), exact and approximate small sample results were derived for
the Hotelling-Lawley trace and the UNIREP statistics for all noncentralities, and for
Wilks' Lambda and the Pillai-Bartlett trace in the Jlinear case. Pitman local alternatives were used to derive asymptotic unconditional power results for the GLMM(F,
G), and conditional power results for the GLMM(F) for the multivariate tests and
the UNIREP statistics. Simulations and a power analysis for a clinical trial were used
to examine the performance of numerical algorithms for the analytic results.
The tables below summarize what was already published in the literature (.J)i
what was derived in this dissertation (.), and what is planned as future research (0).
Items where both 0 and. appear are areas of research where some special cases have
been considered in this dissertation, but the entire question has not been answered.
7.1
Conclusions
The results from Chapters 5 and 6 show that consid.ering conditional rather than unconditional power in general linear multivariate models with random predictors can be
very misleading. Even conditional power calculations that account for random covariates by adjusting the variance estimation can be in error. Conditional power analysis
can lead to either very conservative or wildly optimistic sample size calculations. The
..
Table 7.1: Summary for the GLUM
-.
•
Model
GLUM(F)
GLUM(G)
GLUM(D)
GLUM(F, G)
GLUM(F, G)
GLUM(F, D)
GLUM(F, D)
GLUM(F, G, D)
GLH
GLH(F)
GLH(G)
GLH(D)
GLH(G)
GLH(F)
GLH(D)
GLH(F)
GLH(F, G, D)
Status
..;
..;
..;
..;
-0
..;
0
0
Table 7.2: Summary for the GLMM
Model
GLMM(F)
GLMM(G)
GLMM(D)
GLMM(F, G)
GLMM(F, G)
GLMM(F, G)
GLMM(F, D)
GLMM(F, G, D)
GLH
GLH(F)
GLH(G) C = I
GLH(D)
GLH(G)
GLH(F)
GLH(F, G)
GLH(F, D)
GLH(F, G, D)
Status
..;
..;
o
-0
-0
-0
o
o
contribution of this dissertation is to provide easily calculated unconditional power
results for a variety of multivariate situations. It is strongly recommended to use
unconditional rather than conditional power results for studies with random predictors. Understanding that random predictors have a strong effect on power should
encourage the use of unconditional power analysis. Because the small sample results
are either exact or very good approximations in most cases, the asymptotic results
should probably not be used until the sample sizes are larger than 1000.
The nature of the application will determine how big a difference in sample
..
size is important. A difference in sample size between ten and thirteen rats seems
insignificant to most scientists. However, if the experiment involves killing animals,
more weight has to be put on the sample size decision. This reflects the dependence
of sample size selection not only on power considerations, but on scientific and ethical
concerns. One must consider the severity of the disease being studied, the number
of people it affects, the per-subject expense of doing the study, the time and effort
107
required to process each candidate, the severity of the side effects caused by the
techniques or drugs used in the study, and whether the study is designed with a
stopping rule or not. More accurate computation of power will lead to better informed
investigators and superior decisions both in terms of ethics and costs.
7.2
Plans for Future Research
It is our hope to publish a variety of papers from this dissertation. What follows is
a list of tentative titles and abstracts for the papers that will be written from results
already derived. The journal to which it will be submitted is given in parentheses. A
list of topics for future research follows.
On the Distribution of the Trace of Noncentral or Central Pseudo or True,
Singular or Full Rank Wisharts (Journal of Multivariate Analysis)
Definitions, characteristic functions and examples are given for a variety of members of the Wishart family. A derivation based on spectral decomposition and eigenanalysis allows specification of the distribution of the trace of any sort of Wishart
matrix. The distribution is given in terms of scaled sums of noncentral chi-squared
random variables and constants. Using Davies' algorithm allows the computation of
probability density functions, moments and distribution functions.
Power in a Special Case of the General Linear Multivariate Model with
Fixed and Random Predictors: A Special Case of a General Linear Hypothesis about both Fixed and Random Predictors (Biometrics)
Power calculations for general linear multivariate models were previously available only for purely fixed predictors, purely random. predictors, or special univariate
models with both fixed and random predictors. Multivariate models with fixed and
random predictors are often considered in public health studies. For these studies, the
law of total probability is used to calculate unconditional power, which corresponds to
taking the expected value of the usual power approximation over all possible stochastic realizations of the predictors. A clinical trial of a treatment for increasing bone
density is used as a driving example. We consider hypotheses about baseline bone
density, a Gaussian predictor, and treatment, which is fixed. An unconditional power
approximation for the Geisser-Greenhouse statistic for the univariate approach for
repeated measures is derived and compared to the existing conditional one.
108
•
,
Unconditional Power for a GLMM with Fixed and Random Predictors and
a GLH about Random Predictors (JASA)
Multivariate models with fixed and random predictors are often considered in
public health studies. Often, the interest is in testing hypotheses about random
predictors. Unconditional power results are derived for the univariate approach to
repeated measures tests and for the Hotelling-Lawley trace, Wilks' Lambda and the
Pillai-Bartlett trace. The results are exact when the number of rows in the multivariate hypothesis is one, and approximate for other cases. Results for the univariate
statistics and for the Hotelling-Lawley trace are given for a variety of cases, while
results for the Pillai-Bartlett trace and Wilks' lambda are given only in the linear
case. Asymptotic results are given both for the unconditional power results and for
the conditional power approximations suggested by Muller and Barton (1991) and
Muller and Peterson (1984). Simulations and an example power analysis are used to
illustrate the analytic work.
Bias in General Linear Multivariate Model Power and Sample Size Calculation Due to Estimating Variance (Communications in Statistics)
Conditional power approximations for the univariate approaches to repeated
measures and the multivariate tests were proposed by Muller and Barton, (1989),
and Muller and Peterson (1984). Power approximations of this sort rely on the assumption that the multivariate covariance matrix and the matrix of slope parameters
are known. In practice, they are often estimated from information from a previous
study. Additional complexity arises whenever the estimate has been censored. Left
censoring occurs when only significant results lead to a second study, while right censoring occurs when only non-significant tests lead to a second study. Because the
estimation process and censoring processes introduce variability, the resulting power
estimates are random variables. We propose confidence intervals for noncentrality,
power and sample size for a useful range of cases. Simulations are used to demonstrate the accuracy of the methods and examine the potential for bias in power and
sample size calculations.
•
After completing these papers, and attacking the areas of research listed in Tables
7.1 and 7.2, there is a wide choice of topics to consider. Some of them are listed here.
1. Development of a new taxonomy for models with interaction of fixed and random
predictors, and examination of unconditional power in these models.
2. Computationally efficient approximations.
109
3. GLMM(F, G) where the distribution of G depends on F.
4. Logistic, survival, Poisson models with random predictors.
5. Comparison of unconditional power with optimal power fixed designs for the
GLMM-(F, D).
6. Examination of power issues in models where the randomness arises from imputation of missing data (Little, 1987).
7. Examination of rate of convergence of limits.
8. Development of numerical algorithms for the unconditional Pillai-Bartlett trace
and Wilks' lambda for s* = 1.
9. Derivation of unconditional power results for the GLMM(F, G), GLH(F, G) for
the Pillai-Bartlett trace and Wilks' lambda for s* > 1.
10. Development of numerical algorithms for the Hotelling-Lawley trace and UNIREP
statistics for s* > 1.
11. Asymptotic results for the special case of the GLMM(F, G), GLH(F, G) considered in Chapter 2.
110
APPENDIX A
A.I
Theorem on Quadratic Forms
Theorem 6.6a.2 from Mathai and Provost, 1992, p. 286 is reproduced here exactly.
Let X' = (Xt, ... ,XN) with X j = Np(p,j'V),V > O,j = 1, ... ,n,
and the X/s mutually independently distributed. Let Qj = X'AjX +
!(LjX + X'Li) + Gj, A j = Ai, G j = Gi, Qj be symmetric for all X,
and A j , Lj, and G j be real matrices of constants, j = 1,2. Then Q 1 and
Q2 are independently distributed iff,
A.2
Note on Positive Definite Matrices
The following definitions are quoted from Graybill (1983), p. 396.
Definition 3 An n x n matrix A is defined to be positive semi-definite if
and only if
1. A
= A'
2. z'Az
> 0 for each and every vector z in n N and the equality holds
for at least one vector z such that z
=I o.
Definition 4 An n x n matrix A is defined to be positive definite if and
only if
1. A = A'
2. z'Az > 0 for each and every vector z in
nN
such that z
=I o.
= H +E
Note 1 For a GLMM(F, G) with a GLH(G), T
Proof: By assumption, E
= U'EU(N -
is positive definite.
r) is positive definite. Since
e
must be of
I
full row or column rank if the hypothesis is to be testable, and M- is non-singular,
with probability one, H is either positive semi-definite or positive definite. If H is
positive definite, then so is T, by Theorem 12.2.18, p. 415, Graybill (1983).
Now consider the case in which H is positive semi-definite. We need to show
for all z in
nN
that z'Tz > O. Partition
nN
into two disjoint sets: the set of z such
that z'H z = 0, and the set of z such that z'Hz > O. For the first set, z'Tz = z'Ez,
which is strictly greater than zero since E is positive definite. For the second set,
z'Tz = z'Hz + z'Ez > z'Ez which again is strictly greater than zero.
Then for all z =f=. 0, z'Tz > 0, and thus T is positive definite. 0
A.3
Note on Magnitude of the Trace
Note 2: The eigenvalues {Ai} of HT- I are random variables such that 0
except on a set of measure zero. Thus tr (HT-
I
)
< Ai < 1,
< s.
Proof: Let Ai be the eigenvalues of HT- I . Then they satisfy the equation
I HT- I
-
AII=
o.
(1.1 )
Consider a spectral decomposition of E so that
where FE is full rank since E is. Then one can rewrite equation 1.1 so that
I H(FEF"F/ HF"Et F'p; + FEF'p;)-l
- >'11 -
0
>'11 >'11
0
H[FE(F"E/HF"El +I)F'p;t l
-
I HFi?[FEIHFil +I]-IFE
-
1
1
O.
Since eigenvalues are invariant with respect to circular permutations, this is equivalent
to
(1.2)
Since I is positive definite, and F EI H F ET is positive semidefinite, F EI H F ET +
I is positive definite, by Note 1. Then [F EI HFit + I] is non-singular, and its
112
.
determinant is not equal to zero. Then multiplying both sides of equation 1.2 by
I FE/HFEl +1 I,
IFE/HF ET
-
A(FE1HFET
+ 1)1
-
IF EI HF ET (l - A) - All
0
0
(1.3)
Now if A = 1, it implies that I-II = 0, which is a contradiction. Then A
Dividing equation 1.3 through by (1 - A), we obtain
#-
1.
Since eigenvalues are invariant under circular permutation, the eigenvalues of F E/ FEl
are the eigenvalues of H F ET F EI = HE-I. Let Pi be the eigenvalues of H E- l • Then
and thus
P
A=--.
l+p
Since H E- l is the product of two non-negative matrices, its eigenvalues are greater
than or equal to zero.
For P > 0, 0 :::; A < 1.
Since tr (HT- l ) = L:~=l Ai,
tr (HT- l ) < s. 0
A.4
Transformations for Wilks' Lambda
This section goes provides details about the transformations needed to determine the
distribution of the noncentrality parameter for Wilks' Lambda.
1.1£
then
f.(x)
•
for
={
0<x <
00
elsewhere
2. Let Y = IX. Here I is assumed to be the non-zero eigenvalue of FE/SFEl .
Since that matrix is positive-semidefinite, I > O. The Jacobian is given by
dx
1
dy
I
113
,
and
yf-le -f.;
for
0<Y<
00
r(1'-)2!'f-y!'f
fy(y) =
{
= 0
elsewhere
3. Let z = y + 1. Then the Jacobian is given by
I~~I = 1
and
1
"3
f.(z) { r(I/1)2T
1/3
( "3) _ (=:::1)
1
(z - 1)2/,- 2 e
2"
for
o
l<z<oo
elsewhere
4. Let
_L
Z
g
= q.
Then the Jacobian is given by
dzl
I
- =gq -9-1
dq
where 9 = [a 2 b2
-
4]~[a2
+ b2 -
5]L Since a and b are dimensions of matrices,
they must be greater than or equal to 1. This guarantees that 9
5. Let r
= 6q -
~
1. Then
6, where 6 = 9 [(N - q) - (b - a + 1)/2] - (ab - 2)/2. Then the
Jacobian is given by
drl = ~6'
Idq
and
for
0<r <
elsewhere
114
00
A.5
Density of the H-L Scalar Noncentrality Parameter: Special Case
Note 1 The density of scalar noncentrality parameter for the Hotelling-Lawley trace
is given by
00
f[WHLT; 1, (N - qF), u]
-
~
~
_~(u/2)i (N _qF)(N-qF)/2 [
e
2
f3 (1+22 i ' N-qF)
2
j!
3=0
K -w
[ N - qF + ( (N -qF )-1 W
K
forO
K -w
] (2i-1)/2
x
(N - qF)-l W
)]-(I+2i +N-qF)/2
X
< W < K.
Proof: Let
•
be the noncentrality parameter. Let
Let t be a random variable that is distributed F(l, N - qF, u). Then the density of t
is
f(t)
=
00
~
3=0
for 0
(
/2)i (N
)(N-qF)/2
- q~
t(2i-1)/2[N _ q
J!
f3 (1+23 N-qF)
F
~ e- u / 2 u .
2
< t < 00. Let WHLT = I+(N~F)
t=
'
2
It"
Then
K-w
(N - qF)-l w
+ tt(1+2i+ N-qF)/2
,
and
dt
dw
Now K
= (N -
-K
qF)-l w 2·
> 0 since [CF(F'Ft1CFt1 > 0, tr[(8 - 80 )E- 1(8 - 8 0 Y] > 0 and (N - q-
b + 1) >
o.
Then the Jacobian is given by
11:1 = (N - ~)-lW2'
and the density of the non-centrality parameter follows. 0
115
BIBLIOGRAPHY
Abramowitz, M. and Stegun, I. (1972). Handbook of Mathematical Functions with
Formulas, Graphs, and Mathematical Tables. Dover Publications, Incorporated.
New York.
Anderson, T. (1984). An Introduction to Multivariate Statistical Analysis, Second
Edition. John Wiley and Sons. New York.
Anderson, T. and Das Gupta, S. (1964). Monotonicity of the power functions of some
tests of independence between two sets of variates. The Annals of Mathematical
Statistics 35, 206-208.
Arnold, S. F. (1981). The Theory of Linear Models and Multivariate Analysis. John
Wiley and Sons, Inc. New York.
Bagai, O. P. (1972). On the exact distribution of a test statistic in multivariate
analysis of variance. Sankhya, Series A. 34, 18'7-190.
Berger, M. P. (1986). A Monte Carlo study of the power of alternative tests under
the generalized MANOVA model. Communications in Statistics: Theory and
Methods 15(4), 1251-1283.
Betz, M. A. (1987). An approximation for the Hotelling-Lawley trace in the noncentral
case. Communications in Statistics: Theory and Methods 16(11), 3169-3183.
Boik, R. J. (1981). A priori tests in repeated measures designs: Effects of nonsphericity. Psychometrika 46(3), 241-255.
Christiansen, D. H., Hosking, J., Helms, R. W., Muller, K. and Hunter, K. (1995). Linmod (3.1) Language Reference Manual. University of North Carolina at Chapel
Hill. Chapel Hill, N.C.
Clark, A.-M. (1987). Approximate confidence bounds for estimated power in testing
the general linear hypothesis. Master's thesis. University of North Carolina at
Chapel Hill, Department of Biostatistics. Chapel Hill, North Carolina.
Cohen, J. (1987). Statistical Power Analysis for the Behavioral Sciences, Revised
Edition. Lawrence Erlbaum Associates, Inc., Publishers. Hillsdale, New Jersey.
Constantine, A. (1963). Some non-central distribution problems in multivariate analysis. Annals of Mathematical Statistics 34, 1270-1285.
116
,
Courant, R. and John, F. (1965). Introduction to Calculus and Analysis, Volume One.
Interscience Publishers. New York.
•
Davies, R. B. (1980). Algorithm AS 155: The distribution of a linear combination of
non-central chi-squared random variables. Applied Statistics 29(3), 323-333.
Dozier, W. G. and Muller, K. E. (1993). Small-sample power of uncorrected and
Satterthwaite corrected T tests for comparing binomial proportions. Communications in Statistics: Simulations 22(1), 245-264.
Fujikoshi, Y. (1973). Asymptotic formulas for the distributions of three statistics for
multivariate linear hypothesis. Annals of the Institute of Statistical Mathematics
25, 423-437.
Fujikoshi, Y. (1975). Asymptotic formulas for the non-null distributions of three
statistics for multivariate linear hypothesis. Annals of the Institute of Statistics
22,99-108.
Fujikoshi, Y. (1988a). Asymptotic power comparison of some tests in MANOVA and
canonical correlation models. In Statistical Theory and Data Analysis II, K. Matsushita (ed.). Elsevier Science Publishers B.V. (North Holland). pp. 327-336.
Fujikoshi, Y. (1988b). Comparison of powers of a class of tests for multivariate linear
hypothesis and independence. Journal of Multivariate Analysis 26, 48-58.
Gatsonis, C. and Sampson, A. R. (1989). Multiple correlation: Exact power and
sample size calculations. Psychological Bulletin 106(3), 516-524.
Genizi, A. and Soller, M. (1979). Power derivation in an ANOVA model which is
intermediate between the "Fixed-Effects" and the "Random-Effects" models.
Journal of Statistical Planning and Inference 3, 127-134.
Graybill, F. A. (1983). Matrices with Applications in Statistics. Second Edition.
Wadsworth International Group. Belmont, California.
Habib, A. R. and Harwell, M. R. (1989). An empirical study of the type I error
rate and power for some selected normal-theory and nonpararnetric tests of the
independence of two sets of variables. Communications in Statistics: Simulations
18(2), 793-826.
Hawkins, D. and Han, C.-P. (1986). A power comparison of three tests for design effects in a random effects covariance modeL Communications in Statistics: Theory
and Methods 15(11), 3401-3418.
•
Helms, R. W. (1988). Comparisons of parameter and hypothesis definitions in a generallinear modeL Communications in Statistics, Series A 17, 2725-2753.
Jayakar, A. (1970). On the detection and estimation oflinkage between a locus influencing a quantitative character and a marker locus. Biometrics 26, 451-464.
117
Jennrich, R. 1. and Scluchter, M. D. (1986). Unbalanced repeated-measures models
with structured covariance matrices. Biometrics 42, 805-820.
John, S. (1971). Some optimal multivariate tests. Biometrika 58(1), 123-127.
•
Johnson, N. 1. and Kotz, S. (1970). Continuous Univariate Distributions-2. Houghton
Mifflin. Boston.
Keyes, 1. L. and Muller, K. (1992). IML Power Program User's Guide. University of
North Carolina at Chapel Hill. Chapel Hill, North Carolina.
Kulp, R. and Nagarsenker, B. N. (1984a). An asymptotic expansion of the non-null
distribution of Wilks' criterion for testing the multivariate linear hypothesis. The
Annals of Statistics 12(4), 1576-1583.
Kulp, R. W. and Nagarsenker, B. N. (1984b). Test of independence between two sets
of variates. Communications in Statistics: Theory and Methods 13(6), 685-698.
Laird, N. M. and Ware, J. H. (1982). Random-effects models for longitudinal data.
Biometrics 38, 963 - 974.
LaVange, L. M., Keyes, L. L., Koch, G. G. and Margolis, P. A. (1994). Application
of sample survey methods for modelling ratios to incidence densities. Statistics
in Medicine 13, 343-355.
Lee, Y.-S. (1971). Distribution of the canonical correlations and asymptotic expansions for distributions of certain independence test statistics. The Annals of
Mathematical Statistics 42(2), 526-537.
Leithold, 1. (1968). The Calculus with Analytic Geometry. Harper and Row, Publishers. New York.
Little, R. J. A. (1992). Regression with missing X's: A review. Journal of the American Statistical Association 87(420), 1227-1237.
Mathai, A. M. (1992). Quadratic forms in random variables: Theory and Applications.
M. Dekker. New York.
McCarroll, K. and Helms, R. (October, 1987). An evaluation of some approximate
F statistics and their small sample distributions for the mixed model with linear covariance structure. Technical Report 1838T. Department of Biostatistics,
University of North Carolina at Chapel Hill. Institute of Statistics Mimeo Series.
Moser, B. K., Stevens, G. R. and Watts, C. L. (1989). The two-sample T test versus
Satterthwaite's approximate F test. Communications in Statistics: Theory and
Methods 18(11), 3963-3975.
Muirhead, R. J. (1972a). The asymptotic noncentral distribution of Hotelling's generalized T5. The Annals of Mathematical Statistics 43(5), 1671-1677.
118
..
Muirhead, R. J. (1972b). On the test of independence between two sets of variates.
The Annals of Mathematical Statistics 43(5), 1491-1497.
Muirhead, R. J. (1982). Aspects of Multivariate Statistical Theory. John Wiley and
Sons, Inc.. New York.
Muller, K. E. and Barton, C. N. (1989). Approximate power for repeated measures ANOVA lacking sphericity. Journal of the American Statistical Association
84(406), 549-555. Theory and Methods.
Muller, K. E. and Barton, C. N. (1991). Correction to "Approximate Power for Repeated Measures ANOVA Lacking Sphericity". Journal of the American Statistical Association 86, 255-256.
Muller, K. E. and Peterson, B. L. (1984). Practical methods for computing power in
testing the multivariate general linear hypothesis.. Computational Statistics and
Data Analysis 2, 143-158.
Muller, K. E. and Pieper, K. S. (1994). A computing based strategy for controlling Type I and Type II error rates in interim analysis. Unpublished paper: in
submission.
Muller, K. E., LaVange, L. M., Ramey, S. 1. and Ramey, C. T. (December
1992). Power calculations for general linear multivariate models including repeated measures applications. Journal of the American Statistical Association
87(420), 1209-1226.
•
Nagao, H. (1972). Non-null distributions of the likelihood ratio criteria for independence and equality of mean vectors and covariance matrices. Annals of the Institute of Statistical Mathematics 24, 67-79.
Neter, J., Wasserman, W. and Kutner, M. H. (1985). Applied Linear Statistical Models. Richard D. Irwin, Inc. Homewood, Illinois.
O'Brien, R. G. and Shieh, G. (1992). Pragmatic, unifying algorithm gives power
probabilities for common F tests of the multivariate general linear hypothesis.
Unpublished document from the University of Florida.
Olson, C. 1. (1974). Comparative robustness of six tests in multivariate analysis of
variance. Journal of the American Statistical Association. Applications Section
69(342), 894-908.
Olson, C. 1. (1976). On choosing a test statistic in multivariate analysis of variance.
Psychological Bulletin 83(4), 579-586.
•
Olson, C. 1. (1979). Practical considerations in choosing a MANOVA test statistic:
A rejoinder to Stevens. Psychological Bulletin 86(6), 1350-1352.
Ontjes, D. (1994). Draft proposal. clinical trials for cystic fibrosis lung transplant
patients. Unpublished proposal.
119
Perlman, M. D. and aIkin, I. (1980). Unbiasedness of invariant tests for MANOVA
and other multivariate problems. The Annals of Statistics 8(6), 1326-134l.
Pillai, K. C. S. (1955). Some new test criteria in multivariate analysis. Annals of
Mathematical Statistics 26, 117-121.
Pillai, K. C. S. and Dotson, C. O. (1969). Power comparisons oftests of two multivariate hypotheses based on individual characteristic roots. Annals of the Institute
of Statistical Mathematics 21(1), 49-66.
r
Pillai, K. C. S. and Jayachandran, K. (1967). Power comparisons of tests of two
multivariate hypotheses based on four criteria. Biometrika 54(1 and 2), 195210.
Pillai, K. C. S. and Jayachandran, K. (1968). Power comparisons of tests of equality
of two covariance matrices based on four criteria. Biometrika 55(2), 335-342.
Pillai, K. C. S. and Jayachandran, K. (1970). On the exact distribution of Pillai's V(s)
criterion. Journal of the American Statistical Association. Theory and Methods
65(329), 447-454.
Roebuck, P. (1982a). Canonical forms and tests of hypotheses. Part I: The general
univariate mixed model. Statistica Neerlandica 36(2), 63-74.
..
Roebuck, P. (1982b). Canonical forms and tests of hypotheses. Part II: Multivariate
mixed models. Statistica Neerlandica 36(2), 75--80.
Roy, S. and Mikhail, W. (1961). On the monotonic character of the power functions
of two multivariate tests. The Annals of Mathematical Statistics 32, 1145-1151.
Sambrook, P. N., Kelly, P. J., Keogh, A. M., Macdonald, P., Spratt, P., Freund, J.
and Eisman, J. A. (1994). Bone loss after heart transplantation: a prospective
study. Journal of Heart and Lung Transplantation 13, 116-121.
Sampson, A. R. (1974). A tale of two regressions. Joumal of the A merican Statistical
Association 69(347), 682-689.
SAS Institute (1992). SAS @Software: Version 6. SAS Institute. Cary, North Carolina.
Schluchter, M. D. and Elashoff, J. D. (1990). Small-sample adjustments to tests with
unbalanced repeated measures assuming several covariance structures. Journal
of Statistical Computation and Simulation 37, 69-87.
Searle, S. R. (1982). Matrix Algebra Useful for Statistics. John Wiley and Sons. New
York.
Self, S. G., Mauritsen, R. H. and Ohara, J. (March 1992). Power calculations for
likelihood ratio tests in generalized linear models. Biometrics 48, 31-39.
120
..
Sen, P. K. and Singer, J. M. (1993). Large Sample Methods in Statistics: An Introduction with Applications. Chapman and Hall. New York.
•
Serfling, R. J. (1980). Approximation Theorems of Mathematical Statistics. John Wiley and Sons. New York. Wiley Series in Probability and Mathematical Statistics.
Shah, B. V., Barnwell, B. G., Hunt, P. N. and LaVange, L. M. (1991). SUDAAN
User's Manual, Release 5.50. Research Triangle Institute. Research Triangle
Park, North Carolina 27709.
Soller, M. and Genizi, A. (1978). The efficiency of experimental designs for the detection of linkage between a marker locus and a locus affecting a quantitative trait
in segregating populations. Biometrics 34, 37-55.
Stevens, J. (1979). Comment on Olson: Choosing a test statistic in multivariate
analysis of variance. Psychological Bulletin 86(2), 355-360.
Sugiura, N. and Fujikoshi, Y. (1969). Asymptotic expansions of the non-null distributions of the likelihood ratio criteria for multivariate linear hypothesis and
independence. The Annals of Mathematical Statistics 40(3), 942-952.
Sugiyama, T. and Ushizawa, K. (1992). Power oflargest root on canonical correlation.
Communications in Statistics: Simulations 21(4), 947-960.
•
Thisted, R. A. (1988). Elements of Statistical Computing: Numerical Computation.
Chapman and Hall. New York.
Ware, J. H. (1985). Linear models for the analysis of longitudinal studies. The American Statistician 39(2), 95-1Ol.
Williams, J. S. (1970). The choice and use of tests for the independence of two sets
of variates. Biometrics 26(4), 613-624.
Yang, Y.-H. (1995). Interim Analysis for Continuous Repeated Measurements. PhD
thesis. University of North Carolina at Chapel Hill, Department of Biostatistics.
Zeger, S. 1. and Liang, K.- Y. (1968). Longitudinal data analysis for discrete and
continuous outcomes. Biometrics 42, 121-130.
Zeger, S. L. and Liang, K.-Y. (86). Longitudinal data analysis using generalized linear
models. Biometrika 73, 13-22.
.
•
121
© Copyright 2026 Paperzz