Stewart, Paul Wilder; (1982).Application of a Bivariate t Distribution to Hypothesis Testing in Crossover Experiments."

APPLICATION OF A BIVARIATE t DISTRIBUTION
TO HYPOTHESIS TESTING IN CROSSOVER EXPERIMENTS
by
Paul Wilder Stewart
Department of Biostatistics
University of North Carolina at Chapel Hill
Institute of Statistics Mimeo Series No. 1401
May 1982
APPLICATION OF A BIVARIATE t DISTRIBUTION
TO HYPOTHESIS TESTING IN CROSSOVER EXPERIMENTS
by
Paul Wilder Stewart
A Thesis submitted to the faculty of The
University of North Carolina at Chapel
Hill in partial fulfillment of the
requirements for the degree of Doctor
of Philosophy in the Department of
Biostatistics
Chapel Hill
1981
ABSTRACT
PAUL WILDER STEWART. Application of a Bivariate t Distribution
to Hypothesis Testing in Crossover Designs. Under the direction
of RONALD H. HELMS, Ph.D.
The joint distribution of a pair of Student-t variates,
(t ,t ), is investigated and methods of computing Pr{t l £ set
l 2
l
and t 2 E set 2} are developed. The resulting algorith~s are used
to evaluate the power and actual significance level of two-sided
tests which are based on the test statistic max It i I and the Bonferroni inequality. It is assumed that the numerators of t l and
t 2 are dependent, that the denominators are dependent, and that
nu~erators
are independent of denominators.
This research is motivated by an alternative 'by-period'
analysis for crossover experiments recently suggested by Helms,
Kempthorne and Shen (1980).
Conceptual advantages of the by-
period analysis over the traditional analysis are discussed and
a comparison is made with respect to the power of tests against
alternative hypotheses.
An original crossover experiment is report and analyzed
according to the by-period method.
effects test is computed.
The power of the treatment
A messy-data subset of the complete
data is then re-analyzed and power is computed.
The power of the
analogous test in the traditional analysis is also computed and
comparisons are made between the two analyses.
It is found that
the by-period testing procedure is more powerful than that in the
traditional analysis for this problem.
ii
CONTENTS
ACKNOHlEDGEMENTS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
iv
INTRODUCTION AND SUMMARy......................................
v
Revi ew of the litera ture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1. 1 Summa ry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Multivariate t Distributions.
1.3 Crossover Designs: Theory and Applications..
1
1
10
Bivariet-t Distribution Theory and Power Computations:
The Case of Identical Design and Matrices in the linear
Models for Periods 1 and 2................................
16
2. 1
2.2
2.3
2.4
2.5
Summa ry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Formulation of the Power Function p(t)...............
A Maclaurin Series Expansion.
Two Algorithms for Computing Power...................
Compari son of the Two Al gori thms. . . . . . . . . . . . . . . . . . . . .
16
16
26
45
53
Bivariate-t Distribution Theory for Power Computations:
The Case of Different Design Matrices in the linear
Models for Periods 1 and 2................................
55
3. 1
3.2
3.3
3.4
Summa ry. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The Structure of the Data.........
Transformation of the Data to Canonical Form.........
An Algorithm for Computing Power.....................
55
56
56
71
Examples of Data Analyses with Power Calculations.........
73
4.1 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 A Bean Sprout Experiment.........
4.3 By-Period Analysis for the Complete Set of Data......
4.4 Messy Data Analyses..................................
73
74
75
84
4.4.1 By-Period Analysis............................
4.4.2 Power Calculations for the Traditional Model..
4.4.3 Comparisons of Power....
85
91
96
Recommenda ti ons for Further Research......................
98
1.
2.
3.
4.
5.
Appendix
A.l Bean-Sprout Experiment Protocol
A.2 Algorithm: Power via Maclaurin Polynomials
101
10]
108
iii
A.3 Algorithm: Power via Monte Carlo Integration.
A.4 Algorithm: Power and Significance Level via Monte
Carlo Simulation.....................................
116
120
References. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . .
124
iv
ACKNOWLEDGEMENTS
I wish to thank Dr. Ronald W. Helms for the guidance and
cheerful encouragement that gave this study the potential of
its success.
I also wish to thank the other members of my
Doctoral Committee for their individual contributions:
Dr. N.L.
Johnson, Dr. Dana Quade, Dr. Dennis B. Gillings and Dr. Seth A.
Rudnick.
I am grateful to Mrs. Beryl Glover who so carefully
typed the final draft.
Finally, I wish to express appreciation
to special friends, Dr. Patrick W. Crockett and Senorita Susan
G. Navey, for their support and encouragement.
INTRODUCTION AND SUMMARY
The joint distribution of a pair of Student-t
variates,
(t l ,t 2 ), is investigated and methods of computing pr t l £: set and
l
t 2 £: set 2 are developed. The results are used to evaluate the
power and actual significance level of two-sided tests which are
based on the test statistic maxlti
I
and the Bonferroni inequality.
It is shown that some of the results can be used to evaluate
power and significance of one-sided tests as well.
In this research it is assumed that t l and t 2 are defined
as foll ows:
t·1
=
-1 T
T
C.
X.
y.1
[ X.X.]
-1 -1-1
-1
T
[ y . P. Y. / (n. - r . ) ]
-1-'-'
"
1:
2
where
P.(n.
-1'
, =
x n.)
T
I-X.
X.X.
-1 -'-1
-1
-
T
X.,
-1
C.(l
x r.)
is a matrix of constants,
-1
1
and
-,X.
Y.(n. x 1) d- N (
-1
1
n.
,
X.~.,
-1~'
(J.
2 I ).
,-
is the design matrix of the general linear univariate model de-
noted GLUM (;);§i' 0i2n· It is assumed that the numerators of t l
and t 2 are dependent, that the denominators of t l and t 2 are
dependent, and that numerators are independent of denominators.
vi
This assumption is true when the only dependence between Yl and Y2
occurs as dependence between some (Yli'Y2i) pairs.
It follows that
t l and t 2 are dependent and follow different distributions.
This research is motivated by an alternative Iby-period
l
strategy for crossover experiments recently suggested by Helms,
Kempthorne and Shen (1980).
We will focus on two-period, four-
sequence, two-treatment crossover designs (denoted CO(2,4,2).)
The by-period analysis avoids some difficulties associated with
traditional analyses but requires the straightforward use of the
Bonferroni inequality and maxlti
I
for certain two-sided tests.
It
is therefore of interest to know how the power of tests by this
method of analysis compares with the power of analogous tests by
traditional methods.
Both analytic and numerical methods are used to construct
algorithms for computing power and actual significance level for
the by-period method.
The algorithms are then used as tools in
making a comparison between traditional analysis and by-period
analysis of crossover data.
For the case of
01 = 02
=
0(n x r), an algorithm based on a
Taylor series is developed in Chapter 2 along with an algorithm
using Monte Carlo integration of a fourfold integral.
The Taylor
series expression is algebraically complicated and its unknown
interval of convergence is a functinnof several parameters.
The
Taylor algorithm and the Monte Carlo integration algorithm seem to
perform equally well whenever the Taylor series converges.
vii
x r l ) f ~2(n2 x r 2 ), a
Monte Carlo simulation algorithm is developed in Chapter 3. The
For the more general case of
~l(nl
algorithm is simple in design and efficient in practice.
Since
only about two significant digits of accuracy are required for
purposes of this research, the algorithm is inexpensive.
Further-
more it is sufficiently general to handle power and significance
level calculations for both two-sided and one-sided tests.
All three algorithms developed are generally useful for
certain probability calculations and are not restricted to use
with crossover data.
Finally, an original crossover experiment, conducted for
the purpose of illustrating the theoretical results of this
study, is reported and analyzed in Chapter 4.
In addition, a
subset of the data is taken as an example of messy data.
The
by-period and traditional analyses of the messy data are compared.
It is shown that the by-period strategy is in some ways more advantageous and that it has superior power for testing treatment
effects in the particular messy data analyzed.
CHAPTER 1
REVIEW OF THE LITERATURE
1.1
Summary
The best known of the many possible multivariate t distri-
butions are defined and discussed.
It appears that the multi-
variate theory needed has not yet been reported in the literature.
Important contributions to the literature on crossover designs are
discussed and outstanding problems concerning analysis are noted.
1.2
Multivariate t Distributions
Generalization of the univariate t distribution can take
many forms.
Several multivariate t distributions arise naturally
in practice.
The development of such distributions has a short
history and much of the work is incomplete.
Although mUltivariate
theory research was gaining momentum at the turn of the century
when 'Student) published a description of the t distribution. much
of the theory could not be applied until the coming of the electronic computer.
The development of multivariate t distributions
began circa 1953.
Johnson and Kotz (1972) collate and summarize the literature
on a variety of multivariate t (!) distributions.
vectors (r.v.s.) are of the form
t
mx1
Most t random
= S-l * Z where the
mxm
mx1
2
non-zero elements of
are gamma variates and? follows an
~*~'
independent multinormal distribution.
For this reason the study
of a given t distribution often involves concomitant study of
such distributions as the Wishart and multivariate chi-square.
In this research the t-distribution of interest can be characterized by
=
t
2xl
where l(2xl) -d
~*~' =
matrix
0
Ql
0
-~
* f
2xl
Q2
N2(~'~)
=
~-l*Z
independent of Ql,Q2'
diag(Ql,Q2) is such that Qivi/oi
with vi degrees of freedom.
The diagonal
2
is a
~hi-square
r.v.
Furthermore 1Ql,Q21 forms the diagonal
of a two-dimensional Wishart matrix.
Johnson and Kotz (1972) summarize the still incomplete study
of the many multivariate extensions of the gamma distributions.
That study did not begin until about the time Wishart derived his
well-known distribution in 1928.
1.2.1
Multivariate t with Common Denominator
The most completely developed t-distribution has the
t
mxl
where l
9 N(~,o2~)
tion matrix.
variate
= 1 Z
t
S mxl
is independent of Q 9x;o2/ v ,
R is a correla-
This distribution is often referred to as lithe multi-
distribution."
It is not particularly useful for appli-
3
cation to the analysis of crossover (CO) designs since it is not
usually appropriate to assume that ~ = a2~.
Most contributions to the literature on t distributions deal
specifically with t~.
Johnson and Kotz (1972) trace the develop-
ment of this distribution beginning with fundamental results attributed to Cornish (1954) and Dunnett and Sobel (1954).
Several ap-
plications, algorithms and tabulations have been contributed to
t f since 1971.
the literature on
ulate the distribution of t = max
Freeman and Kuzman (1972) tabX,-S'+ l
1
1
12 Sv
Dutt (1972,1975)
discusses rapidly convergent algorithms for evaluating probability
integrals of t.
Wynn (1975) and Bohrer (1973) evaluate probability
integrals and provide tables for Pre!
polychedral cones.
€
Ar(c)] where Ar(c) define
Nagarsenkar (1975) obtains the distribution of
certain quadratic forms in!.
Ghosh (1975) obtains the distribution
of (t l -t 2). Zellner (1976) exemplifies the application of ~f = t
by studying regressions with t error terms. Dutt, et al. (1976) suggest an approximation to the distribution of
max
It. I.
i=1,2,3 1
Bishop,
et al. (1978) describe and compare four methods for approximating
the percentage points of quadratic forms in independent t variates.
Amos (1978) demonstrates the evaluation of c.d.f. 's of ! by numerical
quadrature.
0=0.
Chen (1979) tabulates percentage points for t~ where
4
1.2.2 Matric-t distributions
Ando and Kaufman (1965) obtained an equivalent repre1
sentation for t = -Z.
They showed that
- SIf t [1- x2v r!;2 . y
with y d- N(Q,~)
v
mxl
mxl
mxl
l
d- N(Q,I )
then t
I~ J- . Z where Z
m
mxl
mxl
mxm
mxl
and J.J t d- W ( v+m-l,R -1) .
- -
m
-
Dickey (1967) generalized this result by computing the joint p.d.f.
of
T
mxq
where
=
[1 * ,.X
mxm
mxq
and
tive definite.
bution.
~ N(Q, Q ) v.1
= ;1,2, ...
d- W (v+m- 1, R)
m
-
are independent if ifj, and Rand
~J
,m
mxm
9 are
posi-
This distribution is called the matric-t distri-
Tan (1969) obtained the p.d.f. of a special case given by
T
pxm
where
X.
mxl
X.
-1
mxl
and J.J t
where
~
mxm
=
Y * S-~
pxm
mxm
9 Wm(v,Y)
independent of Y 9 N(Q,~o @ y) singular
pxm
subject to the restrictions
~
Y= Q
rxp pxm
Dickey (1976) reports
rxm
a new representation of t and matric-variate t as functions of independent t or T variates.
Juritz and Troskie (1976) discuss non-
central T distributions that can be applied to 2-stage least
squares estimators.
5
1.2.3 Multivariate Behrens-Fisher Distribution
Di ckey (1966) investigated
k
T = E
j=l
rnXl
t.
-J
mxl
Z.
-J
.
B.
-J
mxm
t.
-J
mxl
I~j I
t a
¥
j
1
= S.
Z.
-.1
J mxl
d
-
N(~, I )
i id
-m
S~ d- l/v.
J
J
. x2v
i
Cornish (1966) provides an example of application.
Walker and
Saw (1978) discuss the distribution for the case of m=l.
1.2.4 2-step t Distribution
Bulgren, et al., (1974) suggests the use of a 2-step
test for data of the form
2
d
- N(~'
,(J
•
n
The t-variates for the 2 steps are
tl
=
t2 =
-1 1 .
n -
[yT-1
1
[1-- n 1]
- ~l
1
m+h
Y1
(m)(lm_n]
~
~
I Y2
[lrT yTt-2 - 1
m-1
OWl]
I
0
-*1 ~2
k
(m+nJim+n-2l] ,
6
The p.d. f. of
rm
if 1
t
2;<1
where y*
Im+n y*
0
0
is known and
t
2xl
=
T
1
n+m . 1
converges in law to
[Y 1]
~2
Spurrier and Hewett (1975) provide tables for this distribution.
Spurrier and Hewett (1976) extend the distribution to the twopopulation problem.
These distributions are reminiscent of the stepwise t distributions discussed by Steffens (1969), Derflinger and Stappler
(1976), and Siotani (1976) fo r s tepwi se regression analysis.
1. 2.5
Non-student Multivariate t Di stri bu t i on
Johnson and Kotz (1972) discuss
-
t.1
where
X.1
X.
nxl
-1
=
X.1
1
[ S n{n-l) ]~
,.
:;::
X.
-1
11
n -
i=1,2, ... ,m
nxl
-d N(Q,I)
m
1 L: S.
S = m
1 1
1 1] x.
X: [I- - -nS.1 = -1
-1
The distribution of M$ is a mixture of x2 distributions and the
p.d.f. of ! has a relatively simple form.
The marginals of this
distributuion are mixtures of Student-t distributions.
7
1.2.6 General Multivariate t Distribution
The t distribution of interest in this research is
pre~ented
by
=
t
wi th
Z
mxl
independent of [Q l , ... ,Qm1
Q.v.la.
2
d
~
,
"
2
Xv" is not necessarily mutually independent.
Johnson and Kotz (1972) refer to this form as type (III).
They re-
fer to the special case of Qi independent of Qj ¥ i~j as type (II).
The marginal p.d.f.s for type (III) distributions are defined
If Y~ N(fl' 1) independent of Q~x2v, then T = YI /O/v
as follows:
is distributed as a noncentral t with v degrees of freedom and noncentrality parameter fl. The p.d.f. of this ratio is
v+l
2
[v J-2- exp [-~11 ]
00
=
IV
r(~}[v+t2JV;1
L
k=o
Halperin (1967) provides some inequalities for type (II) of the
form tv;al
~
~
t1v(m) I;a
tv;a2
for probabilities involving Itvl·
Miller (1968) discusses a class of distributions which can be constructed as
,
y.
t.1 =
f;1J~
, v·
d
Y
N(~,B)
~
x.,
Z.
~,
,
= Z.TZ.1
~, ~
~
x2v·1
d~ N . (0, p
v
,
~
i=l, ... ,m
not necessarily independent w.r.t. i.
8
Siddiqui (1967) derives the joint density of Itl,t21 of type (III)
for the bivariate case when
is a correlation matrix (i.e.,
E
2x2
2 2
°1 =° 2 =1). He computes the joint density of t l , t 2 and ~ (the
sample correlation) and then integrates out R to reach an approxi-
mation of the p.d.f. of t
in terms of the hypergeometric function.
2xl
Sid&k (1971) proves the following corrolary for!
ties.
If:
the rows of y
nxp
(1)
relation matrix of
Pr
is Rand r ..
1J
1
[Y 1' ... , YpJ = n 1-1
xn
( 3)
=
E
are i.i.d
Y
[
nxp
2
y1
Si/ IN
<
-
c.1 ,
II.1--1
, ... ,p J
>
-
2
[51 ... Sp ]
ITil~l,
'"
then
NxP
Iif 1·-~·I1
E), (2) the cor-
pxp
= J 1.• JJ. (i f j) where
, (4 )
diag
N(~,
probabili-
P
IT
i =1
Pr
IYi-~i
I
<
-
c.1
S./IN
1
Krishnan (1972) derives the central and singly-noncentral
p.d.f. and c.d.f. of t
of type (III) for the case of E =
2xl
2x2
2
0
R ,where ~ is a correlation matrix. He expresses the
2x2
p. d. f.
2x2
in terms of the integrated coerror function,
and expresses the c.d.f.
i(b)erfc(a),
in terms of a double sum of incomplete
beta or hypergeometric functions.
1.2.7 Related Multivariate Gamma Distributions
Jensen (1970) discusses the joint distribution of
9
T . Z.
Z.
-1
(i=1,2) and [~J'~~J is normally dis-
-1
lxv. v.xl
1
1
tributed with zero mean and variance-covariance matrix
(when v2
~
vl)'
He derives a series expression for the bivariate
p.d. f. via the characteristic function of IQ1,Q2!'
(These results
are discussed more fUlly in Chapter 3.)
Lukacs and Laha (1964) and others have shown that the joint
2 is <j>(!) = I! - 2iB .
characteristic function of [Sl 2,,,,,Sm]
Diag (tl,· .. ,tm)! whenever
d
[Sl"",Sm J - Nm(Q,g).
In addition to the literature on multivariate gamma and
t~ishart
distributions summarized by Johnson and Kotz (1972), the
following more recent contribution is also of interest.
Krishnan
(1976) derives the moment generating function and the p.d.f. for
when S.2 = -1
X.T A
X. and X.
-1
-1
1
pxp
pxl
d
.. d
- 1.1. .
1.2.8 Other Publications of Interest
Kerridge and Cook (1976) compare historic algorithms
for computing normal probability integrals and suggest an alternative.
If the p.d.f., f(x), is expressed as a Taylor series and
then integrated, then the Hermite polynomials involved can be
operated on recursively to yield a fast computer algorithm.
10
Numerical results indicate that this algorithm can be sufficiently
accurate and advantageous when speed is important.
Duff (1976) gives an integral representation technique for
calculating general multvariate probabilities.
the multivariate t with common denominator.
He applies it to
The algorithm re-
quires knowledge of the characteristics functions of the distribution in question and its marginals.
Tan and Wong (1978) provide a finite series approximation in
terms of Laguerre polynomials for central and non-central multivariate gamma p.d.f.s.
Use of the approximation requires knowledge
of all the mixed moments.
1.3 Crossover Designs:
Theory and Applications
All crossover designs are repeated-measure designs:
ments are applied sequentially to each study unit.
treat-
A repeated-
measure design is a CO design if at least one treatment sequence
contains two or more distinct treatments.
In CO designs each study
unit receives one treatment during each time period.
The consecu-
tive periods are usually separated by washout time in order to
minimize treatment carryover from one period to the next.
Accord-
ing to the selected design, each study unit receives one of the
possible treatment sequences.
Often the selected design includes
only a subset of the possible sequences.
Conventionally, CO (v,s,p)
denotes a crossover design with v treatments, s sequences and p
periods.
For the CO(2,s,2) design there are four possible
11
sequences of treatments:
then B again.
A then B, B then A, A then A again, B
The CO(2,2,2) design with sequences (A,S) and (B,A)
can be equated to a second-order latin square design.
In order to
avoid introducing sequence effects into the experiment, the units
of study are usually assigned at random to the sequences.
Those
assigned to the jth sequence form a group of nj units of study.
Balance of the design (i.e., nl =n 2=... ) will not be required here.
It will be assumed that treatment carryover may affect responses
in spite of washout time between periods.
Such residual effects
are characterized by the nature of their duration and decay.
Crossover designs can be applied in a greaty variety of situations.
For example, experiments in business, marketing, en-
gineering, psychology, and nutrition can be appropriate.
The
earliest applications were made in agriculture; e.g., crop rotation, livestock feeding, etc.
success.
These applications continue with
The most controversial and perhaps most popularly made
application has been in clinical trials.
In clinical trials it is often difficult to organize and control subjects and experimentation sufficiently to meet all the assumptions of the usual analysis of CO experiments.
Therefore it
can be difficult to know in advance whether the most efficient
design will be a CO or a completely randomized (CR) design, for
example.
Frequently the number of study units in the trial is
minimal.
This can happen because of budget restriction and/or be-
cause only a few eligible subjects are available.
It may happen
12
that the investigator cannot randomize the subjects to the treatment sequences.
Sequence effects must then be accounted for.
Kershner (1979) has compiled an extensive bibliography on
crossover designs.
He also surveys in detail the large family of
CO designs and their analyses.
Currently there is some contro-
versy over the proper application and analysis of CO (2)s)p)
designs.
Kershner (1979) and Kershner and Federer (1981) examine
the difficulties and propose a full-rank general linear univariate
model as a solution for a large class of designs.
They present
comparisons of estimators, some of which are inappropriate because of imprecise interpretations of parameters in similar but
distinct models.
They also discuss three-period two-treatment
designs) CO(2,s)3).
Crossover designs came into use soon after the introduction
of latin squares by Fisher in 1925.
three-period designs.
Brandt (1938) introduced
Cochran and Autrey and Cannon (1941) demon-
strated the first CO experiment with carryover effects.
Williams
(1949) extended their work by constructing balanced designs.
Patterson (1951) noted that some CO designs can be viewed as
factorial experiments) and that repeating the second period altered efficiencies of estimators of effects.
Patterson and Lucus
(1959) 1962) were first to discuss the serious estimability
problems that can arise in CO (2)s)2) designs.
Grizzle (1965)
discusses the conventional analysis of the CO (2)2)2) design and
its associated estimability problems.
13
Some difficulties with
CO(2,s,2) designs remain unresolved.
Brown (1980) has urged that CO(2,s,2) designs (as analyzed conventionally) should never be used if carryover effects exist in
the experiment.
Helms, Kempthorne and Shen (1980) have exposed
the nature of the confounding in CO(2,5,2) and CO(2,s,3) designs.
They point out that analysis is complicated by several difficulties:
the analysis often must account for confounding; cus-
tomary analyses by less-than-full-rank ANOVA models adds to the
complexity; in unbalanced designs it is especially important to
give considerable attention to the meaning of model parameters in
order to properly interpret results.
They also note that the con-
ventional model is a mixed model with uncertainties about proper
analysis.
They demonstrate a new analysis which is based on the
idea that it is advantageous to perform a separate full-rank linear
model analysis on each period and then combine results.
By this
approach confounding is reduced and interpretation of results is
simplified.
It is, however, necessary to conduct the experiment
according to the CO(2,4,2) design.
Fortunately balance is not
required so that the numbers of study units assigned to sequences
(A,A) and (B,B) may be small.
The advantages of the "by-period" strategy proposed by Helms,
Kempthorne and Shen (1980) are several:
Confounding is reduced,
interpretation of results is simplified and improved, messy data can
be analyzed without discarding any observations, period*treatment
interaction is allowed in the model, and carryover effects do not
14
effect the validity of the analysis since they too are allowed.
The by-period analysis involves separate tests of the form
Ho.:
=
T·
a vs
Ha.:
"
In order to test
T·
f 0 (i=1,2)
"
the Bonferroni inequality can be used as follows.
If at least one
of the separate two-sided t-tests is significant at the a/2 level
The test s tati s ti cis then max It i
then reject Ho.
t 2 are the individual t-statistics.
I where
t l and
For this test, the true pro-
bability of type I error is a* which is less than a.
The dif-
ference is
where
t(v l
i)
is defined by
1 -~= Pr[t <t(v ~)J
4
v' 4
with tv a student-t variate with v degrees of freedom.
The power
of this test procedure depends on the noncentral distribution of
the two t-stati s ti cs, (t l , t 2 ).
b a
1 - Power = f
f
f(tl't 2 ) dt} dt 2
-b -a
where a
=
t(v l ,
i),
b =
*)
t(v 2 ,
and f("')
is the bivariate
p.d.f. for the case of hI f O)U (T2 f 0).
The distribution of (t ,t 2) depends on the parameters of
1
the GLUMS for the two periods: It is assumed that
E[
Yi ] =
n.xp
, i
Xi
h,'X;),'
.
*
Si
p. xl
1
,(i=1,2)
15
and
v [ Yi
]
N.1 xl
Some (Y lj ,Y 2j ) pairs are assumed and have covariance 0lo2 P . The
exact definition of -1
X. can take many forms. Here it is sufficient
to know that -1
X. is a full-rank matrix.
In the case of N1=N 2=N
with all N study units observed in both periods, it happens that
~l
and 02 may be identical.
In the analysis of complete data it
will often happen that this is true and is no restriction.
never required that §l
=
@2'
It is
More generally, it may happen that
there is missing data or that covariable adjustment is desired in
one or both periods.
Then
01
and
02
may have different dimensions
and entries.
In each period, secondary parameters 8i = ~i§i will be defined
and the hypothesis Ho: 6 i = 0 will be tested against a one- or twosided alternative.
Estimate
e.1
= M.Y
· is defined by M. =
-1- 1
-1
C.
X.. Without loss
of generality we may assume that ~1' is
[ X. TX. ]-1-1 -1 -1
1 scaled so that M.M. T = 1. The numerator of the t-statistic is 8 "
-1 -1
1
The denominator is the square-root of a sum of squares for error.
The distribution of (8 1 , 82) is bivariate normal. The distribution
of the pair of sums of squares is bivariate gamma. Relaxation of
the restriction that
~l = ~2
variate gamma distribution.
complicates the deviation of the bi-
CHAPTER 2
BIVARIATE-t DISTRIBUTION THEORY FOR POWER COMPUTATIONS:
THE CASE OF IDENTICAL DESIGN MATRICES IN THE
LINEAR MODELS FOR PERIODS 1 AND 2
2. 1
Summary
In this chapter distribution theory results are obtained
which are necessary for power calculation for the hypothesis tests
made in the by-period analysis of CO(2,4,2) designs where design
matrix
~l
equals
The theoretical results culminate in two
~2'
algorithms for computing Pr[lt l I ~ t and It21 ~ ~where ti(i=l ,2)
are t-statistics which have dependent numerators and dependent
denominators.
These a100rithms are used in Chapter 4 as tools for
comparing analyses on the basis of power against alternative hypotheses.
2.2
Formulation of the Power Function P(t)
In order to evaluate the power of general linear hypothesis
tests based on two t-statistics and the Bonferroni inequality it
is necessary to express the power function in a form suitable for
numerical computation.
P(t) = Pr
[It l I ~
t
and
For two-sided tests this means computing
It21~
tJ
where
t
l and t 2 are the
17
t-statistics.
The t-statistics are assumed to be constructed from
data which follow a bivariate linear model with mean
[~Bl]
.
The
~B2
resulting joint distribution of [t l ,t 2J in its general form is quite
complicated. It is, however, possible to derive the joint distribution of [t l ,t 2 ,xl,x2] where xi is proportional to the denominator
of t i (i=1,2). Integration with respect to ~(2xl) over R2 and
integration with respect to
ti€[-t,t]
~(2byl)
i=1,2 yields P(t).
Theorem 2.1.
over the square region
An expression for P(t) is given in
The proof of Theorem 2.1 is constructed from a series
of results given by Lemmas 2.1-2.3.
Lemma 2.1 sets forth the assumed distribution of the data and
the resulting distributions of the r.v.s. which form the numerators
and denominators of the t-statistics in the most general case.
In Lemma 2.2 it is additionally assumed that the two design
matrices
~l
and
~2
are identical and concomitantly that there is no
missing data, i.e., all the observations occur in pairs.
The joint
distribution of the error-sums-of-squares [S12,S22 J is shown to be
a bivariate gamma with known p.d.f.
Lemma 2.3 gives the resulting joint p.d.f. of
It1,t 2,xlx2 1
where x.1 = [s.2/0.2(1_p2)]~.
1
1
The combined results from the proof of Theorem 2.1.
The
resulting expression for P(t) is useful and can be evaluated
numerically by several methods which are discussed later.
13
Lemma
2.11
Let
d Nn
[~:]
l
+n
2
([O~l_
Q
x_
],[:1],
2~.1
l: ),
-1.
with X. (n. by r.)
-1
1
1
n 1+n 2xl
having rank r i , and
B(n
l
Pl'
by
:=
2
I
-
1
~i
(1 by n i )
T .
ijl~~2
and
Then 8
X. [X .T.
X ] -1 X• T Wh'1 C h has ran k \). ( \)2< 1 ) ;
-1
:=
-1
-1
T
C. [ X. X.]
-1
1
-1
1
1
T
X., such that M.M.
"'1
-1-1
-
T
= 1
L]· [:~J
[~l Q]. [Y 1]
Q
- 1
M;
:=
2(2x1)" [~l
=
where
:=
[lq,9]T * [jq,g] with qE:{O,1. .. ,min(n l ,n 2 )};
n ):=
-n.
~l
~2
d
~2
d
2 2
- o. X
v·1
1
and the particular bivariate chi-square distribution of
[5/,5/] is
independent of ~.
Proof of Lemma 2.1:
The application of well-known mu1tinormal distribution theory
yields the distributions of
~
and of Sl and S2'
5'1 nee 52.
i 1S a
function of the data which is orthogonal to both 01 and 82 , the
two bivariate distributions must be independent.
19
Lemma 2.21
With the conditions of Lemma 2. 1 ~ if !l = !2 ; !(n by p),
v
~=ln'i
T -1
= n-p = trace ( l-~[~T~]-1 ~ T) so that M= f i [~~]
f2
then
2. )
2d 22 .
2
S.1 - 0'1 xv 1=1~2 E[S.]=
1
2 2
222
COV[Sl ~S2 ] = 201 02 p v
3. )
the joint p.d.f. of S12 and S2 2 is
1.)
2
f ( Sl ~ S2
2)
=
0'
2
1
v~
2
4
V[S.1 J = 0'1 2v
S 2
Ck · fx2
( 21 2
k=O
v+2k 01 (l-p )
00
L:
)
S2
f 2
(2
2
2)
2 2
2 2
01 02 (l-p )
xv+2K 02 (l-p )
where C ==
[
r(v+2k) p2k(l_p2)V/2] , k=O.l, ...
,
are terms from
__
2_______
r(.Y..) (k)!
2
1 the negative binomial expansion of (----2
l-p
2
n-l
P)-=2
~~
l_ p2
Proof of Lemma 2.2
The outline for this proof is found in Johnson and Kotz (1972),
p. 221.
The joint distribution is a mixture of joint distributions
since the sum of the
two quadratic
for~s
developed by Evans
T
weights~ {Ck}~
is unity.
The covariance of the
Sl 2 and S2 2 can be derived using the procedures
(1969)~
(See Searle (1971), p. 65),
T
T
COV(~l ~~1' ~2 ~~2) = 2·trace(0 ~ ~ f) + 4~1 ~ ~ ~~2
where C =
E[(~1-~l)(~2-\;;2)Tl. For the case of
s/ and S/,
20
The coefficient of correlation follows since the moments of the
marginal (x \J2) distributions are known; in particular~
E[S.2J = cr. 2\J and V[s.2 = cr. 42\J.
1
1
1
J
1
The joint p.d.f. of [ Sl 2 ~S2 2] is of the form
2
2
ex>
f(Sl ~S2 ) =
ex>
where
2:
k=O
ak = 1 ~
l:
k=O
ak>O
a k fl~k(Sl 2) f2~k(S2 2)
and f. k(') is a gamma density function.
1~
Hence the joint distribution of [S12~522J is a mixture of bivariate
distributions within each of which the marginal distributions are
independent. A sample from the joint distribution of [51 2 ,S2 2J
can be obtained by sampling from the distribution of
fl,k(S12)'f2~k(S22) with probability a k. It is also interesting
to note that the ml. xed moments, E[. Sl 2a• S2 2b J,are eas ily computed
as a weighted sum of moment products.
21
Lemma 2.3
With the conditions of Lemma 2.2, if Xi
, I~ I
The parameters of the joint distribution are
8
1
e-'
1
°2
a-'
2
v, p, and M.
Proof of Lemma 2.3:
By independence the joint p.d.f. can be written as
The joint p.d.f. of the t-distributed r.v.'s and their denominators
can be obtained and simplified using the following transformation:
22
A
2 J.,:
Iv) 2
2
J.,:
T,
= 8, / (5,
SO
8, = X,T,a,/(v/('-p ))2
T
2
= 8" 2/(5 22Iv) J.,:2
SO
"
?
k
8 = X T a /(v/('-pL.)) 2
2
2 2 2
X,
= (5,2/a,2(1_p2))~
so
S 2
X
2
= (522/a22(1-p2))~
so
2)
S 2 = X2
a 2
(l-p
2
2 2
,
22
= X, 2
a, ('-p )
The Jacobian of this transformation is
4a/a 2 (l-p2)2.
2
J = a,a 2 (,-p2) [X,X ]2
2
v
•
Define
9
T
[X,X 2J2
= [8,/a"
4a
8
2 2
2 2
a ('-p )
1 2
2/a 2J
.
[v/('_p2)]~ and
with
23
00
•
L
k=O
f
2
Ck 2 (Xl )
xv +2k
2
[fx2+2k(X 2 )J
v
222
01 02 (1 -p )
2 2
2
01 02 (l-p )
2
XV +2k - 2 -. 5X 2
j
2
2
2
e
X X] 2
• : 2 Ck f 2 (Xl) [ (v+2k) /2 v+2k I 1 2
k-O
xv+2k
2
r (-2-) :
00
where
24
It is easy to show that the series
converges.
I
by S
n
=
are non-decreasing and bounded above
n
n
z:
k=O
~
k=O
Uk'
The limit of the ratio
I
Uk+1/U k is zero.
sequence.
Therefore by the ratio test ISnlis a convergent
It follows tha t seri es S converges,
For the special case of
f(t 1 ,t 2 'Xl'X2)
~=Q
and p=O, we have
= f(t 1 'X1)*f(t 2 'X2) and ihtegration over Xi verifies
2
-~(\)+l)
that f(t i ) = (l+t i Iv)
/IV
S(~,\)/2).
I
Theorem 2.1
With the conditions of Lemma 2.2 the probability,
C1
X g
tJ2
-
)]
.
25
Proof of Theorem 2.1:
Lemma 2.3 gives f(t l ,t 2 ,xl'X2) as the joint p.d.f. for
tis(_oo,oo) and xjs[O,oo). The bivariate density f(t ,t ) is
l 2
obtained by integrating f(!,Z) with respect to ~ over the region
defined by
Xl~O, X2~O.
P(t) is then obtained by integrating
Since f(!,~) is a
f(!) over the square region Itil.::.t, i=1,2.
p. d. f. the order of i ntegrati on may be changed.
function of t having parameters
8,
8
e- ' e-2 '
1
P( t) is a
v, p, and M.
Alternative
2
expressions for the joint p.d.f. f(!,x) do not yield a more useful
result.
For example, f(!,X) can be expressed in terms of bivariate
normal p.d.f.s with respect to t or with respect to X.
example if
with
f*(~)
g, 9, and
6
is the p.d.f. of the N2 ( Q * 9
2x2 2xl
certain matrices involving
,
~
2x2
For
) distribution
!' then the p.d.f. of
tis
where
c
=
Wk
= [p/2]
and
2k
v+2k)
/ r(-2k!
Thus the joint p.d.f. of [t l ,t 2J can be expressed in terms of
incomplete moments of the N2(Q·g,~) distribution (or absolute
moments if
v
is even).
A useful recurrence relationship for
26
such moments is not available.
It is possible to express the
incomplete moments as an iterated integral:
00
f
a
()2
The equation [27Ta2r~ f le-~ x-~
00
o
/a
2
dx
:=
M(k)
can be solved recursively to obtain M(k) = a k + bkM(O).
Unfortunately a k and bk are known but algebraically involved
functions of
a and~.
This renders the iterated integral approach
cumbersome.
Use of the Hh k(·) function is equally cumbersome.
2.3 A Maclaurin Series Expansion
Theorem 2.1 provides an expression for P(t) in terms of
By expressing gb(t) as a Maclaurin polynomial in t it is possible
to reduce P(t) to terms of the form
00
00
f
f
a
X 1x S e
l 2
0
-~[X12+X?2]
~
which can be easily integrated.
dX dX
l 2
P(t) is then reduced to an
alebraically involved but discrete summation of terms.
Theorem
2.2 gives an expression for the coefficients of the Maclaurin
polynomial in t.
The proof of Theorem 2.2 is constructed from
27
Lemmas 2.4 - 2.10.
t
Lemma 2.4 gives the kth derivative of f F(t,u)du with
o
respect to t when F is integrable.
derivative in the case of F(t,u)
the kth derivative of g(t)
=
Lemma 2.5 9ives the kth
t
=f
t
f
o
f(v,u)dv.
Lemma 2.6 gives
t
f f(v,u)dv duo
-t -t
lemma 2.7
provides a recurrence equation needed for lemma 2.8.
gives the kth derivative with respect to t of f(t)
lemma 2.8
= exp(g(t)).
lemma 2.9 adds the assumption that f(v,u) is of the form exp[qJ
with q a quadratic form in u and v.
The resulting derivatives
are used in lemma 2.10 to obtain the derivatives needed to
compute the coefficients of the Maclaurin polynomial in t.
Theorem 2.2 uses the results of Lemmas 2.4-2.10 in order to
compute the coefficients.
Finally a corrollary to Theorem 2.2 shows that the end
result can be generalized from integration over a square region
to integration over a rectangular region.
28
Lemma 2.41
3k-l
~
3t
t
k-2 3k-2-i [i
]
F(t,u)du = E 3t k- 2- i ~ti F(t,u)
0
i=O
a
J
t
+ J
o
where
3k-l
at k- l
F(t,u) = F(t,u)
u=t
F(t,u) du
by
definition.
Proof of Lemma 2.4:
The equation is true for k=2; i.e.,
(1. 1. 1)
-33t
I
~
; F( t , u) du = [F (t, u)] u= t + ; 3 F(t, u) du
o
0
32 t
Equation (1.1.1) can be used to compute ----J F(t,u)du for k=3.
3t 2 0
Suppose the lemma is true for k=n.
Then
n-1
t
+ ~t J 3 n 1 F(t,u)du
3
0 at -
and application of equation (1.1.1)
t n-l
(1.1.3) --t3 J 3 n 1 F(t,u)du
a a at -
F(t.U)]
t
n
+ J ~ F(t,u)du.
a at
I
u=t
29
Combining (1.1.2) and (1.1.3) shows that the lemma is true for
k=n+l.
Thus, by mathematical induction, we conclude that Lemma
1.1 is true for k=2,3, ... ,n,n+l, ....
Lemma 2.5!
t
If F(t,u) = ff(v,u)dv
o
t
~
then
t
t
(a.)
a~ f F(t,u)du = f f(t,u)du + f f(v,t)dv
(b.)
ak t
tak-l
tak-l
---k f F(t,u)du = f
k-l f(t,u)du + f
k-l f(v,t)dv
at 0
0 at
0 at
000
[3
k-2 ak- 2- i
i
]
+ L
k-2-' ---,1 f(t,u) + f(v,t) u=v=t
i =0 at
1 at
t
t
Of course, f F(t,u)du = f
o
0
t
f f(v,d)dv duo
0
Proof of Lemma 2.5
Part (a.) follows as a direct corollary of equation (1.1.1)
of Lemma 2.4.
t
f
a
f(t,u)du +
Lemma 1.1.
Part (b.) gives the (k_l)th derivative of
t
f
a
f(v,t)dv and is therefore a direct application of
30
!Lemma 2.6 1
=
If g(t)
t
f
t
f
f(v,u)dv du
and k=2,3,4 ...
-t -t
t
Then (a.)
g(l)(t)
= f
o
f(t,u)+f(-t,u)+f(t,-u)+f(-t,-u)du
t
+ f f(v,t)+f(-v,t)+f(v,-t)+f(-v,-t)dv
o
t
( )
9 k (t) = f
(b.)
k-l
a
o al- l
t
[f(t,u)+ ... +f(-t,-u)]du
k-l
\-1 [f(v,t)+ ... +f(-v,-t)Jdv
o at
+ f
+
k-2
.
L
1=0
k-2-i
[i
a k - 2-1. .L.,..
f(t,u)+ ... +f(-t,-u)+f(v,t)+ ...
1
at
at
+ f(-V,-t)] \u=v=t
Proof of Lemma 2.6:
The double integral can be evaluated as an iterated integral
(see Purcell (1976), p. 777).
g(t)
By
changing variables,
t t
ff(v,u)+f(-v,u)+f(v,-u)+f(-v,-u)dv du (1.3.1).
= f
o0
Direct application of Lemma 2.5 to equation (1.3.1) gives the above
res ults.
31
Lemma 2.~
if i .::.k/2 and i d l,2,3, ...}}
If C .
k,1
if otherwi se
.
Then Ck+l ,1' = Ck ,1· + k . Ck- 1 ,1. 1
and Ck+l,i+l = Ck,i if i= k21 and k is odd
Proof of Lemma 2.7:
This result is easily verified in several ways.
2.d
Lemma
If f(t) = exp [g(t)] , tE(O,oo)
and 9
( k)
(t) =
Then (1.)
(2. )
{finite
0
if k=1,2}
if k>3
f(k)(t) = f(k-l)(t).g(1)(t)+(k_l).f(k-2)(t)·g(2)(t)
f(k)(t) = f(t).
[~]
ck , .[g(1)(t)]k-2\g(2)(t)]i
}:
1
i =0
If f(t) = exp [-~t
Note:
2
J then
f(n)(t)
f(t)
= (_l)n·Hn(t), the Hermite
Polynomial.
Proof of Lemma 2.8:
Begin with f(l)(t) ~ f(t).g(l)(t).
is
k
d
;tK
h (t)'h (t) =
1
2
Leibniz' formula is
~ (~)h(i)(t)'h(k-i)(t)
1
1
2
i =0
and can be applied to
32
=
g(i+l)(t)
=
since
Thus part (1.) is proven.
0 if i>2.
Part (2.) is proved inductively by using part (1.).
Let S =
{integers k : part (2) is true}. Then 1sS and 2sS because
f(l)(t) = f(t)'9(l)(t)
f(2)(t)
=
f(t)· [[g(1)(t)]2+[g(2)(tU'J
Assume that nsS and (n~l)sS.
are of the form of part (2.).
Then
n-w
2
()
f n+l (t) =
()
n-2i+'
Cn ,i [g 1 (t)J
f(t){i~O
n-2+w
2
+
[g
n-2j+l
. [g (1) ( t) ]
Ln' C
n-l,J
j=O
e
()
.
2 (t)J'
j+l
[g (2 ) ( t) ]
}
by the recurrence relation (part (1.)) already proved.
is the indicator variable for n being odd.
Here, w
By considering the
two cases lin is odd" and lin is even ll , f(n+l)(t) can be expressed
as
[n;l]
f(n+l)(t) = f(t)·
where
1
=
l:
i =0
n+1-2i
0
= Cn,o
.[g(l)(t)] [11(2)(t)]
n+ 1, ,
'::J
if i =0
Ch,'. n'C n- 1 ,1. 1 if i=l ,2, ... '[I]
h'C
n-l,i-l
,. f ',=
2n+ 1
i
33
D + = C + . Thus (n+l)ES whenever nES and
n l
n l
By the axiom of mathematical induction, S = {l ,2,3, ...}.
By Lemma 2.7
(n-1)ES.
Lemma 2.9
exp{-~([evxl,bux~ -~')2~/[CVX1,bUX21T_~)}
C = -+1
Assume f(cv,bu) =
where b = -+1 and
Q2 = [ex} ,018~,
with Q} = [cxl'0]0[cx 1 ,0] "
Q3 = [cx ,0]0[ cx l'bx 2J', Q4 = [cx], bx 2189-,
1
Q = [cx ,bx J0[cx ,bx J', m=k-2-i, and i=0,1, ... ,k-2.
1 2
5
1 2
(1.)
~ [~f(Ct,bX)]/_
Then
3t m 3t'
mi n[ m, iJ
( ~)
.l:
~
J=O
is
u-t
°
() m- j
[ i 2j ]
( i _21) !
1
m- j f ( Ct , bt) l: Ci 1 (i _21- j ) ! [- '1 J
3t
1=0
'
[Q2-tQ3Ji-21-j[-Q3Jj
I
and 3mm [3 \
3t
3t
lreplacing
(2.)
f(ev,bt)] v=t is obtained from (1.6.1) above by
.
[cx PJwith [0,bx 2] in the definitions of Q"
1
If i=O then
C
[3 i.
m
3t 3t'
[%1
, f(ct,bt)·
1(3.)
(1.6.1)
l:
n=O
C
m,n
f(et~bu)ll_ :: ~ [~
j ~-t 3tm 3t'
m-2n
n
[Q4-t05] [-Q5]::
Q2 and 03·
f(CV,bt)]
I_
v-t
m
3
m
3t
f(ct,bt)
(1.6.2)
[~[~
f(ct,bU)ll
JI
3t m 3t'
J u=t t=O
[3 [3
m ---.
i f(-ct,-bu) ]
(1.6.3)
= (-1) k . --m
u=t] t=O
3t
3 t'
and similarly for f(cv,bt).
is
34
Proof of Lemma 2.9
By Lemma 2.8--part (2.) we have
i -21
[i/2]
_3_. f(ct,bu) = f(ct,bu)· L: C. 1 ["at log f(ct,bu)]
1
1=0 1, a
at
i
2
1
. [~log f(ct,bu)]
3t
vJhere 3~ log f(ct,bu) = [cxl,OJ8~ - u· [cxl,OJ~[0,bx2J
I
2
and .3- 2 log f(ct,bu) = -[cx ,oJ8[cx ,0]. Evaluation at u=t gives
1
1
3t
i
[ i /2]
i - 2- 1
1
f(ct,bu)] Iu=t = f(ct,bt)· L: Ci 1 [Q2- tQ 3]
[-Q1J·
at
1=0
'
[.L,
r
m
~ l~
It follows that
3t
i
at
f(ct,bu)
]
Iu=t is given by Leibniz '
formula (see the proof of lemma 2.8) as
Taking the differential operation inside the summation indexed by
and then noting that some of the terms have jth derivatives which
are zero yields equation (1.6.1) which was to be proved.
m
i
For ~ [~ f(cv,bt)] I
3t
3t
,the above computation is
v=t
a~ log f(cv,bt) = [0,bx2]8~
essentially repeated;
2
- v· [cx l ,0]8[O,bx 2]
=[O,bx2J~[O,bx2J
I
-
t[0,bx 2J8[0,bx 2] and Slog f(cv,bt)
3t
so that the result for f(cv,bt) can be obtained
35
from equation (J.6.1) by replacing [cx1,OJ with [O,bx ] in the
2
definitions of Q1' Q2' and Q3'
Thus, part (1.) of Lemma 1.6 is
proved.
The proof of part (2.) is trivial: it (equation (1.6.2)) is
a result of the direct application of Lemma 2.8--part (2.).
The
proof of part (3.) is as follows.
mj
- ]
m-j
m-2n-j
rr-2n
By part (2.), [3 m-' f(ct,bt)] It=o = f(O,O) l: Cm_· n[Q4] [-QSJ
3t J
n=O
J,
j
m- ]
m- 2n-J'
[ --m-j
and 3
.
[-QS] n
[ 3tm- J f( -ct,-bt)J t=O = f(O,O) n=O Cm_.J,n [-Q4
I
~
J
By pa rt (1.),
min[m.,iJ
l:
j=O
and
(~).
J
[i - j ]
i-21-j
j
2
[m(i-21) I
1
j
]\
3 m-' f(ct,bt)
. ll:=O Ci ,l (i-21-])1[-Q1] [Q2 J [-Q3 J
3t J
t=O
rL3t3: [i,
f(-ct,-bU)] I JI
3t
u=t t=O
]
min [m, i] m [3 m- j
= l:
(,)
. f(-ct,-bt)
m
j=O
J 3t - J
t=O
=
36
=
min[m,i] m
m ..~m-j
]
..
L
(.).(-1) -J_ ---m:-" f(ct,bt)
. (_1)1- J •
j=o
J
at J
t=O
[i 2j ]
•
L
1=0
C.
(1-21)!
1,1 (i-21-j)!
= (_1)m+i
am
. [-m
at
ai
[1
f(ct,bU)] I JI
at
u=t t=O
Since m=k-2-i, (_l)m+i = (_l)k and equation (1.63) is proved.
a m- j
Note:
If
04 = 0 then [1TI-1. f(ct,bt)
at
j
[i ]
0 then 1:0 Aij1
2
If Q2
Here
==
]
I=
t 0
=
[zero if m-j is odd
not necessarily zero
if m-j is even
[zero if i-j is odd
= n~t.n~cessari1y zero if
1-J 15 even
(i-21)1
[J1
i-21-j
j
Aij1 = Ci,l (i-21-j)! -Q 1
[Q2J
[-Q3J
.
37
Lemma 2.101
If m=k-2-i and i=0,1, ... ,k-2 and f(v,u) is as given in Lemma 2.9
m
f(t,u) + f(-t,u) + f(v,t) + f(V,-t)] I
JI
at m at 1
u=v=t t=O
[3 [~
then
i-j m-j
min[m,iJ[T][TI
(i-21)!
j+n+1
i s f ( a,0) .
L
L
L
(n:) c. 1 (. 21 .) Ie. .[-1]
*
j=O
1=0 n=O J 1,
1-J. m-J,n
*
j
m-j-2n
L
L;
p=O q=O
n
r
L;
L;
i+n-p+q-r(j)(m- j -2n)(n)(r)
r=O s=O
[1 +[-1]
] p
q
r
s
*
i+(n+p+q-r+2s) m-(n+p+q-r+2s)
l+p+s r-s
i-j-21+q
[Xl]
[X 2]
+; [A 22 ] [All] [A 12 a +A 22 a 2] *
1
Proof of Lemma 2.10:
Q2(cX J ) = c.XJ(AJ1a 1 + AJ2 a 2)
Q3(cX J )
=
2
XJ AJJ + cX J X1A12
Q4(cX J ) = c.XJ(AJ1a 1 + AJ2 a 2) + X1(A11a 1 + AI2 a 2)
2
2
Q5(cX ) =
XJ AJJ + X1A 11 + c·X J X1·2A 12
J
where
c
=
~1 and (J, I)
G
f(1 ,2),
(2, 1)
1
38
[f(W 1 ~W2)]
Iw =ct -j It=O
J
wC-t
[i 2 ~
j
•
C.
L:
1=0
•
rr
1
~1
i j
2]
L:
1=0
C,
1
( i -21) !
~1 (i-21-j)!
IT i
-j
m- j
mi n [m~ iJ
[2] m
(i -21) !
L:
L:
L:
(,)C' l (' '21)1 C .
j=O
1=0 n=O J 1,
l-J. m-J,n
-i4l
1
i-j-21
j
m-j-2n
n
• [-Ol(cX J )] [02(cX J )] [-Q3(cX)][Q4(cX)] [-QS(cX J ) I
The product of the "O"-terms is
39
which can be written as
j+l+n i+n-p+q-r j m-j-2n
c
J l:
l:
p=O q=O
[( -1)
l+p+s r-s j+n-p-r n-r
. [A JJ ] [AIl} [A 12 1 [2] .
The above result for (J,I) with c = ±l can be used by noting that
[
a~ [~ f(t,u)
at
at
Iu=v=tJI t=O
+ f(-t,u) + f(v,t) + f(v,-t)]
==
am a i
=[m[-i [f(v'1'W 2)] I
+[f(w 1 ,w 2)] I
+[f(W] ,W 2 )] I
at at
wl=t
w1=-t
w2=t
+
)]I w =-tJI w =w =tJI t=O
+[f(W l ,W 2
1 2
2
am
2
=
l:
l:
J=l c=±l
IIJ
ern
at
ai
[-i [f(w 1 ,w z)]
at
I
]I
]I
wJ=ct wI=t
t=O.
The result of Lemma 1.7 is immediate.
Note:
Suppose Q2 = 0 = Q4'
then terms in the sum indexed by j
depends on m=2(h-l)-i as well as j.
then
term(j) -
(l:
j=O
)
say, term(j),
If Q2 and Q4 are both zero
{O if i-j is odd
not necessary zero if i-j is even
]
•
40
Thus for fixed i, the sum indexed by j may be computed as
J
L:
j=l
.f
term(2j -2)
1
term(2j-l)
if
J
L:
j=l
. .
. h J
1 1 seven Wl
t
mi nrm, i] +2
= ~"2~-
is odd with J = min[m,il!:l
2
41
Theorem 2,2
t·
If g(t) =
!
-t
t
! f(v,u)dv du
-t
where f(v,u) = eXP[-!:2(IV O]x_a/ A((V OJ
Lo u - - 2x2 0 u
~(2
0-~1,
~
~ T = [xl' X21
by 2) is symmetri c, and
i
tdO,oo),
I
has non-negati ve entries,1
then g(t) can be expressed as a Maclaurin series:
9( t) =
N
g(k)(o) k
( k)!
t
+ RN( t) ,
l:
k=O
The coefficients of this polynomial in t are given by
k
[_3_
9( k) ( 0) =
3t k
= [2
here
D.1
=[
3
k-2-i
3t
= f(O,O)
,
k-2-1
k-2
1 =0
l:
j+l+n
• [-1]
j
l:
p=O
t=O
Di
]
if k=2,4,6,oo,
if k=0,1,3,5,oo,
[~1 f(t,u)
+ f(-t,u) + f(v,t) + f(V,-t)J\
3t
mi n[m, ~[i
j=O
i
'.l:
o
I
9( t) 1
21m2j]
l:
l:
1=0 n=O
m-j-2n
l:
q=O
n
(m.) C
(i-21)!
i,l (i-21-J')!
J
m-J',n
~
(J') (m- J'-2n)
r=O s=O p
q
l:
C
f.,
n
(
•
r
)
r
I
]
u=v=t t=O
(
)
s
J
n-r
j+n-p-r {
i +n-p+q-r
+1 [2J
[A 12]
•
[ [-1]
l+p+s r-s
i-j-21+q
~-j-2n-q i+n+p+q-r+2s m-(n+p+q-r+2s)
[All] [A 22J[All a l + A12 a 2] ['~12al + A22a21 ~l J
[X 2]
+
l+p+s r-s
i-j-21+q
m-j-2n-q i+n+p+q~r+2s m-(n+p+Q-r+2S)/
+[A 22 J[All] [A 12 a l + A22a21 [A"a l + A12 a 2] [Xl]
[Xl]
J
and m=k-2-i
I
42
The proof of Theorem 2.2 is easily shown by using the results
of Lemmas 2.4-2.10.
The integral g(t) can be differentiated any
number of times at all points of the range of t, [0,00).
Lemma 2.6
which is derived from Lemmas 2.4 and 2.5 gives the form of the kth
derivative of g(t) as
t
g(l)(t) = J f(t,u) + f(-t,u) + f(t,-u) + f(-t,-u) du
a
t
+ I f(v,t) + f(-v,t) +
a
f(v,~t)
+ f(~v,~t) dv
k-l
t
=
I[ '\ 1 f(t,u) +... + f(-t,-u)] du
+
k-l
\-1
a at
a at
-
+.
t
I[
f(v,t) +... + f(-v,-t)] dv +
k-2
+
k-2-i
ai
a k-2-' [-i f(t,u) +... + f( -t,-u) +
i =0 at
1
at
l:
+ f(v,t) +... + f(-v,-t)]
aO
where k=2,3,4, ... and ---f
at O
I
u=v=t
= f.
Lemma 2.9 which is based on Lemmas 2.7 and 2.8 derives the
I
di
am
form of -am m [-.
f(ct,bu)]
and m
at at'
u,t
at
where c = ~l, b = ~l, and m = k-2-i.
I
ai f(cv,bt)]
[-i
at
v=t
Lemma 2.9 also shows that
I I
m
3
3i
am a
il ]
[m
[-i
f(ct,bu)]
]
= (-1) k . [mL-i
f(-ct,-bu)]
at at
u=t t=O
at at
=t
(and similarly for f(cv,bt)).
It follows that if k is odd then
I
t=o
43
0, :: [
1
k-2-i
ai
a k 2 ' [-,' f(t~u) + f(-t~-u) + f(t~-u) + f(-t~-u)
at - -1 at
+
f(v~t)
+
f(-v~-t)~
+
f(-v~t)
is zero for all valid values of i.
+
f(v~-t)Jlu=v=tJlt=o
Thus g(k)(O) = 0 if k is odd.
When k is even then
k-2-i
~Di
i
= [:t k- 2- i [:t i
f(t~u)
+
f(-t~u)
+
f(v~t)
+ f(V,-t)]!u=v=tJ It=o.
lemma 2.10 gives an expression for Di in terms of Xl and X2 .
Since g(t) is continuous together with all its derivatives on
Finally~
the interval
in t.
rO~oo) ~
g(t) can be represented as a Taylor polynomial
For practical applications t will be small (e.g.,
ts(O~lO))
and therefore a Taylor series expansion of g(t) about t=O is most
appropriate. Thus Theorem 2.2 gives the explicit form of g(t) as
a Maclaurin series.
The remainder term for the Maclaurin series is
lagrange's form of the renainder is
where c is an unknown value only restricted by cs[O~tJ.
An
explicit expression for RN(t) in terms of the underlying parameters
8
8
2
1 ~ (3
~ p~ v, Mand t is so algebraically involved that no
81
2
usefully sharp bounds on RN(t) could be derived.
Preliminary
44
computer trials were run in order to determine empirically whether
or not the series converges numerically.
It was discovered that
the series is an alternating series and that whenever it does
coverge it does so quickly and accurately.
Unfortunately the
interval of convergence is an unknown function of the above
pa rame ters.
Coronary to Theorem 2.2
If g(t) -
Ut
Vt
f
f
f(v,u) dv du where U>O and V>O
-Ut -Vt
with zl
=
VX l and z2
=
UX 2
then g(t) can be expressed as a Maclaurin Series:
g(t)
= UV
N
L
k=O
(k)
~k
~O) t k + RN'(t)
(k)!
where the definition of g(k)(O) is as in Theorem 1 but with
[x l,x 2] repl aced by [zl,z2]'
Proof of the corollary:
By a simple change of variables g(t)
=
UV
t
f
t
f
f(Vv,Uu) dv duo
-t -t
Since f(v,u)
=
exp(-~(rv O]~_~)T ~([v O]0_~))
1-0
vlritten as f(Vv,Uu)
=
u
a
,
f(Vv,Uu) may be
u
° 0J0z-a)),
u- -
eXP{l-!2([ov °lz_a)T A([v
uJ - - -
As in
Theorem 2,2, the remainder term, RN(t), cannot be adequately
expressed in terms of underlying parameters to yield useful
information about the interval of convergence.
45
2.4
Two Algorithms for Computing Power
Theorem 2.2 can be used to provide a computational algorithm
for P(t) as represented in Theorem 2.1.
Theorem 2.2 gives a
polynomial approximation for g(t):
N
g( t) =
g(2h)(0) 2h
(2h) ! t
l:
h=l
-
N g(2h)(0)
l:
h=l (2h) !
t 2h
where N may be taken to be as large as desired.
Thus P(t) can be approximated by
The integral in P(t;N l ,N 2) can be written as
N2 2h
v+2k
(2h)
T
h=l (2h)! ~ ~[Xlx2J
. 9
(0) exp(-~?f ~) dx
00
00
where g(2h)(0) is a polynomial in xl and x . Therefore, the
2
integrals in the resulting sum are of the form
00
c . f
a
00
J [x21 s
f [Xl r
0
exp( -~( X/ + X/)) dX l dX 2
which can be written as the product of two integrals equal to
r+s-2
c . [2J
2
r(r;l) r(S;l) .
46
It has been observed numerically that g(Zh)(O) alternates in sign:
{g(Z)(O), g(6)(0), g(10)(0) ... ) are positive while
{9
(4) (0), 9 (8) (0), ... } are negatlve.
.
The integral in P(t;N ,N ) can be written as
1 Z
NZ t2h
L:
(2 h) I
h= l '
Zh-2
2·
•
•
v+2k
1 [
T]
[x 1x2]
D. e -~ ~ ~ dX 1 dX 2
0
1
00
00
L:
f
f
i =0
0
where the integral can be expressed as
[i -j][m- j l
min[m,iJ -2- -2f(O,O)
j
*
L:
j=O
m-j-2n
L:
L:
p=O
q=O
•
L:
1=0
n
r
L:
L:
h=O
r=O s=O
J+n-p-r
* [A 12J
* r(
L:
(i-21)1
j+l+n
(~) C
C
[ 1J
J
i,l (;-21-j)! m-j,n-
i+n-p+q-r :
(J') (r.l-qj-2n)(~)(~)[[-lJ
+lJ
p
*
l+p+r
[All]
*
(i+n+p+g-~+2S) + v+2k+1~) r( 2h-(i"'n+p+~-r+2S) +v+2k-l
where m = (2h-2-;) and All = A22 for the case in Theorem 2.1.
Thus P(t;N, ,N')) is
I
<-
v/2
[ f(O,O)(l-/)
- TI2 v-1 r(~)
2
1
I~I~]
N2
L:
I,
2k
~
k=O r(V;2k) r(2;2k)
N1
L:
h=l
2h
2. t
12m
*
*
47
[i -j][m- j ]
2 ( h- 1) mi n [m, i] -2- -2-
*
l:
L
i=O
L
j=O
j+l+n
*" [-lJ
1=0
j
(~) C
L
n=O
m-j-2n
J
( i _21 ) !
i,l (i-21-j)!
n
r
L
L:
L:
L:
p=O
q=O
r=O
s=O
C .
m-J,n
*
1
i-j-21+~
m-j-2n-q
*l [All a 1 + A12 a 2] LA12 a 1 + Alla21
i -j-2l+q
m-j-2n-q
+ [A 12 a + Ana 2]
[All a + A a 2]
12
1
1
*
[2 J
J
2(h-2)+2(v+2k)
2
r(v+2k+ 1+( +n+p+g - r+2:1) r(v+2k+ 1+2 (h-1 ~ - (i+n+p+q- r+2~)
1
where m = (2h-2-i).
i+n-p+q-r
Notice that (i+n+p+g-r+2s) is an integer since [(-1)
2
zero otherwise.
and
Also, both i and i+n+p+q-r+2s are ~ 2(h-l).
+11 is
48
N
2
*
r
2t 2(l_p2)
\)(1_p2r~2)2 .
IL
L:
h=l
2(h-l)
*
*
1
(2h)!
1
L:
i=O
mi n Q'~, i ] ~1
L:
( .)
j=O
J
[2 j]
2
*
[ pt1( 1- pM)]
2 2
1
(i) !
(i -j-21) ! (T)!
L:
1=0
[-~J
[r~2j]
*
(:~-j)!
L:
Uf-j-2n) !
n=O
j
*
p=O
[(l-p 2M2) ]
1
2 2 n
[ pt1( 1- pM) ]
rnrr
(j) [-1/ pt·1]
p
L:
j
p
i-j-21+q
*
r~- j - 2n
L:
~.
C~-J-2n). [~- pM
q=O
n
*
L:
r=O
r
*
E
5=0
N
1
*
E
k=O
~ e
q
°1
e
M-j-2n-q
--?"J
[--?. -
°2
°2
i+n-p+q-r
(~)
[(-1)
e
pM
e
-lJ
°1
r
+lJ [-1/2pi.1]
()
5
k
[p 2J
r(k+
~ +8) r(k+ ~ +(h-l)-e)
r(k+
where e = (i+n+p+q-r+25)/2
l)
r(k+l)
+
49
Aside:
Notice that the sum indexed by k can be written as
k
[p2 J r(k+a)r(k+b)/r(k+c)(k)!
where
v
+
a = "2
~
+e
v
+
= "2
~
+ h-l - e
b
v
c = "2
e = (1+n+p+q-r+2s) -< h-l
and
= r(a)r(b)
r(c)
=
1
f
o
F(a,b;c,p 2 )
<
00
t b- 1 (l_t)-b-l+c (1_ tp 2)-a dt
where F(a,b;c,p 2 ) is the hypergeometric function.
By
us i ng
r(a+b)/r(b) = 1Ta (b-l+u)
u=l
r(a+b+~)/r(b+~) = 7Ta
(b-~+u)
u=l
(b)
a
a
=
b
b
r(a+"2)/r(2) =
7T
u=l
a
7T
u=l
(b+l-u)
(u )
b
(2 -l+u)
then p(t; N ,N 2 ) can be written as
1
where a,b are integers,
50
P(t;N ,N ) = f(O,O)[
1 2
*
,~
2
0_0 )\1/2
1Tr(i)r(~
n_02M2)3/2]
*"
h~~ [~;:::2~~;2] h [U~l (2u)t2U-ll] *
2(h-l)
E
1
tr
i=O
*
min[M, iJ
j=O
j
M- j - 2h
L:
[pr1J
q
[1T
q=O
+ [- CJ2
n
(2h-2-i+l-u) (i+l-u) _
M-J-2n+l-u
(u)
81
CJ1
1"
J~8r-l -
)
i -j -2l+q
pm - - }
J *
( u)
P (j+l-u)
[U:l
(u)
(.
u=l
82
*
2 2 j i
M )J [1T
u=O
j-q
p
* p:O[-l]
*
[ ( 1- p
L:
La 1
i -j -2l+q
8
p~1 ~]
CJ2
81
[ -
CJl
82
p
MCJ2
r_
La2
M-j-2n-q
- p~1
M- j -2n·-q
]
i+n+p+q-r
r
n-r [r
[(-1)
+11 [-1/2] [pM J 1T
r=O
u=l
E
82
(nr1)U)
u
8
1
-CJ ]
1
]
]
*
r
*
L:
5=0
*
+
51
Nl
k k (v+l +i+n+p+g-r+2s+u_l)(v+l+(h_l)_i+n+p+g-r+2s+u_l)
* L [p ] [1T (V~ 1
2)
()
2
2.
k=O
u=l --2-- - ~ + u-l
u
2
J
This expression is suitable for translation to FORTRAN.
The expression for P(t) in Theorem 2.1 can be evaluated
by methods other than that of the Maclaurin polynomial.
P(t) =
t
f
-t
with known
t
f
co
f
-t
0
co
f fT~X(tl~t2~xl~X2)
0
integrand~
~
~
is a 4-fold integral
This method is straightforward and
inexpensive when only a few
accuracy are necessary.
desired~
d! d~
4-fold numerical integration is possible by
Monte Carlo integration.
digits
Since
(e.g.~
2) significant digits of
As a function of the number of significant
cost grows rapidly.
Fortunately only one or two
digits are required for the purposes of this research.
The probability integral
is prepared for monte carlo computation by the following change
of variables:
Xl
YI = 1+X
1
X2
~ Y3 = !2(l+t 1/t), ~ = ~(l+t2/t).
2
~ Y2 = 1+X
The Jacobian of this transformation is J = (2t/(1-Y l )(1-y 2 ))2 and
so
1 1 1 1
P(t) = f f f f f(t;Yl'Y2~Y3'Y4) dYl dY2 dY3 dY4
o 0 0 0
where
f(t;y) = f T X(t(2Y 3-,),t(2Y 4-l), Y,/(1-Y l ),Y 2/(1-Y 2 ))* J.
- ~-
52
can be approximated by
P (t)
Y N
N Z f(t;Y.)
=
;=1-'
n
where v
1 1 1 1
:=
f
f
f
f
a a a a
1 qy
:=
1 is the volume of the region of
integration and
are N randomly selected points uniformly distributed in the
The random variables Yij and YkL are
independent for all (i,j) "I (k,L) with N(O,l) marginal
region of integration.
53
distributions.
It follows that PN(t) is a random variable with
mean P(t) and variance (E[f(t;Y)] - [P(t)J~/N
Y
Ey[f(t;Y)]
::
=
where
-
1 1- 1 1 2
J J J f (t;Y) dy.
0000
--
J
For numerical
N
evaluation, N 4-tuples of pseudo-random numbers, {y.}
'-1' are
-1
1-
generated and f(t;y i ) must be evaluated for each of the N
4-tuples.
2.5
Comparison of the two Algorithms
In order to debug the algorithms and determine how well
they each perform preliminary trial computations were attempted.
In the first trial, the following arbitrary parameter values were
8
used:
--1
0'1
8
= --2 = 1.0,
0'2
p
= 0.5,
v
= 10, m = 0.5 and t = 1. Using
24 seconds of c.r.u. time, the Monte Carlo algorithm quickly
converged to P(t)
=
.27.
as an alternating series:
The Maclaurin algorithm converged slowly
after 12 seconds
another 3 minutes and 15 seconds P(t)
=
.25.
~(t)
= .24, after
It appears that the
alternating series would eventually converge to a value near .27;
however, this cannot be proven since no usefully sharp bounds
could be derived for the residual term of the Maclaurin expansion
used.
For the special case of
p =
a
the correct value of P(t) can
be computed from tables of the student-t distribution.
examples were tried.
a re as f 0 11 ows :
Several
The results for the Maclaurin algorithm
54
8
RUN
1 -8 2
0'1 0'2
p
M
\)
t
P(t)
P(t)
Converged? Seconds
1.
a
a 0 0.0 10 1.093 .4900 .4920 apparently
12
2.
a
a 0 0.5 10 0.879 .3600 .3601
apparently
12
3.
a
a a
0.5 10 1.900 .8100
>1
rapidly
diverging
11
4.
a
a a 0.5 1 6.314 .8100
>1
rapi d1y
diverging
11
5.
a a a
6 2.000 .9025
>1
rap idly
di vergi ng
12
6.
a
0.5 10 2.228 .9025
>1
rapidly
diverging
12
a a
0.5
The Monte Carlo algorithm was also run for these examples
and performed as well as or better than the Maclaurin algorithm
in all cases.
While the Maclaurin algorithm does compute an
analytic formulation of P(t), it must perform many arithmetic
operations to do so.
The Monte Carlo Integration method only
estimates P(t) as the mean of a distribution.
it needs relatively few operations.
However to do so
CHAPTER 3
BIVARIATE-t DISTRIBUTION THEORY FOR POWER COMPUTATIONS:
THE CASE OF DIFFERENT DESIGN MATRICES IN THE
LINEAR MODELS FOR PERIODS 1 AND 2
3.1
Summary
As in Chapter 2, theoretical results are developed in order
to construct an algorithm for computing Pr[lt l I~t and It21~tJ.
In this chapter the theory is additionally complicated by the
removal of the assumption that period 1 and period 2 linear
models in the by-period analysis have identical design matrices.
Hence algorithms for direct integration or numerical integration
or Monte Carlo integration are not derived.
Instead a method
of Monte Carlo simulation is presented which handles the most
general case of messy data with different design matrices.
Fur-
thermore this algorithm can be used to compute Pr[tl€R l and t28R~
where Rl and R2 are intervals on the real line or sets of
real numbers.
This algorithm is therefore useful for the power
calculations and for calculation of the difference, (a-a*), between the selected level of significance and the actual level
when using the Bonferroni inequality.
for both two-sided and one-sided tests.
The algorithm is useful
Of course this
56
algorithm can also be used to perform the computations of Chapter
2 algorithms as special cases,
The theoretical results on this
chapter hinge on the use of canonical variates without loss of
genera 1ity,
3,2
The Structure of the Data
In Chapter 2 it is assumed that the two t-statistics are
constructed from data which follow a bivariate general linear
model (GLM) having mean
E [~ i
J
= 0§i' i = 1, 2
In this chapter the requirement of identical design
(~)
matrices
is relaxed to handle the most general case;
E[~i ]
= X.
* B·
~1
~1
e
i =1 ,2
X. (n.1 by r i )
~l
(r.1 by 1)
S·
-1
Here
~l
and
elements.
~2
may have different dimensions as well as different
It is assumed that some (or all) of the
Yl and
~2.
observations are pairwise correlated; i ,e., Vli and V2i are
dependent for some i, This model is applicable to experiments
which suffer dropouts and/or admit new study subjects between
periods 1 and 2,
3,3
Transformation of the Data to Canonical Form
As in Chapter 2, the computation of the probability
p(t)
=
Pr[IT11.:5-t and IT21.:5-t] is desired.
Here Ti i=1,2 are
57
t-statistics based on period 1 and period 2 data respectively.
The joint distribution of (T l ,T 2) in its general form is quite
complicated. It is however possible to derive a useful
expression for the joint distribution of [T l ,T 2 ,Ql,Q2 J where Qi
is proportional to the square of the denominator of Ti (i=1,2).
(Integration with respect to Q(2 by 1) over R2 and integration
with respect to I(2 by 1) over the square region Ti£[-t,t]
(i=1,2) yields p(t).)
As a first step in deriving the distribution of [r,g] it
is quite beneficial to transform the data to canonical form.
The result is simplified distribution theory without loss of
generality.
The t-statistic is the ratio of a normal r.v. to
the square-root of an error-sum-of-squares, and may be written
T.1
= (M. * Y.) I [ -1
y. TP.Y./v.1 ]
~1
~1
-,~,
!.:
2
1 (e.) I [T
= -a.
Y.
P.Y.la,, 2v.1 J!':2
1
-1
,
A
~1~'
= w.1 I [Z.
Tz. I a . 2v .] ~
-, -1
1
,
where M.
is 1 by n.,
P.
is n·1 by n.,
W" d~
-1
1
-1
1
N(]J1.,l)
and
~~;(vi by 1) 9 N(Q,!) independent of Wi (i=1,2). The ~i are
canonical variates computed from the values of Yi and ~i' The
simplifying transformations from Yi to Wi and from Yi to fi can
be performed simultaneously: the joint distribution of [W 1 ,W 2]
is normal and independent of the distribution of [Ql,Q2] where
Z. T-1
Z. :: y. T-1-1
P.Y ..
Q1. :: -1
~1
The bivariate chi-square distribution of
58
[Q, ,Q2 J is complicated, but useful expressions for the characteristic function and for the p.d.f. can be derived from the joint
distribution of
variates Z.
~
-1
~,(vl
by 1) and
~2(v2
by 1).
The canonical
[z,.1 ,ZZ"""Z",
]T are such that the only dependencies
,1
Vi 1
are pairwise:
[z '1 ' Z. 2J 9 N2 ( Q, [
J
J
1
pd.
J
Pd.]
J
)
'I
for all jC{1,2, ... ,v 2}.
in Lemma 3.3.
other Z...
J1
Canonical correlations, dj , are defined
For jC{\)2+l, ... ,vl},Zjl is independent of all
It is unfortunate but in general true that the
~jl,Zj2J pairs do not share a common covariance.
Some of the dj
may be zero, and dj=O implies Zjl and Zj2 are independent.
Lemma 2.1 describes the structure and assumed distribution
of the data (Yl'Y2) in general.
Lemma 3.3 uses the results of
Lemmas 3.1 and 3.2 to construct the linear transformation from
response variables [~lT'~2TJ to canonical variables [?IT?:2 TJ
and to give the distribution of ~lT'~2TJ. Lemma 3.4 shows that
di vid"ing the numerator and denominator of Ti by 0; poses no
distributional problems when performed simultaneously for i=1,2.
Lemna 3.5 proves that [T l *,T2*J constructed from [Wl'W2.~/,~/J
has the same distribution as [Tl,T21 constructed directly from
fr l T,y 2TJ. It also shows that the distribution of [T l ,T2J depends
on 8 1 , 8 2,° 1 , and 02 only through 8 1/°1 and 8 2/°2'
Lemma 3.6 provides a result useful for computing P(t) by
Monte Carlo simulation.
Pseudo-random variables having the
59
joint distribution of [w ,W 2 ,Ql,Q2] can be easily generated from
l
standard normal variable values. The method is most efficient
when v1 + '2 is sma 11 .
Lemma 3.7 derives the bivariate chi-square distribution of
[Q1P2] following the work of Jensen (1970).
Lemma 3.8 gives an
expression for the p.d.f. of the four-variate distribution of
[T l ,T2 ,Ql,Q2]'
Theorem 3.1 presents a formula for P(t) which
can easily be numerically evaluated using Monte Carlo integration
or by using the Maclaurin series method described in Chapter 2.
lemm~
If
-1
P = [In - ~[~T~J ~TJ where X(n by
so that P has rank
then
3
Ms. t.
vxn
4) has rank r
= n-f
~T~ = p, MM T = I , and ~~ = o.
--
-v
Proof of lemma 3.1:
The spectral decomposition of
P is
P
=
V~VT where ~ is
a diagonal matrix of eigenvalues and y is an orthonormal matrix
Since p is idempotent the eigenvalues are zeros
of eigenvectors.
and ones.
zero.
= v,only v of the eigenvalues are not
Since rank(p)
Define A
vxn
~0 = 8. and ~T~
=
(6
with null rows deleted).
nxn
=
0. So P = y~T8yT
subset of the rows of VT.
PX
=
=
Then AA T = I ,
--v
~T~ where ~ = 8yT is a
Furthermore MM T = AA T = -v
I.
0 implies ~T~~ = 0 implies 0T~T . ~~ =
9 implies ~~
Finally,
=
g.
60
Lemma 3.2
If -1
p.
T -, T
I - X . X. X. X.
where X.(n. by r ·) has
,
r., (i=l
=
-no
rank
v.
1
-1 -1 -1
~2)
-
by n2)
T
such that A.
A.
-, -1
1
and
:J where q'{1.2 •... •v 2}
8, (v 1
Then there exists
and
2
[;q
=
1
-
l'
~(nl
-1
so that Pl' has rank
n.-r. and v,>v
==
-1
by n,) and
= -1
P.~ A.A.
-1-'
82(v 2
2)
by
T = Iv ' A.X.
-.
-,-,
== O~
-
~1~~T2 ~1 where ~(v2 ~2) is
by
=
a diagonal matrix having rank ~ min{q~v2}
Proof of Lemma 3.2:
~i(vi by ni ) S.t. ~iT~i = Ei and
lv.· The singular value decomposition of ~O = ~1§~2T is
,
By Lemna 4.1 there exist
~i~iT
=
~O = 91T [:]Q2
is diagonal.
where gi(v i by vi) is orthonormal and !(v 2
. A.
Def1ne
-,
T = M. Tg. Tg.M.
A.
-1 A.
-1
-,
1
1-1
and
=
=
g.M
..
1-1
T
Then A.A.
-'-1
==
~1~~2T = 91~1§~2T92T = glg1T[~]g2g2T = [~].
<
min (q~v2}'
"2)
TQ. T = Q.Q. T = I,
Q.M.M.
-1-'-1 -1
-1-1
TM. = M. TM. = P.~ A.X. = Q.M.X.
M.
-1 -1
-1 -1
-,
-,-,
-,-,-,
rank q~ the rank of ~ must be
~
=
Q.O
-1-
Since B has
= Q~
-
61
Then there exists a transformation to canonical variates
=
[~l
Q
Q] *
~2
by
It" )
are ma tri x funct ions of
where ~ is diagonal with elements [dl, ... ,dv21 on the diagonal,
and
7.
~1
TZ.
-1
=
TA. TA.Y.
't.
-1 -1 -1-1
=
Y. Tp.Y.
-1 -1-1
(i=1,2).
62
Proof of Lemma 3.3:
By lemma 3.2 there exist
~/~i:= Pi' ~i~i = g,
and
~l and ~2 such that ~i~i T = lv.'
~1~82T = [~Q1T.
The joint
distrib~tion
of ;1 and ;2 is obtained immediately by applying well-known
multivariate norma' theory.
In the special case of X.1 ~
2
Lemr~a 3~
it happens that
1
:=
=I
~v
j and [Q"Q2] are non-negative
P
variates which are independent of
T.
~
a la 2
2 )
(J 20 0'2
1
If Q -d N2 (Q, [0'1
0
~2
~
and for i=1,2
L
2 k:
§./(Q./V.(2 and T.* = (e./o.)/(Q./o. V.)2
-1
1
1
1
1
1
1
1
1
Then the joint distributions of [T"T 2 ,Ql ,Q2] and of
[T l *,T 2*,Q"Q2 J are identical, and those of [T"T21 and
tTl * ,T2*] are ide nt i ca 1.
Proof of Lemma 3.4
Assume that the p.d.f. of [T l ,T 2 ,Ql,Q2] exists.
f(I,9) = f(!19)*f(g)
[I given g] is bivariate
where the conditional distribution of
normal.
Given
9,
Then it is
It is only necessary to prove that f(TIQ) = f(T*IQ).
.
~
-
the transformation from e to T is of the form
T
:=
--
63
"-
and the trans forma ti on from e to T* is of the form
Since a i ::: bi , the transformations are identical. Integration
with respect to Ql and Q2 yields the distributions of J and J*
which it follows must be identical.
The usefulness of this
lemma is that it shows that T can be constructed from underlying
variates vfhich may have a simpler variance-covariance structure.
If the conditions of Lemma 2.1 are assumed and
T. :::
1
e.1
Q./v.
-- 1
1
1
A
[
and Wl ]
W2
II
]
~2
i::: 1,2
9 N2
9
~2
Nv +v
l
1
PM]
[ pM
1
[:] , [~,O]
Iv 0["'0]1] )
2 (
p
and T.*
1
::
is independent of
[T
W.I
z. Z./v.1 ] k2
1 --1 -1
I Y2
i =1 ,2
Then I and 1* have the same bivariate distribution, and
[I,gJ
and
[!*,g] have
the same 4-variate distribution.
Proof of Lemma 3.5:
2 and W. ~ ~.Ia ..
By lemma 3.3, -1
Z.T --1
Z. ~ Q.la.
1 1
1
--1
1
3.4,
[1,Q] d
fI*,g].
Then by lemma
64
This lemma shows that a sample from the distribution of T can be
obtained from the distribution of the v l +v 2+2 variates fl,f2'W l ,
and W2' The parameters of the I distribution must depend only
on pm,
-
pll,
, ,
8./0., and v,' (i=1,2).
----------------_._-
~rruna ~~6r-----·
If
Wl,W2'~1'
~2
and
have the normal distributions given in
Lemma 3.5 and Wl*,W 2* and the elements of ~l* and
mutually independent standard-normal variates
d
Then [::]
[:m
anTl
d
-
~2
[l_p2:2]~' fw::l
[~~~,gJ
0
[l-p
and thus Ql = ~l T?l
2
~~1
d
+
1 01
/ ]
e /0
2 2
~]
r:::j
z*
- 1
- 2
+
-
-
- 2
+
2P?k2T[!_p2~2J~[~,gJ ~*l}
an d !I. =
dIc'h en
Q -- p2 d2 Z* TZ*
2
+ 2p d(1_p2 d2)
- 1 - 1
z* TZ*
are
r
+
[:]
z* TZ *
- 1 -1
+ z* T[I_ p 2Jj 2]Z*
~2 *
- 2 - 1
+
e
65
Proof of Lemma 3.6:
Matrix multiplication yields
[p:
p~]
= l:M
and [IV 1 p[~.Q] T]
p [~,Q]
I\I
2
[1-p2:2] ~]
= [I
- \11
p [~, Q]
l~
~
pM ]
[l-p 2M2] ~
O-p2:21 · [I:,
p[~.Q] T ]
2
[l-p ~2J
~
By applying well-known multivariate normal theory it is easily
shown that the linear transformations of [W l *,W 2*J and of
~*lT'~*2T] implied in the lemma yield normally distributed
vari ates whi ch are di stri buted as [W l ,W 2] and as [~l ~~2 T] .
Since identical functions of identically distributed variates
are distributed identically, the remaining results of the lemma
follow.
66
Lemma 3.71
If
~l
and
~2
have the normal distribution given in Lemma 3.5 and
Q. = Z.T Z. (i=1,2)
-1-1
Then the joint distribution of Ql and Q2 has characteristic
-VI
-~
2
2 -~
</>(t l ,t 2) = [1-2it l J "2 [1-2it 2] 2 j~l [1- cp 2dj ]
1
where c = (2it l )(2it 2)/(1-2it l )(l-2it 2)
and ~ = diag(d1, ... ,d 2) as in Lemma 3.5.
The bivariate p.d.f. of Ql and Q2 can be written (a.e.) as
Ql vl
2 v2
f Q(Ql,Q2) = !;;W(2; 2) W(2; 2 ) x
vl-2
- :'-2- 2 Q
x E g' G • L(-2-) (Ql). L (-2-) (22)
k=O KKK
2 K
00
KKK K
00
=!;;
where
~w(Q~; v~)
( )
=
[Q~J-7-
K I<+a
Q, vl+2m
m+n
G E E (m)(n) [-lJ
1<=0 km=O n=O
v· -2
-±Q.
W("2-;
L
2
Q2 v2+2n
)W(2;
)
e . 1 /2r(V~) is tremarginal / p.d.f.
.
LK a (u)= i:O(I<-i) [-uf/(K) 1 is a Laquerre polynomial,
vl
v2
vl+2K v2+ 2K
gK = (K) l(K) 1r(2)r(2)/r( 2 )r( 2 ) ,
and GK = . l: . ••••••. L_
J,+J2+···+ J - K
ajl(pd l )·· .. 'a j
2
where a. (p.) = [p.]
Ji 1
1
2
2j.
1r(j.+~)/;';;:r(j.+l)
1
1
v2
(pdv )
2
67
Proof of Lemma 3.7:
The results of this lemma are given by Jensen (1970).
outline-proof of these results is as follows.
An
For each
j€{1,2 •... ,v2} the pair [Zlj,Z2j] has a standard-bivariate-norma1
distribution with correlation coefficient p.d.j.
For each
jdv2+1 •... ,~ } llj is a standard normal variate.
Except for
pairwise p'd j correlations all the Zij'S are independent.
has been shown that the joint characteristic function of
It
-1
cp(t lj ,t 2j ) :; [(l-it lj )(l-it 2j ) - p2 d/(it 1j )(it 2)J'2
For
j~v2+1
1
the characteristic function of IZ'j/21 is
<t>(t,) :; ['-it 1j ] -~ .
By independence the joint c.f. of
[{z21j/2},{Z~j/2)] is
vl
v2
:;
.7f
J=l
cp(t'J,.t 2J·)
7f
K= 2+1
CP(t'K)
which can be written as
where
Cj :; (itlj)(it2j)/(l-it,j)(j-it2j)
Since Q,/2 is a convolution of [Z~j] (and similarly QZ/Z)
the joint c.f. of [Ql,QZ] is as given in the lemma.
In order
68
to obtain an expression for the p.d.f. based on the c.f. Jensen
(1970) uses the fact that
2 -~
2
(l-cp. d.)
J
J
2
00
2 K
= L [cP ' d . ]
J J
k=O
Collecting terms of the product in
00
Ht"t 2 )
=
-
r(K+~)/r(K+l)r(~).
~(t"t2)
gives
K
vl
K
-v 2
r. G ('-it,)-2- rit,/O-it,)] ('-it 2 )2[it 2/(1-it 2)]
k=O K
where
GK =
L=K aj ,( d,) ..... ajv (Pd v )
2
jl+j2+' .. +j 2
2
L
is a sum taken over all integer partitions of k in terms of the ji1s.
It has been shown that (l-it)-g(it/(l-it))h has as its inverse
Fouri er trans form
Lh(g-l)(X) ~(x;g) r(g)r(h+l)/r(g+h).
Hence the p.d.f. of (Ql,Q2) is as given in the Lemma.
This series
expression for the bivariate p.d.f. uses an orthogonal system
of functions (the Laguerre polynomials) which have the marginal
p.d.f.s as their weight functions.
Jensen (1970) shows that the
series is absolutely convergent almost everywhere (a.e.).
For K=O, GO=l.
For K=l, G
l
= ~p
2 v2
L
i=l
d.
1
2
69
~w {M p~]
and T.1
~
and
~Z "
~
W./(Q./v.)
1
1
1
where Q.1
=
Z.T
Z1., f x%
1
~
~
(.)
is the chi-square p.d.f.
P
ith P degrees of freedom, and
q
==
[Tl,{f~~~-"l-I
T21tT27vrv2J
:: (l-p 2M
2
r 1{Q 1T/
T !lijl
[TljQl/V'~Jll]
T2 /Q2!v2- Jl 2
/ vl + 2T1;q;-fV.J[pMJl2 -Jlll}
2 2
-+ J-l, +Jl2 -2PJllJl2 - 2pM/Q 1Q2 T,T 2 +
v,v2
and
almost everywhere.
+
70
Proof of lemma 3.8:
The conditional p.d.f. f
II9
is a bivariate normal p.d.f.
easily computed using well-known normal distribution theory.
The bivariate p.d.f. f
g is derived in lemma 3.7.
Theorem- 3-:Jj
If the conditions of lemma 3.8 are assumed
Then the probability
P(t)
=
t
t
f
f
-t -t
f T(T l ,T2 )dI
~
can be computed as
00
00
P(t) = f f
o0
t
f
t
f
-t -t
fJ,g(I,9) dI dg
.~-
=
and GK is defined in lemma 3.7
The function 9Q(t) can be written as a Maclaurin series such
that
N
9 (t) = L
gK=O
where 9(2K) (0) is a polynomial in terms of the form Ql r Q2s .
71
Proof of Theorem 3.1:
The series expression for fJ,g is given in lemma 3.8 and
(as stated in lemma 3.7) is convergent a.e.
By Theorem 2.2
gQ(t) can be expressed as a Maclaurin polynomial in t with
known coefficients.
If expressed as a Maclaurin series, P(t) involves no
integrals but is algebraically involved with the summation of
many terms.
As a 4-fold integral, P(t) can be evaluated by
crude Monte Carlo integration or other numerical methods.
3. 4
~n
A1gori thm fC?L
C0!!lp.~!l
n9 Power
Lemma 3.6 shows that a sample from the joint distribution
of [W l ,W 2 ,Ql ,Q2J can be obtained by transforming samples from
the Nv +v +2 (9,1) distribution of [w 1*,W 2*,Zl*T,Z2*T]. Thus,
1 2
one pseudo-random pair [T l ,T 2] from the distribution of [T, ,T2J
can be obtained from~1+v2+2) independent N(O,l) values.
The probabi lity p= Pr[T1E:R l and T2E:R 2J, where Ri (ill ,2)
is a real interval or a set of real numbers, can be estimated
from a sample of [T l ,121 by counting the number of pseudorandom values which fall into the region designated by
{(t l ' t 2 ) : tlE Rl and t 2£
RJ.
Let Xn be the number of pa irs
which fall into the region out of a sample of n pairs.
Then
Xn has a binomial distribution with mean n'p and variance
np(l-p).
Xn has mean p and variance p(l-p)/n.
Since
72
p(1-p)c[0,.2S] the standard error of
to .SIIn.
P=X n
is less than or equal
For n=500 the standard error is <.0224.
is not too large computation of
p is
If (vl+v2+2)
efficient as well as
stra i ghtforwa rd.
This algorithm has the advantage of being able to compute
an estimate of not only power =
l-pr[ltl*l~tl and It2*bt2]
but also (a-a*) = pr[ltl*l>t l and It 2*I>t 2]. Furthermore, the
algorithm can handle similar probability calculations for onesided tests for which power = l-pr[tl*2.tl At2*2.t2J and
(a-a*)
:: Pr [t l *>t l At 2*>t2] .
CHAPTER 4
EXAMPLES OF DATA ANALYSES WITH POWER CALCULATIONS
4. 1
Summa rl'
Data from a CO(2,4,2) experiment are analyzed according to
the by-period method.
matrices,
~l
and
02
In this case the linear model design
are identical. The hypothesis that the
treatment effects are zero in both periods 'is tested using the
Sonferroni inequality.
The power of this test is computed.
Having analyzed the comp}ete set of data, a randomly
selected subset of the data is presented as a messy data problem.
The subset is analyzed according to the by-period method.
this case the design matrices,
dimensions and entries.
~l
and
~2'
In
have different
The power of the two-sided test for
treatment effects is again computed.
Analysis of the messy-
data subset according to the model explicated by Grizzle (1965)
is discussed and the power of the recommended test for treatment
effects is computed. Si nee many of the me ssy-data observati ons
are unusable in the usual model, power is lost.
It is shown
that in this example the test in the by-period analysis is
more powerful than the analogous test in the traditional
analysis.
74
The particular set of data analyzed is illustrative of
situations in which the traditional model is not appropriate;
ca rryover is promi nent, there is a period*treatment i nteracti on,
treatment effects in the two periods are not the same, and the
two periods do not share a common variance.
Other analyses of
the same experiment are also possible and some were performed.
For example, analysis of rate of growth (rather than cumulative
growth) is very di fferent from the analyses presented.
The data
analyzed in this chapter provide the best example of the need
for an alternative analysis and strategy such as that of byperiod approach.
Furthermore, application of the theoretical
methods developed in Chapters 2 and 3 is best demonstrated for
the analyses presented below.
4.2
A Bean Sprout Experiment
Details of the bean-sprout experiment involving the
The protocol is
herbicide 2,40 are given in the appendix.
elaborated, the data are listed and simple descriptive statistics
are presented.
The data are arrayed as 44 observations on 3
variables: Group, Yl and Y2.
The unit of study is a triplet of
sprouts and the analysis variables are within-tr'jplet averages
of the measurements made on sprout lengths.
of length in mm at the end of period 1.
Yl is a measure
Y2 is a measure of
length at the end of period 2.
Figure 4.2.1 illustrates the group means plotted as
75
growth curves.
The effects of the different treatment sequences
are dramatic with respect to absolute sprout length.
There appears
to be carryover from period 1 ~ and treatment given in period 2
appears to have minimal effect.
The correlations between Yl and Y2 for the four treatment
groups are given below.
The treatment sequences associated with
the groups refer to the active reagent, IIA
Group
Seq.
r
y 1 ~ Y2
II
,
and the placebo, IIpll,
(p-value)
----1.
A,A
0.86
( .0006)
2.
A~P
0.81
(.0028)
3.
P,A
0.96
(. 0001)
4.
P,P
0.85
(.0009 )
These coefficients indicate that sprout lengths as measured at
the ends of the periods are highly correlated.
4.3
.By-P~riod Ani!l~is.
for
~
Complete Set of Data
The two length variables, Yl and Y2, are measurements made
He define ~ P ~ Tl • T2 = E[Y.
P, n , T2 , K]
as the mean of the observation on the Kth study unit in period p;
at the ends of the two periods.
this unit received treatment sequence (Tl,T2).
We assume
COV [Y p, Tl ~ T2 ~ K ' YP I ~ Tl1 , T2 I ,K I] is zero if K;lK I, and is
(J
pp I
if K=k
I •
76
Figure 4.2.1
MEAN SPROUT LENGTHS
Means are plotted for each of the treatment groups.
"A" denotes the act; ve reagent treatment. "P" denotes
the placebo treatment.
-.-z--
3 (~A)
-z--4(P;P)
110
5
/00
qo
P
R
()
U
~o
e
T
2 (A) p)
-7--1 (A,A)
-z.70
L
E
'0 N
~
50
T
H
lfO
1
~o
N
10
M.
M.
It>
PERroe> 2-
PERIOD 1
0~o
7°
f()
--+-
10
I
f
,
,
I
leo
/10
/lO
130
lIfO
I
+--
Iso
I
I~
UO lIotiR,S
77
4.3.1
Period 1 Analysis
The primary pararreters are defined as follows:
the overall period 1 average,
Tl = ~lp;- ~lA. -- the period 1 treatment effect,
S (1) =
2
~ l'p
-
~l'A -- a sequence effect.
Because of the randomized treatment design, we assume all
sequence effects are zero.
The sequence effect 52 (1) is included
in the Period 1 model so that the design matrices ~l and ~2 will
be identical and thus the case of ~l = ~2 will be illustrated.
The peri ad 1 response vector y1 (44xl) has expected val ue
~l §l
where
- 1
(A,A)
1 - 1 1
1
- 1- 1
1
1
1
(A,P)
(P,A)
1 - 1
~l --
- - - -
(p,P)
and
For this general linear univariate model (GLUM), the estimates
of the model parameters based on N = 44 sprout triplets are:
----
Parameter
Esti mate
all
95.13
ra:j1
9.75
Standard Error
t-test p-value
~l
59.77
1. 47
o. 000
T1
9.71
1. 47
0.000
-0.94
1. 47
D.526
(1)
2
78
The expected val ues of length in mm at the end of period 1 are
estimated as follows:
Group Sequence Mean Length
_~_~·o
__
Estimate
Standard Error
~·
l.
A,A
11 1-T 1-$2
2.
A,P
111- l
3.
4.
(1)
49.00mm
2.55
+S (1)
2
47.12mm
2.55
P,A
\1 +1' .. S ( 1)
112
68.42mm
2.55
P,P
\11+Tl+$2
66.54mm
2.55
'l'
A
(1)
A separate analysis of the residuals indicates that the
assumptions of the model are well founded.
The normality
assumption is supported by the Kolmogorov-Smirnov goodness-of-fit
test
(p~value >
.20) as well as by a probability plot of the
residuals.
As may be seen from Figure 4.2.1 and from the p-value for
the test of Ho: Tl=O, the treatment applied clearly has a
substantial effect in period 1.
As expected, the sequence effect
(included in the model solely to illustrate the case
~1=~2)
is
not s i gnifi cant.
4.3.2
Peri ad 2
Ana~~
The primary parameters of interest are defined as follows:
79
overall period 2 mean
different in treatment P carryover
and treatment A carryover
period 2 treatment effect
Because of the experimental design, we assume all sequence
effects are zero.
~2*~2
The response vector r2(44 by 1) has mean
where
~2
:::
1
-1
-1
(A,A)
1
-1
1
(A,P)
1
1
-1
(P,A)
1
(p,p)
~
-1
@2
:::
[ 112
In this model
~
~
~
~
1
P12 T2r
.
~1:::~2'
Also notice that T2 is proportional to the
average of the groupl--group2 difference and the group3--group4
difference.
Since both differences are valid comparisons, their
average is also meaningful.
Model parameters (based on N=44) triplets are estimated as
follows:
Pa rameter
Estimate
Standard Error t-test p-value
022
237.61
10 22-
15.41
112
94.31
2.32
0.000
P12
23.01
2.32
0.000
O.li
2.32
0.961
T
2
-~------
..
80
The expected values of length in mm at the end of period 2 are
estimated as follows:
Group Sequence
Mean Length
---------A,A
l.
~2-P 12- T2
Estimate
Standard Error
71 . 19
4.02
71 .41
4.02
2.
A,P
J12-P 1/T 2
3.
P,A
~2+P12-T2
117.21
4.02
4.
P,P
J12+P12+T2
117.43
4.02
-~~~~
.
~.:.=_~-=-~~-
_~_.~~,-~--~_.~---~~._----,._~._~-~~-,-=-.~~--~.-~~"""-"""'-~-~_.
- --'-'--~~
A separate analysis of the residuals indicates that the model
assumptions are well founded.
The Kolmogorov-Smirnov normality
goodness-of-fit test has a p-value greater than .20.
A
probability plot of the residuals also supports the normality
assumption.
The treatment effect in period 2 is negligible whereas
the treatment effect in period 1 was found to be quite large.
The carryover effect from period 1 into period 2 is clearly
substantial.
Also note the increase in the estimated variance
from 95 mm 2 in period 1 to 238 mm 2 in period 2.
4.3.3
fombi ne_cL.9yer- Peri ods_}na lys is
The period effect defined as
by TI
=
O2-0 1 ,
n
The standard error of
= J1 2 -11 1 can be estimated
n and
a test of Ho: n=O
vs. Ha: niO are both obtained by fitting the GLUM
81
E[Y2- Y1J= ~1*@21 where
§21(3 by 1).
TI
is the row 1, column 1 element of
The computed value of TI is 36.54 with a standard
error of 1.21.
The 2-sided t-test has p-value :: .000 and therefore
the period effect is substantial.
The computational results for periods 1 and 2 and for the
combined analysis can all be obtained by fitting the bivariate
model
qy l' Y2' = ~1 *[§l '§2]
V[~l 'Y2 = !44 @
1
2
2
where °11=°1 , °22=02 , and 012=ol 0 2P '
0lo2 P is also obtained as 134.14.
An estimate of covariance
Hence p=.892.
The computed
r-square for the bivariate model is 0.98.
The hypothesis Ho: Tl=O and T2=0 vs. Ha: T1,O and/or T2,O
can be tested using the Bonferroni inequality and the two individual
t-statistics: if at least one of the individual null hypotheses is
rejected at the a/2 level then the joint null hypothesis is
rejected.
Since Ho: Tl=O is significant at any level, the joint
test is also significant (i .e., Ho is rejected).
The hypothesis Ho: Tl =T 2 vs Ha: T1;T 2 is not readily testable.
However, the estimates Tl=9.71~2(1.47) and T2=0.11~2(2.32) indicate
that Tl and T2 are probably unequal. This means that the difference
between treatments A and P in period 1 is not equal to the difference
between A and P in period 2; i.e., there is a period* treatment
interaction.
An estimate of this interaction is T2-T l = -9.60 but
82
the standard error is not available.
In summary, it appears that under the conditions of the
experiment 2,40 has a dramatic detrimental effect on the sprouts
in period 1 (relative to the effect of placebo) which carries over
into period 2 as damage sustained.
Then in period 2 2,40 does
little damage if any, regardless of the period 1 treatment.
This
interpretation implies that 2,40 and time period affect the
sprouts i nteracti vely; i.e., the effect of 2,40 depends on when
it is administered.
The significant carryover effect and the difference in
period 1 and eriod 2 treatment effects (i .e., treatment*period
interaction) raise serious questions about the val idity of the
traditional analysis for this experiment.
The increase in the
estimated variance from period 1 to period 2 is another violation
of the assumptions of the usual analysis.
Sy comparison, none
of the assumptions of the by-period analysis are violated.
The power of the Sonferroni test of Ho: Tl=O and T2=0 is
equal to the probability of rejecting the null hypothesis when
T
to and/or T2,O; i.e.,
1
Power = pr[ltl*l>t(v;a/r) and/or /t 2*I>t(v;a/4)]
= l-Pr[max[t i
t(v;a/4)]
*J2.
=
l-P(t)
where t,* and t 2* are the individual t-statistics, t(v;a/4) is
the 100(1-a/4) percentage point of the student-t distribution
with v degrees of freedom, and v=4l.
For a=.05
83
t(v;a/4)=2.3267.
Other parameters of P(t) are (in the notation
of Lemma 2. 1)
8
1 = ~l *§l
= /NT 1 where f l ::: /N-. [0,1, 0] and N = 44
2 ::: ~2*§2 ::: /t,rr 2 where
2
°1 ::: °11
8
f 2 ::: IN· [0.0.1]
2
°2 ::: °22
P
:::
°12/ (0 11 ° 22)
V
:::
N-rank(tl)
m :::
!,;
2
::: 41
~l [01T01J-l~1 T02[02T~2J-l f 2 ::: 0 since
The function p(t) depends on
8
~l
2
i and 0i only through
T
tl::: N·!
8./0 ..
1
1
If we assume some fixed values for T and T2 and assume
l
2
2
° 1 ::: 95 .13, ° 2 =237.61, and p:::.892 (as estimated) then P(t) can
be computed as a function of T1 and T2 • The figure below
presents power=l-P(t) as a function of T1 and T2 .
neighborhood of Tl and T2 :
. 07
f2~
a
In the
Power as a function of T and T2 . Computation
l
by Monte Carlo simulation is based on 500
pseudo-random bivariate-t pairs .
.19
. 96
1.00
.06 .18
.06 .18
----2
.95
.95
3
4
5
6
1. 00
1.00
7
8
9
!T
l
10
T
1
84
If the experiment was analyzed using the model explicated by
Grizzle (1965) a test for the effect of treatment would be performed
and the power of that test could be computed.
The power of that
test is discussed below in Section 4.4.2.
Using the Sonferroni inequality with the individual
t-statistics assures that the true probability of type I error is
a*<a
where a has a value chosen by the analyst.
The difference
(a-a*) is equal to Pr[lt l *l>t(v;a/4) and It 2*I>t(v;a/4)] evaluated
at Tl=O and T2=O. Computation of (a-a*) therefore involves a
probability integral of a central bivariate-t distribution.
For
a=.05 and parameter va"lues as in Figure 4.3.3.1 (with T =0=T 2 ),
l
Monte Carlo simulation based on 500 pseudo-random bivariate-t
pairs indicates that (a-a*) is approximately zero.
This is
reasonable when we consider that m=O (page 4.12) implies that
the numerators of the
t~o
t-statistics are independent.
Because
the denominators are dependent we still need the Bonferroni
inequa 1ity however.
4. 4
~22.Y-D~~
Ana ·lys~5..
A subset of the bean-sprout data was selected at random
to provide an example of analyses involving missing data and
different design matrices.
Figure 4.4.1 lists the data available.
This set of data is analyzed by period and the test of treatment
effect is compared with the analogous test in the traditional
analysis with respect to power.
85
4.4.1
By-Period Analysis
The same linear model used in Section 4.3 applies to the
analysis of the messy data except that S2(1)iS omitted from the
model.
All of the data (listed in Figure 4.4.1) are usable.
4.4.1.1
Period 1 Analysis
The estimates of the model parameters based on the
Nl =40 available sprout triplets are as follows.
-------------_._--_._----
-------
Parameter
Estimate
Standard Error
t-test p-value
all
90.39
].11
57.42
1. 50
0.000
T
l
9.47
1. 50
0.000
For this univariate model R-square is 0.98.
A separate analysis
of the residuals also supports the assumptions of the model.
The expected values of length in mm at the end of period
for each group are estimated as follows:
Group Sequence Nli
Mean Length
Estimate
Standard Error
i=l
A,A
10
].11-T l
47.95
2.12
i=2
A,P
10
1l1- Tl
47.95
2. 12
i=3
P,A
11
III +T1
66.89
2.12
i=4
P,P
9
].11+T1
66.89
2. 12
86
e
Figure 4.4.1
Observations on Sprout Triples
Period 1
OBS
1
2
3
4
5
6
7
8
9
10
Yl
53.66
45.66
54.33
44.00
46.00
49.33
38.00
50.33
48.33
42.66
11
12
13
14
15
16
17
18
19
20
21
22
5'1.00
55.66
44.00
41.66
39.33
54.33
52.33
48.66
53.00
46.66
23
66.00
81.00
62.66
89.33
58.00
70.33
82.00
52.33
80.33
74.66
54.33
24
25
26
27
28
29
30
31
32
33
Y2
Period 2
Group
Sequence
1
A,A
1
1
1
1
1
72 .66
58.00
70.00
69.00
58.00
68.00
1
1
1
A,P
55.00
84.00
67.33
77.33
87.33
62.00
78.00
2
2
2
2
2
2
2
2
2
2
2
3
3
P,A
1
1
3
135.33
84.00
133.33
120.33
94.00
3
3
3
3
3
3
3
3
e
87
Figure 4.4.1
(conti nued)
34
35
36
37
38
39
40
41
42
43
44
69.33
74.33
66.66
77.00
58.00
53.00
68.33
61.33
38.66
85.00
120.00
103.66
82.33
143.00
129.66
4
4
4
4
4
4
4
4
4
4
4
P,P
88
4.4.1.2
Period 2 Analysis
Only N2=Z4 observations are available in period Z.
The estimates of the model parameters are as follows.
r-------------------------------.
Parameter
Estimate
°22
303.02
\lZ
90.70
3.58
0.000
P1Z
T
2
21.06
3.57
0.000
1. 27
3.57
0.724
Standard Error t-test p-value
The expected values of length in mm at the end of period 2 are
estimated as follows:
Group Sequence NZi
Mean Length
A,A
6
l1Z-P1Z-TZ
68.37
6.20
7
11 -P
2 Z+T Z
70.91
6.Z0
i =1
i=2
Estimate Standard Error
i=3
P,A
5
l1 Z+P1Z-T 2
110.49
6.20
i=4
P,P
6
11
2+P1 2+T 2
113.03
6.20
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _- 1
The R-square value for this model is 0.97.
Residual analysis
supports the distributional assumptions of the model.
4.4.1.3 Combined-aver-Periods
The period effect
n= 11 2- P l
fitting the linear model for E[YZ-Y 1]'
Ana~
can be estimated by
However only N1Z =20
8.9
observations on Y2 -Yl
~
are available.
= 32.6 with a standard error of 1.7.
p-value
~.OOO
Based on these data
The two-sided test has
and represents a substantial period effect.
Complete
residual analysis supports the assumptions of this linear model.
The Bonferroni method of testing Ho: Tl=O and T2=0 vs.
Ha: Tl'O and/or T2'O leads to the rejection of Ho since Tl is
significantly different from zero at any level.
With estimates
Tl = 9.47~2.(1 .50) and T2
:: 1 .27~2(3.57), Tl and T2 are probably
unequal and hence there is a period*treatment interaction. An
estimate of this interaction is
(T 2-T l ) = -8.20 mm.
The power of the Bonferroni test of Ho: Tl=O and T2=0 was
computed using the following parameter values:
e
Nl = 40 = the number of observations in period
v :: 38 = the degrees of freedom in period 1
l
N2 = 24 = the number of observations in period 2
v
2
a
=
21 = the degrees of freedom in period 2
::
.05 :: the selected level of significance
t(v 1 ;a/4)
::
2.33372 :: the 100(1-a/4) percentage point of
the t Vi -distribution
t(v ;a/4) :: 2.41385 :: (similarly)
2
-!:!
1
C1 .. [0, 1] *([ 0 1] [~,r~,r [0 1] T
= [0
I
1 :: ~1~1 :: T l
C2 :: [0. O. 1J
6
* 6.32
T
-1
1]
* 6. 32
- -~
* ([0. 0, 1] [X 2 X2] [0 0 1] I
=[0 a
J* 4.88
1
90
= .0684
=
1:1
21x21
di ag[
1 J
lx17
a
lx4
The figure below presents power as a function of T l and T2 and
the above parameter values:
Power as a functi on of T 1 and T 2'
Computation by Monte Carlo simulation is
based on 500 pseudo-random bivariate-t
pa irs.
10
.66
1. 00
1. 00
".85
9
8
7
.36
.73
.
.
.99
.
1.00
6
5
. 16
.64
.99
1.00
.11
.63
.96
1. 00
1
.09
.62
.96
1. 00
0
.08
.61
.96
1.00
4
3
2
A
T
2
---'?
,>
2
3
4
5
6
7
8
9110 T,
1'1
The power of the test against the alternative hypothesis for
a=O.05 is approximately 1.00 when T,=T l and T =T , due primarily
2 2
91
to the large first period treatment effect.
4.4.2
Power Calculations for'the Traditional Model
The messy data could be analyzed using the model explicated
by
Grizzle (1965).
In this section, we compute the power of the
treatment effects test for that model. To do so it is only
necessary to estimate 0 2 and a which are defined below. The
following model is under consideration (see Grizzle (1965)):
E[Y P1J
. 'k]
a genera 1 mean
= \.l
+
effect of subject j within sequence
~"
lJ
which is distributed N(O,os2).
i =2, 3 and j E: 11 , . . . , nil.
A1s 0 ass ume
E;i·=O.
+
1T
period effect p, p=1,2 and assume
P
lf i + ~2
+
= O.
treatment effect Ki ,K=1,2 and assume
¢l + ~2 = O.
~k
carryover effect k is zero unless p=2
random variation distributed as
2
N(O,oe ) independent of E;ij'
+ e ..
lJP
V[Y pijk J
COY [y
2
=:
°3 + Os
.. k ' Yp
P1J
I' I
• I
1 J
kJ
2
2
=
°
=
(J
s
2
Note that this model would utilize only the data in the (A,P) and
(P,A) sequence groups and within those groups only those experimental
units with data from both periods.
92
When carryover effects (A t<. ) are included in this model, the
period contrast (TI2-TI1) is not estimable.
Grizzle (1965) recommends
that the contrasts (¢2-<Pl) and (A2-Al) be estimated and then if
Ho: A2"'Al is not rejected at a moderate level (e.g., a=.lO), the
carryover effects (A ) a re de leted from the model.
K
Unlike the by-period analysis, this model assumes that the
Kth treatment has the same effect in period 2 that it had in
period 1; i.e., period*treatrnent interaction is assumed niL
For
the messy data at hand it is obvious (from preceding analysis and
data plots) that this assumption is not true.
The non··centra 1i ty pa rame ter of the t-di s tri buti on of the
test statistic for Ho: ¢1=<P2 vs. Ha: <Plt¢2 depends on whether
Al=A2; i.e., whether there is equal carryover from the two
treatments, A and P.
It is obvious from previous analyses that
AltA2 and so the non-centrality parameter is
6
=:
UP2-epl )/(0. [1/n 2
+ 1/n 3]
~)
where ni is the number of sequence i observations (See Grizzle
(1965)). The degrees of freedom for the t-statistic are
(n 2+n 3-2)
=
9 since the number of observation pairs available
on study units which received sequence A,P is n = 6 while for
2
P,A n =: 5. Grizzle (1965) provides a method of computing the
3
amount of experirrentation necessary to obtain 0.95 power against
the alternative hypothesis, Ha: <P2-¢ltO, for a 0.05 significance
(<P 2-¢1 )/012 and uses the graph
in section 2.3 (p. 39) of Owen (1962). Estimates of 0 2 and
level.
The method defines
6* =
93
(~2-~1)
for the messy data are given by:
----
2'(¢2-~1) = (Y12-Y 22 )
A2
( n 1+n 2- 2) a
=
Y12
+ Y13
T[
T
J[J
n
T -1
T
-(Y 13 - Y23) J[1
T -1 T
l-] n]] ! ] y12
n
T -1
+
-1 T
T
[!-][]Y1J 1 }Y13
where Yji(n i by 1) are the observations in period j on group i.
Computation yields 2 = 96.56 and
= 4.28. The noncentrality
Q
a
is
6
Notice that the estimate of
= 0.72.
period 2 data.
¢2-~1
uses period 1 and
The estimate of a 2 is based only on period 1 data,
as recommended by Grizzle.
The following table of approximate
sample sizes needed is calculated from Owen's graph. (See Grizzle
(1965).)
p2=96.56
assumed
n2+n
3
for
power
0.95
(¢2-¢1)
5
6
7 8 9 10
11
12
1m 76 54 40 33 28 23 20
13
14 15
17 15
16
13 12
Since (n 2+n 3)=11, 0.95 power is anticipated for any
However, the actual estimate of ~2-¢1 is 4.28.
Power calculation based on
true non-centrality is
6
6
17 18 19 20
11
10
9
~2-~1>16.
is actually conservative.
The
only when the assumption of constant
variance across periods is valid.
that a 2 is common to both periods.
The traditional model assumes
But from previous analyses we
know that variance is much larger in period 2 than in period 1.
8
94
Thus the period 1 estimate of 0 2 leads to an overestimate of
non-centrality which in turn gives a "conservative" calculation
of power: power is even smaller than the above table implies.
For the (n 2+n 3)=11 observation pairs actually available the power
of the test can be computed as
Pr [Ho rejected gi ven
=
t:. =I
0]
l-pr[lt~(t:.)I.'5..t(v;a/2)J
where t* is the test statistic with
non-centrality parameter
the following table.
v;: 9
degrees of freedom and
Computation of the probability yields
t:..
(Remember, since
f:.,
was overestimated so is
power. )
~---~~---~._---_._------------------,.
2
=96.56
and n2=6,
n =5
3
as s umed
o
4
16
7
non-centrality
parameter
.17 0.67
Power
.053 0.093 0.184
1.18
1.68 2.18 2.59
19
22
3.19
3.70
0.324 0.496 0.669 0.811 0.907
In the by-period analysis the power of test was boosted by
the usefulness of Group 1 and Group 4 data.
If we had given
Group 1 the treatment sequence (A,P) and had given Group 4 the
treatment sequence (P,A) then the same missing-value pattern
would have yielded n2=11 and n3=9.
Then v=18 and the following
95
table would be obtained.
,
o 2=96.56
nZ=ll, n3=9
assumed
(<p 2-<jl 1)
1
4
7
10
13
16
2.94
3.62
19
22
--
non-cen tra 1i ty
parameter
0.23 0.91
power
1.58 2.26
4.30 4.98
0.054 0.138 0.324 0.573 0.795 0.928 0.982 0.997
For (<P 2-<P l ) ::: 4.28 power is again small: Power improved with the
hypothetical alteration in the experimental design but not enough
to equal the power of the by-period analysis test procedure.
Furthermore power was overestimated because of use of
~.
Generally, the CO(2,2,2) design is expected to decrease the
amount of necessary experimentation whenever carryover effects can
2/0 2>0. When 0 2/cr2~0 a completely randomized
s
s
design (1 period) is equally efficient. Here, even though
be omitted and
o
s
0
2/ 0 2>0,
so many observations are unusable in the traditional
-
model that substantial power is lost.
If there were no missing observations and only the (A,P) and
(P,A) sequences had been applied then we would have n2=22, n =22,
3
v=42 and 6 still approximately equal to 0.72 under the model IS
assumptions.
It is only necessary to check the results from
Owen's graph to see that 0.95 power will be obtained only if
96
Yet another analysis alternative is the use of only period 1
data (see Brown (1980).)
This is the same as using only the
period 1 model in the by-period analysis.
This method does not
allow an investigation of period*treatment interaction or of the
period 2 treatment effect.
We cannot say that this method has
any advantage over the by-period analysis with respect to power
since (1) it is a part of the by-period analysis, and (2) a
comparison of power would concern Ho: (T 1""0) vs Ha: hl,O) and
Ho: (Tl=0)/I.(12:::0) vs Ha: (TlfO)i\(T2rO).
4.4.3 Comparisons of Power
It is possible to examine power as a function of (T l ,T 2)
in the by-period analysis and power as a function of t:, in the
traditional analysis.
The two analyses can be compared on the
basis of power of the "treatment effect(s)" test by assuming
is in the neighborhood of 0.72 and (T 1 ,T 2) is in the
nei ghborhood of ('1'1,12) = (9.47,1.27). For these values of
that
1\
T ,T , and f'.. power ('f ,T ) ,; 1.00 and power (t:,) . .10 based on
l 2
l 2
the available messy data. If we hypothetically consider the
cage of Group 1 and Group 2 receiving sequence (A,P) and
Group 3 and Group 4 receiving sequence (P,A) while the pattern
of missing values remains unchanged, then more data would be
usable in the tradH·ional analysis.
(1\)
Then we would have power
= .15 while power (T l ,T 2) ,; 1.00.as before.
Thus for these particular data which involve missing
97
values and an unbalanced design, the power of the "treatment
effects" test in the by-period analysis is greater than the
power of the analogone test in the traditional analysis.
For the complete set of data with no values missing it is
also easy to see from the results of Owen's graph that there are
not nearly enough degrees of freedom to give 0.95 power against
the alternative hypothesis in the traditional model.
By comparison,
power in the by-period analysis is well above 0.95 for this
particular experiment.
CHAPTER 5
RECOMMENDATIONS FOR FURTHER RESEARCH
Several outstanding problems are readily apparent.
These
will requ'ire both analytic and numerical methods for further
study.
In Chapter 2 the Maclaurin expansion was seen to converge
for some parameter values but not others.
unknown function of the parameters.
Convergence is an
For completeness, some
usefully sharp bounds on the residual term would be desirable.
This however is not an important problem since the Monte Carlo
integration algorithm performs as well as the Maclaurin
algorithm when only a few significant digits of accuracy are
required.
A larger contribution would be the development of an
analytic expression for the power function, P(t), which could
be translated into an eff'icient numerical algorithm.
Another
contribution to be made is the extension of the various results
to the tri vari ate case (i. e., three-peri od CO desi gns.)
The
trivariate distribution of the denominators of t l ,t 2 , and t 3
is not readily available although Johnson and Kotz (1972) have
suggested an approximation of the p.d.f.
Extensions to the
trivariate case could also be made in Chapter 3 and probably
99
with more success than in Chapter 2.
The distribution of the test statistic max!t i I is not known
explicitly. Further study could yield results on that distribution
or could produce a new test statistic haVing superior qualities
and characterization.
In Chapter 3 further study using the Monte Carlo simulation
algorithm as a tool could take many forms.
The algorithm is
sufficiently general to make a wide variety of probability
calculations involving the distribution of (t l ,t 2 ). Further work
regarding the distribution theory in Chapter 3 should also
provide rewards.
In Chapter 4 we have not addressed the question of how to
perform and analyze four-sequence experiments with respect to the
number of study units per group.
with unbalanced designs?
How do balanced designs compare
Some of the theoretical tools developed
in Chapters 2 and 3 could be used to study the relative merits
of different CO(2,4,2) designs.
It should be noted that other alternative by-period models
could be fitted to (possibly messy) data.
For example, since the
design matrices may be different, the following model offers no
new problems with respect to estimation, testing, power calculation
and significance level calculation:
100
E[y,]
:: [01 '~lJ *
T
where B is of the form [r-q ,OJ
,0].
- '*[I-qThis model gives an example of covariate adjustment:
the
end-af-period measurement is conditional on experimental design
and covariates given
by ~l
and ?2"
Further investigation into
the usefulness of th'is idea is reconmended,
101
APPENDIX
A.1
Bean-Sprout Experiment Protocol
The objective of this crossover study was to determine the
effect of 2,4 di-chloro-phenoxy acetic acid (2,40) on the growth
of 132 mung bean sprouts.
Because of their young and fast-growing
tissues, sprouts have been given an important role in biological
and biomedical research.
2,40 is a hormone-like reagent which
can dramatically influence growth in young plant tissues.
Like
many other growth-control agents, low doses of 2,40 have positive
effects on growth while high doses are detrimental.
popularly used as a herbicide.
2,40 is
In this study sprouts were exposed
to a moderately detrimental dose. The response observed was the
length of the root sprouted by each bean.
The study was conducted as a two-period, two-treatment
crossover with four sequences.
Each period was 28 hours long.
The two treatments were tap-water (treatment B) as a placebo, and
a 10- 4 molar solution of 2,40 (treatment A). In order to keep the
2,40 in solution the solution was .06% ethanol by volume.
solution had the same appearance as tap-water.
The
The four treatment
sequences (A,A), (A,P), (P,A), and (P,P) were applied to groups
1, 2, 3 and 4 respectively.
Each group consisted of 34 sprouts
which were randomly assigned.
An arb"itrary collection of more than 200 dry mung beans was
placed in a bowl of tap-water located within a small insulated
102
Figure A.1.1
Tri1cing of array of holes on stainless-steel screen
used for growing sprouts (actual size).
0 0 0
00000
0000000
00 0 0 0 0 0
0000000
o
1
"
000 0
000
/
r
r
103
chest.
The beans were left to soak at room temperature and in
darkness for 24 hours.
At the end of the 24th hour (T=24) the
germinated seeds had produced tiny root sprouts approximately
four mm long.
At that point any bean which showed no evidence of
germination was rejected as a possible study subject.
were then assigned at random to the four groups.
picked up was assigned to group
l~
The beans
The first bean
the second picked up was
assigned to group 2, etc .. _, until each group had 33 germinated
beans.
jars.
Each group was placed on one of four identical growingEach jar was constructed by placing a stainless-steel
~
screen disk over the mouth of a l-pint, wide-mouth, clear
glass, Mason jar.
Each screen was perforated with an array of
37 holes which were 5 mm in diameter.
The beans were placed
sprout-downward over the holes and the jars were filled to the
brim with tap water.
Thus within each group every bean could
be identified by its array location.
The four growing-jars with
sprouted-beans in place were placed together in the darkness of
the room temperature chest mentioned above.
The sprouts were
left to lengthen and were disturbed only to ascertain that the
sprouted roots were all growing downward into the water through
the holes in the screens.
This assured that the roots would
be as straight as possible for easy handling and measurement.
At T=45 the shedding hushs were removed.
At T=49 a random
sample of 24 sprouts (6 per group) was selected for measurement
of length.
104
At T=72 (3:00 pm 15 July 1981) the water in the four jars
was replaced by treatments A or P according to design.
It was
possible to conduct the epxeriment as a "double-blind" study by
enlisting the help of an assistant.
(T=72 until T=lOO).
Period 1 lasted 28 hours
Sprout length was measured at T=lOO.
Since
the sprouts were growing nearly one millimeter per hour it was
necessary to randomize the sequence of measurements of the
sprouts.
Meas urement of 1ength was made by removi ng the sprout
from its array location and laying it on a flat surface alongside
a rule.
Length was accurately measured to the nearest mirlimeter.
If the sprout was bent or curly then the sprout was laid on the
flat surface in such a way as to give the maximum measurement
with the restriction that it would be straightened only by its
own weight.
sprouts.
No attempt was made to otherwise straighten crooked
When not being measured or inspected the four groups
were stored in the darkness of the chest.
At T=100 the four groups were simultaneously rinsed with
tap-water to thoroughly remove period 1 treatment solutions.
The four jars were then refilled with tap-water and replaced in
the chest for the remainder of the wash-out time period.
T-121 per"jod 2 was begun.
At
Treatments were appl"ied and methods
mentioned above were used again.
At T=140 period 2 ended and
measurements were made.
Throughout the study the following methods were practiced:
(1) all tapwater used was room-temperature; (2) the four groups
105
were stored together in darkness when not being observed, (3) the
four groups were kept in very close proximity to one another at
all times, (4) root length was defined as the
l~ength
of an
imaginary line segment beginning where the root emerges from the
bean and ending at the root tip.
At the end of period 1 it was patently obvious which
treatment each group had received.
Treatment A caused the sprouts
to have thick, short roots and slightly discolored and wilted
leaves.
(During the wash-out time period every bean produced
the beginnings of a stem and two leaves.)
For analysis purposes the unit of study is takentn be a
triplet of sprouts rather than the individual sprout.
In each of
the four treatment groups there are eleven triplets; and thus,
N=44 units of study in all.
The variables to be analyzed are
triplet-averages of sprout length.
The data can therefore be
arrayed as 44 observations on the variables Yl, Y2 and GROUP.
Figure A. 1.3 lists the values of this array.
Figure A. 1.4 presents simple descriptive statistics for
the 11 triplets in each group.
Within each group the Kolmogorov-
Smirnov goodness-of-fit test for normality was performed on
Yl and Y2.
than .20.
The p-values for these two tests were both greater
It will henceforth be assumed that these variables
are approximately normally distributed within each group.
107
Figure A.l.4
N
Mean
St~ndard
Standard
Error
Mimimun
Maximum
Deviation
Group
Yl
11
Y2
11
47.33
69.03
4.78
7.74
1.44
2.33
38.00
58.00
54.33
84.33
11
11
48.39
73.58
5.31
10.24
1.60
3.09
39.33
55.00
55.67
87.33
Group 3
Y1
11
Y2
11
70.09
119.36
12.40
20.24
3.74
6.10
52.33
84.00
89.33
156.00
Group 4
Y1
11
Y2
11
64.88
115.27
13. 15
19.48
3.96
5.87
38.67
82.33
88.00
143.00
Group 2
Yl
Y2
e
108
A.2 Algorithm: Power via Maclaurin Polynomials
IlpmlER 1 JOB UNCoB.S665VgSTEWART,WASTE=YEs,M~1q
DEST=UNCPHEVU,
II
PRTY==OIl
II
PAGES:::25,
II
1'==(2$000)
II
I*PW=PS
IISTEP 1 EXEC FTGCG,REGION=JOOK
IISYSIN DD l!<
C
C
c**** DECLARATION STATEMENTS •••
DOUBLE PRECISION RESULT,
RHO,
TN
RHOXM e RHOE2,
1'S2 9 TERM1,
TERM, TS2E2, EPSLNK,
5
SIHtK,
SUML
SUIiN,
6 RXXXH, TERMH. TERM!.
1
RXXXR, TERMQ$O TERMR,
8 LASTK, LASTL, LASrN,
RXXXQ1, RXXXQ2,
9
c**** FUNCTION STATEMENTS
C**** DATA STATEMENTS .q.
C**** INPUT VALUES •••
THETA1 == 23.065125
THETA2 -- THETl\1
1
2
3
4
Q
10 .. 6216750
= 8.8300622
RHO
= -.111
NU
= 133
T
= 2~ 3
Xli
:; 0.0
N1
.- 100
N2
= 5
EPSLNK = (10.0)**(-6.0)
EPSL NH = (10.0)**(-6.0)
ISUFFH = 2
SD 1
SD2
C
::
THETA1 g THETA2,
1'1'2,
Xlii,
HALFNU,
Flu
TERM2, TERM3,
EPSLNH,
SU~HI,
SUMP,
SUl~Q,
TERMJ. TERML g
TERMS, TERMK"
LASTP, LASTQ"
RXXXQ3 q RXXXQ4,
SDl a
SD2"
XME2,
RHXME2 ..
CONST u
TERli4,
SUMl,
TS1 11
TS 1 E2,
SUl'1J,
SUMR.,
SUI'1SjI
TERMN g
I.ASTH I1
LAS'l'R 4'
TERMP g
LASTJ,
LASTS ..
RXXXS,
RXXX£{
e
109
e
C**** INITIALIZE VALUES •••
P!
::: 3.1415926535898
TSl
::: THETA1/SDl
::: THETA2/SD2
TS2
XME2
::: Ul*XM
RHOE2 ::: RHO*RHO
RHOXM ::: RHO*XM
RHXME2 :: RHOXM*RHOXM
TS1E2 = TS1*TS1
TS2E2 :: TS2*TS2
TT2
== T*T*2.
TERM1 ::: T51 - RHOXM*TS2
TERM2 = T52 - RHOXM*TSl
TERM) = ldO
RHOE2
TERI14 = 1.0
RHXME2
HALFNU ::: 0.5*FLOAT(NU)
c**** PRINT THE INPUT PARAMETERS •••
PRINT 1
INPUT PARANETERS FOR P ('f;PARMS) __,
1 fORMAT (lOX,,'
PRINT 2" SD1, SD2
= ',D18.10,5X,'SD2
= '9D18.10)
2 FORMAT (10X,'501
PRINT 3, RHO, NU
= ',D18.10,5X,'NU
- ',117
~
3 POUMAT (10X,'RHO
PRINT 4, T , 1M
= ·,D18.10,5X,'XM
-- ' .. D18.10)
4 FORMAT (10X,'T
PRINT 5, THETA1, THETA2
5 PORMAT (10X,'THETAl = ·,D18.10,5X,'THETA2 - ',D18.10)
PRINT 6, Nl , N2
:: ',111
,5X,'N2
- ',117
)
6 FORMAT (10X,'Nl
PRINT 7, T51, TS2
:: ',D18.10,5X,tTS2
9"D18 .. 10)
7 FORMAT (10X,'TSl
PRINT 8, TERM1, TERM2
:::: ·,D18 .. 10,SX,'TER!-12 := ~~D18 .. 10)
8 FORMA'£ (10Xq'~'ERr11
PRINT 9 w TERM3, TERM4
'"D18.10)
9 FORMAT (10I,tTERM3 :::: ·,D18.10,5X,oTERM4
PRINT 10, EPSLNK, ISUFFK
)
°10 FORMAT (10X,'EPSLNK :::: n,D18~10.5X,tISUfFK "" fttJI11
PRINT 11, EPSLNH, ISUFFH
'eD18.10,5X.'ISDFPH - '~I 11
11 FORMAT (10X,'EPSLNU
.~
.=-.>
=
C
COMPUTE THE LOG OF TUE OVERALL CONSTANT •••
CONST = DEXP« TERM1*T51 + TERM2*TS2) I ((-2.0)*TERM4»)
caNsT = CONST*(TERH3 •• HALFNU)* (TERMQ •• l.5J/PI
CaNST = DLOG(CONST) = 2.. 0*DLGAMA(HALFNU)
PRINT 12, CONST
12 PORi1AT ('lOX,9LOG(OVERALL CONSTANT) ""
s@D18",10}
C'i<~~*
.)
110
c
S11ALL
::: DLOG (EPSLNH)
SMAI.L
13 FORMAT (10X I1 °CONVEHGE H: LOG( TERM**2)
PRI NT 13 q
c
<
SMALL~t8D18 ..
10)
c
c
C.*** COMPUTE THE NESTED SUMS •••
c
c****
BEGIN LOOP SUMH .....
L.
C
o.
5UMB
'-'
RXXXH
leUMH
=
TT2*TERM3 I
::: 0
LASTH
-:;:
(FLOAT! NU )*(TERM4.*2»
1..
DO 100 IH=1"N2
TERMH :::: LASTH*BXXXH I FLOAT (
c
c**** II.
BEGIN LOOP SUMI
2*IH*12*IH -
1)
.....
C
SU111
::: 0 ..
MAXI
:::: 2*(18 -
1)
+
1
DO 9900 III=1,MAXI
II
::: III - 1
c
c**** III..
BEGIN LOOP SUMJ
C
SUli.]
::: 0 ..
IXXXJ = 2*{IH - 1) -II
MAXJ
=:: III
IF
(II .. GE .. In)
MAXJ
TERM.] '" 1,.
DO 8900 IIJ=1 g MAIJ
IJ
8100
IIJ
~.
IXXXJ
+
1
1
GO TO 8100
(IJ • EQ .. 0)
TERMJ
= FLO AT { ( IXXXJ + 1 ~ 1.1) * (II + 1 TERMJ :::: TERMJ * LASTJ * TERM4 I FLOAT! IJ )
CONTINUE
IF
C
=::
::
r;n
j
"Ill
c****
BEGIN LOOP SUML
IV~
ee~
C
= O.
SUML
IXXXL ::: II - IJ
MAXL
::-; ( IXXXL I 2)
TERML ::: 1.
DO 1900 IIL=1,MAXL
IL
:::
ITMPQL
~
IF
7100
IIT~
-
+
1
"
IXxxt - 2*11
(IL oBQ. 0)
GO TO 7100
+
'1'ERML
:: FLOAT{
TERML
TERML
TERML
:: TERML I FLOAT( It )
:::: TERML*(-.S)*TERM4
- TERML*LASTL
(ITMPQl.
1)
*
(I'l'MPQI.t 2)
}
CON'rINUE
C
C****
BEGIN LOOP SUMN
V~
C
::: 00
SUMN
=
IXXXJ - IJ
MAXM
== ( IXXXN I 2)
+ 1
TERMN :: 1.
DO 6900 IIN==1,MAXN
IN
::: rIN - 1
IF
(IN eEQ. 0)
GO TO 6100
TERMN - FLOAT{( IXXXN + 1 - 2*IN)*{ IXXXN
TERMN :: TERHN I FLOAT (IN)
TERMN ::: TERMN * TERM4 * LASTN
6100
CONTINUE
IXXXN
C
BEGIN LOOP SUMP
c**** VI ..
+
2*(1 - IN)
.. ....
C
SUMP
:::: O.
MINP
=:
MAXi?
:; IIJ
IF
1
(RHOXM .BQ. 0.)
TERMP
::
MINP -
MAli?
10
DO 5900 IIP=MINP.MAIP
IF
::: lIP - 1
IF
1
IF'
IF
1
1
IJ
= LASTP *
) .. AND .. (Ii? .. NE ..
0
n
FLOAT( IF - IJ - 1 ) I
TERMP
(IP .. EQ .. IJ)
TERMP =: LO
«IP .. EQ .. IJ i .. AND .. (IP "NE .. 0 ))
TERMP
IF
C
({IP ,. NE.
TERMP
LASTP =:
= (
-1 .. 0 )**{ IJ )
«IP .. NE.. IJ ) .. ANDe (RHOXM .. NEe 0 .. ))
TERMP = TERMP * (RHOXM*$( IJ - IP »
FLOAT (IP)
)
112
BEGIN LOOP SUMQ •••
C*"''I'''' VII.
C
SUMQ
= O.
C***. ITMPQL :::: II - IJ - 2*IL
ITMPQN = IIXXN - 2*IN
=
INCQ
ALREADY DEFINED ABOVE
\-
1
({{TERM1 .EQ. O.)oOR. (TERM2 .EQ. O.. }).AND. (ITMPQN .GE .. 1»
1
INCQ
= ITMPQN
IMAXQ = ITMPQN + 1
DO 4900 IIQ=1,IMAXQ,INCQ
IQ
= IIQ
1
IF
=
=
rQL
rQN
=
TERMQ
IF
I F
4100
ITMPQL
+
1Q
IT~PQN
-
1Q
1..
((IQ GEQ.. 0
({TERN 1 ., EQ..
TERMQ
:::: LASTQ
*
) .. OR .. {IQ .. EQ .. I'fMPQN)}
GO TO 4100
0.). OR .. (TERM2 .. EO.. 0",»
GO TO 4100
FLOAT( IQN
1 ) I FLOAT (IQ)
+
CONTINUE
c
LASTQ
= TERMQ
II?
(lQL • EQ.
0)
«lOL .NE. 0) .AND .. (TERMl • NE.
IF
a.. })
RXXXQ1 :: 0 ..
nXXKQ1 == 1..
~nXXXQ1:::
TERM 1**101.
RXXXQ2 =
IF
IF
«IQN • NE .. 0) .. AND .. (TERI12 " NE..
IF'
IF
(IQL • EQ.. 0)
( (IQL .NE .. 0) • AND .. (TERM2 .. NE.
IF
IF
UQN • EQ .. 0)
TERMQ
==
IF
( TERMQ .. EQ~
(ION .. EQ .. 0)
o.. })
nIXIQ)
=
o.
1.
RXXXQ3= TERM2**IQI.
RXXXQ4 :=
o.
RXXXQ4 == 1.
{ (IQN .. NE. 0) .. AND .. (TERtI i " ME .. 0.) ) RXXXQ4::>: rl'ERM 1**IQN
c
c
*
TERMQ
(RXXXQ1*RXXXQ2
C**** VIII .. BEGIN LOOP SUMR
C
SUMR
MINR
MAXR
IF
0 .. ))
0 ..
RXXXQ2 ::: 1.
RXXXQ2= TERH2**IQN
nXXKQ3 =
==
0 .. )
+
RXXXQ3*nXXXQ4)
GO TO 4200
s ...
o.
::: 1
:::: lIN
(RHOXM
IXXXR == II
~EQ
+
..
O~)
IN t
IP
+
MINH == MAXR
IQ
DO 3900 IIR=MINRuMAXR
= IIR -
IR
'rERMR
IF
1
=
« IN
TEPMR
LAS'I'R
1XXX
IF
.. NE .. 0) .. AND .. (IR • NE.
== LASTR
== 'l'ERMR
==
IIXXR -
*
*
(-.5)
IN»
FLOAT (NAIR -
IR
( IXIX • NE. (2* ( IXXX/2»)
RXXXR == DLGAMA{ FLOAT(IH)
IR)
I
FLOAT{IR)
GO TO 3200
+ HALFNU - .. S*FLOAT( IXXX
RXXxntDLGAMA(UALFND + • 5*FLOAT ( IXIX
Rxxxn : : 2.*DEXP( RXXIR + CONST )
IF
«(IN .. NE .. IN). AND. (RUOXM .. NE .. 0.»
Rxxxn : : RIXIn *
(RHOXM**( IN - In })
RXXXR
1
1
1 ..
IF
::::
(IR .. EQ ..
TERMR::::
IN)
(TERMR)
RXXIR == RXXXR ""
*
Rlxxn
{(~
.. 5) **IN)
+
+
1)
1)
)
)
113
c
BEGIN LOOP SUMS •••
c**** IX.
C
SUMS
= O.
MAXS
= IIR
RXXXS =.5'" FLOAT(
TERMS = 1.
DO 2900 IIS=1,MAXS
IS
:;: lIS -
IIXXR
IR )
1
IF
2100
c
c **** x.
c
(IS .EQ. 0)
GO TO 2100
TERMS = ( HALFNU =.5 + RIXXS + FLOATeIS) ) I FLOAT{ IS )
TERMS = TERMS I ( HALFNU - .5 - RXIXS + FLOAT{IH - IS) }
TERMS :;: TERMS'" LASTS'" FLOAT( MAXS - IS )
CONTINUE
BEGIN LOOP SUMK •••
SUMK
MAXIK
IF
:;: 1.
:;: 0
(RHO .EQ .. 0.) GO TO 1400
LASTK :;: 1.
RXXXK = RXXXS + FLOAT{rS)
DO 1200 IK=l,Nl
TERMK = HALFNU - .5 + RXXXK t FLOAT{IK)
TERMK = TEBMK'" (HALFNU - 1.5 - RXXXK + FLOATe!H
TERMK = TERMK '" RHOE2 / FLOAT(IK)
TERMK :;: TEHMK / ( HALFNU + FLOAT(IK - 1) )
TERMK :;: TERMK ... LASTK
SUMK
:;: SUMK
+ TERMK
GO TO 1300
IF
(TERHK .LT. EPSLNK)
LASTK = TERriK
12 00
CONTINUE
PRINT 1201,
LASTK, TERMK, SUMK
1201
FORMAT (5X,'LOOP KNOT CONVERGED:LASTK='aD18.10,
, TERMK=' ,Dl8.l0,' SUMK=' ,018.10)
1
1300
CONTINUE
IF
(MAXIK .LT.. IK)
MAXIK = IK
1400 CONTINUE
C
C****
(x.
END LOOP K
C
SUMS
LASTS
2900 CONT INU E
==
SUMS
::::
.. ..
+
TERMS*SU8K
TERMS
C
c****
(IlC.
END
LOOP S ) ....
C
SUMR
=
SUMR
3200
CONTINUE
3900 CONTINUE
C
c**** (VIII. END LOOP R ) .".
+ (
TERMR
*
(SUMS)
)
+
IK»
114
c
SUMQ
4200
CONTINUE
4900 CONTINUE
+ TERMQ*SUMR
= SUI1Q
c
c**** (VII. END LOO!?
Q )
e
SUMP
:::
+
SUMP
TERMP*SUMQ
5900 CONTINUE
c
e****
(VL
END LOOP P
)
C
SUMN
LASTN
6900 CONTINUE
=:
TERMN*SUMP
+ TERMN
SUMN
--
e
.....
END LOOP N
C**"'* (V.
e
SUMI.
LASTL
7900 CONTINUE
;::; SUI1L
t TERML*SUMN
=
TERl'lL
e
C**** (IV.
e
END LOOP
SUl1J
LASTJ
8900 CONTINUE
=:
I.
SUMJ
:::
+
TERMJ*SUML
TERMJ
+
SUMJ
C
C**** (III. END LOOP
e
SUMl
e
=
J
SUMI
9900 CONTINUE
e**** (I I ..
END LOOP I
C
SUMH
::: SUMH
TERM
-
)
. ...
+ TERMH*SUMI
TERMH*SUMI
.. LT. SMAI,L)
reUMH:: leUMH + 1
IF ({IH • GT .. 3) .. AND .. (leUMH • EQ. 0))
GO TO 194
PRINT 193, 1M, MAXIK, TERM
FORf1AT (9X.I' 'IU=' ,19, I MAIIK=· ,19<1' TERM:' uD18.10)
CONTINUE
IF {(DLOG (TERH**2»
.GE. SMALL)
leUMH = 0
IF ( ICUMH
.. GE. IsorFH
GO TO 190
LAST" :::
TERMH
IF «DLOG (TERM**2»
C**:+:
193
C *194
e
100 CONTINUE
190 CONTINUE
C
115
PRINT 191, ICUMH, SUMH
FORMAT (10X, 'LOOP H ENDED: lCOHH=' ,117,0,
SUMH=' ,D18 .. 10)
PRINT 192, IH,LASTH, TERMH
192 FORMAT (10X,'IH=',I9,' , LASTH=',D18.10,', TERMH=',D18 .. 10)
PRINT 195, MAXIK
195 FORMAT (10X,'MAXIK=',I9)
191
c
END LOOP H ) ....
c**** (I.
c
c
c**** CUMPUTE THE END RESULT
RESULT
9999
c****
PRINT 9999, RESULT
(10X, 'RESUL'£:
l"OR~AT
END THE PROGRAM ....
END
1*
= SUMH
P (T)
...
P(T;PARMS)
-= • ,D18.10)
116
A.3 Algorithm: Power via Monte Carlo Integration
//POWER3 JOB UNC.B~F.51eUuSTEWART,WASTE=YES#M=1,
PRTY=O,
II
PAGES=lOO ..
II
T==
(0,,30)
II
IISTEPl EXEC PTGCG,REGION=600K
I/SYSIN DD «
C
C
C
C**** DECLARATION STATEMENTS ee~
REAL R(40000)
DOUBLE PRECISION RESULT, THETA1, THETA2,
SD1,
5D2 11
1
RHO,
T,
TT4,
XM,
XME2, RHXME2,
2
FROIM, RHOE2, HALFNU,
TS1 ...
PI" CONST"
3
TS2, TERM1, TER!i2, TERM3, TERM4, TS l.E2,
4
TS2E2, EPSLN,
SUM.J, LASTJ II
DSEED,
DIFF 61
5
RTEHM1, RTERH2, SMALL, TEl1PQ,
FeN ...
6
Xl,
X2,
Y2 11
T 1,
Y1,
T2ll'
7
RSUMK, TERMK, LASTK, SERIES .. EPS!.NJ
C**** FUNCTION STATEMENTS •••
C**** DATA STATEMENTS a • •
c**** INPUT VALOES FOR MONTE CARLO METHOD
DSEED == 123457.DO
N
:: 40000
$
ICliECK ::::
EPSLNJ
~
2500
(100.0)
** (-2)
C**** INPUT PARAMETER VALUES
THETA1 :: 39.799497
THETA2 :::: 6.63324
:::: 9.7534609
SD2
:: 15.414603
SDl
c
RHO
=:
NU
If
::;:
::;;
XM
N1
=
7
EPSLN
=
(10.0)**(-2)
::
0.892
41
2 • .32672
O~O
••
e
117
C**** INITIALIZE VALUES •••
C
PI
Tsl
TS2
XME2
RHOE2
RHOXM
RHXME2
TS1E2
TS2E2
TT4
TERMl
TERM2
TERM3
TERM4
HALFNU
= 3.1415926535898
= THETA1/SD1
= THETA2/5D2
= Xl'1**2
= RHO**2
= RHO*Xf1
= RHOXl'1**2
= T51**2
= TS2**2
= T51 - RHOXM*TS2
= TS2 - RHOXM*TS1
= 1.0 - RHOE2
= T*T*4.
= 1.0 - RHXME2
- 0.5*PLOAT(NU)
= (-.5) * TERM] / (FLOAT (NU) * TERM4 )
RTER M2 = ( (TERM3 / FLOAT (NU) ) ** (.5) ) / TBRM4
PRINT THE INPUT PARAMETERS •••
PRINT 1
FORMAT (10X,·
INPUT PARAMETERS FOR P(T;PARBS)
PRINT 2, SOl, SD2
= ·,018.10.51,'5D2
Fa RMAT {lOX,' SD 1
PRINT 3, RHO, NU
::: ',111
)
= ·,D18.10,5X~·NU
FORMAT (10X,'RHO
PRINT 4, T , XM
FORMAT (10X,'T
= ',018.10,5X,'XM
= • .D18.10)
PRINT 5, THETA1, THETA2
FORMAT (10X,'THETA1 = ',D18.10,5X,'THETA2 - ',018.10)
PRINT 6, Nl , N
)
.6X,·N
:= '., I 17
FORMAT (10X,IN1
:: '. 117
PRINT 7, TS1, TS2
::: 0,D18 .. 10)
l"ORMAT (101,'T51
= ·,D18.10,5X,'TS2
PRINT 8, TERM1, TERl'l2
FORMAT pOX,'TERlil = ',D18.10,5X,·TERM2 -= ',018.10)
PRINT 9, TERM), TERM4
FORMAT pOX,'TERM3 = ',018.10,5X,·TERH4 -.:: ',018.10)
PRINT 10, RTERM1, RTERt'l2
FORMAT (10X,'RTERMl = ·,D18.10,5X,·RTERM2 = ',D18a10)
PRINT 11, EPSLN
FORMAT (lOX, 'EPSLN ::' ,018.10)
RTEHM1
C****
1
2
3
4
5
6
7
8
9
10
11
c
c**** COMPUTE THE OVERALL CONSTANT ....
CaNST :: OEXP({ TERM1*TSl + TERM2*TS2) / «-2.0)*TERM4»
CaNST :: CaNsT. (TERM]"'. (HALFNU + 1»
'" (TERM4 •• (- .. 5»
CONST = CaNST /
FLOAT (NU) * PI * (2. 0** (NU-1»
CONST = CONST / ( (OGAMHA(HALFNU»**2 )
CaNST :: CONST '" TT4
PRINT 12, CONST
12 FORMAT (10x.'THE OVERALL CONSTANT IS ',D18.10)
,)
118
c
c
SMALL = (EPSLN / CONST )**2
PRINT 13, SMALl.
13 FORMAT pOX, 'CONVERGENCE:
TERMK
SMALL
otT..
=
',D18 .. 10)
CALL GGUBS (DSEED,NoR)
C
::: N1
( RHO
1MAX
IF
.. EQ. 0.. )
IMAX::::
1
SERIES::: 0 .. 0
JJf1AX
:::
N/4
LAs'rK =:: 1.0
DO 7000 IIK=l,IMAX
IK
= 11K
TERMK :: 1.0
IF
(IK • EQ. 0)
:: LAS'l'K
TER~IK
50
1
GO TO
50
«.5)*RHO)**2 )
* (
TERMK :: TERMK I
CONTINUE
LAST K :::: TERMK
( FLOAT{IK)
* (
HALFNU
+ PLOAT(IK-l)
C
c**** COMPUTE THE INTEGRAL
c
c
c
C
NU2IK
::
NU
SUM.]
::
0 ...
+ 2*IK
FOR
X**IK •••
KOUNT - 0
LASTJ :::: 0 ..
DO 1000 JJ=l,JJMAX
:::: 4*JJ
J
Y1
::::
Y2
::::
y)
::
Y Ll
~
R (J-3)
R (J-2)
R (,J-1)
n (J )
Xl
X2
:: Y1/(1~Y1)
:: Y2/ (1~Y2)
Tl
:::: T*(2*Y]
~
1)
T2
:: T*(2*Y4
=
1)
C
Y1
Y2
Xl
X2
::::
::
.::::
Tl
::
T2
::::
R (J-3)
R (J- 2)
-
Y 1)
Yl/(1.0
Y2)
Y2/ {1 .. 0
T * (2 .. 0*R (J-l)
T *(2.0*R{J
)
-
-
-
1.0)
1.0)
»
119
C****
75
100
1000
2000
COMPUTE THE
::
FCN
TEMPQ =
TEMPO =
TEMPO =
TEMPO ::
TEMPO =
TEMPQ =
(
IF
FCN
KOUNT
CONTINUE
SUMJ
=
IF
(
INTEGRAND FeN
o.
~
••
(11*T1)**2 +(X2*T2)**2 -2.*X1*X2*Tl*T2*RHOXM
RTERM1*TEMPQ
TE~PQ + RTERM2*( Xl*Tl*TERM1
+ X2*T2*TERM2 )
TEMPQ t (-.5)*( X1**2 + 12**2 )
TEMPQ t FLOAT{NU2IK)*DLOG«X1*X2»)
TEMPQ t
( 2.0 )*DLOG«(1.t 11)*(1.+ 12»)
TEMPQ • LT. (-180~ 218»
GO TO 75
= DEXP( TEMPQ )
= KOUNT + 1
SUMJ
+ FCN
JJ .LT. 5000 )
GO TO 100
IF
( (ICHECK* (JJ/ICHECK»
• NE. JJ)
GO TO 100
DIFF
:: (SUMJ/FLOAT(JJ»
- (LASTJ/FLOAT(JJ-l»
IF
{(OIFF** 2) .. LE. EPSL NJ)
GO TO 2000
LASTJ = SUM,J
CONTINUE
CONTINUE
CONTINUE
C
c****
C****
3000
7000
8000
COMPUTE FINAL RIEMANN SUM AS RSUMK
RSUMK = SUMJ / FLOAT(JJ)
USE THE INTEGRAL VALUE RSUMK
TERMK = TERMK * RSUMK
SERIES = SERIES + TERMK
PRINT 3000, IK, TERMK, JJ, KOUNT
FORMAT (10X,'K = ',13,' TERMK=·,018.10,'JJ='.Il0,'#OK=',110)
IF
«TERMK**2). LE. SMALL )
GO TO 8000
CONTINUE
CONTINUE
C
c**** CUMPUTE THE ENO RESULT peT) •••
RESULT = CONST* SERIES
PRINT 9001, RESULT
9001 FORMAT (10X#'THE RESULT IS P(T;PARMS)
c**** END THE PROGRAM •••
END
1*
=
',018.10)
120
A.4
Alj~~Jthm:
Power and Significance Level via Monte Carlo Simulation
IIPOWER4 JOB UNCQB~E51aU.STEWART,WASTE=YES,M=1i
II
PRTY=O,
II
PAGES~100,
II
'£==(0 .. 30)
I*PW==PAPER
IISTEP1 EXEC FTGCG,REGION=300K
IISYSIN DO -*
C
C
C
C**** DECLARATION STATEMENTS
In:AL R (4200)
INTEGER 0 (41)
DOUBLE PRECISION
1
RHO, '1' llH N,
2
RHOXM, PERCNT,
3
TSl u
TS2,
'l'HET A 1 s THETA2 g
T1liAX, T2M.IN,
4
5
ZlK,
C**** FUNCTION STATEMENTS
C**** DATA STATEMENTS
Z2K,
~ ••
TERM4,
DSEED"
TERMS"
DL"
DLRHO,
~QQ
DATA 0/24*1/
c**** INPUT VALUES FOR MONTE CARLO METHOD
J1MAX
J2MAX
c****
== 50
::: 10
DSEEO ::: 123457.DO
ICHECK ::: 5
MINJl ::: 10
INPUT PARAMETER VALUES •••
TAUl
::: 7.000
TAU2
::: 1.27000
THETA1::: TAU1 * 6.3245554
THETA2::: TAU2 * 4Q8818199
SOl
::: 9.5231664
SD2
== 17 .. 407383
RHO
xn
==
0.892
~
0.0684005
NU 1
=:
38
NU2
== 21
DO 1 11 I D= 1 • 1+
o (ID) = 0
111
CONTINUE
T 1 Ml N
:::
T 1MA X
=
-2 .. 33372
2.33372
-2041385
T2MA X -::
2.,41385
T2lHN
=
SD1 ..
T2tiAX,
SD2 ..
Q 1R
W1 II
Q2"
W2,
T 1,
T2
XM"
e
121
c
C··** INITIALIZE VALUES •••
C
C
C****
1
2
3
4
5
e
7
8
9
C
C
r
PI
= 3.1415926535898
NU
= NUl + NU2 + 2
TSl
= THETA1/SDl
TS2
- THETA2/SD2
RHOXH = RHO*XM
TERM4 = 1.0 - RHOXM**2
N
= J1MAX * NU
CALL GGNML (DSEED,N,R)
PRINT THE INPUT PARAMETERS
PRINT 1
FORMAT (10X,W
INPUT PARAMETERS FOR P(T;PARMS)
PRINT 2, SD1, S02
FORMAT (lOX, 'SDl
=' ,D18 .. 10,5X,'502
= ',018.10)
PRINT 3, NUl, NU2
FO R MAT (lOX, t NUl.: , , IS, 5 X, , NU2 =' , IS)
PRINT 4, RHO, XM
FORMAT (10X,'RHO
',D18.10,5X,'XM
.: ',D18.10)
PRINT 5, TAUl, TAU2
FORMAT (10X,'TAUl = ·,D18 .. 10,5X,'TAU2 = ',018.10)
PRINT 7, TS1, TS2
FORHAT (10X,'TSl
= ',018.10,5X,'TS2
.: ',018.10)
PRINT 8, T1MIN,' T1MAX
FORMAT (10X,'T1MIN = ',D18.10,5X,'T1MAX - i,D18.10)
PRINT 9, T2MIN, T2MAX
FORMAT (10X,'T2MIN = ',D18.10,5X.'T2MAX = ',018.10)
=
~
')
122
KOHNT1
=0
KOSS
=0
DO 2000 J2=1,J2MAX
11" (J2 .. EQ.1) GO TO 101
DSEEO
= DSEEO
-
1.00
CALL GGNML (DSEED.N,R)
101
CONTINUE
DO 1000 Jl=1,J1MAX
K UlI N
= NU* (J 1-1) + 1
Wl
- R(K1MIN)
W2
= RCK1MIN + 1)
W2
- RHOXM*W1
+
Wl
= Wl + TSl
=
W2
= L1
= Q2
= Q2
= Q2
L1
Q2
Q2
Q2
200
+ TS2
- K1MIN
t 2
- K1MIN
+ NUl - 1
K2MIN
- K1MAX
+ 1
K2MAX
= K2MIN + NU2 - 1
Q1
:= 0.0
DO 100 Kl=K1MIN g K1MAX
Q1
::: Q1 f ( R(Kl)
)**2
CONTINUE
Q2
::: 0.0
L1
:= 1
DO 200 K2=K2MIN,K2MAX
ZlK
= R(
K2-NUl
Z2K
:= R(
K2
DL
- D{
L1
DLRHO - DL * RHO
TERMS = 1.0
DLRHO**2
K1MIN
K1MAX
100
W2
DSQRT(TERM4)*W2
~
t
1
t
(ZlK*DLRHO) **2
Z2K*Z2K*1'ERM5
+
+
ZlK*Z2K*2.0*DLRHO*DSQRTCTERM5)
CON'l'INlJE
1'1
= Wl / DSQHT( Ql/FLOAT{NU"l )
1'2
- w2 / DSQRT( Q2/FLOAT(NU2 )
C**>!<**********
IF
IF
)
)
)
)
C
IF
({1.'1~GEoT1l1IN) .AND. (T1.LE.T1MAX»
C
C
IF
((1'2.GE.T2MIN) oAND" (T2 .. LE.T2MAX»
KOUNTl = KOUNTl t 1
300
c 350
1000
2000
( T1 .LT. T1MIN
( 1'1 .GT .. T1MAX
IF
( 1'2 .LT. T2MIN
IF
( 1'2 .GTe T2MAX
KOUN1'1 = KOONT1 t 1
CONTINUE
CONTINUE
KOBS
= KOBS
CONTINUE
CONTINUE
+
1
GO
GO
GO
GO
)
)
TO 300
TO 300
'1'0 300
TO 300
GO TO 350
GO TO 350
123
PRINT 400, J1,J2,KOBS
FORMAT (10X,'J1=',I5,'J2=',I5,'NO.OBS=',I5)
PRINT 440, KOUNT1
FORMAT (10X,'NO.IN REGION=',I5)
400
440
x***
END THE
END
PROGRAM •••
124
REFERENCES
Amos, D,E. (1978),
Evaluation of some cumulative distribution
functions by numerical quadrature, SIAM (Review),
£9., 778-800.
--
Ando, A. and Kaufman, G.M. (1965). Bayesian analysis of the
independent multnormal process - neither mean
nor precision known, ~AS~, 60, 347-358.
Bohrer. R. (1973).
A multivariate t probability integral,
Bio~~_!.rjl.~, 60, 647-654.
Bishop, A., Dudewicz, E.J., Juritz, J.M., & Stevens, M.A. (1978).
Percentage points of a quadratic form in
Student t variates, Biometrik~, ~~, 435-440.
Bulgren, W.G., Dykstra, R.L., & Hewett, J.E. (1974). A bivariate
t distribution with applications, JASA, 69.. ,
525-532.
Brown, B.W. (1980).
The crossover experiment for clinical trials,
Bi c,metri cs, 36, 69-79.
Brandt, A.E. (1938).
Tests of significance in reversal or
switchback trials, Iowa Agr ..Exp. Sta. Res.
Bull. No. 234.
Cornish, E.A. (1954). The multivariate t-distribution associated
with a set of normal sample deviates, Australian
Jo~rnal. of Physics, I, 531-542.
------Cornish, E.A. (1966).
A multiple Behrens-Fisher distribution in
Analysis, P.R. Krishnaiah, Ed.,
N.Y.: Academic Press, pp. 203-207.
~ul~ivariate
Chen. H.J. (1979).
Percentage points of the multivariate t
distribution with zero correlations, Biometrical
~ournal, ~, 347-360.
Cochran, W.G. et al. (1941). A double chang-over design for dairy
cattle feeding experiments, Journal of D~
Science, 24,. 937-951.
DU~lett,
C.W. &Sobel, M. (1954). A bivariate generalization of
Student's t-distribution, with tables for
certain special cases, Biometrika, 41, 153-169.
125
Dutt, J.E. (1972).
A new evaluation of the probabiliy integral of
a multivariate t distribution, I.M.S. Bull., 1,
131.
-
Dutt, J.E. (1975).
On computing the probability integral of a
general multivariate t, Biometrika, 62, 201-205.
Dutt, J.E. (1976).
An integral representation technique for
calculating general multivariate probabilities
with an application to multivariate chi-squared,
COMM.St.A., ~, 377-388.
Dutt, J.L, Mattes, K.D., Sommes, A.P., & Tao, L.C. (1976). An
approximation to the maximum modulus of a
trivariate-t, Biometrics, ~, 465-469.
•
Dickey, J.M. (1966).
On a multivariate generalization of the
Behrens-Fisher distribution (abstract), Ann.
Math. Statist., E, 763.
Dickey, J.M. (1967).
Matricvariate generalizations of the
multivariate t-distribution, Ann. Math. Statist.,
38, 511-·518.
Dickey, J.M. (1976).
A new representation of Student1s t as a
function of independent tis with a generalization
to the matrix-t, J. Multiv. Anal., ~, 343-346.
Derflinger, G., & Stappler, H. (1976). A correct test for the
stepwise regression analysis, COMPSTAT, ~, 131-138 .
Freeman, H. & Kuzmack, A. (1972). Tables of multivariate t in six
and more dimensions, Biometrika, E1, 217-219.
Grizzle, J.E. (1965). The two period change-over design and its
use in clinical trials. Biometrics,~, 467-480.
Ghosh, B.K. (1975).
On the distribution of the difference of two
t-variables, JASA, 70, 463-467.
Halperin, M. (1967).
An inequality on a bivariate Student's litH
distribution, JASA, 62, 603-606.
Helms, R.W., Kempthorne, W.J., &Shen, C.D. (1980). Augmenting twoperiod crossover designs to reduce confounding
and improve interpretation, unpublished
manuscript.
Johnson, N.L. & Kotz, S. (1972). Distributions in Statistics:
Continuous Multivariate Distributions, New York:
John Wiley and Sons, Inc.
126
Jensen, D.R. (1970).
The joint distribution of quadratic forms and
related distributions. Austral. J. Statist., ~,
13-22.
Krishnan, M. (1972).
Series representation of a bivariate singly
non-central t distribution, JASA, §l, 228-231.
Kerridge, D.F., & Cook, G.W. (1976). Yet another series for the
normal integral, Biometrika, Q..l, 40'1-413.
Kershner, M.(1979).
On the Theory of Crossover Designs with Residual
Iffects, Ph.D. Dissertation, Corne'll University.
Kershner, M. & Federer, W.T. (1981). Two-treatment crossover designs
for est'imating a variety of effects, ~ASA, 76,
612-619.
Miller, K.S. (1968).
Some multivariate t-distributions, Ann. Math.
Statist., 39, 1605-1609.
Nagarsenkar, B.N. (1975). Some distribution problems connected
with multivariate t-distributions, Metron, 33,
66-74.
•
127
Steffens, F.E. (1969).
A stepwise multivariate t-distribution,
L~_f!:. Statist. J .. 1,17-26.
Tan, W. Y. (1969).
The Restricted Matric-t Distribution, Tech.
Report. No. 205, Department of Statistics,
University of Wisconsin, Madison.
Tan, W.Y. &Wong, S.P. (1978). Onapproximating the central and
non-central multivariate gamma distributions,
(omm. Statist. B., L, 227-242.
Williams, E.J. (1949). Experimental designs balanced for the
estimation of res"idual effects of treatments,
Australian Journal of Scientific Research-A,
.£' 149-168.
~~ynn,
H.P. (1975).
Zellner, A. (1976).
Integrals for one-sided confidence bands: A
general result, Biometrika, ~, 393-396.
Regression models with Student-t error
terms, JASA. ll, 400-405.