El-Moalem, Habib Elias; (1995).Nonparametric Methodology for Incorporation of Surrogate in Clinical Trials."

NONPARAMETRIC METHODOWGY FOR
INCORPORATION OF SURROGATE
IN CLINICAL TRIALS
by
Habib Elias EI-Moalem
Department of Biostatistics
University of North Carolina
Institute of Statistics
Mimeo Series No. 2149
July 1995
NONPARAMETRIC METHODOLOGY FOR
INCORPORATION OF SURROGATES
IN CLINICAL TRIALS
by
Habib Elias EI-Moalem
A dissertation submitted to the faculty of the University of North Carolina at Chapel
Hill in partial fulfillment of the requirements for the degree of Doctor of Philosophy
in the Department of Biostatistics.
Chapel Hill
1995
~ABIB
ELIAS EL-MOALEM.
Nonparametric methodology for incorporation of
surrogates in clinical trials. (Under the direction of Dr. Pranab Kumar Sen.)
ABSTRACT
In clinical trials, as well as in medical research, it is often the case that the
variables of interest, known as true endpoints, are either hard to measure or are too
costly, or simply require considerable time for completion. Hence we search for endpoints that are correlated with the true endpoints and that are easier and less costly
to measure or perhaps can be measured at an earlier time. These endpoints are called
surrogate endpoints. There is a controversy over the definition of a surrogate endpoint
due to the lack of formal methodology for making inference regarding a true endpoint,
Y, when a surrogate,S, is used intstead.
Prentice (1989) defined a valid surrogate as an outcome variable that would
"yield a valid test of the null hypothesis of no association between treatment and
true response". Thus, the surrogate should be informative about the primary endpoint, and fully capture the effect of treatment on true endp?int.
A surrogate is used in this work to mean a substitute for a true response. It
differs from a surrogate as a substitute for a true covariate dealt with in measurement
error or in Latent-class models Sen (1992) considers design and analysis of a general
class of incomplete multiresponse designs (imd) that take into account use of surrogate variables. He formulates appropriate rank based procedures to test the null
hypothesis of no treatment effect in a robust manner using only one true endpoint
and one surrogate.
The purpose of this study is to generalize Sen's approach into a multi-response,
multi-surrogate setup. A vector of primary response variates is partitioned into various subsets on which measurements are obtained for different number of experimental
units in each subset. A vector of surrogate response variates is partitioned simillary.
The set of the validation samples consists of measurements on a subset of primary
11
~
well as surrogate responses taken along with concomitant variates. On the other
hand, the set of surrogate samples consists of measurements on a subset of the surrogate variates along with covariates. Hence, we have an incomplete multiresponse
design scenario. A rank-based test is developed to detect differential treatment effects
on primary variates while incorporating information from surrogates also. The test is
developed in two design settings: the randomized block design, and the balanced incomplete block design. In the latter recovery of ineterblock information is considered
in detail.
III
Acknowledgement
I thank God for giving me this wonderful oportunity to study at the hands of masters
of statistics. I thank my advisor Dr. Pranab Kumar Sen for inspiring me and for his
constant encouragement. I thank all my committe members Drs. Q~ade, Kupper,
Helms and Vine for their helpful comments and syggestions. Also, I am indebted
to Dr. Bahjat Qaqish for all the training he gave me during my employment as a
graduate research assistant at the Linberger Cancer Center. The experience I got
by working on consulting projects is invaluable. Many thanks go to the staff in the
Department of Biostatistics for their technical support; in particular I thank Betty
Pounders and Betty Owens for the administrative help and Bruce Walter and Corey
McEntyre for their computing support. I also thank my fellow students, particularly
my past and present office mates: Ralph Demasi, Joseph Galanko, Antonio Pedroso,
and all the other wonderful fellow students; they have been a constant source of encouragement and given me positive feedback throughout my stay here. I am thankful
to my brother Basim and my sister Rose and my brother in law Basil for helping me
financially and emotionally.
IV
Contents
1
General Introduction
1
Introduction. . .
1
1.2 Literature Review .
3
1.1
1.2.1
Surrogate Endpoints: Uses and Abuses
3
1.2.2
Prentice Criteria for Surrogacy
....
5
1.2.3
Designs of Clinical Trials with Surrogate Endpoints
7
1.2.4
Parametric and Semi-parametric Approach . . . . .
10
Advantages and Disadvantages of the Parametric and SemiParametric Approaches
1.2.5
2
14
Nonparametric Approach.
15
Hypothesis Testing
16
Estimation. . . . .
24
1.3 Synopsis of The Work Done
28
Methodology In Randomized Block Design (RBD)
33
2.1
Introduction . . . . . . . .
33
2.2
Randomized Block Design
34
2.2.1
RBD For The Surrogate Set 1
34
2.2.2
RBD For The Validation Set 1* .
43
2.2.3
RBD For The Set 1° . . . . . . .
45
2.2.4
Construction of the Test Statistics
46
VI
2.2.5
3
Asymptotic Non-null Distribution of LO·
Nonparametric Intra-Block Inference
47
51
Balanced Incomplete Block Designs
51
3.1.1
BIBD for Surrogate Set 1 .
51
3.1.2
BIBD For The Validation Set 1*
63
3.1.3
BIBD For The Set 1° . .
64
3.2
Construction of the Test Statistic
65
3.3
Asymptotic Non-null Distribution of LO·
67
3.1
4 Recovery of Inter-Block Information (RIBI)
4.1
RIBI for Surrogate Set 1
71
72
4.1.1
RIBI For The Validation Set 1* .
81
4.1.2
Construction of The Test Statistics
82
4.1.3
Asymptotic Non-null Distribution of LO·
83
5 General Case of a Vector of Primary Variates
87
5.1
Intra And Inter-Block inference . .
88
5.2
Construction of The Test Statistic .
89
5.3
Asymptotic Non-Null Distribution of LO·
90
6 Properties of The Test Statistics And an Illustration
92
6.1
Introduction.............
92
6.2
ARE for the Complete Block Case.
92
6.3
Example: The Effect of Zidovudine on Survival in Patients with AIDS
97
6.4
Further Research . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
102
VII
Chapter 1
General Introduction
1.1
Introduction
In clinical trials, as well as in medical research, it is often the case that the vari-
ables of interest, known as true endpoints, are either hard to measure, are too costly,
or simply require considerable time for completion. Hence a search for endpoints that
are correlated with the true endpoints and that are easier and less costly to measure,
or perhaps can be measured at an earlier time, results in what are called surrogate
endpoints. There is controversy over the definition of a surrogate endpoint due to the
lack of formal methodology for making inferences regarding a true endpoint, Y, when
a surrogate,S, is used instead.
Prentice (1989) has defined surrogacy and established some operational criteria
with which to choose a surrogate outcome variate in the context of clinical trials.
Basically Prentice (1989) defined a valid surrogate as an outcome variable that would
"yield a valid test of the null hypothesis of no association between treatment and
true response". This simply means that for a surrogate to be a valid substitute for
a true endpoint, the treatments being studied should affect the true endpoint only
through the surrogate. In other words "the surrogate endpoint must have precisely
the same relationship to the true endpoint under each of the treatment strategies
1
b.eing compared".
Pepe (1992) considered settings where, for a random subsample of the subjects
being studied, some validation data are available to study the association between
one true endpoint and one surrogate variable. She considers a completely randomized
design (treatment wise). Let P{3(Y I Z) denote the regression model relating the true
endpoint Y to the covariate Z. Pepe showed that the maximum likelihood estimate
of (3 is nonrobust to misspecification of the conditional distribution of the surrogate
X giventhe unobserved true outcome Y and the covariate Z, P(X I Y, Z). Moreover,
she considers as an alternative a semi-parametric model where she lays no structure
on P(X
I Y, Z). This partial likelihood approach leads to an estimate of f3 which
behaves much better than the fully parametric estimate, yet remains sensitive to any
departure from the structure imposed on P{3(Y
I Z) which would affect the validity
and efficiency of the statistical analysis.
Sen (1994) considers design and analysis of a general class of incomplete multiresponse designs (IMD) that take into account the use of surrogate variables. The Pepe
(1992) setup is a special case of an IMD with one primary response variate and one
surrogate. The model that Sen considers here is a nonparametric model that avoids
the stringent conditions needed in a parametric or semi-parametric model. He formulates appropriate rank based procedures to test the null hypothesis of no treatment
effect in a robust manner using only one true endpoint and one surrogate. Moreover, he tackles the problem of estimating the conditional regression quantiles of the
conditonal distributions (given concomitant variates) of the primary and surrogate
variables.
The purpose of this study is to generalize Sen's approach into a multiresponse,
multisurrogate setup and to study the properties of the proposed nonparametric procedures in both finite and infinite samples. Although the focus is on use of surrogate
variables for primary response variables, covariates may also need surrogates. There
has been much work published recently on the so called measurement error models
2
for continuous as well as discrete covariates. For instance, see Kupper (1984), Fuller
(1987), Carroll (1989), Carroll and Ruppert (1988), Pepe and Fleming (1991), Satten
arid Kupper (1993) among others.
1.2
Literature Review
1.2.1
Surrogate Endpoints: Uses and Abuses
The term surrogate comes from the Latin 'surrgare' meaning 'to substitute'.
Thus a surrogate endpoint simply means a substitute measure for some other variable.
Ellenberg and Hamilton (1989) suggest that "investigators use a surrogate when the
endpoint of interest is too difficult and/or expensive to measure routinely and when
they can define some other, more readily measurable endpoint that is well correlated
with the first to justify its use as a substitute". They present some surrogate endpoints
for survival time in cancer studies such as tumor response, time to progression, or
time to reappearance of disease. However, Piedbois, et al. (1992) warn that tumor
response should not be considered a valid surrogate endpoint for survival in patients
with advanced colorectal cancer.
Wittes, Lakatos and Probstfield (1989) define a surrogate endpoint in cardiovascular clinical trials as an "endpoint measured in lieu of some other so-called 'true'
endpoint." Ejection fraction is given as an example of a surrogate for total mortality. Hillis and Seigel (1989) define a surrogate as "an observed variable that relates
in some way to the variable of primary interest, which we cannot conveniently observe directly." In clinical trials that study ophthalmologic disorders, they examined
the use of the "status of one eye as a surrogate for the (unobservable) status of the
opposite eye in the same individual."
Machado, Gail and Ellenberg (1990) mention the use of CD4+ lymphocyte levels
as surrogates for the development of Acquired Immune Deficiency Syndrome (AIDS)
or death. However, Choi, Lagakos, Schooley and Volberding (1993) conclude that
3
CD4 + lymphocytes are an incomplete surrogate marker for progression to AIDS in
asymptomatic HIV infected persons taking Zidovudine. Gruttola, Wulfsohn, Fischl
and Tsiatis (1993) observe that although CD4 lymphocyte counts are associated with
,
improved survival in patients with AIDS and AIDS-Related Complex, "they account
for only a small portion of the survival benefit of Zidovudine". Lagakos and Hoth
(1992) echo the same conclusion. Baccheti, Moss, Andrews and Jacobson (1992)
found that CD8 counts and changes in hemoglobin and WBC during therapy could
add independent predictive power to the CD4 count and thus would produce more
valid and useful surrogates than the CD4 count alone.
The urgency for developing cures for terminal illnesses, such as cancer and AIDS,
coupled with the long period of patient follow up necessary to obtain valid data for
estimating survival time has led scientists to consider surrogate endpoints that could
be useful for several reasons. Clinical trials that use a surrogate endpoint require
a much shorter follow up time than a true endpoint. Surrogates often require less
invasive techniques of measurement than true endpoints, and hence are easier to
..
measure. Sometimes a disease might be rare, which makes study of the true endpoint
very difficult.
These benefits should not overshadow the possible pitfalls that abound in using
surrogate variables. Fleming (1992) writes that "one rarely can establish that surrogate endpoints are valid". DeMets, in his commentary on Fleming (1992), refers
to the reliance on surrogate markers as "the most disturbing, even threatening, issue
today in clinical trials". Prentice (1989), after generalizing his operational criteria,
mentions that "in spite of the hope for such extension I am somewhat pessimistic
concerning the potential of the surrogate endpoint concept, as it is interpreted in this
paper." Ellenberg (1991) emphasizes that there should be a "strong biological rationale" for a valid use of surrogate markers in AIDS trials, added to a strong predictive
ability of survival time at any given point in time.
The fact that treatment might affect the 'true' clinical outcome through path-
4
..
'Yays or biological processes other than the surrogate marker a raises lot of doubt
as to the markers' validity in predicting the true outcome. In such a situation, false
positive conclusions are misleading. A recent example often cited in the literature is
provided by the Cardiac Arrythmia Suppression Trial (CAST, 1989). Arrhythmias
were accepted as a surrogate for sudden death since it was believed that suppression
of arrhythmias would reduce sudden deaths. The FDA approved the antiarrhythmic
drugs, ancainide and flecainide, and these were widely used. Later on the CAST trial
established that these drugs nearly tripled the death rate relative to placebo.
False negative conclusions might also arise when treatment has no effect on
the surrogate marker but is beneficial with respect to the true clinical endpoint.
Thus reliance on the surrogate marker in this case would deprive patients of effective
treatments. Fleming (1992) cites the Chronic Granulomatous Disease (CGD) clinical
trial. Gamma interferon led to noticable reduction in the rate of serious recurrent
infections in CaD patients, while there was no detectable effect on either superoxide
production or bacterial killing, both of which were used as surrogates for the lifethreatening infections.
Thus, Machado et al. (1990) say that understanding the mechanism of action of a
new treatment is an important a priori condition for selection of a surrogate endpoint,
and whenever such mechanisms are unclear, data on the true endpoint seems to be
the essential source of information. However, they add that "assuming that other
agents in the same class have similar mechanisms of action, it may be possible to rely
on surrogate endpoints in later studies to evaluate these agents" ..
1.2.2
Prentice Criteria for Surrogacy
In light of the above complications of choosing a valid surrogate, formal statistical
methods to aid in this process are urgently needed. Unfortunately, they are also
scarce. Prentice was the first to adopt criteria for valid surrogacy.
He defines a
surrogate endpoint as "a response variable for which a test of the null hypothesis
5
~f
no relationship to the treatment groups under comparison is also a valid test of
the corresponding null hypothesis based on the true endpoint." The term valid here
means that both hypotheses are parallel in the sense they would either both reject
the null hypothesis or both would accept it. Note that the surrogate endpoint in this
definition depends on the treatments being compared, and cannot be assumed to be a
universal substitute for the true endpoint. In symbols, this means that if t denotes the
time of enrollment in a clinical trial, and if Y denotes a true time-to-failure endpoint,
and if we let z = (Zl,' .. , zp) be indicators for p of the p+ 1 treatments to be compared
with respect to the corresponding hazard functions Ay(t
I z),
then with X(t) as the
candidate surrogate we should have
P(X(t) I z, F(t)) - P(X(t) I F(t)) {:} Ay(t I z) - Ay(t)
(1.1 )
where F(t) consists of the failure and censoring histories prior to t for the true endpoint, and P(.) stands for the probability distribution. This means that for a surrogate to be valid, a test of the null hypothesis of no treatment effect on the surrogate
should be parallel to the test of no treatment effect on the true endpoint. In other
words, the two tests would either both reject or both accept the null hypothesis.
Prentice established two criteria to check the validity of (1.1), namely
Ay(t I X(t), z)
=Ay(t I X(t))
(1.2)
which states that the surrogate fully captures the effect of the treatment z on Y, and
Ay(t I X(t))
-I Ay(t)
(1.3)
that is, the surrogate is informative about Y . .In order to establish (=» in (1.1), it is
necessary in addition to condition (1.3), to restrict the class of alternatives to the null
hypothesis of no treatment effect on the surrogate endpoint to "those for which the
treatment effects on the surrogate response distribution have some impact on average
true endpoint risk".
Although time to response was the true endpoint here, Prentice (1989) considered
6
generalizations where the true response was a stochastic process {Y(t), t 2:: O}, where
t is the time from begining of follow-up, and Y(t)
= {Yi(u), Y2 (u),···; 0
~
u
< t}.
Moreover, instead of considering comparison of two or more treatments in a clinical
trial setup, a surrogate response is sought in order to replace a true response in
"respect to its dependence on some general exposure or covariate history". Thus if
we denote the possibly time-dependent covariate by W(t)
= {W1 (u), W 2 (u),···; 0 ~
u < t}, then condition (1.2) can be extended to
Pr{Y(t)
I X(t), W(t), F(t)} = Pr{Y(t) I X(t), F(t)}.
(1.4)
It is well known that in actual practice, condition (1.2) rarely is known to hold
since it is hard to verify or simply is not true. Moreover, measurement error in the
surrogate variable may invalidate criteria (1.2 or 1.4) and (1.3). It is possible also to
use a later response as a surrogate for an earlier true endpoint if the latter is hard to
measure, and the same criteria above can be employed to validate surrogacy.
1.2.3
Designs of Clinical Trials with Surrogate Endpoints
The information gathered in a clinical trial usually involves a number of responses, some of which may be more important than others, and these are what Sen
(1994) calls primary variates. It is essential that the statistical relationship between
the primary variates and the surrogate endpoints be studied before any inference is
made about the treatment effects on the primary endpoints. But then it is often
difficult and/or costly to record data on all the primary variates (as well as surrogate variables and concomitant variables) for all experimental units, thus creating
the need for simultaneous measurements of the primary and surrogate variables for
different subsets of experimental units. This is the genesis of incomplete multiresponse designs (IMD) in clinical trials, as Sen (1994) has so elaborately explained.
In situations where the primary variates can be ordered in terms of their importance
or relevance, a hierarchical design (HD ) can be adopted as follows.
7
Assume that Y
=
(Yl , " ' , Yp ) is a vector of response variates such that Yl is
the most important primary variate, and the other responses have a decreasing order of importance. Assume also that Yo is a q-vector of surrogate variates that
may be recorded for all subjects with relative ease. If S
= So denotes the set
of all
experimental units, then a possible hierarchical scheme would be:
S = So ::J Sp 2 ... 2 SI,
where on SI, all the p
+ q responses
(1.5)
are recorded, on S2 \ SI (Y2,"', Yp and Yo)
are recorded, and so on; on Sp \ Sp-l, Yp and Yo are recorded, and on So \ Sp only
Yo is recorded. Here Sp \ Sp-l means all elements of Sp that are not in Sp-l' Data
pertaining to treatment as well as design variates are also recorded in all of the above
subsets. Now for the specific subsets,
(1.6)
efficient designs {D} like those discussed in chapter 9 of Roy, Gnanadesikan and
Srivastava (1971) may be adopted to draw statistical conclusions.
However, sometimes all the primary variates can be equally important and no
ordering may apply. Here the hierarchical design loses its appeal and other response
wise incomplete blocking may be applied to Y. To see this, Sen (1994) formulates
a general class of IMDs in the following manner.
Let P denote {I"", p} and
. consider the totality of 2P subsets of P, defined by i r = {il,"', iT}' for all possible
1 ~ i l < '" < iT ~ p and r = 0,1"" ,p; i o = O. Consider a proper subset Po of P,
determined by clinical and other factors, such that:
Po = {i r : (r,i r ) E lo}.
(1. 7)
where 10 is the set corresponding to Po. Then the set of all experimental units is
partitioned into a system
S(Po)
= {Sir: (r, i r ) E PO}.
8
(1.8)
w;here for the subset Sir' the primary variate(s) Yi l , · •• , li r along with the surrogate
variates Yo and design and concomitant variates are recorded. For the subset Sir' an
appropriate design Dir (for the treatment as well as design variates), can be adopted
leading to the design sets
D(P o ) = {Di r
:
(r,i r ) E Po}.
(1.9)
Thus, an incomplete multiresponse design that incorporates surrogate endpoints can
be formulated in terms of the dual design sets, responsewise design set and treatmentwise design set, namely
{S(Po), D(Po)}.
(1.10)
Sen (1994) stresses that the choice of optimal designs of the type (1.10) may depend
heavily on the cost and/or difficulty of measuring the responses, whereby a costbenefit approach will be needed.
In setups where there is more than one surrogate endpoint, and data, on all of
them for all experimental units cannot be recorded easily, Sen (1994) suggests a more
general class of IMD designs. Here the set P will be replaced by
P*
= {1 , ... '"
p·1
... , q} ,
(1.11)
and ir by (i r , is) = {i l , · · · , iT; i~,·.·, i:}, for all possible 1 ~ i l < ... < iT ~ p,
1
~
ii < ... < i:
~
q; 1
~
r
~
p, 1 ~
5 ~
o
q, and include io or i in this system. Also
Po is replaced by Po = (i r , i~) : (r, 5, iT' i:) E 10. Then (1.8) is extended to
S(P o) = {S(ir,is) : (r, 5, i r , is) E 1O},
(1.12)
where in the subset S(ir,is)' the primary response variates Yi l l · · · , Yi r , and the sur-
... , Yi~s are to be recorded on the experimental units.
rogate response variates Yi~,
I
However, Sen (1994) does not foresee usage of such general designs since it requires
large cardinality of S, and a sophisticated statistical analysis that may not be appealing to the clinical scientist.
9
1.2.4
Parametric and Semi-parametric Approach
Prentice (1989) used his criteria to draw inferences based solely on surrogate
data. The design that he considered is a special case of an IMD wit?- p
= 1, q = 1;
Po refers to the set of no primary index and 5(Po) = 50 is the entire set of experimental units where only the surrogate variate is recorded. Pepe (1992) considers settings
where some validation data on both the true endpoint Y and the surrogate variable
5 is available. The design here is also a special case of an IMD with p
and i o
= {O}, h = {I}.
= 1, q = 1
Thus one subset contains data only on the surrogate variate,
while the other subset contains both the primary and surrogate variates (validation
sample).
Assume that P/3(Y
I Z) is the regression model that relates the true response
to the vector of covariates Z, and let P/3,o(X
I Y, Z) be the model that relates the
surrogate X to Y and Z. A validation subset V of experimental units on which Y,
X, and Z are recorded is considered so that the strength of the relationship between
X and Y could be studied. Pepe (1992) discusses two approaches in order to draw
inferences about (3.
In the first approach, inference for the true outcome is based on maximum likelihood theory using a parametric model for P( X
I Y, Z).
Maximum likelihood estimates
of (3 and () are based on the likelihood
L((3, ()) =
II P/3(Yi I ZdP/3,o(Xi I Yi, Zi) II P/3,o(Xj I Zj),
iEV
jEV
where V indicates the nonvalidation set of observations on which only the surrogate
and concomitant variables (as well as design variates) have been measured, and where
P/3,O(X I Z) =
JP/3(y I Z)P/3,o(X I y, Z) dy.
Pepe (1992) demonstrates by an example that the maximum likelihood estimate of (3
is very sensitive and nonrobust to misspecification of P/3,o(X I Y,Z).
In the second approach, no structure on P(X I Y, Z) is imposed while P(Y I Z)
10
i~
still parametric in nature, hence the method is termed semi-parametric. Since
P(X I Y, Z) is nonparametric in nature it can be assumed to be independent of ;3.
If
zs denotes the components of Z which are thought to be informative with respect
to the association between X and Y, then an empirical estimate of P(X
I Y, ZS)
is
found using the validation sample:
where in the discrete case
1
- vL...J
~ I[X·t
n iEV
=X
~ EI[Yi =
n iEV
1':
,t
=Y
ZS
't
= ZS]
,
Y,Z: = ZS]
Here 1[.] is the indicator function and n V is the number of subjects in the validation
sample. If X, Y, or ZS is continuous, then suitable smooth kernel type estimators of
the probability
~ensities
Define P(3(X
I Z)
=
P(X, Y, ZS) and P(Y, ZS) are used for those components.
f P(3(Y I Z)P(X I y, ZS) dy,
or the corresponding sum if Y
is discrete. The inference about ;3 is based on the estimated likelihood
L(;3) =
II P(3(Yi I Zi) II P{3(Xj I Zj)
iEV
jEV
which is an estimate of
L(;3) =
II P{3(Yi I Zd II P{3(Xj I Zj)
iEV
(1.13)
jEV
had P(X I Y, Z) been known exactly. Pepe (1992) showed that, provided the validation sample fraction n V/ n has a nonzero, positive limit pV, and the usual regularity
conditions of Cox and Hinkley (1974) hold for both P{3(Y
I Z)
and P{3(X
I Z),
the
maximum estimated likelihood estimate ~ converges in distribution to a normal random variable with zero mean and a variance made of two components. The first,
I- 1 (;3)/n, is the variance of the maximum likelihood estimate based on (1.13), and
the second, K(;3), is the penalty induced by estimating (1.13) which is a decreasing
11
f':lnction of the validation sample fraction pl/.
Although it is true that in large samples there is no loss in using uninformative
nonvalidation set data in the analysis of validation set only, the use of informative
surrogates does increase the efficiency over estimates based solely on true outcome
data. The estimated likelihood method is fully efficient if the surrogate is perfect,
that is if P(Y
I X, Z) = 1. In other words, if X is perfect then the variance of the
maximum estimated likelihood estimate is equal to the variance of the maximum likelihood estimate based on true outcome data for all subjects and hence the surrogate
would be as informative about (3 as the true endpoint. Pepe (1992) showed that when
(3 is a scalar parameter, the maximum estimated likelihood estimate is more efficient
than the maximum likelihood estimate based on the validation set alone if and only
if the validation fraction nl/ In is greater than
!.
Fleming (1992) discusses use of biological markers as auxiliary variables rather
than surrogates in order to strengthen the clinical efficacy analyses and to avoid
the risk of making false conclusions when surrogate endpoints are used. Three approaches are reviewed to deal with auxiliary variables: "variance reduction", "augmented scores", and "estimated likelihood". The variance reduction approach was
explored by Kosorok (1991), and the latter two approaches by Fleming (1992). In
all three approaches, two conditions are needed for efficient analyses employing auxiliary variables: in the first the auxiliary variable and the true endpoint should be
highly correlated; and in the second "one pool of patients having longer follow-up,
and another pool of patients with auxiliary information but with relatively short-term
follow-up on the clinical endpoint".
In their commentary on Fleming (1992), Farewell and Cook discuss their method,
which estimates the correlation p between the surrogate and the true endpoint from
the asymptotic covariance matrix of Lin (1991) and uses it for appropriate weights of
the test statistics. To see this, let VI (t) and V2 (t) be the standard log-rank statistics
at time t for the treatment effect on the surrogate endpoint and the true endpoint,
12
r~spectively.
A global test statistic is defined as
R(t) = pl(t)Vi(t)
+ P2(t)V2(t),
where Pl(t) and P2(t) are possible data-dependent weights that may be of the form
where jl and
+ p(t)
Pl(t)
jl
P2(t)
j2 - p(t)
i2 are chosen to reflect the relative weighting of the marginal test
statistics assuming zero correlation. Their simulation studies found that even with
such weights, possible discordant treatment effects hamper the validation of surrogacy.
Also,
p was
found to be insensitive for assessing a surrogate candidate, whereupon
the weights simplified to a proper choice of jl and
h.
Louis, in his commentary on Fleming (1992), postulates a parametric model that
relates a surrogate (X) to treatment (Z) and a true endpoint (Y) as follows
Pe(Z, X, Y) = P(Z)Pe(X I Z)Pe(Y I Z, X).
-
Now, condition in two different orders and use the Fisher information decomposition
in Louis (1982) to get
Equating the right-hand sides and solving for
I(TIZ) (0)
The term in square brackets determines the cost/benefit of using a surrogate. If it
is negative, the surrogate will be more efficient than the true endpoint in drawing
inferences about O.
Otherwise, if it is positive and one still insists on using the
surrogate a larger sample will be needed.
13
4dvantages and Disadvantages of the Parametric and Semi-Parametric
Approaches
A sketch of the general theory and statistical analysis of IMDs is
f~)Und
in Mona-
han (1961), Srivastava (1966), and Srivastava (1968). In multiresponse situations the
multivariate general linear model (MGLM) is often adopted when there are no surro-
gate variables. Two assumptions in an MGLM are questionable here. The response
variables as well as the error vectors are assumed to be multivariate normal. These
are risky assumptions that cannot be verified in practice. Usually, one checks to make
sure that the marginal distributions are univariate normal. If they are found to be
skewed, appropriate transformations are made to bring about symmetry. But then to
claim that the transformed variables are multivariate normal would not be reasonable.
In addition, sometimes some of the response variables may be categorical in nature
and they would then require other treatment, for example logistic regression. In such
a case the MGLM cannot be applied for all responses. The MGLM is valuable if all
the responses are continuous and the multinormality and other assumptions are valid,
and if our interest lies in the covariance structure of the responses, or in a secqndary
parameter and/or hypothesis that involve more than one dependent variable in an
essential way, the two main reasons for employing a MGLM.
Generalized linear models (GLM) of McCullagh and Nelder (1989) can be adopted
in clinical trials when only one response is analysed. But here also there are drawbacks
that arise due to non-identifiability of such GLM's, high dimension of the "asymptotic
covariance" matrix, and large sample sizes.
The use of Cox (1972) partial likelihood by Fleming (1992) should be viewed
with caution. In clinical trials, censoring (Type I, II or random) often occurs, and
with moderate to high censoring, the partial likehood will not have enough information for a powerful test. Moreover, if there are many covariates with a scatter that
is not very localized then any departure from the proportional hazards assumption
would invalidate the inference about the covariate parameters by introducing possible
14
bias and nonrobust standard errors.
In summary, the parametric setup should be "viewed with caution" as Sen (1993)
elaborates primarily because "any departure from the assumed functional forms may
cause considerable damage to the validity and efficacy of statistical analysis based
on the assumed model." Although the semi-parametric models offer more flexibility
by allowing some arbitrariness of the distribution functions (for example the baseline
hazard in Cox's proportional hazards model is nonparametric in nature), yet they are
not robust to qepartures from the assumed models. For these reasons it seems more
safe to assume a broader model that is less restrictive, and that will preserve consistency of the estimates while providing a reasonably efficient test statistics. Hence the
need for "nonparametric models" which are discussed in detail in the next section.
1.2.5
N onparametric Approach
Nonparametric formulations are more flexible but also more complex. Sen (1994)
states that the reason for this is that "any reduction of the statistical information
through only a few summaritative measures merits a much more careful consideration,
and often, a finite number of such measures may not suffice the purpose." Sen (1994)
lays down the foundation of nonparametric inference using surrogate variables by
considering a uniresponse model with a true outcome, Y, a set of covariates denoted
by Z, and a surrogate variable, X.
For the majority of experimental units, data are recorded only on (Xi, Zi), i E 1.
The set 1 is called the surrogate set. On another subset of experimental units, J*,
disjoint from 1, measurements are made on Y, X, and Z so as to allow assessment of
the relation between the true outcome and the surrogate. Thus, the set of all experimental units would be 50
= 1 U J*
51
and
= J*, which is
a hierarchical design as
in (1.2.3). An IMD can also be considered by including a third subset, 1°, with data
recorded on (Y, Z), but not on X.
Denote the conditional distribution function (d.f.) ofY, given Z = z, by F(y I z),
15
apd the corresponding conditional survival function (sJ.) by
F (y I z) [= 1- F (y I z)].
Also, let the corresponding functions of X, given Z = z, be denoted by G( x
G (x I z)
I z)
and
respectively. Let H(y, x I z) be the conditional joint dJ. of (Y, X).
Two main goals of the analysis of an IMD are discussed by Sen (1992). The
first is when a test of the null hypothesis of no treatment effect is desired, and the
second relates to estimation of the treatment effects.
Hypothesis Testing
2.5.1.1
~urrogate
Sample Analysis
For the sake of simplicity, Sen (1992) deals first with the surrogate set, I, alone,
and then extends the method to the validation set, 1*. The Prentice (1989) criteria
for a surrogate translate here as concordance of
F (. I z)
and
G (. I z),
viewed as
functions of the concomitant variate z. The term concordance is used in·the usual
sense of concordance between two random variables, and the term concomitant variate
is used interchangeably throughout this work with the term
~ovariates.
The method
proposed makes use of nonparametric analysis of covariance (ANOCOVA) tests that
were reported in Puri and Sen (1971) and Puri and Sen (1985). For extensions of
ANOCOVA tests to survival analysis see Chapter 11 of Sen (1981).
Let the vector of concomitant variates Z be expressed as Zj
where the
and the
~i
Ci
= (ci, ~D' ,
i
2: 1,
are non-stochastic r(2: 1) vectors mostly relating to the design variables,
are stochastic covariates. Note that random assignment of treatments to
subjects ensures the independent and identically distributed nature of the
~i's,
which
is the basic assumption of ANOCOVA (in the parametric as well as the nonparametric
setups). Now, let
II(~)
be the marginal distribution of ~i, and let
16
l'he null hypothesis of no treatment effect is formulated in a nonparametric way as
H o : G i (. I ~) = G(. I~)'
(1.14)
i E I
where G is a suitable yet unknown d.£. which is assumed to be continuous everywhere.
+ 1) x I-vectors Xi = (Xi,~D', i E I. If
then Xi leads to a (p + 1) x n matrix. Arrange the elements
Let ~i's be p x I-vectors, and define the (p
the cardinality of I is n
in order of magnitude within each row of this matrix, and denote the corresponding
ranks by Rji,j
= 0,··· ,p; i = 1,···, n.
This (p
+ 1)
x n matrix is called the rank
collection matrix. Define the r-row vectors linear rank statistics (for each j
= 0""
,p
separately) as
T nj = l)Ci - c)anj(Rji ), j = 0,··· ,p,
(1.15)
iEI
where c = C£iEI ci)/n, and the anj(k), k = 1,···, n, are suitable scores (for example
Wilcoxon scores: anj(k) = k/(n
+ 1), k
= 1,·'·, n). Let V n be the (p
+ 1)
x (p
+ 1)
matrix with elements Vnjjl given by
Vnjjl = n- 1 I: anj(Rji)anjl(Rjli)- an/injl,
iEI
for j,l = 0,·,· ,p, and anj= n- 1 2::k=1 anj(k),j = 0,,·· ,p. Also define
(1.16)
n
C n = I:(Ci - C)(Ci -
c)',
(1.17)
i=l
and assume that Rank(C n ) = r(2:: 1), and as n increases n- 1 C n converges to a
= (T~ll'" , T~o)' be an p x r matrix, and write
and let Vno = (VnOl,···,V nop ). Then proceeding as on page
positive definite matrrix C. Let T~
((v njjl))j,jl=l,... ,P as V nOO ,
365 of Sen (1981), fit a linear regression of the surrogate variate rank statistics on the
concomitant part in order to eliminate the effects of the concomitant variates. Define
the residual rank statistics-vector
(1.18)
and let
* = VnOO VnOO
VnO
17
(VnOO )-1'
V nO •
(1.19)
~inally,
let
(1.20)
Sen (1994) proposes L no as a test statistic for testing the nul! hypothesis in
(1.14). Moreover, under (1.14), L no is a permutationally (conditionally) distributionfree test, and hence for small n, one may use an exact test, whereas for large n, L no
can" be approximated by the central chi squared distribution with r degrees of freedom under H a. When
r =
1, one can perform a one-sided test if one pleases based
on T*nO J(v*nOD )1/2 .
2.5.1.2 Censoring
A little adjustment is needed in the above discussion if the design incorporates censoring. In particular, if there is Type II censoring, i.e., if only data on the kn (out
of n) smallest ordered values of the Xi variable are available, where n- 1 kn is close
to some pre-fixed a (0 < a < 1), then as in Chapter 11 of Sen (1981), the censored
version of (1.15) should be considered by replacing anO(Roi) in (1.15) by bnO,kn(Rod,
where
,i = 1,"" kn ,
(1.21)
for j > kn .
The censored version of
Vnjj'
also has to be considered, and proceeding as in (1.18-
1.20), one can define the statistics T~o(kn), v~ao(kn) and Lno(kn ), for every k n(1 ::;
kn
::;
n). When kn is large, the exact (permutational) conditional distribution of
Lno(kn ) can be approximated by the chi-squared distribution with r degrees offreedom
under H o.
When there is Type I censoring (i.e., truncation at a prefixed timepoint T), then
if kn(T)
n
~
= L,iEII(Xi
::; T), where I(.) is the indicator function, then under Ha, as
00
n -lkn (T)
a.s
~
aT,
18
0< aT < 1.
(1.22)
•
(see Chapter 11 of Sen (1981)). The test based on Lno(kn(T)) is also a conditionally
(given kn(T)
= kn) distribution-free test under Ho that has a large sample chi-squared
approximation as well. Moreover, if the Prentice (1989) criteria for a surrogate are
fulfilled, then to circumvent the loss in efficiency brought about by censoring, timesequential tests (such as the progressive censoring scheme (PCS)) of Chapter 11 of
Sen (1981) could be employed. Group sequential testing (GST) in the context of
repeated significance tests could be adopted also. However, if the Prentice (1989)
criteria can not be justified then validation sample analysis will be urgently needed if
valid inference about differential treatment effects on the primary endpoint is desired.
2.4.1.3 Validation Sample Analysis
Let the set of experimental units on which we have data for the primary, surrogate,
and concomitant (as well as design) variates be denoted by Iv with cardinality n v.
If C stands for the concordance between Y and X, then one way to incorporate the
validation set into the analysis is to perform first a preliminary test of
H o : Pr( C) ~ 1/2.
(1.23)
Now, corresponding to Xi, we have a (p + 2)-variate observation (Yi, Xi, ~i)', i E Iv,
and T no becomes here T nvo which is a 2 x r matrix with r elements for each of Y and
X. Also, V n is now V nv, a (p + 2) x (p + 2) matrix that can be partitioned as
where V nvOO is a 2 x 2 matrix, V nvO+ is 2 x p and V nv ++ is a p x p matrix. Then,
proceeding as in (1.15) through 1.19, let
T nvo - Vnvo+(Vnv++)-l~v'
(1.24)
V nvOO - VnvO+(Vnv++tlv~vo+
(1.25)
19
'I;he permutational covariance matrix of T~"o is given by
Cn"
®V n"OO, where
Cn" IS
defined as in (1.17) with n replaced by n v and I by Iv. Then 1.23 can be reframed as
H o : V~O,XY 2: 0
ver.sus
HI: V~O,XY
<0
(1.26)
where V~O,XY is the population counterpart of V~.oo. An asymptotically normal test,
L~",
of (1.26) can be performed using the theory developed in Chapter 8 of Puri
and Sen (1971). Now, if Z; is the critical level of L~. with TJ
(0 < TJ < 1) being
the level of significance for this test, then if L~" < Z;, we reject the null hypothesis
of concordance, and we do not proceed to test the original hypothesis of treatm,ent
effects based on the surrogate set only. Alternative designs (for example, adaptive
designs) will then be needed to test the basic hypothesis in a valid and reliable manner. In such a situation, Sen (1994) considers another test statistic, L~", of a parallel
hypothesis constructed from Iv such that the component of Tn"o corresponding to Y
is the dependent variable, and the other component corresponding to X, along with
T~"
serve as the covariate rank vector. A simple linear regression is done as before
and L~" will be permutationally (conditionally) distribution-free with a chi-squared
large sample approximation with r degrees of freedom.
If, on the other hand, the null hypothesis of concordance is accepted, that is if
L~.
2: Z;, then Sen (1994) suggests combining statistical evidence from both I and Iv
to get a test statistic to test the original hypothesis of no treatment effect. One such
combination can be L~~"
= L no + L~.,
and this test is also permutationally (condi-
tionally) distribution-free with a large sample central chi-squared distribution with
2r degrees of freedom under H o. However, the fact that the degrees of freedom are
2r instead of r causes concern regarding the efficiency of L~~" in terms of asymptotic
power. Hence, Sen (1994) suggests combining the covariate adjusted rank statistics from I and Iv before constructing the quadratic forms, and then constructing a
quadratic form in such a way that under H o, the resulting test is permutationally
conditionally distribution-free with large sample chi- squared approximation with
degrees of freedom.
20
r
Before outlining the details of such a test, it is worth mentioning at this stage
that if the Prentice (1989) assumption holds (based upon clinical considerations),
performing a preliminary test actually complicates matters due to the multiple tests
involved. Sen (1994) illustrates this point by pointing out that two tests are performed, the preliminary test at significance level 1] , and the second stage test at level
(Y2,
say. Then the overall level, say
(Y,
is given by
(1.27)
where 1~2 and l~~ are the critical levels of L~v and L~~v respectively. The basic problem here is that the pair (L~v' L~v)' and the pair (L~v' L~~v) may not be stochastically
independent, hence evaluation of (1.27) will be quite complicated (even in the asymptotic setup). That is why performing a preliminary test of concordance may not be
very appealing.
Sen (1994) considers analogues of the standard parametric procedures discussed
in Roy et al. (1971), though in a nonparametric setup. In the normal theory setup,
specifying the mean and covariance structure completely specifies the distribution of
the test statistic, whereas here, the covariate adjusted linear
~ank
statistics, although
asymptotically multivariate normal, include some unknown functionals in the mean
vector and covariance matrix that need to be known to be able to construct the
test. One option then would be to use adaptive procedures, but the large sample size
required eliminates such an option. Hence, Sen (1994) prescribes the unique combination procedures which we shall describe in detail.
First, partition the residual vector T~vo in (1.24) and the dispersion matrix in
1.25 as follows
(1.28)
Also let
(1.29)
21
The statistics T~vo(x) and T~vo(Y:x~ are statistically independent under the permutational model for large n, and permutationally uncorrelated for finite n. Moreover,
each of them, when normalized, is asymptotically multinormal. This prompted Sen
(1994) to consider the combination
T~:o = [( v~vxx
t
1
+ (vnvY:x t
1
t
1
{
(vn"xx )-lT~vo(x) + (v~"(y:x))-lT~,,o(Y:X)}·
(1.30)
The permutational mean of
T~:o
is 0 and its permutational dispersion matrix is
(1.31)
The final step is to combine T~:o and T~o in (1.18) to get the suggested test statistic.
Denote the r x r matrix in (1.31) by AI, and let A 2
= v~oo.Cn
where C n and
v~oo
are defined in (1.17) and 1.19 respectively. Let
W.t -- [A-1 1
+ A-2 1 ]-1 A-I
i'
i
= 1,2
(1.32)
and let
(1.33)
TO' is conditionally distribution-free, under the permutational model, with 0 mean
and covariance matrix [A~l
+ A 21]-1,
and the permutational multivariate central
limit theorem of Sen (1983) applies here. Hence the motivation for the overall test
statistic
(1.34)
The exact permutational distribution of LO' can be obtained by enumeration if n
and n v are small to moderate. But for large nand n v , the distribution of LO' is
asymptotically a chi-squared with r degrees of freedom. For local (i.e., Pitman-type)
alternatives, the asymptotic distribution of LO' is a noncentral chi-squared with r
degrees of freedom and noncentrality parameter t:,.L which depends upon the alternative and the score functions. The test based on LO' has, at least asymptotically,
22
•
~etter
power than the two stage test in (1.27) if the Prentice (1989) assumption holds,
where it is not as favorable if there is reason to doubt the concordance between the
surrogate and the primary variate.
Moreover, both Type I and Type II censoring can be incorporated into these
test just like before in (1.21) and (1.22). The V matrices would be modified and we
proceed as in (1.28) through (1.34). The extension to repeated significance testing is
also conceivable albeit more complex.
In simple designs (e.g., two-sample or multi-sample models), it is possible to
construct functionals of the distribution functions F(.
I z)
and G(.
I z)
that help in
drawing statistical inferences about the treatment effects. Consider the functional
O(z) = O(F(. I z), and the functional
~(z) = ~(G(.
I z)).
Examples of such functionals
could be the nonparametric regression quantiles (e.g., the conditional median), or the
conditional mean, although the latter is not preferred due to its sensitivity to the
tails of the F (. I z) and will not be considered here.
The dependence between the surrogate and the primary variates is reflected in
some suitable nonparametric functional, 'ljJ(.), where
O(z) =
Note that in a hierarchical design,
'ljJ(~(z)),
~(z)
z E 3z;
(1.35)
may be estimated from the set 1, and 'ljJ(.) from
the set 1*. In an imd design, O(z) may be estimated from 1°, ~(z) from 1, and 'ljJ(.)
from 1*. Let 0(.) and
~(.)
be conditional quantiles and consider the simplest model
of placebo versus treatment. The Prentice (1989) assumption may be tested here as
follows. Partition the concomitant vector
treatment and design variates, and
Z(2)
Z
as
(Z(l), Z(2)),
with
Z(1)
containing the
has the other possibly random concomitant
variates. In the placebo vs. treatment case,
Z(1)
takes only the values 0 for placebo,
and 1 for treatment.
The null hypothesis of no treatment relationship to the true outcome can be
formulated as
(1.36)
23
f9r all
Z(l)
and
Z(2).
Similarly, the null hypothesis of no relationship of the treatment
to the surrogate can be formulated as
(1.37)
for all
Z(l)
and
Z(2).
Testing the validity of the Prentice (1989) assumption is equiv-
alent here to requiring (1.36) to be testable through (1.37).
depends on
Z(l),
Thus if
O(Z(l), Z(2»)
then the concordance condition would have to be verified before
making any conclusions. If
require that for every
Zo
and
Zl
are two distinct values of Z(1), then we may
Z(2)
(1.38)
This brings us to the domain of estimation of the functionals 0(.), e(.) and 'ljJ(.) in a
nonparametric way which we shall deal with in detail in the next subsection.
Estimation
Consider the simplest case of placebo versus treatment when ZP) = 0 or 1,
and when there is no concomitant variate, i.e., when
Z(2)
=
o. For the set I, we
have two subsets corresponding to the placebo and treatment groups, from which
we estimate the quantiles e(O) and e(l) as the sample quantiles of the respective
distribution. For the validation subset, J*, Sen (1994) suggests considering the two
bivariate distributions of (X, Y) for the placebo and treatment groups, and then
estimating the sample quantiles from the marginal distributions for each sample alone.
Thus, from the validation subset, J*, we have the estimators
(1.39)
Also, from the surrogate set, I, we have the estimators ts(O), ts(1). By standard
multivariate nonparametric methods (see Chapter 5 of Puri and Sen (1971)), the
asymptotic distribution of the two bivariate points (tv(O),Ov(O)) and (tv(l), Ov(1)) is
~
nvo
( tv(O)
- e(O) ) '" .IV2
Ar (0 r )
~
,
0,
Ov(O) - 0(0)
24
(1.40)
= O.
where nvo is the cardinality of the subset of 1* for which Z
Also, if n vl stands
for the cardinality of the subset of 1* for which Z = 1, we have
(1.41)
where ro and
rl are to be consistently estimated from the validation subsets, though
with a better rate of convergence if we assume that ro = r 1 = r. Similarly, if nso
and n s l are the sizes of the subsets of I for which Z
= 0 or
Z
= 1, then
(1.42)
(1.43)
where
/50
and
/51
are consistently estimated from the respective samples, also with
a better estimate if we let
and E( Ov(1)
IsO
= /sl = /5.
By (1.40) and (1.41), E(Ov(O)
I tv(O))
I tv (1 )) are both linear in tv(O) and tv (1 ) respectively, and hence the
motivation for the following estimators of 0(0) and 0(1)
(1.44)
(1.45)
The regression estimates, ~o and ~I, are to be obtained from classical linear inference
procedures by using estimates of
r o, rI,
lOs
and
/ls'
This procedure applies also
when Z(I) takes more than two values.
Now, with the introduction of the concomitant variate,
deal with three possible situations. The first arises when
Z(2)
Z(2),
we may have to
is categorical in nature.
The same method of estimation discussed will still apply but based on the subsets
corresponding to the possible values of Z(2). So, if these values are denoted by ar, r
=
1,' .. ,I<, then the functionals to estimate are
O(Z(I), a r ), e(Z(I), a r ), r = 1,· .. ,I<.
25
(1.46)
1;'he second case may arise when
simple model when
Z(2)
Z(2)
is a continuous random variable. Consider a
is a scalar random variable with a continuous distribution.
To estimate O(Z(l),XO) for a given Xo, we let
Wi =
IZ?) -
xol, i E I(and i E 1*).
Now, within each subset of I (and 1*) we consider a subset of observations for which
the Wi have the smallest k values, where k is not small but k/ nvo (or k/ nvl, etc.)
is small. Based upon these k observations we proceed as in (1.40) through (i.45)
and estimate O(Z(1), xo). The theory for this is explained in Gangopadhyay and Sen
(1992a) and Gangopadhyay and Sen (1992b). It should be noted that the rate of
convergence in ((1.40))-(1.43) was .,jn, whereas here it is of the order n a for some
a
< 2/5.
The third case is when some covariates are discrete and others are continuous. A
combination of the methods for the first and second cases will be a good prescription.
2.5.2.1 Concomitance Assumption
Up till now we have been assuming that the (Xi, Zi), i E I, has the same distribution
as (Xi, Zi) i E 1* so that O(z) and e(z) could be estimated consistently. But this may
'not be tenable in practice due to planned censoring of the surrogate or concomitant
variates that may damage the homogeneity assumption of the covariates made in
(1.14),' and hence, all the analysis made so far may be invalid. Sen (1994) gives
us some hope here by introducing some modifications that will extend the method
discussed to such situations.
Consider the case when 1= {i : Xi :::; xo} for some real xo. Here G(. I z) will be
right truncated at xo. Define
G(x I z)
crx;;rz)'
x:::; Xo,
x> Xo
26
and consider the functional
which is different from e(z).
On the other hand variates in 1* have values that
correspond to values of Xi exceeding
be truncated from the left at
Xo
Xo.
Thus, H(x, y
I z), for this upper tail, will
with respect to Xi. Define
H (x y I z) - H(x, y I z) - H(xo, y I z)
L,
1 - H( Xo,oo I)
,
z
for x > Xo, Y E R+.
Then the corresponding marginal distribution of X is
GL ( x
I Z) -
H(x,oo I z) -H(xo,y I z)
,
1 - H(xo, 00 I z)
x >
Xo.
and parallel to eR(Z), let
Thus, n s and n v are nonnegative integer random variables that add up to a
known n, and such that
ns
-
n
P
---t
v
Pr ( X ::; Xo ) an d -n
n
p
---t
)
Pr ( X > Xo,
as n
~ 00.
The asymptotic normality results of (1.40)-(1.41) on the estimates of fh(z) and
eL(Z), and
~L(Z)
[as in (1.42-1.43)] will not lead to better estimates of O(z) [as in (1.44-
1.45)], unless eL(Z), eR(Z), OL(Z), and O(z) are related by some estimable functional
forms. Sen (1994) contends that although it is tempting to consider semiparametric models (such as the proportional hazards model of Cox (1972)) to enforce such
estimablity conditions, these models should be avoided here on account of lack of
robustness. Sen (1994) suggests as a partial solution, extension of the validation subset, 1*, by including additional observations for which Xi ::; Xo. This will permit the
nonparametric estimation of the relationship between edz) and ~R(Z), and also the
relationship between O(z) and
~L(Z).
27
1.3
Synopsi~
of The Work Done
There are two main issues that need to be addressed when incorporating surrogate
variables in clinical trials. The first issue is economic in nature, while the second is
statistical. A reduction in the cost of a clinical trial, although desired, should not
be at the expense of unreliable statistical inference. This could happen, for example,
if one abuses surrogate endpoints by using them without considering carefully their
association with the true endpoint. Thus, in the design stage of the clinical trial, one
has to try and balance the practicality of primary and surrogate response measurements (in terms of cost and difficulty), with the minimization to the extent possible
of the bias of the estimates and inefficiency of the tests involved. Hence, a proper
choice of the validation subset, 1*, seems to be essential.
Although this study will not focus on design issues of clinical trials that handle
surrogate responses, we will consider statistical analysis in incomplete multiresponse
design (IMD) settings of which the hierarchical design is a special case. The reason
for this is that it allows investigators greater flexibility than if we consider hierarchical
designs only. The contribution of this work consists in generaJizing Sen's method into
the multivariate case in a randomized block setting (complete as well as incomplete),
and to combine inter and intra-block information in a nonparametric way. This has
not been done before in the literature. Above all it is the introduction of surrogates
that is also new.
There are two scenarios that we will study in detail which I shall now describe.
As we have seen in (1.2.1), one surrogate may not suffice to draw valid inference
about the treatment effect on the primary variate. One major concern of scientists
that puts the validity of the surrogate endpoints in doubt is the possibility that treatment may affect the true endpoint via other pathways than the surrogate. Thus, if
the investigators think that this is the case with a certain surrogate, they can use
measurements of other surrogates to account for these other pathways of treatment
effects on the primary endpoint. In this way we will have more information about dif-
28
f~rential
treatment effects on the true endpoint and our inference would be valid. For
example, we mentioned in (1.2.1) that Baccheti et al. (1992) found that CD8 counts
as well as changes in hemoglobin and WBC during therapy add more information to
the effect of treatment on AIDS or death than CD4 counts alone. Thus, in many
cases use of more than one surrogate to predict treatment effect on a true outcome
is better than relying on only one surrogate.
The first scenario can thus be formulated as follows. A primary variate, Y, is
considered along with a q-vector of surrogates, X, and measurements on the concomitant and design variates, Z, are obtained as well. An IMD here pertains to designing
three subsets of experimental units; the first subset, I, has measurements on X and
Z, and is termed the surrogate set. The second subset, called the validation subset,
1*, has measurements on Y, X, and Z. The third subset, 1°, contains measurements
on Y and Z.
The second scenario that we will focus on deals with a vector of primary variates, Y, a vector of surrogates X, and accompanying concomitant variates, Z. Here
surrogate and validation subsets will be considered also.
The methodology proposed here will be an extension of the methods in (1.2.5).
The tests proposed by Sen in (1.2.5) dealt with hierarchical designs only, and they
accomodate use of one surrogate for a primary variate. Keeping in mind that hierarchical designs are a special case of an IMD, this research generalizes these tests
to the IMD setting. As pointed out in (1.2.5), statements were made regarding test
statistics and their properties without any detailed proofs. Supplying proofs in the
multivariate setting of the two scenarios above will be one of the tasks to be carried
out during this study.
The second chapter will focus on analysis of designs of the first scenario type.
Since it would take too much space here we will not go into the technical details, but
rather we content ourselves with only a motivation. The null hypothesis of interest is
the same as (1.14) when we consider the surrogate sample alone with the difference
29
that G(x
I ~)
is now a multivariate distribution function. The aim is to formulate
test statistics, based on linear rank statistics in (1.15), in each of the subsets 1,1*,
and 10 , and then to come up with a unified test that combines information from all
these subsets similar to the test suggested in (1.34).
To be more specific, if we assume that all the surrogate variates can be recorded
with relative ease, then if we have reason to believe that the Prentice (1989) criteria
hold true, we will proceed with the surrogate sample analysis as in (1.2.5). Thus, we
will have now
which leads to a (p + q)
X
n matrix. The linear rank statistics are constructed as in
(1.15), but now T no will be a qr-row vector with r elements for each of the q surrogate
variables. V n is now a (p+q) x (p+q) matrix. We fit a multiple linear regression with
the q surrogates as the dependent variables and the p concomitant variables as the
independent variables. An appropriate residual rank statistic, say
T~o,
correspond-
ing to (1.18) will then be constructed. Censoring can also be incorporated here by
replacing the censored version of the score functions as in (1.21).
Simillar modifications have to made when constructing a residual rank statistic
for the validation subset, J*. T nvo is now a (q+ l)r-row vector with r elements for Y,
and r elements for each of the q surrogate variables. V nv is now a (p+ q+ 1) x (p+ q+ 1)
matrix, where V nvOO is (q
+ 1)
x (q
+ 1),
V nv O+ is (q
+ 1)
x p and V n v ++ is a p x p
matrix. We regress Y and X on the p concomitant variates, and then we consider
the residual rank statistic, say T~vo' corresponding to (1.24). We then partition T~vo
and the corresponding covariance matrix
V~voo
as in before.
V~voo
now becomes
where vnvYY is variance of Y, and v~vYX is a 1 x q vector, and v~vxx is a q x q matrix
of the covariance of X. We then construct a statistic similar to T~:o in (1.30).
An analogous treatment to the subset 10 would yield a similar residual rank
30
s}atistic, say T~oo, where no is the cardinality of ]0. Here we would regress Y on
the p covariates to obtain
T~oo.
We shall use
residual rank statistics for] and 1*, namely
T~oo
T~o
along with the previously defined
and
T~:o
respectively, to construct
an overall test statistic similar to (1.34). Thus, if we let TO·, as in (1.33), be
W 2 T*'
T O·. = W 1 T**'
nv O +
nO
where the weights Wi, i
+ W 3 T*'noO
= 1, ... ,3 are to be defined as in
(1.47)
(1.32), then the final step
consists of constructing the quadratic form similar to (1.34) that will be permutationally distribution-free, and will have a large sample chi-squared approximation.
The third chapter will be devoted to studying the second scenario. Here we will
make use of the nonparametric multivariate techniques in Puri and Sen (1971) in
order to come up with the combined test. We consider residual rank statistics based
on regressing the q surrogate variates on the p covariates in the surrogate sample.
Moreover, if Y is an s x 1 vector, then we regress the q + s variates on the p covariates to get the residual rank statistic in the validation subset, 1*. In the subset
]0,
we regress the s primary response variates on the p concomitant variates to get
the corresponding residual rank statistic. Then we construc,t the combined test by
constructing the appropriate quadratic form as in (1.34).
It maybe noted in (1.47), that if, based on other considerations, the weights
are prespecified, then although n 1/2To· will still be asymptotically multinormal, the
quadratic form, LO·, based on TO· and the prior weights may not converge to a chisquared distribution because the discriminant of such a form may not be equal to the
covariance matrix of the combined test-statistics, and thus Cochran's (1934) Theorem
may not apply. Here it would be appropriate to consider resampling methods, like the
bootstrap, when dealing with the distribution theory in the multivariate case. Further
note that a hierarchical design is a special case of an incomplete multiresponse design
so that the methods developed in this work will apply to such designs as well. Hence,
they will not be considered in detail.
Finally, the last chapter will study the power function of the test developed and
31
it,s asymptotic relative efficiency. An example will be based on a completed doubleblind placebo-controlled trial conducted by Burroughs-Wellcome, which treated 281
patients with advanced HIV disease. Of these, 137 patients were randomized to receive
placebo and 144 patients were randomized to receive a 250-mg dose of Zidovudine
(ZDV) every four hours. In this study CD4 counts were determined prior to treatment
and approximately every four weeks during therapy. The median duration of followup was 120 - 127 days, at which point the study was stopped due to the superior
results of the ZDV arm in decreasing mortality.
32
Chapter 2
Methodology In Randomized
Block Design (RBD)
2.1
Introduction
This chapter will be devoted to studying the first scenario mentioned in (1.3) in
which a primary variate Y, a q-vector of surrogates X, and a p-vector of concomitant
variates, Z, are considered. As mentioned earlier the new treatment may affect the
primary endpoint through possibly more than one path. This model allows us to
account for these different paths.
An IMD here pertains to designing three subsets of experimental units: the
first is the surrogate set /, which has measurements on X and Z; the second is the
validation set, /*, which has measurements on Y, X, and Z; and the third subset is
/0, contains measurements on Y and Z.
Although this study will not focus on design issues of clinical trials that handle
surrogate responses, statistical analysis is considered in two design settings: the randomized block design (RBD), and balanced incomplete block design (BIBD). The
main reason for considering factorial designs is to reduce the variability of the estimates by eliminating the effect of one or more nuisance variables. Each of the above
33
designs will be considered separately for the subsets
f,
1* and
fO.
The next section
will deal with the simplest design, namely the randomized block layout in each of the
subsets
2.2
f,
1* and
fO.
Randomized Block Design
In classical linear models theory, the analysis of data from a randomized block setup
makes the following assumptions when estimating parameters:
i) The block and treatment effects are additive.
ii) The errors are independent and homoscedastic.
iii) Blocks and treatments do not interact unless each cell contains two or more
observations.
Moreover, the errors are assumed to be normally distributed in case confidence intervals or significance tests for the parameters are desired. Here, in the nonparametric
-
setup, these assumptions are relaxed for the most part. We do not assume additivity
of blocks and treatments; we drop the homoscedasticity assumption of the errors,
rather we assume that the errors are independent random vectors each of which has
a continuous distribution that is symmetric in its arguments. The normality assump.tion is completely relaxed.
2.2.1
RBD For The Surrogate Set I
Consider a two-way layout with s blocks of r plots each where r different treatments
are randomly assigned to the plots. Assume there are no replicates for simplicity.
Thus the cardinality of f is N( = rs). The response in the ith block receiving the
jth treatment is a p + q-vector V ij
= (xg), ... ,xi1),zg), ... ,Zi1))' = (Xij' ZiJ'
34
of
measurements corresponding to the p covariates and the q surrogates. Consider the
model
V ij
= J.L + ai + Tj + fij,
where J.L is the mean effect,
01, ...
j
= 1,"', r,
i
= 1,"', S,
,as are the block effects,
TI, ... , T r
(2.1)
are the treat-
ment effects, and En,··', E sr are all p + q-vectors. The following assumptions are
made:
a) Vi
= (V ill ···, Vir) has a continuous r(p+q)-variate c.d.£.
Gi(u), u E Rr(p+q), i
1,··" s.
b) The joint c.d.f. of Zi = (Zil,"', Zir) is symmetric in its
r
p-vectors. This
is the concomitance assumption of the covariate distribution in the analysis of
covarIance.
We wish to test the null hypothesis,
H o : Tj = 0 j = 1,' .. ,r.
while the set of alternatives relates to shifts in location due to treatment effects.
The linear rank statistics discussed in (1.2.5) will be based upon the method of
ranking after alignment described in detail in Chapter 7 of Puri and Sen (1971). The
alignment procedure eliminates the block effect by subtracting from each observation
in a block a translation invariant symmetric function of the observations like the
block average, block median, the Winsorized or trimmed mean, etc. Let Vi. be such
a function. Define the aligned observations as
.
.J
= 1 ...
'"
r
i
= 1,'"
,So
(2.2)
For the k th variate, rank the observations ug)*, ... , U};)* in ascending order of magnitude and denote by R~J) the rank of Ui~k)* in this set, for i
k
= 1,'·· ,p+ q.
= (RU),···, R~f+q))', corresponding
Thus, there is a rank vector R ij
to V~·
= (U(~)* ... U~~+q)*), i = 1 '"
~J
~J'.'
~J
"
S
,
J' = 1 ... r
"
35
= 1," . ,S, j = 1,' .. ,r, and
•
=
For each N( = rs) and each k
(a~~ll"" a~:N)' where a~:i
= 1,' .. ,p + q,
= J};)(jj(N
+ 1)),
define suitable rank scores a~)
=
1 :::; j :::; N. Moreover, J};)(u) is
defined in accordance with the Chernoff-Savage convention, that is J};)(u) satisfies
the following conditions:
(a) limN
---+
ooJ};)(u) = J(k)(u) exists for 0 < u < 1 and is not constant,
(b)
where
(k) (x)
GN[i]
= -;1[number of Uii(k)*]
:::; x,
k
= 1, ... ,p + q, j = 1, ... , r,
and
(k)(
~ GN[jl
(k) ( )
HN
x) = -1 LJ
x , k = 1, ... ,p + q.
r i=l
Define
(k,k')
1[
( (k)* (k')*
.
]
GN[j.il(x, y) = -; number of Uii ,Uii ):::; (x, y) ,
for k, k' = 1,' .. ,p + q, j, 1= 1,' .. ,r with either j
Let a~)=
N- 1 L~=l a~!a'
I or k
-=/:-
k' or both.
Also let C}j~ = 1 ifthe a th smallest abservation among
the N values of Ui~k)* belong to the
ph
treatment, and let C}j~ = 0 otherwise, for
= 1,' .. ,r. Now construct the linear rank order statistic for the k th
variate and the ph treatment, j = 1,"', r, k = 1,'" ,p + q
a
= 1,' .. ,N,
-=/:-
j
N
(k) _
-1 " " CU) (k)
T N,i
- s LJ NaaN,o.,
(2.3)
0.=1
This leads to the r x (p
+ q) matrix
TN =
((T~;))i=l,. .. ,T,
36
k=l"",p+q
(2.4)
Define the rank collection matrix of order (p
Partition RjV into s submatrices of order (p
R*N
= (R(p+q)xr
1
'"
+ q)
+ q) X
R(p+q)xr) ,.LLi
. . D.
" S
(Rn ,' .. ,Rsr ).
x N by R';v
reach
= (R.
~1,
". ,.LLir,
D.) i
= 1-"", s
Under the null hypothesis the distribution of U71'···' U7r is symmetric in the r
vectors and hence remains invariant under any permutation of the r vectors. Thus
the joint distribution of
remains invariant under the finite group 9s of transformations {9s} which maps the
sample space onto itself. The cardinality of 9s is equal to (r!Y. Typically a 9s is such
that
where (U?1'···' U?r) is a permutation of U71" . " U7n etc. i = 1,"', s. Let R~
denote the rank collection matrix corresponding to U~. Note that for every 9s E 9s,
there exists a R~
= 9sRN
which is permutationally equivalent to RjV.
The distribution of R';v over its (N!)(p+q) possible realizations will depend on the
unknown c.dJ Gi , even when the null hypothesis holds. However, under H o, U~ has
the same distribution as UN for all 9s E 9s, and hence, the conditional distribution of
U';v over {U~ = 9sU';v; 9s E 9s} will be uniform, each realization having the common
conditional probability (d)-s. This leads to the probability law, g:Js:
Under Ho, the conditional distribution of R';v over the (r!)S realizations {R~
=
9sR';v;9s E 9s} is uniform, each realization having the conditional probability (d)-s.
Since
g:Js
is completely specified, the existence of conditionally distribution-free tests
for Ho is thus established.
Let
-1"'"
r
-(k)
a N,R(k)= r
~a
•.
j=1
(k)
(k)
N,R
'J
be the intrablock averages for the k th variate, k =1,,·' ,p + q.
37
Theorem 2.1 Let V
Vkk'
N
=
be the (p
+ q)
x (p + q) matrix with elements
1
~ ~( (k)
[ s(r -1) L L aN,R(k)
1=1 t=1
aN,R(k»)
It
l.
x(a
eN
given by
:;(k)
-
/
k, k' = 1,'" ,p + q. Moreover, let
Vkk'
/
:;(k ) ]
(k )
(k') -
N,R lt
(2.5)
aN,RI.(kl))
to be the r x r matrix with elements
Cjjl
given
by
C' 'I
JJ
= -sr1 (CJJ"/r 0
1 )),)
'"
= 1,"', r
(2.6)
where 8jj , is the usual Kronecker delta. Then,
(~)"
E(TN)
", aX;+q)) 0 J
Var [Vec (TN)]
where J is an r
X
1 vector of ones and Var [Vec (TN)] is a r(p+q) x r(p+q) dispersion
matrix.
Proof: Note that
and
E
((k)
1". aN,R(k)
)2
1 ~ ( (k) )2
= N L aN,ex .
ex=1
'J
Also, for all i,i'= 1,···,s
j,l = 1,···,r and
k,k'
= 1,···,p+q
To justify the first equality above note that 1can take any of 1, ... ,s with probability
~, m can take any of the remaining s - 1 numbers with probability S~1'
38
t, u can take
-
apy of 1, ... ,r with probability ~, and under ps the blocks are independent. On the
other hand if i = i',j
=I j', then
1
sr (r
-
~ ~
1) L.... L....
1=1 t:;eu=1
1
sr(r
_
-1)
1
sr(r _
1)
(k)
(k)
aN R(k)a N R(k)
'It
{~(~(
(k)
'I"
~ ~ aN,R~;)
))2
-
~ ~((k)
~~ aN,R~;))
2}
.
{~r2(a(k»)2
_ ~(a(k) )2}
L....
N,R(k)
L.... N,OI
1=1
01=1
l.
and
1
s
T
-(--1-) "'"
L.... "'"
L....
sr r -
1=1 t:;eu=1
(k)
aN R(k)a
'It
(k')
(k')
N,R I "
if k r-I- k',)' r-I-
and
ST
1 "'"
"'" a (k)
(k').
-L....L....
(k)a
(k') If k
sr 1=1
Thus, we have
and
.
Var p ,
(k»)
(TN,j
39
t=1
N,R/t
N,RIt
=I k,,
).,
Moreover,
(2.7)
The expected value in the first term in (2.7) above contains products of scores in the
same block and same treatment. Such a product occurs wjth probability ~ under
~S.
The expected value in the second term consists of products of scores in different
blocks but same treatment. Keep in mind that the blocks are independent under
~S'
Also,
E
Ps
[(8- 1 ;'"
L...J
C(j) a(k)
/
NOI N,OI
)(8- 1 ;...
C(j') a(k »)]
L...J NOI N,OI
01=1
01=1
8- 2 [;... C(j) C(j') E
L...J
NOI
NOI
Ps
(a(k) a(k'))
N,OI N,OI
+ ;...
L...J
1
-2 [
8
r(r _
1
-2 [
8
r(r
-2 [
8
C(j) C(j') E
NOI
N{3
Ps
(a(k) a(k'))]
N,OI N,{3
(2.8)
0If:{3=1
01=1
~ ~
(k)
(k')
aN,R(k)a N R(k')
1=1 tf:u=1
It
'Z"
1) L...J L...J
~ ~
(k)
~ E Ps (k)
(k')
)]
+ If:m=1
L...J
aN,R(k)a N R(k')
I)'
~
(k')
1
~
(k)
m)'
~
(k')
-1) ~ t/:::l aN,R~~)aN,R~~/) + 1~1 r 2 ~aN,R)~) ~ aN,R~2
1
r(r _
~ (~~
1) (::
(k)
(k')
8 ~ aN,R~~)aN,R~~')
40
-
~
(k)
(k')
~ aN,R~~)aN,R)~/)
)
]
The expected value in the first term in (2.8) above contains products of scores in the
same block but different treatments. Such a product occurs with probability T(T~l)
under ps. The expected value in the second term consists of products of scores in
different blocks and different treatments. Hence,
(k)
(k'))
Covps(TN,j' TN,j
-(k)-(k')
-aN aN
Also, for j
=1=
j' we have
•
Denote the marginal c.d.f. of Ui~k)* and of (Ui~k)*,USk')*) by G~VJ(x) ~nd G~V.l](X,y)
respectively, for j,l
= 1,·· . ,r, k, k' = 1,· .. ,p + q, with
41
at least one of j
=1=
1, k
=1=
k'
~eing
true, and let
jj~) (x)
1
= -
sr
S
T
G~VJ(x), fork
LL
i=l j=l
= 1,'" ,p + q.
..
Define the monotone transformation
W tJ" -bkk l .J'I'
,t -
(W(l)
W(p+q)),·· i j ' ... ' i j
,J -
k , k' -- "
1 ... p + q,J' - 1"
... ,
r'
' ))
E(W(k)W(k
ij
iJ
'
-1"
.L..J b
1,,"', r, Z. -- 1, " ' , s,.
S
(s) bkk'.jl-S
kk'.jl,i
k , k' -- 1,"',p+q, J. -- 1,"',r,.
i=l
Vkk',N
I T
(s) jj
= - ".L..J bkkl.
-
r j=l
1 T T
" bkkl.
(s) jl , k,k'= 1,"',p+q;
2' "
.L..J.L..J
r
(2.9)
j=l 1=1
Finally let
VN
=
Now by Lemma 7.3.10 of Puri and Sen (1971) V N :e..,
probability when
VN
(2.10)
((Vkk l ,N))k,k'=l'''',P+q'
and V N is positive definite in
VN,
is positive definite. Now, using this fact and conditions (a), (b),
and (c) we arrive at the following thoerem.
Theorem 2.2 For each j = 1,"', r, the 1 x (p
is asymptotically normal with mean equal to
Cjj V N,
where
Cjj
+ q)
row- vector of TN in (2.4)
(aW,· .. ,~+q)),
and dispersion matrix
is the diagonal element of eN.
For proof see Theorem 7.3.3 of Puri and Sen (1971). For each j separately, j =
1,"', r, let us partition the corresponding 1 x (p + q) row vector of TN in (2.4) into
two components after subtracting the mean; the first component,
TNo,j,
is a q x 1
vector of the centered linear rank statistics corresponding to the surrogate variables
that is
TNo .
,J
and the second component,
= (TN(1),J -
Tt,j,
a}N1 ) ... TN(q) -
",J
q
a}N )),
'
is a p x 1 vector of centered linear rank statistics
corresponding to the concomitant variates that is
TO .
N,J
= (T(q+1)
N,J
-
a(q+l) ... T(p+q) N"
N,J
42
a}p+q)),
N
.
..
S;imillarly, we partition the matrix V
VN =
as follows
N
VNOO
(
. V~ci+
VNO+)
(2.11)
V N++
where V NOO is a q x q matrix, V NO+ is q x p and V N++ is a p x p matrix. Then, from
the classical normal theory, (see Theorem 2.5.1 of Anderson (1984)), for large N we
have
Define the q x 1 residual rank-statistics
V NOO
(-l) V'
V NOO - V NO+ V N
++ NO+
(2.12)
Now, for each j = 1,···, r, obtain similarly a residual vector, consider the q x r
residual matrix corresponding to the different treatments
(2.13)
where
and
Moreover, the dispersion matrix of sl/2Vec (T NO ) is the qr x qr matrix eN ® V NOO .
It maybe noted that censoring can also be incorporated in the design and we would
then use the censored version of the score functions as in (1.21).
2.2.2
RBD For The Validation Set 1*
Consider a two-way layout with n* blocks of r plots each where r different treatments are randomly assigned to the plots. Since there are no replicates the cardinality
43
o,f 1* is N v (= n*r), say, thus allowing the number of subjects in the validation set to be
different from the number in the surrogate set. The response in the ith block receiving
. a p
t h e J·th t reatment IS
+ q + 1-vector
U·· tJ -
("l-":.iJ' X(I)
X(q) Z(I)
Z(p))'
i j ' · · · ' i j ' i j ' · · · ' ij
.I
0
f
measurements corresponding to the primary variate Yij, the p covariates and the q
surrogates.
Consider the same model and assumptions as in (2.1) with the only difference
that now
Ell,·· . ,En • r
are all p
+ q + 1 vectors. That is to say we regress the pri-
mary variate and surrogates on the concomitant variates. The method of ranking
after alignment will also be used here. Proceeding as in proof of Theorem 2.1 with
N replaced by N v , and s by n*, so that k ranges now from 1 to p + q + 1 and hence
V N v is a (p
+ q + 1)
x (p
+ q + 1)
matrix, then, corresponding to (2.11) we have
V NvO+
)
(2.14)
V Nv ++
where V
NvOO
is ~ (q
+ 1)
x (q
+ 1)
matrix, V NvO+ is (q + 1) x p and V N v ++ is a p x p
matrix. Moreover, corresponding to (2.13) let the (q
+ 1)
x 1 residual rank-statistics
and dispersion matrix be
T*NvO,j
V
Now, for each j
(q
+ 1)
=
NvOO -
V NvO+·V(-I)
V'NvO+
N ++
v
1,···, r, we obtain similarly a residual vector.
(2.15)
Consider the
x r residual matrix corresponding to the different treatments
(2.16)
Partition the residual matrix in (2.16) and the (q
(2.15) as follows:
44
+ 1) x (q + 1) dispersion matrix in
'Yhere vNvYY is 1 xl,. vNvYX is 1 x q, and V NvXX is q x q. Also let
(2.17)
(2.18)
Note that the dispersion matrix of the 1 Xr vector n*1/2 TNvO (Y:X) is given by vNv(Y:X)C Nv '
where the elements of CNv are given by (2.10) with N replaced by Nv .
2.2.3
RBD For The Set 1°
Consider a two-way layout with nO blocks of r plots each where r different treatments are randomly assigned to the plots. Since there are no replicates the cardinality
of 1° is N°( = nOr). The response in the ith block receiving the jth treatment is a
p + I-vector U ij
= (lij, zg), ... , Zi~»)'
of measurements corresponding to the pri-
mary variate lij, and the p covariates.
Consider the same model and assumptions as in (2.1) with the only difference
that now en,···, enO r are all p
+ 1 vectors.
The method of ranking after alignment
will also be used here. Proceeding as in the proof of Theorem 1 with N replaced by
N°, and
s
by nO, also with k ranging from 1 to p+ 1 and hence V NO is (p+ 1) x (p+ 1)
matrix which is partitioned as we partitioned V N Next partition V NO as
where vNO OO is 1 xl, vNoo+is 1 x p, and V NO++ is P x p. Moreover, corresponding to
(2.12) let the (p
+ 1) x 1 residual rank-statistics and its dispersion matrix be
(2.19)
Now, for each j
= 1,· .. ,r, obtain similarly a residual rank statistic.
Finally, let the
1 x r residual vector corresponding to the different treatments be
(2.20)
45
penote by VNoooCNo, the dispersion matrix of nOl/2TNOO' where CNo is the r x r
matrix whose elements are given by (2.6) with N replaced by N°.
2.2.4
"
Construction of the Test Statistics
The next step consists of combining information from all subsets. Since the tests
developed in the different subsets are independent, we will take a weighted linear
combination of these tests with the weights being the inverses of the corresponding
dispersion matrices. Note that for such a combination to make sense the dimensions
of the tests should be the same.
One possible way to get around this difficulty would be to reduce the order of
the q x r matrix in 2.13 into a 1 x r vector by taking a linear combination of this
matrix, say aTiVo, where a is of order 1 x q, subject to the two restrictions, i) a'a
= 1,
and ii) the variance given in (2.12) is minimum. Although by doing this we are losing
information, the minimum variance condition maximizes the noncentrality parameter
.
of the chi-square distribution of the quadratic form based on the transformed residual
vector, which in turn maximizes the power of this test. Moreover, a can be ordered
if there is inherent ordering in the surrogate vector. Note that
TNO(X)
aT NO
(aTNO,I' ... , aTNO,r)
. with dispersion vNXCN, where VNX = aViVooa'. To see this
aCov(TNO,j' TNO,jl )a'
Cov( aTNO,j' aTNO,jl)
cjj'aVNOOa'
where cjj' is the corresponding element of CN. Let Al
and A 3
= vNO OO C NO.
= VNXCN, A 2 = VN"(Y:X)CN,,,
Let
W t· = [A-1 I + A-I
2
+ A-I]-IA~I
3
t'
46
.
Z=
1, 2, 3
(2.21 )
.
~ow,
noting that the mean of each of the three statistics TNO(Xl'
TNvO(Y:X),
and T NOO
is 0, following the Gauss-Markov Theorem we consider the linear combination
T O· = W I T*'
NO(X)
Assume that :. ~ PI, and
+ W 2T*'NvO(Y:X) + W 3 T *'NOO
(2.22)
:0 ~ P2 for positive and finite real PI and P2. Then SI/2To·
is conditionally distribution-free, under the permutational model, with
°mean and
covariance matrix
and the permutational multivariate central limit theorem of Sen (1983) applies here.
Thus, for large N, N v , and N°, SI/2To· "" N,. (0, A). Finally consider the overall test
statistic
(2.23)
The exact permutational distribution of LO· can be obtained by enumeration if N,
N v , and N° are small to moderate. It maybe noted that rank(CN)
rank(CNo) =
r -
=
rank(CNv )
=
1. Thus for large N, N v , and N° the null distribution of LO· by
Cochran (1934) is asymptotically a chi-squared with r -1 degrees of freedom since it
is a quadratic form of asymptotically normal variates with discriminant of rank r - 1.
2.2.5
Asymptotic Non-null Distribution of LO·
For the study of the non-null distribution of LO· we will consider local alternatives only. Fixed alternatives lead to consistent tests in the sense that the power
goes to one in large samples, which makes it hard to compare them. Thus they
will not be considered further. Let
Gi(u) = Gi(X,Z), x E
n: q ,
us first work with the surrogate set I.
Write
z E Rr p , and consider the following sequence of alterna-
tive hypotheses:
(2.24)
47
where
..x
= ((.\ ;k»)) stands for an r x q matrix of treatment effects.
S-1 Li=1 G~~~, 1 :s; j
:s; r, 1 :s;
k
:s;
Let G~~]
q, and let
Thus, under {HN }
j=l,···,r, k=l,···,q;
Assume that G~VJ(x), j = 1,"', r, k = 1,"', q, i = 1"·,, s are all absolutely
continuous. Moreover if each G~k) ~ G(k) then lims -+ oo S-1 Li=1 G~k)(x) = G(k)(x).
Also, asymptotically jj~) (x) is equal to G(k)(x), which is the limit of G~~](x). Let
g(k) (x) be the density function corresponding to G(k) (x).
Now expand G(k)(X - N- 1/ 2>..Y») in a Taylor series around x for a fixed x. We
have for large N
(k)
J1N,j
OO
J
1:
-00
i:
J
(k) -(k)
-(k)
[H N (x)]dGs[j] (x),
J(k) [G(k)(x)] dG(k)(x - N- 1/ 2>..;k»)
J(k) [G(k)(x)] dG(k)(x)
+
i:
•
+ O(N- 1/ 2)o
J(k) [G(k)(x)] d [G(k)(x - N- 1/ 2>..;k») - G(k)(x)]
+o(N- 1 / 2 )
Integrate by parts the second integral in the last term
J1~!j
lo1 J(k)(u)du
[1 J(k)(u)du
Jo
[1 J(k)(u)du
Jo
-1:
[G(k)(x - N- 1/ 2>..;k») - G(k)(x)]
:x
J(k)(G(k)(x))
+ o(N-1/ 2)
+ (N-1/2)>..;k) JOO ~J(k)(G(k)(x))g(k)(x)dx + O(N- 1/ 2)
-00
dx
+ (N-1/2)>..;k) joo ~J(k)(G(k)(x))dG(k)(x) + O(N- 1/ 2)
-00
dx
where the last two steps resulted from a Taylor series expansion of G(k)(x_(N- 1/ 2>..Y»)
around x for a fixed x. Thus, if we let
48
then from the above we have
N1/2[J1~!j -
fa1 J(k)(U)dU]
-+
A~k)B(G(k)),
j = 1,···,r as N -+
00
By Theorem (7.3.12) of Puri and Sen (1971) [N1/2(T~} - J1~!J, j = 1,···, r, k
=
1, ... , p+q] has asymptotically a multinormal distribution with null mean and dispersion matrix that converges to the same dispersion matrix under the null hypothesis
namely,
VN.
Also by condition (a) of (2.2.1) we have
Thus in the light of the above discussion
(2.26)
Now, by Lemma (7.3.10) and by the convergence model in Section (7.2.4) of Puri
and Sen (1971), 'VN
'"
v defined in (2.10) with bkkl.jjl all defined for the limiting
distributions of the average c.d.f. 'so Let v I be the version of v in the surrogate set.
Partition
VI
as in (2.11) and corresponding to (2.12) and (2.13) let
T *NO = T
Thus, let 111
11 I
= ((1])k)))
where 1])k)
-1
NO - VI0+ V l++
TON
= A)k)B(G(k)), j = 1,···,r,
k
= 1,···,q. Partition
into 1110 corresponding to the surrogates, and 11~ corresponding to the covariates.
Let TNO(x) = aTNO ' then for large n
E ( S 1/2T*NO(X) I H N ) -+ a ( 1110 -
and its dispersion matrix is vix eN where vix
-1 0)'
vlo+vI++111
= #-tID*
= avjooa'.
In a similar fashion, define {HN} for the validation set 1* such that ((A)k))) is
now an r x (q
+ 1)
matrix. Also, 111. is defined in the same way as 111 with the
49
difference that k
validation set
= 1,· .. ,q + 1. V N v
]*.
rv
VI*
which is the version of
VI
defined in the
Thus, corresponding to (2.14) we have
Partition the r x (q + 1) expected value matrix "71* and the (q + 1) x (q + 1) dispersion
matrix
V 1*
as follows
*
v I • OO
where
vj*yy
is 1 x 1,
=
(
vj*yy
*'
vI*YX
is 1 x q, and
vj.yX
· matrIX
. 0f
an d t he d·
IsperSlOn
n
vj.xx
*1/2 T *
NvO(Y:X)
.
IS
is q x q. Hence, for large
*
C
vI*(Y:X) 1*
h
were
Finally, along the same lines, we define {HN } for the subset
an
r
x 1 vector. Also,
the version of
no
1 2
/
VI
"710
]0
such that .\ is now
is defined in the same way as "7-[. V NO
defined in the subset
Tl'voo under {HN } converges to
]0.
p,joo
n*
rv
V 10
Thus, for large n the expected value of
= "710
with dispersion matrix
VIOOOCIO.
Consider the weights Wi defined in (2.21) where now we have Al =
A2
= vj*(y:X)C Nv '
and A 3
= vjoooCNo.
which is
vjXCN,
Thus, under {HN }, the mean of n 1 / 2 To* in
(??) is for large n
Therefore, from Theorem (7.3.12), Lemma (7.3.10), and Theorem (2.8.2) of Puri and
Sen (1971) and under {HN}, LO· defined in (2.23) has asymptotically a noncentral
chi-squared distribution with r -1 degrees of freedom and the noncentrality parameter
50
Chapter 3
Nonparametric Intra-Block
Inference
Blocking in experimental design is .done in order to use experimental units as nearly
homogeneous as possible. Complete block designs lose their efficincy due to the failure to eliminate heterogeneity among units when the number of treatments to be
compared is not small. In such cases, incomplete block designs are used where experimental units are divided into blocks containing fewer units than the number of
treatments to be compared. Comparisons of treatments with equal accuracy may require equal number of replicates and other constraints leading to balanced incomplete
block designs (BIBD).
3.1
3.1.1
Balanced Incomplete Block Designs
BIBD for Surrogate Set I
Consider n replications of a BIBD consisting of s blocks of constant size
which v treatments are applied such that:
(i) No treatment occurs more than once in any block,
51
r(~
2) to
(ii) The
ph
= 1,··· ,v, and
treatment occurs in rj(~ s) blocks j
(iii) The (j, j')th treatments occur together in rjj'(> 0) blocks (j =I- j'
Let
Si
= 1, ... ,v).
stand for the set of treatments occurring in the i th block, i = 1; ... ,s. For the
a th replicate, the response in the ith block receiving the jth treatment is a stochastic
p
+ q- vector V
exij
(l) .•. , X(q)
= (Xexij,
exij, Z(l)
exij,···, Z(p))'
exij
= (X'exij, z'exij )' 0 f
measurements
corresponding to the p covariates and the q surrogates. Consider the model
= JL ex + f3 ex i + Tj + Eexij,
V exij
where JL ex is the replicate effect,
j E
Si,
i
= 1,···, s,
a
= 1,···, n
(3.1)
f3 exi is the block effect, TI,···, Tv are the treatment
I
effects, and
a)
V exi
Eexij
=
are the error vectors. The following assumptions are made:
E Si) has a continuous r(p
(Vexij,j
Rr(p+q),
i
+ q)-variate
c.d.f.
Gi(u),
u E
= 1,··· ,So
b) The joint c.d.f. of
[Zexij,
j E
Si]
is symmetric in its r p-vectors. This
IS
the
concomitance assumption in ANOCOVA.
c)
[Eexij,j
E Si] have a jointly continuous cumulative c.d.f. G(2:I,···,2: r ) which is
symmetric in its r vectors. This includes the i.i.d. assumption of the distribution
of
[Eexij,j
E
Si,
i = 1,··· ,s] as a special case.
d) The joint distribution of
all i = 1,···
[Zexij, Eexij,j
E Si] is exchangeable and is the same for
,So
Note that the
Tj
is a (p
+ q)-vector with two components:
treatment effects corresponding to the surrogates, and
Tj2
Tjl
is a q-vector of
a p-null vector of treatment
effects for the covariates, since we assume no interaction between treatment and
concomitant variates. We wish to test the null hypothesis, How,
How:
Tjl
= 0 j = 1,·",v,
52
.
while the set of alternatives relates to shifts in location due to treatment effects. To
adopt the method of ranking after alignment [see Chapter 7 of Puri and Sen (1971)],
define the aligned observations by
U: ij = U Olij
-
r-
1
L
U Olil
j E
Si,
i = 1,"', S, a= 1"·,, n.
TI,
J. E S i, Z. = 1,.", S.
IESi
Let
= Tj -
Tj,i
"
r -1 "
L..J
IESi
eOlij
=
r
EOlij -
-1 "
" Eoil,
L..J
= 1, ... ,S,
i
= 1,' .. ,n.
j E
Si,
Si,
i = 1,,·, ,S, a = 1",· ,no
a
IESi
Then we have
U:
ij
=
Tj,i
+ eOlij,
j E
Thus if F(zl,···, zr) is the c.d.f. of
eOlij,j
E
Si,
and if Fr(p+.q) stands for the class of
all G(Zl,···, zr) which are symmetric in their r (p
j E
Moreover, the joint distribution of
(Z:ij' eOlij,
k th variate, rank the observations
ugi*,···, u~~~*
+ q)-vectors, then
is als0 exchangeable. For the
Si)
in ascending order of magnitude
and denote by R~~} the rank of Ui7]* in this set, for i = 1, ... ,S, j E
Si,
a = 1, ... ,n,
· to U*Olij = (U(l)*
1 ... ,p + q. Thus, correspon dmg
=,
Olij,···, U(p+.q)*),
Olij
we h ave a
1
. S
d - 1, ... , n.
(l) ... , R(p+.q)),·
rank vector R Olij = (ROlij,
Olij
, Z = ,... , s, J E i, an a an d k
For each N(= nsr) and each k = 1,'" ,p + q, we define suitable rank scores
a N(k)
-
(a(k)
N,l'
... a(k) )
'N,N'
where
a(k). -
fk)(J'j(N
N,] -
N
+ 1)) ,
1 -< J' < N' Moreover ,
J(k)(u)
N
is defined in accordance with the Chernoff-Savage convention, i.e., we assume that
JJ.P (u)
satisfies the conditions a, b, c of Section 2.2.1. For notational simplicity, let
(k) -TJ Olij
TJ~:~
= r-
1
a(k)
L
jESi
(k)'
N,R"'iJ
J' E S·., ,;• -- 1·
, ••
TJ~:;, TJ~~!
,
S,
'" - ,
1
~
... " n'
s
=
S-l
L1J~:~, and 1J.\~)
i=l
53
n
= n-
1
L 1J;:'!
0l=1
f<?r k = 1,' .. , p + q. The proposed test is based on the statistics
(k) _ 1 ~" (k)
TN,j - ; ; ~.~ T/aij'
a=l zEPJ
where Pj
= {i : j
E
Sil, j
= 1,"
. _
_
J -l,···,v,k -l,···,p+q.
" v. Let
= ((T~~))j=l,. .. ,v,k=l'."'P+q.
TN
Tests based on aligned ranks are only conditionally distribution free. Define
U~i
= (U~ij,j
E
Si) ,U a = (U~l""'U~s), a = 1,···,n, and URr
= (Ui,···,U~).
Hence, by (3.2) we have the following three arms of the permutational law:
(i) The joint distribution of
U~i
remains invariant under any permutation of the r
vectors among themselves, there being r! such permutations, i = 1,' ..
,s,
(ii) As U~i' i = 1,' .. , s are i.i.d., the joint distribution of U~ remains invariant under
any permutation of the s sets
U~i' i =
1," " s, among themselves, there being
s! such permutations,
(iii) As replicates are independent, the joint distribution of URr remains invariant
under any permutation of the n sets
U~,
a
= 1,"', n among themselves.
Thus, if we define (In to be the compound group of transformations {9n} by
o
o
On U N .= U N* = [(l)U*
9n 1"" ,9n(n)u*]
n' 9n(a) E
r.
~n,
a
= 1,"', n.
Then (In contains [s!(r!yt transformations, and under How it leaves the joint distribution of U~ invariant. Let S(U~) = {U%
= OnU~
: gn E (In}. Then, it follows
from the above discussion that
P [U?v
= uo; I S(U?v), How] = l/N*;N* = [s!(rwt,
for all uo; E S(U?v), whenever G E
tational) probability measure by
rn.
Fr(p+q).
Let us denote this conditional (permu-
rn
is completely specified the existence of
As
conditionally distribution-free tests for How is thus established.
54
"
Theorem 3.1 Let
(1)
_
VN,kk' -
ns
I
[(k)
(k)] [(k')
(k')]
"lOlij - "lai.
"lOlij - "lai. ,
" "" ""
-s- "
L.J L.J L.J
n
k, k' = 1, ... ,p + q,
r 01=1 i=l je8i
and
(2)
_
VN,kk' -
ns
" " " " [(k)
I
-
ns
l
l
(k)] [(k )
(k )]
"lai. - "la.. ,
L.J L.J "lai. - "la..
01=1 i=l
Y N(l) =
((
A(l)
=
(1»))
k , k' = 1" .. , p + q',
an d
y(2)
N
=
((aJ~~)) and
A(2)
= ((aJ~~))
VN,kk'
(( (2) ))
VN,kk'
.
Also, let
where
aJ~~
= [srjj'
- rjrj'] /(s - 1),
j,j'
= 1,"', v,
where hjj , is the usual Kronecker delta. Finally, let
B N is a (p + q)v x (p + q)v matrix. Then
Proof: Note that for a given
0:',0:'
= 1," .
,n, under rn the probability that i
any of 1" .. ,s is ~, and the probability that j E Si is ~, hence
1
~
"" E pn ( "lOlij
(k»)
;:; L.J L.J
. 01=1 iePj
H; i~ Ur t, j~, ~~~l)
r J'/...
.71(k) i = 1 .. , , s , J' = 1" ... v , k = 1, ... ,p + q.
55
will be
= Hi, i')
¥oreover, let A
of A is
: i E Pj, i' E Pj" i =I- i'}, and note that then the cardinality
1) if j = j'). Then
rjrj' - rjA= rj(rj -
nEPn
[r(k)
N,) -
E
(r(k))]
[r(k')
N,)
N,) -
1
(r(k'))]
N,)
[( 1~"(
(1
/
~"
X - LL- ((k')
1]{3i'j - 1]{(k
3.. )))]
n 01=1 iEPj
n {3=1 i'EPj
n
I: I: E 1]~:} -1]i~~) (1]~:; - 1]i~:))]
nEpn
n-
E
-
L- L-
(k) 1]OIij
(k)))
1]01..
pn [(
01=1 iEP}
n
+n- 1 "L-
"
L-
E Pn [(1](~).
at)
_1](k))
01..
(1](~;)
_1](k ' ))]
01' )
01 ..
01=1 (i,i')EA
n
+n
-1
""
L-
L- E pn
[(
(k) - 1]01..
(k)) ((k')
1]OIij
1]{3ij - 'fJ{3(k'))]
..
0I#{3=1 iEPj
+n -1""
LLn
E pn
[(
(k) - 'fJ (k))
((k')
'fJOIij
'fJ{3i'j 0I ..
1]{(k'))]
3..
0I#{3=1 (i,i')EA
Note that the last two terms are null because the replicates are independent and
E pn ('fJ~:;
'fJi~:)
-
= O.
The first term yields
n
n
-1 "
L- "L- E pn
[(
(k) - 'fJ (k))
((k')
(k'))]
'fJOIij
'fJOIij - 1]01
..
0I ..
01=1 iEP)
56
.
The second term gives
n
n
-1 "'"
"'"
~ ~
01=1 (i,i/)EA
E
-1
n
-
/
(k)) ((k )
TJ OI ..
TJOIilj -
[( (k)
TJOIij -
Pn
n-
~
"'"
~ ~
01=1 (i,i/)EA
E
_ rj(rj -
-1
TJOI,)TJOI,I)
1
S( S -
1) ~ (
~
n
((~) (~/).) _
t
1) i#i ' =1
"'"
1
[
~
_ rj(rj n
S(S _
1)
n8 (S -
_ rj(rj n
-
(~2: (k)) (~L (~/))]
k jEBi TJm)
'
i=1 i ' =1
1) ~ (
TJOI,,)
~
1) ~
i=1
(k) (k / ))
OI
TJ .. TJOI ..
_
~ ~ (~)
(k / )]
~ ~ TJOI'. TJOI'.
01=1 i=1
( (k) (k / ))
TJ OI .. TJ OI ..
~
1) ~~ ( (k) (k / )
TJOI .. TJOI..
1) ~
=1
_
~ ~ (~) (~/))]
~ TJOI .. TJm.
8 i=1
1) ~ ~ ( (k) (k / ))
(8 _ 1) ~ ~ TJOIi. TJOIi.
rj(rj n8
k jEBi,
/
(~~
(k) (k )
~ (k) (k l ))]
~ ~ TJOIi. TJOIi . - ~ TJOIi. TJOIi.
1) [~8 2 (k) (k /)
~
TJOI.. TJOI..
1) 01=1
rj(rj -
-
(k) (k / ))
TJOI .. TJOI ..
(k) (k / ))
TJOI.. TJOI ..
01=1 (i,i/)EA
n (8
~
01=1
01=1
~
~
rj(rj -
1) ~ (
rj(rj n
2: [
t
1
01=1 (i,i/)EA
n
Pn
/
(k ))]
TJOI ..
01=1 i=1
-
1) {~ ~ ~ [ (~) _
~ ~ TJw.
1)
n8 01=1 i=1
r·(r·
) ) -1) v (2)
- (8 - 1) N,kk '
_ rj(rj -
(8
(k) (k / )
8TJOI .. TJOI ..
.
(~/)
(k)] [
TJOI..
TJOI'.
_
(kl)]}
TJOI ..
Hence, the sum of the two terms give
n
Cov Pn
(k) T(k/))
(TN,j'
N,j
On the other hand, let B
rjj',
= {(i, i')
(1) (1)
ajj'vN,kk'
+ rjVN,kk ' -
(1) (1)
a·
)"vN
) , kk'
+ a·(2)'IvN(2)kk'
:i
(2)
1) (2)
(8 _ 1) vN,kk '
rj(rj -
)),
E Pj, i' E Pjl, i
= i'}.
The cardinality of B is
Thus
nEPn [T(k).
N,) -
E (T(k))]
[T(k'),
N,)
N,) -
_
-
nE pn
E
[(1~ "'"
;:;
~.~
(T(k"~)]
N,)
((k)
TJOIij -
01=1 'EP)
(k)))
TJOI..
x;:; ~
~ .,~
[1 "'"
{3=1 , EPj'
57
/
) ((k
TJ{3i l j'
/
TJ{3(k.. )))]
n
n
-1 " "
L-
0:=1 (i,i/)EB
n
1
+n0:=1 (i,i')EA
L L
+n
n
""
-1
l
E Pn [( 'rJo:ij
(k)
""
L-
E pn
(k)
L- E Pn [( 'rJo:ij
""
L-
7]i~~»)]
-
.
.
(k)
L- E Pn [( 7]o:ij
""
'rJ0: ..
'rJ~:} - 'rJi~!) (7]~::},
[(
l
l
)
(k») ((k
7]f3i ' jl -
-
'rJ0:..
-
)
(k») ((k
7]0/..
7]f3i ' jl
-
o:f:.f3=l (i,i/)EB
n
+n -1
(k »)]
'rJ0:..
""
L-
l
)
(k») ((k
'rJo:i'jl -
-
.
(k »)]
7]13 ..
l
o:f:.f3=l (i,i/)EA
.
l
(k.. »)]
7]13
Note that the first term in this last equality is the sum of products in blocks where
pairs of treatments (j,j') occur together, while the second term consists of the sum
of products in blocks where treatments (j, j') occur in different blocks. The last two
terms are null due to the independence of replicates. The first term then yields
n
n
-1 " "
L-
""
L-
E Pn [( 7]o:ij
(k)
l
(k») ( 7]o:i
(k'l jl
) 'rJ0:..
-
(k.. »)]
7]0:
0/=1 (i,i / ) EB
n
n-1""
n
E pn ((~),
(k:~/)
L- L7]0:1 J 'rJ0:1 J
_ rjjl " "
""
L-
0/=1 (i,i/)EB
~
rjjl
n
n
~
1
[
L-
0:=1
~~
1
r'
~~
r _ 1 nsr L- ~
[""
L-
0/=1 1=1
JESi
'I
JJ
[~
~ (~
(~) (~/)
ns L~ 7]0:1. 'rJm.
.
0/=1
rjjl
(1)
- - - V N "I
r- 1
And the second
t~rm
n
L-
""
L-
E Pn [( 7]o:ij
(k)
(k)
_
1=1
s
(k) (k')
'rJ0: .. 7]0: ..
n
(k ' )
L-
l
(k) (k )]
L
7]O/ij'rJo:ij
JESi
(k) (k')
'rJ0: .. 7]0: ..
0:=1
(k) (k ' )
r'rJo:i. 'rJo:i.
l
(k) (k
7]0: .. 7]0:..
»)] _
+ r7]o:i.7]o:i.
(k) (~/)]
rjjl V(l) I
_ 1 N,jj
r
(2)
+ r'J'IvN
"I
J
JJ
•
l
-
(k' )'
n
rjjl " "
7]O/ij7]o:ij -
simplifies into
n
-1 " "
JJ
(k)
7]o:ij7]o:ij' -
JES; J/ES;
I
~
L-
n
0/=1
["" ""
ns
" " " " r 2 (~) (~') _
L- L- 'rJm. 'rJ0:1.
1 nsr 0:=1 i=l
_
rjjl
_
7]0:1)7]0:1J
1) L- L- L- ,L-
0/=11=1
n
_ r jj' " " (k) (k ' )
n ~ 7]0/ .. 'rJ0: ..
r -
(k) (k ),] _ rjjl
1=1 Jf:.J/ES;
rjjl
1
---;:;: sr(r -
rjjl
l
""
1) ~ ,L-
sr(r _
(k) (k ' )
'rJ0/ .. 'rJ0: ..
0:=1
)
(k») ((k
7]0:..
'rJO/iljl
-
l
(k.. »)]
'rJ0:
0:=1 (i,i/)EA
58
n
,
n-l'"
L...J
a=l
'"
L...J
(i,i/)EA
E
l"n
n
TjT; - Tjjl ~ [
1
L...J s(s _
n
a=l
-
TjT; - Tjjl ~
ns(s - 1) L...J
a=l
_ (TjT; - Tjj/)
-
S -
1
n
[S2
.
(~~ (~:~/) _ TjTj - Tjjl ' " (k) (k /)
"lellt) "lat )
L...J "la .. "la ..
~
a=l
(~) (~/)]
1) L...J "l0it. "l0it.
_ TjT; - Tjjl
i:;t:i'=l
n
~ (k) (k /)
L...J "lau "la..
. a=l
(k) (k /) _ ~ (~) (~/)] _ TjT; - Tjjl ~ (k) (k/)
"lau "la..
~ "lat. "lat.
n
L...J "la.. "la..
t=l
a=l
[~ ~ (~ (~) (~/)
ns L...J ~ 1]OH. "'Ctz.
a=l
_
(k) (k/»)]
TJa .. TJa ..
t=l
(TjT; - Tjj/) (2)
s -1
VN,kk '
Therefore, the sum of the two terms gives
•
Hence the theorem.
Define the following
/
/
I ns
(k,k
)(
)
_
" 'L...J
" 'L...J
" I [( U aij
(k)* ,Uaij
(k )*) :::; (x,y) ]
HN,l x,y - N 'L...J
a=li=ljE~
(k,k') (
H N ,2
X,
)
Y
I
= nT(T _
ns
1) ~
'" '"
Moreover, denote the marginal c.dJ of
'"
f:t #7es;
I
.
/
[( aij
(k)*
(k )*)
]
U
' Uaijl
:::; (x, y)
ul;j* by Fi)k)(x) and note that under How it is
equal to F(k)(x) which is independent of i and j. Also, for every i = 1,,", s, denote
/
/
/
.
. c.dJ. of (k)*
(k )*) . Fmally,
by F ij(k,k )( x) (_
- F l(k,k )( x, y ) un der How ) the margmal
Uaij ,Uaij
for j
=1=
j' = 1,'"
,v, let
Fj~~;k')(x)(=
F?,k/)(x,y) under How) be the marginal c.d.f.
(k)* U(kl)*) D fi
of (Uaij,
aij' . e ne
•
i = 1,2, for k,k' = 1,"',p+q, where Ilk = J~J(k)(u)du. Note that Ffk,k)(x,y)
reduces to the univariate c.d.£. F(k)(x) as x = y almost everywhere. Further let
V
. _ (( (kk /»))
Vi
k,k'=l,.u,p+q,
t -
59
. _ 1,2,.
Z -
[(r ~ 1)] + [(s:. 1)]
B = [(r ~ 1)]
[(s -ll~r - 1)]
=
A
A(I)
A(2);
A(I) _
{} = A ® VI
-
,
A(2);
..
B ® V2
Theorem 3.2 Under How, B N as defined in Theorem 3.1 converges in probability
(as n
-+
Proof:
(0) to {} defined above.
Note that R~~~
= NH<;) (U~7]*).
Hence, TJ~~}
=
J<;) [~IH<;) (U~7]*)]. It
follows that
Let us rewrite V~!kk' as follows
1
n
s
[TJ~~}
-nsr L L L
(1 )
vN,kk'
0=1 i=1 jESj
-
TJ~:~] [TJ~::) - TJ~~?]
n~r t t L TJ~:}TJ~:j 0=1 t=1 JESj
r r
r
r - 1
r
nsr
JJf
Jf
R2
JR2
r -1 [kk'
lI1
r
t t
0=1 t=1
[~ L TJ~:}] [~ .L TJ~~:~]
JESj
J'ESj
1[_1 ~ ~ '" TJOtJTJOtJ
(~). (~')] _r - 1[ (1 1) ~ ~ '"
r - 1
p
-+
:s
L..J L..J L..J
r
0=1 i=1 jESj
nsr r -
L..J L..J
L..J
01=1 i=1 j=f::j'ESj
(~). (~')],
TJOtJTJOItJ
H(k)(x)] f k') [ N H(k')()] dH(k,k')(x )_
N
N
N +1 N Y
N,1
,Y
fk) [ N H(k)(x)] f k') [ N H(k')()] dH(k,k')(x )
N
N +1 N
N
N +1 N Y
N,2
,Y
fk) [
N
N
N
+1
kk']
lI 2
where R 2 is the the Euclidean plane, and the last term follows from Theorem 5.4.2 of
Puri and Sen (1971). Note that H<;)
-+
H(k) (= p(k)under How), where H(k) is the
population combined c.dJ. Hence we have
60
.
L.et
US
(2)
rewn't e VN
,kk' as
(2)
VN,kk'
The first term in the last equality gives
1
n
s
- LL
ns 0'=1 ;=1
[(1]~~~ -1]~.~)) (TJ~~:)
1
n
s
- LL
ns 0'=1 ;=1
-
TJ~~'))]
TJ~~~TJ~::)
-
TJ~~)TJ~.~')
2- L.J
~ L.J
~ [~"
(~).] [~"
(~'>,] _TJ ...(k) TJ...(k')
L.J TJm)
L.J TJm)
ns 0'=1
r jESi
;=1
r j'ESi
1 [ - 1 L.J
~ ~"
(k) (k')]
L.J L.J TJO';j1]O';j
-
r
nsr 0'=1 ;=1 jESi
r - -1
+r
[
~ ~"
(k) (k')]
(. 1 1) L.J L.J L.J TJO';j1]O';j'
nsr r 0'=1 ;=1 j1=j'E Si
-TJ.(.~)TJ~.~')
~
r
JJR2r
r - 1
r
J(k) [ N H(k)(x)] J(k') [ N H(k')(y)] dH(k,k')(x y)
N N +1 N
N
N +1 N
N,I'
+
JJr
J(k) [ N H(k)(x)] J(k') [ N H(k')()] dH(k,k')(x )
N
N +1 N Y
N,2
,y
R2 N N + 1 N
-TJ.\~)1].\~')
p
-+
1 kk'
r
-VI
1 kk'
+ -r -- V
2
r
where the last term follows from Theorem 5.4.2 of Puri and Sen (1971). The second
term gives
61
-
_..!..
sr
{_I ~ ~ " [( (~). _
nsr
(~')
(k)) ( TJO:~J _ 77...
(k'))]}
TJ...
'fJcx~J
LJ LJ L.-i
0'=1 i=1 jESi
_ r - 1{
1
~~ "
sr
nsr(r - 1) L.J ~ . ~
.
[( (k). _ (k)) ( (k') _ (k'»)]}
'rJQ<J
'rJ...
'rJQij'
'rJ...
0'=1 <=1 J::f:.J'ESi
S- 1 {
- - s-
~ ~
1
[( (k)
(k») (k')
(k'»)] }
'rJQi. - 'rJ...
'rJQi'. - 'rJ...
ns(s _ 1) L.J . ~
0'=1 <::f:.<'=1
p
-+
1 kk'
r - 1 kk'
--v - - - v
sr 1
sr 2
Note that the last term in the last equality converges to zero in probability. To
see this, remember that {'rJ~~~} are independent for every
Moreover, if we let FnQi(x) = ~ 'LjESi
( 'rJ(~)
w.
'rJ(k»)
...
= joo
J(k) [
N
-00
~
Thus, if we let ('rJ~~~ - 'rJ~.~)) =
I(Ui7J* $
1,"', n, i = 1,"', s.
Q -
x), then
N
N
+ 1 H(k)(x)]
N
dFnQi(x) _ 'rJ(k)
...
fal J(k)( u )du - J.l(k) =
o.
Mi7) , then E Mi7) ~ 0, and
.
pn
1
~ ~ [('rJQi.
(k)
(k)) (le')
(k'))]
_ 1) L.J L.J
- 'rJ...
'rJQi'. - 'rJ ...
(
ns s
0'=1 i::f:.i'=1
(k'))
(k)
E pn ( M Qi
M Qi
,.
p
-+
Therefore,
(2)
P
[(s -
vN kk' --+
' s
1)] VI
O.
(kk')
+. [(S -
l)(r -
sr
1)] V2
(kk')
Hence
vW ~ [s:' 1] [VI + (r - 1)V2] .
Thus it is clear now how under How, B N ~
62
n.
•
Theorem 3.3 LetVe:c(TN)' a (p+t 1 )vx1-vector, denote the rolled out form of TN.
Then n ~ (Vec (TN) - 1/) converges, in probability, to a multivariate normal distribution with null mean vector and dispersion matrix B N .
The proof is found in Theorem 4:1 of Sen (1969). Now; construct the residual rank statistics as we did in the complete block design. Partition [Vec (TN) - 1/]
into two components: the first component, TNO, is a qv
X
1 vector of the centered
linear rank statistics corresponding to the surrogate variables, and the second component, T~, is a pv
X
1 vector of centered linear rank statistics corresponding to the
concomitant variates. Simillarly, partition the matrix B N as we partitioned V N in
(2.11)
BN=
where B NOO is a qv
X
(
BNOO
B NO + )
B'rvo+
B N ++
qv matrix, B NO+ is qv
X
pv and B N++ is a pv
(3.3)
X
pv matrix.
Then, from the classical normal theory, (see Theorem 2.5.1 of Anderson (1984)), for
large N we have
Define the qv
X
1 residual rank-statistics TN-O' and BN-OO' the covariance matrix of
n 1/2T'No as follows
B'Noo
3.1.2
(-l) B'
B NOD - B NO+ B N
++ NO+
(3.4)
BIBD For The Validation Set 1*
Consider n* replications of a BIBD consisting of s* blocks of constant size r*(2:: 2) to
which v treatments are applied such that the conditions i, ii, and iii in Section 3.1.1
are satisfied.
63
Let Si stand for the set of treatments occuring in the i th block, i = 1, ... , s*. For
the a th replicate, the response in the ith block receiving the jth treatment is a stochastI.c P
X(q) Z(l)
+ q + 1-vector UOIii = (lJOIi;"l X(l)
exij'" • , exij' exii'·'"
Z(p))'
exii
.L
=
("t.ri;"l X'OIij' Z')'
f
exij
0
.L
measurements corresponding to the primary variate, p covariates and the q surrogates.
Consider the the same model in (3.1) and the assumptions following it with
the difference that here we regress the primary variate and the surrogates on the
concomitant variates. We basically repeat what we have done in the surrogate set with
N replaced by N v (= n*s*r*). Thus B N in Theorem 3 is now a (p+q+l)v x (p+q+l)v
matrix ,B Nv ' which we partition as in (3.3). Then, corresponding to (3.4) we have
B NvOO
Partition the (q
+ l)v
B'
B NvOO - B NvO+ B(-l)
N v++ NvO+
x I-vector T NvO ' and the (q
+ l)v
x (q
+ l)v
(3.5)
matrix B NvOO as
follows
T NvO =
B NvOO =
(
(T~vo(Y), T~vo(X))'
BN.,Yy
B NVY .X
B~vYX
B NvXX
).
where B NvYY is v x v, B NvYX is v x qv, and B NvXX is qv x qv. Also let
TNvO(Y:X)
T*
B*
B*(-l) T*
NvO(Y) N vYX NvXX NvO(X) ,
(3.6)
BNv(Y:x)
B *NvYY - B*NvYX B*NvXX B*'
NvYX·
(3.7)
Note that the dispersion matrix of v x 1 vector n*1/2 TNvO (Y:X) is given by the v x v
matrix BNv(Y:X).
3.1.3
BIBD For The Set
[0
Consider nO replications of a BIBD consisting of SO blocks of constant size
rOC~
2) to which v treatments are applied such that the conditions i, ii, and iii in
(3.1.1) are satisfied.
64
Let Si stand for the set of treatments occuring in the i th block, i
=
1,"', so.
For the a th replicate, the response in the ith block receiving the jth treatment is a
. p
stochastIc
+ I-vector
U aij
=
(
( 1Zaij,
) .. " Zaij
(p)) I
Yaij,"',
=
(
') I
}ij, Zaij
of measure-
ments corresponding to the primary variate, and the p covariates.
Consider the same model in (3.1) and the assumptions following it with the
difference that here we regress the primary variate on the concomitant variates. We
basically repeat what we have done in the surrogate set with N replaced by N°(=
nOsOrO).
Thus BN in Theorem 3 is now a (p
+ l)v
x (p
+ l)v
matrix ,BNo, which is
partitioned as in (3.3)
where B'N0oo is v x v, BNo O+ is v x pv, and V
NO++
is pv x pv. Thus, corresponding
to (3.5) we now have the v x 1 residual rank-statistics T'N0o and the v x v dispersion
* as f0 11ows
. B*NOOO 0 f n Ol/2 T NOO
matnx
(3.8)
B'N°oo
3.2
Construction of the Test Statistic
The next step consists of combining information from all subsets. Since the tests
developed in the different subsets are independent, we will take a weighted linear
combination of these tests with the weights being the inverses of the corresponding
dispersion matrices. Note that for such a combination to make sense the dimensions
of the tests should be the same.
We follow a similar approach as in the complete block case. Write the qv x 1vector, T'No, obtained from the surrogate set I, as a q x v matrix. Then take a linear
combination of this matrix, say aT'No, where a is of order 1 x q, subject to the two
restrictions, i) a'a = 1, and ii) the variance given in (3.4) is minimum. Although by
65
qoing this we are losing information, the minimum variance condition maximizes the
noncentrality parameter of the chi-squared distribution of the quadratic form based
on the transformed residual vector, which in turn maximizes the power of this test.
Moreover, a can be ordered if there is inherent ordering in the surrogate vector.
Note that if we partition
vW, and vW as we did for B
N
in (3.3), then we get
V~bo and V~bo each of which is a q x q matrix. Now write
TNO(X) -
aT NO
(aT NO ,1 " ' " aT NO ,v)
where {TN-O'i' j = 1, ... ,
v} are the q x 1 columns of the q x v matrix T NO '
Cov(aTNO,i' aTNO,i')
Moreover,
aCov(TNO,j, TNO,j/)a'
(I)V(I)
a [ajjl NOO
(2)V(2)]'
+ ajjl
NOO a
Thus, denote the dispersion matrix of n I/2T NO (x) by B NX which is a v x v matrix.
Let Al = B NX ' A 2 = BNv(Y:X), and A 3 = B NOOO ' Let
- [A-1 I
W·,. -
I
+ A-I
+ A-I]-IA:2
3
t'
. 1, 2 , 3
z=
(3.9)
Now, noting that the mean of each of the three statistics TNO(x), TNvO(Y:X), and T NOO
is 0, following the Gauss-Markov theorem we consider the linear combination
(3.10)
Assume that
nn. - t
PI, and ~
-t
P2 for positive and finite real PI and P2. Then n I / 2To'
is conditionally distribution-free, under the permutational model, with
°mean and
covariance matrix
and the permutational multivariate central limit theorem of Sen (1983) applies here.
Thus, for large N, N v , and N°, n I / 2To' ,..., N v (0, A). Finally consider the overall test
statistic
(3.11)
66
The exact permutational distribution of LO· can be obtained by enumeration
if N, N v , and N° are small to moderate. Thus for large N, N v , and N° the null
distribution of LO· by Cochran (1934) is asymptotically a chi-squared with v - I
degre.es of freedom since it is a quadratic form of asymptotically normal variates with
discriminant of rank v-I.
3.3
Asymptotic Non-null Distribution of L O*
For the study of the non-null distribution of LO· we will consider local alternatives
only. Let us first work with the surrogate set I. Write Gi(u) = Gi(x, z), x E Rr q , z E
Rrp , and consider the following sequence of alternative hypotheses:
(3.12)
where ~ = ((Ay))) stands for an v
X q
matrix of treatment effects. Thus, under {HN}
Fi\') (x);" F(') [x - n- (A\') - ~ ~ AI'») ],
1 2
/
j E S" k= 1,· .. ,q.
Assume that all Fi~k) (x) are absolutely continuous. Let f(k) (x} be the density function
corresponding to F(k) (x). Let
(3.13)
where
H(k)(x)
1
=-
s
I: I: Fi~k)(x), k = 1,""
q.
sr i=1 jeSi
Now expand F(k) [x - n- 1 / 2 (A)k) - ~ LIes; A~k))] in a Taylor series around x for a
fixed x. We will have
Thus, for large n we have
67
I:
i:
I:
J(')
[F(')(
x)] dF(') [x - n- 1/' (Aj') - ~ I~ Ai'))]
J(k) [F(k)(x)] dF(k)(x) +
J(')
[F(') (x)] d {F(') [x - n -II' ( Aj')
-
~ I~ Aj')) ] - F(')(x) }
+o(n- 1 / 2 )
Now integrate by parts the second integral in the last term
(k)
I-ln,ij
r J(k)(u)du _jOO {F(k) [x _ nJo
1
1/ 2
().)k) _
+ o(n-
~ L ).~k))]
_ F(k)(x)}
r IESi
-00
~J(k)(F(k)(X))
dx
1 2
/ )
r J(k)(u)du + (nJo
1/ 2 )
r J(k)(u)du + (nJo
1/ 2 )
1
1
().)k) _
~L
r
().)k) _
IESi
~L
r
).}k))
lESi
JOO ~J(k)(F(k)(X))f(k)(x)dx + o(n- 1 / 2 )
-00
).}k))
dx
JOO ~J(k)(F(k)(X))dF(k)(X) + o(n- 1 / 2 )
-00
dx
where the last two steps resulted from a Taylor series exapnsion of
around x for a fixed x. Thus, if we let
then from the above we have
Now, let Q~~lj = ~ I:~=l 7]~:}, j E Si, {= 1,"', s, k = 1,'" ,p + q. Note that
r;;3 = I:iEPj Q~~lj'
Then by Theorem 5.1 of Sen (1969) [n 1 / 2 ( Q~~lj -I-l~~lj)' j E Si, i
=
1,"', s, k = 1,'" ,p + q] has asymptotically a multinormal distribution with null
mean and dispersion matrix whose entries for i
to V~kl, and for j =I- jf converge to
l
v;k .
# if are zero, and for j = jf
converge
Thus, by condition (a) of Section 2.2.1 and
68
.
Theorem 5.1 of Sen (1969) we have
lim E
n-+oo
[n (T(k) - r'''l(k l ) I HN]
1/2
N,J
J
lim
n--+oo
...
E[n 1/ 2 (Tjp - "L..J /1(k).) I HN] + lim E[n 1/ 2 (L...J
" (/1(k). n,tJ
,J
n--+oo
n,tJ
iEPj
lim E
n--+oo
[n
'1l(k»)) I HN]
'I ...
iEPj
_ rn;"J
I/(k).)
L..J (Q(k).
n,tJ
1/2 "
I HN] +
iEPj
n~~ E [nIl' (~(I'~:!j - 1.' J(kJ(U)dU)) I HN]
- J~ E[n 1/2(.L("l.~~l _fa1 J(kl(U)dU)) I HN]
~EP}
E
H~ Jim n'l' (I'~:!j - /.' jlkJ(U)dU)] I HN}
B(F(k l )
.L
[>.Yl -
~EP}
B(F(k l ) "
LJ
iEP)
~ L >.}kl]
lESi
[>.(kJ l _ ~>.(kl
" >.(kI l]
rJ _ ~rLJ
I#jESi
B(F(kl ) {[rj(r
r
Let
(J =
((eY»)) j=l, ... ,v,k=l,...
,q ,
t
-1)] >.;k l _
rjjl >.;~l}
j':f.j=l
r
where
Then, by Theorem 5.2 of Sen (1969),
{n (T;t,} - rj"l~~»), k = 1, ... ,p + q,
1 2
/
have jointly asymptotically a multinormal distribution with mean
(J
j
= 1,' .. ,v}
and dispersion
matrix fl. Denote by fl I the version of fl pertaining to the surrogate set, I, and by
(JI
the version of (J in I. Partition
(JI
as
((JIO, (J~).
Also partition
B N in (3.3) and corresponding to (3.4) let
1")*
_
uIwOO -
1")(-1) 1")1
I")
I")
UIwOO -
UIO+UI++UIO+'
69
flI
as we partitioned
Then for large n
and its dispersion matrix is
njw x
which is the population version of B'Nx.
In a similar fashion, define {HN} for the validation set
1iv such that
((>.~k))) is
+ 1) matrix. Also, (J [tv is defined in the same way as (J [ with the
difference that k = 1,,' . ,q + 1. B Nv
n[w which is the version of n defined in the
validation set 1iv. Thus, corresponding to (4.5) and (4.6) we have
now an v x (q
rv
*
T NvO(Y) -
TN-vo(y:X)
n*
HiI*YY -
OJ,(y:X)
Hence for large
£")*
and the dispersion matrix of
---+
(J
['(Y) -
n*1/2TN-vO(Y:X)
£")*
£")*(-1) (J'
U['YXU[,XX ['(X)
under {HN } is
Finally, along the same lines, define {HN
an v x 1 vector. Also,
1 2
/
£")*(-1) £")*'
"~['YX"~['XX"~['YX'
n*
E[n *1/2T*NvO(Y:X) I H]
N
value of n0
£")*(-1) T*
['YX U ['XX NvO(X) ,
£")*
U
(J [0
}
by
/Ljoo,
nj,(y:X)'
for the subset 1° such that ~ is now
is defined in the same way as
TN- oo under {HN }
*
= J..t['O(Y:X)
(J [ .
Denote the expected
and its dispersion matrix is
njoo'
Now, we consider the weights Wi defined in (3.9) where now we have Al
A2
= nj,(y:X)' and
A3
= njoo'
= nN-X,
Thus, under {HN }, the mean of TO' in (3.10) is
Therefore, from Theorem 5.3 of Sen (1969) and under {HN
},
LO' defined in (3.11)
has asymptotically a noncentral chi-square distribution with v-I degrees of freedom
and the noncentrality parameter
•
70
Chapter 4
Recovery of Inter-Block
Information (RIBI)
The model introduced in (3.1) and the analysis following it extracts intrablock information. However, when block comparisons are taken into account, the varietal
effects of treatments become more accurate; see Rao (1947). This section deals with
combined inference based on intra- and interblock information. A full treatment of
the parametric case is also found in Scheffe (1959).
The need for the recovery of inter-block information was first felt by Yates (1940)
in the context of incomplete block designs. Since the treatments are randomly allocated to incomplete blocks, it is reasonable to assume that the block effects are
random variables instead of fixed. Moreover, Yates (1940) noted that if the experimental material is fairly heterogeneous, treating block effects as fixed will result in
loss of information contained in the block totals.
71
RIBI for Surrogate Set I
4.1
The model considered in (3.1) assumes that the block effects
1,' ..
,s, are fixed.
f301i, a
= 1"", n,
Z
=
In some instances this may not be viable. It may be more rational
to assume that blocks are random variables. For example, consider blocks made
up of litters of cats where each litter, and hence each block, come from a different
mother. The mothers are assumed to be a random sample from an infinite population.
Consider the same setup as in (3). Denote the block totals by
hOli,
a
= 1, ... ,n, i =
1,"', s. Then, (3.1) gives
hOli
L
-
Xij
jESi
L
[ltOi
+ f301i + Tj + EOIij]
jESi
rlt Oi
+L
Tj
+ f OIi
jES;
rltOi
+ T OIi + f OIi
(4.1)
where
fOli
=
rf301i
+L
EOIij,
i = I,···,s, a = I,···,n.
jESi
For each a
= 1,' .. ,n, the block effects
f3 01 i'
i
= 1,'"
,s are' assumed to be random
variables with a joint distribution that is symmetric in its s arguments. Moreover,
in addition to the assumptions following (3.1),
be independent.
T OIi
f3 01i
and
LjES; EOIij
is a p + q-vector with two components,
treatment effects corresponding to the surrogates, and
T OIi2,
T OIil
are assumed to
a p x I-vector of
a q x I-zero vector of
treatment effects for the covariates since we assume no interaction between treatment
and concomitant variables. We wish to test the null hypothesis
HOB:
TOIil
=0
i
= 1,"', s.
while the set of alternatives relates to shifts in location due to treatment effects. Let
hOi
= (hOlI,"', hOls).
Also, let
(i) The joint distribution of
h N
hOi
= (hI,"', h n ).
Under
HOB,
we have the following
remains invariant under any permutation of the s
vectors among themselves, there being s! such permutations.
72
(ii) As replicates are independent, the joint distribution of h N remains invariant
under any permutation of the n sets hOI, a = 1,' .. ,n.
Thus, if we define
On to be the compound group of transformations {gn} by
E
n
1, ... ' g(n)h]
n
n , g(OI)
n
9 n hN -- h*N -- [g(l)h
Then
On contains [s!]n transformations, and under
HOB
t:
~n,
1 ... n
rv \...<
- ,
,.
it leaves the joint distribution
of h N invariant. Let S(hN ) = {h N = 9nhN : gn E On}. Then, it follows from the
above discussion that
for all h N E S(hN ). Denote this conditional (permutational) probability measure by
~~.
As
~~
for
HOB
is completely specified the existence of conditionally distribution-free tests
is thus established.
For the k th variate, we rank the observations h~~), ... ,h~k) in ascending order of
magnitude and denote by S~~) the rank of h~~) in this set, for i
= 1,"
. ,s, a
= 1,' .. ,n,
and k = 1,'" ,p + q. Thus, corresponding to h Oli = (h~l/, ... : h~+q»)' we have a rank
vector
SOli
= (S~~),· ., ,S~+q»)', i = 1,' .. ,s, a = 1,' .. ,n.
For each N'(= ns) and each k = 1,'" ,p
b~~
= (b~~,l,···,b~~,NI)'
where b~~,j
=
+ q,
we define suitable rank scores
J;;)(j/(N' + 1)), 1 :::; j :::; N'. Moreover,
J;;)(u) is defined in accordance with the Chernoff-Savage convention, i.e., we assume
that J;;) (u) satisfies the conditions a, b, c of Section 2.2.1. For notational simplicity,
we let
t(k) "'OIi
-
ei~)
~ - 1 ... s
b(k)
I
(k)'
N ,sa;
0 -,
s
= S-l Le~~),
."
e~.k)
i=l
for k
'" - 1 '"
\...<
- ,
n'
"
n
= n- 1 Lei~)
i=l
= 1, ... ,p + q. The proposed test is based on the statistics
(k). = _1 ~"t(~) .
TN',)
. L.J L.J "'0/1 , J
nr) 0'=1 iEPj
= 1,"', v, k = 1,'"
73
,p + q.
where Pj
= {i : j
E
Sd,
= 1,' .. ,v.
j
Let TN,
= ((T;;'~J)j=I .. ",v, k=I"",p+q,
Theorem 4.1 Let
(1)
wN',kk'
=
~ L-J
~~
[t(~)
L-J ~m
ns
[t(~')
_
~o~
_ dk)]
~o.
d k')]
~o.
k , k' = 1" ... p
,
+ q,
0=1 i=1
and
W(2~
= ~ L-J
~
N kk
I
,
n
W
0=1
k) _ d k)] [d k' ) - t(k')]
[d
~o.
~..
c"o.
c"..
"
(1) ))
= (( wN,kk'
(1)
N'
and W
(2)
N'
=
k k'
((
= 1" ...
(2)
wN,kk '
))
p
+ q',
.
Also, let
where
srjjl - rjrjl
(l) _
dj j l -
2(
rj
dJJ(~),
s-
. .,
1)' J,J=l,···,v;
= rjrjl
2 '
rj
..,
J,J =
1
.
,"',v,
Finally, let
C N' -eN'
is a (p
+ q)v
X
V(l)
iOI
W(I)
'<Y
N'
+ V(2)
iOI
'<Y
W(2)
N"
-
(p + q)v matrix. Then
where J is a v x I-vector of ones.
Proof: Note that for a given
Q, Q
= 1,'"
,n, the probability under p~ that i will be
any of 1, ... , s is 1.s' hence
74
¥oreover, let A
= {(i, i')
: i E Pj, i' E Pj, i =I- i'}, and note that then the cardinality
= j').
of A is rjrjl - rjj/(= rj(rj - 1) if j
nE p ;. [Tt~j
- E (Tt~j)]
nE p ;.
n~2
[T~~:~
Then
- E (Tt:n]
[(~
t L: (e~~) - e~.k))) x (~
t L (e~::) - f~kl)))]
nr
nr
t L:
J a=1
iEPj
E p ;.
[(
f3=1 i'EPj
J
e~~) - f\k)) (e~~/) - e~kl)) ]
J a=1 iEPj
+ n~~
t
L:
E p ;.
[(
J a=1 (i,i/)EA
+_1
~
nr2 L..
e~~) - e~.k)) (e~::) - e~kl)) ]
"E.
[(e(~) - e(k)) (e(~/) - e(k l))]
L..
at
Pn
f3t
..
..
J a#f3=1 iEPj
+~
t
L:
E p ;.
nr J a#f3=1 (i,i/)EA
[(e~~) - f~k)) (eh~:) - f\k
l
))]
Note that the last two terms are null because the replicates are independent and
E p ;. (e~:)
- e~k)) = o. The first term yields
~
nr j
tL
E p ;.
[(
e~~) - e~k)) (e~:/) - e~.kl)) ]
a=1 iEPj
n~~ t .L: E p ;. (e~~) e~:/)) J a=1 tEPJ
(e~.k)e~.k'))
1
r
J
.
n~2 t L: [~t e~:)e~~/)] - :. ((~k) e.~kl))
J a=1 tEPj
t=1
J
t [~ te~:)e~~') - d~)d~/) + ei~)ei~')]
-~ t (e~k)e~kl))
t t [e~:) - ei~)] [ei~/) - d~/)]
+_1 t ei~)ei~') - ~ (f(.k)e~kl))
_1_
nrj a=1
rJ
S i=1
a=1
_1_.
nsrJ a=1 i=1
nrj a=1
rj
_1_ ~ ~
nsr' L.. L..
[e(k) _e(k)]
at
J a=1 t=1
+ n~'
t
J a=1
a.
[t(~/)
~at
_
t<k l)]
~a.
[d~) - e~.k)] [e!:') - e~.k')]
- r
-1 [jWN'(1)' kk' + WN'(2)]
,kk'
75
The second term gives
•
Hence, the sum of the two terms give
(k)
(k'))
nCov p n• ( TN' , j' TN' ,j
S - rj
= rj (S -
(1)
1) wN' ' kk'
(2)
+ wN' ,kk"
On the other hand, let B = Hi, i') : i E Pj, i' E Pj', i = i'}. The cardinality of B is
rjj',
Thus
nEpi.
(k)
[ TN',j -
E
(
(k))] [(k')
TN',j
TN',j' -
E
((k') )]
TN',j'
76
nE p :.
[(~ t I: (~~~) - ~~k))) (~t I: (~}j;:) - ~~.kl)))]
X
nr) a=1 iEPJ
_.1
~
"
nr2 LJ LJ
) a=1 (i,i/)EB
E.
Pn
+_1
~ "LJ
nr~ LJ
nr) {3=1 i/EPJI
[(~(~)
_
m
(t(~;)
_ ":.d..k/ ))]
<"m
d k ))
":...
[(t(~)
_ ~(k)) (t(~;)
":.at
..
":.at
E.
Pn
_ t(k'))]
":. ..
) a=1 (i,i/)EA
+_1
~
nr~
LJ
"E.
LJ
Pn
[(t(~)
_
":.at
d k ))
":...
(t(~:)
_
":.{3t
[(t(~)
_
":.at
d k )) (t(k;) _ d kl ))]
":...
":.{3t
":. ..
d kl ))]
":. ..
) a#={3=1 (i,i/)EB
+_1
~
nr~
LJ
"E.
LJ
Pn
) a#={3=1 (i,i/)EA
The first term then yields
_1
nr~
~
LJ
[(~(~)
_ ~(k)) (t(~I)
at
..
":.at
"
E.
LJ
Pn
_ dk/))]
<"..
) a=1 (i,i/)EB
~
LJ
_1
nr~
"
E.
LJ
Pn
(~(~)~(k')) _
O!t
rjjl (dk)dkl))
2
O!t
) a=1 (i,i/)EB
":...
":. ..
r)
-
~ t I: [~t~~~)~~:/)] rj~' (~~.k)~~.kl))
nrj a=1 (i,i/)EB
S i=1
rj
~ [~s ~
~(~)~(~/) _ ~(k)~(kl)
+ ~(k)~(kl)]
~ at at
a. a.
a. a.
rjjl
nr2 LJ
) a=1
r:~,
_
t=1
n
I: ((~k)(~kl))
) a=1
rjjl
r~
[~ ~ ~ [~(~) _ ~(k)] [t(~I)
ns
)
LJ
~
at
a=1 t=1
a.
":.at
_ d kl )]]
":.a.
r:r [~~ d~)~~~/) - (~~.k)~~.k/))]
+
rjjl
r2
)
[~
~ ~ [~(k) _ ~(k)]
[~W)
_ ~(kl)]]
ns LJ ~ at
a.
at
a.
+ r:r
rjjl [
r~
a=1 t=1
[~~ [~~~) - (~k)] [~~~') _ ~~.kl)]]
(1)
WNI,kk '
(2)]
+ WNI,kk
'
)
The second term gives
_1
n
~
~
LJ
"
LJ
E.
Pn
[(~(~)
_
at
d k ))
":...
kl
(t(~:)
_ d ))]
":.at
":...
r) a=1 (i,i/)EA
77
The details were skipped because they were done earlier. Therefore, the sum of the
two terms gives
n COV P:;
(k) T(k'))
(TN',j'
N',j'
,(1)
(1)
= ajj'wN',kk'
(2)
+ d(2)
jj'WN',kk'
•
Hence the theorem.
Define the following
1
H};i,;') (x, y) = ns
n
L
s
~ I [( h~~)*, h~~')*) ~ (x, y)]
a=l,=l
(k,k')
HN',2 (x,y) =
.
1
~ ~ [( (k)* (k')*)
(
)]
(-1) L..J L..J I hai ,h ai
~ x,y
ns s
a=l i#i'=l
Moreover, denote the marginal c.d.f of h~~)* by F}k)(x) and note that under HOB it
is equal to F(k)(x) which is independent of i. Also, for every i = 1,"', s, denote by
F?,k')(x)(= F 1(k,k')(x,y) under HOB) the marginal c.d.f. of (h~~)*,h~~')*). Finally, for
i
=I i' = 1,···,s, let F}:"k')(x)(=
FJk,k')(X,y) under HOB) be the marginal c.d.f. of
,
(k)* , h(k')*)
Defi ne
ai'
.
(h ai
z = 1,2, for k,k'
=
1,"',p+q, where J-lk
=
f~J(k)(u)du. Note that F1(k,k)(x,y)
reduces to the univariate c.d.f. F(k) (x) as x = y almost everywhere. Further let
(kk') )
)k,k'=l,..·.,P+q,
Wi=((Wi
78
.
z=1,2;
•
E = (8 - 1) D(1)
8
_
~D(2).,
8
Theorem 4.2 Under HOB, eN defined in Theorem 3.3, converges in probability (as
n
--+
00) to 1P defined above.
Proof:
Note that S~7) = NHc;) (h~~)). Hence, ~~)
follows that
Rewrite w~J,kk' as follows
(1 )
wN',kk'
where R 2 is the Euclidean plane, and the last term follows from Theorem 5.4.2 of
Puri and Sen (1971). Note that Hc;J
--+
H(k)(= p(k)under HOB), where H(k) is the
population combined c.d.f. Hence we have
W
(1)
N'
s- 1
[WI - W2] .
s
P
--t - -
.
(2)
Let us rewnte wN',kk' as
(2)
WN',kk'
~
f
-
[~~~) ~~k)] [~~~') _~~.k')]
0'=1
79
~
ns
~
ns
t t t [(~~~) -~~.k») - (~~~:) _~~.k'»)]
0'=1 i=1 i'=1
t t [(~~:) - ~~.k») _ (~~~') _ ~~.k'»)]
0'=1 i=1
f
+_1 ~ ~ [(~(~) _ ~(k»)
ns 2 LJ LJ
at..
_ (~(~:)
_ ~(k'»)]
at
..
0'=1 i=1
~ ~ [w~k' + w;k']
Hence we have
•
Thus the theorem.
Theorem 4.3 n 1 / 2 [Vec(TN') -
eJ
converges in probability to a multivariate normal
distribution with null mean vector and dispersion matrix C N ,.
The proof is found in Theorem 4.1 of Sen (1969). Now, construct the residual rank
statistics as we did in the intra-block case. Partition [Vec (TN') -
eJ
into two com-
ponents, the first component, T N'O, is a qv x 1 vector of the centered linear rank
statistics corresponding to the surrogate variables, and the second component, T~"
is a pv x 1 vector of centered linear rank statistics corresponding to the concomitant
variates. Similarly, partition the matrix C N' as follows
C N ,=
(
CN'OO
CN,O+ )
C~I\T'o+
C N ,++
(4.2)
where C N'OO is a qv x qv matrix, C N'O+ is qv x pv and C N'++ is a pv x pv matrix.
Then, from the classical normal theory, (see Theorem 2.5.1 of Anderson (1984)), for
large n we have
Define the qv x 1 residual rank-statistics TN,o, and the covariance matrix of n 1 / 2 T N ,o,
CN,oo, as follows
(-1) C'
C N'OO - C N'O+ C N'++
N'O+
80
(4.3)
4.1.1
RIBI For The Validation Set 1*
Consider n* replications of a BIBD consisting of s* blocks of constant size r*(2:: 2) to
which v treatments are applied such that, all the assumptions made
i~
the surrogate
set I are satisfied.
Let Si stand for the set of treatments occuring in the i th block, i = 1, ... , s*. For the
a th replicate, the response in the ith block receiving the jth treatment is a stochastic
p+q+l-vectorU oi j =
(lij,X~~J, ... ,X~~J,Z~~J, ... ,Z~~J)' = (Y;j,X~ij,Z~ij)' of mea-
suremeIits corresponding to the primary variate, p covariates and the q surrogates.
Consider the the same model in (4.1) and the assumptions following it with the difference that here we regress the primary variate and the surrogates on the concomitant
variates. We basically repeat what we have done in the surrogate set with N' replaced
by N'v(= n*s*). Thus CN' in theorem 6 is now a (p + q + l)v x (p + q + l)v matrix
,CN'v'
which we partition as in (4.2). Then, corresponding to (4.3) we have
(4.4)
Partition the (q + l)v x I-vector T"N,v o' and the (q + l)v x (q + l)v matrix C"N,v oo as
follows
*
_
C N'vOO -
where CN'v YY is v x
V,
C"N,v YX
C*N'vY Y
,
( CN'vYX
is v x qv, and C"N,vxx is qv x qv. Also let
T*
T *N'vO(Y) - C*N'vYX C*(-l)
N'vXX N'vO(X),
(4.5)
C *N'vYY - C*N'vYX C*N'vXX C*'
N'vYX·
(4.6)
Note that the dispersion matrix of v x 1 vector n*1/2T"N,vO(Y:X) is given by the v x v
matrix C"N,v(Y:X).
81
4.1.2
Construction of The Test Statistics
For the reasons mentioned in the begining of Section 3.2 we work with TN,o(x)
bTN,o instead of TN,o, where b is a 1 x q- vector subject to the two, restrictions, i)
b'b = 1, and ii) the variance given in (4.3) is minimum. Denote the v x v dispersion
matrix of n I/2T;,Tl o(x) by CN,x'
The next step consists of combining the interblock and intrablock information.
Since the tests developed in Chapte:s 3 and 4 are independent, we will take a weighted
linear combination of the four tests where the weights are inverses of the corresponding
dispersion matrices. Recall that we let Al = B NX ' A 2
Chapter 3. Let A 3
= CN,x, and let A 4 = CN'v(Y:X)
=
BNv(Y:X)
developed in
developed in this chapter. Then
let the weights be defined by
W ,· -- [A-I I
I
+ A-I
2 + A-I
3 + A-Ij-IA4
i'
. 1, 2, 3, 4
z=
Now, noting that the mean of each of the four statistics TNO(x),
and
TN,vO(Y:X),
(4.7)
TNvO(Y:X),
TN,o(x),
is 0, consider the linear combination
Assume that ;:.
-+
p for positive and finite real p. Then n I / 2 To· is conditionally
distribution-free, under the permutational model, with
°mean and covariance matrix
and the permutational multivariate central limit theorem of Sen (1983) applies here.
Thus, for large N, N v , N', and N~, n I / 2 To· '"
N v (0, A). Finally consider the overall
test statistic
(4.9)
The exact permutational distribution of LO· can be obtained by enumeration if N,
N v , N', and
N~
are small to moderate. Thus for large N, N v , N', and
N~
the null
distribution of LO· by Cochran (1934) is asymptotically a chi-squared with v - 1
degrees of freedom.
82
•
4.1.3
Asymptotic Non-null Distribution of
LO*
For the study of the non-null distribution of LO· we will consider local alternatives
only. Recall that we have found in Section 3.3 that the the covariance .matrices of the
linear rank statistics in both sets I and 1* under HN, converge to the same dispersion
matrices under How. On the other hand, the test statistics had a different mean under
HN than under How. The same results hold for the tests using block totals. To see this
let us first work with the surrogate set I. Write Gi(u)
= Gi(x, z),
x E Rq, z E RP,
consider the following sequence of alternative hypotheses:
(4.10)
where A = ((>.;k))) stands for an v x q matrix of treatment effects. Thus, under {HN }
y(k)(X)
t
= F(k)
[x - n- 1 / 2 )..(k)]
k
J"
=1
'"
q
,.
We assume that F}k\x) are all absolutely continuous. Let j(k)(x) be the density
function corresponding to F(k)(x). Let
(4.11)
where
H(k)(X) =
~
t
F?)(x), k = 1,"', q.
S i=l
Now expand F(k) [x -
n- 1 / 2 >.;k)]
in a Taylor series around x for a fixed x. We will
have
Thus, for large n we have
J1S~l
1.:
1:
1:
1:
J(k) [H(k)(x)] dF?)(x)
J(k) [F(k)(x)] dF(k) [x - n- 1 / 2 )..;k)]
J(k) [F(k)(x)] dF(k)(x)
+
J(k) [F(k)(x)] d{F(k) [x - n- 1 / 2 >.;k)] - F(k)(x)}
+o(n- 1 / 2 )
83
~ow
integrate by parts the second integral in the last term
(k)
[1
f-ln,i
k
+
J(k)(u)du
-1
00
{F(k) [x _ n- 1 / 2 A)k)] _ F(k)(x)}
-00
~J(k)(F(k)(x))
dx
o(n- 1/ 2)
[1 J(k)(u)du
Jo
+ (n- 1/ 2)A)k)
[1 J(k)(u)du
+ (n- 1/ 2)A(k)
Jo
1 ~J(k)(F(k)(X))f(k)(x)dx+
00
-00
J
dx
1 ~J(k)(F(k)(X))dF(k)(x)+
00
-00
·dx
o(n-1/2)
o(n-1/2)
where the last two steps resulted from a Taylor series expansion of F(k) [x - n- 1 / 2 A)k)]
around x for a fixed x. Thus, if we let
then from the above we have
Now, let Q~~l
1..
T
J
L:iEP Q~k].
),
=
~ L::=1 ~~:),
i
=
1,···, s, k
=
1,·" ,p
+
q. Note that
Tj;3 =
Then along the same lines of theorem 5.1 of Sen (1969) [nl/2(Q~kl_
,
f-l~~l), i = 1,···, s, k = 1,··· ,p + q] has asymptotically a IIlUltinormal distribution
with null mean and dispersion matrix whose entries for i = i' converge to
wt
kl
,
and
for i =I- i' converge to W~kl. Thus, by condition (a) of Section 2.2.1 and Theorem 5.1
of Sen (1969) we have
lim E
n-+oo
[n (T(~). - ~(k)) I H
1 2
/
N ,J
..
N]
84
-
12
2:
J~~
n / (Ji~:l- fal ](k)(U)dU)] I HN}
te~
0
E {[:..
J
B(F(k)) :.
J
2: )..}k)
iePj
B(F(k)))..}k)
=
= (( ,Y))) j=l,...,v,k=l,...,q , where IY) = B(F(k)))..}k) Then, by Theorem 5.2 of Sen
(1969), {n 1 / 2 (T;;'~j - ~~k)), k = 1,·,· ,p + q, j = 1,·'·, v} have jointly asymptot-
Let I
ically a multinormal distribution with mean I and dispersion matrix W. Denote by
WI the version of W pertaining to the surrogate set, I, and by II the version of I in
I. Partition WI as we partitioned CN' in (4.2) and corresponding to (4.3) let
.Tr*
'£'
_
100 -
.Tr
'£'
.Tr
.TA-l).Trl
'£' [0+ '£' 1++ '£' [0+·
100 -
Then for large n
and its dispersion matrix is 1P~x which is the population version of
eN-,x.
In a similar fashion, we define {HN} for the validation set 1* such that (()..}k)))
is now an v x (q
+ 1)
matrix. Also, 1[" is defined in the same way as T I with the
difference that k = 1, ... , q + 1. C N' v
f"V
1P[" which is the version of 1P defined in the
validation set 1*. Thus, corresponding to (4.7) and (5.1) we have
TNvO(Y:X)
T *NvO(Y)
W~"(Y:X)
.T.*
~ ["YY -
.Tr*(-l) T*
I"YX '£' I"XX NvO(X),
.Tr*
-
'£'
.Tr*(-l) .Tr*'
I"YX '£' I·XX '£' I·YX·
.Tr* .
'.I!'
Hence for large n *
··
. f *1/2 T *
.
an d t he dIsperSlOn matrIX 0 n
NvO(Y:X) IS
85
.Tr*
'.I!'
I"(Y:X)'
For now, focus only on the surrogate and validation subsets. Consider the weights
Wi defined in (3.9) where now we have Al
A4
= qi'j.(y:X).
Thus, under {HN
},
= .njx, A 2 = .nj.(y:X),
A3
= qi'jx, and
the mean of n I / 2 T o• in (4.9) is
Therefore, from Theorem 5.3 of Sen (1969) and under {HN}, La· has asymptotically a
noncentral chi-squared distribution with v-I degrees of freedom and the noncentrality
parameter
•
86
Chapter 5
General Case of a Vector of
Primary Variates
It is hard to find an ideal surrogate that fully captures the treatment effect on the
true endpoint, because a new treatment may affect the true endpoint through other
potential surrogate variates. Hence, there is a need to consider all possible surrogates
so as to avoid losing information about treatment effects on the primary variates.
Moreover, it is often difficult/costly to record data on all primary variates (as
well as surrogates and covariates) for all experimental units.
Consider a vector
of primary response variates (y(l), ... , y( m))', and a vector of surrogate responses
(X(l),'" ,X(q))' that we partition into various subsets on which measurements are
obtained for different numbers of experimental units. Such a partition is done based
upon practical consideration of medical as well as economic factors. This is the genesis
of incomplete multiresponse designs (IMD).
Sen (1994) formulates a general class of IMDs in the following manner. Let
P denote {1,'" , q + m} and consider the totality of 2q+m subsets of P, defined by
(i r , i;) = {ill' .. ,ir ; ii', ... ,i:}, for all possible 1 ::; i l < ... < i r
i: ::;
::;
m, 1 ::; ii' < ... <
q; 1 ::; r ::; m, 1 ::; s ::; q, and include the null sets i o or i~ in this system.
Consider a proper subset Po of P, determined by clinical and other factors, such that
87
P.o = {(ir,i;) : (r,s,ir,i:) E I o}, where 10 is the set corresponding to Po. Then the
set of all experimental units is partitioned into the system
where in the subset S(ir,is) the primary response variates
y(il),.", y(ir) ,
and the
surrogate response variates XUi), ... ,XU:) are to be recorded on the experimental
units.
For the subset S(ir,i s ) efficient design D(ir,i s ) (for the treatment as well as design
variates) may be adopted leading to the design sets
Thus, an incomplete multiresponse design that incorporates surrogate endpoints
can be formulated in terms of the dual design sets, responsewise design set and
treatmentwise design set, namely {S(Po),D(Po)}.
Assume that there are II, ... ,hI surrogate subsets of sizes nI, ... , nkll that contain measurements on surrgate and concomitant variates, and I;"" ,Ik2 validation
subsets of sizes
ni',·· . , nk
2
,
with measurements on both primary and surrgate end-
points along with covariates.
Assume also that a p-vector of covariates is easily
measured on all subjects in the respective subsets.
5.1
Intra And Inter-Block inference
Consider n* replications of a BIBD consisting of s blocks of constant size
r(~
2) to
which v treatments are applied in subset II, such that all the previous assumptions
discussed in Chapter 3 apply. For the a th replicate, the response in the i th block receiv-
. t he J'th treatment IS
. a stoch
' vector U en).. -- (Xi~
mg
astIc
e<ij' ••• , Xi:
e<ij"
Z(I)
e<ij'···,
ZIp))'
-e<ij
(X~ij' Z~ij)' of measurements corresponding to the p covariates and the surrogate responses in {SOo,i:): (r,s,io,i~) E Po}.
88
~
The focus here is, on hypothesis testing. Moreover, since a BIBD design is a
connected design, testability of hypotheses is ensured (see Bose (1947)).
Apply the same method developed in Chapter 3 to create a test statistic in each
of the surrogate subsets. Simillarly, obtain a test statistic in each of the validation
sets. Moreover, obtain a test statistic in each of the surrogate as well as validation
subsets by modelling the block totals as explained in Chapter 4.
Thus, for within block inference we conceive of a vector of test statistics (T~1,w,"', T~kl' w)
in the surrogate sets, and another vector of test statistics,
( T~. w, .. "T~.
l'
k2 '
w), in the validation sets.
Also, for between block inference we
have a vector of tests (T~1,B"", T~ kl'B) in the surrogate sets, and a vector of tests
( T~. B,' .. ,T~. B) in the validation subsets.
l'
5.2
k2'
Let n =
.
nl
+ ... + nkl + n~ + ... + n k .
2
Construction of The Test Statistic
The next step consists of combining information from all subsets. Since the tests
developed in the different subsets are independent, we will take a weighted linear
combination of these tests with the weights being the inverses of the corresponding
dispersion matrices. Note that for such a combination to make sense the dimensions
of the tests should be the same. Thus, each test is appropriately transformed so as
to allow a meaningful linear combination.
For simplicity of notation replace the within block tests in the surrogate and
validations subsets by their transformed version (Tl,w,"', Tn,w), and the between
block tests in all subsets by (Tl,B,"', Tn,B). Also replace their corresponding dispersion matrices by (Al,w,"', An,w), and (Al,B,"" An,B)
Define
for i
= 1,' .. ,n,
j E {W, B}. Consider the linear combination
89
Assume that ~
n
-+ Pi
J
p's
for j = 2,' .. ,kt, and !!t
nz
-+ PI
for 1 = 1,'" ,k2 , where the
are all positive and finite real numbers. Then n~/2To* is conditionally distribution-
free under the permutational model, and the permutational multivariate central limit
theorem of Sen (1983) applies here. Hence
n~/2To* is multinormal with 0 mean and
covariance matrix
Finally consider the overall test statistic
(5.1 )
The exact permutational distribution of LO* can be obtained by enumeration if n,
is small to moderate, while for large n, the null distribution of LO* by the Cochran
(1934) Theorem is asymptotically a chi-squared with v-I degrees of freedom.
5.3
Asymptotic Non-Null Distribution of LO*
The dispersion matrix of the test developed in the different subsets converge, under
local alternatives to the same limit as under the null. The mean of the test-statistics
however, is shifted under local alternatives. The distribution of LO* under local alternatives is a non-central chi-squared. The details follow in the same way as has been
done in Chapters 4 and 5.
Assume that the mean of the respective tests under H N defined in Chapter 4 is
a 2n x 1- vector, (PI ,w,' ", Pn "w, PI B,' .. ,Pn,B)'
Define the weights Wi using the population versions of the covariance matrices.
Thus, under {HN }, the mean of n~/2To* is
Therefore, from Theorem 5.3 of Sen (1969) and under {HN
}, LO*
has asymptotically a
noncentral chi-squared distribution with v-I degrees of freedom and the noncentrality
90
Chapter 6
Properties of The Test Statistics
And an Illustration
6.1
Introduction
This chapter discusses studies the asymptotic relative efficiency (ARE) of the test
statistics LO·, developed in Chapters 2,3, and 4 with respect to the parametric version
constructed in exactly the same way using the least squares estimates lse's of the
treatment effects instead of the linear rank statistics T;;'~.
Also, calculations for the complete block case will be done on the BurroughsWellcome data mentioned at the end of Chapter 1 to illustrate the method developed.
6.2
ARE for the Complete Block Case
Consider the same setup in the surrogate set as in Section 2.2.1.
The model in
2.1 assumes now the errors to be normally and independently distributed.
means that
covariance
fij
~,
This
has now a (p+q)-variate multinormal distribution with null mean and
independently for all i = 1,· .. ,s, j
92
= 1,· .. ,r.
The least squares estimates of the treatment effects are defined by
which is the average of the aligned variable Ui~k) over blocks. The alignment is done
by subtracting the block average from Ui~k). Moreover, V N, the (p
matrix defined in Theorem 2.1 has now elements
VH'
+ q)
x (p
+ q)
given by
k,k'=1,···,p+q,and
where eN is the matrix of coefficients defined in Theorem 2.1, and TN is the r x
(p + q) matrix of least square estimates corresponding to the different treatments and
variates.
Repeat the same method developed in the nonparametric setup. Partition V N
as in 2.11, then obtain TNo, a q x r matrix of residual 1se's in the surrogate set.
Similarly, obtain TNvO(Y:X)' a 1 x r vector of residual1se's in the validation set. Build
the weighted linear combination TO· as in (2.22). Finally construct the quadratic
form La· using TO· and its covariance matrix.
Whenever the errors have a finite dispersion matrix (~), the 1se's are asymptotically normal, and hence the null distribution of La· is asymptotically a chi-squared
with r - 1 degrees of freedom. However, under the sequence of alternative hypotheses {HN } in (2.24), T
= n- 1 / 2 >., and
thus the mean of
T
undergoes a location shift
leading to the following shift in the mean of T
(6.1)
Let Al s be the version similar to A, the covariance of the weighted linear combination of the test
TO·
under {HN }. Note that the limit of the expected value of VN
for large sis r;l~ under both H o and H N . Thus, construct A as was done in Section
93
2.2.5
but using r-1
eN 0 ~ instead of v. Let /-t?: be the mean of the TO· under {HN
.
r
}.
Then LO· has asymptotically a noncentral chi-squared distribution with r-1 degrees
of freedom and noncentrality
~ LO·
=
Zs
(/-tIs0·)' A ls(-1)( /-t 0·)
Is .
Thus, the ARE is given by
ARE
= ~LO •.
~Lo·
Zs
Rewrite /-to·
~*
= M1~*' and /-t?s·
= M2~*, where M 1 and M 2 are both 1 x (2q+ 1), and
is the (2q+ 1) x r matrix of the shifts in treatment effects in the surrogate set stacked
+ 1) x r matrix in the validation set. Moreover,
and let Bg = M~A~;1)M2. Then the ARE becomes
over the (q
let B~
= M~A(-l)Ml'
In the univ~riate case A*, B~, and Bg) are all scalars. The ratio depends only on
the scalars
B? and Bg.
No common value can be assigned for all possible directions.
In the multivariate case however, the ARE depends upon A*, B~, and Bg) and a
unique solution may not exist. Therefore, one considers the maximum and minimum
possible values over the variation of ~ *.
If cm(G,J) and CM(G,J) are respectively the minimum and maximum charac-
teristic roots of B~B~(-l), then by virtue of the well-known Courant theorem on the
extremum of the ratio of two non-negative definite quadratic forms we have
Cm(G,J) ~ e (A*,B~,B~) ~ CM(G,J), VA*.
It may be noted that if the normal scores are used for J, and if G is normal then
B(G(k))
=
1 2
O"kk / ,
where O"kk is the (k, kyh entry of ~ corresponding to the k th variate.
v defined in (2.10) is the same as ~. These facts imply that B~B~(-l) = I, and hence
cm ( G, J)
= CM( G, J) =
1, and hence the normal scores LO· and L?: are asymptotically
power equivalent.
94
..
It may be noted that in such a multivariate setup there is no generally uniformly
most powerful test of the null hypothesis. The least squares test developed here is
asymptotically equivalent to the likelihood ratio test when the distribution of the
errors is normal. But any departure from normality makes it nonrobust. Moreover,
the least squares test is not invariant to monotone transformation of the data, e.g.,
the log transformation. This is not the case with the rank procedure developed in
this work which remains invariant under coordinate-wise strictly monotone transformation, although it is not affine invariant. Therfore, there may exist some situations
when rank statistics fair much better especially with heavy tails when the distribution
is strictly non-normal.
The same mechanism can be applied to the balanced incomplete block design to
find the ARE and the bounds on it. Consider the same model in (3.1). Assume that
the errors
fOiij
have a p + q multinormal distribution with null mean and covariance
!: which is assumed to be nonnegative definte.
The intra as well as the inter-block tests developed in Chapters 4 and 5 are replaced here with the least square version of these test. Thus, for intra-block inference,
the v x (p
+ q)
matrix of least square estimates,
which is the average of the aligned variable
TN,W
has entries given by
Ul7J over blocks containing the ph treat-
ment and over all replicates. The alignment is done by subtracting the block average
from
Ul7J.
Moreover, v~) defined in Theorem 3.1 has now entries defined by
(1)
= _1 ~ ~ ' "
vN,kk
L.J L.J L.J
'
nsr 0i=1 i=1 jESi
and
[U(~~* _ U(~)*] [U(~~)* - U(~/)*]
W)
Oit.
Oit)
OiL'
k, k'
= 1, . ". ,p + q,
vW defined in Theorem 3.1 has entries defined by
N,kk ,=
V(2)
~ L.J
~~
L.J [U(~)* ns
OiL
U(k)*]
Oi..
[U(~/)*
OiL
- U(k')*]
k, k' = 1,·· . ,p + q",
Oi..'
0i=1 i=1
where the dots in the subscripts indicate averages over the dotted subscript. Then,
95
wjth A (1) and A (2) defined as Theorem 3.1 we have
BN is a (p
+ q)v x (p + q)v matrix, and
nV[Vec(TN,w)]
The Expected value of
vW for large n
= B N.
converges to (r~I)~. Similarly, The
limit of the mean of V~2) for large n is (S~I) (r~). Thus if
A
= (r~l)
B = r (1 - ~) A(2), then the limit of the mean of B N for large n is (A
A(I)
and
+ B) @ ~.
This limit can be now used to bulid the noncentrality parameter as was done in
Section 3.3 or in Section 4.1.3.
On the other hand, for inter-block inference, consider the model in (4.1) and
note that
LjES; €aij
has covariancer~. Let TN,B be the v x (p
+ q) matrix of least
square estimates with entries given by
A(k)
TN' J"
,
1 ~"
(k)
.
.
= LJ LJ h ai , J = 1,"', v, k = 1,'" ,p
nr"J a=1 tEPj
"
+ q.
The covariance matrix of f;;'~j is the same as that defined in Theorem 4.1 with
the ~~~) replaced by h~~). In such a case the limit of the mean of the matrix W~~
defined in Theorem 4.1 converges for large n to (S~I) (r~), and the limit of the
expected value of W~! converges for large n to (n~l) (;~) .
Define
D
= ~D(1)
s'
and E
=
r(n-l)D(2) where
ns'
D(I)
and
D(2)
are defined in
Theorem 4.1. The limit of the expected value of the least square version of eN'
defined in Theorem 4.1 converges for large n to (D
+ E) @~.
This limit can be used
to compute the noncentrality parameter under H N as done in Section 4.1.3 using the
least squares estimates. The ARE will be the ratio of the noncentrality parameters
as before and the same conclusions arrived at in the complete case apply here too.
96
6.3
Example: The Effect of Zidovudine on Survival in Patients with AIDS
The example uses data from a double blind placebo-controlled clinical trial (study
02) conducted by Burroughs-Wellcome in the year 1986 to evaluate the effect of
Zidovudine (ZDV) on survival in patients with acquired immuno-deficiency syndrome
(AIDS). A total of 281 patients were enrolled in the study; 144 patients were assigned
to the ZDV group and 137 patients were assigned to the placebo group (See Tsiatis,
Gruttola and Wulfsohn (1995)).
The primary response is the length of survival, but because there is a need to
evaluate new therapies in a shorter period of time, surrogate outcomes are of interest.
Entry into the study was staggered over a period of five months between 19 February
1986 and 7 July 1986. The study was stopped after seven months because of a
significant decrease in mortality in patients treated with ZDV.
At that time Burroughs Wellcome conducted a second study (study 08) where
patients receiving placebo were then offered ZDV and followed for clinical outcomes.
Entry into study 08 was staggered over time between 16 Sept'ember 1986 and 6 April
1987.
The survival time for this illustration was calculated differently for patients on
placebo and those on ZDV. (Tsiatis et al. (1995)). Thanks to Dr. Michael Wulfsohn
for sending me the data sets and the SAS program that calculates the survival time.
The SAS code that calculates the survival time is attached in the Appendix. A
description of the SAS code that calulates the survival time follows.
Since most of the placebo patients were dying of AIDS after approximately 80
weeks of entry into study 02, their sample size became too small to allow a powerful
comparison with the ZDV group. Hence the survival time was truncated at 80 weeks
of entry into study 02. Let a denote the time of entry into study 02 plus 80 weeks.
If the death date, the date of study termination, and the last date the patient
97
w;as known to be alive were misssing then the survival time was calculated as the
difference between 16 September 1986 and entry into study 02 irrespective of the
treatment group.
Let m stand for the minimum of entry into study 08 and 16 October 1986 where
the minimum is taken over nonmissing values of entry into study 08. By October
16 nearly 90% of placebo patients switched to ZDV. Let M denote the maximum of
the death date, the date off-study, the last date the patient was known to be alive,
and entry into study 08, where the maximum is taken over the nomissing values of
these variables. The date off-study for a patient is the date when that patient was
no longer followed because of AIDS complications. Patients died shortly afterwards.
For placebo patients whose death date was not missing and was less than or
equal to m, the survival time is the time from entry into study 02 to the death date.
Otherwise, if either the death date was missing or was greater than m, then the
survival time was calculated as the time from entry into study 02 to the minimum
of entry into study 08, M, and 16 October 1986. Placebo patients were censored if
either their death date was missing or if it exceeded entry into study 08 or 16 October
1986.
For patients receiving ZDV whose death date was not missing and was less than
a, the survival time is the time from entry into study 02 to the death date. Otherwise,
if either their death date was missing or was greater than a, then the survival time
is the time from entry into study 02 to the minimum of M and a. Patients on ZDV
were censored if either their death date was missing or if it exceeded a.
As mentioned in Section 1.2.1 both CD4 and CD8 cell counts (counts/Liter) have
been proposed as surrogates for survival in AIDS patients. The CD4-lymphocyte
count has been proposed as a potential surrogate for human immunodefficiency virus
(HIV), the cause of AIDS, in many studies because of its observed correlation with
the clinical outcome, survival. CD4 counts decrease soon after infection with HIV
because the virus attacks the immune system. They continue to decline during the
98
•
<l.9ymptomatic stage of the disease until full-blown AIDS develops and terminates its
victims.
Patients with advanced disease have lower CD4 counts than those in the early
stage of infection. In this study CD4 counts were determined prior to treatment and
aproximately every four weeks during therapy. The initial CD4 measurement was
used in this example. Moreover, as mentioned in the Section 1.2.1, Baccheti et al.
(1992) found that CD8 counts add predictive power to the CD4 counts and thus would
produce more valid and useful surrogates than the CD4 counts alone. The CD8 cells
act both as deterents of virus multiplication by a mechanism that is not very well
understood, and as accelerators of AIDS by expressing CD38, CD57, or HLA-DR. (
See Romagnani (1994) for a review of CD4 and CD8 cells in the progression of AIDS.
Thus, CD8 count is considered as another surrogate in this example, and again only
the initial measurement was used.
At the time of this analysis covariate data were not available. Patients were
assigned random values for age generated as psuedo-random numbers independently
and uniformly distributed between 18 and 55.
To illustrate the theory an artificial responsewise as well as treatmentwise design
were imposed. With regard to the responsewise design, only those patients who were
not censored were included in the validation set. The validation set data consist of
the survival time, CD4 count, CDS count, and age. There were 24 patients on placebo
that were not censored. These were matched randomly with 24 out of 55 patients
who were not censored and were on ZDV. Thus a randomized block treatmentwise
design with 24 blocks was formed for the validation set.
The surrogate set data consisted of the variables CD4 count, CDS count, and
age. There were 113 patients on placebo and 89 patients on ZDV who were censored.
Twenty four out of 55 patients who were not censored and were on ZDV were added to
the 89 patients re'sulting in 113 patients on ZDV who were censored. These were randomly matched with the 113 patients on placebo who were censored. A randomized
99
b~ock
treatmentwise design with 113 blocks was formed for the surrogate set.
Descriptive statistics for the overall data, surrogate set, and validation set
IS
given in Table 1 in the Appendix. In the validation set, the overall correlation between
the survival time and the surrogate responses is given in Table 2 in the Appendix.
Moreover, the correlation between survival and the surrogates in each treatment group
in the validation set is given in Tables 3 and 4 in the Appendix. In Tables 3 and 4 the
negative correlation is due to chance fluctuations. The Spearman rank correlation
did not give negative values.
The mean of the survival time for placebo group is 109 days, whereas for the
ZDV group it is 360 days. A two sample t-test to test the equality of means of the
survival time of the placebo group and that of the ZDV group in the validation set
gave a p-value of 0.0001. The normal approximation for the Wilcoxon rank sum
test gave a p-value of 0.0001. This shows that the data support a more beneficial
treatment effect over placebo.
In the surrogate subset, CD4 count, CDS count, and age were aligned by subtracting their block averages, then ranked and Wilcoxon scores were assigned to the
ranks. A 2 x 3 matrix of linear rank statistics was constructed by taking the mean
over blocks of the scores belonging to the
ph
treatment. Each of these statistics
were centered by subtracting its expected value under the null hypothesis and the
permutational model.
The scores were aligned by subtracting the block score average. The estimated
covariance matrix VN defined in Theorem 2.1 was computed yielding the following
3 x 3 matrix of sums of squares and cross products of the aligned scores summed
over all blocks and treatments and divided by 113(= s( r - 1)), where s = 113 is the
number of blocks, and r = 2 is the number of treatments.
The matrix of coefficients
eN
for the surrogate set defined in Theorem 2.1
100
IS
g~ven
by
0.004424779
-0.004424779
-0.004424779 },
0.004424779
the entries being ±1/226, where sr = 226 is the cardinality of the surrogate set. The
estimated covariance matrix of the surrogates and age scores is
VN
=
0.16519
0.076239 0.013159
0.076239
0.16520
0.010146
0.013159 0.010146
0.16502
where the columns and rows correspond to CD4, CD8, and age respectively. Next,
partition VN and calculate the residual statistics, TN-a defined in 2.13
*
T No
_
-
{-0.023765 0.0237655}
-0.002536 0.0025356
where the rows correspond to the CD4 and CD8 rank statistics, respectively, and the
columns correspond to the treatments, placebo and ZDV.
A similar manipulation was done in the validation set. Thus a 2 x 4 matrix of
linear rank statistics was formed. In addition, residuals were calculated by regressing
the survival time on the surrogates after regressing both the survival time and surrogates on age as described in Section 2.2.2. This resulted in a 1 x· 2 vector of linear
rank statistics of survival time corresponding to placebo and ZDV.
Weights were calculated as in (2.21), the weighted linear combination of the
test statistics, TO·, constructed as in (2.22), and the quadratic form La· in (2.23)
computed. The following table gives the values of La· for both the parametric and
nonparametric versions with and without surrogates.
With Surrogates
Least Squares Test
Rank Based Test
Without Surrogates
70
346
125
388
101
The value of the rank based L o· was 125 when surrogates were incorporated, which
obviously leads to rejection of the null hypothesis of no treatment effect on the primary
endpoint, survival, when compared with the chi-squared distribution with 1 degree
of freedom.
Separate analyses were done using the validation set but ignoring surrogates in
order to compare the with and without surrogate analyses. The value of the quadratic
form was 388 which also leads to rejection of H o :
Tj
= 0 j = 1"", r. But the
significance level is smaller when surrogates are ignored. Thus, if significance level is
used as a criterion to see what was gained by using surrogates, this example shows
that one gets a more conservative significance level by using surrogates thus guarding
against faulty conclusions.
Parametric analyses were done using the least square estimates for comparison
with the nonparametric analyses. The same surrogate and validation sets were used.
It may be noted that the Shapiro-Wilk test of normality for the survival time, CD4
count, and CD8 count led to rejection of the null hypothesis that the distribution of
each of these variables is normal (p-value=O.OOOl). Even after taking the logarithm
of these variables, the Shapiro-Wilk test gave a p-value less' than 0.0001. Analyses
were carried on the logged variables. The value of the quadratic form in (2.23) for
the with surrogate analysis was 70.15, and for the without surrogate analysis it was
345.57.
Note that both the parametric and nonparametric methods led to very high
values of the quadratic form when surrogates were ignored. The small significance
level in such a case may lead researchers to optimistic conclusions that may be faulty.
6.4
Further Research
The method developed in this work handles only continuous data. However, it is
common for researchers to be interested in using discrete response variates. If the
102
sample size is fairly large, one may repeat the same analyses described in this work for
each category and compare categories accordingly. However, if there isn't sufficient
data in each category this appraoch may not work. Further research is needed to
incorporate discrete data in the methodology expounded in this study.
The linear models considered in (2.1), (3.1), and (4.1) do not accomodate interaction between blocks and treatments. The method discussed here can be extended
to cover such a possibility.
Moreover, the replicates may not be independent in the balanced incomplete
block case. This may happen if the replicates are measured on the same person
over time. Appropriate changes in the covariance matrices should be made to allow
dependent replicates.
A more involved problem is that of estimation of conditional quantiles of both the
surrogate and primary variates (conditional on the covariates). Sen (1994) discussed
such an estimation problem. It is more complex than the hypothesis testing problem.
This has been touched upon in the introductory Chapter. More work in that direction
is needed.
This procedure does not handle missing data. It would be worthwhile to incorporate missing cells in such a methodology. Rubin (1976) formulates three conditions
on the process that causes missing data which enable the statistician to ignore this
process when making inferences about the distribution of the data. If () is the parameter of the data, and ¢ is the parameter of the conditional distribution of the missing
data indicator given the data, then the missing data mechanism is ignorable if the
following conditons hold:
a) the data are missing at random ifmissingness depends on the data only through
observed values
b) The observed data are observed at random if for each possible value of the missing
data and the parameter ¢, the conditional probability of the observed pattern
103
of missing data, given the missing data and the observed data, is the same for
all possible values of the observed data.
c) The parameters () and </> are distinct in the sense that their joint p?-rameter space
factorizes into a </>-space and a ()-space.
Moreover, Rubin (1976) defines data to be missing completely at random if the
indicator for missing data is independent of the observed data. However, sometimes
the process that caused missing data cannot be ignored for example when the missing
pattern depends upon the observed data.
A nice account of the parametric approach to missing data is found in Rubin
(1976), Little and Rubin (1987), Little (1993), and Little (1994) among others. Their
approach is based on maximum likelihood techniques and the assumption of normality
is essential in the models they select. However, as the example shows that even when
data are transformed, normality may not hold. This leads the researcher to investigate
other, more widely applicable nonparametric methods.
The nonparametric approach to missing variables in multi-sample rank permutation tests for MANOVA and MANOCOVA is developed in: Servy and Sen (1987).
The multi-sample rank permutation tests for the complete data case are extended to
random missing patterns. Define the scores of the observed variables as they were
defined in the complete case (See Puri and Sen (1971)), disregarding the blanks, and
filling up any blank with the mean of the scores of the non-missing values of the
variable it belongs to. They show that under a random missing scheme, the rankpermutation principle holds and yields conditionally distribution-free test statistics.
They study in detail the asymptotic relative efficiency of the proposed test statistics.
An extention of their methodology to the block case is planned to make this study
gain wide applicability.
104
*
*
*
ENTRY02= DATE OF ENTRY INTO STUDY 02 (MMDDYY)
ENTRY08= DATE OF ENTRY INTO STUDY 08.
ALL PATIENTS ENTERING THIS
STUDY WERE PUT ON RETROVIR THERAPY. (MMDDYY)
* DDEATH = DATE OF DEATH (MMDDYY)
*
*
DTERM = DATE OFF STUDY (MMDDYY)
DALIVE = DATE PATIENT WAS LAST KNOWN TO BE ALIVE (MMDDYY)
**********************************************************************;
data D;
set D;
last_al='160CT86'd;
last_azt=entry02+7*80;
if ddeath ne . then death=1;else death=O;
*placebo;
if trt=O then do;
cens=(death=1 and .<ddeath<=min(entry08,last_al));
if cens=l then survt=ddeath-entry02;
else survt=min(entry08,max(dterm,dalive,ddeath,entry08),last_al)-entry02;
end;
if trt=O and ddeath=. and dalive=. and dterm=. then do;
survt='16Sep86'd-entry02;
end;
*AZT;
if trt=l then do;
cens=(death=l) and ddeath<=last_azt;
if cens=1 then survt=ddeath-entry02;
106
else survt=min(max(dterm,dalive,entry08,ddeath),last_azt)-entry02;
end;
if trt=1 and ddeath=. and dalive=. and dterm=. then do;
survt='16Sep86'd-entry02;
end;
107
Table 1
Means and Standard Deviations
Survival
Surrogate
Validation
Overall
(N=226)
(N=48)
(N=274)
276
235
268
(days)
(189)
(164)
(185)
CD4
130
75
120
(129)
(99)
(126)
556
492
545
(351)
(409)
(362)
•
(cells/L)
CD8
(cells/L)
Age
37
38
(years)
(11)
(11)
108
37
(11)
Table 2
Overall Correlation Between
Survival Time and Surrogates
in The Validation Set (N=48)
Pearson Correlation
Spearman Correlation
(p-value)
(p-value)
CD4
Survival
CD8
0.0146
0.634
(0.9216)
CD8
CD4
(0.0001)
CD8
0.193
Survival
CD8
0.0137
0.5868
(0.9259)
(0.0001)
0.2609
(0.0732)
(0.1885)
109
Table 3
Correlation Between Survival
\.
Time and Surrogates in The
Validation Set For the Placebo Group
(N=24)
Pearson Correlation
Spearman Correlation
(p-value)
(p-value)
Survival
CD4
-0.2688
(0.2041)
Survival
CD8
0.3562
CD4
(0.0876)
CD8
0.0461
0.7149
(0.8306)
(0.0001)
•
CD8
0.2925
CD8
(0.1654)
0.2301
(0.2795)
.
no
Table 4
Correlation Between Survival
Time and Surrogates in The
Validation Set For the ZDV Group
(N=24)
Pearson Correlation
Spearman Correlation
(p-value)
CD4
CD8
(p-value)
Survival
CD8
-0.0207
0.7435
(0.9237)
(0.0001)
Survival
CD4
CD8
0.0619
(0.7735)
0.1818
0.5127
(0.3952)
(0.0104)
0.2057
(0.3349)
111
CD8
Bibliography
•
Anderson, T. (1984). An Introduction to Multivariate Statistical Analysis. second edn.
John Wiley, New York.
Baccheti, P., Moss, A. R., Andrews, J. C. and Jacobson, M. (1992). Early predictors of survival in symptomatic HIV- infected persons treated with high-dose
zidovudine. Journal of Acquired Immune Deficiency Syndromes 5, 732-736.
Bose, R. C. (1947). Proceedings of the 34th Indian Science Congress 11, 1-25.
Carroll, R. (1989). Covariance analysis in Generalized Linear Measurement Error
Models. Statistics in Medicine 8, 1075-1093.
Carroll, R. and Ruppert, D. (1988). Transformation and Weighting in Regression.
London: Chapman and Hall.
Choi, S., Lagakos, S. W., Schooley, R. T. and Volberding, P. (1993). CD4+ lymphocytes are an incomplete surrogate marker for clinical progression in persons
with asymptomatic HIV infection taking zidovudine. Annals of Internal Medicine
118, 674-680.
Cochran, W. G. (1934). The distribution of quadratic forms in a normal system,
with application to the analysis of covariance.. Proceedings of the Cambridge
Philosophical Society 30, 178-191.
Cox, D. R. (1972). Regression models and life tables (with discussion). Journal of the
Royal Statistical Society Ser. B 34, 187-220.
Cox, D. R. and Hinkley, D. V. (1974). Theoretical Statistics. London: Chapman and
Hall.
Ellenberg, S. S. (1991). Surrogate endpoints in clinical trials. British Medical Journal
302,63-64.
...
Ellenberg, S. S. and Hamilton, J. M. (1989). Surrogate endpoints in clinical trials:
Cancer. Statistics in Medicine 8, 405-413.
Fleming, T. R. (1992). Evaluating therapeutic interventions: Some issues and experiences. Statistical Science 7, 428-456.
Fuller, W. (1987). Measurement Error Models. John Wiley, New York.
112
--.
-
Gangopadhyay, A. K. and Sen, P. (1992a). Nonparametric statistics and related topics.
A.K.M.E. Saleh, North Holland, Amsterdam.
Gangopadhyay, A. K. and Sen, P. (1992b). Nonparametric statistics and related topics.
A.K.M.E. Saleh, North Holland, Amsterdam.
Gruttola, V., Wulfsohn, M., Fischl, M. and Tsiatis, A. (1993). Modeling the relationship between survival and CD4 lymphocytes in patients with AIDS and AIDSRelated Complex. Journal of Acquired Immune Deficiency Syndromes 6, 359365.
Hillis, A. and Seigel, D. (1989). Surrogate endpoints in clinical trials: Opthalmologic
disorders. Statistics in Medicine 8, 427-430.
Kosorok, M. (1991). A variance reduction method for combining multivariate failure times in order to increase power in clinical trials. PhD thesis. University of
Washington.
Kupper, 1. (1984). Effects of the use of unreliable surrogate variables on the validity of
epidemiologic research studies. American Journal of Epidemiology 120,643-648.
Lagakos, S. and Hoth, D. (1992). Surrogate markers in AIDS: where are we? where
are we going? Annals of Internal Medicine 116, 599-601.
Lin, D. (1991). Nonparametric sequential testing in clinical trials with incomplete
multivariate observations. Biometrika 78, 123-131.
Little, R. J. A. (1993). Pattern-mixture models for multivariate incomplete data.
Journal of the American Statistical Association 88, 125-34.
Little, R. J. A. (1994). A class of pattern-mixture models for normal incomplete data.
Biometrika 81, 471-83.
Little, R. J. and Rubin, D. B. (1987). Statistical analysis and missing data. John
Wiley, New York.
Louis, T. A. (1982). Finding the observed information using the EM algorithm. Journal of the Royal Statistical Society Ser. B pp. 226-233.
Machado, S. G., Gail, M. H. and Ellenberg, S. S. (1990). On the use of laboratory
markers as surrogates for clinical endpoints in the evaluation of treatment for
HIV infection. Journal of Acquired Immune Deficiency Syndromes 3, 1065-1073.
McCullagh, P. and NeIder, J. (1989). Generalized Linear Models. London: Chapman
and Hall.
Monahan, I. P. (1961). Incomplete-variable designs in multivariate experiments. PhD
thesis. Virginia Polytechnic Institute, Blacksburg, Virginia.
113
Pepe, M. S. (1992). Inference using surrogate outcome data and a validation sample.
Biometrika 79, 355-365.
Pepe, M. S. and Fleming, T. (1991). A nonparametric method for dealing with
mismeasured covariate data. Journal of the American Statistical Association
86, 108-113.
Piedbois, et al. (1992). Modulation of fluorouracil by leucovorin in patients with advanced colorectal cancer: Evidence in terms of response rate. Journal of Clinical
Oncology 10, 896-903.
Prentice, R. L. (1989). Surrogate endpoints in clinical trials: Definition and operational criteria. Statistics in Medicine 8, 431-440.
..
,
Puri, M. 1. and Sen, P. (1971). Nonparametric Methods in Multivariate Analysis.
John Wiley, New York.
Puri, M. 1. and Sen, P. K. (1985). Nonparamefric Methods in General Linear Models.
John Wiley, New York.
Rao, C. R. (1947). General methods of analysis for incomplete block designs. Journal
of the American Statistical Association 42, 541-561.
Romagnani, S. (1994). Immunologic and clinical aspects of human immunodeficiency
virus infection. Allergy 49, 685-695.
Roy, S., Gnanadesikan, R. and Srivastava, J. (1971). Analysis and Design of Certain
Quantitative Multiresponse Experiments. Pergamon Press, New York.
Rubin, D. B. (1976). Inference and missing data. Biometrika 63,581-592.
Satten, G. and Kupper, 1. (1993). Inferences about exposure-disease associatios using
probability of exposure values. Journal of the American Statistical Association
88, 200-208.
Scheffe, H. (1959). The Analysis of Variance. John Wiley, New York.
Sen, P. K. (1969). On a class of aligned rank order tests for multiresponse experiments
in some incomplete block designs; See also Annals of Mathematical Statistics,
42 1971. Mimeo Series 607. Department of Biostatistics, University of North
Carolina.
Sen, P. K. (1981). Sequential Nonparametrics. John Wiley, New York.
Sen, P. K. (1983). On permutational central limit theorems for general multivariate
linear rank statistics. Sankhya A 45, 141-149.
Sen, P. K. (1992). Incomplete multiresponse designs and surrogate endpoints in clin
ical trials. Mimeo Series 2106. Department of Biostatistics, University of North
Carolina.
h
114
•
Sen, P. K. (1994). Incomplete multiresponse designs and surrogaie endpoints in clinical trials. Journal of Statistical Planning and Inference 42, 161-186.
Servy, E. C. and Sen, P. K. (1987). Missing variables in multi-sample rank permutation tests for manova and manocova. Sankhya A 49, 78-95.
Srivastava, J. N. (1966). Incomplete multiresponse designs. Sankhya A 28, 377-388.
Srivastava, J. N. (1968). On a general class of designs for multiresponse experiments.
Annals of Mathematical Statistics 39, 1825-1843.
Tsiatis, A., Gruttola, V. and Wulfsohn, M. (1995). Modeling the relationship of survival to longitudinal data measured with error. applications to survival and cd4
counts in patients with aids." Journal of the American Statistical Association
90,27-37.
Wittes, J., Lakatos, E. and Probstfield, J. (1989). Surrogate endpoints in clinical
trials: Cardiovascular Diseases. Statistics in Medicine 8, 415-425.
Yates, F. (1940). The recovery of interblock information in balanced incomplete block
designs. Annals of Eugenics 10, 317-325.
115