A Jackknifed Chi-Squared Test for Complex Samples

A Jackknifed Chi-Squared Test for Complex Samples
Author(s): Robert E. Fay
Source: Journal of the American Statistical Association, Vol. 80, No. 389 (Mar., 1985), pp. 148157
Published by: American Statistical Association
Stable URL: http://www.jstor.org/stable/2288065 .
Accessed: 25/07/2014 12:01
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact [email protected].
.
American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journal
of the American Statistical Association.
http://www.jstor.org
This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:01:18 PM
All use subject to JSTOR Terms and Conditions
A
Jackknifed
hi-Squared
Complex
Test
for
Samples
ROBERT
E. FAY*
Complex sample designs typically invalidatethe direct application of the familiar Pearson or likelihood-ratiochi-squared
statisticsfor testing the fit of a model to a cross-classifiedtable
of counts. This article discusses the adjustmentof these statistics througha jackknifingapproach.The techniquemay generally be appliedwhenevera standardreplicationmethod, such
as the jackknife, bootstrap,or repeatedhalf-samples,provides
a consistent estimate of the covariance matrix of the sample
estimates. Propertiesof the limiting distributionof new test
statistics, XJ and GJ, are described. The new statisticsmay be
used to test goodness of fit and to comparenested models.
KEY WORDS: Contingencytables; Log-linearmodels; Logit;
Replication methods; Sample surveys.
1. INTRODUCTION
Standardtextbooksin the analysisof categoricaldatathrough
log-linearmodels (such as Bishop et al. 1975; Goodman 1978;
Haberman 1974a, 1978, 1979; and Fienberg 1980) discuss
these models in tertnsof classicalsamplingdistributions,namely
the Poisson, multinomial,and productmultinomial(with occasional considerationof the hypergeometric).In these contexts, chi-squaredtests of the overall fit and of the contribution
made to the fit by specific groupsof parametersare a principal
tool in the selection and assessment of models for cross-classified data. The Pearson and likelihood-ratiochi-squaredstatistics representthe most commonly applied tests, but alternatives include the Freemanand Tukey (1950) chi-squareand
the Wald statistic proposedby Grizzle et al. (1969). Fienberg
(1979) reviewed the literatureand propertiesof these statistics.
By now it is well documentedin the literature,althoughnot
always acknowledgedin practice, that these simple test statistics may give extremelyerroneousresultswhen appliedto data
arisingfrom a complex sampledesign. Researchon alternative
proceduresto address this problem has taken a variety of directions, which, at the risk of oversimplification,may be divided into three major strategies:
1. Direct estimation of the covariance matrix of the cell
observationsand applicationof this covariance matrix in the
estimation of the model and in the testing of the parameters
and overall fit.
2. Use of estimationproceduresmotivatedby simple random sampling, such as maximumlikelihood, but computation
of a test statistic explicitly dependent on an estimate of the
covariance matrixof the cell observations.
3. As in 2, use of estimation procedures motivated by
* Robert E. Fay is a Staff Assistant, Statistical Methods Division, U.S.
Bureau of the Census, Washington, DC 20233. The author is indebted to
Charles D. Cowan, Stephen E. Fienberg, Ronald N. Forthofer, Myron J.
Katzoff, J. N. K. Rao, Donald B. Rubin, FrederickJ. Scheuren,andespecially
Shelby J. Haberman,for helpful comments and discussion. Jeffrey C. Moore
collaboratedon the analysis of the example. Thoughtfulsuggestions from an
associate editor and two referees assisted in preparingthis article.
simple randomsamplingand modification,typically by a scale
factor, of the usual Pearson or likelihood-ratiotest statistics.
The first of these threecategoriesmay be identifiedwith the
generalized least squaresmethod. Koch et al. (1975) gave an
expression for the Wald statisticthat is appropriatefor sample
designs admittingconsistentestimatesof the covariancematrix
of the sample frequencies.Generally, this approachrepresents
a complete asymptotic solution to the problem, but it often
yields erratic results for complex samples under all but the
most favorable of conditions. When the sample is known to
be multinomial, the weighted least squares methodology appropriatefor this distributioncan behave satisfactorilyeven for
moderatelylarge tables (say, 100 cells) when the numberof
observationsper cell is sufficient (say, a minimumof 50 per
cell). Typical estimators of the covariance of the observed
cross-classification for complex designs, however, yield far
less precisionthanthe multinomialanalogues, and this reduced
precision has a serious effect on the inversionrequiredin the
computationof the Wald statistic. This instabilityin the estimated inverse in turn inflates the rate of rejection under the
null hypothesis, often enough to make the test unusable.
Representingthe second strategy, a numberof authors, including Chapman(1966), McCarthy(1969), Nathan (1973),
and Fellegi (1980), have proposed alternativetest statistics.
Some tests requireestimationof the full covariancematrixof
the cell estimates, followed by an inversionproducingthe same
potential difficulties as those encounteredwith the Wald statistic. Others, such as McCarthy'smethod, requireadditional
assumptions. Fellegi (1980) comparedmany of these alternatives, includingthe behaviorof a statistic(t') of this form, for
the test of independencein a two-way classification.
The last approach is based on the properties of the chisquaredstatistic computed from the sample estimates. Cohen
(1976), Altham (1976), and Brier(1980) examinedmodels for
the interrelationshipof the sample design and the population,
which led to a simple correctionto the chi-squaredstatistic.
The disadvantageof these methods, however, is the need for
additionalassumptionson the covarianceof the estimates for
the proceduresto work acceptably well. Fellegi (1980) proposed a statistic (t") that attemptsto correct the chi-squared
statisticundermore generalassumptions.Rao and Scott (1981)
reviewed these methods and examined the propertiesof the
chi-squaredstatisticfor two-way tables underseveral common
complex sample designs. These two authors, along with their
colleagues (Holt et al. 1980; Scott and Rao 1981; Hidiroglou
and Rao 1981, 1983; Rao and Scott 1984), studied the distribution of the chi-squaredtests under more general log-linear
models and developed test statisticsthat can be computedfor
some models from the estimated cross-classificationand the
Inthe PublicDomain
Journalof the AmericanStatisticalAssociation
March1985,Vol. 80, No. 389, Theoryand Methods
148
This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:01:18 PM
All use subject to JSTOR Terms and Conditions
Fay: A Jackknifed
Chi-Squared
149
Test
variancesof the cells and marginaltables. A recent paper by
More generally, their paper also consideredthe case of the
comparison of two nested models, M1 and M2, in the sense
Bedrick (1983) is also based on this general strategy.
This article describes a differentsolution, based on a mod- that M2 is a special case of M1. In the case of multinomial
ified jackknife procedureappliedto the Pearsonor likelihood- sampling, the practiceis to consider the difference of the chiratio test statistics themselves, that is closely related to the squaredtests underthe two models. (If both models fit a conwork of Rao and Scott, especially to their proposed statistic stantterm, so the observedand fittedtotals over all cells agree,
X2/1.. The next section summarizesthe resultsof these authors the difference of likelihood-ratiochi-squaresgives the likeliand shows how one componentin the computationof the jack- hood-ratiochi-squaredtest for comparisonof the models, under
knifed chi-squaredtests presentedhere could be used to con- multinomialsampling.) If the differencein degrees of freedom
struct a test asymptotically equivalent to X2/c.. Under some is k, then the limiting distributionunder the null hypothesis
asymptoticconditions, X2/c. rejects underthe null hypothesis for the difference of the chi-squaredtests is chi-squareon k
at a rate significantly higher than the nominal level; the jack- degrees of freedom in the multinomialcase and that of (2.3)
knifed tests incorporatean additionaladjustmentto avoid this in the general case, where the bj's depend on C, x, and the
property. (Another statistic proposed by Rao and Scott, two models in question.
Rao and Scott (1981, 1984) proposed two test statistics:
XS, adjustsX2/6. by a differentmeans.) The third section deof
the
tests
first
developing XS, based on application of the Satterwaite(1946) approxiby
scribes the actual computation
common
replication strategies. mation to the distributionof (2.3), and the simpler X21(.,
a general notation for several
new
tests
are
In some circumstances,the
asymptoticallycon- where
section
discusses
propertiesof the
servative, and the fourth
k
a number of
section
collects
limiting distribution. The fifth
(2.4)
6. = E ,/ k.
general comments aboutthe range of applicationof these new
j=1
statistics, which is followed by an illustration.The Appendix
The adjusted statistic, X21(., is then compared to the chistates the asymptoticresults.
squareddistributionon k degreesof freedom.They gave several
OF THENEW TESTSTO
2. RELATIONSHIP
situations in which 6. may be estimateddirectly from the obTHOSEOF RAO AND SCOTT
served cross-classificationand estimatedvariancesfor the cells
of
the cross-classificationandfor specific marginaltotals, when
The most general resultspresentedby Rao and Scott (1984)
the
log-linear model admits a closed-form solution.
consider a fixed vector, x, of cell proportionsestimatedby a
The
jackknifed statistics presentedhere include a quantity
consistent estimator p*, where cells are indexed by i = 1,
based
on variabilityin X2 or G2 over a set of replications.
K+
T. They assume that n"12 (p* - a) converges in disnull hypothesis, K+lk is a consistent estimatorof
the
Under
tributionto N(O, C) as n increases, where n is the sample size.
could be used as an estimate of X21(.. The
so
X2I(K+Ik)
(.;
These assumptionssuitablycharacterizein many instancesthe
is
computed even when no closed-form
readily
K+
quantity
cross-classifieddataobtainedthroughcomplex designs in large
for the estimates under the log-linear
is
available
expression
sample surveys. They consideran estimatora*(p*) of w under
model.
a log-linear model, derived throughmaximum-likelihoodesOne of the disadvantagesof X2/1., however, is its rejection
timation for multinomialsampling applied directly to p* (or
under
the null hypothesis at a rate much greaterthan the nomsome asymptoticallyequivalentestimator).In this setting, the
inal
level
when the corresponding6j's vary substantiallyfrom
correspondingversions of the usual Pearson and likelihoodThejackknifedtests incorporatea correctionto avoid
one
other.
ratio chi-squaredtests are
this problem;Section 4 makes explicit the natureof this cor(Pi* (2.1) rection. The test Xs based on the Satterwaiteapproximation
X2(p*) = n
(p*))2/r*(p*)
compensates for variation in the (5's by a different means;
Section 5 comments on the relative merits of Xs and the jackand
knifed tests.
p
(2.2)
Pi ln(pi*/7X*(p*)).
G2(p*) = 2n
If for a specific model, the tests (2.1) and (2.2) have k degrees
of freedomunderthe null hypothesisfor multinomialsampling,
Rao and Scott showed thatfor the complex sample, the limiting
distributionis given by (2.3) as the weighted sum of independent chi-squaredvariates X2(j), j = 1, . . . k, each on a
single degree of freedom,
3. CALCULATION
OF THEJACKKNIFED
TESTSTATISTICS
3.1 Representation of the Replication Methods
Suppose Y representsan observedcross-classification,possibly in the form of estimated population totals for a finite
populationderived from a complex sample survey. The class
k
of replicationmethods to be considered in this article will be
Qk
(2.3) based on (pseudo-) replicatesY + W,i), i = 1, I . . , I, ] =
IE 5j X%('
j=1
1,. . . , Ji, typically based on the same data as Y. The asympwhere the 51's are nonnegative. In the case of multinomial totic theory for the jackknifed tests requiresthat
sampling, the 3j's will be identically 1; but in the generalcase,
they depend on the limiting covariance matrix C, the true
(3.1)
E
(i = 0
i
values qn, and the specific model being investigated.
This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:01:18 PM
All use subject to JSTOR Terms and Conditions
Journal of the American Statistical Association, March 1985
150
for each i. An estimate, cov*(Y), of the covarianceof Y should
be given by
cov*(Y)
W"J)
') W0 i
bi
=
(3.2)
i
i
where W j) 0 WJ) representsthe outerproductof W(i ) with
itself (the standardcross-productmatrix)and the bi are a fixed
set of constants appropriatefor the problem.
An additionalconditionis requiredfor the asymptotictheory
to apply: that the W('J)'sbe uniformly small relative to the
sampling variance in Y [estimatedby (3.2)]. This condition,
plus a condition on the accuracyof (3.2), will be made clearer
by applicationof this notationto common replicationmethods
in this section; Section 4 provides an explicit statement. Although some modificationis necessary, the (simple)jackknife,
a version of the jackknife to reflect stratification,and the halfsample and bootstrapmethods each may be recast into this
notation.
The standardjackknife may be applied when Y can be representedas the sum of n iid randomvariables,Pi). The standard
leave-one-outreplicates,Y(-i) = Y - V), may be reweighted
to the same expected total by the factorn/ (n - 1) and written
as
nW-Nl(n - 1) = Y + (Y
-
nZVi))(n - 1). (3.3)
The second termon the rightof (3.3) assumes the role of W(ii)
and satisfies (3.1). (Here, the subscripti is fixed at 1.) The
value (n - 1)/n representsthe usual choice for bi.
Stratificationconstitutesone of the key techniquesin complex sample design and is discussed in any standardtextbook
in the subject. In the precedingnotation, the universe may be
considered to be divided into I strata. The jackknife may be
adaptedto this problem if the samples are selected independently from each stratumand if Y may be representedas
y
=
E
Z,j),
(3.4)
ij
where the Z(s'J)are ni iid random variables within stratumi.
(These variables are not, however, assumed to be identically
distributedacross strata.)For each stratumi,
((EZ(i,j)) -
Y + w(ii) = y +
niZ('i))/(ni - 1)
methods, (3.1) forces inclusion of the complementaryhalf)
sample W('2) = Y Z(An additionalmodificationis necessary, however, since the
asymptotictheoryrequiresW(i'i)to be uniformlysmall relative
to the variationof Y, and the half-sampleestimates lack this
property.Instead, the representation
WUM) = d(Z(,)_
W(i-2)
-
Y)
= - WO(i)
1/(2d21)-'
bi=
(3.7)
(3.8)
(3.9)
generalizes the notion of half-samplereplication. The asymptotic conditions may be met by allowing d to converge to 0 in
a suitable manner;in practice, d = .05 appearsto be satisfactory for most applications. The sequence of half-samples
may be based on either independentselections or balanced
repeatedreplication(McCarthy1969).
Bootstrapreplicationmethods (Efron 1979, 1982) may also
be considered in some applications;for purposes of computation of the jackknifed tests here, bootstrapsamples may be
treatedas the equivalent of half-samplereplicatesreweighted
to the populationtotal.
3.2 Computation of the Statistics
The jackknifed values of the test statistics requirerefitting
the given log-linear model to the replicates, Y + W(i"),and
recomputing the test statistics, X2(Y + W('j)) or G2(Y +
W("')),for these new tables. [In this and subsequentsections,
the usual formulas for X2 and G2 are applied directly to the
cross-classification, Y, whenever Y is based on weighted estimates. In other words, for weighted Y (or Y + W('"))the
sum of the cell estimates of Y (or Y + W("'))replaces n in
(2.1) and (2.2).] Using the bi introducedin (3.2), thejackknifed
test statistic, XJ, is defined by
Xi = [(X2(Y))"2 - (K+)"2]/{V/(8X2(Y))}"/2
(3.10)
where
Pii
= X2(y + W(ij)) - X2(Y),
K =
b
bi
i
i
V=
bi
i
(3.11)
P
pij
(3.12)
Pi,
(3.13)
i
(3.5)
and K+ takes the value K for positive K and is zero otherwise.
has the same expected value as Y and defines Wi), satisfying
A test of the differenceof two chi-squaredtests undernested
(3.1). The correspondingchoice for bi is (ni - 1)/ni.
models Ml and M2 is given by
An alternativeapproachto varianceestimationfromcomplex
G (G2)(Y) -G(2)(Y))"2 - (K+)1/2
samples is to form estimates based on half of the sample,
selected to representall sources of variabilityin the design.
{V/ (8G 2)(Y) - 8G2)(Y))}112
Forexample, in designs with two selectionsin each of a number
where
of strata, half-samples may be formed by picking one from
each of the pairsof selections. If Z(i'l),i = 1, . . ., I, represent Pij= G(2)(y + W(ij)) - G(2)(Y + W(i"))
estimates of the totals, each based on half of the sample (re- G l)(Y)),
(3.15)
-(G2)(Y)
weighted, by a factor of 2 if necessary, to the original sample
V and K are defined by (3.12) and (3.13) applied to (3.15),
total), each may be reexpressedas
and
K+ again takes the value K for positive K and is zero
)Y + (Z("')- Y).
(3.6)
otherwise. When M(,) is the saturatedmodel, giving G(2l)of
Although not requiredin the usual applicationof half-sample zero in each case, (3.14) and (3.15) are direct analogues of
-
This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:01:18 PM
All use subject to JSTOR Terms and Conditions
151
Fay: A Jackknited Chi-Squared Test
Table 1. Critical Values of the
- (k)112) Distribution
(3.10) and (3.11) formed by replacingX2 by G2 throughout.
Similarly, expressions analogous to (3.14) and (3.15) formed
by replacingG2 by X2 throughoutdefine XJ for the difference
of two Pearson tests, although XJ should be set to 0 if the
differenceX()(Y) - X()(Y) becomes negative. (This problem
occurs readily for X2 in sparsetables. In these instances, GJis
preferableto XJ.)
2 112((k2)112
4. PROPERTIES
DISTRIBUTION
OF THELIMITING
4.1 Limiting Distribution Under the Null
Hypothesis in the General Case
The Appendix considersthe following asymptoticsituation,
with n - oo:
1. The populationproportionsw are fixed and satisfy the
log-linear model(s) in question. The model or models include
a parameterfor the sample total, so the model is an assertion
only about populationproportions.(This condition is satisfied
by the vast majorityof models fitted in practicalsituations.)
2. The sample proportionsestimatedby Yn are consistent
for the population proportionsxT,and there are constants gn
and hn, depending only on n, such that the quantityhn(Yn gn'N) converges in distributionto a multivariatenormal, N(O,
C), with a nonzero covariance, C.
introducedin (3.1) are such that as n->
3. The nWD"')'s
00
max hn jjnW(ijll
0.
-
(4.1)
iJ
4.
The covariance estimator (3.2), when multiplied by
h2, converges in probabilityto C.
5.
Condition (3.1) is strictly satisfied for each n.
Under these conditions, XJ and GJ are shown to have as their
limiting distributionthe distributionof
k
{
(?,
=
where
X2
j
=
1,
.
,
2
)
1/2
k
1/2
(2
?
iX2))}
(4.2)
k, are a set of independent chi-squared
variates, each on a single degree of freedom, and the ci are a
set of nonnegative weights dependingon C and the model(s)
in question. The value of k in (4.2) is the degrees of freedom
of the test for multinomialsampling.
4.2 Definition of the Test
In the importantspecial case of multinomialsampling, the
6i are identically 1. More generally, when the 5i's in (4.2) are
equal to any one positive constant, (4.2) reduces to a simple
monotonic transformationof the chi-squareddistribution
2I=21 2{(x2)12
-
(k)"2},
(4.3)
where xy is distributedas chi-squareon k degrees of freedom.
T*heexpression (4.3) gives an approximatestandardizationof
Table1 providescriticalvaluesatthe .05 and.01 levels
(xk2)"2.
for (4.3). Except for small k, the values for a given level depend
only slightly on k. As k increases, (4.3) approachesthe N(O,
Degrees of Freedom, k
.05
.01
1
2
3
5
10
20
40
oo
1.36
1.46
1.50
1.55
1.58
1.60
1.62
1.65
2.23
2.29
2.30
2.33
2.34
2.35
2.34
2.33
1) distribution.Note that for a = .05, the critical values increase monotonically with k, but for a = .01, they rise to a
maximumvalue of 2.35 for k at 20 (approximately),compared
with the limiting value 2.33.
The test based on XJ or GJ consists of rejecting for large
values of the statistic comparedwith critical values in Table 1
and, more generally, for other critical values determinedby
the distribution(4.3). Thus the test procedureis based on the
approximationof the general limiting distribution,(4.2), by a
simplification, (4.3).
When Y has a multinomialdistribution,X2, Xi, G2, and GJ
are asymptotically equivalent tests under the preceding conditions. (The asymptoticequivalenceof X2 and G2 underthese
assumptions was previously established;e.g., see Haberman
1974a.)
If C is assumed to be a scalar multiple of the covariance
matrixfor the multinomialdistribution,each of the 5i'sin (4.2)
assumes the value of this scalar multiplier. Consequentlythe
limiting distributionis again (4.3), and the jackknifedtests are
asymptoticallyexact underthe null hypothesis. This situation
correspondsto models studiedby Cohen(1976), Altham(1976),
and Brier (1980). If (3.2) is used as the estimated covariance
matrix in the computationof X2/c. and XI proposed by Rao
and Scott, the tests will be asymptoticallyequivalent to each
other and to XJ.
The more general relationshipbetween X2h5. and XJ may
be seen by expressing
XJ = A*{2l'2{(X2(Y)/(K+/k))"2- (k)"2}},
(4.4)
A* = ((4X2(Y)K+)/(kV))"2.
(4.5)
where
For equal 5i's, A* converges in probabilityto 1. In the general
case, comparisonof Xj/A* to the distributionof (4.3) gives a
test asymptotically equivalent to comparisonof X21b. to the
chi-squareddistributionon k degrees of freedom. Thus if XJ/
A* were used as a test statistic in place of XJ, it would consequently share the tendency of X2/1. to reject underthe null
hypothesis at an inflated rate when the 5i's vary.
4.3 Properties of the Test in the General Case
The testing procedure based on approximatingthe distributionof (4.2) by (4.3) is not asymptoticallyexact except when
the 5i's in (4.2) are equal. Computersimulationindicates that
the proposed tests are otherwise asymptoticallyconservative
at the nominallevels of .05 and .01 shown in Table 1 (although
This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:01:18 PM
All use subject to JSTOR Terms and Conditions
Journal of the American Statistical Association, March 1985
152
at the .01 level, 2.35 must be used in place of 2.34 or 2.33
for k above 20 to make the statementstrictly true).
A heuristicargumentoffers an explanationfor this observed
conservatism. When one of the 5i's, say 51, is substantially
largerthan the others, the numeratorand denominatorof (4.2)
will be positively correlated:whenX2) is large, both terms will
tend to be largerthan their respective expectations. In turn, a
positive correlationbetween the numeratorand denominator
of (4.2) could have the effect of reducing the probabilityof
extreme values.
In computer simulationof the distribution(4.2), actual rejection rates for the test based on (4.3) fell below the nominal
level by modest amountsfor all choices. Experimentationwith
5i's indicated a minimum of the actual rejection rates under
the null hypothesis when all 5i's are fixed at 1, except one,
say 61, set to a large value. For k = 6, the minimumrejection
rate at the nominal .05 level is about .032 with 61 set to around
10. (Increasing61furtheractuallyraisesthe poweragaintoward
.05.) For k = 10, the minimumof about .027 occurs for 61
around 15. Similarly, 61 - 10 yields a minimumof .025 for
i
20 yields the minimum of .022 for k = 40.
k = 16; 51
(Although the minimum values cited are fairly precisely determined, the indicationof the associated choice of 51 giving
the true minimum is far more coarse.)
Furtheranalytic approximationindicates that for this case
of 61 set considerablyhigher than the others, it is possible to
construct a sequence of 5i's with increasing k such that the
actual rejection rate under the null hypothesis for tests at the
nominal .05 level converges to zero. (The sequence has 61
increasing slightly faster than k1'2,with the other 5i's fixed at
1.) Fortunately,the rate at which the minimumrejectionrate
approacheszero is gentle: computersimulationof the analytic
approximationgives a minimum rejection rate of .015 for
k = 160 and .013 for k = 640.
It should be emphasizedthat these somewhat low rejection
rates are obtained underthe most adverse combinationof the
6i's, and the actualasymptoticrejectionrate should usually be
far closer to .05. Moreover,althoughrejectionrateslower than
the nominal level underthe null hypothesis suggest some loss
of power against near alternativehypotheses, the Appendix
details how the test remains consistent against any fixed alternative. Thus sufficient data would always lead to the rejection of false hypotheses, regardless of the values of the underlying 5i's.
It should also be emphasizedthatthe rangeof actualasymptotic rejectionrates offered by XJ seems far preferableto that
given by X2/ 5.. For example, at k = 10 the range for XJ at
the nominal .05 level is .027 to .050, whereas X2/1. rejects
asymptoticallyat rates varying from .050 to .177.
specific designs (one of them for the Census Bureau'sPublicUse Files from the CurrentPopulationSurvey), and reportson
an extensive set of Monte Carlo experimentsto evaluate the
performanceof these test statistics. (Both the program and
documentationare in the public domain.) This section will
attemptto summarizethe rangeof applicationof thejackknifed
tests, with implicit citation to this documentationunless otherwise noted.
* The Monte Carlo results confirmed earlier findings that
when the data arise from a multinomialdistribution,the Pearson test is superiorto the likelihood-ratiochi-square (Larntz
1978) for tests of goodness of fit, and the likelihood-ratiotest
is superiorto the differenceof Pearsonchi-squares(Haberman
1977) for tests of parametersin most situations, such as tests
of parametersin a logistic model. Not too surprisingly, the
Monte Carloresultsfavoredthe same choices for thejackknifed
tests. For multinomial samples, the jackknifed tests do not
realize any advantageover the preferredoriginal test.
* Almost any form of complex design (except the benign
forms of stratificationdiscussed by Rao and Scott 1981) has
a potentially severe effect on the chi-squaredtests. Although
the jackknifed tests require somewhat larger sample sizes for
the asymptotictheory to approximateactualperformance(discussed in detail in the programdocumentation),they otherwise
offer protectionagainst the effects of a complex design.
* The numberof replicates requiredis well within computational capabilities, given current computer resources. The
Monte Carlo results indicatedonly slight improvementin the
use of 50 fractions over 20 for purposes of constructingthe
simple jackknife, or of 50 half-samplereplicatesover 20. The
numberof replicatesfor the stratifiedjackknife should be necessarilylarger,dependingon the numberof strata.(Effectively,
a degree of freedom in the estimatedvarianceis lost for each
stratum.)This rangeof 20-50 for practicalapplicationappears
satisfactoryregardlessof table size (even when the numberof
cells is many times the numberof replicates).
* For tests of overall fit, the asymptotictheorybreaksdown
when the numberof cells with no observationsbecomes more
than a small fractionof the degrees of freedom. This happens
sooner with half-sample methods than with the simple jackknife.
* For tests of the contributionof specific sets of parameters
to a model (testing the relative fit of two nested models), the
jackknifed likelihood-ratiotest essentially requires only that
the correspondingmarginal tables be reasonably filled out.
Thus there is a wide potential range of applicationto logistic
and other causal modeling even when the table itself is sparse.
Comparisonof nested models is often more essential to the
development of models for large cross-classificationsthan is
testing of goodness of fit.
5. COMMENTS ON THEAPPLICATION
* When a single parameterof a model is to be tested with
OF THE TESTS
data from a complex sample design, the researcheris faced
The jackknifedtests have been incorporatedin the computer with a choice between the jackknifedlikelihood-ratiotest and
program CPLX (Contingency Table Analysis for Complex a standardizedvalue of the parameterbased on an estimated
Sample Designs; described in Fay 1982). The documentation standarderrorderived by standardreplicationtheory. Not only
for this program(Fay 1983a) includes commentson strategies are these alternativesasymptoticallyequivalent, however, but
to formulatereplicationapproachesfor common complex de- the Monte Carlo findings indicate extremely high agreement
signs, provides two example applicationsof these strategiesto (both tests rejecting or both tests not rejecting), for example,
This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:01:18 PM
All use subject to JSTOR Terms and Conditions
153
Fay: A Jackknifed Chi-Squared Test
on the orderof 99%, for realistic sample sizes (e.g., only 100
observations) and numbers of replicates (e.g., 20 jackknife
replicates). Thus the practicalimplicationsof this choice are
generally minimal and would be virtually nil in applications
involving a reasonablylarge numberof observations.
* Although not emphasized in this article, the asymptotic
theory for the jackknifed tests applies to a wider range of
problems, such as latent structuremodels describedby Goodman (1974) and Haberman(1974b), log-linear models with
structuralzeros, and the non-log-linear models proposed by
Haberman(1974c), Goodman(1979, 1981), and Clogg (1982)
for tables that include variables with orderedcategories.
* Replicationmethods can be adaptedto a numberof complex sample designs, including consideration of multistage
sampling, complex estimation procedures,and finite population corrections.
* One desirablefeatureof X2/ c5.is its ability to be computed
in some cases withoutaccess to the originalsurveydata. Whenever XJ can be computed,however, it offers greaterprotection
against rejection of the null hypothesis at a rate appreciably
higher than the nominal level.
Like XJ, the test X5 proposedby Rao and Scott essentially
requiresaccess to the original data. (The statisticcan be computed from an estimatedcovariancematrixfor the cells of the
complete cross-classification, but such a matrix is rarely, if
ever, availablein practicefor largetableswithoutspecific effort
for this purpose.) Asymptotically, X2 probably gives actual
rejection rates over a range both narrowerand closer to the
nominal level than does the generally conservativeXJ. It can
be shown, however, that under some practicalconditions, the
actual (as opposed to asymptotic)performanceof X2 would be
more conservative than XJ. (Furthercomments on this point
are given in Fay 1983b.) Thusneithertest is uniformlysuperior
to the other, but both appearto be more complete solutions to
the general problem than are previous alternatives.
The formulationof X2 also extends to other models beyond
the log-linear model, as does XJ. For specialized applications,
however, researchersmay find XJ easier to implement than
Xs, since the former only requiresalgorithmsto compute estimates under the model.
6. AN EXAMPLE
The U.S. Bureauof the Census contractedwith Damansand
Associates to conduct the Knowledge, Attitude, and Practice
Survey (KAP) to measurethe impactof the publicitycampaign
for the 1980 Census and public knowledgeof the census. (This
example is discussed in more detail by Fay 1983a, but it is
summarizedhere because it illustratessome importantfeatures
of complex designs and the behavior of the jackknifedtests.)
This surveyincludedinterviewsat two distincttimes or phases:
one in early 1980, beforethe campaignhadsubstantiallybegun,
and the second in mid-March,a few weeks before Census Day
(April 1, 1980), when the campaign had reached essentially
full intensity. A basic specificationfor the design was that the
equal size for the black,
surveyobtainsamplesof approximately
to provide relipopulation
("other")
Spanish, and remaining
groups.
able estimates for these three
The design of this survey was similar in many respects to
othernationalhouseholdsamples. The first step or stage of the
design was to draw a sample of counties. Counties (townships
in New England) were grouped into primarysampling units
(PSU's) composed of one or more contiguous counties. To
accommodatethe objective of samples of equal size for the
three groups, PSU's were assigned scores based on a linear
combination of their black, Spanish, and other populations,
with coefficients inversely proportionalto their respectiverelative nationalpopulationsizes. Because of their large scores,
the PSU's that covered the cities of New York, Philadelphia,
Chicago, Detroit, Miami, Houston, Washington,San Antonio,
Los Angeles, and San Franciscowere includedwith certainty,
that is, without any randomselection at the first stage. From
all remainingPSU's in the country, a stratifiedsample of 40
non-self-representingPSU's were selected with probabilities
proportionalto their scores.
In both the 10 self-representing(certainty)and 40 non-selfrepresentingPSU's, the households were selected throughadditional stages of sampling, includingstratificationand sample
allocation according to estimates of the black and Spanish
populationsfor census tractswithin the sampledPSU's, again
with the primary objective of selecting approximatelyequal
samples for the three demographicgroups. The ultimateunits
of the sample were clusters (segments) of neighboringhouseholds. By design, segments for the second phase of interviewing just before Census Day were physically adjacent to the
segments interviewedfor the first phase. (Thus the sample for
each phase was a probabilitysample, but the samples in the
two phases cannot be considered statistically independent.)
Because of the complexity of the design, the unconditional
probabilitiesof inclusion in the sample for interviewedhouseholds varied substantially,and the reciprocalsof these probabilities, modified by an adjustmentfor noninterviewedhouseholds, were employed to weight the responsesfor purposesof
estimation.
In this application, the stratifiedjackknife was chosen to
representthe effect of variabilityfrom the design. Non-selfrepresentingPSU's were usually groupedby pairs into strata
for the computationof the jackknife, but in a few instances,
groupings of three or four were more convenient and consequently were used. Each of the self-representingPSU's was
treated as a separate stratum, and the segments of sampled
households were groupedinto three approximatelyequivalent
and independentsubsamples, balanced as closely as possible
accordingto the original stratificationof segments within the
PSU. Segments interviewedin the second phase were assigned
to the same groups as the respective neighboringsegments in
the first phase, thus reflectingin the formationof the replicates
this potentially importantsource of covariance.
This illustrative analysis will focus on the effect of race/
ethnicity (R), income (I), and phase of interviewing (P) on
whetherrespondentshad recentlyheardof the census (H) (also
termed "exposure"here). Table 2 shows the weighted survey
estimates, in thousands.As with many estimatesfrom national
surveys, the entriesshown areroundedto thousandsfor display
as a matterof convenience and ease of interpretation,the calculationsdescribedhere were also carriedout in thousandsbut
withoutrounding.(Multiplicationof all of the surveyestimates
by a uniformfactorhas no effect on the values of thejackknifed
tests.)
This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:01:18 PM
All use subject to JSTOR Terms and Conditions
Journal of the American Statistical Association, March 1985
154
Table 2. Whether Respondents Had Recently Heard
of the Census: By Race/Ethnicity, Household
Income, and Phase, From the KAP Survey
(weighted estimates in thousands)
Recently Heard of Census
March 1980
Jan.IFeb. 1980
Race!
Ethnicity
Black
Spanish
Other
Household
Income ($)
Yes
No
Yes
No
22,000 or more
12,000-21,999
0-11,999
22,000 or more
12,000-21,999
0-11,999
22,000 or more
12,000-21,999
0-11,999
441
1,459
1,738
169
141
196
9,485
8,066
7,023
209
1,572
3,682
58
595
1,012
6,422
10,486
15,266
1,008
1,876
2,483
362
730
727
16,014
13,691
15,106
95
554
2,057
0
125
339
2,032
6,278
4,099
NOTE: Cases with incomplete data are excluded from the analysis.
Table 3 shows a general increase in exposure to the census
(H) over time. A more subtle question is whetherthis change
in exposure was uniform(on a logistic scale) for these groups
or instead varied by group, possibly because of differences in
the relative efficacy of the publicitycampaign. An analysis of
how exposure changed over time could consider the sequence
of models in Table3. The inclusionof the three-wayinteraction
of income, race, and phase, [IRPI, in each model makes each
equivalentto a logistic model for H. The three-wayinteraction
of exposure (H), income, and race-[HIR]-in each model
incorporatesall interactionof income and race upon exposure,
although without regardto phase. Thus all remainingparameters pertainto change in exposure over time. (In Table 3, the
chi-squaredtests for these models are reportedin thousands
simply for convenience. Extraordinarilylarge chi-squaredtests
are the rule when surveys weighted to nationaltotals are analyzed.)
The preferredjackknifedtest of overall fit, XJ(but also GJ),
indlcatesthat model 1, which posits an equal (logistic) change
in exposure over time, apparentlyfits well. Furtherfitting,
however, shows thatallowance for a differentialeffect of race/
ethnicityover time, [HRP], appearsto improvethe fit, whereas
allowance for a differential effect of income, [HIP], instead
leaves a significant lack of fit. Note, however, that the conclusions drawn by XJ about models 2 and 3 (each with 6 df)
is opposite to the orderimplied by the magnitudesof X2. Thus
any technique to evaluate the chi-squaresof different models
by an overall correctionfactor independentof the choice of
Table 3. Tests of Hypotheses for the KAP Example
Model
1
2
3
1-2
1-3
Parameters
df
X2(in
thousands)
G2(in
thousands)
Tests of Goodness of Fit
1,666.6
1,630.2
[IRP],[HIR],[HP] 8
1,217.3
1,212.8
[IRP],[HIR],HRP] 6
946.9
925.7
[IRP],[HIR],[HIP] 6
[HRP]
[HIP]
Tests of Parameters
417.4
2
704.5
2
Xi
GJ
.30
.03
1.63
.33
.03
1.78
1.56
- .23
1.69
- .23
model will disagree with one or both results from Xi. The
statisticX2/ 6. of Rao and Scott (1981), which allows different
adjustmentsdependingon the choice of model (indeed, these
researchersemphasize the importanceof this consideration),
cannot be readily applied here, since all three models do not
admitclosed-formsolutions(Haberman1974a;Goodman1970).
The preferred test of the contributionof specific sets of
parameters,GJ (but XJ as well), is consistent with the tests of
overall fit, showing that the [HRP], but not the [HIP], parameters make a significantcontributionto the model. Again, this
is the opposite of the ordersuggested by the magnitudeof the
usual chi-squaredtests. Applicationof replicationmethods to
the parametersof the model (Fay 1983a) indicates that the
significance of the [HRP] parametersis driven by the contrast
of Spanish with the other two groups (z = 2.95, showing
greaterincrease in exposure for Spanish), whereasthe income
parametersare not significant, confirmingthe resultsfrom GJ.
This reversalin the relative magnitudesof the jackknifedtests
and the original chi-squaresin testing these two hypotheses is
not surprising,however, since the black and Spanish populations were disproportionatelysampled, thus giving comparisons by race/ethnicity (such as [HRP]) greaterrelative reliability than comparisonsby income (such as [HIP]) compared
to simple randomsamplingof the total population.Hence replication methodsrecognize this distinctionand providea sound
analysis of the data (based on frequentisttheory).
This example was selected from a large numberof applications of these methods for the distinct reversal of the conclusions from the jackknifedtests comparedto what might be
inferredfrom the magnitudeof the original chi-squares. The
reversalin this instancemay be attributedto the large variation
in the survey weights. In the author'sexperience, for samples
with equal weights, those hypotheses with relatively larger
values for the simple chi-squaredtests will be the ones rejected
by the jackknifedtests as well. Thus alternativemethodsbased
on an overall adjustmentto chi-squarenot depending on the
specific model may be acceptable in some applications, particularlywhen access to the originaldata is impossible. Nonetheless the ability to performcorrectly, even in relatively extremesituations,andthe avoidanceof unnecessaryassumptions
would seem to recommendthe jackknifed tests over such alternatives in situations in which access to the original data
permits the computation.
RESULTS
APPENDIX:ASYMPTOTIC
This appendixsummarizesthe results and methodsof proof
given elsewhere (Fay 1980) for the asymptoticresults cited in
this article.
The class of estimatorsfor which this theoryholds is general,
but all useful applicationsknown to the authorareto maximum
likelihood estimators(or asymptoticequivalents under multinomial sampling) for a class of parametricmodels studied
earlier by Rao (1965). This class comprises parametricrepresentations for the expected value, aT(O),of a multinomial
proportionsatisfyingthe following (numberedaccordingto Rao
1965):
449.3
719.8
in Table2 of whetherrespondentshad recentlyheard
NOTE: Analysisof cross-classification
of the census (H)by race/ethnicity(R), income(1),and phase (P).
Assumption1.].
inf
Given a (5> 0, there is an E > 0 with
[Q'rr(00),ln aT(00)) -
gHO
- H?I>5
This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:01:18 PM
All use subject to JSTOR Terms and Conditions
(aT(00), ln aT(O))]> g,
(A. 1)
Fay: A Jackknifed Chi-Squared Test
155
where (, ) denotes the standardinnerproductand In operates
on the components of a vector by taking their naturallogarithms.
Assumption2.3 (modified). The function TTis in C(2)in a
neighborhoodof 00, where C(2) is the space of twice differentiable functions with a continuous second-orderdifferential
in a neighborhoodof 00.
The nW"'j"ssatisfy
Condition2.
ij) = 0
for each i and n, and there are constants nfi such that as
n -> oo,
J
i
Assumption3.
at 00, where
The informationmatrix(irs) is nonsingular
irs =
>
it)
airj/ ar
jatyaO.
(A.2)
Under weaker conditions, Rao showed the (asymptotic)existence and asymptotic efficiency of the maximum likelihood
estimator.
The results discussed here assume the three previous conditions and the following:
Assumption4. n(00) has a strictlypositive expected value
in each component.
(A.5)
for the covarianceoperator,C, in (A.4), where 0 denotes the
outer product.
Condition 3.
As n -*
oo,
hnsup sup LW(iW
ll
i
(A.7)
0.
j
Let X2 denote the Pearsonchi-squaredtest for the maximum
likelihood estimator aT(O*)of aT(00),and let X2' be the analogous test for the maximum likelihood estimator nr'(O*')of
Tr'(00') (or asymptotically equivalent estimators). Define
nPij = X2 '(Yn +
Xij))
-
X2(Yn +
Xij))
-
- (X2'(Yn)
(A.8)
X2(Yn)),
Assumption5. 00 is an interiorpoint of a parameterspace
in q-dimensionalEuclideanspace.
(A.9)
Kn =
nfi
nPij,
The class of models satisfying assumptions 1.1, 2.3, 3, 4,
j
i
and 5 includes, of course, the log-linear models studied as
(A.10)
Vn=
nf1i n
examples in this article, where in each model a parameterwas
J
i
included to representthe overall sample total.
Haberman(1974b) discussed a relatedclass of models aris- Then as n -* oo,
ing from the observationof only a subset of the variablesin a
q-q'
cross-classificationwhose expected values satisfy a log-linear
(A.11)
gnKn~ E 'm,
m=1
model for the complete table. Goodman(1974) illustratedthe
applicationof these methods to latent structureanalysis. Genand the joint distributionof X2'(Yn) - X2(Yn) and of Vn is
erally, these models also satisfy the preceding assumptions,
characterized
by
except for a subset of the parameterspace, where both asq -q'
sumptions1.1 and 3 may fail. Section 5 mentionsotherclasses
L
2
(A.12)
g
mX
-X2(Yn))
hn(X2'(Yn)
conditions.
the
required
of models that also satisfy
m=1
both
of
theorems
encompasses
the
following
The statement
tests of fit and tests of specific parameters.(The case of overall and
fit is representedby a "saturated"model for w.) The first
q-q'
theorem considers the asymptotic distributionunder the null
(A.13)
gn2hnVn >4 3 mX(m)
hypothesis.
m=1
3
3
3
3
-
Theorem1. Let w(O) and w'(0') be functionsof q-dimensional 0 andq'-dimensional0' satisfyingassumptions1.1, 2.3,
3, 4, and 5. Suppose that the true 00 and 0?' are such that
l'(00'), and that in a neighborhoodof 0?', the image
iT(O?) =
of n' is contained in the image of w in a neighborhood
of 00. (For nontrivial results, q' < q is assumed.) Suppose
that for each n, there are random variables Yn and nW(ij,
1 . . . , nJi, satisfying the following
i = 1, . . . , in j =
three restrictions:
Condition 1.
n ->00
There exist constantsgn and hn such that as
hn (Yn -
3
for independent chi-squared variates, XZ2) (m = 1, .
q - q'), and constants 5m(m = 1, ... , q - q'), given by
6m = ((v(m), D-lQ((00))Ctv(m)),
where ((,
(A. 14)
is the inner productdefined by
((x, y)) = (x, D-'(Tr(00))y)
(A. 15)
for vectors x and y, D- I'n(00)) is the operatorthat transforms
the elements of a vector y = {y1}by
(D- 1(,(00))y)i = (7ri(O0))-1yi,
(A. 16)
and C' is the covariance operator
gnff(00))
> N(O, C),
(A.3)
C' = C + (e, Ce) ff 0 f - Ce 0 f - f 0 Ce.
for some covarianceoperatorC, and
[Typically, E((e, Yn)) = gn, where e representsthe vector
whose elements are all 1.]
(A.17)
The vectors V(m), m = 1, . . . , q - q, are defined as an
orthonormalbasis with respect to the inner product (A. 15)
for T(Tr(00))- T'Qn'(00'))-the orthogonal complement of
This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:01:18 PM
All use subject to JSTOR Terms and Conditions
Journal of the American Statistical Association, March 1985
156
T'(Q'(O')) in T(Q(00))with respectto (A. 15)-when TQn(00))
is the tangentplane to ff at 00 and T'(Q'(00')) is the tangent
plane to n' at 09'. The v(m)are further restricted to satisfy
((v(m), D-'(w(00))C'v(')4
= 0
(A.18)
for m # t.
The same results apply to the statistic GJ formed by substitution of G2 for X2 in (A.8) and (A. 12).
Commenton Proof. The developmentis throughstandard
asymptoticmethods. Analytic approximationsfor the first- and
second-orderdifferentials of X2 and G2 are first developed.
The conditions of the theorem are shown to give (A. 11) as a
consequence of the second-orderterm in the Taylor series expansion for X2 or G2 [the contributionof the first-orderterm
to (A.9) is removedby (A.5)]. The existence of v(m),satisfying
(A. 18), is a consequence of a standardtheorem (Rao 1965).
These vectors are used in proving the joint asymptoticdistribution (A. 12)-(A. 13). The first-orderterm expansionof X2 or
G2 aboutthe observed Y dominatesthe analytic approximation
of (A. 10). Although the v(m)are not necessarily unique, the
relationship (A. 14) uniquely determines the 5m, up to permutationsof their order.
Simple transformationof the results of this theorem leads
to the characterizationof XJ and GJ given in this article.
A second theorem establishes the consistency of XJ or GJ
against a fixed alternativehypothesis.
Theorem2. Suppose that the conditions of Theorem 1 are
satisfied with the following modifications:The true expected
proportionsare given by ir(00), but there is no longer any 00'
with ir'(00') = f(90); and the maximum likelihood estimator
ir'(O*') of 09' at w(00) has strictly positive elements. Then
-I
X2 (Yn
-
X2(Yn)}
>E
(7ri(0?)
- '(O*I))2/7r(O*I)
(A. 19)
arfdgnKnand Vn are both bounded in probability. The same
results hold for G2.
Commenton Proof. The methodof proof is similarto that
for Theorem 1. A more complete statement of the theorem
(Fay 1980) gives explicit limits for gnKnand Vn.
Manipulationof the results of the theoremshows the divergence of XJor GJunderthe conditionsof this second theorem.
The material developed in the proofs may also be used to
determinethe conditions under which the tests are asymptotically efficient for nearbyalternatives.
[ReceivedFebruary 1980. Revised June 1984.]
REFERENCES
Altham, P. A. E. (1976), "DiscreteVariableAnalysis for IndividualsGrouped
Into Families," Biometrika, 63, 263-269.
Bedrick, EdwardJ. (1983), "AdjustedChi-SquareTests for Cross-Classified
Tables of Survey Data," Biometrika, 70, 591-595.
Bishop, Yvonne M. M., Feinberg, StephenE., and Holland, Paul W. (1975),
Discrete MultivariateAnalysis, Cambridge,MA: MIT Press.
Brier, Stephen S. (1980), "Analysis of Contingency Tables Under Cluster
Sampling," Biometrika, 67, 591-596.
Chapman,D. W. (1966), "An ApproximateTest of IndependenceBased on
Replications of a Complex Survey Design," unpublishedmaster's thesis,
Cornell University, Dept. of Statistics.
Clogg, Clifford C. (1982), "Some Models for the Analysis of Association in
MultiwayCross-ClassificationsHaving OrderedCategories,"Journalof the
AmericanStatistical Association, 77, 803-815.
Cohen, J. E. (1976), "The Distributionof the Chi-Square Statistic Under
Cluster Sampling From Contingency Tables," Journal of the American
Statistical Association, 71, 665-670.
Efron, Bradley (1979), "Bootstrap Methods: Another Look at the Jackknife," Annals of Statistics, 7, 1-26.
(1982), The Jackknife, the Bootstrap, and Other ResamplingPlans,
Philadelphia,PA: Society for Industrialand Applied Mathematics.
Fay, RobertE. (1980), "On JackknifingChi-SquareTests Statistics-Part II:
AsymptoticTheory," unpublishedmanuscript,U.S. Bureauof the Census.
(1982), "Contingency Table Analysis for Complex Designs:
CPLX," Proceedings of the Section on SurveyResearchMethods,American
Statistical Association, pp. 44-53.
(1983a), "CPLX-Contingency Table Analysis for Complex Sample
Designs, ProgramDocumention," unpublishedreport, U.S. Bureauof the
Census.
(1983b), "ReplicationApproachesto the Log-LinearAnalysis of Data
From Complex Samples," paper presentedat the seminar "Recent Developments in the Analysis of Large Scale Data Sets," StatisticalOffice of the
EuropeanCommunities, Luxembourg,November.
Fellegi, Ivan P. (1980), "ApproximateTests of Independenceand Goodness
of Fit Based on StratifiedMultistage Samples," Journal of the American
Statistical Association, 75, 261-268.
Fienberg,StephenE. (1979), "TheUse of Chi-SquareStatisticsfor Categorical
Data Problems," Journal of the Royal Statistical Society, Ser. B, 41, 5464.
(1980), TheAnalysis of Cross-ClassifiedData, Cambridge,MA: MIT
Press.
Freeman, M. F., and Tukey, J. W. (1950), "TransformationsRelated to the
Angularand the SquareRoot," Annals of MathematicalStatistics, 21, 607611.
Goodman, Leo A. (1970), "The MultivariateAnalysis of QualitativeData:
InteractionsAmong MultipleClassifications,"Journalof the AmericanStatistical Association, 65, 226-256.
(1974), "TheAnalysis of Systems of QualitativeVariablesWhen Some
of the Variables Are Unobservable. Part I: A Modified Latent Structure
Approach,"AmericanJournal of Sociology, 79, 1179-1259.
(1978), Analyzing QualitativelCategorical Data, Cambridge, MA:
Abt Associates.
(1979), "Simple Models for the Analysis of Association in CrossClassificationsHaving OrderedCategories,"Journal of the AmericanStatistical Association, 74, 537-552.
(1981), "AssociationModels andCanonicalCorrelationin the Analysis
of Cross-ClassificationsHaving OrderedCategories,"Journal of the American Statistical Association, 76, 320-334.
Grizzle, J. E., Starmer,C. F., and Koch, G. G. (1969), "Analysis of Categorical Data by Linear Models," Biometrics, 25, 489-504.
Haberman, Shelby J. (1974a), The Analysis of Frequency Data, Chicago:
University of Chicago Press.
(1974b), "Log-LinearModels for FrequencyTables Derived by Indirect Observation:MaximumLikelihood Equations,"Annals of Statistics,
2, 911-924.
(1974c), "Log-LinearModels for Frequency Tables With Ordered
Classifications," Biometrics, 30, 589-600.
(1977), "Log-LinearModels and FrequencyTables With Small Expected Cell Counts," Annals of Statistics, 5, 1148-1169.
(1978), Analysis of Qualitative Data: Vol. 1: IntroductoryTopics,
New York: Academic Press.
(1979), Analysis of QualitativeData: Vol. 2: New Developments,New
York: Academic Press.
Hidiroglou, M. A., and Rao, J. N. K. (1981), "ChisquareTests for the
Analysis of CategoricalData From the CanadaHealth Survey," paperpresented at the InternationalStatisticalInstituteMeetings, Buenos Aires, December 5.
(1983), "Chi-SquareTests for the Analysis of Three-WayContingency
Tables From the Canada Health Survey," unpublishedTechnical Report,
Statistics Canada.
Holt, D., Scott, A. J., and Ewings, P. 0. (1980), "Chi-SquaredTests With
Survey Data," Journal of the Royal Statistical Society, Ser. A, 143, 302320.
Koch, G. G., Freeman, D. H., Jr., and Freeman, J. L. (1975), "Strategies
in the MultivariateAnalysis of Data FromComplexSurveys,"International
Statistical Review, 43, 59-78.
Larntz,Kinley (1978), "Small-SampleComparisonsof Exact Levels for Chi-
This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:01:18 PM
All use subject to JSTOR Terms and Conditions
Fay: A Jackknifed Chi-Squared Test
Squared Goodness-of-Fit Statistics," Journal of the American Statistical
Association, 73, 253-263.
McCarthy,Philip J. (1969), "Pseudo-Replication:Half-Samples,"Review of
the InternationalStatistical Institute, 37, 239-264.
Nathan, Gad (1973), "ApproximateTests of Independencein Contingency
Tables From Complex StratifiedSampling," in Vital and Health Statistics,
Ser. 2, No. 53, Washington,DC: National Centerfor Health Statistics.
Rao, C. R. (1965), Linear Statistical Inference and Its Applications, New
York:John Wiley.
Rao, J. N. K., and Scott, A. J. (1981), "The Analysis of CategoricalData
FromComplex Sample Surveys:Chi-SquaredTests for Goodness of Fit and
157
Independence in Two-Way Tables," Journal of the American Statistical
Association, 76, 221-230.
(1984), "OnChi-SquaredTests for MultiwayContingencyTablesWith
Cell ProportionsEstimated From Survey Data," Annals of Statistics, 12,
46-60.
Satterwaite,F. E. (1946), "An ApproximateDistributionof Estimatesof Variance Components,"Biometrics, 2, 110-114.
Scott, A. J., and Rao, J. N. K. (1981), "Chi-SquaredTests for Contingency
Tables With ProportionsEstimatedFrom Survey Data," in CurrentTopics
in Survey Sampling, eds. D. Krewski, R. Platek, and J. N. K. Rao, New
York: Academic Press.
This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:01:18 PM
All use subject to JSTOR Terms and Conditions