A Jackknifed Chi-Squared Test for Complex Samples Author(s): Robert E. Fay Source: Journal of the American Statistical Association, Vol. 80, No. 389 (Mar., 1985), pp. 148157 Published by: American Statistical Association Stable URL: http://www.jstor.org/stable/2288065 . Accessed: 25/07/2014 12:01 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. . American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journal of the American Statistical Association. http://www.jstor.org This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:01:18 PM All use subject to JSTOR Terms and Conditions A Jackknifed hi-Squared Complex Test for Samples ROBERT E. FAY* Complex sample designs typically invalidatethe direct application of the familiar Pearson or likelihood-ratiochi-squared statisticsfor testing the fit of a model to a cross-classifiedtable of counts. This article discusses the adjustmentof these statistics througha jackknifingapproach.The techniquemay generally be appliedwhenevera standardreplicationmethod, such as the jackknife, bootstrap,or repeatedhalf-samples,provides a consistent estimate of the covariance matrix of the sample estimates. Propertiesof the limiting distributionof new test statistics, XJ and GJ, are described. The new statisticsmay be used to test goodness of fit and to comparenested models. KEY WORDS: Contingencytables; Log-linearmodels; Logit; Replication methods; Sample surveys. 1. INTRODUCTION Standardtextbooksin the analysisof categoricaldatathrough log-linearmodels (such as Bishop et al. 1975; Goodman 1978; Haberman 1974a, 1978, 1979; and Fienberg 1980) discuss these models in tertnsof classicalsamplingdistributions,namely the Poisson, multinomial,and productmultinomial(with occasional considerationof the hypergeometric).In these contexts, chi-squaredtests of the overall fit and of the contribution made to the fit by specific groupsof parametersare a principal tool in the selection and assessment of models for cross-classified data. The Pearson and likelihood-ratiochi-squaredstatistics representthe most commonly applied tests, but alternatives include the Freemanand Tukey (1950) chi-squareand the Wald statistic proposedby Grizzle et al. (1969). Fienberg (1979) reviewed the literatureand propertiesof these statistics. By now it is well documentedin the literature,althoughnot always acknowledgedin practice, that these simple test statistics may give extremelyerroneousresultswhen appliedto data arisingfrom a complex sampledesign. Researchon alternative proceduresto address this problem has taken a variety of directions, which, at the risk of oversimplification,may be divided into three major strategies: 1. Direct estimation of the covariance matrix of the cell observationsand applicationof this covariance matrix in the estimation of the model and in the testing of the parameters and overall fit. 2. Use of estimationproceduresmotivatedby simple random sampling, such as maximumlikelihood, but computation of a test statistic explicitly dependent on an estimate of the covariance matrixof the cell observations. 3. As in 2, use of estimation procedures motivated by * Robert E. Fay is a Staff Assistant, Statistical Methods Division, U.S. Bureau of the Census, Washington, DC 20233. The author is indebted to Charles D. Cowan, Stephen E. Fienberg, Ronald N. Forthofer, Myron J. Katzoff, J. N. K. Rao, Donald B. Rubin, FrederickJ. Scheuren,andespecially Shelby J. Haberman,for helpful comments and discussion. Jeffrey C. Moore collaboratedon the analysis of the example. Thoughtfulsuggestions from an associate editor and two referees assisted in preparingthis article. simple randomsamplingand modification,typically by a scale factor, of the usual Pearson or likelihood-ratiotest statistics. The first of these threecategoriesmay be identifiedwith the generalized least squaresmethod. Koch et al. (1975) gave an expression for the Wald statisticthat is appropriatefor sample designs admittingconsistentestimatesof the covariancematrix of the sample frequencies.Generally, this approachrepresents a complete asymptotic solution to the problem, but it often yields erratic results for complex samples under all but the most favorable of conditions. When the sample is known to be multinomial, the weighted least squares methodology appropriatefor this distributioncan behave satisfactorilyeven for moderatelylarge tables (say, 100 cells) when the numberof observationsper cell is sufficient (say, a minimumof 50 per cell). Typical estimators of the covariance of the observed cross-classification for complex designs, however, yield far less precisionthanthe multinomialanalogues, and this reduced precision has a serious effect on the inversionrequiredin the computationof the Wald statistic. This instabilityin the estimated inverse in turn inflates the rate of rejection under the null hypothesis, often enough to make the test unusable. Representingthe second strategy, a numberof authors, including Chapman(1966), McCarthy(1969), Nathan (1973), and Fellegi (1980), have proposed alternativetest statistics. Some tests requireestimationof the full covariancematrixof the cell estimates, followed by an inversionproducingthe same potential difficulties as those encounteredwith the Wald statistic. Others, such as McCarthy'smethod, requireadditional assumptions. Fellegi (1980) comparedmany of these alternatives, includingthe behaviorof a statistic(t') of this form, for the test of independencein a two-way classification. The last approach is based on the properties of the chisquaredstatistic computed from the sample estimates. Cohen (1976), Altham (1976), and Brier(1980) examinedmodels for the interrelationshipof the sample design and the population, which led to a simple correctionto the chi-squaredstatistic. The disadvantageof these methods, however, is the need for additionalassumptionson the covarianceof the estimates for the proceduresto work acceptably well. Fellegi (1980) proposed a statistic (t") that attemptsto correct the chi-squared statisticundermore generalassumptions.Rao and Scott (1981) reviewed these methods and examined the propertiesof the chi-squaredstatisticfor two-way tables underseveral common complex sample designs. These two authors, along with their colleagues (Holt et al. 1980; Scott and Rao 1981; Hidiroglou and Rao 1981, 1983; Rao and Scott 1984), studied the distribution of the chi-squaredtests under more general log-linear models and developed test statisticsthat can be computedfor some models from the estimated cross-classificationand the Inthe PublicDomain Journalof the AmericanStatisticalAssociation March1985,Vol. 80, No. 389, Theoryand Methods 148 This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:01:18 PM All use subject to JSTOR Terms and Conditions Fay: A Jackknifed Chi-Squared 149 Test variancesof the cells and marginaltables. A recent paper by More generally, their paper also consideredthe case of the comparison of two nested models, M1 and M2, in the sense Bedrick (1983) is also based on this general strategy. This article describes a differentsolution, based on a mod- that M2 is a special case of M1. In the case of multinomial ified jackknife procedureappliedto the Pearsonor likelihood- sampling, the practiceis to consider the difference of the chiratio test statistics themselves, that is closely related to the squaredtests underthe two models. (If both models fit a conwork of Rao and Scott, especially to their proposed statistic stantterm, so the observedand fittedtotals over all cells agree, X2/1.. The next section summarizesthe resultsof these authors the difference of likelihood-ratiochi-squaresgives the likeliand shows how one componentin the computationof the jack- hood-ratiochi-squaredtest for comparisonof the models, under knifed chi-squaredtests presentedhere could be used to con- multinomialsampling.) If the differencein degrees of freedom struct a test asymptotically equivalent to X2/c.. Under some is k, then the limiting distributionunder the null hypothesis asymptoticconditions, X2/c. rejects underthe null hypothesis for the difference of the chi-squaredtests is chi-squareon k at a rate significantly higher than the nominal level; the jack- degrees of freedom in the multinomialcase and that of (2.3) knifed tests incorporatean additionaladjustmentto avoid this in the general case, where the bj's depend on C, x, and the property. (Another statistic proposed by Rao and Scott, two models in question. Rao and Scott (1981, 1984) proposed two test statistics: XS, adjustsX2/6. by a differentmeans.) The third section deof the tests first developing XS, based on application of the Satterwaite(1946) approxiby scribes the actual computation common replication strategies. mation to the distributionof (2.3), and the simpler X21(., a general notation for several new tests are In some circumstances,the asymptoticallycon- where section discusses propertiesof the servative, and the fourth k a number of section collects limiting distribution. The fifth (2.4) 6. = E ,/ k. general comments aboutthe range of applicationof these new j=1 statistics, which is followed by an illustration.The Appendix The adjusted statistic, X21(., is then compared to the chistates the asymptoticresults. squareddistributionon k degreesof freedom.They gave several OF THENEW TESTSTO 2. RELATIONSHIP situations in which 6. may be estimateddirectly from the obTHOSEOF RAO AND SCOTT served cross-classificationand estimatedvariancesfor the cells of the cross-classificationandfor specific marginaltotals, when The most general resultspresentedby Rao and Scott (1984) the log-linear model admits a closed-form solution. consider a fixed vector, x, of cell proportionsestimatedby a The jackknifed statistics presentedhere include a quantity consistent estimator p*, where cells are indexed by i = 1, based on variabilityin X2 or G2 over a set of replications. K+ T. They assume that n"12 (p* - a) converges in disnull hypothesis, K+lk is a consistent estimatorof the Under tributionto N(O, C) as n increases, where n is the sample size. could be used as an estimate of X21(.. The so X2I(K+Ik) (.; These assumptionssuitablycharacterizein many instancesthe is computed even when no closed-form readily K+ quantity cross-classifieddataobtainedthroughcomplex designs in large for the estimates under the log-linear is available expression sample surveys. They consideran estimatora*(p*) of w under model. a log-linear model, derived throughmaximum-likelihoodesOne of the disadvantagesof X2/1., however, is its rejection timation for multinomialsampling applied directly to p* (or under the null hypothesis at a rate much greaterthan the nomsome asymptoticallyequivalentestimator).In this setting, the inal level when the corresponding6j's vary substantiallyfrom correspondingversions of the usual Pearson and likelihoodThejackknifedtests incorporatea correctionto avoid one other. ratio chi-squaredtests are this problem;Section 4 makes explicit the natureof this cor(Pi* (2.1) rection. The test Xs based on the Satterwaiteapproximation X2(p*) = n (p*))2/r*(p*) compensates for variation in the (5's by a different means; Section 5 comments on the relative merits of Xs and the jackand knifed tests. p (2.2) Pi ln(pi*/7X*(p*)). G2(p*) = 2n If for a specific model, the tests (2.1) and (2.2) have k degrees of freedomunderthe null hypothesisfor multinomialsampling, Rao and Scott showed thatfor the complex sample, the limiting distributionis given by (2.3) as the weighted sum of independent chi-squaredvariates X2(j), j = 1, . . . k, each on a single degree of freedom, 3. CALCULATION OF THEJACKKNIFED TESTSTATISTICS 3.1 Representation of the Replication Methods Suppose Y representsan observedcross-classification,possibly in the form of estimated population totals for a finite populationderived from a complex sample survey. The class k of replicationmethods to be considered in this article will be Qk (2.3) based on (pseudo-) replicatesY + W,i), i = 1, I . . , I, ] = IE 5j X%(' j=1 1,. . . , Ji, typically based on the same data as Y. The asympwhere the 51's are nonnegative. In the case of multinomial totic theory for the jackknifed tests requiresthat sampling, the 3j's will be identically 1; but in the generalcase, they depend on the limiting covariance matrix C, the true (3.1) E (i = 0 i values qn, and the specific model being investigated. This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:01:18 PM All use subject to JSTOR Terms and Conditions Journal of the American Statistical Association, March 1985 150 for each i. An estimate, cov*(Y), of the covarianceof Y should be given by cov*(Y) W"J) ') W0 i bi = (3.2) i i where W j) 0 WJ) representsthe outerproductof W(i ) with itself (the standardcross-productmatrix)and the bi are a fixed set of constants appropriatefor the problem. An additionalconditionis requiredfor the asymptotictheory to apply: that the W('J)'sbe uniformly small relative to the sampling variance in Y [estimatedby (3.2)]. This condition, plus a condition on the accuracyof (3.2), will be made clearer by applicationof this notationto common replicationmethods in this section; Section 4 provides an explicit statement. Although some modificationis necessary, the (simple)jackknife, a version of the jackknife to reflect stratification,and the halfsample and bootstrapmethods each may be recast into this notation. The standardjackknife may be applied when Y can be representedas the sum of n iid randomvariables,Pi). The standard leave-one-outreplicates,Y(-i) = Y - V), may be reweighted to the same expected total by the factorn/ (n - 1) and written as nW-Nl(n - 1) = Y + (Y - nZVi))(n - 1). (3.3) The second termon the rightof (3.3) assumes the role of W(ii) and satisfies (3.1). (Here, the subscripti is fixed at 1.) The value (n - 1)/n representsthe usual choice for bi. Stratificationconstitutesone of the key techniquesin complex sample design and is discussed in any standardtextbook in the subject. In the precedingnotation, the universe may be considered to be divided into I strata. The jackknife may be adaptedto this problem if the samples are selected independently from each stratumand if Y may be representedas y = E Z,j), (3.4) ij where the Z(s'J)are ni iid random variables within stratumi. (These variables are not, however, assumed to be identically distributedacross strata.)For each stratumi, ((EZ(i,j)) - Y + w(ii) = y + niZ('i))/(ni - 1) methods, (3.1) forces inclusion of the complementaryhalf) sample W('2) = Y Z(An additionalmodificationis necessary, however, since the asymptotictheoryrequiresW(i'i)to be uniformlysmall relative to the variationof Y, and the half-sampleestimates lack this property.Instead, the representation WUM) = d(Z(,)_ W(i-2) - Y) = - WO(i) 1/(2d21)-' bi= (3.7) (3.8) (3.9) generalizes the notion of half-samplereplication. The asymptotic conditions may be met by allowing d to converge to 0 in a suitable manner;in practice, d = .05 appearsto be satisfactory for most applications. The sequence of half-samples may be based on either independentselections or balanced repeatedreplication(McCarthy1969). Bootstrapreplicationmethods (Efron 1979, 1982) may also be considered in some applications;for purposes of computation of the jackknifed tests here, bootstrapsamples may be treatedas the equivalent of half-samplereplicatesreweighted to the populationtotal. 3.2 Computation of the Statistics The jackknifed values of the test statistics requirerefitting the given log-linear model to the replicates, Y + W(i"),and recomputing the test statistics, X2(Y + W('j)) or G2(Y + W("')),for these new tables. [In this and subsequentsections, the usual formulas for X2 and G2 are applied directly to the cross-classification, Y, whenever Y is based on weighted estimates. In other words, for weighted Y (or Y + W('"))the sum of the cell estimates of Y (or Y + W("'))replaces n in (2.1) and (2.2).] Using the bi introducedin (3.2), thejackknifed test statistic, XJ, is defined by Xi = [(X2(Y))"2 - (K+)"2]/{V/(8X2(Y))}"/2 (3.10) where Pii = X2(y + W(ij)) - X2(Y), K = b bi i i V= bi i (3.11) P pij (3.12) Pi, (3.13) i (3.5) and K+ takes the value K for positive K and is zero otherwise. has the same expected value as Y and defines Wi), satisfying A test of the differenceof two chi-squaredtests undernested (3.1). The correspondingchoice for bi is (ni - 1)/ni. models Ml and M2 is given by An alternativeapproachto varianceestimationfromcomplex G (G2)(Y) -G(2)(Y))"2 - (K+)1/2 samples is to form estimates based on half of the sample, selected to representall sources of variabilityin the design. {V/ (8G 2)(Y) - 8G2)(Y))}112 Forexample, in designs with two selectionsin each of a number where of strata, half-samples may be formed by picking one from each of the pairsof selections. If Z(i'l),i = 1, . . ., I, represent Pij= G(2)(y + W(ij)) - G(2)(Y + W(i")) estimates of the totals, each based on half of the sample (re- G l)(Y)), (3.15) -(G2)(Y) weighted, by a factor of 2 if necessary, to the original sample V and K are defined by (3.12) and (3.13) applied to (3.15), total), each may be reexpressedas and K+ again takes the value K for positive K and is zero )Y + (Z("')- Y). (3.6) otherwise. When M(,) is the saturatedmodel, giving G(2l)of Although not requiredin the usual applicationof half-sample zero in each case, (3.14) and (3.15) are direct analogues of - This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:01:18 PM All use subject to JSTOR Terms and Conditions 151 Fay: A Jackknited Chi-Squared Test Table 1. Critical Values of the - (k)112) Distribution (3.10) and (3.11) formed by replacingX2 by G2 throughout. Similarly, expressions analogous to (3.14) and (3.15) formed by replacingG2 by X2 throughoutdefine XJ for the difference of two Pearson tests, although XJ should be set to 0 if the differenceX()(Y) - X()(Y) becomes negative. (This problem occurs readily for X2 in sparsetables. In these instances, GJis preferableto XJ.) 2 112((k2)112 4. PROPERTIES DISTRIBUTION OF THELIMITING 4.1 Limiting Distribution Under the Null Hypothesis in the General Case The Appendix considersthe following asymptoticsituation, with n - oo: 1. The populationproportionsw are fixed and satisfy the log-linear model(s) in question. The model or models include a parameterfor the sample total, so the model is an assertion only about populationproportions.(This condition is satisfied by the vast majorityof models fitted in practicalsituations.) 2. The sample proportionsestimatedby Yn are consistent for the population proportionsxT,and there are constants gn and hn, depending only on n, such that the quantityhn(Yn gn'N) converges in distributionto a multivariatenormal, N(O, C), with a nonzero covariance, C. introducedin (3.1) are such that as n-> 3. The nWD"')'s 00 max hn jjnW(ijll 0. - (4.1) iJ 4. The covariance estimator (3.2), when multiplied by h2, converges in probabilityto C. 5. Condition (3.1) is strictly satisfied for each n. Under these conditions, XJ and GJ are shown to have as their limiting distributionthe distributionof k { (?, = where X2 j = 1, . , 2 ) 1/2 k 1/2 (2 ? iX2))} (4.2) k, are a set of independent chi-squared variates, each on a single degree of freedom, and the ci are a set of nonnegative weights dependingon C and the model(s) in question. The value of k in (4.2) is the degrees of freedom of the test for multinomialsampling. 4.2 Definition of the Test In the importantspecial case of multinomialsampling, the 6i are identically 1. More generally, when the 5i's in (4.2) are equal to any one positive constant, (4.2) reduces to a simple monotonic transformationof the chi-squareddistribution 2I=21 2{(x2)12 - (k)"2}, (4.3) where xy is distributedas chi-squareon k degrees of freedom. T*heexpression (4.3) gives an approximatestandardizationof Table1 providescriticalvaluesatthe .05 and.01 levels (xk2)"2. for (4.3). Except for small k, the values for a given level depend only slightly on k. As k increases, (4.3) approachesthe N(O, Degrees of Freedom, k .05 .01 1 2 3 5 10 20 40 oo 1.36 1.46 1.50 1.55 1.58 1.60 1.62 1.65 2.23 2.29 2.30 2.33 2.34 2.35 2.34 2.33 1) distribution.Note that for a = .05, the critical values increase monotonically with k, but for a = .01, they rise to a maximumvalue of 2.35 for k at 20 (approximately),compared with the limiting value 2.33. The test based on XJ or GJ consists of rejecting for large values of the statistic comparedwith critical values in Table 1 and, more generally, for other critical values determinedby the distribution(4.3). Thus the test procedureis based on the approximationof the general limiting distribution,(4.2), by a simplification, (4.3). When Y has a multinomialdistribution,X2, Xi, G2, and GJ are asymptotically equivalent tests under the preceding conditions. (The asymptoticequivalenceof X2 and G2 underthese assumptions was previously established;e.g., see Haberman 1974a.) If C is assumed to be a scalar multiple of the covariance matrixfor the multinomialdistribution,each of the 5i'sin (4.2) assumes the value of this scalar multiplier. Consequentlythe limiting distributionis again (4.3), and the jackknifedtests are asymptoticallyexact underthe null hypothesis. This situation correspondsto models studiedby Cohen(1976), Altham(1976), and Brier (1980). If (3.2) is used as the estimated covariance matrix in the computationof X2/c. and XI proposed by Rao and Scott, the tests will be asymptoticallyequivalent to each other and to XJ. The more general relationshipbetween X2h5. and XJ may be seen by expressing XJ = A*{2l'2{(X2(Y)/(K+/k))"2- (k)"2}}, (4.4) A* = ((4X2(Y)K+)/(kV))"2. (4.5) where For equal 5i's, A* converges in probabilityto 1. In the general case, comparisonof Xj/A* to the distributionof (4.3) gives a test asymptotically equivalent to comparisonof X21b. to the chi-squareddistributionon k degrees of freedom. Thus if XJ/ A* were used as a test statistic in place of XJ, it would consequently share the tendency of X2/1. to reject underthe null hypothesis at an inflated rate when the 5i's vary. 4.3 Properties of the Test in the General Case The testing procedure based on approximatingthe distributionof (4.2) by (4.3) is not asymptoticallyexact except when the 5i's in (4.2) are equal. Computersimulationindicates that the proposed tests are otherwise asymptoticallyconservative at the nominallevels of .05 and .01 shown in Table 1 (although This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:01:18 PM All use subject to JSTOR Terms and Conditions Journal of the American Statistical Association, March 1985 152 at the .01 level, 2.35 must be used in place of 2.34 or 2.33 for k above 20 to make the statementstrictly true). A heuristicargumentoffers an explanationfor this observed conservatism. When one of the 5i's, say 51, is substantially largerthan the others, the numeratorand denominatorof (4.2) will be positively correlated:whenX2) is large, both terms will tend to be largerthan their respective expectations. In turn, a positive correlationbetween the numeratorand denominator of (4.2) could have the effect of reducing the probabilityof extreme values. In computer simulationof the distribution(4.2), actual rejection rates for the test based on (4.3) fell below the nominal level by modest amountsfor all choices. Experimentationwith 5i's indicated a minimum of the actual rejection rates under the null hypothesis when all 5i's are fixed at 1, except one, say 61, set to a large value. For k = 6, the minimumrejection rate at the nominal .05 level is about .032 with 61 set to around 10. (Increasing61furtheractuallyraisesthe poweragaintoward .05.) For k = 10, the minimumof about .027 occurs for 61 around 15. Similarly, 61 - 10 yields a minimumof .025 for i 20 yields the minimum of .022 for k = 40. k = 16; 51 (Although the minimum values cited are fairly precisely determined, the indicationof the associated choice of 51 giving the true minimum is far more coarse.) Furtheranalytic approximationindicates that for this case of 61 set considerablyhigher than the others, it is possible to construct a sequence of 5i's with increasing k such that the actual rejection rate under the null hypothesis for tests at the nominal .05 level converges to zero. (The sequence has 61 increasing slightly faster than k1'2,with the other 5i's fixed at 1.) Fortunately,the rate at which the minimumrejectionrate approacheszero is gentle: computersimulationof the analytic approximationgives a minimum rejection rate of .015 for k = 160 and .013 for k = 640. It should be emphasizedthat these somewhat low rejection rates are obtained underthe most adverse combinationof the 6i's, and the actualasymptoticrejectionrate should usually be far closer to .05. Moreover,althoughrejectionrateslower than the nominal level underthe null hypothesis suggest some loss of power against near alternativehypotheses, the Appendix details how the test remains consistent against any fixed alternative. Thus sufficient data would always lead to the rejection of false hypotheses, regardless of the values of the underlying 5i's. It should also be emphasizedthatthe rangeof actualasymptotic rejectionrates offered by XJ seems far preferableto that given by X2/ 5.. For example, at k = 10 the range for XJ at the nominal .05 level is .027 to .050, whereas X2/1. rejects asymptoticallyat rates varying from .050 to .177. specific designs (one of them for the Census Bureau'sPublicUse Files from the CurrentPopulationSurvey), and reportson an extensive set of Monte Carlo experimentsto evaluate the performanceof these test statistics. (Both the program and documentationare in the public domain.) This section will attemptto summarizethe rangeof applicationof thejackknifed tests, with implicit citation to this documentationunless otherwise noted. * The Monte Carlo results confirmed earlier findings that when the data arise from a multinomialdistribution,the Pearson test is superiorto the likelihood-ratiochi-square (Larntz 1978) for tests of goodness of fit, and the likelihood-ratiotest is superiorto the differenceof Pearsonchi-squares(Haberman 1977) for tests of parametersin most situations, such as tests of parametersin a logistic model. Not too surprisingly, the Monte Carloresultsfavoredthe same choices for thejackknifed tests. For multinomial samples, the jackknifed tests do not realize any advantageover the preferredoriginal test. * Almost any form of complex design (except the benign forms of stratificationdiscussed by Rao and Scott 1981) has a potentially severe effect on the chi-squaredtests. Although the jackknifed tests require somewhat larger sample sizes for the asymptotictheory to approximateactualperformance(discussed in detail in the programdocumentation),they otherwise offer protectionagainst the effects of a complex design. * The numberof replicates requiredis well within computational capabilities, given current computer resources. The Monte Carlo results indicatedonly slight improvementin the use of 50 fractions over 20 for purposes of constructingthe simple jackknife, or of 50 half-samplereplicatesover 20. The numberof replicatesfor the stratifiedjackknife should be necessarilylarger,dependingon the numberof strata.(Effectively, a degree of freedom in the estimatedvarianceis lost for each stratum.)This rangeof 20-50 for practicalapplicationappears satisfactoryregardlessof table size (even when the numberof cells is many times the numberof replicates). * For tests of overall fit, the asymptotictheorybreaksdown when the numberof cells with no observationsbecomes more than a small fractionof the degrees of freedom. This happens sooner with half-sample methods than with the simple jackknife. * For tests of the contributionof specific sets of parameters to a model (testing the relative fit of two nested models), the jackknifed likelihood-ratiotest essentially requires only that the correspondingmarginal tables be reasonably filled out. Thus there is a wide potential range of applicationto logistic and other causal modeling even when the table itself is sparse. Comparisonof nested models is often more essential to the development of models for large cross-classificationsthan is testing of goodness of fit. 5. COMMENTS ON THEAPPLICATION * When a single parameterof a model is to be tested with OF THE TESTS data from a complex sample design, the researcheris faced The jackknifedtests have been incorporatedin the computer with a choice between the jackknifedlikelihood-ratiotest and program CPLX (Contingency Table Analysis for Complex a standardizedvalue of the parameterbased on an estimated Sample Designs; described in Fay 1982). The documentation standarderrorderived by standardreplicationtheory. Not only for this program(Fay 1983a) includes commentson strategies are these alternativesasymptoticallyequivalent, however, but to formulatereplicationapproachesfor common complex de- the Monte Carlo findings indicate extremely high agreement signs, provides two example applicationsof these strategiesto (both tests rejecting or both tests not rejecting), for example, This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:01:18 PM All use subject to JSTOR Terms and Conditions 153 Fay: A Jackknifed Chi-Squared Test on the orderof 99%, for realistic sample sizes (e.g., only 100 observations) and numbers of replicates (e.g., 20 jackknife replicates). Thus the practicalimplicationsof this choice are generally minimal and would be virtually nil in applications involving a reasonablylarge numberof observations. * Although not emphasized in this article, the asymptotic theory for the jackknifed tests applies to a wider range of problems, such as latent structuremodels describedby Goodman (1974) and Haberman(1974b), log-linear models with structuralzeros, and the non-log-linear models proposed by Haberman(1974c), Goodman(1979, 1981), and Clogg (1982) for tables that include variables with orderedcategories. * Replicationmethods can be adaptedto a numberof complex sample designs, including consideration of multistage sampling, complex estimation procedures,and finite population corrections. * One desirablefeatureof X2/ c5.is its ability to be computed in some cases withoutaccess to the originalsurveydata. Whenever XJ can be computed,however, it offers greaterprotection against rejection of the null hypothesis at a rate appreciably higher than the nominal level. Like XJ, the test X5 proposedby Rao and Scott essentially requiresaccess to the original data. (The statisticcan be computed from an estimatedcovariancematrixfor the cells of the complete cross-classification, but such a matrix is rarely, if ever, availablein practicefor largetableswithoutspecific effort for this purpose.) Asymptotically, X2 probably gives actual rejection rates over a range both narrowerand closer to the nominal level than does the generally conservativeXJ. It can be shown, however, that under some practicalconditions, the actual (as opposed to asymptotic)performanceof X2 would be more conservative than XJ. (Furthercomments on this point are given in Fay 1983b.) Thusneithertest is uniformlysuperior to the other, but both appearto be more complete solutions to the general problem than are previous alternatives. The formulationof X2 also extends to other models beyond the log-linear model, as does XJ. For specialized applications, however, researchersmay find XJ easier to implement than Xs, since the former only requiresalgorithmsto compute estimates under the model. 6. AN EXAMPLE The U.S. Bureauof the Census contractedwith Damansand Associates to conduct the Knowledge, Attitude, and Practice Survey (KAP) to measurethe impactof the publicitycampaign for the 1980 Census and public knowledgeof the census. (This example is discussed in more detail by Fay 1983a, but it is summarizedhere because it illustratessome importantfeatures of complex designs and the behavior of the jackknifedtests.) This surveyincludedinterviewsat two distincttimes or phases: one in early 1980, beforethe campaignhadsubstantiallybegun, and the second in mid-March,a few weeks before Census Day (April 1, 1980), when the campaign had reached essentially full intensity. A basic specificationfor the design was that the equal size for the black, surveyobtainsamplesof approximately to provide relipopulation ("other") Spanish, and remaining groups. able estimates for these three The design of this survey was similar in many respects to othernationalhouseholdsamples. The first step or stage of the design was to draw a sample of counties. Counties (townships in New England) were grouped into primarysampling units (PSU's) composed of one or more contiguous counties. To accommodatethe objective of samples of equal size for the three groups, PSU's were assigned scores based on a linear combination of their black, Spanish, and other populations, with coefficients inversely proportionalto their respectiverelative nationalpopulationsizes. Because of their large scores, the PSU's that covered the cities of New York, Philadelphia, Chicago, Detroit, Miami, Houston, Washington,San Antonio, Los Angeles, and San Franciscowere includedwith certainty, that is, without any randomselection at the first stage. From all remainingPSU's in the country, a stratifiedsample of 40 non-self-representingPSU's were selected with probabilities proportionalto their scores. In both the 10 self-representing(certainty)and 40 non-selfrepresentingPSU's, the households were selected throughadditional stages of sampling, includingstratificationand sample allocation according to estimates of the black and Spanish populationsfor census tractswithin the sampledPSU's, again with the primary objective of selecting approximatelyequal samples for the three demographicgroups. The ultimateunits of the sample were clusters (segments) of neighboringhouseholds. By design, segments for the second phase of interviewing just before Census Day were physically adjacent to the segments interviewedfor the first phase. (Thus the sample for each phase was a probabilitysample, but the samples in the two phases cannot be considered statistically independent.) Because of the complexity of the design, the unconditional probabilitiesof inclusion in the sample for interviewedhouseholds varied substantially,and the reciprocalsof these probabilities, modified by an adjustmentfor noninterviewedhouseholds, were employed to weight the responsesfor purposesof estimation. In this application, the stratifiedjackknife was chosen to representthe effect of variabilityfrom the design. Non-selfrepresentingPSU's were usually groupedby pairs into strata for the computationof the jackknife, but in a few instances, groupings of three or four were more convenient and consequently were used. Each of the self-representingPSU's was treated as a separate stratum, and the segments of sampled households were groupedinto three approximatelyequivalent and independentsubsamples, balanced as closely as possible accordingto the original stratificationof segments within the PSU. Segments interviewedin the second phase were assigned to the same groups as the respective neighboringsegments in the first phase, thus reflectingin the formationof the replicates this potentially importantsource of covariance. This illustrative analysis will focus on the effect of race/ ethnicity (R), income (I), and phase of interviewing (P) on whetherrespondentshad recentlyheardof the census (H) (also termed "exposure"here). Table 2 shows the weighted survey estimates, in thousands.As with many estimatesfrom national surveys, the entriesshown areroundedto thousandsfor display as a matterof convenience and ease of interpretation,the calculationsdescribedhere were also carriedout in thousandsbut withoutrounding.(Multiplicationof all of the surveyestimates by a uniformfactorhas no effect on the values of thejackknifed tests.) This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:01:18 PM All use subject to JSTOR Terms and Conditions Journal of the American Statistical Association, March 1985 154 Table 2. Whether Respondents Had Recently Heard of the Census: By Race/Ethnicity, Household Income, and Phase, From the KAP Survey (weighted estimates in thousands) Recently Heard of Census March 1980 Jan.IFeb. 1980 Race! Ethnicity Black Spanish Other Household Income ($) Yes No Yes No 22,000 or more 12,000-21,999 0-11,999 22,000 or more 12,000-21,999 0-11,999 22,000 or more 12,000-21,999 0-11,999 441 1,459 1,738 169 141 196 9,485 8,066 7,023 209 1,572 3,682 58 595 1,012 6,422 10,486 15,266 1,008 1,876 2,483 362 730 727 16,014 13,691 15,106 95 554 2,057 0 125 339 2,032 6,278 4,099 NOTE: Cases with incomplete data are excluded from the analysis. Table 3 shows a general increase in exposure to the census (H) over time. A more subtle question is whetherthis change in exposure was uniform(on a logistic scale) for these groups or instead varied by group, possibly because of differences in the relative efficacy of the publicitycampaign. An analysis of how exposure changed over time could consider the sequence of models in Table3. The inclusionof the three-wayinteraction of income, race, and phase, [IRPI, in each model makes each equivalentto a logistic model for H. The three-wayinteraction of exposure (H), income, and race-[HIR]-in each model incorporatesall interactionof income and race upon exposure, although without regardto phase. Thus all remainingparameters pertainto change in exposure over time. (In Table 3, the chi-squaredtests for these models are reportedin thousands simply for convenience. Extraordinarilylarge chi-squaredtests are the rule when surveys weighted to nationaltotals are analyzed.) The preferredjackknifedtest of overall fit, XJ(but also GJ), indlcatesthat model 1, which posits an equal (logistic) change in exposure over time, apparentlyfits well. Furtherfitting, however, shows thatallowance for a differentialeffect of race/ ethnicityover time, [HRP], appearsto improvethe fit, whereas allowance for a differential effect of income, [HIP], instead leaves a significant lack of fit. Note, however, that the conclusions drawn by XJ about models 2 and 3 (each with 6 df) is opposite to the orderimplied by the magnitudesof X2. Thus any technique to evaluate the chi-squaresof different models by an overall correctionfactor independentof the choice of Table 3. Tests of Hypotheses for the KAP Example Model 1 2 3 1-2 1-3 Parameters df X2(in thousands) G2(in thousands) Tests of Goodness of Fit 1,666.6 1,630.2 [IRP],[HIR],[HP] 8 1,217.3 1,212.8 [IRP],[HIR],HRP] 6 946.9 925.7 [IRP],[HIR],[HIP] 6 [HRP] [HIP] Tests of Parameters 417.4 2 704.5 2 Xi GJ .30 .03 1.63 .33 .03 1.78 1.56 - .23 1.69 - .23 model will disagree with one or both results from Xi. The statisticX2/ 6. of Rao and Scott (1981), which allows different adjustmentsdependingon the choice of model (indeed, these researchersemphasize the importanceof this consideration), cannot be readily applied here, since all three models do not admitclosed-formsolutions(Haberman1974a;Goodman1970). The preferred test of the contributionof specific sets of parameters,GJ (but XJ as well), is consistent with the tests of overall fit, showing that the [HRP], but not the [HIP], parameters make a significantcontributionto the model. Again, this is the opposite of the ordersuggested by the magnitudeof the usual chi-squaredtests. Applicationof replicationmethods to the parametersof the model (Fay 1983a) indicates that the significance of the [HRP] parametersis driven by the contrast of Spanish with the other two groups (z = 2.95, showing greaterincrease in exposure for Spanish), whereasthe income parametersare not significant, confirmingthe resultsfrom GJ. This reversalin the relative magnitudesof the jackknifedtests and the original chi-squaresin testing these two hypotheses is not surprising,however, since the black and Spanish populations were disproportionatelysampled, thus giving comparisons by race/ethnicity (such as [HRP]) greaterrelative reliability than comparisonsby income (such as [HIP]) compared to simple randomsamplingof the total population.Hence replication methodsrecognize this distinctionand providea sound analysis of the data (based on frequentisttheory). This example was selected from a large numberof applications of these methods for the distinct reversal of the conclusions from the jackknifedtests comparedto what might be inferredfrom the magnitudeof the original chi-squares. The reversalin this instancemay be attributedto the large variation in the survey weights. In the author'sexperience, for samples with equal weights, those hypotheses with relatively larger values for the simple chi-squaredtests will be the ones rejected by the jackknifedtests as well. Thus alternativemethodsbased on an overall adjustmentto chi-squarenot depending on the specific model may be acceptable in some applications, particularlywhen access to the originaldata is impossible. Nonetheless the ability to performcorrectly, even in relatively extremesituations,andthe avoidanceof unnecessaryassumptions would seem to recommendthe jackknifed tests over such alternatives in situations in which access to the original data permits the computation. RESULTS APPENDIX:ASYMPTOTIC This appendixsummarizesthe results and methodsof proof given elsewhere (Fay 1980) for the asymptoticresults cited in this article. The class of estimatorsfor which this theoryholds is general, but all useful applicationsknown to the authorareto maximum likelihood estimators(or asymptoticequivalents under multinomial sampling) for a class of parametricmodels studied earlier by Rao (1965). This class comprises parametricrepresentations for the expected value, aT(O),of a multinomial proportionsatisfyingthe following (numberedaccordingto Rao 1965): 449.3 719.8 in Table2 of whetherrespondentshad recentlyheard NOTE: Analysisof cross-classification of the census (H)by race/ethnicity(R), income(1),and phase (P). Assumption1.]. inf Given a (5> 0, there is an E > 0 with [Q'rr(00),ln aT(00)) - gHO - H?I>5 This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:01:18 PM All use subject to JSTOR Terms and Conditions (aT(00), ln aT(O))]> g, (A. 1) Fay: A Jackknifed Chi-Squared Test 155 where (, ) denotes the standardinnerproductand In operates on the components of a vector by taking their naturallogarithms. Assumption2.3 (modified). The function TTis in C(2)in a neighborhoodof 00, where C(2) is the space of twice differentiable functions with a continuous second-orderdifferential in a neighborhoodof 00. The nW"'j"ssatisfy Condition2. ij) = 0 for each i and n, and there are constants nfi such that as n -> oo, J i Assumption3. at 00, where The informationmatrix(irs) is nonsingular irs = > it) airj/ ar jatyaO. (A.2) Under weaker conditions, Rao showed the (asymptotic)existence and asymptotic efficiency of the maximum likelihood estimator. The results discussed here assume the three previous conditions and the following: Assumption4. n(00) has a strictlypositive expected value in each component. (A.5) for the covarianceoperator,C, in (A.4), where 0 denotes the outer product. Condition 3. As n -* oo, hnsup sup LW(iW ll i (A.7) 0. j Let X2 denote the Pearsonchi-squaredtest for the maximum likelihood estimator aT(O*)of aT(00),and let X2' be the analogous test for the maximum likelihood estimator nr'(O*')of Tr'(00') (or asymptotically equivalent estimators). Define nPij = X2 '(Yn + Xij)) - X2(Yn + Xij)) - - (X2'(Yn) (A.8) X2(Yn)), Assumption5. 00 is an interiorpoint of a parameterspace in q-dimensionalEuclideanspace. (A.9) Kn = nfi nPij, The class of models satisfying assumptions 1.1, 2.3, 3, 4, j i and 5 includes, of course, the log-linear models studied as (A.10) Vn= nf1i n examples in this article, where in each model a parameterwas J i included to representthe overall sample total. Haberman(1974b) discussed a relatedclass of models aris- Then as n -* oo, ing from the observationof only a subset of the variablesin a q-q' cross-classificationwhose expected values satisfy a log-linear (A.11) gnKn~ E 'm, m=1 model for the complete table. Goodman(1974) illustratedthe applicationof these methods to latent structureanalysis. Genand the joint distributionof X2'(Yn) - X2(Yn) and of Vn is erally, these models also satisfy the preceding assumptions, characterized by except for a subset of the parameterspace, where both asq -q' sumptions1.1 and 3 may fail. Section 5 mentionsotherclasses L 2 (A.12) g mX -X2(Yn)) hn(X2'(Yn) conditions. the required of models that also satisfy m=1 both of theorems encompasses the following The statement tests of fit and tests of specific parameters.(The case of overall and fit is representedby a "saturated"model for w.) The first q-q' theorem considers the asymptotic distributionunder the null (A.13) gn2hnVn >4 3 mX(m) hypothesis. m=1 3 3 3 3 - Theorem1. Let w(O) and w'(0') be functionsof q-dimensional 0 andq'-dimensional0' satisfyingassumptions1.1, 2.3, 3, 4, and 5. Suppose that the true 00 and 0?' are such that l'(00'), and that in a neighborhoodof 0?', the image iT(O?) = of n' is contained in the image of w in a neighborhood of 00. (For nontrivial results, q' < q is assumed.) Suppose that for each n, there are random variables Yn and nW(ij, 1 . . . , nJi, satisfying the following i = 1, . . . , in j = three restrictions: Condition 1. n ->00 There exist constantsgn and hn such that as hn (Yn - 3 for independent chi-squared variates, XZ2) (m = 1, . q - q'), and constants 5m(m = 1, ... , q - q'), given by 6m = ((v(m), D-lQ((00))Ctv(m)), where ((, (A. 14) is the inner productdefined by ((x, y)) = (x, D-'(Tr(00))y) (A. 15) for vectors x and y, D- I'n(00)) is the operatorthat transforms the elements of a vector y = {y1}by (D- 1(,(00))y)i = (7ri(O0))-1yi, (A. 16) and C' is the covariance operator gnff(00)) > N(O, C), (A.3) C' = C + (e, Ce) ff 0 f - Ce 0 f - f 0 Ce. for some covarianceoperatorC, and [Typically, E((e, Yn)) = gn, where e representsthe vector whose elements are all 1.] (A.17) The vectors V(m), m = 1, . . . , q - q, are defined as an orthonormalbasis with respect to the inner product (A. 15) for T(Tr(00))- T'Qn'(00'))-the orthogonal complement of This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:01:18 PM All use subject to JSTOR Terms and Conditions Journal of the American Statistical Association, March 1985 156 T'(Q'(O')) in T(Q(00))with respectto (A. 15)-when TQn(00)) is the tangentplane to ff at 00 and T'(Q'(00')) is the tangent plane to n' at 09'. The v(m)are further restricted to satisfy ((v(m), D-'(w(00))C'v(')4 = 0 (A.18) for m # t. The same results apply to the statistic GJ formed by substitution of G2 for X2 in (A.8) and (A. 12). Commenton Proof. The developmentis throughstandard asymptoticmethods. Analytic approximationsfor the first- and second-orderdifferentials of X2 and G2 are first developed. The conditions of the theorem are shown to give (A. 11) as a consequence of the second-orderterm in the Taylor series expansion for X2 or G2 [the contributionof the first-orderterm to (A.9) is removedby (A.5)]. The existence of v(m),satisfying (A. 18), is a consequence of a standardtheorem (Rao 1965). These vectors are used in proving the joint asymptoticdistribution (A. 12)-(A. 13). The first-orderterm expansionof X2 or G2 aboutthe observed Y dominatesthe analytic approximation of (A. 10). Although the v(m)are not necessarily unique, the relationship (A. 14) uniquely determines the 5m, up to permutationsof their order. Simple transformationof the results of this theorem leads to the characterizationof XJ and GJ given in this article. A second theorem establishes the consistency of XJ or GJ against a fixed alternativehypothesis. Theorem2. Suppose that the conditions of Theorem 1 are satisfied with the following modifications:The true expected proportionsare given by ir(00), but there is no longer any 00' with ir'(00') = f(90); and the maximum likelihood estimator ir'(O*') of 09' at w(00) has strictly positive elements. Then -I X2 (Yn - X2(Yn)} >E (7ri(0?) - '(O*I))2/7r(O*I) (A. 19) arfdgnKnand Vn are both bounded in probability. The same results hold for G2. Commenton Proof. The methodof proof is similarto that for Theorem 1. A more complete statement of the theorem (Fay 1980) gives explicit limits for gnKnand Vn. Manipulationof the results of the theoremshows the divergence of XJor GJunderthe conditionsof this second theorem. The material developed in the proofs may also be used to determinethe conditions under which the tests are asymptotically efficient for nearbyalternatives. [ReceivedFebruary 1980. Revised June 1984.] REFERENCES Altham, P. A. E. (1976), "DiscreteVariableAnalysis for IndividualsGrouped Into Families," Biometrika, 63, 263-269. Bedrick, EdwardJ. (1983), "AdjustedChi-SquareTests for Cross-Classified Tables of Survey Data," Biometrika, 70, 591-595. Bishop, Yvonne M. M., Feinberg, StephenE., and Holland, Paul W. (1975), Discrete MultivariateAnalysis, Cambridge,MA: MIT Press. Brier, Stephen S. (1980), "Analysis of Contingency Tables Under Cluster Sampling," Biometrika, 67, 591-596. Chapman,D. W. (1966), "An ApproximateTest of IndependenceBased on Replications of a Complex Survey Design," unpublishedmaster's thesis, Cornell University, Dept. of Statistics. Clogg, Clifford C. (1982), "Some Models for the Analysis of Association in MultiwayCross-ClassificationsHaving OrderedCategories,"Journalof the AmericanStatistical Association, 77, 803-815. Cohen, J. E. (1976), "The Distributionof the Chi-Square Statistic Under Cluster Sampling From Contingency Tables," Journal of the American Statistical Association, 71, 665-670. Efron, Bradley (1979), "Bootstrap Methods: Another Look at the Jackknife," Annals of Statistics, 7, 1-26. (1982), The Jackknife, the Bootstrap, and Other ResamplingPlans, Philadelphia,PA: Society for Industrialand Applied Mathematics. Fay, RobertE. (1980), "On JackknifingChi-SquareTests Statistics-Part II: AsymptoticTheory," unpublishedmanuscript,U.S. Bureauof the Census. (1982), "Contingency Table Analysis for Complex Designs: CPLX," Proceedings of the Section on SurveyResearchMethods,American Statistical Association, pp. 44-53. (1983a), "CPLX-Contingency Table Analysis for Complex Sample Designs, ProgramDocumention," unpublishedreport, U.S. Bureauof the Census. (1983b), "ReplicationApproachesto the Log-LinearAnalysis of Data From Complex Samples," paper presentedat the seminar "Recent Developments in the Analysis of Large Scale Data Sets," StatisticalOffice of the EuropeanCommunities, Luxembourg,November. Fellegi, Ivan P. (1980), "ApproximateTests of Independenceand Goodness of Fit Based on StratifiedMultistage Samples," Journal of the American Statistical Association, 75, 261-268. Fienberg,StephenE. (1979), "TheUse of Chi-SquareStatisticsfor Categorical Data Problems," Journal of the Royal Statistical Society, Ser. B, 41, 5464. (1980), TheAnalysis of Cross-ClassifiedData, Cambridge,MA: MIT Press. Freeman, M. F., and Tukey, J. W. (1950), "TransformationsRelated to the Angularand the SquareRoot," Annals of MathematicalStatistics, 21, 607611. Goodman, Leo A. (1970), "The MultivariateAnalysis of QualitativeData: InteractionsAmong MultipleClassifications,"Journalof the AmericanStatistical Association, 65, 226-256. (1974), "TheAnalysis of Systems of QualitativeVariablesWhen Some of the Variables Are Unobservable. Part I: A Modified Latent Structure Approach,"AmericanJournal of Sociology, 79, 1179-1259. (1978), Analyzing QualitativelCategorical Data, Cambridge, MA: Abt Associates. (1979), "Simple Models for the Analysis of Association in CrossClassificationsHaving OrderedCategories,"Journal of the AmericanStatistical Association, 74, 537-552. (1981), "AssociationModels andCanonicalCorrelationin the Analysis of Cross-ClassificationsHaving OrderedCategories,"Journal of the American Statistical Association, 76, 320-334. Grizzle, J. E., Starmer,C. F., and Koch, G. G. (1969), "Analysis of Categorical Data by Linear Models," Biometrics, 25, 489-504. Haberman, Shelby J. (1974a), The Analysis of Frequency Data, Chicago: University of Chicago Press. (1974b), "Log-LinearModels for FrequencyTables Derived by Indirect Observation:MaximumLikelihood Equations,"Annals of Statistics, 2, 911-924. (1974c), "Log-LinearModels for Frequency Tables With Ordered Classifications," Biometrics, 30, 589-600. (1977), "Log-LinearModels and FrequencyTables With Small Expected Cell Counts," Annals of Statistics, 5, 1148-1169. (1978), Analysis of Qualitative Data: Vol. 1: IntroductoryTopics, New York: Academic Press. (1979), Analysis of QualitativeData: Vol. 2: New Developments,New York: Academic Press. Hidiroglou, M. A., and Rao, J. N. K. (1981), "ChisquareTests for the Analysis of CategoricalData From the CanadaHealth Survey," paperpresented at the InternationalStatisticalInstituteMeetings, Buenos Aires, December 5. (1983), "Chi-SquareTests for the Analysis of Three-WayContingency Tables From the Canada Health Survey," unpublishedTechnical Report, Statistics Canada. Holt, D., Scott, A. J., and Ewings, P. 0. (1980), "Chi-SquaredTests With Survey Data," Journal of the Royal Statistical Society, Ser. A, 143, 302320. Koch, G. G., Freeman, D. H., Jr., and Freeman, J. L. (1975), "Strategies in the MultivariateAnalysis of Data FromComplexSurveys,"International Statistical Review, 43, 59-78. Larntz,Kinley (1978), "Small-SampleComparisonsof Exact Levels for Chi- This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:01:18 PM All use subject to JSTOR Terms and Conditions Fay: A Jackknifed Chi-Squared Test Squared Goodness-of-Fit Statistics," Journal of the American Statistical Association, 73, 253-263. McCarthy,Philip J. (1969), "Pseudo-Replication:Half-Samples,"Review of the InternationalStatistical Institute, 37, 239-264. Nathan, Gad (1973), "ApproximateTests of Independencein Contingency Tables From Complex StratifiedSampling," in Vital and Health Statistics, Ser. 2, No. 53, Washington,DC: National Centerfor Health Statistics. Rao, C. R. (1965), Linear Statistical Inference and Its Applications, New York:John Wiley. Rao, J. N. K., and Scott, A. J. (1981), "The Analysis of CategoricalData FromComplex Sample Surveys:Chi-SquaredTests for Goodness of Fit and 157 Independence in Two-Way Tables," Journal of the American Statistical Association, 76, 221-230. (1984), "OnChi-SquaredTests for MultiwayContingencyTablesWith Cell ProportionsEstimated From Survey Data," Annals of Statistics, 12, 46-60. Satterwaite,F. E. (1946), "An ApproximateDistributionof Estimatesof Variance Components,"Biometrics, 2, 110-114. Scott, A. J., and Rao, J. N. K. (1981), "Chi-SquaredTests for Contingency Tables With ProportionsEstimatedFrom Survey Data," in CurrentTopics in Survey Sampling, eds. D. Krewski, R. Platek, and J. N. K. Rao, New York: Academic Press. This content downloaded from 193.0.66.11 on Fri, 25 Jul 2014 12:01:18 PM All use subject to JSTOR Terms and Conditions
© Copyright 2026 Paperzz