Analysis of structural equation models based on a mixture of continuous and ordinal random variables in the case of complex survey data Stephen du Toit Scientific Software International Summary The present paper deals with the use of design weights to fit SEM models to a mixture of continuous and ordinal manifest variables with or without missing values with optional specification of stratum and/or cluster variables. It also deals with the issue of robust standard error estimation and the adjustment of the chi-square goodness of fit statistic. Results based on real data are presented. Research was supported by SBIR grant 5R44AA014999-03 from NIAAA to Scientific Software International. 1 1 Introduction We assume that the population from which the sample data are obtained can be stratified into H strata. Within each stratum h , nh clusters or primary sampling units (PSUs) are drawn and within the h -th stratum and k -th cluster, nh ultimate sampling k units (USUs) are drawn with design weights whkl , where l denotes the l -th USU within the k -th cluster, which in turn is nested within stratum h . In the subsequent sections we briefly discuss parameter estimation and the Taylor linearization method employed in LISREL to produce robust standard error estimates under single stage sampling. 2 A reparameterization of the general LISREL model y i = υ + Λ ( α + ui ) + ε i ⎛τy ⎞ ⎟⎟ ⎝ τx ⎠ υ = ⎜⎜ ⎛ Λ I − B −1 0 ⎞ ) ⎟ Λ=⎜ y( ⎜ 0 Λ x ⎟⎠ ⎝ ⎛ α y + Γκ ⎞ ⎟ κ ⎟⎠ ⎝ α = ⎜⎜ ⎛ w i + Γv i ⎞ ⎟ v i ⎟⎠ ⎝ ui = ⎜⎜ ⎛ Ψ y + ΓΦΓ ' Cov ui = Ψ = ⎜ ⎜ ΦΓ ' ⎝ ( ) ΓΦ ⎞ ⎟ Φ ⎟⎠ 2 ⎛ εy ⎞ ⎟ ⎜ εδ ⎟ ⎝ ⎠ εi = ⎜ Θε ⎜ Θδε ⎛ Cov (εi ) = Θ = ⎜ ⎝ Θεδ ⎞ ⎟ Θδ ⎟⎠ From the definitions above it follows that: • Λ y of the general LISREL model is part of the first p × m submatrix of Λ • • • • Λ x of the general LISREL model is the final q × n submatrix of Λ B of the general LISREL model is an m × m matrix with zero diagonal elements and is part of the first p × m submatrix of Λ Γ of the general LISREL model is an m × n submatrix of α and Ψ τ y of the general LISREL model consists of the first p elements of υ • • τ x of the general LISREL model consists of the final q elements of υ α y of the general LISREL model consists of the first m elements of α • • • κ of the general LISREL model consists of the last n elements of α • • • Θεδ = Θδε ′ of the general LISREL model is the second p × q submatrix of Θ Φ of the general LISREL model is the final m × m submatrix of Ψ Ψ y of the general LISREL model is part of the first m × m submatrix of Ψ Θε of the general LISREL model is the first p × p submatrix of Θ Θδ of the general LISREL model is the final q × q submatrix of Θ 3 3 Mixture of ordinal and continuous dependent variables Let ⎛ y 0i : pi ×1⎞ ⎟, ⎜ y : qi ×1⎟ Ni ⎝ ⎠ yi = ⎜ i = 1,2, ,N where N denotes the number of cases and where the ( pi + qi ) ×1 vector y i of manifest variables are partitioned into a pi ×1 vector y 0i of ordinal and a ( qi ×1) vector y Ni of continuous manifest variables. It is further assumed that the SEM model has m latent variables ui , where u1, u 2 , , u N are i.i.d. N ( 0, Ψ ) . The likelihood function for case i , is evaluated as f ( y i ) = ∫ f ( y 0i , y Ni , ui )dui ui = ∫ f ( y 0i , y Ni | ui )g ( ui ) dui ui Under the assumption of conditional independence, it follows that pi ( ) qi ( ) f ( y i ) = ∫ ∏ f y 0ij | ui ⋅ ∏ f y Nij | ui g ( ui ) dui . ui j =1 j =1 Hence ⎧⎪ pi f ( y i ) = exp ⎨ ln f y 0ij | ui + ⎪⎩ j =1 ui ∫ ∑ ( ) pi ( ) ⎫ ⎪ ∑ ln f y Nij | ui + ln g (ui )⎬dui j =1 ⎪⎭ In general, a closed-form solution to this integral does not exist. To evaluate integrals of the type described above, we use a direct implementation of GaussHermite quadrature. 4 With this rule, an integral of the form I (t ) = ∫ f (t )exp ⎡⎣ −t 2 ⎤⎦ dt is approximated by the sum Q I (t ) ≈ ∑ wu f ( zu ) , u =1 where wu and zu are weights and nodes of the Hermite polynomial of degree Q. Adaptive quadrature generally requires fewer points and weights to yield estimates of the model parameters and standard errors that are as accurate as would be obtained with more points and weights in non-adaptive quadrature. The reason for that is that the adaptive quadrature procedure uses the empirical Bayes means and covariances, updated at each iteration to essentially shift and scale the quadrature locations of each case (subject) in order to place them under the peak of the corresponding integral. ( ) f y 0ij | ui : Ordinal variables Suppose that y 0ij has C categories, then ( P y 0ij = c ( ) ) ( ) = P y 0ij ≤ c − P y 0ij ≤ c − 1 , c = 1,2, ( , C −1 ) where P y 0ij = 0 = 0 ( ) ( ) and P y 0ij = c = 1 − P y 0ij ≤ c − 1 . In LISREL the logit, probit, log-log and cumulative log-log functions are available. For the logistic link function, for example, ( ) ( ( )) P y 0ij ≤ c = 1 1 + exp −ηij , where ηij = τ ic + λ 'ij ( α + ui ) . 5 In the case of ordinal variables it is assumed that the corresponding subsets of υ and Θ are set to zero means and unit variances. The parameter τ ic is the so-called threshold parameter. ( ) f y Nij | ui : Normal variables ( ) ( f y Nij | ui = 2πθ jj ) −1 2 exp− 1 2θ jj ( ) 2 yij − μij , where μij = v j + λ i' ( α + ui ) . The parameter v j is the so-called intercept parameter and θ jj the residual variance. Special case y 0i depends only on u0i : m0 ×1 y Ni depends only on u Ni : mN ×1 ⎛ u 0i ⎞ ⎟⎟ and m = m0 + mN . u ⎝ Ni ⎠ where ui : ( m ×1) = ⎜⎜ f ( y 0i , y Ni ) = ∫ ∫ f ( y 0i , y Ni , u0i , u Ni )du Ni du0i u0 u N ⎧ = ⎪⎨ u0 ⎪ ⎩ui f ( y Nij | u N ) ⋅ g ( u N | u0 ) ∫ ∫∏ j =1 qi ⎫ du N ⎪⎬ × ⎪⎭ f ( y 0ij | u0 ) ⋅ g ( u0 ) du0 , ∏ j =1 pi Hence ⎧ pi ⎩⎪ j =1 ( ) ⎫ f ( y i ) = ∫ exp ⎪⎨ln f ( y Ni | u0 ) + ∑ ln f y 0ij | u0 ln g ( u0 ) ⎪⎬du0 u0 ⎭⎪ 6 ln f ( y Ni | u0 ) Since y Ni ~ N ( μi , Σi ) , where y Ni = v + Λ Niα + Λ Niu Ni , hence μi = v + Λ Niα , Σi = Λ Ni Ψ 22Λ ′Ni + Dθ and u0i ~ N ( α, Ψ11 ) , it follows from well-known results for normal conditional distributions that ( y Ni ~ N μ y⋅0 , Σ y⋅0 ) where −1 μ y⋅0 = υ + Λ Niα + Λ Ni Ψ 21Ψ11 ( u 0i − α ) −1 Σ y⋅0 = Σi − Λ N 1Ψ 21Ψ11 Ψ12Λ ′N 1 Therefore ln f ( y Ni | u0 ) = − −1 2 pi ′ 1 1 ln 2π − ln Σ y⋅0 − y Ni − u y⋅0 Σ −y1⋅0 y Ni − u y⋅0 2 2 2 ( ) ( ) 7 4 Parameter estimation Parameter estimation is relatively straightforward and can be summarized by the following two steps. Step 1 Calculate the natural logarithm of the likelihood function, ln L , where H nh nhk ln L = ∑∑∑ whkl ln f ( y hkl | γ ) (1) h =1 k =1 l =1 Step 2 Obtain an estimate γ̂ of γ by solving the set of simultaneous equations ∂ ln L =0 ∂γ γ = γˆ (2) In general, no closed-form solution to the set of equations (2) exists, and therefore parameter estimates are obtained interactively using the Fisher scoring algorithm: ( ) ( ) γˆ (t +1) = γˆ (t ) + I −n 1 γˆ (t ) g γˆ (t ) where γ̂ (t ) denotes the parameter values at iteration t , t = 1,2, (3) , and g (⋅) denotes the gradient vector and where I n (⋅) denotes the information matrix. 8 Let C = ∂γ*′ presented in symbolic form as ∂γ ⎡ γ \ γ *′ ⎢ ⎢ vecΛ y ⎢ ⎢ vecΛ y ⎢ ⎢ vecB ⎢ αy ⎢ ⎢ κ ⎢ C = ⎢ vec ( Γ ) ⎢ ⎢ vecs Ψ y ⎢ ⎢ vecs ( Φ ) ⎢ ⎢ vecs ( Θ ε ) ⎢ ⎢ vecs Θ δ ⎢ τy ⎢ ⎢ τx ⎣⎢ ( ) ( ) ⎡⎣ vecΛ ⎤⎦ C11 C21 C31 0 0 0 ' α ' ⎡⎣ vecsΨ ⎤⎦ 0 0 0 0 0 0 C42 0 C52 0 C62 C63 ' ⎡⎣ vecsΘ ⎤⎦ 0 0 0 0 0 0 0 0 C73 0 0 0 C83 0 0 0 0 C94 0 0 0 C10,4 0 0 0 0 0 0 0 0 ' ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ 0 ⎥ ⎥ 0 ⎥ ⎥ 0 ⎥ ⎥ 0 ⎥ C11,5 ⎥⎥ C12,5 ⎥⎥⎦ υ' 0 0 0 0 0 0 Using the chain rule for matrix differentiation, it follows that g(γ) = C ∂ ln L ∂γ * (4) and ⎡ ∂ 2 ln L ' ⎤ I n ( γ ) = − E ⎢C * *' C ⎥ ⎣⎢ ∂γ ∂γ ⎦⎥ t +1 t Iterations are continued until γˆi( ) − γˆi( ) ≺ ε ∀i = 1, 2, (5) , nfree where ε is a small scalar value, e.g. 10-6. 9 5 Approximate covariance matrix of estimators An approximate expression for the asymptotic covariance matrix of γ̂ is given by Cov ( γˆ ) ≈ Ι n −1 ( γ ) GΙ n −1 ( γ ) Using results derived by Binder (1983) and Fuller (1975), it follows that, under single stage sampling with replacement (WR) or without replacement (WOR) nh (1 − f h ) nh G=∑ ∑ t hi. − t h.. t hi. − t h.. h=1 nh − 1 i =1 H ( )( ) ' where nh i • nh = ∑ mhij , with mhij the number of cases within stratum h, cluster i, and j =1 USU j . n • f h = h , the sampling rate for stratum h . N • t hij = g hij ( γˆ ), where g hij (γˆ ) is the hij -th contribution to the gradient vector g(γ ) . • t hi. = • t h.. = mh ij ∑ t hij j =1 nh 1 t hi. nh ∑ i =1 In practice, we assume a zero contribution to G for strata that contain a single PSU (cluster). Additionally, if there is no variable to define clusters, the observations within each stratum are treated as being the primary sampling units. 10 6 Adjustment to the chi-square goodness of fit statistics 2 Simulation studies indicated that the χ LR -statistic based on the difference between two deviance statistics in general yields a too high rejection rate. Let ) ( ( d1 = tr Ι n ( γˆ 1 ) Cov ( γˆ 1 ) , d 2 = tr Ι n ( γˆ 2 ) Cov ( γˆ 2 ) ) ( ) where Ι n γ s , s = 1, 2 denotes the information matrix under H1 and H 2 . A correction to the χ 2 -statistic for testing the difference in two deviance statistics is given by 2 2 χ robust = c × χ LR , where ( nfree2 > nfree1 ) c= nfree2 − nfree1 , abs(d1 − d 2 ) and where nfree1 and nfree2 respectively denote the total number of parameters to be estimated under the H1 and H 2 models. 11 7 Example: Confirmatory Factor Analysis Model The data set forms part of the data library of the Alcohol and Drug Services Study (ADSS). The ADSS is a national study of substance abuse treatment facilities and clients. Background data and data on the substance abuse of a sample of 1752 clients were obtained. The sample was stratified by census region and within each stratum a sample was obtained for each of three facility treatment types within each of the four census regions. The following variables included in the PSF were selected from the survey data: o CENREG: This variable indicates the census region and has four categories, these being "Northeast", "Midwest", "South", and "West" respectively. o FACTYPE: The facility treatment type has four categories, too, representing facilities with "residential treatment", "outpatient methadone treatment", "outpatient non-methadone treatment", and "more than one type of treatment" respectively. o COCEU: An indicator variable with value "1" if the respondent has ever used cocaine, and "0" otherwise. o MAREU: An indicator variable with value "1" if the respondent has ever used marijuana, and "0" otherwise. o DEPR: This indicator variable is coded "1" if the respondent is depressed, and "0" otherwise. o EDU: A categorical variable representing the respondent's level of education at admission. It has 5 categories, these being (from 1 to 5) "less than 8 years", "8 – 11 years or less than High School graduate", "High School graduate / GED", "some college", and "college graduate / postgraduate". 12 o JAILR: This indicator variable indicates whether the respondent had a prison or jail record prior to admission. o NUMTE: A count variable, indicating the total number of treatment episodes prior to admission. From the main menu bar, select the Data, Survey Design … option. Add the survey design variables to the appropriate text boxes. 13 The SIMPLIS syntax file is shown below. The only difference between this and the usual SIMPLIS syntax is the addition of the paragraph $ADAPQ(12) CLL Other options are PROBIT, LOGLOG and LOGIT. 14 A portion of the LISREL output is shown next. 15 Fit statistics and threshold values are shown below. 16 8 Example: Confirmatory Factor Analysis Model with latent variable relationship and latent variable means 17 Mean Model 18 Alternative Parameterization 19 9 Example: Confirmatory Factor Analysis Model with a mixture of ordinal and continuous variables 20 21 22 Further Reading An, A.B. (2003). Performing Logistic Regression on Survey Data with the New SURVEYLOGISTIC. Paper 258-27 presented at SUGI 27, held on April 14-17, 2003. Orlando, Florida. Binder, D.A. (1983). On the Variances of Asymptotically Normal Estimators from Complex Surveys. International Statistical Review, 51, 279-292. Binder, D.A., & Patak, Z. (1994). Use of estimating functions for individual estimation from complex surveys, Journal of the American Statistical Association, 83, 1035-1043. Chambers, R. & Skinner, C.J. (ed.) Analysis of Survey Data. NY: Wiley, 2003. Feder, M., Nathan, G., & Pfefferman, D. (2000). Multilevel Modeling of Complex Survey Longitudinal Data with Time Varying Random Effects, Survey Methodology, 26(1), 53-65, Statistics Canada. Fuller, W.A. (1975). Regression Analysis for Sample Survey. Sankhya, Series C, 37, 117-132. Horvitz, D. G., and Thompson, D. J. (1952). A generalization of sampling without replacement from a finite universe, Journal of the American Statistical Association, 47, 663 - 685. Jöreskog, K.G. & Sörbom, D. (2004). LISREL 8.70 for Windows [Computer Software]. Lincolnwood, IL: Scientific Software International, Inc. Jöreskog, K.G., & Moustaki, I. (2001). Factor analysis of ordinal variables: A comparison of three approaches, Multivariate Behavioral Research, 36(3), 347-387. Graubard & Korn (1993). Hypothesis testing with complex survey data, JASA, 88, 629-641. Kish, L. (1965). Survey Sampling, New York: John Wiley. Kish, L., & Frankel, M.R. (1974). Inference from Complex Samples, Journal of Royal Statistical Society, B(36), 1-37. Muthen, B.O. (1984). A general structural equation model with dichotomous, ordinal, categorical, and continuous latent variable indicators, Psychometrika, 49, 115-132. Muthen, BO., & Satorra, A. (1995). Complex sample data in structural equation modeling, in: P Marsden (Ed.). Sociological Methodology, 216-316. Nathan, G. (1988). Inference Based on Data from Complex Sample Designs, in: Handbook of Statistics, P.R. Krishnaiah & C.R. Rao (Eds.), Amsterdam: Elsevier, 247266. Rao, J.N.K. (1975). Unbiased variance estimation for multistage designs, Sankhya, C(37), 133-139. Rao, J.N.K. (1994). Estimation of totals and distributing functions using auxiliary information at the estimation stage, J. Official Statist., 10, 153-165. 23 Rao, J.N.K., & Scott, A.J. (1981). The Analysis of Categorical Data from Complex Sample Surveys: Chi-Squared Tests for Goodness of Fit and Independence in Two-Way Tables, Journal of the American Statistical Association, 76, 221-230. Rao, J.N.K., Scott, A.J., & Skinner, C.J. (1998). Quasi-score test with survey data, Statistica Sinica 8, 1059-1070. Sarndal, C.E., Swensson, B., & Wretman, J. (1992). Model assisted survey sampling, New York: Springer. SAS Institute, Inc. (2004). SAS/STAT®: User’s Guide. Cary, NC: SAS Institute, Inc. Satorra, A., & Bentler, PM. (1994). Corrections to test statistics and standard errors in covariance structure analysis, in: A. van Eye & C.C. Clogg (Eds.), Latent variable Analysis in Developmental Research, 285-305, Thousand Oaks, CA: Sage Publications. Shapiro, G. M., & Bateman, D.V. (1978). A better alternative to the collapsed stratum variance estimate, Proceedings of the Survey Research Methods, American Statistical Association, 451-456. Skinner, C.J., Holt, D., & Smith, T.M.F. (1989). Analysis of Complex Surveys, Chichester: Wiley. Skinner, C. J. (1989). Domain means, regression and multivariate analysis. In Analysis of Complex Surveys (eds. C.J. Skinner, D. Holt and T.M.F. Smith) 59-87, Wiley. Smith, T., and Holmes, D. (1989) Multivariate Analysis. In Analysis of Complex Surveys (eds. C.J. Skinner, D. Holt and T.M.F. Smith) 165-190,Wiley. SUDAAN User Manual Release 8.0, Second Edition. (2002). Research Triangle Institute. Traat, I., Meister, K., & Sostra, K. (2001). Statistical inference in sampling theory, Theory of stochastic processes, 7(23), no. 1-2, 301-316. Wolter, K. M. (1985). Introduction to Variance Estimation, New York: Springer-Verlag. Yates, F., and Grundy, P.M. (1953). Selection without replacement from within strata with probability proportional to size, Journal of the Royal Statistical Society, B(15), 253 - 261. 24
© Copyright 2026 Paperzz