Studying Measurement Invariance Using Confirmatory Factor Analysis Measurement Invariance Suppose we have a scale that is intended to measure a specified trait. The scale is made up of multiple items (or subscales, or multi-item parcels). In factor analytic terms, the items serve as indicators of the trait or factor in a common factor model. Suppose furthermore that we make use of this scale in samples from two or more distinct populations (e.g., genders, cultural groups, racial groups, age groups, occupational groups, etc.). We may wish to analyze resulting measures in a variety of ways. For instance, we may wish to test the difference between group means, or to analyze the relationship of the scale to other variables in each group. For any such use of scale scores, there is a critical assumption that the scale is measuring the same trait in all of the groups. If that assumption holds, then comparisons and analyses of those scores are acceptable and yield meaningful interpretations. But if that assumption is not true, then such comparisons and analyses do not yield meaningful results. This is the general issue of measurement invariance (MI). Issues of MI are also often relevant in longitudinal research. When a scale is administered over repeated occasions to the same sample of people, the question of MI involves the issue of whether the scale is measuring the same construct at different occasions. This type of MI will be addressed later; for now, we examine MI over multiple groups. Studying MI Using Confirmatory Factor Analysis A central principle of MI is that measures across groups are considered to be on the same scale if relationships between the indicators and the trait are the same across groups. This statement can be translated into factor analytic terms: Given multiple items that make up a scale, if the loadings for those items on the single underlying factor are the same across groups, then measurement invariance is supported. When framed in these factor analytic terms, this property is called factorial invariance, and represents one 2 approach to the study of MI. As will be seen, invariance of factor loadings is only one aspect of MI. The various aspects of MI can be investigated using confirmatory factor analysis models. These models can be supplemented with a model for structured means so as to allow for the study of group differences in means on latent variables. Review of Multi-Sample CFA Model with Structured Means For a factor analysis model with nonzero means of MVs and LVs, we first specify the following data model: B œ 7 B LB 0 $ where 7B represents a vector of intercept terms for the B variables, LB is the factor loading matrix, 0 is the vector of scores on the LVs, and $ is the vector of error of measurement terms corresponding to the B variables. We also define , as the vector of means on the LVs. From this data model, we can derive the following covariance structure: OBB œ LB QLwB K$ which expresses the population variances and covariances of the MVs as a function of the parameters in LB , Q , and K$ . We can also derive the following mean structure: .B œ 7B LB , This model expresses the population means of the B variables as a function of the parameters in 7B , LB and ,. The full model for covariances and means thus has 5 parameter matrices: LB , Q , K$ , 7B , and ,. A model is specified by designating fixed, free, and constrained parameters in these 5 matrices. This model can be fit to data (sample covariance matrix, – ) by obtaining estimates of the parameters such W , and sample mean vector B 3 s that the resulting implied population covariance matrix and mean vector (O and . s) are as similar as possible to their sample counterparts. The model can also be generalized to the case of multiple samples. The covariance structure is generalized to Ð1Ñ Ð1Ñ wÐ1Ñ Ð1Ñ OBB œ LB QÐ1Ñ LB K$ and the mean structure is generalized to Ð1Ñ Ð1Ñ .B œ 7 Ð1Ñ B LB ,Ð1Ñ where 1 represents the 1th of K populations. The model is specified in terms of the parameter matrices for each group, possibly including equality constraints on selected parameters across groups. The model is fit to the multiple samples simultaneously. Model Specification for Investigating MI The notion of MI usually is raised with reference to a single scale and the question of whether it measures the same trait in different groups. This question can be studied using the multi-sample CFA model with structured means. In the simple MI case, there would be only one factor, and the indicators of that factor would be the scale items (or subscales, parcels, etc.) In the following developments, it should be kept in mind that all of these methods are equally applicable to the case where there is more than one factor and the question is whether the same LVs (plural) are being measured across groups. The multi-sample CFA model with structured means can be used to investigate MI and to test for group differences in LV means. This is achieved by testing a sequence of models, beginning with an unconstrained model and progressively introducing equality constraints on parameters. It is possible to define an a priori sequence of models to achieve this objective. To see this, consider the case where the indicators do measure the same LV across groups. What does this case imply with regard to invariance in model parameters across groups? First, it implies that the loadings in LB will be invariant, since those loadings represent the relationships of the indicators to 4 the underlying trait. Does this level of invariance imply that any other parameters must be invariant across groups. The answer to this question is "no." To understand this, consider the data model: B œ 7 B LB 0 $ If this model holds for every individual in each population, then the factor loadings in LB are the same for every individual. In this case, we have simple MI and LB will be invariant across groups. But 7B need not be invariant, since those intercepts do not involve the property of MI. Considering the covariance structure given by Ð1Ñ Ð1Ñ wÐ1Ñ Ð1Ñ OBB œ LB QÐ1Ñ LB K$ it would not be necessary for Q or for K to be invariant across groups. These matrices contain variances and covariances of exogenous LVs and of errors of measurement, respectively. Even if we have MI, meaning we are measuring the same LV in each group, these variances and covariances could be different from group to group. Thus, a simple form of MI requires factor loadings to be invariant across groups, but allows all other parameters to vary from group to group. Without constraints on factor loadings, we do not achieve this level of invariance. With additional constraints, we achieve stronger forms of invariance. We can specify a sequence of models accordingly, to represent different degrees of MI, beginning with the least constrained and progressing to the most constrained model. The sequence of models is defined so as to provide substantively useful information. At each step in the sequence we are interested in (a) the fit of that model, and (b) the degree of decrement in fit compared to the previous model. At any step, if the model fits well and the decrement in fit is not significant, then the constraints imposed at that step are deemed plausible in the population. However, if at any step the decrement in fit caused by the constraints introduced at that step is significant, then those constraints are deemed implausible. The material below is adapted from several sources, including Reise, Widaman, and Pugh (1993), Meredith (1993), and Widaman and Reise (1997), and Vandenberg and Lance (2000). 5 Invariance of Population Covariance Matrices It is useful as a first step to test the null hypothesis that the population covariance matrices are equal. Failure to reject this model implies that equality of population covariance matrices is plausible, which in turn implies that equality of parameters of the covariance structure model (factor loadings, unique variances, and factor variances and covariances) is plausible. In such a case, there is no need to carry out further tests of invariance of these parameters. On the other hand, if invariance of population covariance matrices is rejected, then one can begin a series of tests to determine whether some, but not all, of the model parameters may be invariant. Configural Invariance (also called pattern invariance) Configural invariance is defined as the same pattern of fixed and free factor loadings (and other parameters) across groups, but no equality constraints. This model implies that similar, but not identical, LVs are present in the 1 groups. This model is represented by the following set of equations: Ð1Ñ Ð1Ñ wÐ1Ñ Covariance Structure: OBB œ LB QÐ1Ñ LB KÐ1Ñ Mean Structure: .B œ 7 Ð1Ñ B LB ,Ð1Ñ Ð1Ñ Ð1Ñ According to this structure, group differences in variances, covariances, and means, may be attributable to group differences in elements of all 5 parameter matrices. In practice it is not possible to fit this model to data with all elements of 7 Ð1Ñ and ,Ð1Ñ free in each group. These parameters would not be all identified. The usual way around this problem in assessing configural invariance is to omit the mean structure from the model and test the model using only the covariance structure. There is no real loss in taking this approach, since mean differences on LVs across groups would not be of interest without invariance in factor loadings. In such a specification, one would need to set Ð1Ñ a scale for the LVs in the usual way. Beyond that, the matrices LB , Q Ð1Ñ , and KÐ1Ñ would be specified so as to have the same pattern of fixed and free parameters across groups, but no equality constraints would be imposed. 6 The model of configural invariance serves as a useful baseline model to which we can compare more restrictive models. Weak Factorial Invariance Under weak factorial invariance, the factor loadings are constrained to be equal across groups, but no other equality constraints are imposed. This model implies that the same LVs are being measured across groups. The model can be represented as follows: Ð1Ñ Covariance Structure: OBB œ LB QÐ1Ñ LBw KÐ1Ñ Mean Structure: .B œ 7 Ð1Ñ B LB ,Ð1Ñ Ð1Ñ Note that the superscript 1 has been dropped from L , indicating that L does not vary across groups. Again, the parameters in 7 Ð1Ñ and ,Ð1Ñ can not all be estimated without further constraints. As a result, the weak factorial invariance model is typically evaluated using only the covariance structure. Considering that covariance structure, it can be seen that under weak factorial invariance, differences in covariances among MVs (off-diagonal Ð1Ñ elements in OBB ) are attributable to differences among covariances of LVs (off-diagonal elements in QÐ1Ñ ). That is, differences across groups in relationships among MVs are attributable to differences across groups in relationships among LVs. Strong Factorial Invariance Ð1Ñ Strong factorial invariance is tested by specifying factor loadings in LB to be invariant across groups, and also for intercepts in 7 Ð1Ñ B to be invariant across groups. This condition is also called "scalar invariance." The resulting model is then as follows: Ð1Ñ Covariance Structure: OBB œ LB QÐ1Ñ LBw KÐ1Ñ Mean Structure: .B œ 7 B LB ,Ð1Ñ Ð1Ñ 7 As with weak factorial invariance, this specification implies that the measurement of the LVs is the same across groups. Furthermore, the invariance in the intercepts in the mean structure allows for us to evaluate mean differences in LVs across groups. The form of the mean structure implies that any differences in means on the MVs are attributable to differences in means on the LVs. Although it is not possible to estimate factor means in ,Ð1Ñ in each group, the differences among those means will be identified. Thus, to assess mean differences, we must simply fix the vector ,Ð1Ñ in one group, and allow elements of ,Ð1Ñ to be free in the remaining groups. This is usually achieved by setting elements of ,Ð1Ñ to zero in one group. Resulting parameter estimates for , in the remaining groups then reflect differences between factor means in that group and the reference group. Thus, under strong factorial invariance, group differences in covariances among MVs and in means of MVs are attributable to group differences in covariances and means on LVs. Strict Factorial Invariance Strict factorial invariance extends the previous model by invoking the additional constraint that unique variances in KÐ1Ñ are invariant across groups. The resulting model is as follows: Ð1Ñ Covariance Structure: OBB œ LB QÐ1Ñ LBw K Mean Structure: .B œ 7 B LB ,Ð1Ñ Ð1Ñ The key difference between strict and strong factorial invariance involves how the variances of the MVs (diagonal elements of O Ð1Ñ ) are accounted for. Under strong factorial invariance, group differences in those variances could be attributable to group differences in variances of LVs (in diagonals of QÐ1Ñ ) as well as to group differences in error variances (in diagonals of KÐ1Ñ ). Under strict factorial invariance, group differences in variances of MVs are attributable only to group differences in variances of LVs, since error variances are invariant across groups. 8 Strict factorial invariance is a highly constrained model and may often not hold in practice. In fact, there is reason to expect that it would not hold, even if strong factorial invariance does hold. Even if all populations come from a common parent population with given error variances, it would be expected that error variances would vary from one subpopulation to another. Additional Models Two additional sets of constraints could be of interest: 1) Invariant variances and covariances of LVs. In this model, strict factorial invariance would be extended by constraining Q to be invariant across groups. A test of the difference between this model and the strict model would indicate whether such a constraint were plausible. If so, the implication is that the entire covariance structure is invariant, meaning that the population covariance matrix is invariant across groups and has the same parametric structure. 2) Invariant means of LVs. This model is an extension of strict factorial invariance and is specified by constraining , to be invariant across groups. A test of the difference between this model and the strict model would indicate whether this constraint is plausible. If so, the implication is that the entire mean structure is invariant, meaning that population means of the MVs and LVs are invariant across groups. However, if the difference between this model and the strict model is significant, then that is evidence that the population means on the LVs are significantly different across groups. What To Do When Factorial Invariance Is Not Supported If an investigator carries out a study of factorial invariance and finds no support for this property, what does he or she do? 9 If configural variance is not supported, this is a serious problem. In that case, the evidence argues against even similar factor patterns across groups. One must either accept the fact that different LVs are being measured in different groups, or attempt to identify and rectify the reason for the lack of configural invariance. One approach to addressing the problem would be to back up and reconsider the selection of indicators. One can examine results and carry out more detailed analyses to determine whether configural invariance would hold for a subset or different set of indicators. If configural invariance is supported, but weak factorial invariance is not supported, one can again either accept and attempt to assess the implications of that fact or try to address it. One way to address it is to identify and delete problematic indicators, so that weak factorial invariance would be supported using a subset of the indicators. This approach is problematic, however, because it can alter the nature of a standardized scale and is also susceptible to capitalization on chance characteristics of the observed samples. Another approach is to make use of the notion of "partial measurement invariance," wherein equality constraints are imposed on some but not all of the factor loadings (see Byrne, Shavelson, and Muthen, 1989). If partial measurement variance is supported, however, it is highly debatable whether subsequent comparisons of LV means across groups are meaningful. In addition, a finding of partial measurement invariance could be highly dependent on chance characteristics of the observed samples. Example The study of factorial invariance is illustrated using an example from Reise, Widaman, and Pugh (1993). This example involves a 5-item scale of negative affect administered to samples of American and Chinese students. The items are adjectives: nervous, worried, jittery, tense, distressed. Subjects responded to each item on a 5-point Likert scale. Sample sizes were N=540 American students and N=598 Chinese students. Sample data are provided in Table 1 of the Reise et al. paper. (Note: There are two errors in this table: First, the entries listed as correlations are actually covariances. Second, the entry for the relationship between worried and tense for the Minnesota sample should be .77 rather than .74.) 10 Following is the data file rwp1.dat for fitting a covariance structure (no mean structure): NERV WORRD JITTRY TENSE DSTRSD .865 .29 1.232 .30 .25 1.020 .39 .31 .20 1.082 .40 .51 .42 .52 1.254 NERV WORRD JITTRY TENSE DSTRSD 1.254 .79 1.488 .66 .55 1.188 .73 .77 .63 1.416 .66 .84 .57 .92 1.563 Following are command files for fitting various models of measurement invariance. Command file for test of invariant population covariance matrices: LISREL MODEL FOR INVARIANT SIGMA -- NANJING SAMPLE DA NG=2 NI=5 NO=598 LA FI=RWP1.DAT CM FI=RWP1.DAT MO NX=5 NK=5 LX=ID TD=ZE PH=SY,FR OU LISREL MODEL FOR INVARIANT SIGMA -- MINNESOTA SAMPLE DA NI=5 NO=540 LA FI=RWP1.DAT CM FI=RWP1.DAT MO PH=IN OU Command file rwpm1.spl for configural invariance: LISREL MODEL FOR CONFIGURAL INVARIANCE -- NANJING SAMPLE DA NG=2 NI=5 NO=598 LA FI=RWP1.DAT CM FI=RWP1.DAT MO NX=5 NK=1 LX=FR LK NEGAFF 11 FI LX 1 VA 1 LX 1 OU XM LISREL MODEL FOR CONFIGURAL INVARIANCE -- MINNESOTA SAMPLE DA NI=5 NO=540 LA FI=RWP1.DAT CM FI=RWP1.DAT MO LK NEGAFF FI LX 1 VA 1 LX 1 OU XM Command file rwpm2.spl for weak factorial invariance: LISREL MODEL FOR WEAK FACTORIAL INVARIANCE -- NANJING SAMPLE DA NG=2 NI=5 NO=598 LA FI=RWP1.DAT CM FI=RWP1.DAT MO NX=5 NK=1 LX=FR LK NEGAFF FI LX 1 VA 1 LX 1 OU XM LISREL MODEL FOR WEAK FACTORIAL INVARIANCE -- MINNESOTA SAMPLE DA NI=5 NO=540 LA FI=RWP1.DAT CM FI=RWP1.DAT MO LX=IN LK NEGAFF FI LX 1 VA 1 LX 1 OU XM Data file rwp2.dat for fitting both covariance and mean structures: NERV WORRD JITTRY TENSE DSTRSD .865 .29 1.232 12 .30 .25 1.020 .39 .31 .20 1.082 .40 .51 .42 .52 1.254 1.89 2.09 1.60 2.15 1.93 NERV WORRD JITTRY TENSE DSTRSD 1.254 .79 1.488 .66 .55 1.188 .73 .77 .63 1.416 .66 .84 .57 .92 1.563 2.17 2.52 2.01 2.35 2.29 Command file rwpm3.spl for strong factorial invariance: LISREL MODEL FOR STRONG FACTORIAL INVARIANCE -- NANJING SAMPLE DA NG=2 NI=5 NO=598 LA FI=RWP2.DAT CM FI=RWP2.DAT ME FI=RWP2.DAT MO NX=5 NK=1 LX=FR TX=FR KA=FI LK NEGAFF FI LX 1 VA 1 LX 1 OU XM LISREL MODEL FOR STRONG FACTORIAL INVARIANCE -- MINNESOTA SAMPLE DA NI=5 NO=540 LA FI=RWP2.DAT CM FI=RWP2.DAT ME FI=RWP2.DAT MO LX=IN TX=IN KA=FR LK NEGAFF FI LX 1 VA 1 LX 1 OU XM Command file rwpm4.spl for strict factorial invariance: LISREL MODEL FOR STRICT FACTORIAL INVARIANCE -- NANJING SAMPLE DA NG=2 NI=5 NO=598 LA FI=RWP2.DAT CM FI=RWP2.DAT 13 ME FI=RWP2.DAT MO NX=5 NK=1 LX=FR TX=FR KA=FI LK NEGAFF FI LX 1 VA 1 LX 1 OU XM LISREL MODEL FOR STRICT FACTORIAL INVARIANCE -- MINNESOTA SAMPLE DA NI=5 NO=540 LA FI=RWP2.DAT CM FI=RWP2.DAT ME FI=RWP2.DAT MO LX=IN TX=IN KA=FR TD=IN LK NEGAFF FI LX 1 VA 1 LX 1 OU XM Command file rwpm5.spl for strict factorial invariance as well as invariant factor variance: LISREL MODEL FOR STRICT FACTORIAL INVARIANCE -- NANJING SAMPLE FACTOR VARIANCE INVARIANT DA NG=2 NI=5 NO=598 LA FI=RWP2.DAT CM FI=RWP2.DAT ME FI=RWP2.DAT MO NX=5 NK=1 LX=FR TX=FR KA=FI LK NEGAFF FI LX 1 VA 1 LX 1 OU XM LISREL MODEL FOR STRICT FACTORIAL INVARIANCE -- MINNESOTA SAMPLE FACTOR VARIANCE INVARIANT DA NI=5 NO=540 LA FI=RWP2.DAT CM FI=RWP2.DAT ME FI=RWP2.DAT MO LX=IN TX=IN KA=FR TD=IN PH=IN LK 14 NEGAFF FI LX 1 VA 1 LX 1 OU XM Command file rwpm6.spl for strict factorial invariance as well as invariant factor mean: LISREL MODEL FOR STRICT FACTORIAL INVARIANCE -- NANJING SAMPLE FACTOR MEAN INVARIANT DA NG=2 NI=5 NO=598 LA FI=RWP2.DAT CM FI=RWP2.DAT ME FI=RWP2.DAT MO NX=5 NK=1 LX=FR TX=FR KA=FI LK NEGAFF FI LX 1 VA 1 LX 1 OU XM LISREL MODEL FOR STRICT FACTORIAL INVARIANCE -- MINNESOTA SAMPLE FACTOR MEAN INVARIANT DA NI=5 NO=540 LA FI=RWP2.DAT CM FI=RWP2.DAT ME FI=RWP2.DAT MO LX=IN TX=IN KA=FR TD=IN KA=IN LK NEGAFF FI LX 1 VA 1 LX 1 OU XM Summary of Results Invariant population covariance matrices: ;# œ )(Þ#), .0 œ "&ß rmsea = .091 15 This model fits very poorly, indicating that population covariance matrices are almost certainly not invariant, which implies that at least some of the covariance structure parameters are different between groups. We then proceed to investigate hypotheses about different types of measurement invariance. Configural Invariance: ;# œ 75.30, .0 œ "!, rmsea=.11, nnfi=.91 For purposes of example, consider configural invariance to be plausible and proceed to test weak factorial invariance: ;# œ 90.1!, .0 œ "4, rmsea=.095, nnfi=.93 Fit of this model is marginal. Comparison to configural invariance model yields ;# œ 14.80, .0 œ 4, : .01. But nnfi is better than for previous model. Proceed to test of strong factorial invariance: ;# œ 111.21, .0 œ "8, rmsea=.093, nnfi=.93 Under this model we can compare groups with respect to factor variances s " œ .33 (.03) for the Chinese sample and .64 and means. For variances, 9 (.06) for the American sample, indicating higher variance on the negative affect LV for the American sample. For means, , s œ .00 for the Chinese sample (fixed), and .31 (.05) for the American sample, indicating a higher mean on the negative affect LV for the American sample. Proceed to test of strict factorial invariance: ;# œ 122.51, .0 œ 23, rmsea=.086, nnfi=.94 Fit of this model is marginal. The decrement in fit from the previous model is tested by ;# œ 11.30, .0 œ 5, : .05, indicating statistical rejection of this constraint that error variances in the MVs are invariant across groups. For the sake of the example, test two further models. In the present case, these were tested by introducing additional constraints to the model of strict 16 invariance, but they could have also been specified by introducing these constraints to the model of strong invariance. Strict invariance, plus invariant factor variances: ;# œ 158.92, .0 œ 24, rmsea=.098, nnfi=.93 This constraint causes a significant decrease in fit (;# œ 36.41, .0 œ 1, : .001), meaning that this constraint is not plausible in the population. This finding is not surprising given our earlier observation of factor different variances in the two groups. Strict invariance, plus invariant factor means: ;# œ 164.48, .0 œ 24, rmsea=.100, nnfi=.92 This constraint also causes a significant decrease in fit (;# œ 41.97, .0 œ 1, : .001), meaning that this constraint is not plausible in the population. This finding is also not surprising given our earlier observation of different factor means in the two groups. 17 Measurement Invariance over Time Suppose that a multi-item scale intended to measure a particular construct is administered to the same sample of individuals on repeated occasions. The investigator may be interested in assessing aspects of change on the construct, or in relationships of the construct to other variables at various occasions. Obtaining valid information about such questions requires that the same construct is being measured at each occasion. This is MI defined across occasions rather than across different groups of people. To investigate MI over time, one must conceive of modeling a population covariance matrix that is of order pT x pT, where p is the number of measured variables (items) and T is the number of occasions. This covariance matrix D can be seen as being organized into blocks of order p x p. Diagonal blocks D"" , D## , . . . DX X represent within-occasion covariance matrices, and off-diagonal blocks D#" , D$" , etc., represent between-occasion covariance matrices. Questions of MI refer to the factor structure of the diagonal blocks, but the model represents the structure of the full matrix. MI models can be defined and tested for this context much as in the multigroup context. Analyses, however, are single-sample analyses, with constraints applied across occasions rather than across groups. A typical sequence of models in this context might proceed through tests of configural, weak, strong, and strict invariance. The question of invariance of factor means is often of central interest. 18 References Byrne, B. M., Shavelson, R. J., & Muthen, B. (19889). Testing for the equivalence of factor covariance and mean structures: The issue of partial measurement invariance. Psychological Bulletin, 105, 456-466. Meredith, W. (1993). Measurement invariance, factor analysis, and factorial invariance. Pyschometrika, 58, 525-543. Reise, S. P., Widaman, K. F., & Pugh, R. H. (1993). Confirmatory factor analysis and item response theory: Two approaches for exploring measurement invariance. Psychological Bulletin, 114, 552-566. Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 3, 4-70. Widaman, K. F., & Reise, S. P. (1997). Exploring the measurement invariance of psychological instruments: Applications in the substance use domain. In K. J. Bryant, M. Windle, & S. G. West (Eds.) The science of prevention: Methodological advances from alcohol and substance abuse research, pp. 281-324. Washington, DC: APA.
© Copyright 2026 Paperzz