Editorial Factor analysis: A primer for asthma researchers Kristin A. Riekert, PhD, and Michelle Eakin, PhD Baltimore, Md In the past few years, factor analysis (FA) has been increasingly used to examine the relationship between variables known to be associated with asthma health status including spirometry, atopy, inflammation, symptom frequency, quality of life, b-agonist use, and exacerbations. FA is a multistep analytic process that seeks to define the underlying latent constructs among a larger set of observed variables. Variables that correlate with one another but are distinct from other variables are combined into factors that are assumed to reflect the same underlying disease process. Thus, a primary advantage of FA is that it reduces a large number of disease features to a smaller, more manageable number of independent and analyzable factors. FA does not have any a priori hypotheses about the data structure. As such, the specific variables included in the model highly affect the output of the factor analysis and the interpretation of which latent constructs are represented. Moreover, there are multiple valid options for conducting FA, and there are no tests of significance. As a result, FA is dependent on the researchers’ judgment about (1) the consistency of results using different statistical approaches and rules of thumb, (2) the interpretability of the output, and (3) comparability with previous research. Herein lies the challenge of FA. Because FA is based on the correlation between variables, one of the most important decisions is selecting which variable to include in the model. Ideally the researcher has access to the universe of possible variables from which to choose those that are interrelated and make sense on the basis of known empirical data or hypothesized mechanisms. Unfortunately, for most asthma studies, the FA is a secondary analysis of an existing dataset, and the variables available may not include relevant domains, or domains present may not be measured thoroughly. For example, Holt et al1 did not have quality-of-life data, and symptom data were based on diary cards, which contrasts with the survey approach used by Shatz et al,2 which in turn did not have measures of atopy or inflammation that were included to a varying degree in other studies (Table I). This is important because failure to measure an important factor may distort the apparent relationships among the measured factors.6 The complexity of the variables included in the analysis can also affect the results. In asthma, this is a significant limitation because many of the constructs of interest From the Division of Pulmonary and Critical Care Medicine, Johns Hopkins School of Medicine. Supported in part by HL075344 from the National Heart, Lung, and Blood Institute. Disclosure of potential conflict of interest: The authors have declared that they have no conflict of interest. Received for publication March 24, 2008; accepted for publication March 25, 2008. Reprint requests: Kristin Riekert, PhD, Assistant Professor, Division of Pulmonary and Critical Care Medicine, Johns Hopkins University, 5501 Hopkins Bayview Circle, JHAAC Rm 4B.74, Baltimore, MD 21224. E-mail: [email protected]. J Allergy Clin Immunol 2008;121:1181-3. 0091-6749/$34.00 Ó 2008 American Academy of Allergy, Asthma & Immunology doi:10.1016/j.jaci.2008.03.023 (eg, spirometry, atopy, patient reported outcomes) are confounded by measurement metrics (eg, mL vs a 5-point scale) and levels of patient burden (1-time blood test vs daily diary completion). Thus, subtle differences in factors between published studies may merely reflect differences in variable selection and measures and not inconsistencies in the true underlying constructs. The 2 most common methods for extracting factors in FA are principal FA (PFA) and principal components analysis (PCA). The only difference between these methods is the type of variance included in the analysis. PCA uses the total variance of all the variables, and PFA uses only the variance that each variables shares with other variables and excludes a single variable’s unique and error variance. Holt et al1 conducted a PFA, whereas most other asthma studies have used PCA (Table I). PCA and PFA typically yield equivalent results, particularly if the sample size is large and there are numerous variables. However, PFA can result in factors with negative eigenvalues, which are artifacts of the methodology that can affect how many factors are retained and should not be included in further analyses or interpretation. After factors have been extracted, researchers must use their judgment to determine how many factors should be retained. There are several criteria available, and typically multiple methods are examined to determine whether the most scientifically meaningful factors are being retained. The 2 most common methods for selecting the number of factors are the Kaiser criterion (eg, only factors that have eigenvalues greater than 1 are retained) and scree test (identifies the number of factors by plotting the latent roots against the number of factors in order of extraction [ie, scree plot] and examining the slope).6 Holt et al1 used an unconventional method for factors selection that uses the mean of the eigenvalues as the cutoff point for factor retention. They appear to have erroneously included negative eigenvalues in the mean calculation, thereby inflating the number of factors retained. Using the more traditional approaches to factor selection, the data provided by the authors indicate that 3 factors would be retained by using the Kaiser criterion and 3 to 5 factors using the scree test. Holt et al1 did not comment on whether they also examined the interpretability of a 3-factor or 4-factor solution. Scientific judgment of the practical significance and interpretability of the results is one of the most important steps in determining which factors to retain. After completing the PFA or PCA, the results are likely to be difficult to interpret. Rotation is therefore used to improve the utility of the solution and does not change the quality of the mathematical fit. There are 2 types of rotation: orthogonal (which results in uncorrelated factors) and oblique (which allows factors to be correlated). Orthogonal rotation provides factors that are easier to interpret and describe succinctly in a manuscript, but it is extremely rare for factors to be independent and uncorrelated. In reality, both methods tend to result in equivalent patterns. Researchers often conduct both types of rotations and, if similar, present the orthogonal results as did Holt et al.1 1181 1182 RIEKERT AND EAKIN J ALLERGY CLIN IMMUNOL MAY 2008 TABLE I. Factor analytic procedures and criteria used in studies evaluating the components of asthma health status Article Constructs included in FA 1 Holt et al (this issue) Pillai et al (2008)3 Shatz et al (2005)2 Juniper et al (2004)4 Leung et al (2004)5 Atopy (biological) b-Agonist use (diary) Exacerbations (diary) Inflammation (biological) Spirometry Symptoms—asthma (diary) Atopy (biological and survey) Spirometry Symptoms—asthma (survey) Symptoms—rhinitis (survey) Exacerbations (survey) Quality of life (survey) Symptoms—asthma (survey) b-Agonist use (diary) Quality of life (survey) Spirometry Symptoms—asthma (diary) Spirometry Exhaled NO Atopy (biological) Inflammation (biological) Extraction strategy Rotation Selection of no. of factors Criterion used for factor loading Test of robustness PFA Orthogonal EV above mean of all EVs Oblique .40 Time NR Oblique Kaiser criterion Scree test .45 PCA Orthogonal Kaiser criterion .40 PCA Orthogonal Kaiser criterion NR Age Sex Race Siblings Education Income Nonwhite Smokers Active treatment group Multiple trials Time PCA Orthogonal Kaiser criterion .45 None EV, Eigenvalue; NR, not reported or unclear. The next step is to determine which variables load on the factors. Factor loadings are the correlations between the observed variables and the factor. Because there is no statistical test, the rule of thumb is to examine the magnitude of factor loadings. Values greater than 60.30 are considered to meet the minimal level, whereas loadings 60.40 are considered more important to that factor.6 Ideally the goal is to have each variable load on only 1 factor; however, in practice, some variables load greater than 0.40 on more than 1 factor or may not load on any factor. If this occurs, a researcher may decide to delete them from the model and rerun the model to obtain a new factor solution. In contrast with previous asthma studies,2-5 Holt et al1 did not have any variable load on multiple factors. Secondary analyses often include testing the robustness of the factors identified. This is done by repeating the FA in a second sample, in a subsample, or over time. Previous asthma samples have found little difference when replicating the results over time,1,4 using multiple samples,4 or stratifying on demographics such as age, sex, race, or socioeconomic variables,2,3 suggesting that asthma status factors are robust. The final step is to interpret the scientific meaning of the pattern of factor loadings and offer a label for each factor. The label or name of the factor is not assigned by the analysis but is developed by the researcher to describe best the pattern of factor loadings and can be symbolic, descriptive, or causal. It is important to note that factors may represent the method of measurement such as self-report versus laboratory values rather that a biological or mechanistic relationship. A common error is overinterpretation of FA by ascribing greater meaning to the factors than can be presumed from the FA results alone. In the article by Holt et al,1 for example, there is an explicit assumption that the results of the FA suggest that the factors abstracted are meaningfully related to the assessment of a patient’s asthma status. Although it is true that there are previous empirical data to support that the variables included in the FA are important for the assessment of asthma status, the FA itself cannot test the hypothesis that a multicomponent assessment has clinical utility. FA also does not support the recommendation by Holt et al1 that the selection of 1 variable from each factor is sufficient for a comprehensive asthma assessment. What is clear from this article and others, however, is that variables known to be associated with asthma outcomes are not a unidimensional construct. A critical next step is to test the relevance of these constructs to the prediction of important health outcomes. For example, what is the relative importance of each factor identified by Holt et al1 to specialist ratings of asthma severity, response to treatment, or the occurrence of future exacerbations? Is the strength of the association between factors equal across all outcomes, or are some factors more sensitive to change? A benefit of FA is that factor scores can be calculated (weighing each variable proportionally to its loading on the factor) and used as variables in subsequent analyses. Factor scores are influenced by measurement error, so it is not a perfect measure of the underlying construct; however, they can be considered as weighted composite scores. This empirical approach to data reduction may increase statistical precision, thereby reducing the sample size required.7 To date, no asthma study has evaluated the clinical utility or predictive ability of factor scores. The article by Holt et al1 in this issue supports the asthma guidelines8 and a growing body of literature suggesting that asthma status is not unidimensional. The minor differences between studies conducting FAs more likely reflect the selection of variables and varying analytic approaches rather than substantial differences in the underlying constructs. To advance our scientific knowledge of which constructs are essential to assess to properly assign treatment and minimized negative health outcomes, future studies need to go beyond merely evaluating factor J ALLERGY CLIN IMMUNOL VOLUME 121, NUMBER 5 structure to testing the relative importance of each factor to relevant asthma health outcomes. REFERENCES 1. Holt EW, Cook EF, Covar RA, Spahn J, Fuhbrigge AL. Identifying the components of asthma health status in children with mild to moderate asthma. J Allergy Clin Immunol 2008;121:1175-80. 2. Schatz M, Zeiger RS, Vollmer WM, Mosen D, Apter AJ, Stibolt TB, et al. Validation of a beta-agonist long-term asthma control scale derived from computerized pharmacy data. J Allergy Clin Immunol 2006;117:995-1000. 3. Pillai SG, Tang Y, van den Oord E, Klotsman M, Barnes K, Carlsen K, et al. Factor analysis in the Genetics of Asthma International Network family study identifies five major quantitative asthma phenotypes. Clin Exp Allergy 2008;38:421-9. RIEKERT AND EAKIN 1183 4. Juniper EF, Wisniewski ME, Cox FM, Emmett AH, Nielsen KE, O’Byrne PM. Relationship between quality of life and clinical status in asthma: a factor analysis. Eur Respir J 2004;23:287-91. 5. Leung TF, Wong GW, Ko FW, Lam CW, Fok TF. Clinical and atopic parameters and airway inflammatory markers in childhood asthma: a factor analysis. Thorax 2005; 60:822-6. 6. Tabachnick BG, Fidell LS. Using multivariate statistics. 4th ed. New York: Allyn and Bacon; 2001. 7. Freemantle N, Calvert M, Wood J, Eastaugh J, Griffin C. Composite outcomes in randomized trials: greater precision but with greater uncertainty? JAMA 2003; 289:2554-9. 8. National Asthma Education and Prevention Program Expert Panel Report 3: guidelines for the diagnosis and management of asthma. Bethesda (MD: National Institutes of Health; 2007. Pub no. 08-4051.
© Copyright 2026 Paperzz