Functional Ecology 1996 10,421-43 1 ESSAY REVIEW Inferring multiple causality: the limitations of path analysis P. S. PETRAITIS," A. E. DUNHAM* and P. H. NIEWIAROWSKI? *Department of Biology, University of Pennsylvania, Philadelphia, PA 19104-6018 and ?-Universityof Akron, Department of Biology, Akron, OH 44325-3908, USA Summary 1. Review of published ecological studies using path analysis suggests that path analysis is often misused and collinearity is a moderate to severe problem in a majority of the analyses. 2. Multiple regression is shown to be related to path analysis, and thus collinearity, which is a well-known problem in regression analysis, can also be a problem in path analysis. Collinearity occurs when independent variables are highly correlated and causes estimates of standardized partial regression coefficients and of path coefficients to be less precise, less accurate and prone to rounding error. 3. A review of 24 path analyses in 12 papers revealed incorrectly calculated path coefficients in 13 cases and problems with collinearity in 15 cases. Estimates of path coefficients are biased and lack precision if there is collinearity. 4. An additional 40 path analyses could not be evaluated because of incomplete information. Most studies lack sufficient sample size to justify the use of path analysis. Problems associated with the use of categorical variables, which were used in six cases, are unappreciated. 5. These problems suggest that path coefficients may often mislead ecologists about the relative importance of ecological processes. Key-words: Collinearity, multiple regression, relative importance, statistical analysis Functional Ecology (1996) 10,42143 1 Introduction 0 1996 British Ecological Society The dynamics of many important ecological and evolutionary processes are often influenced by multiple interacting factors (Quinn & Dunham 1983), and attempting to understand the dynamics of such systems presents difficult technical and philosophical problems. Evolutionary ecologists have recently turned to path analysis, multiple regression and related techniques to analyse systems of multiple causality. While these techniques are often seen as competitors, they are, in fact, related, and as a result the problems of these techniques are similar. The use of path analysis as a means of analysing systems involving multiple causality has become increasingly prevalent in evolutionary ecology. Path analysis was developed in the 1920s by Wright (1920, 1934) to analyse systems of multiple causality, but in spite of its appeal, it was not widely used until recently (see Power 1972 and Arnold 1972 for early examples of the use of path analysis in ecology). With the advent of fast and powerful desk-top computers, path analysis and related methods are rapidly becom- ing favoured techniques for the inference of multiple causal interrelationships among variables in ecology and evolutionary biology (Mitchell-Olds & Shaw 1987; Lynch 1988; Lynch &Arnold 1988; Schemske & Horvitz 1988; Crespi & Bookstein 1989; Crespi 1990; Farris & Lechowicz 1990; Johnson, Huggins & DeNoyelles 1991; Kingsolver & Schemske 1991; Mitchell 1992). There are, however, a number of important limitations and pitfalls of path analysis and related methods that seem largely unappreciated (but see Mitchell-Olds & Shaw 1987 and James & McCulloch 1990). We point out some of these dangers and suggest that path analysis cannot adequately answer many of the questions that evolutionary ecologists wish to ask. Our review begins by placing path analysis in the context of multiple regression techniques and least-squares estimation. We do not discuss the more commonly cited difficulties, such as the requirements of linearity and homogeneity of variances or the use of predictor variables that are measured with errors. These issues are well known and extensively discussed in the statistical literature (Myers 1990 422 provides one of the clearest introductions to these problems). We are more concerned with explaining several unappreciated pitfalls that arise in the study of complex systems in which the causal mechanisms are poorly known and in which the suspected causes may be highly intercorrelated. Such conceptual shortcomings often blur the distinction between hypothesis testing and exploratory analysis in the minds of many ecologists, and so we discuss how path analysis cannot overcome a lack of understanding about specific mechanisms or replace explicit experimental tests of alternative hypotheses about these mechanisms. Intercorrelated factors also present analytical problems, which come under the rubric of collinearity. Collinearity is not an unknown problem to ecologists (see Mitchell-Olds & Shaw 1987 and James & McCulloch 1990 for recent critiques in evolutionary ecology), yet diagnostics for detecting collinearity are not widely used in evolutionary ecology. Thus the third section discusses the problems and detection of collinearity. Finally, we review some of the better-known studies in which path analyses have been used and demonstrate that collinearity compromises the biological interpretations in many cases. In reviewing the literature, we were surprised by misuse of categorical data, presence of pseudoreplication and problems with inadequate sample size, and so, in passing, we also comment on these problems. P. S. Petraitis et al. Links between multiple regression and path analysis Multiple regression models are used to fit the best functional relationship between a single criterion variable (Y) and two or more predictor variables (X,, X2,...). The 'functional relationship' between predictors and criterion is assumed to be linear (Y= PIXl + P2X2 + ...). Furthermore it is assumed that the ~ sthe , errors associated with observing the Xis, are distributed normally and are not correlated. If these assumptions are met, then ordinary least-squares methods will then give the 'best' estimates of the linear relationships (i.e. the estimators will be unbiased and have minimum variance, Mood & Graybill 1963, Myers 1990). If the predictor and criterion variables are centred by subtracting their means and scaled by dividing through with their standard deviations, then the best estimates of linear relationships are standardized partial regression coefficients. For three predictor variables, XI, X2, and X3, and a criterion variable Y, the functional relationship between the standardized variables is then y = blxl . .+ b2x2 - - + blx3 . .+ E , O 1996British Ecological Society, Functional ~ ~ 10,421431 w, eqn 1 where x and y are the standardized variables and the bis~are the lstandardized partial regression ~ ~ ~ , coefficients (beta weights). The least-squares estimate of the beta weights can be found by solving the set of normal equations, which describes the relationship between all pairwise correlations and the standardized regression coefficients. For our three standardized predictor variables, xl, x2, and x,, and one standardized criterion variable y, the normal equations are: eqn 2 The correlation structure can profoundly affect the estimates of the beta weights because the estimates are adjusted for the correlations among the predictor variables, even though these correlations do not appear in the regression equation (i.e. equation 1). This can be easily seen by plotting the size of the beta weights of two predictors against the strength of the correlation between the predictors (Fig. 1). Suppose we start with the situation where x, and x2 are uncorrelated and r,, = -r2, and thus the regression coefficients b, and b2 equal r,, and -r2,, respectively. The coefficients, however, change as the correlation between x, and x2 changes (Fig. 1). The dependence of regression and path coefficients on the correlation structure is often glossed over. Yet this dependence means that inferences about the strengths of path coefficients are conditional upon the correlational structure among the predictor variables. If the correlational structure of the sample does not match the correlational structure of the population then regression and path coefficients will not estimate relative strengths accurately. This can occur if sampling is not random or if analysis is based on data from experiments with factorial designs, which break the correlational structure among the predictor Fig. 1. The effect of correlational structure on the magnitude of beta weights. In this example, it is assumed that the beta weights are equal in magnitude but opposite in sign. When x, and x2 are uncorrelated, then b, = r,, and b2= r2,. The correlation r,, is set equal to 0.3 when rI2= 0. 423 Limitations of path analysis variables (e.g. Campbell & Halama 1993; EnglishLoeb, Karban & Hougen-Eitzman 1993; Wootton 1994). One additional point is worth mentioning. The number o f paired observations affects the estimate o f the standardized partial regression coefficients o f categorical variables because the variance is standardized to 1. Standardization o f a two-state categorical variable (scored 0 or 1 ) means that displacement o f the scores 0 and 1, in standard deviation units, varies with the number o f observations. For a regression done in the context o f an analysis o f variance, the number o f observations equals the number o f treatments (see Sokal & Rohlf 1981). The difference between two treatments, such as the presence and absence o f a predator, is [ ( n - l ) l n ~ "standard ~ deviation units where n = 2 for the two treatments. Changes in the number o f treatments, which is often less than five, will thus dramatically alter beta weights. As in multiple linear regression, conventional path analysis starts with a defined structure o f hypothesized causal relationships (paths), usually represented as causal diagrams, and then uses the correlations among all the variables to estimate the strength o f the paths. The variables are standardized as in standardized multiple regression. Thus in Fig. 2(a), there are the three predictor variables and one criterion variable, and the observed correlation between x1 and y is due not only to the direct effect o f x , on y (path p,,) but also the indirect effect o f x, via x, and x, because the three predictors are correlated. For example, the correlation betweenx, and y in terms o f path coefficients,is: '-1, Fig. 2. Comparison of beta weights and path coefficients and their standard errors. The correlations are from Box 16.1 in Sokal & Rohlf (1981), which we used because data are readily available. Fig. 2(a) is the path diagram for the multiple regression model. Double-headed arrows represent the correlations among the predictor variables and single-headed arrows are the standardized partial regression coefficients. (b) is the path diagram in which there is no causal link between x, and x,. (c) gives the path diagram of the multiple regression model in which x3 has been dropped to reduce collinearity. Sample size is 41. All paths significant except eachp,, in (a) and (b). = P,, + py2r12 + py,r1,. eqn 3 Equation 3 is identical to the first o f the three normal equations except that pyi s replace the bis. I f the path diagram posits all possible pairwise correlations among the predictors, then path coefficients are merely standardized partial regression coefficients. Note that multiple regression is simply a path analysis with a particular structure and, per force, the same restrictions apply to both analyses. Calling one analysis a path analysis and another a regression analysis does not alter the conditions required for parameter estimation. Figure 2(a) is, in fact, the path diagram for a multiple regression. Now suppose we believed apriori that the path diagram does not include all the causal and correlational links found in a multiple regression. For example, suppose we assume that no link exists between x, and X , either directly between x1 and x, or indirectly via x, (Fig.2b). This structural constraint implies rl, equals zero, and the path diagram, or model, is called overidentified because there are fewer paths to be estimated than observed correlations. To solve for the paths, r13in the normal equations is set to zero regardless o f the observed value for TI,. This is a structural zero forced by our hypothesis o f no interaction between x1 and x,. Predefined constraints, which are determined prior to calculating the path coefficients, cause the path coefficients to differ from the beta weights o f a multiple regression. This can be clearly seen by comparing Fig. 2(a) and Fig. 2(b). It has been often suggested that path coefficients can be calculated by carrying out repeated multiple regression analyses on appropriate subsets o f variables (e.g. Dillon & Goldstein 1984; Fanis & Lechowicz 1990; Mitchell 1992). This is often true, but the example in Fig. 2(b)cannot be analysed in this 424 P. S. Petraitis et al. manner. In this example, the correlation between xl and x3 is assumed to be zero based on the structure of the path diagram, and this correlation must be set equal to zero in the normal equations. Carrying out multiple regressions for subsets of variables using the raw data is not appropriate, and Fig. 2(b) can only be analysed by defining the correct correlatioh matrix if regression procedures in conventional statistical packages such as SAS or SYSTAT are used (see Appendix for SAS and SYSTAT codes for running the examples in Fig 2a and b) . Hypothesis testing vs exploratory analysis O 1996 British Ecological Society, Funcfional ~ ~ 10,421431 Several recent studies have presented reviews demonstrating and advocating the use of path analysis to study systems of multiple causality in ecology and evolution (Schemske & Horvitz 1988; Kingsolver & Schemske 1991; Mitchell 1992, 1993; Wootton 1994). Overall, these reviews present fairly thorough treatments of the major assumptions of the technique as well as the potential strengths of the approach compared with multiple regression. But our review of studies actually employing path analysis reveals substantial confusion about the rationale for using this approach and the limitations on the inferences about causality that may be drawn. We suggest that the main confusion associated with path analysis is that inferences about 'causal' pathways are misleading. Demonstration of causation must be made external to the statistical process of path analysis. In other words, finding a significant fit of a path model to a data set does not demonstrate that relationships among variables are causal (Mitchell 1993). Rather, the path model becomes an hypothesis subject to experimental verification. Kingsolver & Schemske (1991) and Mitchell (1992, 1993) emphasize two main applications of path analysis: exploratory data analysis and formal hypothesis testing (statistical adequacy of a proposed causal model). We view these two uses as alternatives and suggest that inappropriate application of path analysis has arisen frequently from a lack of distinction between them. The feature that distinguishes formal hypothesis testing is the presentation of a formal path model that is not derived from a data set that is itself the object of the path analysis. In other words, a model of causal and correlational pathways is presented a priori. It is separate from the data that will be used to estimate strengths of paths in the model. The model must be tested by collecting new data. This approach, however, poses a dilemma because one must choose between observed correlations based on new data and inferences about causal links that were made before the data were collected. For example, suppose that we assumed a model as in Fig. 2(b) where to be ~ r13is l assumed ~ ~ zero,~but new, data showed that r,,, even though small, was significantly different from zero. There are three alternatives. First, accept the data as providing a good test of the model and reject the model. We believe this is the correct procedure. Second, use the widely accepted 'arm-waving' approach, ignore the new data and hold on to the model in spite of its falsification. Third, add or delete paths until a better fit is found. At this point, we slip into an a posteriori approach to path analysis that has the same difficulties as stepwise regression (see comments by James & McCulloch 1990). In contrast to testing an a priori path model, a path diagram can be created aposteriori by adding or dropping variables until a fit that maximizes the proportion of variation explained is found. This is an exercise in formalizing, in a quantitative sense, hypotheses about the pattern of interactions among a set of correlated variables. Good hypotheses may spring from imagination, a posteriori path analyses, or various other Muses. Use of path analysis in this context, like our use of the Muses, must be carefully distinguished from hypothesis testing. Hallmarks of the a posteriori approach include lack of presentation and justification of an explicit path model before analysis (e.g. Fjeld & Rognerud 1993). More subtly, studies that present only vague descriptions and justifications of a path model as well as those that present path diagrams based solely on previously collected correlative data (e.g. Eriksson & Bremer 1993) have more in common with exploratory path analysis than with hypothesis testing. Under the a posteriori approach, the dependence of path coefficients on the correlational structure becomes a crucial issue. Recall that Fig. 1 shows how the magnitude of the coefficient changes with the correlation among the predictor variables. Thus our choice of causal links will determine the magnitude of the path coefficients. A firmly grounded causal model prior to the collection of the data is a must if we are to have any confidence in the results of a path analysis. Yet post hoe models seem to be the norm in ecology (several recent examples are Johnson et al. 1991; Wesser & Armbruster 1991; Walker et al. 1994; Wootton 1994). Such studies, which mix a posteriori exploratory analysis and hypothesis testing, commit an error that has little to do with the specific assumptions of path analysis, but instead reveal a misunderstanding about the role of statistics in inferring causal relationships. Collinearity: the Medusa of regression analysis? While a variety of methodological problems plague both multiple regression and path analysis, collinearity is especially troublesome for evolutionary ecologists who wish to place some confidence in the values assigned to path coefficients. Collinearity occurs when there are large correlations among some of the predictor variables. However, examination of all pairwise correlations will not ensure that collinearity is 425 Limitations of path analysis Fig.3. Effect of collinearity on confidence intervals of partial regression coefficients (P). A thousand X,, X2 pairs were generated with correlation coefficients of 0.47, 0.86, 0.96, respectively. From each sample of 1000 doublets, four sets of 50 pairs were chosen at random and used to calculate Y according to the equation: Y = 0,3X, + 0.4X2 + E where & is a random error term drawn from a uniform distribution bounded between -100 and +100. The partial regression coefficients for X, and X2 were estimated using ordinary least squares. Solid lines show the true parametric values of p2 and p,. Symbols and bars show mean values from the four samples and the mean 95% confidence intervals, respectively. O 1996 British Ecological Society, FunctionalEcology, 10,421431 detected, and successful detection requires the use of a variety of diagnostics (Myers 1990), which are related to the effects of collinearity on the coefficients. Correlated predictors have two distinct effects on the partial regression coefficients (and per force path coefficients). First, collinearity among predictor variables increases the standard errors of estimated partial regression coefficients, resulting in a lack of stability of path coefficients between samples (Asher 1983; Fig.3). Large standard errors of path coefficients mean that estimated strengths of effects between variables may vary substantially between samples. This means less precision. One result is that, as collinearity increases, the ability to detect a significant effect (statistically non-zero path coefficient) is reduced (compare 95% confidence intervals around regression coefficients with r12= 0.96 vs 0.47 in Fig. 3). This is an especially important result when individual t-tests are used to test the significance of individual path coefficients. In other words, inability to reject the hypothesis that a path coefficient is nonzero may result from very low power (large standard error), rather than the unimportance of an effect. Therefore, even though increased confidence intervals around partial regression estimates increase with collinearity, protecting against a type I error, incorrect inferences (i.e. a particular path effect is zero) about the importance of a particular effect may still be drawn. The inflation of the standard error can be used to screen for collinearity. If the predictor variables are not correlated, then the variance of each regression coefficient will equal the error variance (i.e. Var(b,) = 02, Myers 1990). Correlations among predictors inflate the variances of the coefficients, and variance inflation factors (VIFs), which equal var(b1)lo2, are used as indexes of the loss in precision. The VIF of the it, predictor is calculated by regressing the ith predictor against the remaining predictors. VIF equals ll(1-R;) where R: is the coefficient of multiple determination of this regression. There are no hard and fast rules for how large VIFs can be before collinearity becomes a problem. Myers (1990) suggests all VIFs should be less than 10, but Freund & Littell (1986) suggest all R:S should be smaller than the R2 of the complete model. For our artificial example in Fig. 2(a), the VIFs are 1.25 forb,, 14.28 for b2 and 13.70 for 6,. Thus the variances of b, and b2 are roughly 11 times the expected error variance. As expected, the standard errors for b1 and b2 are large (0.380 and 0.374, respectively), and there is a loss in precision. The second effect of collinearity is an increase in the absolute magnitude of the partial regression coefficients (Myers 1990). This means less accuracy. A bias in the beta weight is often the most noticeable. For example, a multiple regression, as in Fig. 2A, in which one or more of the beta weights are greater than one in absolute value may be due to collinearity. Large values for beta weights, however, do not necessarily imply collinearity. The estimation bias depends on the magnitude of the eigenvalues of the predictor variable correlation matrix such that the relationship between the expected estimates and the parametric values is: E(Ci bI2)=Xi + 02C1ll&, eqn 4 where bi = the estimated regression coefficients, pi = their parametric values, o2= the model error variance and ki = the eigenvalues (Myers 1990). The bias can be considerable. For example, two of the predictors used in the multiple regression in Fig. 2(a) are highly correlated and two of the three beta weights are greater than one in absolute value. The eigenvalues of the predictor correlation matrix are 1.9878, 0.9761 and 0.0361. Thus Xi b2 overestimates C, by 29.2302! Normally Ci b2 overestimates Xi Pi2 by 302 if all three predictors are not correlated because eigenvalues for matrices of uncorrelated predictors equal one. This bias can be used to detect collinearity by calculating a condition index, which is hl/Ai, for each predictor variable. Myers (1990) suggests collinearity will be a problem if the index is over 1000. Note that SAS and SYSTAT use the square root of this ratio, which is also called the condition index, and suggest using 30 as the cut-off point. Based on Mason & Perreault's (1991) simulations, we suspect that the 426 P. S. Petraitis et al. cut-off is set far too high for ecological analyses. They found serious problems with collinearity with condition indexes as low as 3.4 when the sample size was less that 100 and R2 was less than 0.75, which are typical values for ecological studies. In Fig. 2(a), for example, the largest condition index is only 7.42 (square root of 1.9878/0.0361), which is well below 30, but the beta weights appear to be unusually large. The problems with collinearity, in particular unusually large beta weights, are often resolved by dropping one or more of the predictors. This seems to make sense to many researchers (see comments of MitchellOlds & Shaw 1987). Clearly if two variables are highly correlated, then either should serve equally well to predict change in the criterion variable. In Fig. 2(c) for example, we dropped x3 because of its strong correlation with x, but weak correlation with x,. Dropping the third variable has a small effect on the amount of the variation that is explained: the coefficient of multiple determination, R2,equals 0.6125 if all three predictor variables are considered and drops to 0.5 161 if only x, and x, are included. There is, however, quite a change in the beta weights, and b, goes from +1.7111 to +0.5835, which underscores the difficulties of inferring the relative strengths of several predictors when they are highly intercorrelated. Dropping predictor variables is not only intellectually dishonest (Philippi 1993), it also contaminates the remaining predictors (Box 1966). Suppose we believe that the true causal relationship involves all three predictors, but we drop x3 to limit the effects of collinearity. In spite of this, the estimates of the regression coefficients b, and b, remain contaminated by the eliminated predictor. For example, suppose the true model and expected value of y[E(y)] were eqn 5 but with x, and x3 highly correlated as is the case in Fig. 2(c). If we drop x,, our expected values for b, and b, become: P3 Note that in equation 6 the quantities by which are multiplied are partial correlations and are the standardized partial regression coefficients for x, and x, if we regressed x3 on x, and x,. Review of path analyses in ecology It is our impression that path analysis has been widely misused by ecologists. We reviewed over 64 path analyses in 26 papers (Tables 1 and 2). Our review is by no means complete. We relied on citations in published papers and listings in Current Contents for references. Since our main concern was the use of path O 1996 British analysis in ~ ~ Society, ~ l ~ ~ i ~ecology, ~ lwe did not include papers dealing with unmeasured variables (e.g. the measurement of Functional~cology, 10,421431 selection, see Mitchell-Olds & Shaw 1987; Crespi & Table 1. Papers with path analyses that were examined but not included in review of collinearity. Analyses were rejected for review because they lacked either a table of correlations or an explicit diagram with path coefficients Source 1 2 3 Campbell & Halama (1993) English-Loeb et al. (1993) Fams & Lechowicz (1990) Herbers (1990) Herrera (1993) Johnson et al. (1991) Jordan (1992) Jordan (1993) King (1993) Maddox & Antonovics (1983) Mitchell (1993) Mitchell-Olds (1987) Sinervo (1990) Stanton et al. (1991) Winn & Werner (1987) Wootton (1994) 1. Number of path analyses in paper: note that Winn & Werner (1987) do not provide sufficient data to determine the number of analyses done. 2. Categorical variables or variables defined by treatment levels included in analysis? N, no, Y, yes. 3. Sample sizetnumber of predicted paths. ?, sample size missing or sample size or number of paths not explicit for one or more analyses. Bookstein 1989; Crespi 1990) or articles outside the mainstream of ecology (e.g. use of path analysis in an agricultural context). Authors analysed path diagrams by hand (two papers) or by using a LISREL type of program (six papers) or a multiple regression package (18 papers). LISREL, which stands for linear structural relations (Joreskog & Sorbom 1988), is a statistical package that can implement a maximum-likelihood procedure to estimate the strengths of direct and indirect effects (what we have been calling beta weights or path coefficients up to now). The procedure is quite versatile, and the model can include predictor and criterion variables that are unmeasured. The equivalent procedure in SAS is called PROC CALIS (SAS Institute Inc. 1990). We were surprised by the low sample size in nearly all cases. Sample size needs to be large to ensure stable parameter estimates. With a small sample size, successive observations are likely to change parameter estimates merely as a result of random sampling from the underlying 'true' population variancecovariance structure. As a general rule of thumb, sample size should be at least five to 20 times larger than the number of estimated paths to ensure reliable results (SAS Institute Inc. 1990). We found the ratio of sample size to number of paths ranged from below 2 to about 90 (see Tables 1 and 2). The ratio appeared to be < 5 for the majority of analyses. In many cases, the sample size was not explicitly stated, and in the 427 Limitations of path analysis Table 2. Survey of collinearity and other problems in published path analyses for which correlations and path diagrams were provided Source 1 2 3 4 5 6 7 8 Arnold (1972) T I , F2 Eriksson & Bremer (1993) T2, F1 Fjeld & Rognerud (1993) T2, F4 Johnson (1975) T3, F4 Mitchell (1992) T I , F1A Mitchell (1992) T I , FIB Mitchell (1993) T10.1, F10.3 Power (1972) T2, F2 Roach (1986) T3, F2 Schemske & Horvitz (1988) A, F3 Schemske & Horvitz (1988) A, F4 Walker et al. (1994) Ala, F2 Walker et al. (1994) Alb, F2 Walker et al. (1994) Alc, F2 Walker et al. (1994) Ald, F2 Walker et al. (1994) Ale, F2 Wesser & Armbruster (1991) A2, T2 Wesser & Armbmster (1991) A2, T3a Wesser & Armbruster (1991) A2, T3b Wootton (1994) T3, F4A Wootton (1994) T3, F4B Wootton (1994) T3, F4E Wootton (1994) T3, F4D Wootton (1994) T3, F4C Source: Author, year, correlation table, path coeffiecients. T, table, F, figure, A, appendix. 1. Included categegorical variables or variables defined by treatment levels? N, no, Y, yes. 2. Sample sizelnumber of predicted paths to the nearest integer. 3. Number of miscalculated pathsltotal number of paths. ?, problems with estimates. 4. Number of regressions incorrectly doneltotal number of regressions. 5. Largest condition index. 6. Largest VIF. 7. Number of VIFs > model VIFItotal number of paths from multiple regressions. 8. Number of multiple regressions for which at least one VIF > model VIF/total number of multiple regressions. O 1996 British Ecological Society, ~ ~ ~ 10,421-431 paper by Jordan (1993), we could not find any mention of sample size. We also disregarded pseudoreplication in our tally of sample sizes and used samples sizes as stated by the authors. Although we were not looking for it, we found two clear cases of what Hurlbert (1984) calls temporal pseudoreplication. Johnson et al. (1991) stated they had 64 observations, but in fact used eight mesocosms and sampled each one eight times. Wootton (1994, his Table 2) gives his sample size as 20, but his experiment involved 10 cages, each of which was sampled in 2 different years. Categorical variables or results from experiments with fixed treatment levels were used in nine papers, and none of the authors seems to have understood how these types of variables fail to account for the range of natural variation. For example, consider a two-state categorical variable. The scoring is arbitrary because the variance is standardized to unity. Thus experimental manipulation of the absence and presence of a predator is usually scored as 0 and 1 (e.g. Wootton 1994) but could just as well be scored as -100 and +I00 with no change in the beta weight or path coefficient.~ This means when techniques t j manipulative ~ ~ ~ are used to control treatment levels, we have no way of knowing if the range of treatments spans 1, 2 or 10 standard deviations of the normal variation seen in the field. We suspect that use of categorical variables and fixed treatment levels generally inflate estimates of path coefficients because the ranges of categories and treatments (e.g. the control of the presence or absence of a predator through the use of cages) are much larger than the implied range of one standard deviation (see Petraitis 1996). We were able to evaluate 24 path analyses in 12 papers for collinearity (Table 2). Tests for collinearity required us to re-run the path analyses, and to do this we needed a clear path diagram and the correlations among the variables. We found over 40 additional path analyses, but these lacked either a clear path diagram or the required correlations (Table l). Using the correlation matrix for each path diagram, we calculated condition indexes, variance inflation factors and path coefficients. We used regression procedures in SAS and SYSTAT to analyse the data. This involved calculating 226 paths spread over 9 1 regressions. We found incorrect path coefficients in 13 of the dis-~ ~24 path ~analyses ~(Table 2). ~ This does ~ not include ~ crepancies in estimates, which were often greater ~ 428 P. S. Petraitis et al. O 1996 British Ecological Society, Functional ~ ~ 10,421431 than 20% and caused by the effects of collinearity on accumulated rounding errors. In seven cases (one in Power 1972, one in Johnson 1975 and five in Walker et al. 1994), the authors used an overall multiple regression even though the path diagram showed no links between two or more of the independent variables. Usually the path diagram was missing correlational paths among variables, but the path coefficients suggested these correlations were included in the path analysis. We were able to duplicate Power's (1972) and Johnson's (1975) estimates by including the correlations missing from the path diagrams. We were unable to duplicate the estimates in Walker et al. (1994) or in three other analyses (one from Eriksson & Bremer 1993 and two from Wootton 1994). We assume typographical errors, either in the correlation table or in the path diagram, are the most likely cause in these cases. Finally, we could not identify the source of the discrepancies in the three remaining cases, which are from Wesser & Armbruster (1991). One of their analyses has serious collinearity, and rounding errors may have prevented us from recovering the published estimates in our re-analysis. However, the three path analyses contain six subanalyses that are shownas simple regressions, and thus the path coefficients should be the simple correlations. Five of the six path coefficients do not match the reported correlations. Collinearity appears to be a moderate to serious problem in 15 of 23 analyses and affects 22% of the path coefficients (Table 2, the number of VIFs is greater than model R ~ )When . collinearity was present, we found our estimates of path coefficients would have the same rank order as the published values but were often 20% greater than and sometimes double the published values. The condition indexes and VIFs were well above the recommended cut-off points in two papers (Wesser & Armbruster 1991; Wootton 1994). Ironically, these authors checked for collinearity. Wootton (1994) found it, but then dropped variables. Wesser & Armbruster (1991) failed to detect it. This suggests collinearity is a moderately common problem. Since it undermines the precision and accuracy of estimates of path coefficients, we believe most published path analyses in ecology should be viewed with extreme caution. Finally we would like to note the three good examples of path analysis we found and suggest they serve as guides to those who wish to do path analysis. Roach (1986) and Mitchell (1992, 1993) have sufficient sample size, appear to have done the analysis correctly, and have little or no problem with collineasity (see Table2). Sample sizes are clearly stated. Correlation tables, path diagrams, and path coefficients are provided. The data are not from experimental manipulations variables, and ~ and ldo not~ include~ categorical ~ , thus these problems are avoided. Discussion Evolutionary ecologists hope to estimate the relative strengths of competing causes of biological phenomena. Path analysis and multiple regression are useful tools for analysing multiple causality, but they cannot overcome the limitations imposed by highly correlated observations, inappropriate data and poorly conceptualized models. Collinearity causes beta weights to be less accurate, less precise and more sensitive to small errors in measurement and rounding. Use of categorical variables, non-random sampling, and use of data from manipulative experiments prevents the v a t ance-covasiance structure of the sample from matching the variance-covariance structure of the population. Path and regression estimates from such samples may be accurate and precise, but cannot be generalized to the population as a whole. Using path analysis does not improve the situation because multiple regression is simply a particular type of path analysis. Our review suggests there is cause for concern. Some (Kingsolver & Schemske 1991; Mitchell 1992) have suggested that path models, and in particular models with unmeasured variables, could be better analysed using approaches such as LISREL. We doubt that this procedure will be of much help to evolutionary ecology. Implementation of LISREL requires us to specify eight matrices in total (Dillon & Goldstein 1984). These matrices identify the links between all the variables (i.e. the paths) and the variance-covasiance structures of the measured variables and of their errors of measurement. Paths (i.e. non-zero entries in the matrices) can be fixed, constrained to equal a combination of other parameters, or left unconstrained. If all paths are left unconstrained, LISREL assumes certain constraints on some paths and then estimates the remaining ones. LISREL solves for the set of 'path coefficients' that best reproduces the observed variance-covariance matrix of the measured variables. Testing the fit of the observed variance-covariance matrix to the predicted variance-covariance matrix is carried out by a X2 likelihood ratio (Dillon & Goldstein 1984). The test assumes a large sample of independent observations, no missing or categorical data, and multinomality of all variables. The models must be linear and additive. All of the cases that we examined that used LISREL or a likelihood ratio test failed to meet one or more of these assumptions. It seems clear that, in most cases, the requisite underlying theory is not available to specify the mechanisms that link variables a priori nor do the data meet the assumptions of the model and its test. The problem of model definition is serious: failure to d e f i e a sufficiently restrictive model can lead to several sets of 'path coefficients' giving the same test value. One could 'adjust' the model by adding paths suggested by the observed correlation structure or by deleting less 'interesting' paths (e.g. a sequence of paths that connects two variables that 429 Limitations of path analysis are weakly correlated) until an unique solution can be found. Once this is done, data, not theory, drive the model, and the analysis becomes nothing more than 'a limitless exercise in data snooping contributing little, if anything, to scientific progress' (Dillon & Goldstein 1984). Given the potential for problems, we believe that LISREL does not offer a widely applicable solution for addressing multiple causality in evolutionary ecology, even though LISREL may be able to overcome some of the problems of collinearity. Several biased estimation techniques have been proposed as a way to combat collinearity (Myers 1990). These are controversial methods. Ridge regression, for example, biases the regression coefficients in order to improve the model's ability to predict the criterion variable. The estimates of the individual regression coefficients may be very biased even though when taken together they provide a very good prediction of the criterion variable. Other methods, such as the ridge trace method and the df trace method, stabilize the individual coefficients with minimal bias but at the expense of prediction. It seems to us the latter set of methods is more appropriate for ecological data because ecologists are usually interested in the relative strengths of coefficients. Ultimately, our understanding of relative strengths of ecological processes depends on clearly defined models of causality. Path analysis and LISREL will correctly estimate the relative strengths of competing factors only for the model under consideration. Dropping highly correlated predictors or deleting paths contaminates the remaining estimates, and thus throws in to question the meaning of 'relative strengths'. While path analysis and LISREL do offer structured approaches to finding patterns that may serve as the basis for hypotheses, these methods cannot by themselves confirm or disprove the existence of causal links. Acknowledgements We thank Joe Bernardo, Johan Bring, Evan Cooch, Peter Fairweather, Randy Mitchell, Steve Heard and Joe Travis for their comments and insights. References O 1996 British Ecological Society, Functional Ecology, 10,421431 Arnold, S.J. (1972) Species densities of predators and their prey. American Naturalist 106,220-236. Asher, H.B. (1983) Causal Modeling. Sage Publications, Newbury Park, CA. Box, G.E.P. (1966) Use and abuse of regression. Technometrics 8,625-629. Campbell, D.R. & Halama, K.J. (1993) Resource and pollen limitations to lifetime seed production in a natural plant population. Ecology 74, 1043-105 1. Crespi, B.J. (1990) Measuring the effect of natural selection on phenotypic interaction systems. American Naturalist 135,3247, Crespi, B.J. & Bookstein, F.L. (1989) A path-analytic model for the measurement of selection on morphology. Evolution 43,18-28. Dillon, W.R. & Goldstein, M. (1984) Multivariate Analysis. Wiley & Sons, New York, NY, USA. English-Loeb, G.M., Karban, R. & Hougen-Eitzman, D. (1993) Direct and indirect competition between spider mites feeding on grapes. Ecological Applications 3, 699-307. Eriksson, 0. & Bremer, B. (1993) Genet dynamics of a clonal plant Rubus saxatilis. Journal of Ecology 81, 533-542. Farris, M.A. & Lechowicz, M.J. (1990) Functional interactions among traits that determine reproductive success in a native annual plant. Ecology 7,548-557. Fjeld, E. & Rognerud, S. (1993) Use of path analysis to investigate mercury accumulation in brown trout (Salmo trutta) in Norway and the influence of environmental factors. Canadian Journal of Fisheries and Aquatic Science 50,1158-1167. Freund, R.J. & Littell, R.C. (1986) SAS System Linear Regression, 1986 edition. SAS Institute Inc., Cary, NC, USA. Herbers, J.M. (1990) Reproductive investment and allocation ratios for the ant Leptothorax longispinosus: sorting out the variation. American Naturalist 136, 178-208. Herrera, C.M. (1993) Selection on floral morphology and environmental determinants of fecundity in a hawk mothpollinated violet. Ecological Monographs 63,25 1-275. Hurlbert, S.H. (1984) Pseudoreplication and the design of ecological field experiments. Ecological Monographs 54, 187-211. James, F.C. & McCulloch, C.E. (1990) Multivariate analysis in ecology and systematics: Panacea or Pandora's box? Annual Review of Ecology and Systematics 21, 129-166. Johnson, M.L., Huggins, D.G. & DeNoyelles, F., Jr. (1991) Ecosystem modeling with LISREL: a new approach for measuring direct and indirect effects. Ecological Applications 1, 383-398. Johnson, N.K. (1975) Controls of number of bird species on montane islands in the great basin. Evolution 29, 545-567. Jordan, N. (1992) Path analysis of local adaptation in two ecotypes of the annual plant Dioda teres Walt. (Rubiaceae). American Naturalist 139, 149-165. Jordan, N. (1993) Prospects for weed control through crop interference. Ecological Applications 3, 84-9 1. Joreskog, K.G. & Sorbom, D. (1988) LISREL 7. A Guide to the Program and Applications. Scientific Software, Inc., Mooresville, IN, USA. King, R.B. (1993) Determinants of offspring number and size in the brown snake, Storeria dekayi. Journal of Herpetology 27, 175-185. Kingsolver, J.G. & Schemske, D.W. (1991) Path analyses of selection. Trends in Ecology and Evolution 6,276-280. Lynch, M . (1988) Path analysis of ontogenetic data. Sizestructured Populations - Ecology and Evolution (eds B. Ebenman & L. Persson), pp. 2 9 4 6 . Springer-Verlag, Berlin. Lynch, M. & Arnold, S.J. (1988) The measurement of selection on size and growth. Size-structured Populations Ecology and Evolution (eds B. Ebenman and L. Persson), pp. 47-59. Springer-Verlag, Berlin. Maddox, G.D. & Antonovics, J. (1983) Experimental ecological genetics in Plantago: a structural equation approach to fitness components in P. aristata and P patagonica. Ecology 64, 1092-1099. Mason, C.H. & Perreault, W.D. Jr., (1991) Collinearity, power, and interpretation of multiple regression analysis. Journal of Marketing Research 28,268-280. Mitchell, R.J. (1992) Testing evolutionary and ecological hypotheses using path analysis and structural equation modeling. Functional Ecology 6, 123-129. 430 P. S. Petraitis et al. 0 1996 British Ecological Society, Functional Ecology, 10,421431 Mitchell, R.J. (1993) Path analysis: pollination. Design and Analysis of Ecological Experiments (eds S.M. Scheiner and J. Gurevitch), pp. 211-231. Chapman and Hall, New York. Mitchell-Olds, T. (1987) Analysis of local variation in plant size. Ecology 68, 82-87. Mitchell-Olds, T. & Shaw, R. (1987) Regression analysis of natural selection: statistical inference and biological interpretation. Evolution 41, 1149-1 161. Mood, A.M. & Graybill, F.A. (1963) Introduction to the Theory of Statistics, 2nd edition. McGraw-Hill, New York. Myers, R.H. (1990) Classical and Modern Regrewion with Applications, 2nd edition. PWS Kent, Boston, MA. Petraitis, P.S. (1996) How can we compare the importance of ecological processes if we never ask, 'Compared to what?' Issues and Perspectives in Experimental Ecology (eds W. Resitarits and J. Bernardo). Oxford University Press, in press. Philippi, T.E. (1993) Multiple regression: herbivory. Design and Analysis of Ecological Experiments (eds S.M. Scheiner and J. Gurevitch), pp. 183-210. Chapman and Hall, New York. Power, D.M. (1972) Numbers of bird species on the California islands. Evolution 26,45 1 4 6 3 . Quinn, J.F. & Dunham, A.E. (1983). On hypothesis testing in ecology and evolution. American Naturalist 122,602417, Roach, D.A. (1986) Timing of seed production and dispersal in Geranium carolinianum: effects on fitness. Ecology 67,572-576. SAS Institute, lnc. (1990) SAS User's Guide: Statistics. SAS Institute, Inc., Cary, NC. Schemske, D.W. & Horvitz, C.C. (1988) Plant-animal interactions and fruit production in a neotropical herb: a path analysis. Ecology 69, 1128-1 137. Sinervo, B. (1990) The evolution of maternal investment in lizards: an experimental and comparative analysis of egg size and its effects on offspring performance. Evolution 44,279-294. Sokal, R.R. & Rohlf, F.J. (1981) Biometry, 2nd edition. Freeman, San Francisco, CA. Stanton, M., Young, H.J., Ellstrand, N.C. & Clegg, J.M. (1991) Consequences of floral variation for male and female reproduction in experimental populations of wild radish, Raphanus sativus L. Evolution 45,268-280. Walker, M.D., Webber, P.J., Arnold, E.H. & Ebert-May, D. (1994) Effects of interannual climate variation on aboveground phytomass in alpine vegetation. Ecology 75, 393408. Wesser, S.D. & Armbmster, W.S. (1991) Species distribution controls across a forest-steppe transition: a causal model and experimental test. Ecological Monographs 61, 323-342. Winn, A.A. & Werner, P.A. (1987) Regulation of seed yield within and among populations of Prunella vulgaris. Ecology 68, 1224-1233. Wootton, J.T. (1994) Predicting direct and indirect effects: an integrated approach using experiments and path analysis. Ecology 75, 151-165. Wright, S. (1920) The relative importance of heredity and environment in determining the piebald pattern of guinea pigs. Proceedings of the National Academy of Science, USA 6,320-332. Wright, S. (1934) The method of path coefficients. Annals of Mathematics and Statistics 5. 161-215. Received 13 June 1995: revised 21 November 1995; accepted 30 November 1995 431 Limitations of path analysis Appendix SAS and SYSTAT code to run path analyses in Fig. 2a and b. Examples from Sokal and Rohlf's (1981) Box 16.1 and Table 16.I SAS CODE TO RUN FIG. 2(A) SYSTAT CODE TO RUN FIG. 2(A) DATA C.FIG2A (TYPE=CORR); INPUT ( Y X 1 X2 X3 N) (8.4); -TYPE- = ' CORR ' ; LENGTH -NAME- $ 8 ; IF -N- = 1 THEN -NAME- = 'Y'; IF -N-=2 THEN -NAME-= 'XI'; IF -N- = 3 THEN -NAME- = 'X2' ; IF -N-=4 THEN -NAMl-= 'X3'; IF -N_=5 THEN -NAMI-= 'N'; IF -N_=5 THEN -TYPE-= 'N'; CARDS ; 0.6448 0.4938 -0.4336 1.0000 -0.0627 1.0000 -0.1900 -0.4336 1.0000 0.9553 0.6448 -0.1900 0.9553 1.0000 0.4938 -0.0627 41.0000 41.0000 41.0000 41.0000 c :>data SAVE FIG2A VARIABLES Y X1 X2 X3 TYPE=CORRELATION RUN 1.0000 1~0000 -0.4336 0.6448 -0.1900 1.0000 0.4938 -0.0627 0.9553 RUN USE FIG2A PRINT RUN QUIT c:>mglh USE FIG2A PRINT=LONG FORMAT=6 MODEL Y=Xl+X2+X3 / N=41 ESTIMATE QUIT RUN; PROC PRINT; RUN; PROC REG DATA=C.FIG2A; MODEL Y =X1 X2 X3 / NOINT VIF TOL COLLIN; RUN; SAS CODE TO RUN FIG. 2(B) DATA C .FIG2B (TYPE=CORR); INPUT (Y X1 X2 X3 N) (8.4); -TYPE- = ' CORR ' ; LENGTH -NAME- $ 8 ; IF -N-=l THEN -NAME-= 'Y'; IF -N-=2 THEN -NAME-= 'Xl'; IF -N-=3 THEN -NAME-= 'X2'; IF -N-=4 THEN -NAME-= 'X3'; IF -N-=5 THEN -NAME-= 'N'; IF -N- = 5 THEN -TYPE- = 'N'; CARDS ; -0.4336 0.6448 1.0000 1.0000 -0.1900 -0.4336 1.0000 0.6448 -0.1900 0.9553 0.4938 0.0000 41.0000 41.0000 41.0000 0 1996 British Ecological Society, Functional Ecology, 10,421431 RUN; PROC PRINT; RUN; PROC REG DATA=C.FIG2B; MODEL Y = X 1 X2 X3 / NOINT VIF TOL COLLIN; RUN; 1.0000 SYSTAT CODE FOR FIG. 2(B). c :>data SAVE FIG2B VARIABLES Y X1 X2 X3 TYPE=CORRELATION RUN 1.0000 1.0000 -0.4336 0.6448 -0.1900 1.0000 0.4938 0.0000 0.9553 RUN USE FIG2B PRINT RUN QUIT c : >mglh USE FIG2B PRINT=LONG FORMAT=6 MODEL Y=Xl+X2+X3 / N=41 ESTIMATE QUIT 1.0000
© Copyright 2026 Paperzz