Inferring multiple causality: the limitations of path

Functional
Ecology 1996
10,421-43 1
ESSAY REVIEW
Inferring multiple causality: the limitations of path
analysis
P. S. PETRAITIS," A. E. DUNHAM* and P. H. NIEWIAROWSKI?
*Department of Biology, University of Pennsylvania, Philadelphia, PA 19104-6018 and ?-Universityof Akron,
Department of Biology, Akron, OH 44325-3908, USA
Summary
1. Review of published ecological studies using path analysis suggests that path analysis is often misused and collinearity is a moderate to severe problem in a majority of
the analyses.
2. Multiple regression is shown to be related to path analysis, and thus collinearity,
which is a well-known problem in regression analysis, can also be a problem in path
analysis. Collinearity occurs when independent variables are highly correlated and
causes estimates of standardized partial regression coefficients and of path coefficients
to be less precise, less accurate and prone to rounding error.
3. A review of 24 path analyses in 12 papers revealed incorrectly calculated path coefficients in 13 cases and problems with collinearity in 15 cases. Estimates of path coefficients are biased and lack precision if there is collinearity.
4. An additional 40 path analyses could not be evaluated because of incomplete information. Most studies lack sufficient sample size to justify the use of path analysis.
Problems associated with the use of categorical variables, which were used in six
cases, are unappreciated.
5. These problems suggest that path coefficients may often mislead ecologists about
the relative importance of ecological processes.
Key-words: Collinearity, multiple regression, relative importance, statistical analysis
Functional Ecology (1996) 10,42143 1
Introduction
0 1996 British
Ecological Society
The dynamics of many important ecological and evolutionary processes are often influenced by multiple
interacting factors (Quinn & Dunham 1983), and
attempting to understand the dynamics of such systems presents difficult technical and philosophical
problems. Evolutionary ecologists have recently
turned to path analysis, multiple regression and
related techniques to analyse systems of multiple
causality. While these techniques are often seen as
competitors, they are, in fact, related, and as a result
the problems of these techniques are similar.
The use of path analysis as a means of analysing
systems involving multiple causality has become
increasingly prevalent in evolutionary ecology. Path
analysis was developed in the 1920s by Wright (1920,
1934) to analyse systems of multiple causality, but in
spite of its appeal, it was not widely used until
recently (see Power 1972 and Arnold 1972 for early
examples of the use of path analysis in ecology). With
the advent of fast and powerful desk-top computers,
path analysis and related methods are rapidly becom-
ing favoured techniques for the inference of multiple
causal interrelationships among variables in ecology
and evolutionary biology (Mitchell-Olds & Shaw
1987; Lynch 1988; Lynch &Arnold 1988; Schemske
& Horvitz 1988; Crespi & Bookstein 1989; Crespi
1990; Farris & Lechowicz 1990; Johnson, Huggins &
DeNoyelles 1991; Kingsolver & Schemske 1991;
Mitchell 1992). There are, however, a number of
important limitations and pitfalls of path analysis and
related methods that seem largely unappreciated (but
see Mitchell-Olds & Shaw 1987 and James &
McCulloch 1990).
We point out some of these dangers and suggest
that path analysis cannot adequately answer many of
the questions that evolutionary ecologists wish to
ask. Our review begins by placing path analysis in
the context of multiple regression techniques and
least-squares estimation. We do not discuss the more
commonly cited difficulties, such as the requirements of linearity and homogeneity of variances or
the use of predictor variables that are measured with
errors. These issues are well known and extensively
discussed in the statistical literature (Myers 1990
422
provides one of the clearest introductions to these
problems). We are more concerned with explaining
several unappreciated pitfalls that arise in the study
of complex systems in which the causal mechanisms
are poorly known and in which the suspected causes
may be highly intercorrelated. Such conceptual
shortcomings often blur the distinction between
hypothesis testing and exploratory analysis in the
minds of many ecologists, and so we discuss how
path analysis cannot overcome a lack of understanding about specific mechanisms or replace explicit
experimental tests of alternative hypotheses about
these mechanisms.
Intercorrelated factors also present analytical problems, which come under the rubric of collinearity.
Collinearity is not an unknown problem to ecologists
(see Mitchell-Olds & Shaw 1987 and James &
McCulloch 1990 for recent critiques in evolutionary
ecology), yet diagnostics for detecting collinearity are
not widely used in evolutionary ecology. Thus the
third section discusses the problems and detection of
collinearity.
Finally, we review some of the better-known
studies in which path analyses have been used and
demonstrate that collinearity compromises the biological interpretations in many cases. In reviewing the
literature, we were surprised by misuse of categorical
data, presence of pseudoreplication and problems
with inadequate sample size, and so, in passing, we
also comment on these problems.
P. S. Petraitis
et al.
Links between multiple regression and path
analysis
Multiple regression models are used to fit the best
functional relationship between a single criterion variable (Y) and two or more predictor variables (X,,
X2,...). The 'functional relationship' between predictors and criterion is assumed to be linear (Y= PIXl +
P2X2 + ...). Furthermore it is assumed that the ~ sthe
,
errors associated with observing the Xis, are distributed normally and are not correlated. If these
assumptions are met, then ordinary least-squares
methods will then give the 'best' estimates of the
linear relationships (i.e. the estimators will be unbiased and have minimum variance, Mood & Graybill
1963, Myers 1990). If the predictor and criterion variables are centred by subtracting their means and
scaled by dividing through with their standard deviations, then the best estimates of linear relationships
are standardized partial regression coefficients. For
three predictor variables, XI, X2, and X3, and a criterion variable Y, the functional relationship between
the standardized variables is then
y = blxl
. .+ b2x2
- - + blx3
. .+ E ,
O 1996British
Ecological Society,
Functional ~
~
10,421431
w,
eqn 1
where x and y are the standardized variables and the
bis~are the lstandardized
partial
regression
~
~
~
, coefficients
(beta weights). The least-squares estimate of the beta
weights can be found by solving the set of normal
equations, which describes the relationship between
all pairwise correlations and the standardized regression coefficients. For our three standardized predictor
variables, xl, x2, and x,, and one standardized criterion
variable y, the normal equations are:
eqn 2
The correlation structure can profoundly affect the
estimates of the beta weights because the estimates
are adjusted for the correlations among the predictor
variables, even though these correlations do not
appear in the regression equation (i.e. equation 1).
This can be easily seen by plotting the size of the beta
weights of two predictors against the strength of the
correlation between the predictors (Fig. 1). Suppose
we start with the situation where x, and x2 are uncorrelated and r,, = -r2, and thus the regression coefficients b, and b2 equal r,, and -r2,, respectively. The
coefficients, however, change as the correlation
between x, and x2 changes (Fig. 1).
The dependence of regression and path coefficients
on the correlation structure is often glossed over. Yet
this dependence means that inferences about the
strengths of path coefficients are conditional upon the
correlational structure among the predictor variables.
If the correlational structure of the sample does not
match the correlational structure of the population then
regression and path coefficients will not estimate
relative strengths accurately. This can occur if
sampling is not random or if analysis is based on data
from experiments with factorial designs, which break
the correlational structure among the predictor
Fig. 1. The effect of correlational structure on the magnitude
of beta weights. In this example, it is assumed that the beta
weights are equal in magnitude but opposite in sign. When
x, and x2 are uncorrelated, then b, = r,, and b2= r2,. The correlation r,, is set equal to 0.3 when rI2= 0.
423
Limitations of
path analysis
variables (e.g. Campbell & Halama 1993; EnglishLoeb, Karban & Hougen-Eitzman 1993; Wootton
1994).
One additional point is worth mentioning. The
number o f paired observations affects the estimate o f
the standardized partial regression coefficients o f categorical variables because the variance is standardized to 1. Standardization o f a two-state categorical
variable (scored 0 or 1 ) means that displacement o f
the scores 0 and 1, in standard deviation units, varies
with the number o f observations. For a regression
done in the context o f an analysis o f variance, the
number o f observations equals the number o f treatments (see Sokal & Rohlf 1981). The difference
between two treatments, such as the presence and
absence o f a predator, is [ ( n - l ) l n ~ "standard
~
deviation units where n = 2 for the two treatments. Changes
in the number o f treatments, which is often less than
five, will thus dramatically alter beta weights.
As in multiple linear regression, conventional path
analysis starts with a defined structure o f hypothesized
causal relationships (paths), usually represented as
causal diagrams, and then uses the correlations among
all the variables to estimate the strength o f the paths.
The variables are standardized as in standardized
multiple regression. Thus in Fig. 2(a), there are the
three predictor variables and one criterion variable,
and the observed correlation between x1 and y is due
not only to the direct effect o f x , on y (path p,,) but
also the indirect effect o f x, via x, and x, because the
three predictors are correlated. For example, the correlation betweenx, and y in terms o f path coefficients,is:
'-1,
Fig. 2. Comparison of beta weights and path coefficients and their standard errors. The
correlations are from Box 16.1 in Sokal & Rohlf (1981), which we used because data
are readily available. Fig. 2(a) is the path diagram for the multiple regression model.
Double-headed arrows represent the correlations among the predictor variables and
single-headed arrows are the standardized partial regression coefficients. (b) is the path
diagram in which there is no causal link between x, and x,. (c) gives the path diagram
of the multiple regression model in which x3 has been dropped to reduce collinearity.
Sample size is 41. All paths significant except eachp,, in (a) and (b).
= P,, + py2r12 + py,r1,.
eqn 3
Equation 3 is identical to the first o f the three normal
equations except that pyi s replace the bis. I f the path
diagram posits all possible pairwise correlations
among the predictors, then path coefficients are
merely standardized partial regression coefficients.
Note that multiple regression is simply a path analysis
with a particular structure and, per force, the same
restrictions apply to both analyses. Calling one analysis a path analysis and another a regression analysis
does not alter the conditions required for parameter
estimation. Figure 2(a) is, in fact, the path diagram for
a multiple regression.
Now suppose we believed apriori that the path diagram does not include all the causal and correlational
links found in a multiple regression. For example,
suppose we assume that no link exists between x, and
X , either directly between x1 and x, or indirectly via x,
(Fig.2b). This structural constraint implies rl, equals
zero, and the path diagram, or model, is called overidentified because there are fewer paths to be estimated than observed correlations. To solve for the
paths, r13in the normal equations is set to zero regardless o f the observed value for TI,. This is a structural
zero forced by our hypothesis o f no interaction
between x1 and x,. Predefined constraints, which are
determined prior to calculating the path coefficients,
cause the path coefficients to differ from the beta
weights o f a multiple regression. This can be clearly
seen by comparing Fig. 2(a) and Fig. 2(b).
It has been often suggested that path coefficients
can be calculated by carrying out repeated multiple
regression analyses on appropriate subsets o f
variables (e.g. Dillon & Goldstein 1984; Fanis &
Lechowicz 1990; Mitchell 1992). This is often true,
but the example in Fig. 2(b)cannot be analysed in this
424
P. S. Petraitis
et al.
manner. In this example, the correlation between xl
and x3 is assumed to be zero based on the structure of
the path diagram, and this correlation must be set
equal to zero in the normal equations. Carrying out
multiple regressions for subsets of variables using the
raw data is not appropriate, and Fig. 2(b) can only be
analysed by defining the correct correlatioh matrix if
regression procedures in conventional statistical packages such as SAS or SYSTAT are used (see Appendix
for SAS and SYSTAT codes for running the examples
in Fig 2a and b) .
Hypothesis testing vs exploratory analysis
O 1996 British
Ecological Society,
Funcfional ~
~
10,421431
Several recent studies have presented reviews demonstrating and advocating the use of path analysis to
study systems of multiple causality in ecology and
evolution (Schemske & Horvitz 1988; Kingsolver &
Schemske 1991; Mitchell 1992, 1993; Wootton
1994). Overall, these reviews present fairly thorough
treatments of the major assumptions of the technique
as well as the potential strengths of the approach compared with multiple regression. But our review of
studies actually employing path analysis reveals substantial confusion about the rationale for using this
approach and the limitations on the inferences about
causality that may be drawn.
We suggest that the main confusion associated with
path analysis is that inferences about 'causal' pathways are misleading. Demonstration of causation
must be made external to the statistical process of path
analysis. In other words, finding a significant fit of a
path model to a data set does not demonstrate that
relationships among variables are causal (Mitchell
1993). Rather, the path model becomes an hypothesis
subject to experimental verification.
Kingsolver & Schemske (1991) and Mitchell
(1992, 1993) emphasize two main applications of path
analysis: exploratory data analysis and formal hypothesis testing (statistical adequacy of a proposed causal
model). We view these two uses as alternatives and
suggest that inappropriate application of path analysis
has arisen frequently from a lack of distinction
between them.
The feature that distinguishes formal hypothesis
testing is the presentation of a formal path model that
is not derived from a data set that is itself the object of
the path analysis. In other words, a model of causal
and correlational pathways is presented a priori. It is
separate from the data that will be used to estimate
strengths of paths in the model. The model must be
tested by collecting new data.
This approach, however, poses a dilemma because
one must choose between observed correlations based
on new data and inferences about causal links that
were made before the data were collected. For example, suppose that we assumed a model as in Fig. 2(b)
where
to be
~ r13is
l assumed
~
~ zero,~but new, data showed
that r,,, even though small, was significantly different
from zero. There are three alternatives. First, accept
the data as providing a good test of the model and
reject the model. We believe this is the correct procedure. Second, use the widely accepted 'arm-waving'
approach, ignore the new data and hold on to the
model in spite of its falsification. Third, add or delete
paths until a better fit is found. At this point, we slip
into an a posteriori approach to path analysis that has
the same difficulties as stepwise regression (see comments by James & McCulloch 1990).
In contrast to testing an a priori path model, a path
diagram can be created aposteriori by adding or dropping variables until a fit that maximizes the proportion
of variation explained is found. This is an exercise in
formalizing, in a quantitative sense, hypotheses about
the pattern of interactions among a set of correlated
variables. Good hypotheses may spring from imagination, a posteriori path analyses, or various other
Muses. Use of path analysis in this context, like our
use of the Muses, must be carefully distinguished
from hypothesis testing. Hallmarks of the a posteriori
approach include lack of presentation and justification
of an explicit path model before analysis (e.g. Fjeld &
Rognerud 1993). More subtly, studies that present
only vague descriptions and justifications of a path
model as well as those that present path diagrams
based solely on previously collected correlative data
(e.g. Eriksson & Bremer 1993) have more in common
with exploratory path analysis than with hypothesis
testing.
Under the a posteriori approach, the dependence of
path coefficients on the correlational structure
becomes a crucial issue. Recall that Fig. 1 shows how
the magnitude of the coefficient changes with the correlation among the predictor variables. Thus our
choice of causal links will determine the magnitude of
the path coefficients. A firmly grounded causal model
prior to the collection of the data is a must if we are to
have any confidence in the results of a path analysis.
Yet post hoe models seem to be the norm in ecology
(several recent examples are Johnson et al. 1991;
Wesser & Armbruster 1991; Walker et al. 1994;
Wootton 1994). Such studies, which mix a posteriori
exploratory analysis and hypothesis testing, commit
an error that has little to do with the specific assumptions of path analysis, but instead reveal a misunderstanding about the role of statistics in inferring causal
relationships.
Collinearity: the Medusa of regression analysis?
While a variety of methodological problems plague
both multiple regression and path analysis, collinearity is especially troublesome for evolutionary ecologists who wish to place some confidence in the values
assigned to path coefficients. Collinearity occurs
when there are large correlations among some of the
predictor variables. However, examination of all pairwise correlations will not ensure that collinearity is
425
Limitations of
path analysis
Fig.3. Effect of collinearity on confidence intervals of
partial regression coefficients (P). A thousand X,, X2 pairs
were generated with correlation coefficients of 0.47, 0.86,
0.96, respectively. From each sample of 1000 doublets, four
sets of 50 pairs were chosen at random and used to calculate
Y according to the equation: Y = 0,3X, + 0.4X2 + E where & is
a random error term drawn from a uniform distribution
bounded between -100 and +100. The partial regression
coefficients for X, and X2 were estimated using ordinary
least squares. Solid lines show the true parametric values of
p2 and p,. Symbols and bars show mean values from the
four samples and the mean 95% confidence intervals,
respectively.
O 1996 British
Ecological Society,
FunctionalEcology,
10,421431
detected, and successful detection requires the use of a
variety of diagnostics (Myers 1990), which are related
to the effects of collinearity on the coefficients.
Correlated predictors have two distinct effects on
the partial regression coefficients (and per force path
coefficients). First, collinearity among predictor variables increases the standard errors of estimated partial
regression coefficients, resulting in a lack of stability
of path coefficients between samples (Asher 1983;
Fig.3). Large standard errors of path coefficients
mean that estimated strengths of effects between
variables may vary substantially between samples.
This means less precision. One result is that, as
collinearity increases, the ability to detect a significant effect (statistically non-zero path coefficient) is
reduced (compare 95% confidence intervals around
regression coefficients with r12= 0.96 vs 0.47 in
Fig. 3). This is an especially important result when
individual t-tests are used to test the significance of
individual path coefficients. In other words, inability
to reject the hypothesis that a path coefficient is nonzero may result from very low power (large standard
error), rather than the unimportance of an effect.
Therefore, even though increased confidence intervals
around partial regression estimates increase with
collinearity, protecting against a type I error, incorrect
inferences (i.e. a particular path effect is zero) about
the importance of a particular effect may still be
drawn.
The inflation of the standard error can be used to
screen for collinearity. If the predictor variables are
not correlated, then the variance of each regression
coefficient will equal the error variance (i.e. Var(b,) =
02, Myers 1990). Correlations among predictors
inflate the variances of the coefficients, and variance
inflation factors (VIFs), which equal var(b1)lo2, are
used as indexes of the loss in precision. The VIF of the
it, predictor is calculated by regressing the ith predictor against the remaining predictors. VIF equals
ll(1-R;) where R: is the coefficient of multiple determination of this regression.
There are no hard and fast rules for how large VIFs
can be before collinearity becomes a problem. Myers
(1990) suggests all VIFs should be less than 10, but
Freund & Littell (1986) suggest all R:S should be
smaller than the R2 of the complete model. For our
artificial example in Fig. 2(a), the VIFs are 1.25 forb,,
14.28 for b2 and 13.70 for 6,. Thus the variances of b,
and b2 are roughly 11 times the expected error variance. As expected, the standard errors for b1 and b2
are large (0.380 and 0.374, respectively), and there is
a loss in precision.
The second effect of collinearity is an increase in
the absolute magnitude of the partial regression coefficients (Myers 1990). This means less accuracy. A
bias in the beta weight is often the most noticeable.
For example, a multiple regression, as in Fig. 2A, in
which one or more of the beta weights are greater than
one in absolute value may be due to collinearity.
Large values for beta weights, however, do not necessarily imply collinearity. The estimation bias depends
on the magnitude of the eigenvalues of the predictor
variable correlation matrix such that the relationship
between the expected estimates and the parametric
values is:
E(Ci bI2)=Xi
+ 02C1ll&,
eqn 4
where bi = the estimated regression coefficients, pi =
their parametric values, o2= the model error variance
and ki = the eigenvalues (Myers 1990).
The bias can be considerable. For example, two of
the predictors used in the multiple regression in
Fig. 2(a) are highly correlated and two of the three
beta weights are greater than one in absolute value.
The eigenvalues of the predictor correlation matrix
are 1.9878, 0.9761 and 0.0361. Thus Xi b2 overestimates C,
by 29.2302! Normally Ci b2 overestimates
Xi Pi2 by 302 if all three predictors are not correlated
because eigenvalues for matrices of uncorrelated predictors equal one.
This bias can be used to detect collinearity by calculating a condition index, which is hl/Ai, for each predictor variable. Myers (1990) suggests collinearity
will be a problem if the index is over 1000. Note that
SAS and SYSTAT use the square root of this ratio,
which is also called the condition index, and suggest
using 30 as the cut-off point. Based on Mason &
Perreault's (1991) simulations, we suspect that the
426
P. S. Petraitis
et al.
cut-off is set far too high for ecological analyses. They
found serious problems with collinearity with condition indexes as low as 3.4 when the sample size was
less that 100 and R2 was less than 0.75, which are typical values for ecological studies. In Fig. 2(a), for
example, the largest condition index is only 7.42
(square root of 1.9878/0.0361), which is well below
30, but the beta weights appear to be unusually large.
The problems with collinearity, in particular unusually large beta weights, are often resolved by dropping
one or more of the predictors. This seems to make
sense to many researchers (see comments of MitchellOlds & Shaw 1987). Clearly if two variables are highly
correlated, then either should serve equally well to predict change in the criterion variable. In Fig. 2(c) for
example, we dropped x3 because of its strong correlation with x, but weak correlation with x,. Dropping the
third variable has a small effect on the amount of the
variation that is explained: the coefficient of multiple
determination, R2,equals 0.6125 if all three predictor
variables are considered and drops to 0.5 161 if only x,
and x, are included. There is, however, quite a change
in the beta weights, and b, goes from +1.7111 to
+0.5835, which underscores the difficulties of inferring the relative strengths of several predictors when
they are highly intercorrelated.
Dropping predictor variables is not only intellectually dishonest (Philippi 1993), it also contaminates the
remaining predictors (Box 1966). Suppose we believe
that the true causal relationship involves all three predictors, but we drop x3 to limit the effects of collinearity. In spite of this, the estimates of the regression
coefficients b, and b, remain contaminated by the
eliminated predictor. For example, suppose the true
model and expected value of y[E(y)] were
eqn 5
but with x, and x3 highly correlated as is the case in
Fig. 2(c). If we drop x,, our expected values for b, and
b, become:
P3
Note that in equation 6 the quantities by which are
multiplied are partial correlations and are the standardized partial regression coefficients for x, and x, if
we regressed x3 on x, and x,.
Review of path analyses in ecology
It is our impression that path analysis has been widely
misused by ecologists. We reviewed over 64 path
analyses in 26 papers (Tables 1 and 2). Our review is
by no means complete. We relied on citations in published papers and listings in Current Contents for references.
Since our main concern was the use of path
O 1996 British
analysis
in
~
~
Society,
~
l
~
~
i
~ecology,
~
lwe did not include papers dealing
with unmeasured variables (e.g. the measurement of
Functional~cology,
10,421431
selection, see Mitchell-Olds & Shaw 1987; Crespi &
Table 1. Papers with path analyses that were examined but
not included in review of collinearity. Analyses were
rejected for review because they lacked either a table of
correlations or an explicit diagram with path coefficients
Source
1
2
3
Campbell & Halama (1993)
English-Loeb et al. (1993)
Fams & Lechowicz (1990)
Herbers (1990)
Herrera (1993)
Johnson et al. (1991)
Jordan (1992)
Jordan (1993)
King (1993)
Maddox & Antonovics (1983)
Mitchell (1993)
Mitchell-Olds (1987)
Sinervo (1990)
Stanton et al. (1991)
Winn & Werner (1987)
Wootton (1994)
1. Number of path analyses in paper: note that Winn &
Werner (1987) do not provide sufficient data to determine
the number of analyses done.
2. Categorical variables or variables defined by treatment
levels included in analysis? N, no, Y, yes.
3. Sample sizetnumber of predicted paths. ?, sample size
missing or sample size or number of paths not explicit for
one or more analyses.
Bookstein 1989; Crespi 1990) or articles outside the
mainstream of ecology (e.g. use of path analysis in an
agricultural context).
Authors analysed path diagrams by hand (two
papers) or by using a LISREL type of program (six
papers) or a multiple regression package (18 papers).
LISREL, which stands for linear structural relations
(Joreskog & Sorbom 1988), is a statistical package
that can implement a maximum-likelihood procedure
to estimate the strengths of direct and indirect effects
(what we have been calling beta weights or path coefficients up to now). The procedure is quite versatile,
and the model can include predictor and criterion variables that are unmeasured. The equivalent procedure
in SAS is called PROC CALIS (SAS Institute Inc.
1990).
We were surprised by the low sample size in nearly
all cases. Sample size needs to be large to ensure
stable parameter estimates. With a small sample size,
successive observations are likely to change parameter estimates merely as a result of random sampling
from the underlying 'true' population variancecovariance structure. As a general rule of thumb, sample size should be at least five to 20 times larger than
the number of estimated paths to ensure reliable
results (SAS Institute Inc. 1990). We found the ratio
of sample size to number of paths ranged from below
2 to about 90 (see Tables 1 and 2). The ratio appeared
to be < 5 for the majority of analyses. In many cases,
the sample size was not explicitly stated, and in the
427
Limitations of
path analysis
Table 2. Survey of collinearity and other problems in published path analyses for which correlations and path diagrams were
provided
Source
1
2
3
4
5
6
7
8
Arnold (1972) T I , F2
Eriksson & Bremer (1993) T2, F1
Fjeld & Rognerud (1993) T2, F4
Johnson (1975) T3, F4
Mitchell (1992) T I , F1A
Mitchell (1992) T I , FIB
Mitchell (1993) T10.1, F10.3
Power (1972) T2, F2
Roach (1986) T3, F2
Schemske & Horvitz (1988) A, F3
Schemske & Horvitz (1988) A, F4
Walker et al. (1994) Ala, F2
Walker et al. (1994) Alb, F2
Walker et al. (1994) Alc, F2
Walker et al. (1994) Ald, F2
Walker et al. (1994) Ale, F2
Wesser & Armbruster (1991) A2, T2
Wesser & Armbmster (1991) A2, T3a
Wesser & Armbruster (1991) A2, T3b
Wootton (1994) T3, F4A
Wootton (1994) T3, F4B
Wootton (1994) T3, F4E
Wootton (1994) T3, F4D
Wootton (1994) T3, F4C
Source: Author, year, correlation table, path coeffiecients. T, table, F, figure, A, appendix.
1. Included categegorical variables or variables defined by treatment levels? N, no, Y, yes.
2. Sample sizelnumber of predicted paths to the nearest integer.
3. Number of miscalculated pathsltotal number of paths. ?, problems with estimates.
4. Number of regressions incorrectly doneltotal number of regressions.
5. Largest condition index.
6. Largest VIF.
7. Number of VIFs > model VIFItotal number of paths from multiple regressions.
8. Number of multiple regressions for which at least one VIF > model VIF/total number of multiple regressions.
O 1996 British
Ecological Society,
~
~
~
10,421-431
paper by Jordan (1993), we could not find any mention of sample size. We also disregarded pseudoreplication in our tally of sample sizes and used samples
sizes as stated by the authors. Although we were not
looking for it, we found two clear cases of what
Hurlbert (1984) calls temporal pseudoreplication.
Johnson et al. (1991) stated they had 64 observations,
but in fact used eight mesocosms and sampled each
one eight times. Wootton (1994, his Table 2) gives his
sample size as 20, but his experiment involved 10
cages, each of which was sampled in 2 different years.
Categorical variables or results from experiments
with fixed treatment levels were used in nine papers,
and none of the authors seems to have understood how
these types of variables fail to account for the range of
natural variation. For example, consider a two-state
categorical variable. The scoring is arbitrary because
the variance is standardized to unity. Thus experimental manipulation of the absence and presence of a
predator is usually scored as 0 and 1 (e.g. Wootton
1994) but could just as well be scored as -100 and
+I00 with no change in the beta weight or path coefficient.~ This means
when
techniques
t
j manipulative
~
~
~ are
used to control treatment levels, we have no way of
knowing if the range of treatments spans 1, 2 or 10
standard deviations of the normal variation seen in the
field. We suspect that use of categorical variables and
fixed treatment levels generally inflate estimates of
path coefficients because the ranges of categories and
treatments (e.g. the control of the presence or absence
of a predator through the use of cages) are much larger
than the implied range of one standard deviation (see
Petraitis 1996).
We were able to evaluate 24 path analyses in 12
papers for collinearity (Table 2). Tests for collinearity
required us to re-run the path analyses, and to do this
we needed a clear path diagram and the correlations
among the variables. We found over 40 additional
path analyses, but these lacked either a clear path diagram or the required correlations (Table l). Using the
correlation matrix for each path diagram, we calculated condition indexes, variance inflation factors and
path coefficients. We used regression procedures in
SAS and SYSTAT to analyse the data. This involved
calculating 226 paths spread over 9 1 regressions.
We found incorrect path coefficients in 13 of the
dis-~
~24 path ~analyses ~(Table 2).
~ This does
~ not include
~
crepancies in estimates, which were often greater
~
428
P. S. Petraitis
et al.
O 1996 British
Ecological Society,
Functional ~
~
10,421431
than 20% and caused by the effects of collinearity
on accumulated rounding errors. In seven cases
(one in Power 1972, one in Johnson 1975 and five
in Walker et al. 1994), the authors used an overall
multiple regression even though the path diagram
showed no links between two or more of the independent variables. Usually the path diagram was
missing correlational paths among variables, but
the path coefficients suggested these correlations
were included in the path analysis. We were able to
duplicate Power's (1972) and Johnson's (1975)
estimates by including the correlations missing
from the path diagrams. We were unable to duplicate the estimates in Walker et al. (1994) or in three
other analyses (one from Eriksson & Bremer 1993
and two from Wootton 1994). We assume typographical errors, either in the correlation table or in
the path diagram, are the most likely cause in these
cases. Finally, we could not identify the source of
the discrepancies in the three remaining cases,
which are from Wesser & Armbruster (1991). One
of their analyses has serious collinearity, and
rounding errors may have prevented us from recovering the published estimates in our re-analysis.
However, the three path analyses contain six subanalyses that are shownas simple regressions, and
thus the path coefficients should be the simple correlations. Five of the six path coefficients do not
match the reported correlations.
Collinearity appears to be a moderate to serious
problem in 15 of 23 analyses and affects 22% of the
path coefficients (Table 2, the number of VIFs is
greater than model R ~ )When
.
collinearity was present, we found our estimates of path coefficients
would have the same rank order as the published
values but were often 20% greater than and sometimes double the published values. The condition
indexes and VIFs were well above the recommended cut-off points in two papers (Wesser &
Armbruster 1991; Wootton 1994). Ironically, these
authors checked for collinearity. Wootton (1994)
found it, but then dropped variables. Wesser &
Armbruster (1991) failed to detect it. This suggests
collinearity is a moderately common problem. Since
it undermines the precision and accuracy of estimates of path coefficients, we believe most published path analyses in ecology should be viewed
with extreme caution.
Finally we would like to note the three good examples of path analysis we found and suggest they serve
as guides to those who wish to do path analysis. Roach
(1986) and Mitchell (1992, 1993) have sufficient sample size, appear to have done the analysis correctly,
and have little or no problem with collineasity (see
Table2). Sample sizes are clearly stated. Correlation
tables, path diagrams, and path coefficients are provided. The data are not from experimental manipulations
variables,
and
~ and ldo not~ include~ categorical
~
,
thus these problems are avoided.
Discussion
Evolutionary ecologists hope to estimate the relative
strengths of competing causes of biological phenomena. Path analysis and multiple regression are useful
tools for analysing multiple causality, but they cannot
overcome the limitations imposed by highly correlated observations, inappropriate data and poorly conceptualized models. Collinearity causes beta weights
to be less accurate, less precise and more sensitive to
small errors in measurement and rounding. Use of categorical variables, non-random sampling, and use of
data from manipulative experiments prevents the v a t ance-covasiance structure of the sample from matching the variance-covariance structure of the population. Path and regression estimates from such samples
may be accurate and precise, but cannot be generalized to the population as a whole. Using path analysis
does not improve the situation because multiple
regression is simply a particular type of path analysis.
Our review suggests there is cause for concern.
Some (Kingsolver & Schemske 1991; Mitchell
1992) have suggested that path models, and in particular models with unmeasured variables, could be better
analysed using approaches such as LISREL. We
doubt that this procedure will be of much help to
evolutionary ecology. Implementation of LISREL
requires us to specify eight matrices in total (Dillon &
Goldstein 1984). These matrices identify the links
between all the variables (i.e. the paths) and the
variance-covasiance structures of the measured
variables and of their errors of measurement. Paths
(i.e. non-zero entries in the matrices) can be fixed,
constrained to equal a combination of other parameters, or left unconstrained. If all paths are left unconstrained, LISREL assumes certain constraints on
some paths and then estimates the remaining ones.
LISREL solves for the set of 'path coefficients' that
best reproduces the observed variance-covariance
matrix of the measured variables.
Testing the fit of the observed variance-covariance
matrix to the predicted variance-covariance matrix is
carried out by a X2 likelihood ratio (Dillon &
Goldstein 1984). The test assumes a large sample of
independent observations, no missing or categorical
data, and multinomality of all variables. The models
must be linear and additive. All of the cases that we
examined that used LISREL or a likelihood ratio test
failed to meet one or more of these assumptions.
It seems clear that, in most cases, the requisite underlying theory is not available to specify the mechanisms
that link variables a priori nor do the data meet the
assumptions of the model and its test. The problem of
model definition is serious: failure to d e f i e a sufficiently
restrictive model can lead to several sets of 'path coefficients' giving the same test value. One could 'adjust' the
model by adding paths suggested by the observed correlation structure or by deleting less 'interesting' paths
(e.g. a sequence of paths that connects two variables that
429
Limitations of
path analysis
are weakly correlated) until an unique solution can be
found. Once this is done, data, not theory, drive the
model, and the analysis becomes nothing more than 'a
limitless exercise in data snooping contributing little, if
anything, to scientific progress' (Dillon & Goldstein
1984). Given the potential for problems, we believe that
LISREL does not offer a widely applicable solution for
addressing multiple causality in evolutionary ecology,
even though LISREL may be able to overcome some of
the problems of collinearity.
Several biased estimation techniques have been
proposed as a way to combat collinearity (Myers
1990). These are controversial methods. Ridge regression, for example, biases the regression coefficients in
order to improve the model's ability to predict the criterion variable. The estimates of the individual regression coefficients may be very biased even though
when taken together they provide a very good prediction of the criterion variable. Other methods, such as
the ridge trace method and the df trace method, stabilize the individual coefficients with minimal bias but
at the expense of prediction. It seems to us the latter
set of methods is more appropriate for ecological data
because ecologists are usually interested in the relative strengths of coefficients.
Ultimately, our understanding of relative strengths of
ecological processes depends on clearly defined models of causality. Path analysis and LISREL will correctly estimate the relative strengths of competing factors only for the model under consideration. Dropping
highly correlated predictors or deleting paths contaminates the remaining estimates, and thus throws in to
question the meaning of 'relative strengths'. While path
analysis and LISREL do offer structured approaches to
finding patterns that may serve as the basis for hypotheses, these methods cannot by themselves confirm or
disprove the existence of causal links.
Acknowledgements
We thank Joe Bernardo, Johan Bring, Evan Cooch,
Peter Fairweather, Randy Mitchell, Steve Heard and
Joe Travis for their comments and insights.
References
O 1996 British
Ecological Society,
Functional Ecology,
10,421431
Arnold, S.J. (1972) Species densities of predators and their
prey. American Naturalist 106,220-236.
Asher, H.B. (1983) Causal Modeling. Sage Publications,
Newbury Park, CA.
Box, G.E.P. (1966) Use and abuse of regression.
Technometrics 8,625-629.
Campbell, D.R. & Halama, K.J. (1993) Resource and pollen
limitations to lifetime seed production in a natural plant
population. Ecology 74, 1043-105 1.
Crespi, B.J. (1990) Measuring the effect of natural selection
on phenotypic interaction systems. American Naturalist
135,3247,
Crespi, B.J. & Bookstein, F.L. (1989) A path-analytic model
for the measurement of selection on morphology.
Evolution 43,18-28.
Dillon, W.R. & Goldstein, M. (1984) Multivariate Analysis.
Wiley & Sons, New York, NY, USA.
English-Loeb, G.M., Karban, R. & Hougen-Eitzman, D.
(1993) Direct and indirect competition between spider
mites feeding on grapes. Ecological Applications 3,
699-307.
Eriksson, 0. & Bremer, B. (1993) Genet dynamics of a
clonal plant Rubus saxatilis. Journal of Ecology 81,
533-542.
Farris, M.A. & Lechowicz, M.J. (1990) Functional interactions among traits that determine reproductive success in
a native annual plant. Ecology 7,548-557.
Fjeld, E. & Rognerud, S. (1993) Use of path analysis to
investigate mercury accumulation in brown trout (Salmo
trutta) in Norway and the influence of environmental factors. Canadian Journal of Fisheries and Aquatic Science
50,1158-1167.
Freund, R.J. & Littell, R.C. (1986) SAS System Linear
Regression, 1986 edition. SAS Institute Inc., Cary, NC,
USA.
Herbers, J.M. (1990) Reproductive investment and allocation ratios for the ant Leptothorax longispinosus: sorting
out the variation. American Naturalist 136, 178-208.
Herrera, C.M. (1993) Selection on floral morphology and
environmental determinants of fecundity in a hawk mothpollinated violet. Ecological Monographs 63,25 1-275.
Hurlbert, S.H. (1984) Pseudoreplication and the design of
ecological field experiments. Ecological Monographs 54,
187-211.
James, F.C. & McCulloch, C.E. (1990) Multivariate analysis
in ecology and systematics: Panacea or Pandora's box?
Annual Review of Ecology and Systematics 21, 129-166.
Johnson, M.L., Huggins, D.G. & DeNoyelles, F., Jr. (1991)
Ecosystem modeling with LISREL: a new approach for
measuring direct and indirect effects. Ecological
Applications 1, 383-398.
Johnson, N.K. (1975) Controls of number of bird species on
montane islands in the great basin. Evolution 29,
545-567.
Jordan, N. (1992) Path analysis of local adaptation in two
ecotypes of the annual plant Dioda teres Walt.
(Rubiaceae). American Naturalist 139, 149-165.
Jordan, N. (1993) Prospects for weed control through crop
interference. Ecological Applications 3, 84-9 1.
Joreskog, K.G. & Sorbom, D. (1988) LISREL 7. A Guide to
the Program and Applications. Scientific Software, Inc.,
Mooresville, IN, USA.
King, R.B. (1993) Determinants of offspring number and
size in the brown snake, Storeria dekayi. Journal of
Herpetology 27, 175-185.
Kingsolver, J.G. & Schemske, D.W. (1991) Path analyses of
selection. Trends in Ecology and Evolution 6,276-280.
Lynch, M . (1988) Path analysis of ontogenetic data. Sizestructured Populations - Ecology and Evolution (eds B.
Ebenman & L. Persson), pp. 2 9 4 6 . Springer-Verlag,
Berlin.
Lynch, M. & Arnold, S.J. (1988) The measurement of selection on size and growth. Size-structured Populations Ecology and Evolution (eds B. Ebenman and L. Persson),
pp. 47-59. Springer-Verlag, Berlin.
Maddox, G.D. & Antonovics, J. (1983) Experimental ecological genetics in Plantago: a structural equation
approach to fitness components in P. aristata and P
patagonica. Ecology 64, 1092-1099.
Mason, C.H. & Perreault, W.D. Jr., (1991) Collinearity,
power, and interpretation of multiple regression analysis.
Journal of Marketing Research 28,268-280.
Mitchell, R.J. (1992) Testing evolutionary and ecological
hypotheses using path analysis and structural equation
modeling. Functional Ecology 6, 123-129.
430
P. S. Petraitis
et al.
0 1996 British
Ecological Society,
Functional Ecology,
10,421431
Mitchell, R.J. (1993) Path analysis: pollination. Design and
Analysis of Ecological Experiments (eds S.M. Scheiner
and J. Gurevitch), pp. 211-231. Chapman and Hall, New
York.
Mitchell-Olds, T. (1987) Analysis of local variation in plant
size. Ecology 68, 82-87.
Mitchell-Olds, T. & Shaw, R. (1987) Regression analysis of
natural selection: statistical inference and biological interpretation. Evolution 41, 1149-1 161.
Mood, A.M. & Graybill, F.A. (1963) Introduction to the
Theory of Statistics, 2nd edition. McGraw-Hill, New York.
Myers, R.H. (1990) Classical and Modern Regrewion with
Applications, 2nd edition. PWS Kent, Boston, MA.
Petraitis, P.S. (1996) How can we compare the importance
of ecological processes if we never ask, 'Compared to
what?' Issues and Perspectives in Experimental Ecology
(eds W. Resitarits and J. Bernardo). Oxford University
Press, in press.
Philippi, T.E. (1993) Multiple regression: herbivory. Design
and Analysis of Ecological Experiments (eds S.M.
Scheiner and J. Gurevitch), pp. 183-210. Chapman and
Hall, New York.
Power, D.M. (1972) Numbers of bird species on the
California islands. Evolution 26,45 1 4 6 3 .
Quinn, J.F. & Dunham, A.E. (1983). On hypothesis testing in
ecology and evolution. American Naturalist 122,602417,
Roach, D.A. (1986) Timing of seed production and dispersal
in Geranium carolinianum: effects on fitness. Ecology
67,572-576.
SAS Institute, lnc. (1990) SAS User's Guide: Statistics. SAS
Institute, Inc., Cary, NC.
Schemske, D.W. & Horvitz, C.C. (1988) Plant-animal interactions and fruit production in a neotropical herb: a path
analysis. Ecology 69, 1128-1 137.
Sinervo, B. (1990) The evolution of maternal investment in
lizards: an experimental and comparative analysis of egg
size and its effects on offspring performance. Evolution
44,279-294.
Sokal, R.R. & Rohlf, F.J. (1981) Biometry, 2nd edition.
Freeman, San Francisco, CA.
Stanton, M., Young, H.J., Ellstrand, N.C. & Clegg, J.M.
(1991) Consequences of floral variation for male and
female reproduction in experimental populations of wild
radish, Raphanus sativus L. Evolution 45,268-280.
Walker, M.D., Webber, P.J., Arnold, E.H. & Ebert-May, D.
(1994) Effects of interannual climate variation on aboveground phytomass in alpine vegetation. Ecology 75,
393408.
Wesser, S.D. & Armbmster, W.S. (1991) Species distribution controls across a forest-steppe transition: a causal
model and experimental test. Ecological Monographs 61,
323-342.
Winn, A.A. & Werner, P.A. (1987) Regulation of seed yield
within and among populations of Prunella vulgaris.
Ecology 68, 1224-1233.
Wootton, J.T. (1994) Predicting direct and indirect effects:
an integrated approach using experiments and path analysis. Ecology 75, 151-165.
Wright, S. (1920) The relative importance of heredity and
environment in determining the piebald pattern of guinea
pigs. Proceedings of the National Academy of Science,
USA 6,320-332.
Wright, S. (1934) The method of path coefficients. Annals of
Mathematics and Statistics 5. 161-215.
Received 13 June 1995: revised 21 November 1995;
accepted 30 November 1995
431
Limitations of
path analysis
Appendix
SAS and SYSTAT code to run path analyses in Fig. 2a
and b. Examples from Sokal and Rohlf's (1981) Box
16.1 and Table 16.I
SAS CODE TO RUN FIG. 2(A)
SYSTAT CODE TO RUN FIG. 2(A)
DATA C.FIG2A (TYPE=CORR);
INPUT ( Y X 1 X2 X3 N) (8.4);
-TYPE- = ' CORR ' ;
LENGTH -NAME- $ 8 ;
IF -N- = 1 THEN -NAME- = 'Y';
IF -N-=2 THEN -NAME-= 'XI';
IF -N- = 3 THEN -NAME- = 'X2' ;
IF -N-=4 THEN -NAMl-= 'X3';
IF -N_=5 THEN -NAMI-= 'N';
IF -N_=5 THEN -TYPE-= 'N';
CARDS ;
0.6448
0.4938
-0.4336
1.0000
-0.0627
1.0000
-0.1900
-0.4336
1.0000
0.9553
0.6448
-0.1900
0.9553
1.0000
0.4938
-0.0627
41.0000
41.0000
41.0000
41.0000
c :>data
SAVE FIG2A
VARIABLES Y X1 X2 X3
TYPE=CORRELATION
RUN
1.0000
1~0000
-0.4336
0.6448
-0.1900
1.0000
0.4938
-0.0627
0.9553
RUN
USE FIG2A
PRINT
RUN
QUIT
c:>mglh
USE FIG2A
PRINT=LONG
FORMAT=6
MODEL Y=Xl+X2+X3 / N=41
ESTIMATE
QUIT
RUN;
PROC PRINT;
RUN;
PROC REG DATA=C.FIG2A;
MODEL Y =X1 X2 X3 / NOINT VIF TOL COLLIN;
RUN;
SAS CODE TO RUN FIG. 2(B)
DATA C .FIG2B (TYPE=CORR);
INPUT (Y X1 X2 X3 N) (8.4);
-TYPE- = ' CORR ' ;
LENGTH -NAME- $ 8 ;
IF -N-=l THEN -NAME-= 'Y';
IF -N-=2 THEN -NAME-= 'Xl';
IF -N-=3 THEN -NAME-= 'X2';
IF -N-=4 THEN -NAME-= 'X3';
IF -N-=5 THEN -NAME-= 'N';
IF -N- = 5 THEN -TYPE- = 'N';
CARDS ;
-0.4336
0.6448
1.0000
1.0000
-0.1900
-0.4336
1.0000
0.6448
-0.1900
0.9553
0.4938
0.0000
41.0000
41.0000
41.0000
0 1996 British
Ecological Society,
Functional Ecology,
10,421431
RUN;
PROC PRINT;
RUN;
PROC REG DATA=C.FIG2B;
MODEL Y = X 1 X2 X3 / NOINT VIF TOL COLLIN;
RUN;
1.0000
SYSTAT CODE FOR FIG. 2(B).
c :>data
SAVE FIG2B
VARIABLES Y X1 X2 X3
TYPE=CORRELATION
RUN
1.0000
1.0000
-0.4336
0.6448
-0.1900
1.0000
0.4938
0.0000
0.9553
RUN
USE FIG2B
PRINT
RUN
QUIT
c : >mglh
USE FIG2B
PRINT=LONG
FORMAT=6
MODEL Y=Xl+X2+X3 / N=41
ESTIMATE
QUIT
1.0000