ECONOMICS SERIES SWP 2013/1 March 2014 Neither

Faculty of Business and Law
School of Accounting, Economics and Finance
ECONOMICS SERIES
SWP 2013/1
March 2014
Neither Fixed nor Random: Weighted Least
Squares Meta-Analysis
T.D. Stanley and Hristos Doucouliagos
The working papers are a series of manuscripts in their draft form. Please do not
quote without obtaining the author’s consent as these works are in their draft form.
The views expressed in this paper are those of the author and not necessarily
endorsed by the School or IBISWorld Pty Ltd.
Neither Fixed nor Random: Weighted Least Squares Meta-Analysis
by
T.D. Stanley* and Hristos Doucouliagos**
Abstract
We show how and explain why an unrestricted weighted least squares estimator is
superior to conventional random-effects meta-analysis when there is publication (or
small-sample) bias and better than fixed-effect weighted average if there is heterogeneity.
Ironically, the advantage of this weighted least squares meta-regression is largest in those
exact conditions for which random effects are designed—large additive heterogeneity.
Keywords: meta-analysis, meta-regression, weighted least squares, fixed effect, random
effects
* Professor of Economics, Hendrix College, 1600 Washington St., Conway, AR, 72032
USA. Email: [email protected]. Phone: 1-501-450-1276; Fax: 1-501-450-1400.
** Professor of Economics, School of Accounting, Economics and Finance and Alfred
Deakin Research Institute, Deakin University, 221 Burwood Highway, Burwood, 3125,
Victoria, Australia. Email: [email protected]. Phone: 61-3-9244-6531.
1
Neither Fixed nor Random: Weighted Least Squares Meta-Analysis
1. INTRODUCTION
Nearly all meta-analyses report a ‘fixed-effect’ or a ‘random-effects’ weighted average,
often both [1, 2]. However, it is widely known that fixed-effect estimator produces
confidence intervals with poor coverage when applied to unconditional inference; that is,
to populations that may not be entirely identical to the one sampled [2, 3, 4]. Random
effects, on the other hand, are highly sensitive to the accuracy of the estimate of the
between-study variance, τ2 [1], and conventional estimates of τ2 are biased [3]. When
there is publication (or small-sample) bias, random effects have larger biases than fixed
effect [4, 5, 6, 7, 8]. In this paper, we propose the routine use of a simple unrestricted
weighted least squares meta-regression that offers the best of both. We show how this
unrestricted weighted least squares estimator corrects the poor coverage of the fixedeffect estimator. Further, when there is either publication selection or small-sample bias,
our simulations demonstrate that the unrestricted weighted least squares dominate
random-effects meta-analysis, whether the reviewer is synthesizing RCTs (randomized
controlled trials) or regression estimates.
In practice, our approach addresses the same problems of conventional metaanalysis as does Henmi and Copas [4]. Their hybrid confidence interval is centered on
the fixed-effect estimate, as is our weighted least squares estimator, but Henmi and Copas
calculate its width from the random effects setting, further taking into account the
uncertainty of estimating τ2 [4]. We believe that our unrestricted weighted least squares
approach is more simple and elegant. Weighted least squares have a long history with
well-established statistical properties rooted in the Gauss-Markov Theorem [9-12].
Weighted least squares confidence intervals are easily and automatically calculated by
regression routines found in all standard statistical software.
Weighted least squares have been used by many meta-analysts in different
contexts [1,13-22].
We fully recognize that weighted least squares are an integral
component of all of these methods, including fixed and random effects. However, the
key difference among these methods lies in exactly how each implements weighted least
squares, and these differences matter.
2
To our knowledge, no one has suggested that an unrestricted weighted least
squares (WLS) should replace random-effects meta-analysis.
Nor has anyone
demonstrated the superiority of an unrestricted weighted least squares over conventional
meta-analysis. However, particle physicists have long used a similar weighted least
squares approach for experimental measurements of the mass and charge of fundamental
particles (e.g., bosons, leptons, quarks) without ever referring to meta-analysis [23].
Rather than embrace these methods, meta-analysts have thus far denied their relevance.
“Our model-based analysis shows that the conventional additive random-effects model
appears to fit the data better than the multiplicative model, so our suggestion is that here
the (particle physicists) might consider changing their practice” [23, p. 120].
We
demonstrate just the opposite through realistic simulations of meta-analyses of both
RCTs and regression coefficients. Our simulations show that weighted least squares
estimates are often superior to random effects even when we are confined to the
conventional additive random-effects model.
Thus, meta-analysts would do well to
report this unrestricted weighted least squares estimate routinely as a summary estimate.
2. SIMPLE WEIGHTED AVERAGES
As widely known, the fixed-effect estimator assumes that the individual reported effects,
yi, are a random draw from a fixed normal population. Or,
yi = µ + εi
and εi ~ N(0, σ i2 ) for i = 1, 2, . . . , L.
(1)
Random effects allow individual means to vary randomly around µ. Or,
yi = µ +θi + εi
;
θi ~ N(0, τ 2 ) and εi ~ N(0, σ i2 )
for i = 1, 2, . . . , L.
(2)
All three estimators: fixed effect, random effects and unrestricted weighted least squares,
may be modeled compactly as:
yi ~ N(µ, vi )
(3)
with different assumptions about the individual variances, vi . Random-effects assume
that variances are additive: vi = (σ i2 + τ 2 ) , where τ 2 is the usual between-study or
3
heterogeneity variance. Fixed effect assumes that there is no excess heterogeneity, or
τ 2 =0. The unrestricted weighted least squares can also be modeled by equation (1);
however, it assumes only that the variances can be estimated up to some unknown
multiplicative constant, φ , or that vi = φσ i2 .
The Gauss-Markov theorem proves that, as long as vi is known up to some
proportional constant, φ , the conventional weighted least squares estimator provides the
best (minimum variance) linear unbiased estimator (or BLUE) [10, 11]. With consistent
estimates of σ i2 (such as each study’s squared standard error), weighted least squares
provide consistent, asymptotically efficient and asymptotically normal estimates [12].
All three estimators can be written in a common compact form:
µ̂ = Σwi yi Σwi
(4)
However, each employs different weights, wi , and thereby has different variances. Fixed
effect uses weights, wi =1/ σ i2 , and has variance, 1 Σwi . Random effects has weights,
wiʹ′ =1/ (σ i2 + τ 2 ) with variance, 1 Σwiʹ′ . Lastly, the unrestricted weighted least squares’
weights are wi* =1/ (φσ i2 ) with variance = 1 / Σwi* = φ Σ1 / σ i2 .
Thus, fixed effect, µ̂ F , and the unrestricted weighted least squares estimators,
µ̂W , are identical. Substituting 1/ σ i2 for wi into equation (4) implies that:
µ̂ F = Σ(1 / σ i2 ) yi Σ(1 / σ i2 ) = (1 / φ )Σ(1 / σ i2 ) yi (1 / φ )Σ(1 / σ i2 ) = µ̂W ; for all φ ≠ 0 (5)
However, µ̂ F and µ̂W have different variances. The variance of µ̂W is φ times the
variance of µ̂ F .
Sample estimates are easy to obtain for all of the above parameters from the
conventional information collected in a systematic review. First, the standard error of the
individual reported estimate, SE i , may be used in the place of σ i . Second, φ is
automatically estimated from the meta-sample by conventional weighted least squares
statistical software (e.g., STATA). Ordinary least squares will also correctly calculate
4
µ̂W and its confidence interval. To do so, run a simple meta-regression of the
standardized effect size, t i = y i SE i , with precision, 1 SE i , as the independent variable
and no intercept [14]. The mean squared error of this simple regression,
H 2=
∑ (t
i
−µˆ W / SE i ) 2
( L − 1)
,
(6)
serves as an estimate of φ and is automatically employed to help calculate the standard
error? and confidence interval of µ̂W . Both H and I2= ( H 2 − 1) H 2 are used to measure
heterogeneity in systematic reviews [22]. Lastly, τ 2 is routinely calculated by a separate
algorithm, often the method of moments or the DerSimonian-Laird method [3, 24].
In the next section, we offer realistic simulations of these three estimators using
both standardized mean differences from randomized controlled trials (RCTs) and
estimated regression coefficients. In these simulations, excess heterogeneity is always
introduced as an additive term; that is, as assumed by the random-effects model (2). We
take it for granted that the additive model, equation (2), is more realistic in applications
than the multiplicative variance structure upon which unrestricted weighted least squares
are derived.
Nonetheless, the unrestricted weighted least square is shown to have
comparable or superior statistical properties.
3. SIMULATIONS
First, we consider estimated regression coefficients as the object of research synthesis.
Regression is the most commonly used statistical technique in the social sciences, and it
encompasses many other statistical tests, including: ANOVA, t-tests, and quasiexperimental designs (regression discontinuity, instrumental variables, difference-indifference) [5, 25]. To ensure that regression estimates do not have unique properties, we
also simulate nearly a million meta-analyses of standardized mean differences from
randomized controlled trials (RCTs) in Section 3.2.
3.1 Regression Estimates
Our simulations first generate data randomly and then estimate a target regression
coefficient, α , from:
1
5
Yi = 100 + α X1i +α X2i + ui
1
(7)
2
Where ui ~ N(0,1002) and X1i ~ Uniform (100,300). The true effect, α , is assumed to be
1
either 0 or 1. When α =1, the correlation between Y and X1 is 0.27, which represents a
1
small effect size by conventional guidelines [26]. A wide range of sample sizes are
assumed to be used to estimate α in the primary literature, n= {62, 125, 250, 500, 1000},
1
similarly for the meta-analysis samples sizes = {5, 10, 20, 40, 80}.
X2 is generated in a manner that makes it correlated with X1. X2i is equal to X1i
plus a N(0,502) disturbance. When a relevant variable, like X2, is omitted from a
regression but is correlated with the included independent variable, like X1, the estimated
regression coefficient ( α̂1i ) will be biased. This omitted-variable bias is α2 ⋅ α12 ; where
α12 =1 is the slope coefficient of a regression of X2i on X1i. In these simulations, α is
2
generated randomly for each study, α 2i ~N(0, σ h2 ). That is, empirical effects are assigned
random additive heterogeneity just as assumed by random effects with variance = σ h2 .
Prior research has established the importance of the relative size of the
unexplained heterogeneity, σ h2 , [5, 6, 16]. Hence, we simulate a wide range of random
unexplained heterogeneity through a random omitted-variable bias, from no excess
heterogeneity to quite large levels. Values of random heterogeneity, σh, were selected to
encompass the heterogeneity found in past meta-analyses, as measured by I2 and reported
in the Table 1, below [22]. For example, among minimum wage elasticities, I2 is 90%
[27]; it is 93% among estimates of the value of statistical life [28] and 97% among the
partial correlations of CEO pay and corporate performance [29].
3.1.1 Results
Table 1 reports the percentage of unexplained random heterogeneity found among
the estimated effects, relative to the total variation in observed effects, I2 [22]. The
reported values of I2 are calculated ‘empirically’ for each replication of these simulations
and averaged. When σ h2 =0, the ‘true’ I2 would also be zero; however, the conventional
truncation of I2 at zero imparts a small upward bias. 95% confidence intervals are
constructed for each replication using the formulas and methods reported in Section 2,
6
above. Lastly, the coverage percentages from 10,000 replications are reported in the last
four columns of Table 1 for α =0. When 10,000 replications are used, the coverage
1
proportions vary by 0.003, or less, from one simulation of 10,000 replications to the next
simulation of 10,000 replications.
Insert Table 1 about here
As expected, the coverage probabilities are very poor for the conventional fixed-effect
meta-analysis (FEMA) when there is excess heterogeneity. One might excuse fixed
effect in these cases, because it is not designed for unconditional inference; that is, for
populations that differ in any way from the one sampled [3, 30]. However, the central
finding is that the unrestricted weighted least squares (WLS) variances make an
acceptable allowance for the actual uncertainty of the fixed-effect estimate, regardless of
the level of excess heterogeneity. On average, weighted least squares coverage is within
3.67 percent of the nominal level, while random effects (REMA) are off by 7.07%. The
Knapp-Hartung correction for random effects reduces this discrepancy to 4.52% [31]. In
any case, the unrestricted weighted least squares’ coverage is comparable to that of
random effects.
Insert Table 2 about here
Table 2 reports the coverage for the same simulations and estimators when the
true regression coefficient is 1.0; that is, when the correlation of interest is 0.27. Table 2
reflects virtually the same coverage rates and the exact same overall properties for these
interval estimators as found in Table 1 and discussed above. We do not report bias and
MSE results for these alternative meta-analysis estimators because the properties of the
conventional meta-analysis estimators are already well known. Weighted least squares’
point estimate is identical to conventional fixed-effect meta-analysis; thus, they must
have the same bias and MSE. Weighted least squares differ from the conventional fixed
effect only in their variances. With heterogeneity, weighted least squares will thereby
have wider confidence intervals and larger p-values than fixed effect.
7
In previous studies, fixed effect has been shown to be less biased than random
effects when there is publication selection for statistical significance (or small-sample
bias) [4-8]. Thus, the real advantage of weighted least squares approach over random
effects will be seen when there is publication selection (or small-sample) bias.
Insert Table 3 about here
Table 3 reports the bias and mean square errors (MSE) for weighted least squares
and random effects when half of the studies selectively report significantly positive
results. For the other half, the first estimate that is randomly generated is reported. The
simulation design for Table 3 is identical to what is described above and used to generate
Tables 1 and 2. For the selected 50%, everything is randomly generated as before, except
all of the random generating processes are repeated over and over again, until an
estimated effect is statistically positive. Weighted least squares are much less biased than
random effects when there is publication selection (or small-sample) bias. On average,
our simulations find that random effects bias is 77% larger than weighted least squares
meta-regression, and random effects’ MSE is just under three times larger than weighted
least squares’ MSE. As other studies have shown, when there is publication selection or
small-sample bias, random effects give an unacceptable summary of research findings.
3.1.2 Discussion
Surprisingly, weighted least squares outperforms random effects when there is large,
additive heterogeneity; that is, in those exact cases for which random effects are
designed.
The differences between these two approaches to accommodate excess
heterogeneity are greatest at the higher levels of heterogeneity. This is surprising because
these simulations induce an additive random heterogeneity just as the random-effects
model assumes and seemingly contrary to weighted least squares’ multiplicative
variance. Nonetheless, the performance of random effects relative to weighted least
squares is worse when there is large heterogeneity.
What explains the success of the unrestricted weighted least squares metaregression approach? Certainly, the fact that the unrestricted weighted least squares’
8
weights, 1/ SE i2 , gives relatively more weight to the most precise estimates than does
random effects, 1/( SE i2 + τˆ 2 ), helps to explain the superior statistical performance of
weighted least squares when there is both publication selection bias and excess
heterogeneity (i.e., τ 2 >0). Nonetheless, random effects should outperform weighted least
squares when there is excess heterogeneity but no publication selection bias, because we
induce random, additive heterogeneity in our simulations just as random effects assume.
To explain this puzzle, consider the expected value and variance of the estimated
effects in the presence of omitted variable bias. In our simulations, random omitted
variable bias is introduced for each study, α 2i ~ N(0, σ h2 ). As a result, the estimate’s
variance will contain a new term that depends directly on the square the random omitted
variable bias, α 2i2 [32]. Furthermore, this random heterogeneity can dominate
conventional sampling error variance when excess heterogeneity is sufficiently large. As
heterogeneity increases, the squared omitted-variable bias term gradually dominates the
usual sources of estimation error. Eventually, the estimate’s variance will be roughly
proportional to this excess heterogeneity. Thus, for large levels of heterogeneity, the
multiplicative model of these variances assumed by unrestricted weighted least squares
becomes approximately correct. This explanation is further corroborated by the superior
performance of weighted least squares in the tables of simulation results (Tables 1-3) for
the highest level of heterogeneity, (σh=4).
Insert Figure 1 about here
Figure 1 graphs 1,000 random primary study standard errors squared (the
estimates’ variances) against the square of the random heterogeneity, θ i in equation (2),
from our simulations’ largest heterogeneity condition, (σh=4). In our simulations, we can
directly observe θi. At such high levels of heterogeneity, excess heterogeneity dominates
conventional sampling errors, and the estimate’s reported variance (‘SE-squared’) will be
correlated with θ i2 (r = 0.5). Figure 1 reveals a fan-shaped scatter and an approximate
proportionality. Thus, for large heterogeneity, SE i2 is roughly proportional to excess
9
heterogeneity variance, and weighted least squares’ multiplicative variance-covariance
structure will be approximately correct.
In practice, the differences between random effects and weighted least squares
(or, equivalently, fixed effect) can be quite large [33]; thus, these difference can have
important practical consequences. For example, among the hedonic wage estimates of the
value of a statistical life, which have strong evidence of publication bias, random effects
is over three times larger ($5.7 million) than weighted least squares ($1.8 million) [28,
33]. Similarly, adopting a common currency (e.g., the euro) is estimated to increase trade
by 34% using weighted least squares and 90% with random effects [34].
3.2 Standardized Mean Differences from Randomized Controlled Trials
To ensure that our simulation results are not an aberration of regression estimation, we
also simulate standardized mean differences (Cohen’s d) from randomized controlled
trials. For the control group, outcomes are:
Yci = Xci + ui
(8)
Where ui ~ N(0,2500) and Xci ~ N(300,7500). In the experimental group, there is an
added treatment effect.
Te= µ + θi
(9)
Where θi ~ N(0, σ h2 ). As before, we assume there is either no effect or a small one, µ
={0,20}. When µ =20, Cohen’s d is 0.2. Each group is assumed to have either: 32, 64,
125, 250, or 500 subjects. As with regression estimates, we investigate a full range of
heterogeneity by varying σ h2 , see Table 4 and 5.
Insert Tables 4 and 5 about here
3.2.1 Results
Tables 4 and 5 display the coverage rates for the standardized mean differences as
measure by Cohen’s d.
Hedges’ g was also simulated, but the differences were
inconsequential. As with regression coefficients, these simulations reveal that weighted
10
least squares produce adequate confidence intervals when random effects are used as the
basis of comparison. Overall, weighted least squares’ coverage rates depart from the
nominal level of 95% by 5.00%; whereas random effects are off by 5.21%.
Insert Table 6 about here
When half of the reported findings were selected to be statistically significant,
weighted least squares entirely dominate random effects—see Table 6. In every case,
weighted least squares have both smaller bias and lower MSE than does random effects.
With 50% publication selection, random effects more than doubles the small empirical
effect (d=0.2). On average, random effects has a 33% larger bias and a 63% higher MSE
than weighted least squares.
3.2.2 Discussion
For experimental syntheses, weighted least squares provide confidence intervals
comparable to random effects. However, when there is publication (or small-sample)
bias, weighted least squares are clearly superior.
As before, these simulations of
experimental results reveal an unexpected phenomenon. The advantage of weighted least
squares is greatest in the very circumstances for which random effects were designed—
large, additive heterogeneity.
The explanation of this unexpected phenomenon is much same as discussed above
with regard to regression estimates.
Unrestricted weighted least squares’ weights,
1/ SE i2 , are more differentiating than random effects’ weights, 1/( SE i2 + τˆ 2 ).
As
τˆ 2 increases, random effects move away from weighted least squares and approach the
simple, unweighted mean, which is highly biased when there is small-sample or
publication bias. Also, similar to regression estimates, the standard error of Cohen’s d
moves with excess heterogeneity. Recall that the variance of d contains a second term
2
that depends on d . With greater heterogeneity, larger values of Cohen’s d will be
observed, which in turn increases the variance of d. Figure 2 displays an approximate
proportionality between the variance of d and the excess heterogeneity variance from
11
1,000 random repetitions from our experimental simulation for the highest level of
heterogeneity (σh=200).
5. CONCLUSIONS
Publication bias is common in medical research [35]. In economics, there is
evidence of substantial or severe publication bias in the large majority of empirical areas
of research [36]. Unfortunately, tests for publication or small-sample bias are well known
to have low power [16, 37]. Thus, it is prudent for meta-analysts to assume that there is
publication (or small-sample) bias, regardless of what their tests might indicate. With
unrestricted weighted least squares so clearly dominating random effects when there is
publication (or small-sample) bias and given that weighted least squares’ confidence
intervals are comparable to random effects’ when there is no publication bias, we see no
practical reason why unrestricted weighted least squares should not be reported in all
meta-analyses. We have shown that conventional fixed-effect and random-effects metaanalysis will produce misleading results in many practical applications.
Weighted least squares are well grounded by statistical theory, the Gauss-Markov
theorem and are very simple to implement. One need merely to run a simple ordinary
least squares regression of the estimate’s standardized value (effecti/ SE i ) vs. its precision
(1/ SE i ) with no intercept. All regression software will correctly calculate this simple
substitute for random effects, its standard error, its t-test, and its confidence interval.
Nothing further is required.
The contribution of this paper is quite modest, but potentially far reaching. From
one perspective, we merely offer a correction for the standard errors of conventional
fixed-effect meta-analysis when applied to unconditional inferences and, in the process,
show how this weighted least squares estimator is often superior to conventional randomeffects meta-analysis. Furthermore, this same weighted least squares model is easily
extended to multiple meta-regression that models heterogeneity where it again dominates
both fixed- and random-effects meta-regression in much the same ways as revealed here
[38].
12
REFERENCES
1.
Raudenbush, S.W. Random effects models in H. Cooper and L.V. Hedges (eds.) The
Handbook of Research Synthesis. Russell Sage: New York, 1994; 301-321.
2.
Hedges, L.V. Fixed effects models in H. Cooper and L.V. Hedges (eds.) The Handbook of
Research Synthesis. Russell Sage: New York, 1994; 285-299.
3.
Hedges, L.V. and Vevea, J.L. Fixed- and random-effects models in meta-analysis.
Psychological Methods 1998; 3: 486-504.
4.
Henmi, M. and Copas, J.B. Confidence intervals for random effects meta-analysis and
robustness to publication bias. Statistics in Medicine 2010; 29: 2969-2983.
5.
Stanley, T.D., Jarrell, S. B. and Doucouliagos, H(C). Could it be better to discard 90% of
the data? A statistical paradox. The American Statistician 2010; 64: 70-77.
6.
Stanley, T.D. and Doucouliagos, C(H). Meta-regression approximations to reduce
publication selection bias. Research Synthesis Methods 2013.
7.
Poole C., Greenland S. 1999. Random-effects meta-analyses are not always conservative.
American Journal of Epidemiology, 150:469-475.
8.
Sutton, A.J., Song, F., Gilbody, S.M., Abrams, K.R. 2000. Modelling publication bias in
meta-analysis: a review. Statistical Methods in Medical Research 9:421-445.
9.
Aitken, A.C. On least squares and linear combinations of observations. Proceedings of the
Royal Society of Edinburgh 1935; 55: 42–48.
10.
Davidson, R. and MacKinnon, J.G. Econometric Theory and Methods. Oxford University
Press: Oxford, 2004.
11.
Greene, W.H. Econometric Analysis. Macmillan: New York, 1990.
12.
Wooldridge, J.M. Econometric Analysis of Cross Section and Panel Data. MIT Press:
Cambridge, 2002.
13.
Stanley, T.D. and Jarrell, S.B. Meta-regression analysis: A quantitative method of literature
surveys, Journal of Economic Surveys 1989; 3: 161-170.
14.
Thompson, S.G. and Sharp, S.J. Explaining heterogeneity in meta-analysis: A comparison
of methods. Statistics in Medicine 1999; 18: 2693-2708.
15.
Fazel, S., Khosla, V., Doll, H. and Geddes, J. The prevalence of mental disorders among
the homeless in Western countries: Systematic review and meta-regression analysis. PLOS
Medicine, 2008; 5(12): 1670-1681.
13
16.
Stanley, T.D. Meta-regression methods for detecting and estimating empirical effect in the
presence of publication selection. Oxford Bulletin of Economics and Statistics 2008; 70:
103-127.
17.
Baker, W.L., White, C.M., Cappelleri, J.C., Kluger, J., and Coleman, C.I. Understanding
heterogeneity in meta-analysis: the role of meta-regression. The International Journal of
Clinical Practice, 2009; 63(10): 1426-1434.
18.
Copas, J.B. and Lozada, C. The radial plot in meta analysis: Approximations and
applications. Journal of the Royal Statistical Society: Series C (Applied Statistics), 2009;
58: 329-344.
19.
Moreno, S.G., Sutton, A.J., Ades, A., Stanley, T.D., Abrams, K.R., Peters, J.L. and Cooper,
N.J. Assessment of regression-based methods to adjust for publication bias through a
comprehensive simulation study, BMC Medical Research Methodology, 2009; 9: 2,
http://www.biomedcentral.com/1471-2288/9/2.
20.
Karkos, C.D., Sutton, A.J., Bown, M.J. and Sayers, R.D. A meta-analysis and
metaregression analysis of factors influencing mortality after endovascular repair of
ruptured abdominal aortic aneuryms. European Journal of Endovascular Surgery 2011; 42:
775-786.
21.
Drewes, H.W., Steuten, L.M.G., Lemmens, L.C., Baan, C.A., Boshuizen, H.C., Elissen,
A.M.J., Lemmens, K.M.M., Meeuwissen, J.A.C. and Vrijhoef, H.J.M. The effectiveness of
chronic care management for heart failure: meta-regression analyses to explain the
heterogeneity in outcomes. HSR: Health Services Research, 2012; 47(5): 1926-1959.
22.
Higgins J.P.T. and Thompson, S.G. Quantifying heterogeneity in meta-analysis. Statistics
in Medicine 2002; 21: 1539-1558.
23.
Baker, R.D. and Jackson, D. Meta-analysis inside and outside of particle physics: Two
traditions that should converge? Research Synthesis Methods 2013; 4: 109–124.
24.
DerSimonian, R. and Laird, M. Meta-analysis in clinical trials. Controlled Clinical Trials
1986; 7: 177-188.
25. Rockers, P.C., Røttingen, J.A., Shemilt, I., Tugwell, P. and Bärnighausen, T. Inclusion of
quasi-experimental studies in systematic reviews of health systems research, mimeo, 2014.
26. Cohen, J. Statistical Power Analysis in the Behavioral Sciences, 2nd ed. Hillsdale: Erlbaum,
1988.
14
27. Doucouliagos, C(H). and Stanley, T.D. Publication selection bias in minimum-wage
research? A meta-regression analysis. British Journal of Industrial Relations 2009; 47: 406429.
28.
Doucouliagos, C.(H), Stanley, T.D. and Giles, M. Are Estimates of the Value of a
Statistical Life Exaggerated? Journal of Health Economics 2012; 31: 197-206.
29. Doucouliagos, C.(H), Haman, J. and Stanley, T.D.. Pay for performance and corporate
governance reform. Industrial Relations 2012; 51: 670-703.
30. Hedges, L.V. Statistical considerations. in H. Cooper and L.V. Hedges (eds.) The
Handbook of Research Synthesis. Russell Sage: New York, pp. 29-38, 1994.
31. Knapp, G., and J. Hartung. 2003. Improved tests for a random effects meta-regression with
a single covariate. Statistics in Medicine 22: 2693-2710
32.
Kmenta, J. 1971. Elements of Econometrics, New York: Macmillan.
33. Stanley, T.D. and Doucouliagos, C. Meta-Regression Analysis in Economics and Business.
Routledge: London, 2012.
34.
Rose, A.K. and Stanley, T.D. A meta-analysis of the effect of common currencies on
international trade. Journal of Economic Surveys 2005; 19: 347-365
35. Hopewell, S., Loudon, K., Clarke, M. J., Oxman, A. D., and Dickersin, K. Publication bias
in clinical trials due to statistical significance or direction of trial result, Cochrane Review,
2009; 1. Available at http://www.thecochranelibrary.com
36. Doucouliagos, C.(H.) and Stanley, T.D. Theory competition and selectivity: Are all
economic facts greatly exaggerated? Journal of Economic Surveys 2013; 27: 316-339.
37. Egger, M., Smith, G.D., Schneider, M. and Minder, C. Bias in meta-analysis detected by a
simple, graphical test. BMJ 1997; 315: 629-634.
38.
Stanley, T.D. and Doucouliagos, C. (H). Better than random: Weighted least squares metaregression analysis. SWP, Economics Series 2013-2, Deakin University.
15
Table 1: Coverage of Meta-Analysis Weighted Averages (True effect, α = 0)
1
MRA
Sample Size
5
5
5
5
5
5
5
10
10
10
10
10
10
10
20
20
20
20
20
20
20
40
40
40
40
40
40
40
80
80
80
80
80
80
80
Random Heterogeneity
( σh ) *
0
.125
.25
.50
1.0
2.0
4.0
0
.125
.25
.50
1.0
2.0
4.0
0
.125
.25
.50
1.0
2.0
4.0
0
.125
.25
.50
1.0
2.0
4.0
0
.125
.25
.50
1.0
2.0
4.0
Average
I
2†
.1411
.2395
.4508
.7263
.8969
.9589
.9791
.1176
.2669
.5408
.8176
.9357
.9728
.9845
.0940
.2874
.6031
.8505
.9463
.9762
.9859
.0765
.3016
.6335
.8630
.9503
.9772
.9861
.0587
.3196
.6465
.8688
.9518
.9776
.9863
FEMA
REMA
WLS
.9536
.8449
.6490
.4047
.2367
.1538
.1269
.9515
.8421
.6505
.4185
.2663
.1935
.1677
.9504
.8454
.6464
.4213
.2647
.2043
.1669
.9464
.8562
.6523
.4254
.2792
.2081
.1662
.9470
.8469
.6549
.4269
.2845
.2044
.1710
.4980
.9654
.9019
.8507
.8218
.7941
.7610
.7106
.9633
.9152
.8926
.8823
.8674
.8208
.7473
.9621
.9276
.9201
.9208
.8958
.8480
.7647
.9574
.9376
.9348
.9278
.9141
.8642
.7696
.9565
.9429
.9422
.9367
.9155
.8638
.7778
.8793
.9534
.9245
.9080
.8893
.8890
.9061
.9223
.9528
.9186
.8965
.8753
.8886
.9065
.9197
.9485
.9219
.8952
.8865
.8885
.9172
.9381
.9502
.9245
.8946
.8806
.8959
.9275
.9370
.9498
.9224
.8935
.8852
.9002
.9158
.9431
.9133
* σh is the standard deviation of the random heterogeneity.
†
I2 is the proportion of the total variation among the empirical effects that is attributable to
heterogeneity. FEMA and REMA denote the fixed-effect and random-effects meta-analysis
averages, respectively.
16
Table 2: Coverage of Meta-Analysis Weighted Averages (True effect, α =1)
1
MRA
Sample Size
5
5
5
5
5
5
5
10
10
10
10
10
10
10
20
20
20
20
20
20
20
40
40
40
40
40
40
40
80
80
80
80
80
80
80
Random
Heterogeneity (σh)*
0
.125
.25
.50
1.0
2.0
4.0
0
.125
.25
.50
1.0
2.0
4.0
0
.125
.25
.50
1.0
2.0
4.0
0
.125
.25
.50
1.0
2.0
4.0
0
.125
.25
.50
1.0
2.0
4.0
Average
I
2†
.1375
.2475
.4432
.7253
.8955
.9593
.9783
.1185
.2669
.5451
.8166
.9355
.9731
.9846
.0968
.2826
.6029
.8499
.9464
.9761
.9858
.0747
.3043
.6315
.8631
.9503
.9773
.9862
.0585
.3201
.6471
.8686
.9519
.9776
.9863
FEMA
REMA
WLS
.9501
.8443
.6457
.4097
.2334
.1590
.1266
.9501
.8524
.6581
.4147
.2664
.1929
.1676
.9507
.8496
.6550
.4260
.2685
.2017
.1585
.9513
.8436
.6535
.4143
.2752
.2046
.1706
.9501
.8489
.6533
.4191
.2774
.2142
.1678
.4979
.9630
.9086
.8488
.8215
.7891
.7579
.7077
.9613
.9243
.8998
.8820
.8674
.8237
.7492
.9620
.9253
.9232
.9173
.8959
.8488
.7646
.9609
.9361
.9388
.9304
.9095
.8623
.7749
.9582
.9459
.9414
.9379
.9193
.8637
.7652
.8796
.9501
.9297
.9021
.8892
.8850
.9046
.9178
.9529
.9277
.9015
.8742
.8827
.9059
.9262
.9508
.9202
.8965
.8797
.8963
.9188
.9364
.9516
.9198
.8980
.8817
.9000
.9214
.9392
.9513
.9247
.8977
.8855
.9032
.9222
.9395
.9138
* σh is the standard deviation of the random heterogeneity.
†
I2 is the proportion of the total variation among the empirical effects that is attributable to
heterogeneity. FEMA and REMA denote the fixed-effect and random-effects meta-analysis
averages, respectively.
17
Table 3: Bias and MSE of Meta-Analysis with 50% Publication Selection
MRA
Sample Size
Random Heterogeneity
( σh ) *
0
.125
.25
.50
1.0
2.0
4.0
0
.125
.25
.50
1.0
2.0
4.0
0
.125
.25
.50
1.0
2.0
4.0
0
.125
.25
.50
1.0
2.0
4.0
Average Bias or MSE
REMA Bias
WLS Bias
REMA MSE
WLS MSE
10
10
10
10
10
10
10
20
20
20
20
20
20
20
40
40
40
40
40
40
40
80
80
80
80
80
80
80
.0086
.0131
.0255
.0823
.2546
.6186
1.3651
.0080
.0118
.0251
.0858
.2578
.6206
1.3615
.0071
.0114
.0263
.0852
.2571
.6209
1.3629
.0072
.0119
.0264
.0861
.2582
.6237
1.3692
.3390
.0066
.0085
.0124
.0498
.1783
.4143
.7448
.0067
.0075
.0120
.0522
.1756
.3940
.7048
.0063
.0074
.0126
.0501
.1733
.3856
.6745
.0067
.0079
.0124
.0503
.1746
.3809
.6653
.1920
.0032
.0057
.0115
.0324
.1415
.6400
2.8433
.0016
.0030
.0059
.0200
.1050
.5160
2.3395
.0008
.0015
.0033
.0135
.0854
.4523
2.1020
.0004
.0008
.0020
.0105
.0761
.4221
1.9960
.4227
.0031
.0059
.0138
.0379
.1257
.3879
1.0700
.0016
.0030
.0068
.0206
.0761
.2561
.7234
.0008
.0015
.0035
.0113
.0519
.1983
.5596
.0004
.0008
.0018
.0070
.0415
.1689
.4936
.1526
18
Table 4: Coverage of Experimental Results (d=0)
MRA
Sample Size
5
5
5
5
5
5
5
10
10
10
10
10
10
10
20
20
20
20
20
20
20
40
40
40
40
40
40
40
80
80
80
80
80
80
80
Random
Heterogeneity (σh)*
0
6.25
12.5
25
50
100
200
0
6.25
12.5
25
50
100
200
0
6.25
12.5
25
50
100
200
0
6.25
12.5
25
50
100
200
0
6.25
12.5
25
50
100
200
Average
I2
†
.1341
.2159
.3978
.6811
.8867
.9657
.9891
.1121
.2286
.4800
.7863
.9351
.9807
.9931
.0891
.2375
.5363
.8257
.9476
.9843
.9941
.0678
.2459
.5683
.8417
.9526
.9855
.9944
.0545
.2572
.5862
.8493
.9551
.9860
.9946
FEMA
REMA
WLS
.9514
.8644
.6928
.4361
.2335
.1305
.0740
.9503
.8691
.6967
.4497
.2463
.1412
.0930
.9509
.8710
.7012
.4465
.2498
.1426
.0975
.9460
.8732
.6910
.4444
.2422
.1474
.0975
.9511
.8710
.6926
.4449
.2435
.1434
.0936
.4906
.9621
.9158
.8581
.8191
.7980
.7795
.7524
.9647
.9261
.8951
.8916
.8812
.8689
.8245
.9613
.9299
.9226
.9249
.9105
.8972
.8607
.9564
.9369
.9332
.9348
.9274
.9104
.8731
.9582
.9435
.9421
.9426
.9392
.9190
.8779
.9011
.9464
.9324
.9098
.8901
.8781
.8852
.9001
.9509
.9288
.8974
.8752
.8651
.8804
.8954
.9497
.9266
.8971
.8759
.8654
.8876
.9062
.9462
.9283
.8933
.8682
.8674
.8825
.9069
.9531
.9302
.8929
.8752
.8789
.8867
.9086
.9018
* σh is the standard deviation of the random heterogeneity.
†
I2 is the proportion of the total variation among the empirical effects that is attributable to
heterogeneity. FEMA and REMA denote the fixed-effect and random-effects meta-analysis
averages, respectively.
19
Table 5: Coverage of Experimental Results (d=0.2)
MRA
Sample Size
5
5
5
5
5
5
5
10
10
10
10
10
10
10
20
20
20
20
20
20
20
40
40
40
40
40
40
40
80
80
80
80
80
80
80
Random
Heterogeneity (σh)*
0
6.25
12.5
25
50
100
200
0
6.25
12.5
25
50
100
200
0
6.25
12.5
25
50
100
200
0
6.25
12.5
25
50
100
200
0
6.25
12.5
25
50
100
200
Average
I2
†
.1360
.2162
.3965
.6821
.8851
.9672
.9888
.1151
.2317
.4766
.7822
.9335
.9805
.9931
.0904
.2353
.5316
.8232
.9473
.9843
.9941
.0699
.2479
.5679
.8403
.9525
.9854
.9945
.0546
.2556
.5839
.8478
.9548
.9861
.9946
FEMA
REMA
WLS
.9529
.8696
.7064
.4385
.2429
.1340
.0776
.9497
.8693
.6944
.4443
.2396
.1353
.0900
.9497
.8718
.7036
.4402
.2499
.1440
.0980
.9476
.8720
.6941
.4313
.2462
.1394
.0954
.9513
.8634
.7003
.4402
.2472
.1420
.0941
.4905
.9646
.9193
.8638
.8188
.7948
.7894
.7571
.9622
.9272
.8925
.8930
.8779
.8590
.8281
.9604
.9344
.9164
.9171
.9123
.8970
.8598
.9597
.9355
.9314
.9385
.9236
.9105
.8715
.9595
.9385
.9415
.9424
.9338
.9244
.8791
.9010
.9527
.9334
.9085
.8873
.8727
.8855
.8941
.9480
.9277
.8993
.8807
.8700
.8696
.8929
.9475
.9298
.8948
.8686
.8665
.8809
.9035
.9489
.9248
.8922
.8717
.8568
.8780
.8992
.9520
.9198
.8972
.8689
.8644
.8752
.8918
.8987
* σh is the standard deviation of the random heterogeneity.
†
I2 is the proportion of the total variation among the empirical effects that is attributable to
heterogeneity. FEMA and REMA denote the fixed-effect and random-effects meta-analysis
averages, respectively.
20
Table 6: Bias and MSE of Experimental Results with 50% Publication Bias
MRA
WLS MSE
σh -- Random REMA Bias WLS Bias REMA MSE
Sample Size
Heterogeneity
10
0
10
6.25
10
12.5
10
25
10
50
10
100
10
200
20
0
20
6.25
20
12.5
20
25
20
50
20
100
20
200
40
0
40
6.25
40
12.5
40
25
40
50
40
100
40
200
80
0
80
6.25
80
12.5
80
25
80
50
80
100
80
200
Average Bias or MSE
.0411
.0515
.0746
.1275
.2275
.4221
.8283
.0400
.0512
.0755
.1285
.2302
.4285
.8217
.0397
.0515
.0765
.1293
.2291
.4277
.8202
.0396
.0517
.0769
.1299
.2299
.4277
.8224
.2536
.0344
.0414
.0580
.0998
.1843
.3374
.5990
.0341
.0412
.0578
.0988
.1856
.3406
.5820
.0342
.0413
.0577
.0991
.1828
.3361
.5677
.0343
.0412
.0577
.0991
.1838
.3349
.5669
.1904
.0025
.0038
.0077
.0218
.0695
.2465
.9498
.0020
.0032
.0068
.0192
.0619
.2180
.8109
.0018
.0029
.0064
.0181
.0570
.1999
.7402
.0017
.0028
.0062
.0176
.0551
.1916
.7098
.1584
.0020
.0030
.0061
.0184
.0628
.2065
.6214
.0016
.0024
.0047
.0140
.0485
.1619
.4614
.0014
.0020
.0040
.0120
.0405
.1352
.3807
.0013
.0019
.0037
.0109
.0373
.1232
.3504
.0971
21
Figure 1: Plot of Estimated Regression Coefficient Variances ( SE i2 ) vs. Heterogeneity
Variances ( θ i2 ) σh=4
9
8
7
SE-squared
6
5
4
3
2
1
0
0
50
100
150
Heterogeneity Variance
200
250
22
Figure 2: Plot of Estimated RCT Variances ( SE i2 ) vs. Heterogeneity Variances ( θ i2 )
σh=200
35
30
SE-squared
25
20
15
10
5
0
0
5
10
15
20
25
Heterogeneity Variance
30
35
40