Appendix S7. Trait Medusa approach

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
Appendix S7. Trait Medusa approach
Methodological description
We used the ‘trait medusa’ algorithm implemented in the R package motmot (Thomas &
Freckleton 2012) to detect rate shifts, and selected the optimal number of shifts using an AICc
criterion.
trait medusa is an expansion of the medusa [Modelling Evolutionnary Diversifiction Using
Stepwise AIC, Alfaro et al. (2009)] framework that identifies position and magnitude in the rate of
lineage diversification. The algorithm we used here for trait evolution (‘tm1’ algorithm in motmot
R package) works as follow:
(1) Compute the likelihood of a single-rate BM model
(2) Fit a second rate of evolution at each node of the phylogeny (where the fitted rate is applied to
all branches descending from the node) and compute its likelihood.
(3) Select the best-fitting two rates model
(4) Fit rate heterogeneous models with two rate shifts where one of the shift must occur at the
node identified at step 3.
(5) Continue this procedure until AICc is not improved anymore.
(6) The preferred model is the one with the lowest AICc.
Results
Method one
We then used this preferred model fit to modify the VCV matrix of the PGLS (PGLSAICc_Multi-σ2).
This procedure showed high type I error rates, sometimes higher than a simple OLS (see
Appendix S7: Fig. S1) This approach fails because the AICc criterion tends to overestimate the
number of shifts (Boettiger et al. 2012; Thomas & Freckleton 2012; Appendix S7: Fig S2), and
thus mispecifies the VCV matrix, leading to increased type I error rates for PGLS.
Method two
We implemented the bootstrap procedure proposed by Boettiger et al. (2012) to correct for the
overfitting inherent to the medusa algorithm (Boettiger et al. 2012; Thomas & Freckleton 2012).
Using this method, we sequentially compared models of increasing numbers of fitted rates (i.e.
one rate model M1 versus two rate models M2, M2 versus three rates model M3, and so on). We
computed an observed likelihood ratio (δobs, see Eq. 4) between each pair of models Mn and Mn+1
(δobs_n/n+1):
δn/n+1 = -2 (logLn – logLn+1) (Eq. 4)
where larger values of δn/n+1 indicate more support for model Mn+1. To compare models M1 and
M2, we produced a null distribution of δ1/2 (δnull_1/2) under the simpler model, M1, by simulating
120 traits (number of simulated traits was limited due to computational constraints) evolving
under this model, and then we fitted a heterogeneous BM model using the ‘trait medusa’
algorithm, computing the corresponding δ1/2 for each simulation. Last, we then calculated the
proportion of δnull_1/2 that were less than or equal to δobs_1/2. This value represents the probability of
observing δobs_1/2 given that the trait evolved under the simpler model M1. If this value was less
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
than or equal to 95% we retained the simpler model, else we tested for a more complex model by
repeating the procedure comparing M2 vs M3, M3 vs M4, as so on, until we were not able to reject
the simpler model of the pair (forward model selection procedure). Once we identified the best
model, we used it to modify the VCV structure of the PGLS (PGLSBootstrap_Multi-σ2), as described
above. This method (PGLSBootstrap_Multi-σ2) showed a reduced type I error rate compared to the AICc
criterion and PGLSglobal_λ from method one; however, type I errors remained higher than 5%(see
Appendix S7: Fig. S1).
Method 3 (method for significance testing)
The trait Medusa procedure may be expected to have a slight inflated type I error rate because it
represents a two-step procedure, and each step has independent errors. Step one identified the
correct evolutionary model of the OLS residuals, and step two fitted the transformed VCV matrix
(PGLSBootstrap_Multi-σ2) in the PGLS procedure. Assuming correct type I error rates (5%), we
selected the true model of evolution in 95% of cases and PGLS did not detect a significant
correlation between X and Y in 95% of the cases when B=0.
The PGLS Bootstrap_Multi-σ2 will thus have a correct type I error rates in the 95% of the cases where
the true model of evolution is detected but will have an inflated Type I error rate in the remaining
5% of the cases (Appendix S7: Fig S2-S3).
The issue of inflated type I error when using a two step procedure (i.e., incorrect family wise type
I error) was recognised by ter Braak et al. (2012) within an ecological context. Our third approach,
following ter Braak et al. (2012), considered the relationship between Y and X (PGLScombination)
and retained the highest p-value between the PGLSglobal_λ and PGLS Bootstrap_Multi-σ2. This approach
produced correct type I error rate in all cases (Appendix S7: Fig. S1).
70
71
Type
I Error
TypeIerror
TypeIerror
0.6
0.6
Method
Method
PGLS
0.4
0.4
PGLS
OLS
OLS
PGLSTrueVCV
PGLS
PGLSTrueVCV
AICc_Multi 2
2
PGLS
PGLSAICc_Multi
Bootsrap_Multi
PGLS
Bootsrap_Multi
PGLSProposed
PGLSProposed
PGLS
combina on
0.2
0.2
0.0
0.0
0
0
2
2
4
4
6
6
Ra o of BM rates between clades
Clade 1
Clade 2
72
73
74
75
76
77
78
79
80
81
2
2
2
! s Clade1
log # 2 &
² s Clade2 %
Clade 1
Clade 2
Figure S1. Type 1 error of the different trait medusa procedure. Comparison of type I error
rate for classical (OLS, PGLSglobal_λ, and PGLSTrueVCV and modified (PGLSAICc_Multi-σ2,
PGLSBootstrap_Multi-σ2 and PGLScombination) comparative methods as a function of evolutionary rate
heterogeneity between clades. We show here the result for the simplest model of rate
heterogeneity (i.e. one single rate shift) and plotted below the X-axis the corresponding
transformed trees for a homogeneous rate (σ2 [Clade 1] = σ2 [Clade 2] =1) and a heterogeneous
rate (σ2 [Clade 1] = 1; σ2 [Clade 2] =0.01).
Numbers of
true shi s
1
1
1
1
1
1
0
Number of
fi ed shi s
0
1
2
3
4
5
6
7
8
9
10
Method
% of simula on with a given
number of fi ed shi s
Ra o of σ2
82
83
84
85
86
87
88
89
90
91
92
93
Figure S2. Fitted Numbers of Rate Shifts. The figure depicts the outputs of the trait medusa
algorithm fitted on the residuals of the OLS. We consider the case where X and Y follow an
identical but heterogeneous rate of trait evolution (σ2). Only a single shift is simulated here: traits
follow a BM model of evolution with two σ2 in the two descending clades (see Appendix S2). We
explored different ratios of σ2 between the two clades (from 0.001 to 1) which correspond to the
seven pairs of barplot. For scenarios we plot the percentage of simulations with a given number of
fitted rates (see colours in the legend; the true number of simulated shifts are given above the
barplots). We compare two methods (1) stopping the trait medusa algorithm with the AICc
criterion (‘Best AICc’) or (2) bootstrapping the ‘Best AICc’ method (‘Bootstrapped AICc’) as
described in the methods.
Ra o of BM rates between clades
Clade 1
Clade 2
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
2
! s Clade1
log # 2 &
² s Clade2 %
Clade 1
Clade 2
Figure S3. Analysing type I error rate of the Bootstrapped version of PGLS. The figure
presents the type I error rate of different methods for the correlation between two traits
showing identical heterogeneous rates of trait evolution (σ2) as a function of the strength of
the heterogeneity. Here one single shift is simulated with separate σ2 in the two descending
clades (A and B, see phylogenetic trees below the X-axis). We plot below the X-axis the
corresponding transformed trees for a homogeneous signal (σ2 [Clade A] = σ2 [Clade B] =1)
and a heterogeneous signal (σ2 [Clade A] = 1; σ2 [Clade B] =0.01). We used an unbalanced
tree of 128 species and fitted the trait medusa algorithm to the OLS residuals and then used
this model output to feed the VCV of a classical PGLS (λ fixed to one). We plot the type I
error rate of the bootstrapped model (‘PGLSBootstrap_Multi-σ2’). We separated the set of
simulations used for computing this first red curve into two sets: 1) simulations that correctly
assign the number of simulated shifts (‘PGLSBootstrapTRUE_Multi-σ2’) and 2) the one that did not
(‘PGLSBootstrapFALSE_Multi-σ2’). We then plotted the corresponding type I error rate of these two
sets. The overall type I error rate (red curve) is simply the weighted mean of the two others
sets (blue curves). The type I error represents the percentage of simulation that detected a
significant correlation (at the 5% level) between the two traits which is expected to be 5% for
a valid method (black horizontal line).
113 References
114
115
116
117
118
Alfaro, M.E., Santini, F., Brock, C., Alamillo, H., Dornburg, A., Rabosky, D.L., Carnevale,
G. & Harmon, L.J. (2009). Nine exceptional radiations plus high turnover explain
species diversity in jawed vertebrates. Proceedings of the National Academy of Sciences
of the United States of America, 106, 13410–4.
119
120
Boettiger, C., Coop, G. & Ralph, P. (2012). Is your phylogeny informative? Measuring the
power of comparative methods. Evolution, 66, 2240–2251.
121
122
Ter Braak, C., Cormont, A. & Dray, S. (2012). Improved testing of species traits-environment
relationships in the fourth-corner problem. Ecology, 93, 1525–1526.
123
124
Thomas, G.H. & Freckleton, R.P. (2012). MOTMOT: models of trait macroevolution on
trees. Methods in Ecology and Evolution, 3, 145–151.
125
126