Fig. S4.1 Simulated GL1 dataset used to test the Potential

1
New Phytologist Supporting Information
2
Article title: Decomposing leaf mass into photosynthetic and structural components explains divergent
3
patterns of trait variation within and among plant species
4
5
Authors: Masatoshi Katabuchi, Kaoru Kitajima, S. Joseph Wright, Sunshine A. Van Bael, Jeanne L. D.
6
Osnas and Jeremy W. Lichstein
7
Article acceptance date: Click here to enter a date.
8
9
The following Supporting Information is available for this article:
10
11
Notes S1 Defining and understanding leaf trait mass- vs. area-dependence.
12
Notes S2 Implications of photosynthetic and structural leaf mass for trait mass- vs. area-dependence.
13
Notes S3 Alternative leaf lifespan models and realized net photosynthesis.
14
Notes S4 Model tests with simulated data.
15
Fig. S1 Boxplots comparing posterior means of the latent variable f (the fraction of total LMA comprised
16
by LMAp) across deciduous (D) and evergreen (E) leaves in GLOPNET and Panama, and across sites (wet
17
and dry) and canopy strata (sun and shade) in Panama.
18
Fig. S2 Boxplots comparing leaf mass per area (LMA), photosynthetic leaf mass per area (LMAp;
19
posterior means), and structural leaf mass per area (LMAs; posterior means) across sites (wet and dry)
20
and canopy strata (sun and shade) in all leaves in the Panama datasets.
21
Fig. S3 Measured traits related to photosynthesis and metabolism (nitrogen and phosphorus per-unit
22
leaf area; Narea and Parea) are positively correlated with LMA and with estimates (posterior means) of the
23
photosynthetic and structural LMA components (LMAp and LMAs, respectively) in the GLOPNET dataset.
24
Fig. S4 Boxplots comparing cellulose content (percent of total leaf mass) across sites (wet and dry) and
25
canopy strata (sun and shade) in Panama.
26
Fig. S5 Posterior means of LMAp vs LMAs in the (a) GLOPNET and (b) Panama datasets.
27
Table S1 Summary of results for Potential LL Model (Eq. 5) and Optimal LL Models (Notes S3).
28
Table S2 Correlations between LMAp or LMAs and other traits derived from the Potential LL model (Eq.
29
5) fit to GLOPNET data.
30
Table S3 Correlations between LMAp or LMAs and other traits derived from the three alternative LL
31
models (Notes S3) fit to Panama data.
32
Table S4 Correlations between observed and predicted LL for the Potential LL Model (Eq. 5) and the
1
33
Optimal LL Model (Notes S3) fit to Panama data.
34
2
35
Notes S1. Defining and understanding leaf trait mass- vs. area-dependence
36
37
Let X represent the amount of a trait in an entire leaf. For example, X could be the photosynthetic
38
capacity of the entire leaf (units = moles CO2 fixed per-unit time), the amount of nitrogen in the entire
39
leaf (units = grams of nitrogen), etc. X could depend on leaf mass (in which case leaves of greater mass
40
would have higher values of X than leaves of smaller mass), leaf area (in which case leaves of greater
41
area would have higher values of X than leaves of smaller area), or both mass and area. We define a trait
42
as being purely mass-dependent if X depends on leaf mass but not area, and we define a trait as being
43
purely area-dependent if X depends on leaf area but not mass. We expect most traits to depend on both
44
mass and area to some degree, but the extreme cases (pure mass- and area-dependence, which we
45
explain in more detail below) are useful for illustrating and understanding the key concepts related to
46
mass- and area-dependence.
47
Rather than X representing the amount of trait in an entire leaf, we could instead define X as the
48
amount of trait in a leaf sample of known mass and area. For example, we could define a trait as being
49
purely mass-dependent if the amount of trait in a sample depended only on the mass of the sample. In
50
what follows, we refer to ‘leaves,’ but the concepts are equally applicable to leaf samples, whether they
51
be samples comprised of multiple leaves, or partial-leaf samples that are representative of entire leaves.
52
First, consider a purely mass-dependent trait. In this case, if we were to compare leaves of equal
53
mass that varied in area, there would be zero correlation between X and area across leaves. More
54
generally, if we consider the case of pure mass-dependence with leaves of variable mass and area, and
55
we assume that X is proportional to mass, we have:
56
(Equation S1.1)
Xi = Cm × Massi × εi
57
(Equation S1.2)
Xi/Areai = Cm × LMAi × εi
58
(Equation S1.3)
Xi/Massi = Cm × εi
59
where the subscript i refers to leaf i, Cm is a constant, and εi is a lognormally-distributed error term,
60
which becomes an additive error term if we take the logarithm of both sides of the equations. The key
61
point from Eqs. S1.1-S1.3 is that if X is purely mass-dependent, then the area-normalized trait (X/area)
62
increases with LMA, and the mass-normalized trait (X/mass) is statistically independent of LMA (see
63
Osnas et al. 2013 for further explanation and discussion). The assumption of mass-proportionality (i.e.,
64
the Mass term in Eq. S1.1 has an exponent of one) leads to clean forms for Eqs. S1.2-S1.3, but our
65
conclusions regarding the dependence of X/area and X/mass on LMA would be unchanged if this
3
66
assumption were relaxed. That is, if Mass were raised to any positive exponent in Eq. S1.1, it would still
67
be the case that X/area increases with LMA, and X/mass is independent of LMA.
68
Next, consider a purely area-dependent trait. In this case, if we were to compare leaves of equal
69
area that varied in mass, there would be zero correlation between X and mass across leaves. More
70
generally, if we consider the case of pure area-dependence with leaves of variable mass and area, and
71
we assume that X is proportional to area, we have:
72
(Equation S1.4)
Xi = Ca × Areai × εi
73
(Equation S1.5)
Xi/Areai = Ca × εi
74
(Equation S1.6)
Xi/Massi = Ca × LMAi–1 × εi
75
where Ca is a constant, and other details are as above. The key point from Eqs. S1.4-S1.6 is that if X is
76
purely area-dependent, then the area-normalized trait (X/area) is statistically independent of LMA, and
77
the mass-normalized trait (X/mass) decreases with LMA (Osnas et al., 2013). As explained above for
78
mass-dependent traits, the qualitative conclusions do not depend on the proportionality assumption.
79
4
80
81
Notes S2 Implications of photosynthetic and structural leaf mass for trait mass- vs. area-dependence
82
83
Here, we use simulated data to explore how variation in photosynthetic and structural LMA components
84
(LMAp and LMAs, respectively) affect trait mass- and area-dependence (as defined in Notes S1). We
85
show that variation among leaves in LMAp leads to mass-dependence of photosynthetic capacity (Amax)
86
and related traits (e.g., Rdark and concentrations of N and P), whereas variation in LMAs leads to area-
87
dependence of these same traits. We assume that traits associated with photosynthetic capacity, when
88
expressed per-unit leaf area, increase with LMAp but are unaffected by LMAs. This is equivalent to
89
assuming that the total amount of a photosynthetic trait in a leaf (represented by the symbol X in Notes
90
S1) increases with the mass of photosynthetic tissue in a leaf but is unaffected by the mass of structural
91
tissue. Our analysis of simulated data focuses on a single trait, photosynthetic capacity (Amax), since this
92
trait is closely related to LMAp in our model (see main text).
93
94
2.1 Simulated datasets
95
We performed tests with simulated data to explore how the degree of trait mass-dependence is
96
affected by the variances of LMAp and LMAs, and the covariance between LMAp and LMAs. In the
97
simulated datasets, we used estimates from our analysis of the GLOPNET dataset (see Methods and
98
Results in the main text) to assign the following: Aarea = 0.23  LMAp  exp(a), where a is normally-
99
distributed with mean 0 and standard deviation 0.5; median value of LMAp = 70 g m–2; and median value
100
of LMAs = 98 g m–2. To evaluate how variance and covariance of LMAp and LMAs affect estimates of
101
mass-dependence, we considered all combinations of the following: standard deviations of log-scale
102
LMAp and LMAs from 0.1 to 1.0 in increments of 0.1 (which determine how variable LMAp and LMAs are
103
among leaves in the simulated sample), and correlation coefficients between log-scale LMAp and LMAs
104
from –0.4 to 0.4 in increments of 0.1 (which determines how LMAp and LMAs covary among leaves in
105
the simulated sample). In total, we estimated mass-dependence for 900 parameter combinations (10
106
values of LMAp variance × 10 values of LMAs variance × 9 correlation coefficients). For each combination
107
of parameters, we simulated 100 replicates. Each replicate had a sample size of 100 leaves, where the
108
values of LMAp and LMAs for each leaf was a random draw from the distributions with the variances
109
and covariances described above.
110
111
Of the 90,000 simulated datasets, 7839 (~8.7%) showed negative covariance between LMA and
either LMAp or LMAs. We excluded these simulated datasets from our subsequent analysis because one
5
112
goal of our analysis was to attribute total LMA variance due to variance in LMAp vs. LMAs (see main text
113
Eq. 15), which is not straightforward when one of the covariances is negative. Furthermore, our
114
empirical results based on the GLOPNET and Panama datasets suggested that covariances between LMA
115
and both LMAp and LMAs are in reality positive, so excluding negative covariances from our analysis
116
should not bias our inferences with respect to real datasets.
117
118
2.2 Quantifying trait mass-dependence
119
For each of the simulated datasets described above, we quantified mass-dependence of Amax using the
120
following equation, equivalent to “Model-LN” in Osnas et al. (2013; see the “Models” section of their
121
Supplementary Materials):
122
(Equation S2.1)
123
where Aareai is area-normalized Amax for leaf i, LMAi is total LMA for leaf i, εi is a normally-distributed
124
error term, and the parameter b quantifies mass- vs. area-dependence. Parameter b equals one for a
125
purely mass-proportional trait (Eq. S1.2) and zero for a purely area-proportional trait (Eq. S1.5). More
126
generally, b is close to zero for traits that are mostly area-dependent, and b increases with increasing
127
mass-dependence. Empirical estimates of b tend to range between 0 and 1 (Osnas et al. in review), and
128
we interpret values of b > 0.5 as indicating greater mass- than area-dependence. We used ordinary least
129
squares regression to fit Eq. S2.1 to each simulated dataset. Osnas et al. (2013) showed that this method
130
yields similar results as other approaches to quantifying mass-dependence.
log(Aareai) = a + b × log(LMAi) + εi
131
132
2.3 Results
133
Variation among leaves in LMAp led to mass-dependence of Amax (b > 0.5), whereas variation in LMAs led
134
to area-dependence of Amax (b near 0; Fig. S2.1a). That is, as total LMA variance became increasingly
135
dominated by LMAs (i.e., moving to the right along the x-axis of Fig. S2.1a), mass-dependence of Amax
136
decreased (or, equivalently, area-dependence increased). Intuitively, mass-dependence of Amax
137
decreases (and area-dependence increases) with LMAs variance because LMAs has no effect on area-
138
normalized Amax (Aarea). Thus, as LMAs variance increases, the photosynthetic capacity of entire leaves
139
becomes increasingly dependent on leaf area (rather than leaf mass), and Aarea becomes increasingly
140
independent of LMA. As expected, the percent variation of total LMA that is due to LMAs variance
141
increased with LMAs variance (Fig. S2.1b). Thus, weak mass-dependence corresponds to cases where
142
LMAs variance dominates total LMA variance (right side of x-axes in Figs. S2.1a and S2.1b), and strong
6
143
mass-dependence corresponds to cases where LMAp variance dominates total LMA variance (left side of
144
x-axes in Figs. S2.1a and S2.1b).
145
Covariance between LMAp and LMAs (which was positive but weak in our analyses of GLOPNET
146
and Panama data; see Fig. S6) also affected mass- vs. area-dependence. Empirical estimates of mass-
147
dependence (b) tend to range from 0 to 1 (Osnas et al. in review). Within this typical range (0 < b < 1),
148
positive covariance between LMAp and LMAs increases mass-dependence, whereas negative covariance
149
between LMAp and LMAs (which is inconsistent with our results of GLOPNET and Panama data)
150
increases area-dependence (Fig. S2.1a). Our simulations show that if LMAs has greater variance than
151
LMAp, then b tends to be greater than 1, although such values are larger than most empirical estimates
152
of b, which tend to range between 0 (pure area-proportionality) and 1 (pure mass-proportionality;
153
Osnas et al. in review).
154
7
155
156
Fig. S2.1 Effects of the LMAs:LMAp variance ratio in simulated datasets on (a) the mass-dependence of
157
photosynthetic capacity (Amax) and (b) the percent of variance in leaf mass per area (LMA) explained by
158
LMAs variance, where LMAs and LMAp are structural and photosynthetic components, respectively, of
159
LMA (see Notes S2 for details). Mass-dependence (y-axis in the left panel) is estimated as parameter b in
160
Eq. S2.1; thus mass-dependence of Amax decreases (and area-dependence increases) as the LMAs:LMAp
161
variance ratio increases. Methods for generating simulated data and for estimating mass-dependence
162
are described in Notes S2. Methods to quantify the percent variation of LMA due to LMAs (right panel)
163
are described in the main-text (see Eq. 15). Note that empirical estimates of mass-dependence (y-axis in
164
the left panel) tend to range between 0 (pure area-proportionality; Eq. S1.5) and 1 (pure mass-
165
proportionality; Eq. S1.2). Thus, we interpret values less than 0.5 as indicating primarily area-dependent
166
traits, and values greater than 0.5 as indicating primarily mass-dependent traits.
167
8
168
Notes S3 Alternative leaf lifespan models and realized net photosynthesis
169
170
Here, we describe three alternative LL models, and how we derive an approximation for realized rate of
171
annualized (or time-averaged) net photosynthesis (which is required for the Optimal LL Models) based
172
on measured Amax and Rdark.
173
174
3.1 Alternative models of LL
175
The fist model, which we refer to as “Potential LL Model”, assumes that LL is proportional to the amount
176
of LMAs:
177
178
E[LL𝑖 ] = 𝛽 LMAs𝑖
(Potential LL Model)
179
180
where, E[] is the expected value of the variable in brackets, LLi is leaf lifespan of leaf i and  is a
181
constant. We fitted the Potential LL Model to both GLOPNET and the Panama dataset.
182
The second model assumes optimal LL theory (Kikuzawa, 1991). Details on the Kikuzawa model
183
are described in the main text. According to optimal LL theory, LL should decrease with the realized rate
184
of net photosynthesis, because frequently replacing old leaves to new leaves ensure high carbon gain
185
per unit time for leaves that have high initial net photosynthesis rate (Kikuzawa, 1991). We call this
186
second model “Optimal LL Model”. We scaled maximum net photosynthetic rate to use realized net
187
photosynthetic rate to use realized:
188
189
E[LL𝑖 ] = 𝛽√LMAs𝑖 LMA𝑖 /(𝜃L 𝐴area 𝑖 − 𝑅area 𝑖 )
(Optimal LL model)
190
191
where, θ L (0 < θ L < 1) is the scaling parameter for shade leaves to describe realized net photosynthetic
192
rate (Kikuzawa et al., 2004), with θ L = 1 for sun leaves, θ L < 1 for shade leaves, and Aarea i is the the net
193
photosynthetic rate per unit area of leaf i, and Rarea i is the dark respiration rate per unit area of leaf i. In
194
this model, parameter θ LAarea i – Rarea i is proportional to the mean daily net photosynthetic rate. We
195
fitted the optimal LL model to the Panama dataset.
196
The third model includes an additional parameter to account for site effects (). Because we
197
combined data from wet and sites in a single analysis, it is possible that site differences in LL could
198
confound our results.
199
9
200
E[LL𝑖 ] = 𝛽𝜑√LMAs𝑖 LMA𝑖 /(𝜃L 𝐴area 𝑖 − 𝑅area 𝑖 )
(Optimal LL model with site effect)
201
202
We report the optimal LL model in the main text as it produced the better WAIC (Watanabe, 2010;
203
Gelman et al., 2014) value than the potential LL model. Although the site effect model had best WAIC
204
value, we include this in supplementary materials, because this model includes non-mechanistic
205
processes relevant to variance in LL (i.e., site effects). See Table S1 for the full result.
206
207
3.2 Realized net photosynthesis
208
The realized rate of net photosynthesis reflects multiple factors, including down-regulation of
209
photosynthetic capacity and respiration of chronically shaded leaves (Chen et al., 2014), as well as the
210
amount of light incident upon a leaf at a given time (Kikuzawa et al., 2004). The former (down-
211
regulation) is already reflected in observed values of Amax (which is measured under saturating light
212
conditions), but the latter (realized light availability) is not. Therefore, we introduced a scaling
213
parameter ( L; with 0 <  L < 1) to account for shading effects when approximating realized net
214
photosynthesis in the Optimal LL Models. We show that realized net photosynthesis per unit time is
215
approximately proportional to  LAarea – Rarea, where Aarea and Rarea are net photosynthetic capacity (Amax)
216
and dark respiration (Rdark) per-unit leaf area per hour respectively. For simplicity, we assume that (1)
217
day and night are each 12 h long (which is approximately correct for the tropical forest sites in Panama
218
where we applied the Optimal LL Models), (2) net photosynthetic rate is constant during the day, and (3)
219
respiration rate is constant during the night. Then, we have:
220
221
Net photosynthesis during 12-hour daytime = (12 hr) × LAarea
222
Respiration during 12-hour nighttime = (12 hr) × Rarea
223
224
The scaling parameter L accounts for the effects of light availability on net photosynthesis is, with L = 1
225
for sun leaves, and L < 1 for shade leaves. Subtracting the two respiration terms from gross
226
photosynthesis yields the approximation for 24-hour net photosynthesis is proportional to  LAarea – Rarea.
227
Above, we express Aarea and Rarea per hour, but these are easily translated to other time periods.
228
Although Rdark may not accurately measure nighttime respiration respiration because of
229
complicated by processes such as nighttime phloem loading (Azcón-Bieto & Osmond, 1983; Azcón-Bieto
230
et al., 1983), it is not easy to implement scaling parameters for nighttime respiration (RN). It is difficult
231
to include both L and RN at the same time for convergence. Additionally, to implement RN into the
10
232
models, RN should be same for both sun and shade leaves or RN = 1 for sun or shade leaves and RN > 0
233
for shade or sun leaves, but there is no clear biological justification to set RN. In contrast, L has clear
234
biological motivation as explained above and thus we only focus on  L in our models.
235
11
236
237
Notes S4 Model tests with simulated data
238
239
We performed tests with simulated data to evaluate the performance of our modeling approach with
240
different prior assumptions for the latent variables that determine LMAp and LMAs (fi in Eqs. 2-3). If the
241
modeling approach (with a given set of priors) is approach is robust, then posterior parameter estimates
242
(including those for latent variables) should closely match the assumed (“true”) parameter values used
243
to create the simulate datasets. Below, we describe (i) the different parameter combinations used to
244
generate the simulated datasets, (ii) alternative priors that we explored for the latent variables fi, and
245
(iii) results of the tests with simulated data, including randomized versions of the simulated data
246
(analogous to the LMA-randomization tests presented in the main text). In general, these tests with
247
simulated datasets suggest that the results presented in the main text are robust. In particular, the
248
simulation tests suggest that our main results depend primarily on underlying patterns in the GLOPNET
249
and Panama datasets that we analyzed, as opposed to being pre-determined by our model assumptions.
250
251
4.1 Methods
252
4.1.1 Simulated datasets
253
Four distinct combinations of parameter values were used to create simulated datasets with different
254
properties (Fig. S4.1-S4.4, Tables S4.1-S4.2):
255
 Simulated datasets GL1 and GL2, based on GLOPNET data. We used parameter values estimated
256
from the GLOPNET analysis described in main text (Potential LL Model, Eq. 5) to create two simulated
257
datasets (GL1 and GL2) with different LMA distributions for evergreen and deciduous leaves. The
258
correlation between LL and LMA is strong while the correlation between LMA and Aarea is weak (Figs.
259
S4.1-S4.2). GL1 has the same LMA distribution as the observed GLOPNET data (high LMA for
260
evergreen leaves, and low LMA for deciduous leaves; Fig. S4.1 and Table S4.2), whereas GL2 assigns
261
the observed evergreen LMA distribution to deciduous leaves, and vice versa (Fig. S4.2 and Table
262
S4.2). We analyzed GL1 and GL2 using the same methods applied to the GLOPNET data in the main
263
text, which allowed us to assess if inferred differences between evergreen and deciduous leaves (Fig.
264
4) merely reflect model assumptions (in which case GL1 and GL2 should yield inferences similar to
265
those shown in Fig. 4), or if inferred differences reflect meaningful patterns in the data (in which GL1,
266
but not GL2, should yield results similar to Fig. 4).
12
267
 Simulated dataset PA, based on Panama data. We used parameters values estimated from the
268
Panama analysis with the lowest WAIC (Optimal LL Model with site effects, Notes S3) to create a
269
simulated dataset with properties similar to the observed Panama data. But in the case of the
270
simulated data, the true values of the parameters (including LMAp and LMAs for each leaf) are
271
known, which allows us to test the modeling approach. In contrast to the GLOPNET dataset, the
272
correlation between LMA and Aarea is strong while the correlation between LMA and LL is weak in this
273
simulated dataset (Fig. S4.3), as in the observed Panama data (Figs. 2a and 2g).
274
 Simulated dataset WC (“weak correlation”). We created one simulated dataset with weak
275
correlations among LMAp, LMAs and other traits (Fig. S4.4). This dataset allows us to assess if our
276
modeling approach is prone to inducing strong correlations (e.g., between LMAp and Aarea, or
277
between LMAs and LL) when the true correlations are weak.
278
For each of the above four parameter combinations, we analyzed a single simulated dataset, as well as
279
nine additional datasets with randomized LMA values (see details below). A more rigorous approach
280
would involve analyzing many replicate datasets (e.g., 1000), but this would be very computationally
281
expensive because each of our analyses requires fitting a complex Bayesian model with many latent
282
variables. Although our tests with simulated data are either non-replicated (in the case of the original,
283
non-randomized, simulated data) or have limited replication (in the case of the LMA-randomized
284
simulated data), we consider the test results presented below to be sufficiently clear to provide
285
meaningful guidance on the interpretation of the results presented in the main text.
286
287
4.1.2 Model forms applied to simulated data
288
We fit each simulated dataset with the model form used to generate the data; i.e., we applied the
289
Potential LL Model (Eq. 1) to the simulated GL1, GL2, and WC datasets, and the Optimal LL Model with
290
site effects (Notes S3) to the simulated PA dataset.
291
292
4.1.3 Priors
293
For leaf i of group j (where j is evergreen or deciduous in the GL1 and GL2 datasets, and j is sun/dry-site,
294
shade/dry-site, sun/wet-site, or shade/wet-site in the PA dataset), we considered two different sets of
295
priors for the latent variables fi:
296
(i)
fi ~ Uniform(0,1) (Non-hierarchical model)
297
(ii)
logit–1(fi) ~ Normal(μ0 + rj, σ) (Hierarchical model)
13
298
where 0 is constant across all leaves, rj is a group effect (a vector with constant value within each group
299
j), and  is the variance of fi. In the hierarchical model, 0, rj, and  are free parameters. The first set of
300
priors (uniform, non-hierarchical model) is used in the results reported in the main text and in Figs. S5.1-
301
S5.6 because analyses of simulated data (see below) suggested that these priors yielded more robust
302
inferences compared to the hierarchical model.
303
304
4.1.4 Parameter estimation
305
Models were fit and convergence of posterior distributions was assessed as described in the main text.
306
307
4.1.5 Randomized LMA datasets
308
As with the analyses described in the main text, we created randomized LMA datasets corresponding to
309
each of the four simulated datasets described above, and we fit the models to both the original (non-
310
randomized) simulated datasets, as well as the randomized datasets (see details below). The purpose of
311
this analysis was to assess the validity of the randomization approach in the main text. For example,
312
analyses of non-randomized data should yield correlations between LMAp and Aarea (or between LMAs
313
and LL) that are similar to the true correlations in the original simulated data, whereas analyses of
314
randomized data should yield weaker correlations than in the original simulated data.
315
We generated 9 randomized LMA datasets for each of the four simulated datasets, and we fit
316
the same models (Potential LL Model or Optimal LL Model with site effects) to the randomized and non-
317
randomized data. For the simulated GL1, GL2, and WC datasets, we randomized LMA values across
318
leaves, while maintaining the original (non-randomized) data for Aarea, Rarea and LL. For the simulated PA
319
dataset, LMA values were shuffled within sites (wet and dry) and across canopy strata (sun and shade).
320
Thus, when randomizing the PA dataset, we removed the effect of strata (sun vs. shade) on LMA, while
321
maintaining site differences in LMA as well as the observed covariances among Aarea, Rarea and LL. To
322
compare model fit between the non-randomized dataset and randomized LMA dataset, we calculated a
323
standardized effect size (SES) for correlation between focal variables and translated them to P-value as
324
described in the main text.
325
326
4.2 Results
327
Non-informative flat priors for latent variables (non-hierarchical models) performed better than the
328
hierarchical models. True values of model structural parameters (i.e., all parameters except for latent
329
variables) were usually located within the central part of their posterior distributions for models with
14
330
either non-hierarchical or hierarchical priors on latent variables (Figs. S4.5-S4.7). However, the match
331
between posterior estimates and true values tended to be better for the non-hierarchical models (Figs.
332
S4.5-S4.6), and in one case (WC dataset), the hierarchical model did not converge.
333
For latent variables (which are relevant to many results and conclusions presented in the main
334
text), the non-hierarchical model yielded posterior means that were unbiased and strongly correlated
335
with the true values for the GL1 and PA datasets, whereas the hierarchical models had weaker
336
relationships with the true values and/or systematic bias (Fig. S4.8). For the simulated WC dataset, the
337
non-hierarchical model performed better than the hierarchical model, although neither set of priors
338
yielded posterior means that were well-correlated with the true values of latent variables (Fig. S4.8).
339
340
LMA randomization is an effective way to identify meaningful correlations between LMAp and LMAs
341
and other traits. When the true correlations between LMAp and LMAs and other traits are strong (as in
342
the simulated GL and PA datasets), the estimated correlations (derived from posterior distributions of
343
the latent variables fi; Eqs. 2-3) are very similar to the true correlations and significantly different from
344
correlations derived from datasets in which LMA was randomized among leaves (see GL1 and PA in Fig.
345
S4.9). In contrast, when the true correlations between LMAp and LMAs and other traits are weak (as in
346
the simulated WC dataset), the estimated correlations and randomization-based correlations are very
347
similar to each other and significantly greater than the true correlations (see WC in Fig. S4.9). These
348
results suggest that (i) if the true correlations between LMAp and LMAs and other traits are weak, then
349
model assumptions (e.g., the degrees of freedom provided by latent variables) can lead to over-
350
estimates of these correlations; but (ii) such artefactual results can be diagnosed using the LMA-
351
randomization tests presented here and in the main text. Specifically, if results obtained from the
352
original data (non-randomized) are significantly different from results obtained from LMA-randomized
353
data, then the tests presented here (Fig. S4.9) suggest that the estimated correlations are similar to the
354
true correlations.
355
356
Inferred differences in LMAp and LMAs among groups (e.g, evergreen vs. deciduous leaves) and
357
inferred contributions of LMAp and LMAs to total LMA variance depend more on the data than on
358
model assumptions. Estimated distributions of LMAp and LMAs for evergreen and deciduous leaves
359
were very similar to the true distributions for simulated GL1 data (which has similar properties to the
360
original GLOPNET dataset), as well as for simulated GL2 data (where LMA distributions for evergreen
361
and deciduous leaves were swapped) (Figs. S4.10-S4.11). The estimated variance contributions of LMAp
15
362
and LMAs to total LMA variation were very similar to the true values for GL1 and roughly similar to the
363
true values for GL2 (Fig. S4.11). Despite the quantitative mismatch between estimated and true values
364
for GL2 variance partitioning (Fig. S4.12), the estimates are qualitatively correct in identifying LMAp as
365
the dominant variance component in GL2. These test results suggest that using our modeling approach,
366
estimated differences in LMAp and LMAs between groups of leaves and estimates of LMA variance
367
components are at least qualitatively robust, and depend more strongly on the properties of the dataset
368
than on model assumptions.
369
370
16
371
372
Fig. S4.1 Simulated GL1 dataset used to test the Potential LL Model analysis of GLOPNET data. The GL1
373
dataset has similar properties as the original GLOPNET dataset. But for GL1 (unlike GLOPNET), the values
374
of all model parameters (including the latent variables fi that determine LMAp and LMAs for each leaf)
375
are known.
376
377
378
379
17
380
381
382
Fig. S4.2 Simulated GL2 dataset used to test the Potential LL Model analysis of GLOPNET data. GL2 is
383
similar to GL1, except that in GL2, the LMA distributions for evergreen and deciduous leaves are
384
swapped (Table S4.2). These non-realistic LMA distributions were used to test the ability of our
385
modeling approach to correctly estimate differences in LMAp and LMAs among groups of leaves.
386
18
387
388
Fig. S4.3 Simulated PA dataset used to test the Optimal LL Model analysis of Panama data. The PA
389
dataset has similar properties as the original Panama dataset. But for PA (unlike Panama data), the
390
values of all model parameters (including the latent variables fi that determine LMAp and LMAs for each
391
leaf) are known.
392
393
394
19
395
396
Fig. S4.4 The weak correlation (WC) dataset used to test the Potential LL Model. Relationships among
397
traits are much weaker in the WC dataset compared to other simulated datasets (Figs. S4.1-S4.3). The
398
WC dataset was designed to evaluate the performance of our modeling approach in cases where the
399
hypothesized relationships between LMAp and LMAs and other traits were weak.
400
20
401
402
403
Fig. S4.5 Posterior distributions for model parameters in the simulated GL1 dataset. Vertical red lines
404
indicate the assumed (“true”) values used to create the simulated data. The non-hierarchical model
405
assumes flat (uniform) priors for each latent variable fi (Eq. 5), whereas the hierarchical model assumes
406
that each latent variable comes from a group-specific normal distribution, where groups are either
407
evergreen, deciduous, or unclassified leaves.
408
409
410
411
412
21
413
414
Fig. S4.6 Posterior distributions for model parameters in the simulated PA dataset. Vertical red lines
415
indicate the assumed (“true”) values used to create the simulated data. The non-hierarchical model
416
assumes flat (uniform) priors for each latent variable fi (Eq. 5), whereas the hierarchical model assumes
417
that each latent variable comes from a group-specific normal distribution, where groups are the four
418
combinations of two sites (wet and dry) and two canopy strata (sun and shade).
419
420
421
22
422
423
Fig. S4.7 Posterior distributions for model parameters in the simulated WC dataset. Vertical red lines
424
indicate the assumed (“true”) values used to create the simulated data. Posteriors for the model with
425
hierarchical priors did not converge, so only results for the non-hierarchical model (with uniform priors
426
for each latent variable) are shown.
427
428
429
23
430
431
432
Fig. S4.8 Posterior means for the latent variables fi (which determine LMAp and LMAs for each leaf; Eq.
433
5) plotted against their assumed (“true”) values in three simulated datasets. The dashed line indicates
434
the 1:1 relationship. The two rows correspond to results for two different sets of prior assumptions:
435
hierarchical priors (see legends for Figs. S4.5-S4.7 for details) and non-hierarchical uniform priors. The
436
hierarchical model for the simulated WC dataset did not converge, so no results are shown.
437
24
438
439
440
Fig. S4.9 Comparison of correlation coefficients estimated from non-randomized simulated data (‘Est’),
441
those obtained from analyses of LMA-randomized data (‘Rand’), and the assumed (‘True’) correlations
442
used to generate the simulated data. Correlations are shown between LMAp and Aarea, between LMAp
443
and Rarea, and between LMAs and LL. Red solid lines show the true correlations. Blue dashed lines show
444
the posterior means of the estimated (‘Est’) correlations obtained from non-randomized simulated data.
445
Black dashed lines show the posterior means of the correlations obtained from LMA-randomized
446
(‘Rand’) simulated data. Posterior distributions (estimated from 1000 random posterior samples) are
447
also shown for each correlation. P < 0.05 indicates a significantly stronger correlation in models fit to
448
non-randomized compared to models fit to randomized data (P-values based on the standardized effect
449
size, SES; see Methods in the main text for details).
450
451
25
452
453
454
455
456
Fig. S4.10 Boxplots of assumed (“true”) values of LMAp and LMAs in the simulated GL1 dataset, and the
457
corresponding posterior means for LMAp and LMAs estimated from the Potential LL Model applied to
458
the simulated data. The results show that for the GL1 dataset (which has similar properties to the
459
original GLOPNET data), the model accurately recovers the assumed distribution of LMAp and LMAs for
460
deciduous evergreen leaves.
461
462
463
26
464
465
Fig. S4.11 Boxplots of assumed (“true”) values of LMAp and LMAs in the simulated GL2 dataset, and the
466
corresponding posterior means for LMAp and LMAs estimated from the Potential LL Model applied to
467
the simulated data. The results show that for the GL2 dataset (in which LMA distributions for deciduous
468
and evergreen leaves were swapped relative to the original GLOPNET data), the model accurately
469
recovers the assumed distribution of LMAp and LMAs for deciduous and evergreen leaves.
470
27
471
472
473
Fig. S4.12 True and estimated proportions of total LMA variance contributed by variances in LMAp and
474
LMAs for the simulated GL1 and GL2 datasets. ‘True’ values refer to the assumed values in the simulated
475
datasets, whereas ‘Estimated’ values are posterior means from the Potential LL Model applied to the
476
simulated datasets. The results show that the modeling approach yields qualitatively correct inferences
477
(e.g., LMAs variance in dominant in GL1, whereas LMAp variance is dominant in GL2), although
478
quantitative estimates may not be robust (e.g., mismatch between True and Estimated values for GL2).
479
28
Table S4.1 Parameter values used to generate simulated datasets. The GL1 and GL2 parameter values were estimated from the Potential LL
Model (Eq. 5) applied to the GLOPNET global dataset, as described in the main text. GL1 and GL2 share the same parameter values, but differ in
the assumed LMA distributions (see Table S4.2). The PA parameter values were estimated from the Optimal LL Model with sites effects (Notes
S3) applied to the Panama data, as described in the main text. The WC (weak correlation) dataset was designed to have weak correlations
between LMAp and LMAs and other traits. Parameter values shown here were combined with LMA values (Table S4.2) to create the simulated
data are shown in Figs. S4.1-S4.4.
Parameter
Parameter value in each simulated dataset
Description
GL1, GL2
PA
WC

0.23
0.28
1
Slope coefficient relating LMAp to Aarea and Rarea

0.23
0.60
1
Slope coefficient relating LMAs to LL
rp
0.02
0.02
0.02
Slope coefficient relating LMAp to Rarea
rs
0.001
0.001
0.02
Slope coefficient relating LMAs to Rarea
L
NA
0.5
NA

NA
0.7
NA
12
0
0.8
0
13
0.7
0.03
0
23
0
0.1
0
1
0.5
0.2
1
2
0.5
0.5
1
3
0.5
0.7
1
Scaling parameter that accounts for shading and
thus affects the realized net photosynthetic rate
Parameter to account for site effects
Correlation coefficients in the covariance matrix in
Eq. 11 in the main text
Standard deviations in the covariance matrix in Eq.
11 in the main text
29
30
Table S4.2 LMA values in the simulated datasets. The latent variables fi determine LMAp and LMAs for
each leaf (Eq. 5).
Median of LMA
SD of LMA
Mean of fi
Deciduous
80
0.3
0.6
Evergreen
170
0.5
0.4
Deciduous
170
0.3
0.6
Evergreen
80
0.5
0.4
Sun Dry
90
0.3
0.6
Shade Dry
70
0.3
0.7
Sun Wet
40
0.2
0.3
Shade Wet
30
0.4
0.6
120
0.3
0.4
Simulated GL1
Evergreen has higher LMA
Simulated GL2
Evergreen has lower LMA
Simulated PA
Weak Correlation (WC)
31
Fig. S1 Boxplots comparing posterior means of the latent variable f (the fraction of total LMA comprised
by LMAp) across deciduous (D) and evergreen (E) leaves in GLOPNET and Panama, and across sites (wet
and dry) and canopy strata (sun and shade) in Panama. Note that LMAp = f × LMA, and LMAs = (1 – f) ×
LMA. (a) Deciduous and evergreen leaves in the GLOPNET dataset; (b) deciduous and evergreen leaves
for Panama species for which both sun and shade leaves were available; (c) leaves for Panama species
for which both sun and shade leaves were available; and (d) all leaves for Panama. At the dry Panama
site, the increase in LMA from shade to sun (Fig. 5) is due to roughly equal proportional increases in
LMAp and LMAs (because f is similar between sun and shade), whereas at the wet Panama site, the
increase in LMA from shade to sun (Fig. 5) is due primarily to increased LMAp (because f is greater in sun
than shade). GLOPNET results are for the Potential LL Model (Eq. 5), and Panama results are for the
Optimal LL Model with site effects (Eq. 13 and Notes S3). The center line in each box indicates the
median, upper and lower box edges indicate the interquartile range, whiskers show 1.5 times the
interquartile range, and points are outliers. Groups sharing the same letters are not significantly
different (P > 0.05; t-tests).
32
Fig. S2 Boxplots comparing leaf mass per area (LMA), photosynthetic leaf mass per area (LMAp;
posterior means), and structural leaf mass per area (LMAs; posterior means) across sites (wet and dry)
and canopy strata (sun and shade) in Panama. The results shown here include all leaves in the Panama
dataset, whereas Fig. 5 in the main text only includes Panama species for which both sun and shade
leaves were available. Boxplot symbols as in Fig. S1. Groups sharing the same letters are not significantly
different (P > 0.05; t-tests).
33
Fig. S3 Measured traits related to photosynthesis and metabolism (nitrogen and phosphorus per-unit
leaf area; Narea and Parea) are positively correlated with LMA and with estimates (posterior means) of the
photosynthetic and structural LMA components (LMAp and LMAs, respectively) in the GLOPNET dataset.
LMAp yields stronger correlations and more consistent relationships compared to LMA and LMAs; e.g.,
evergreen and deciduous leaves align along a single relationship in panel b, but not in panels a or c. Gray
symbols show model results from one of ten randomized datasets in which LMA was randomized among
all leaves in GLOPNET. Pearson correlation coefficients are for observed LMA (left column) and posterior
means of LMAp (middle column) and LMAs (right column). P-values (*** P < 0.001) for LMA test the null
hypothesis of zero correlation, and for LMAp and LMAs test the null hypothesis of equal correlation in
observed and randomized datasets (see ‘Randomized LMA Datasets’ in Methods for details).
34
Fig. S4 Boxplots comparing cellulose content (percent of total leaf mass) across sites (wet and dry) and
canopy strata (sun and shade) in Panama. (a) All leaves in the Panama dataset; and (b) leaves for
Panama species for which both sun and shade leaves were available. Cellulose content (percent mass) is
similar for shade and sun at the dry site, but higher for shade than sun at the wet site. This pattern is
consistent with estimates of LMAp and LMAs fractions (Fig. S1c-d); i.e., at the wet site, LMAs comprises
a larger fraction of total LMA in shade than sun (because LMAp comprises a larger fraction in sun than
shade; Fig. S1c-d). Boxplot symbols as in Fig. S1. Groups sharing the same letters are not significantly
different (P > 0.05; t-tests).
35
Fig. S5 Posterior means of LMAp vs LMAs in the (a) GLOPNET and (b) Panama datasets. Correlations
between LMAp and LMAs are significantly positive, but the small r2 values indicate that a single axis
could not accurately represent the two-dimensional space. Symbols as in Main Text Figs. 1-2.
36
Table S1 Summary of results for Potential LL Model (Eq. 5) and Optimal LL Models (Eq. 13 Notes S3).
Column definitions:

Data = GLOPNET or Panama.

Model = Potential, Optimal or Optimal with site effects.

n.obs = number of samples analyzed.

n.par = number of free structural parameters (excluding latent variables) in models (see notes below table for details).

WAIC = Watanabe-Akaike information criterion; also known as widely applicable information criterion. WAIC is a predictive information
criterion for Bayesian models, with lower values indicating a more parsimonious model.

, , rp, rs, L and  = posterior mean and 95% credible interval for each parameter.
Data
LL Model
n.obs
n.par
WAIC


rs
L

0.026]
0.002 [0.001, 0.003]
NA [NA, NA]
NA [NA, NA]
0.021 [0.016,
0.002 [-0.001,
0.027]
0.007]
NA [NA, NA]
NA [NA, NA]
rp
0.023 [0.02,
GLOPNET
Panama
Potential
Potential
198
132
10
10
863
516
0.234 [0.208, 0.263]
0.298 [0.272, 0.329]
0.23 [0.203, 0.264]
0.499 [0.402, 0.636]
0.021 [0.018,
Panama
Optimal
132
11
378
0.282 [0.256, 0.316]
0.611 [0.542, 0.697]
0.279 [0.253,
0.026]
0 [-0.003, 0.003]
0.022 [0.018,
-0.001 [-0.004,
0.026]
0.002]
0.335]
NA [NA, NA]
Optimal
with site
Panama
effects
132
12
347
0.283 [0.254, 0.319]
0.691 [0.605, 0.799]
0.698 [0.564,
0.28 [0.253, 0.332]
0.863]
Note: Free structural parameters for the Potential LL Model: , , rp, rs, 12, 13, 23, 1, 2, and 3. Free structural parameters for the
Optimal LL Model: , , rp, rs, 12, 13, 23, 1, 2, 3 and L. The model with site effects includes one additional parameter (; see Notes
S3).
37
Table S2 Correlations between LMAp or LMAs and other traits (on logarithmic scales) derived from the
Potential LL model (Eq. 5) fit to GLOPNET data. Pearson correlations are given for models fit to the
observed datasets (robs) and randomized datasets (rrand). For randomized datasets, rrand is the mean
correlation for 10 datasets in which LMA was randomized across all leaves. P-values test the null
hypothesis of equal correlation in observed and randomized datasets (see ‘Randomized LMA Datasets’
in Methods for details).
LMA randomized across all
leaves
LL model
Potential
trait
LMA component
Aarea
LMAp
0.672
0.484
< 0.001
Aarea
LMAs
-0.234
-0.401
< 0.001
Rarea
LMAp
0.59
0.369
< 0.001
Rarea
LMAs
0.124
-0.237
< 0.001
LL
LMAp
0.108
-0.562
< 0.001
LL
LMAs
0.916
0.542
< 0.001
Narea
LMAp
0.645
0.074
< 0.001
Narea
LMAs
0.477
-0.027
< 0.001
Parea
LMAp
0.57
0.234
< 0.001
Parea
LMAs
0.304
-0.137
< 0.001
38
robs
rrand
P-value
Table S3 Correlations between LMAp or LMAs and other traits (on logarithmic scales) derived from the
three alternative LL models fit to Panama data: Potential LL Model (Eq. 5), Optimal LL Model without site
effect (Eq. 13), and Optimal LL Model with site effect (Notes S3). Pearson correlations are given for
models fit to the observed datasets (robs) and randomized datasets (rrand). For randomized datasets, rrand
is the mean correlation for 10 datasets in which LMA was randomized across all leaves, and 10 datasets
in which LMA was randomized within sites (wet and dry) across canopy strata (sun and shade). P-values
test the null hypothesis of equal correlation in observed and randomized datasets (see ‘Randomized
LMA Datasets’ in Methods for details).
LL model
Potential
Optimal
trait
LMA
component
robs
LMA randomized
LMA randomized within
across all leaves
sites across strata
rrand
rrand
P-value
P-value
Aarea
LMAp
0.970
0.646
< 0.001
0.668
< 0.001
Aarea
LMAs
0.057
-0.463
< 0.001
-0.428
< 0.001
Rarea
LMAp
0.618
0.528
0.157
0.496
0.063
Rarea
LMAs
0.167
-0.381
< 0.001
-0.355
< 0.001
LL
LMAp
-0.432
-0.613
0.254
-0.526
< 0.001
LL
LMAs
0.368
0.491
< 0.001
0.642
< 0.001
Narea
LMAp
0.854
0.575
< 0.001
0.592
< 0.001
Narea
LMAs
0.289
-0.376
< 0.001
-0.344
< 0.001
Parea
LMAp
0.768
0.571
< 0.001
0.550
< 0.001
Parea
LMAs
0.176
-0.383
< 0.001
-0.393
< 0.001
CLarea
LMAp
0.600
0.252
< 0.001
0.309
< 0.001
CLarea
LMAs
0.651
-0.141
< 0.001
-0.042
< 0.001
Aarea
LMAp
0.883
0.497
< 0.001
0.531
< 0.001
Aarea
LMAs
0.17
-0.366
< 0.001
-0.312
< 0.001
Rarea
LMAp
0.676
0.392
< 0.01
0.382
< 0.01
Rarea
LMAs
0.095
-0.308
< 0.001
-0.282
< 0.001
LL
LMAp
-0.549
-0.654
< 0.001
-0.573
0.307
LL
LMAs
0.556
0.600
0.317
0.726
< 0.01
39
Optimal
with site
effect
Narea
LMAp
0.823
0.427
< 0.001
0.459
< 0.001
Narea
LMAs
0.326
-0.284
< 0.001
-0.242
< 0.001
Parea
LMAp
0.766
0.448
< 0.001
0.441
< 0.001
Parea
LMAs
0.196
-0.31
< 0.001
-0.308
< 0.001
CLarea
LMAp
0.581
0.113
< 0.001
0.183
< 0.001
CLarea
LMAs
0.695
-0.021
< 0.001
0.080
< 0.001
Aarea
LMAp
0.846
0.454
< 0.001
0.502
< 0.001
Aarea
LMAs
0.208
-0.337
< 0.001
-0.327
< 0.001
Rarea
LMAp
0.690
0.439
< 0.001
0.400
< 0.01
Rarea
LMAs
0.063
-0.354
< 0.001
-0.331
< 0.001
LL
LMAp
-0.503
-0.502
< 0.01
-0.451
0.229
LL
LMAs
0.583
0.533
< 0.05
0.718
< 0.01
Narea
LMAp
0.808
0.409
< 0.001
0.444
< 0.001
Narea
LMAs
0.332
-0.300
< 0.001
-0.279
< 0.001
Parea
LMAp
0.740
0.384
< 0.001
0.391
< 0.001
Parea
LMAs
0.235
-0.249
< 0.001
-0.292
< 0.001
CLarea
LMAp
0.616
0.205
< 0.001
0.252
< 0.001
CLarea
LMAs
0.704
-0.132
< 0.001
0.001
< 0.001
40
Table S4 Correlations between observed and predicted LL (on logarithmic scales) for the Potential LL
Model (Eq. 5) and the Optimal LL Model (Eq. 13 and Notes S3) fit to Panama data. Pearson correlations
are given for models fit to the observed datasets (robs) and randomized datasets (rrand). For randomized
datasets, rrand is the mean correlation for 10 datasets in which LMA was randomized across all leaves,
and 10 datasets in which LMA was randomized within sites (wet and dry) across canopy strata (sun and
shade). P-values test the null hypothesis of equal correlation in observed and randomized datasets (see
‘Randomized LMA Datasets’ in Methods for details).
Model
robs
LMA randomized across
LMA randomized within
all leaves
sites across strata
rrand
P-value
rrand
P-value
Potential
0.382
0.502
< 0.001
0.652
< 0.001
Optimal
0.759
0.518
< 0.001
0.637
< 0.001
0.813
0.664
< 0.001
0.705
< 0.001
Optimal
with site
effect
41
References
Azcón-Bieto J, Osmond CB. 1983. Relationship between Photosynthesis and Respiration The Effect of
Carbohydrate Status on the Rate of CO2 Production by Respiration in Darkened and Illuminated Wheat
Leaves. Plant Physiology 71: 574–581.
Azcón-Bieto J, Lambers H, DAY DA. 1983. Effect of photosynthesis and carbohydrate status on
respiratory rates and the involvement of the alternative pathway in leaf respiration. Plant Physiology 72:
598–603.
Chen A, Lichstein JW, Osnas JLD, Pacala SW. 2014. Species-independent down-regulation of leaf
photosynthesis and respiration in response to shading: evidence from six temperate tree species. PLoS
ONE 9: e91798.
Gelman A, Hwang J, Vehtari A. 2014. Understanding predictive information criteria for Bayesian models.
Statistics and Computing 24: 997–1016.
Kikuzawa K. 1991. A cost-benefit analysis of leaf habit and leaf longevity of trees and their geographical
pattern. The American Naturalist: 1250–1263.
Kikuzawa K, Shirakawa H, Suzuki M, Umeki K. 2004. Mean labor time of a leaf. Ecological Research 19:
365–374.
Osnas JLD, Lichstein JW, Reich PB, Pacala SW. 2013. Global Leaf Trait Relationships: Mass, Area, and the
Leaf Economics Spectrum. Science 340: 741–744.
Watanabe S. 2010. Asymptotic Equivalence of Bayes Cross Validation and Widely Applicable Information
Criterion in Singular Learning Theory. The Journal of Machine Learning Research 11: 3571–3594.
42