1 New Phytologist Supporting Information 2 Article title: Decomposing leaf mass into photosynthetic and structural components explains divergent 3 patterns of trait variation within and among plant species 4 5 Authors: Masatoshi Katabuchi, Kaoru Kitajima, S. Joseph Wright, Sunshine A. Van Bael, Jeanne L. D. 6 Osnas and Jeremy W. Lichstein 7 Article acceptance date: Click here to enter a date. 8 9 The following Supporting Information is available for this article: 10 11 Notes S1 Defining and understanding leaf trait mass- vs. area-dependence. 12 Notes S2 Implications of photosynthetic and structural leaf mass for trait mass- vs. area-dependence. 13 Notes S3 Alternative leaf lifespan models and realized net photosynthesis. 14 Notes S4 Model tests with simulated data. 15 Fig. S1 Boxplots comparing posterior means of the latent variable f (the fraction of total LMA comprised 16 by LMAp) across deciduous (D) and evergreen (E) leaves in GLOPNET and Panama, and across sites (wet 17 and dry) and canopy strata (sun and shade) in Panama. 18 Fig. S2 Boxplots comparing leaf mass per area (LMA), photosynthetic leaf mass per area (LMAp; 19 posterior means), and structural leaf mass per area (LMAs; posterior means) across sites (wet and dry) 20 and canopy strata (sun and shade) in all leaves in the Panama datasets. 21 Fig. S3 Measured traits related to photosynthesis and metabolism (nitrogen and phosphorus per-unit 22 leaf area; Narea and Parea) are positively correlated with LMA and with estimates (posterior means) of the 23 photosynthetic and structural LMA components (LMAp and LMAs, respectively) in the GLOPNET dataset. 24 Fig. S4 Boxplots comparing cellulose content (percent of total leaf mass) across sites (wet and dry) and 25 canopy strata (sun and shade) in Panama. 26 Fig. S5 Posterior means of LMAp vs LMAs in the (a) GLOPNET and (b) Panama datasets. 27 Table S1 Summary of results for Potential LL Model (Eq. 5) and Optimal LL Models (Notes S3). 28 Table S2 Correlations between LMAp or LMAs and other traits derived from the Potential LL model (Eq. 29 5) fit to GLOPNET data. 30 Table S3 Correlations between LMAp or LMAs and other traits derived from the three alternative LL 31 models (Notes S3) fit to Panama data. 32 Table S4 Correlations between observed and predicted LL for the Potential LL Model (Eq. 5) and the 1 33 Optimal LL Model (Notes S3) fit to Panama data. 34 2 35 Notes S1. Defining and understanding leaf trait mass- vs. area-dependence 36 37 Let X represent the amount of a trait in an entire leaf. For example, X could be the photosynthetic 38 capacity of the entire leaf (units = moles CO2 fixed per-unit time), the amount of nitrogen in the entire 39 leaf (units = grams of nitrogen), etc. X could depend on leaf mass (in which case leaves of greater mass 40 would have higher values of X than leaves of smaller mass), leaf area (in which case leaves of greater 41 area would have higher values of X than leaves of smaller area), or both mass and area. We define a trait 42 as being purely mass-dependent if X depends on leaf mass but not area, and we define a trait as being 43 purely area-dependent if X depends on leaf area but not mass. We expect most traits to depend on both 44 mass and area to some degree, but the extreme cases (pure mass- and area-dependence, which we 45 explain in more detail below) are useful for illustrating and understanding the key concepts related to 46 mass- and area-dependence. 47 Rather than X representing the amount of trait in an entire leaf, we could instead define X as the 48 amount of trait in a leaf sample of known mass and area. For example, we could define a trait as being 49 purely mass-dependent if the amount of trait in a sample depended only on the mass of the sample. In 50 what follows, we refer to ‘leaves,’ but the concepts are equally applicable to leaf samples, whether they 51 be samples comprised of multiple leaves, or partial-leaf samples that are representative of entire leaves. 52 First, consider a purely mass-dependent trait. In this case, if we were to compare leaves of equal 53 mass that varied in area, there would be zero correlation between X and area across leaves. More 54 generally, if we consider the case of pure mass-dependence with leaves of variable mass and area, and 55 we assume that X is proportional to mass, we have: 56 (Equation S1.1) Xi = Cm × Massi × εi 57 (Equation S1.2) Xi/Areai = Cm × LMAi × εi 58 (Equation S1.3) Xi/Massi = Cm × εi 59 where the subscript i refers to leaf i, Cm is a constant, and εi is a lognormally-distributed error term, 60 which becomes an additive error term if we take the logarithm of both sides of the equations. The key 61 point from Eqs. S1.1-S1.3 is that if X is purely mass-dependent, then the area-normalized trait (X/area) 62 increases with LMA, and the mass-normalized trait (X/mass) is statistically independent of LMA (see 63 Osnas et al. 2013 for further explanation and discussion). The assumption of mass-proportionality (i.e., 64 the Mass term in Eq. S1.1 has an exponent of one) leads to clean forms for Eqs. S1.2-S1.3, but our 65 conclusions regarding the dependence of X/area and X/mass on LMA would be unchanged if this 3 66 assumption were relaxed. That is, if Mass were raised to any positive exponent in Eq. S1.1, it would still 67 be the case that X/area increases with LMA, and X/mass is independent of LMA. 68 Next, consider a purely area-dependent trait. In this case, if we were to compare leaves of equal 69 area that varied in mass, there would be zero correlation between X and mass across leaves. More 70 generally, if we consider the case of pure area-dependence with leaves of variable mass and area, and 71 we assume that X is proportional to area, we have: 72 (Equation S1.4) Xi = Ca × Areai × εi 73 (Equation S1.5) Xi/Areai = Ca × εi 74 (Equation S1.6) Xi/Massi = Ca × LMAi–1 × εi 75 where Ca is a constant, and other details are as above. The key point from Eqs. S1.4-S1.6 is that if X is 76 purely area-dependent, then the area-normalized trait (X/area) is statistically independent of LMA, and 77 the mass-normalized trait (X/mass) decreases with LMA (Osnas et al., 2013). As explained above for 78 mass-dependent traits, the qualitative conclusions do not depend on the proportionality assumption. 79 4 80 81 Notes S2 Implications of photosynthetic and structural leaf mass for trait mass- vs. area-dependence 82 83 Here, we use simulated data to explore how variation in photosynthetic and structural LMA components 84 (LMAp and LMAs, respectively) affect trait mass- and area-dependence (as defined in Notes S1). We 85 show that variation among leaves in LMAp leads to mass-dependence of photosynthetic capacity (Amax) 86 and related traits (e.g., Rdark and concentrations of N and P), whereas variation in LMAs leads to area- 87 dependence of these same traits. We assume that traits associated with photosynthetic capacity, when 88 expressed per-unit leaf area, increase with LMAp but are unaffected by LMAs. This is equivalent to 89 assuming that the total amount of a photosynthetic trait in a leaf (represented by the symbol X in Notes 90 S1) increases with the mass of photosynthetic tissue in a leaf but is unaffected by the mass of structural 91 tissue. Our analysis of simulated data focuses on a single trait, photosynthetic capacity (Amax), since this 92 trait is closely related to LMAp in our model (see main text). 93 94 2.1 Simulated datasets 95 We performed tests with simulated data to explore how the degree of trait mass-dependence is 96 affected by the variances of LMAp and LMAs, and the covariance between LMAp and LMAs. In the 97 simulated datasets, we used estimates from our analysis of the GLOPNET dataset (see Methods and 98 Results in the main text) to assign the following: Aarea = 0.23 LMAp exp(a), where a is normally- 99 distributed with mean 0 and standard deviation 0.5; median value of LMAp = 70 g m–2; and median value 100 of LMAs = 98 g m–2. To evaluate how variance and covariance of LMAp and LMAs affect estimates of 101 mass-dependence, we considered all combinations of the following: standard deviations of log-scale 102 LMAp and LMAs from 0.1 to 1.0 in increments of 0.1 (which determine how variable LMAp and LMAs are 103 among leaves in the simulated sample), and correlation coefficients between log-scale LMAp and LMAs 104 from –0.4 to 0.4 in increments of 0.1 (which determines how LMAp and LMAs covary among leaves in 105 the simulated sample). In total, we estimated mass-dependence for 900 parameter combinations (10 106 values of LMAp variance × 10 values of LMAs variance × 9 correlation coefficients). For each combination 107 of parameters, we simulated 100 replicates. Each replicate had a sample size of 100 leaves, where the 108 values of LMAp and LMAs for each leaf was a random draw from the distributions with the variances 109 and covariances described above. 110 111 Of the 90,000 simulated datasets, 7839 (~8.7%) showed negative covariance between LMA and either LMAp or LMAs. We excluded these simulated datasets from our subsequent analysis because one 5 112 goal of our analysis was to attribute total LMA variance due to variance in LMAp vs. LMAs (see main text 113 Eq. 15), which is not straightforward when one of the covariances is negative. Furthermore, our 114 empirical results based on the GLOPNET and Panama datasets suggested that covariances between LMA 115 and both LMAp and LMAs are in reality positive, so excluding negative covariances from our analysis 116 should not bias our inferences with respect to real datasets. 117 118 2.2 Quantifying trait mass-dependence 119 For each of the simulated datasets described above, we quantified mass-dependence of Amax using the 120 following equation, equivalent to “Model-LN” in Osnas et al. (2013; see the “Models” section of their 121 Supplementary Materials): 122 (Equation S2.1) 123 where Aareai is area-normalized Amax for leaf i, LMAi is total LMA for leaf i, εi is a normally-distributed 124 error term, and the parameter b quantifies mass- vs. area-dependence. Parameter b equals one for a 125 purely mass-proportional trait (Eq. S1.2) and zero for a purely area-proportional trait (Eq. S1.5). More 126 generally, b is close to zero for traits that are mostly area-dependent, and b increases with increasing 127 mass-dependence. Empirical estimates of b tend to range between 0 and 1 (Osnas et al. in review), and 128 we interpret values of b > 0.5 as indicating greater mass- than area-dependence. We used ordinary least 129 squares regression to fit Eq. S2.1 to each simulated dataset. Osnas et al. (2013) showed that this method 130 yields similar results as other approaches to quantifying mass-dependence. log(Aareai) = a + b × log(LMAi) + εi 131 132 2.3 Results 133 Variation among leaves in LMAp led to mass-dependence of Amax (b > 0.5), whereas variation in LMAs led 134 to area-dependence of Amax (b near 0; Fig. S2.1a). That is, as total LMA variance became increasingly 135 dominated by LMAs (i.e., moving to the right along the x-axis of Fig. S2.1a), mass-dependence of Amax 136 decreased (or, equivalently, area-dependence increased). Intuitively, mass-dependence of Amax 137 decreases (and area-dependence increases) with LMAs variance because LMAs has no effect on area- 138 normalized Amax (Aarea). Thus, as LMAs variance increases, the photosynthetic capacity of entire leaves 139 becomes increasingly dependent on leaf area (rather than leaf mass), and Aarea becomes increasingly 140 independent of LMA. As expected, the percent variation of total LMA that is due to LMAs variance 141 increased with LMAs variance (Fig. S2.1b). Thus, weak mass-dependence corresponds to cases where 142 LMAs variance dominates total LMA variance (right side of x-axes in Figs. S2.1a and S2.1b), and strong 6 143 mass-dependence corresponds to cases where LMAp variance dominates total LMA variance (left side of 144 x-axes in Figs. S2.1a and S2.1b). 145 Covariance between LMAp and LMAs (which was positive but weak in our analyses of GLOPNET 146 and Panama data; see Fig. S6) also affected mass- vs. area-dependence. Empirical estimates of mass- 147 dependence (b) tend to range from 0 to 1 (Osnas et al. in review). Within this typical range (0 < b < 1), 148 positive covariance between LMAp and LMAs increases mass-dependence, whereas negative covariance 149 between LMAp and LMAs (which is inconsistent with our results of GLOPNET and Panama data) 150 increases area-dependence (Fig. S2.1a). Our simulations show that if LMAs has greater variance than 151 LMAp, then b tends to be greater than 1, although such values are larger than most empirical estimates 152 of b, which tend to range between 0 (pure area-proportionality) and 1 (pure mass-proportionality; 153 Osnas et al. in review). 154 7 155 156 Fig. S2.1 Effects of the LMAs:LMAp variance ratio in simulated datasets on (a) the mass-dependence of 157 photosynthetic capacity (Amax) and (b) the percent of variance in leaf mass per area (LMA) explained by 158 LMAs variance, where LMAs and LMAp are structural and photosynthetic components, respectively, of 159 LMA (see Notes S2 for details). Mass-dependence (y-axis in the left panel) is estimated as parameter b in 160 Eq. S2.1; thus mass-dependence of Amax decreases (and area-dependence increases) as the LMAs:LMAp 161 variance ratio increases. Methods for generating simulated data and for estimating mass-dependence 162 are described in Notes S2. Methods to quantify the percent variation of LMA due to LMAs (right panel) 163 are described in the main-text (see Eq. 15). Note that empirical estimates of mass-dependence (y-axis in 164 the left panel) tend to range between 0 (pure area-proportionality; Eq. S1.5) and 1 (pure mass- 165 proportionality; Eq. S1.2). Thus, we interpret values less than 0.5 as indicating primarily area-dependent 166 traits, and values greater than 0.5 as indicating primarily mass-dependent traits. 167 8 168 Notes S3 Alternative leaf lifespan models and realized net photosynthesis 169 170 Here, we describe three alternative LL models, and how we derive an approximation for realized rate of 171 annualized (or time-averaged) net photosynthesis (which is required for the Optimal LL Models) based 172 on measured Amax and Rdark. 173 174 3.1 Alternative models of LL 175 The fist model, which we refer to as “Potential LL Model”, assumes that LL is proportional to the amount 176 of LMAs: 177 178 E[LL𝑖 ] = 𝛽 LMAs𝑖 (Potential LL Model) 179 180 where, E[] is the expected value of the variable in brackets, LLi is leaf lifespan of leaf i and is a 181 constant. We fitted the Potential LL Model to both GLOPNET and the Panama dataset. 182 The second model assumes optimal LL theory (Kikuzawa, 1991). Details on the Kikuzawa model 183 are described in the main text. According to optimal LL theory, LL should decrease with the realized rate 184 of net photosynthesis, because frequently replacing old leaves to new leaves ensure high carbon gain 185 per unit time for leaves that have high initial net photosynthesis rate (Kikuzawa, 1991). We call this 186 second model “Optimal LL Model”. We scaled maximum net photosynthetic rate to use realized net 187 photosynthetic rate to use realized: 188 189 E[LL𝑖 ] = 𝛽√LMAs𝑖 LMA𝑖 /(𝜃L 𝐴area 𝑖 − 𝑅area 𝑖 ) (Optimal LL model) 190 191 where, θ L (0 < θ L < 1) is the scaling parameter for shade leaves to describe realized net photosynthetic 192 rate (Kikuzawa et al., 2004), with θ L = 1 for sun leaves, θ L < 1 for shade leaves, and Aarea i is the the net 193 photosynthetic rate per unit area of leaf i, and Rarea i is the dark respiration rate per unit area of leaf i. In 194 this model, parameter θ LAarea i – Rarea i is proportional to the mean daily net photosynthetic rate. We 195 fitted the optimal LL model to the Panama dataset. 196 The third model includes an additional parameter to account for site effects (). Because we 197 combined data from wet and sites in a single analysis, it is possible that site differences in LL could 198 confound our results. 199 9 200 E[LL𝑖 ] = 𝛽𝜑√LMAs𝑖 LMA𝑖 /(𝜃L 𝐴area 𝑖 − 𝑅area 𝑖 ) (Optimal LL model with site effect) 201 202 We report the optimal LL model in the main text as it produced the better WAIC (Watanabe, 2010; 203 Gelman et al., 2014) value than the potential LL model. Although the site effect model had best WAIC 204 value, we include this in supplementary materials, because this model includes non-mechanistic 205 processes relevant to variance in LL (i.e., site effects). See Table S1 for the full result. 206 207 3.2 Realized net photosynthesis 208 The realized rate of net photosynthesis reflects multiple factors, including down-regulation of 209 photosynthetic capacity and respiration of chronically shaded leaves (Chen et al., 2014), as well as the 210 amount of light incident upon a leaf at a given time (Kikuzawa et al., 2004). The former (down- 211 regulation) is already reflected in observed values of Amax (which is measured under saturating light 212 conditions), but the latter (realized light availability) is not. Therefore, we introduced a scaling 213 parameter ( L; with 0 < L < 1) to account for shading effects when approximating realized net 214 photosynthesis in the Optimal LL Models. We show that realized net photosynthesis per unit time is 215 approximately proportional to LAarea – Rarea, where Aarea and Rarea are net photosynthetic capacity (Amax) 216 and dark respiration (Rdark) per-unit leaf area per hour respectively. For simplicity, we assume that (1) 217 day and night are each 12 h long (which is approximately correct for the tropical forest sites in Panama 218 where we applied the Optimal LL Models), (2) net photosynthetic rate is constant during the day, and (3) 219 respiration rate is constant during the night. Then, we have: 220 221 Net photosynthesis during 12-hour daytime = (12 hr) × LAarea 222 Respiration during 12-hour nighttime = (12 hr) × Rarea 223 224 The scaling parameter L accounts for the effects of light availability on net photosynthesis is, with L = 1 225 for sun leaves, and L < 1 for shade leaves. Subtracting the two respiration terms from gross 226 photosynthesis yields the approximation for 24-hour net photosynthesis is proportional to LAarea – Rarea. 227 Above, we express Aarea and Rarea per hour, but these are easily translated to other time periods. 228 Although Rdark may not accurately measure nighttime respiration respiration because of 229 complicated by processes such as nighttime phloem loading (Azcón-Bieto & Osmond, 1983; Azcón-Bieto 230 et al., 1983), it is not easy to implement scaling parameters for nighttime respiration (RN). It is difficult 231 to include both L and RN at the same time for convergence. Additionally, to implement RN into the 10 232 models, RN should be same for both sun and shade leaves or RN = 1 for sun or shade leaves and RN > 0 233 for shade or sun leaves, but there is no clear biological justification to set RN. In contrast, L has clear 234 biological motivation as explained above and thus we only focus on L in our models. 235 11 236 237 Notes S4 Model tests with simulated data 238 239 We performed tests with simulated data to evaluate the performance of our modeling approach with 240 different prior assumptions for the latent variables that determine LMAp and LMAs (fi in Eqs. 2-3). If the 241 modeling approach (with a given set of priors) is approach is robust, then posterior parameter estimates 242 (including those for latent variables) should closely match the assumed (“true”) parameter values used 243 to create the simulate datasets. Below, we describe (i) the different parameter combinations used to 244 generate the simulated datasets, (ii) alternative priors that we explored for the latent variables fi, and 245 (iii) results of the tests with simulated data, including randomized versions of the simulated data 246 (analogous to the LMA-randomization tests presented in the main text). In general, these tests with 247 simulated datasets suggest that the results presented in the main text are robust. In particular, the 248 simulation tests suggest that our main results depend primarily on underlying patterns in the GLOPNET 249 and Panama datasets that we analyzed, as opposed to being pre-determined by our model assumptions. 250 251 4.1 Methods 252 4.1.1 Simulated datasets 253 Four distinct combinations of parameter values were used to create simulated datasets with different 254 properties (Fig. S4.1-S4.4, Tables S4.1-S4.2): 255 Simulated datasets GL1 and GL2, based on GLOPNET data. We used parameter values estimated 256 from the GLOPNET analysis described in main text (Potential LL Model, Eq. 5) to create two simulated 257 datasets (GL1 and GL2) with different LMA distributions for evergreen and deciduous leaves. The 258 correlation between LL and LMA is strong while the correlation between LMA and Aarea is weak (Figs. 259 S4.1-S4.2). GL1 has the same LMA distribution as the observed GLOPNET data (high LMA for 260 evergreen leaves, and low LMA for deciduous leaves; Fig. S4.1 and Table S4.2), whereas GL2 assigns 261 the observed evergreen LMA distribution to deciduous leaves, and vice versa (Fig. S4.2 and Table 262 S4.2). We analyzed GL1 and GL2 using the same methods applied to the GLOPNET data in the main 263 text, which allowed us to assess if inferred differences between evergreen and deciduous leaves (Fig. 264 4) merely reflect model assumptions (in which case GL1 and GL2 should yield inferences similar to 265 those shown in Fig. 4), or if inferred differences reflect meaningful patterns in the data (in which GL1, 266 but not GL2, should yield results similar to Fig. 4). 12 267 Simulated dataset PA, based on Panama data. We used parameters values estimated from the 268 Panama analysis with the lowest WAIC (Optimal LL Model with site effects, Notes S3) to create a 269 simulated dataset with properties similar to the observed Panama data. But in the case of the 270 simulated data, the true values of the parameters (including LMAp and LMAs for each leaf) are 271 known, which allows us to test the modeling approach. In contrast to the GLOPNET dataset, the 272 correlation between LMA and Aarea is strong while the correlation between LMA and LL is weak in this 273 simulated dataset (Fig. S4.3), as in the observed Panama data (Figs. 2a and 2g). 274 Simulated dataset WC (“weak correlation”). We created one simulated dataset with weak 275 correlations among LMAp, LMAs and other traits (Fig. S4.4). This dataset allows us to assess if our 276 modeling approach is prone to inducing strong correlations (e.g., between LMAp and Aarea, or 277 between LMAs and LL) when the true correlations are weak. 278 For each of the above four parameter combinations, we analyzed a single simulated dataset, as well as 279 nine additional datasets with randomized LMA values (see details below). A more rigorous approach 280 would involve analyzing many replicate datasets (e.g., 1000), but this would be very computationally 281 expensive because each of our analyses requires fitting a complex Bayesian model with many latent 282 variables. Although our tests with simulated data are either non-replicated (in the case of the original, 283 non-randomized, simulated data) or have limited replication (in the case of the LMA-randomized 284 simulated data), we consider the test results presented below to be sufficiently clear to provide 285 meaningful guidance on the interpretation of the results presented in the main text. 286 287 4.1.2 Model forms applied to simulated data 288 We fit each simulated dataset with the model form used to generate the data; i.e., we applied the 289 Potential LL Model (Eq. 1) to the simulated GL1, GL2, and WC datasets, and the Optimal LL Model with 290 site effects (Notes S3) to the simulated PA dataset. 291 292 4.1.3 Priors 293 For leaf i of group j (where j is evergreen or deciduous in the GL1 and GL2 datasets, and j is sun/dry-site, 294 shade/dry-site, sun/wet-site, or shade/wet-site in the PA dataset), we considered two different sets of 295 priors for the latent variables fi: 296 (i) fi ~ Uniform(0,1) (Non-hierarchical model) 297 (ii) logit–1(fi) ~ Normal(μ0 + rj, σ) (Hierarchical model) 13 298 where 0 is constant across all leaves, rj is a group effect (a vector with constant value within each group 299 j), and is the variance of fi. In the hierarchical model, 0, rj, and are free parameters. The first set of 300 priors (uniform, non-hierarchical model) is used in the results reported in the main text and in Figs. S5.1- 301 S5.6 because analyses of simulated data (see below) suggested that these priors yielded more robust 302 inferences compared to the hierarchical model. 303 304 4.1.4 Parameter estimation 305 Models were fit and convergence of posterior distributions was assessed as described in the main text. 306 307 4.1.5 Randomized LMA datasets 308 As with the analyses described in the main text, we created randomized LMA datasets corresponding to 309 each of the four simulated datasets described above, and we fit the models to both the original (non- 310 randomized) simulated datasets, as well as the randomized datasets (see details below). The purpose of 311 this analysis was to assess the validity of the randomization approach in the main text. For example, 312 analyses of non-randomized data should yield correlations between LMAp and Aarea (or between LMAs 313 and LL) that are similar to the true correlations in the original simulated data, whereas analyses of 314 randomized data should yield weaker correlations than in the original simulated data. 315 We generated 9 randomized LMA datasets for each of the four simulated datasets, and we fit 316 the same models (Potential LL Model or Optimal LL Model with site effects) to the randomized and non- 317 randomized data. For the simulated GL1, GL2, and WC datasets, we randomized LMA values across 318 leaves, while maintaining the original (non-randomized) data for Aarea, Rarea and LL. For the simulated PA 319 dataset, LMA values were shuffled within sites (wet and dry) and across canopy strata (sun and shade). 320 Thus, when randomizing the PA dataset, we removed the effect of strata (sun vs. shade) on LMA, while 321 maintaining site differences in LMA as well as the observed covariances among Aarea, Rarea and LL. To 322 compare model fit between the non-randomized dataset and randomized LMA dataset, we calculated a 323 standardized effect size (SES) for correlation between focal variables and translated them to P-value as 324 described in the main text. 325 326 4.2 Results 327 Non-informative flat priors for latent variables (non-hierarchical models) performed better than the 328 hierarchical models. True values of model structural parameters (i.e., all parameters except for latent 329 variables) were usually located within the central part of their posterior distributions for models with 14 330 either non-hierarchical or hierarchical priors on latent variables (Figs. S4.5-S4.7). However, the match 331 between posterior estimates and true values tended to be better for the non-hierarchical models (Figs. 332 S4.5-S4.6), and in one case (WC dataset), the hierarchical model did not converge. 333 For latent variables (which are relevant to many results and conclusions presented in the main 334 text), the non-hierarchical model yielded posterior means that were unbiased and strongly correlated 335 with the true values for the GL1 and PA datasets, whereas the hierarchical models had weaker 336 relationships with the true values and/or systematic bias (Fig. S4.8). For the simulated WC dataset, the 337 non-hierarchical model performed better than the hierarchical model, although neither set of priors 338 yielded posterior means that were well-correlated with the true values of latent variables (Fig. S4.8). 339 340 LMA randomization is an effective way to identify meaningful correlations between LMAp and LMAs 341 and other traits. When the true correlations between LMAp and LMAs and other traits are strong (as in 342 the simulated GL and PA datasets), the estimated correlations (derived from posterior distributions of 343 the latent variables fi; Eqs. 2-3) are very similar to the true correlations and significantly different from 344 correlations derived from datasets in which LMA was randomized among leaves (see GL1 and PA in Fig. 345 S4.9). In contrast, when the true correlations between LMAp and LMAs and other traits are weak (as in 346 the simulated WC dataset), the estimated correlations and randomization-based correlations are very 347 similar to each other and significantly greater than the true correlations (see WC in Fig. S4.9). These 348 results suggest that (i) if the true correlations between LMAp and LMAs and other traits are weak, then 349 model assumptions (e.g., the degrees of freedom provided by latent variables) can lead to over- 350 estimates of these correlations; but (ii) such artefactual results can be diagnosed using the LMA- 351 randomization tests presented here and in the main text. Specifically, if results obtained from the 352 original data (non-randomized) are significantly different from results obtained from LMA-randomized 353 data, then the tests presented here (Fig. S4.9) suggest that the estimated correlations are similar to the 354 true correlations. 355 356 Inferred differences in LMAp and LMAs among groups (e.g, evergreen vs. deciduous leaves) and 357 inferred contributions of LMAp and LMAs to total LMA variance depend more on the data than on 358 model assumptions. Estimated distributions of LMAp and LMAs for evergreen and deciduous leaves 359 were very similar to the true distributions for simulated GL1 data (which has similar properties to the 360 original GLOPNET dataset), as well as for simulated GL2 data (where LMA distributions for evergreen 361 and deciduous leaves were swapped) (Figs. S4.10-S4.11). The estimated variance contributions of LMAp 15 362 and LMAs to total LMA variation were very similar to the true values for GL1 and roughly similar to the 363 true values for GL2 (Fig. S4.11). Despite the quantitative mismatch between estimated and true values 364 for GL2 variance partitioning (Fig. S4.12), the estimates are qualitatively correct in identifying LMAp as 365 the dominant variance component in GL2. These test results suggest that using our modeling approach, 366 estimated differences in LMAp and LMAs between groups of leaves and estimates of LMA variance 367 components are at least qualitatively robust, and depend more strongly on the properties of the dataset 368 than on model assumptions. 369 370 16 371 372 Fig. S4.1 Simulated GL1 dataset used to test the Potential LL Model analysis of GLOPNET data. The GL1 373 dataset has similar properties as the original GLOPNET dataset. But for GL1 (unlike GLOPNET), the values 374 of all model parameters (including the latent variables fi that determine LMAp and LMAs for each leaf) 375 are known. 376 377 378 379 17 380 381 382 Fig. S4.2 Simulated GL2 dataset used to test the Potential LL Model analysis of GLOPNET data. GL2 is 383 similar to GL1, except that in GL2, the LMA distributions for evergreen and deciduous leaves are 384 swapped (Table S4.2). These non-realistic LMA distributions were used to test the ability of our 385 modeling approach to correctly estimate differences in LMAp and LMAs among groups of leaves. 386 18 387 388 Fig. S4.3 Simulated PA dataset used to test the Optimal LL Model analysis of Panama data. The PA 389 dataset has similar properties as the original Panama dataset. But for PA (unlike Panama data), the 390 values of all model parameters (including the latent variables fi that determine LMAp and LMAs for each 391 leaf) are known. 392 393 394 19 395 396 Fig. S4.4 The weak correlation (WC) dataset used to test the Potential LL Model. Relationships among 397 traits are much weaker in the WC dataset compared to other simulated datasets (Figs. S4.1-S4.3). The 398 WC dataset was designed to evaluate the performance of our modeling approach in cases where the 399 hypothesized relationships between LMAp and LMAs and other traits were weak. 400 20 401 402 403 Fig. S4.5 Posterior distributions for model parameters in the simulated GL1 dataset. Vertical red lines 404 indicate the assumed (“true”) values used to create the simulated data. The non-hierarchical model 405 assumes flat (uniform) priors for each latent variable fi (Eq. 5), whereas the hierarchical model assumes 406 that each latent variable comes from a group-specific normal distribution, where groups are either 407 evergreen, deciduous, or unclassified leaves. 408 409 410 411 412 21 413 414 Fig. S4.6 Posterior distributions for model parameters in the simulated PA dataset. Vertical red lines 415 indicate the assumed (“true”) values used to create the simulated data. The non-hierarchical model 416 assumes flat (uniform) priors for each latent variable fi (Eq. 5), whereas the hierarchical model assumes 417 that each latent variable comes from a group-specific normal distribution, where groups are the four 418 combinations of two sites (wet and dry) and two canopy strata (sun and shade). 419 420 421 22 422 423 Fig. S4.7 Posterior distributions for model parameters in the simulated WC dataset. Vertical red lines 424 indicate the assumed (“true”) values used to create the simulated data. Posteriors for the model with 425 hierarchical priors did not converge, so only results for the non-hierarchical model (with uniform priors 426 for each latent variable) are shown. 427 428 429 23 430 431 432 Fig. S4.8 Posterior means for the latent variables fi (which determine LMAp and LMAs for each leaf; Eq. 433 5) plotted against their assumed (“true”) values in three simulated datasets. The dashed line indicates 434 the 1:1 relationship. The two rows correspond to results for two different sets of prior assumptions: 435 hierarchical priors (see legends for Figs. S4.5-S4.7 for details) and non-hierarchical uniform priors. The 436 hierarchical model for the simulated WC dataset did not converge, so no results are shown. 437 24 438 439 440 Fig. S4.9 Comparison of correlation coefficients estimated from non-randomized simulated data (‘Est’), 441 those obtained from analyses of LMA-randomized data (‘Rand’), and the assumed (‘True’) correlations 442 used to generate the simulated data. Correlations are shown between LMAp and Aarea, between LMAp 443 and Rarea, and between LMAs and LL. Red solid lines show the true correlations. Blue dashed lines show 444 the posterior means of the estimated (‘Est’) correlations obtained from non-randomized simulated data. 445 Black dashed lines show the posterior means of the correlations obtained from LMA-randomized 446 (‘Rand’) simulated data. Posterior distributions (estimated from 1000 random posterior samples) are 447 also shown for each correlation. P < 0.05 indicates a significantly stronger correlation in models fit to 448 non-randomized compared to models fit to randomized data (P-values based on the standardized effect 449 size, SES; see Methods in the main text for details). 450 451 25 452 453 454 455 456 Fig. S4.10 Boxplots of assumed (“true”) values of LMAp and LMAs in the simulated GL1 dataset, and the 457 corresponding posterior means for LMAp and LMAs estimated from the Potential LL Model applied to 458 the simulated data. The results show that for the GL1 dataset (which has similar properties to the 459 original GLOPNET data), the model accurately recovers the assumed distribution of LMAp and LMAs for 460 deciduous evergreen leaves. 461 462 463 26 464 465 Fig. S4.11 Boxplots of assumed (“true”) values of LMAp and LMAs in the simulated GL2 dataset, and the 466 corresponding posterior means for LMAp and LMAs estimated from the Potential LL Model applied to 467 the simulated data. The results show that for the GL2 dataset (in which LMA distributions for deciduous 468 and evergreen leaves were swapped relative to the original GLOPNET data), the model accurately 469 recovers the assumed distribution of LMAp and LMAs for deciduous and evergreen leaves. 470 27 471 472 473 Fig. S4.12 True and estimated proportions of total LMA variance contributed by variances in LMAp and 474 LMAs for the simulated GL1 and GL2 datasets. ‘True’ values refer to the assumed values in the simulated 475 datasets, whereas ‘Estimated’ values are posterior means from the Potential LL Model applied to the 476 simulated datasets. The results show that the modeling approach yields qualitatively correct inferences 477 (e.g., LMAs variance in dominant in GL1, whereas LMAp variance is dominant in GL2), although 478 quantitative estimates may not be robust (e.g., mismatch between True and Estimated values for GL2). 479 28 Table S4.1 Parameter values used to generate simulated datasets. The GL1 and GL2 parameter values were estimated from the Potential LL Model (Eq. 5) applied to the GLOPNET global dataset, as described in the main text. GL1 and GL2 share the same parameter values, but differ in the assumed LMA distributions (see Table S4.2). The PA parameter values were estimated from the Optimal LL Model with sites effects (Notes S3) applied to the Panama data, as described in the main text. The WC (weak correlation) dataset was designed to have weak correlations between LMAp and LMAs and other traits. Parameter values shown here were combined with LMA values (Table S4.2) to create the simulated data are shown in Figs. S4.1-S4.4. Parameter Parameter value in each simulated dataset Description GL1, GL2 PA WC 0.23 0.28 1 Slope coefficient relating LMAp to Aarea and Rarea 0.23 0.60 1 Slope coefficient relating LMAs to LL rp 0.02 0.02 0.02 Slope coefficient relating LMAp to Rarea rs 0.001 0.001 0.02 Slope coefficient relating LMAs to Rarea L NA 0.5 NA NA 0.7 NA 12 0 0.8 0 13 0.7 0.03 0 23 0 0.1 0 1 0.5 0.2 1 2 0.5 0.5 1 3 0.5 0.7 1 Scaling parameter that accounts for shading and thus affects the realized net photosynthetic rate Parameter to account for site effects Correlation coefficients in the covariance matrix in Eq. 11 in the main text Standard deviations in the covariance matrix in Eq. 11 in the main text 29 30 Table S4.2 LMA values in the simulated datasets. The latent variables fi determine LMAp and LMAs for each leaf (Eq. 5). Median of LMA SD of LMA Mean of fi Deciduous 80 0.3 0.6 Evergreen 170 0.5 0.4 Deciduous 170 0.3 0.6 Evergreen 80 0.5 0.4 Sun Dry 90 0.3 0.6 Shade Dry 70 0.3 0.7 Sun Wet 40 0.2 0.3 Shade Wet 30 0.4 0.6 120 0.3 0.4 Simulated GL1 Evergreen has higher LMA Simulated GL2 Evergreen has lower LMA Simulated PA Weak Correlation (WC) 31 Fig. S1 Boxplots comparing posterior means of the latent variable f (the fraction of total LMA comprised by LMAp) across deciduous (D) and evergreen (E) leaves in GLOPNET and Panama, and across sites (wet and dry) and canopy strata (sun and shade) in Panama. Note that LMAp = f × LMA, and LMAs = (1 – f) × LMA. (a) Deciduous and evergreen leaves in the GLOPNET dataset; (b) deciduous and evergreen leaves for Panama species for which both sun and shade leaves were available; (c) leaves for Panama species for which both sun and shade leaves were available; and (d) all leaves for Panama. At the dry Panama site, the increase in LMA from shade to sun (Fig. 5) is due to roughly equal proportional increases in LMAp and LMAs (because f is similar between sun and shade), whereas at the wet Panama site, the increase in LMA from shade to sun (Fig. 5) is due primarily to increased LMAp (because f is greater in sun than shade). GLOPNET results are for the Potential LL Model (Eq. 5), and Panama results are for the Optimal LL Model with site effects (Eq. 13 and Notes S3). The center line in each box indicates the median, upper and lower box edges indicate the interquartile range, whiskers show 1.5 times the interquartile range, and points are outliers. Groups sharing the same letters are not significantly different (P > 0.05; t-tests). 32 Fig. S2 Boxplots comparing leaf mass per area (LMA), photosynthetic leaf mass per area (LMAp; posterior means), and structural leaf mass per area (LMAs; posterior means) across sites (wet and dry) and canopy strata (sun and shade) in Panama. The results shown here include all leaves in the Panama dataset, whereas Fig. 5 in the main text only includes Panama species for which both sun and shade leaves were available. Boxplot symbols as in Fig. S1. Groups sharing the same letters are not significantly different (P > 0.05; t-tests). 33 Fig. S3 Measured traits related to photosynthesis and metabolism (nitrogen and phosphorus per-unit leaf area; Narea and Parea) are positively correlated with LMA and with estimates (posterior means) of the photosynthetic and structural LMA components (LMAp and LMAs, respectively) in the GLOPNET dataset. LMAp yields stronger correlations and more consistent relationships compared to LMA and LMAs; e.g., evergreen and deciduous leaves align along a single relationship in panel b, but not in panels a or c. Gray symbols show model results from one of ten randomized datasets in which LMA was randomized among all leaves in GLOPNET. Pearson correlation coefficients are for observed LMA (left column) and posterior means of LMAp (middle column) and LMAs (right column). P-values (*** P < 0.001) for LMA test the null hypothesis of zero correlation, and for LMAp and LMAs test the null hypothesis of equal correlation in observed and randomized datasets (see ‘Randomized LMA Datasets’ in Methods for details). 34 Fig. S4 Boxplots comparing cellulose content (percent of total leaf mass) across sites (wet and dry) and canopy strata (sun and shade) in Panama. (a) All leaves in the Panama dataset; and (b) leaves for Panama species for which both sun and shade leaves were available. Cellulose content (percent mass) is similar for shade and sun at the dry site, but higher for shade than sun at the wet site. This pattern is consistent with estimates of LMAp and LMAs fractions (Fig. S1c-d); i.e., at the wet site, LMAs comprises a larger fraction of total LMA in shade than sun (because LMAp comprises a larger fraction in sun than shade; Fig. S1c-d). Boxplot symbols as in Fig. S1. Groups sharing the same letters are not significantly different (P > 0.05; t-tests). 35 Fig. S5 Posterior means of LMAp vs LMAs in the (a) GLOPNET and (b) Panama datasets. Correlations between LMAp and LMAs are significantly positive, but the small r2 values indicate that a single axis could not accurately represent the two-dimensional space. Symbols as in Main Text Figs. 1-2. 36 Table S1 Summary of results for Potential LL Model (Eq. 5) and Optimal LL Models (Eq. 13 Notes S3). Column definitions: Data = GLOPNET or Panama. Model = Potential, Optimal or Optimal with site effects. n.obs = number of samples analyzed. n.par = number of free structural parameters (excluding latent variables) in models (see notes below table for details). WAIC = Watanabe-Akaike information criterion; also known as widely applicable information criterion. WAIC is a predictive information criterion for Bayesian models, with lower values indicating a more parsimonious model. , , rp, rs, L and = posterior mean and 95% credible interval for each parameter. Data LL Model n.obs n.par WAIC rs L 0.026] 0.002 [0.001, 0.003] NA [NA, NA] NA [NA, NA] 0.021 [0.016, 0.002 [-0.001, 0.027] 0.007] NA [NA, NA] NA [NA, NA] rp 0.023 [0.02, GLOPNET Panama Potential Potential 198 132 10 10 863 516 0.234 [0.208, 0.263] 0.298 [0.272, 0.329] 0.23 [0.203, 0.264] 0.499 [0.402, 0.636] 0.021 [0.018, Panama Optimal 132 11 378 0.282 [0.256, 0.316] 0.611 [0.542, 0.697] 0.279 [0.253, 0.026] 0 [-0.003, 0.003] 0.022 [0.018, -0.001 [-0.004, 0.026] 0.002] 0.335] NA [NA, NA] Optimal with site Panama effects 132 12 347 0.283 [0.254, 0.319] 0.691 [0.605, 0.799] 0.698 [0.564, 0.28 [0.253, 0.332] 0.863] Note: Free structural parameters for the Potential LL Model: , , rp, rs, 12, 13, 23, 1, 2, and 3. Free structural parameters for the Optimal LL Model: , , rp, rs, 12, 13, 23, 1, 2, 3 and L. The model with site effects includes one additional parameter (; see Notes S3). 37 Table S2 Correlations between LMAp or LMAs and other traits (on logarithmic scales) derived from the Potential LL model (Eq. 5) fit to GLOPNET data. Pearson correlations are given for models fit to the observed datasets (robs) and randomized datasets (rrand). For randomized datasets, rrand is the mean correlation for 10 datasets in which LMA was randomized across all leaves. P-values test the null hypothesis of equal correlation in observed and randomized datasets (see ‘Randomized LMA Datasets’ in Methods for details). LMA randomized across all leaves LL model Potential trait LMA component Aarea LMAp 0.672 0.484 < 0.001 Aarea LMAs -0.234 -0.401 < 0.001 Rarea LMAp 0.59 0.369 < 0.001 Rarea LMAs 0.124 -0.237 < 0.001 LL LMAp 0.108 -0.562 < 0.001 LL LMAs 0.916 0.542 < 0.001 Narea LMAp 0.645 0.074 < 0.001 Narea LMAs 0.477 -0.027 < 0.001 Parea LMAp 0.57 0.234 < 0.001 Parea LMAs 0.304 -0.137 < 0.001 38 robs rrand P-value Table S3 Correlations between LMAp or LMAs and other traits (on logarithmic scales) derived from the three alternative LL models fit to Panama data: Potential LL Model (Eq. 5), Optimal LL Model without site effect (Eq. 13), and Optimal LL Model with site effect (Notes S3). Pearson correlations are given for models fit to the observed datasets (robs) and randomized datasets (rrand). For randomized datasets, rrand is the mean correlation for 10 datasets in which LMA was randomized across all leaves, and 10 datasets in which LMA was randomized within sites (wet and dry) across canopy strata (sun and shade). P-values test the null hypothesis of equal correlation in observed and randomized datasets (see ‘Randomized LMA Datasets’ in Methods for details). LL model Potential Optimal trait LMA component robs LMA randomized LMA randomized within across all leaves sites across strata rrand rrand P-value P-value Aarea LMAp 0.970 0.646 < 0.001 0.668 < 0.001 Aarea LMAs 0.057 -0.463 < 0.001 -0.428 < 0.001 Rarea LMAp 0.618 0.528 0.157 0.496 0.063 Rarea LMAs 0.167 -0.381 < 0.001 -0.355 < 0.001 LL LMAp -0.432 -0.613 0.254 -0.526 < 0.001 LL LMAs 0.368 0.491 < 0.001 0.642 < 0.001 Narea LMAp 0.854 0.575 < 0.001 0.592 < 0.001 Narea LMAs 0.289 -0.376 < 0.001 -0.344 < 0.001 Parea LMAp 0.768 0.571 < 0.001 0.550 < 0.001 Parea LMAs 0.176 -0.383 < 0.001 -0.393 < 0.001 CLarea LMAp 0.600 0.252 < 0.001 0.309 < 0.001 CLarea LMAs 0.651 -0.141 < 0.001 -0.042 < 0.001 Aarea LMAp 0.883 0.497 < 0.001 0.531 < 0.001 Aarea LMAs 0.17 -0.366 < 0.001 -0.312 < 0.001 Rarea LMAp 0.676 0.392 < 0.01 0.382 < 0.01 Rarea LMAs 0.095 -0.308 < 0.001 -0.282 < 0.001 LL LMAp -0.549 -0.654 < 0.001 -0.573 0.307 LL LMAs 0.556 0.600 0.317 0.726 < 0.01 39 Optimal with site effect Narea LMAp 0.823 0.427 < 0.001 0.459 < 0.001 Narea LMAs 0.326 -0.284 < 0.001 -0.242 < 0.001 Parea LMAp 0.766 0.448 < 0.001 0.441 < 0.001 Parea LMAs 0.196 -0.31 < 0.001 -0.308 < 0.001 CLarea LMAp 0.581 0.113 < 0.001 0.183 < 0.001 CLarea LMAs 0.695 -0.021 < 0.001 0.080 < 0.001 Aarea LMAp 0.846 0.454 < 0.001 0.502 < 0.001 Aarea LMAs 0.208 -0.337 < 0.001 -0.327 < 0.001 Rarea LMAp 0.690 0.439 < 0.001 0.400 < 0.01 Rarea LMAs 0.063 -0.354 < 0.001 -0.331 < 0.001 LL LMAp -0.503 -0.502 < 0.01 -0.451 0.229 LL LMAs 0.583 0.533 < 0.05 0.718 < 0.01 Narea LMAp 0.808 0.409 < 0.001 0.444 < 0.001 Narea LMAs 0.332 -0.300 < 0.001 -0.279 < 0.001 Parea LMAp 0.740 0.384 < 0.001 0.391 < 0.001 Parea LMAs 0.235 -0.249 < 0.001 -0.292 < 0.001 CLarea LMAp 0.616 0.205 < 0.001 0.252 < 0.001 CLarea LMAs 0.704 -0.132 < 0.001 0.001 < 0.001 40 Table S4 Correlations between observed and predicted LL (on logarithmic scales) for the Potential LL Model (Eq. 5) and the Optimal LL Model (Eq. 13 and Notes S3) fit to Panama data. Pearson correlations are given for models fit to the observed datasets (robs) and randomized datasets (rrand). For randomized datasets, rrand is the mean correlation for 10 datasets in which LMA was randomized across all leaves, and 10 datasets in which LMA was randomized within sites (wet and dry) across canopy strata (sun and shade). P-values test the null hypothesis of equal correlation in observed and randomized datasets (see ‘Randomized LMA Datasets’ in Methods for details). Model robs LMA randomized across LMA randomized within all leaves sites across strata rrand P-value rrand P-value Potential 0.382 0.502 < 0.001 0.652 < 0.001 Optimal 0.759 0.518 < 0.001 0.637 < 0.001 0.813 0.664 < 0.001 0.705 < 0.001 Optimal with site effect 41 References Azcón-Bieto J, Osmond CB. 1983. Relationship between Photosynthesis and Respiration The Effect of Carbohydrate Status on the Rate of CO2 Production by Respiration in Darkened and Illuminated Wheat Leaves. Plant Physiology 71: 574–581. Azcón-Bieto J, Lambers H, DAY DA. 1983. Effect of photosynthesis and carbohydrate status on respiratory rates and the involvement of the alternative pathway in leaf respiration. Plant Physiology 72: 598–603. Chen A, Lichstein JW, Osnas JLD, Pacala SW. 2014. Species-independent down-regulation of leaf photosynthesis and respiration in response to shading: evidence from six temperate tree species. PLoS ONE 9: e91798. Gelman A, Hwang J, Vehtari A. 2014. Understanding predictive information criteria for Bayesian models. Statistics and Computing 24: 997–1016. Kikuzawa K. 1991. A cost-benefit analysis of leaf habit and leaf longevity of trees and their geographical pattern. The American Naturalist: 1250–1263. Kikuzawa K, Shirakawa H, Suzuki M, Umeki K. 2004. Mean labor time of a leaf. Ecological Research 19: 365–374. Osnas JLD, Lichstein JW, Reich PB, Pacala SW. 2013. Global Leaf Trait Relationships: Mass, Area, and the Leaf Economics Spectrum. Science 340: 741–744. Watanabe S. 2010. Asymptotic Equivalence of Bayes Cross Validation and Widely Applicable Information Criterion in Singular Learning Theory. The Journal of Machine Learning Research 11: 3571–3594. 42
© Copyright 2026 Paperzz