Supplemental Figures

1
Appendix
2
Identifying large-effect loci through univariate linear mixed models
3
We used GEMMA (Zhou et al. 2013) to fit univariate linear mixed models to ascertain
4
large-effect loci on the fitness-related phenotypic traits measured herein. To control for population
5
structure we utilized the kinship matrix estimated for BSLMM. Allelic associations were judged as
6
significant by converting the Wald’s p-value to q-values (Storey et al. 2015; v2.2.2). Additionally,
7
we employed GEMMA to acquire MLE estimates of PVE from our IDS SNP dataset for each
8
phenotype.
9
We found little evidence of large-effect loci within out dataset as none of the q-values were below
10
the threshold (Table S2). We did relax the threshold for inclusion by isolating loci with -ln(Wald’s
11
p) β‰₯ 10 (discussed in main text). Thus, there were little to no loci confidently associated to
12
phenotype using univariate LMM suggesting the absence of large-effect loci within our dataset.
13
Estimation of h2 and 𝑸𝑺𝑻
14
Assuming that the collection of seedlings for each maternal tree were half-siblings, we
15
estimated the mean heritability (h2) across populations and differentiation (QST) for each
16
phenotypic trait as:
17
β„Ž2 =
2
4πœŽπ‘“π‘Žπ‘š(π‘π‘œπ‘)
2
2
πœŽπ‘“π‘Žπ‘š(π‘π‘œπ‘)
+ πœŽπ‘Ÿπ‘’π‘ π‘–π‘‘π‘’π‘Žπ‘™
18
19
20
2
πœŽπ‘π‘œπ‘
𝑄𝑆𝑇 = 𝜎2
2
π‘π‘œπ‘ +8πœŽπ‘“π‘Žπ‘š(π‘π‘œπ‘)
,
21
22
2
where πœŽπ‘“π‘Žπ‘š(π‘π‘œπ‘)
is the variance component attributed to the random effect of family nested in
23
2
population, πœŽπ‘π‘œπ‘
is the variance component attributed to the random effect of population, and
24
2
πœŽπ‘Ÿπ‘’π‘ π‘–π‘‘π‘’π‘Žπ‘™
is the variance component attributed to residual effects. Confidence intervals around
25
these point estimates were constructed using parametric bootstrapping (n = 1000 replicates) as
1
26
carried out using the simulate function in the stats package of R (see Maloney et al. in review
27
for more details).
28
Supplemental Figures
29
Figure S1
30
31
Figure S1 Distributions of expected heterozygosity across bayenv2 focal loci for 9/18
32
environments. (A) AWS0-25, (B) AWS0-50, (C) Annual precipitation, (D) CEC, (E) Clay, (F)
33
Elevation, (G) GDD-Aug, (H) GDD-May, (I) Latitude.
2
34
Figure S2
35
36
Figure S2 Distributions of expected heterozygosity across bayenv2 focal loci for 9/18
37
environments. (A) Longitude, (B) Maximum solar radiation input, (C) Rock coverage, (D) Sand,
38
(E) Silt, (F) Tmax-July, (G) Tmin-Jan, (H) WC-15Bar, (I) WC-β…“bar.
3
39
Figure S3
40
41
Figure S3 Single-locus 𝐹𝑆𝑇 for all SNPs (N = 116,231) calculated from hierfstat. 95% CI:
42
-0.0289, 0.0428.
4
43
Figure S4
44
45
Figure S4 Distributions of the harmonic mean posterior inclusion probability Μ…Μ…Μ…Μ…Μ…
𝑃𝐼𝑃 (𝛾̅ ) for loci
46
Μ…Μ…Μ…Μ…Μ… estimated from BSLMM.
identified by the 99.9th or the 99.8th percentile of 𝑃𝐼𝑃
5
47
Figure S5
48
49
Figure S5 Effect size distributions of main effect (𝛽̅) for loci identified by the 99.9th or the 99.8th
50
Μ…Μ…Μ…Μ…Μ… estimated from BSLMM. Legend as in Figure S4.
percentile of 𝑃𝐼𝑃
6
51
Figure S6
52
53
Figure S6 Effect size distributions of sparse effect (𝛼̅) for loci identified by the 99.9th or the 99.8th
54
Μ…Μ…Μ…Μ…Μ… estimated from BSLMM. Legend as in Figure S4.
percentile of 𝑃𝐼𝑃
7
55
Figure S7
56
57
Μ…Μ…Μ…Μ…Μ…) for loci of the 99.9th or the 99.8th
Figure S7 Effect size distributions of total effect (𝑏̂ = 𝛼̅+𝛽̅ βˆ™ 𝑃𝐼𝑃
58
percentile of Μ…Μ…Μ…Μ…Μ…
𝑃𝐼𝑃 estimated from BSLMM. Legend as in Figure S4.
8
59
Figure S8
60
61
Figure S8 Histograms of multilocus FST (blue bars) as calculated with hierfstat for all SNPs
62
(N = 116231). Vertical lines mark focal SNPs identified by bayenv2, red lines are those SNPs
63
below the 95th percentile of 𝐹ST, purple lines are between the 95th percentile and the 99.9th
64
percentile 𝐹ST, blue lines are SNPs with 𝐹ST greater than the 99.9th percentile. (A) AWS0-25 = 95
65
SNPs (B) AWS0-50 = 147 SNPs (C) Ann-ppt = 49 SNPs (D) CEC = 14 SNPs (E) Clay = 22 SNPs
66
(F) Elevation = 143 SNPs (G) GDD-Aug = 157 SNPs (H) GDD-May = 80 SNPs (I) Latitude = 199
67
SNPs.
9
68
Figure S9
69
70
Figure S9 Histograms of multilocus 𝐹ST (blue bars) as calculated with hierfstat for all SNPs
71
(N = 116231). Vertical lines mark focal SNPs identified by bayenv2, red lines are those SNPs
72
below the 95th percentile of 𝐹ST, purple lines are between the 95th percentile and the 99.9th
73
percentile 𝐹ST, blue lines are SNPs with 𝐹ST greater than the 99.9th percentile. (A) longitude = 67
74
SNPs (B) percent maximum radiation input = 144 SNPs (C) percent rock coverage = 143 SNPs
75
(D) percent sand = 111 SNPs (E) silt = 140 SNPs (F) maximum July temperature = 50 SNPs (G)
76
minimum January temperature = 116 SNPs (H) WC-15bar = 86 SNPs (I) WC-β…“bar = 97 SNPs.
10
77
Figure S10
78
79
Figure S10 P-values from Wald’s tests used in single-locus phenotypic association implemented
80
through univariate LMM using the GEMMA software package. Dashed lines indicate
81
-ln(0.05/116231), the most conservative threshold for inclusion. Dotted lines indicate a relaxed
82
threshold, -𝑙𝑛⁑(π‘π‘Šπ‘Žπ‘™π‘‘ ) ο‚³ 10, to investigate overlap with focal SNPs identified from BSLMM,
83
OutFLANK, and bayenv2. (A) bud flush, (B) 13C, (C) height, (D) 15N, (E) root:shoot biomass.
84
Order of markers does not reflect physical distance.
11
85
Figure S11
86
87
Figure S11 Principal component analysis of allele frequencies for the empirical dataset imputed
88
with Beagle. Percent variance explained for each PC is given in the axis labels. SNPs across
89
the 6 populations used for phenotypic association show a similar pattern (data not shown).
12
90
Figure S12
91
92
Figure S12 Violin plots for main effects (𝛼̅), sparse effects (𝛽̅), the posterior inclusion probability
93
Μ…Μ…Μ…Μ…Μ…), and model averaged effects (𝑏̂ =⁑𝛼̅𝑖 + 𝛽𝑖̅ 𝑃𝐼𝑃
̅̅̅̅̅𝑖 ) estimated in BSLMM.
(𝑃𝐼𝑃
13
94
Figure S13
95
96
Figure S13 Expected heterozygosity across all loci in the empirical set of SNPs (n = 116,231).
97
SNPs were binned according to expected heterozygosity, with bins of 0.01 from 0 to 0.50. SNPs
98
across the 6 populations used for phenotypic association show a similar pattern (data not shown).
14
99
Figure S14
100
101
Figure S14 Expected heterozygosity across focal SNPs from OutFLANK (n=110). SNPs were
102
binned according to expected heterozygosity, with bins of 0.01 from 0 to 0.50.
15
103
Figure S15
104
105
Figure S15 Mean allele frequency difference (AFD) among 8 populations for focal loci associated
106
to environment (red line) by bayenv2 and from 1000 sets of random SNPs chosen by HE (black
107
distributions). Number of loci associated to various environments is given in Table 3. (A) AWS0-
108
25 = 95 SNPs (B) AWS0-50 = 147 SNPs (C) Ann-ppt = 49 SNPs (D) CEC = 14 SNPs (E) Clay =
109
22 SNPs (F) Elevation = 143 SNPs (G) GDD-Aug = 157 SNPs (H) GDD-May = 80 SNPs (I)
110
Latitude = 199 SNPs.
16
111
Figure S16
112
113
Figure S16 Mean allele frequency difference (AFD) among 8 populations for focal loci associated
114
to environment (red line) by bayenv2 and from 1000 sets of random SNPs chosen by HE (black
115
distributions). Number of loci associated to various environments is given in Table 3. (A)
116
Longitude, (B) Maximum solar radiation input, (C) Rock coverage, (D) Sand, (E) Silt, (F) Tmax-
117
July, (G) Tmin-Jan, (H) WC-15Bar, (I) WC-β…“bar.
17
118
Figure S17
119
120
Figure S17 Mean allele frequency difference (AFD) among 6 populations for focal loci associated
121
Μ…Μ…Μ…Μ…Μ…) and from 1000 sets of random SNPs
to phenotype (red line) by BSLMM (ο‚³99.8th percentile of 𝑃𝐼𝑃
122
chosen by HE (black distributions). Number of loci associated to various environments is given in
123
Table 4. (A) bud flush, (B) 13C, (C) height, (D) 15N, (E) root:shoot biomass.
18
124
Figure S18
125
126
Μ…Μ…Μ…Μ…Μ…) loci identified by
Figure S18 Expected heterozygosity across focal (ο‚³99.9th percentile of 𝑃𝐼𝑃
127
BSLMM. (A) bud flush, (B) 13C, (C) height, (D) 15N, (E) root:shoot biomass.
19
128
Figure S19
129
130
Figure S19 Expected heterozygosity across focal (ο‚³99.8th percentile of Μ…Μ…Μ…Μ…Μ…
𝑃𝐼𝑃) loci identified by
131
BSLMM. (A) bud flush, (B) 13C, (C) height, (D) 15N, (E) root:shoot biomass.
20
132
Supplemental Tables
133
Table S1
134
Bin (% missing data)
n
Fraction of N
6
3.755e-05
ο‚£10%
6476
0.0405
>10% and ο‚£20%
29890
0.1871
>20% and ο‚£30%
40201
0.2516
>30% and ο‚£40%
39658
0.2482
>40% and ο‚£50%
>50%
0
0.00000
TABLE S1 Degree of missing data across SNPs in dataset. Count (n) and fraction of all loci (N =
135
161,231) by bin.
21
136
Table S2
137
Phenotype
N SNPs
PVE (se)
Min q-value
Bud Flush
0
1.075e-06 (na)
0.9946
0
1.081e-06 (0.4066)
0.2049
13C
Height
0
1.081e-06 (0.7217)
0.6120
0
1.081e-06 (na)
0.9999
15N
Root:Shoot
0
0.3526 (0.2529)
0.7336
TABLE S2 Results of univariate linear mixed models (LMM) as implemented in GEMMA. N SNPs
138
are those SNPs with significant effect (q-value ≀ 0.05). PVE = maximum likelihood estimate of the
139
percent phenotypic variance explained across individual SNPs with large effect. The final column
140
illustrates the relative magnitude of the minimum q-value across SNPs for each phenotype.
22
141
Table S3
Comparison
99.8th PIPs and bayenv2
99.8th PIPs and 99.8th
PIPs
99.8th PIPs and LMM loci
bayenv2 and LMM loci
OutFLANK and LMM loci
OutFLANK and bayenv2
bayenv2 and bayenv2
Group 1
Bud flush
Bud flush
13C
13C
13C
13C
Height
Height
Height
Height
15N
15N
Root:shoot
Root:shoot
15N
13C
Rock coverag
Budeflush
15N
13C
Height
Rock coverag
e
n/a
AWS0-25
Ann-ppt
Elevation
GDD-Aug
Max-rad input
Rock coverag
e
Sand
Tmin-Jan
WC-15bar
WC-β…“bar
AWS0-25
Silt
Silt
WC-15bar
WC-β…“bar
Silt
GDD-Aug
WC-15bar
WC-β…“bar
Group 2
GDD-May
Rock-cov
Elevation
Longitude
Tmin-Jan
WC-β…“bar
Clay
Elevation
Max-rad-input
Tmax-July
AWS0-50
Sand
Elevation
Bud flush
Height
15N
Rock coverag
Bud eflush
15N
13C
Height
n/a
n/a
n/a
n/a
n/a
n/a
n/a
n/a
n/a
n/a
n/a
n/a
AWS0-50
AWS0-50
Sand
AWS0-50
WC-15bar
AWS0-25
Elevation
Silt
AWS0-50
23
overla
1
p
1
1
2
1
1
1
1
2
1
1
1
1
3
1
1
1
1
1
3
3
1
0
1
1
1
2
1
3
1
3
1
3
75
73
63
49
46
43
43
42
35
Large-effect loci
0
present
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
3
3
1
0
0
0
0
0
0
0
0
0
0
0
0
sdfsdf
0
0
0
0
0
0
0
0
Comparison
bayenv2 and bayenv2
Cont’d from previous page
Group 1
Group 2
overla
Sand
AWS0-50
32
p
WC-15bar
AWS0-25
30
WC-β…“bar
Silt
27
Latitude
Elevation
26
Sand
AWS0-35
23
Tmin-Jan
Rock-cov
22
Tmin-Jan
GDD-Aug
22
Tmin-Jan
Ann-ppt
20
Rock-cov
Longitude
19
Longitude
Ann-ppt
18
Rock-cov
Ann-ppt
17
WC-15bar
Sand
16
Tmin-Jan
Longitude
15
WC-β…“bar
AWS0-25
13
Silt
Max rad input
13
GDD-May
Elevation
12
Tmin-Jan
Elevation
10
Latitude
GDD-Aug
10
Tmax-July
AWS0-50
9
Latitude
GDD-May
9
Max rad input
AWS0-50
8
Rock cov
GDD-May
8
Sand
Max rad input
7
GDD-May
GDD-Aug
6
WC-15bar
Max rad input
7
Tmax-July
AWS0-25
5
WC-β…“bar
Tmax-July
5
WC-β…“bar
Max rad input
4
Sand
GDD-May
4
Max rad input
AWS0-25
4
Rock cov
Max rad input
4
Tmax-July
GDD-May
4
WC-β…“bar
Rock cov
3
Tmax-July
Silt
3
WC-15bar
Tmax-July
3
Tmax-July
Latitude
3
WC-β…“bar
Sand
3
WC-β…“bar
GDD-Aug
2
Silt
GDD-Aug
2
Tmin-Jan
Silt
2
Rock cov
CEC
2
Elevation
Clay
2
Elevation
AWS0-50
2
24
Large-effect loci
0
present
0
0
sdfsdf
0
0
0
sdfsdf
0
0
0
sdfsdf
0
0
0
sdfsdf
0
0
0
sdfsdf
0
0
0
sdfsdf
0
0
0
sdfsdf
0
0
0
sdfsdf
0
0
0
sdfsdf
0
0
0
sdfsdf
0
0
0
sdfsdf
0
0
0
sdfsdf
0
sdfsdf
0
0
0
sdfsdf
0
0
0
sdfsdf
142
Cont’d from previous page
Comparison
Group 1
Group 2
overla
Large-effect loci
bayenv2 and bayenv2
Elevation
AWS-25
2
0
p
present
Rock cov
GDD-Aug
2
0
Rock cov
Elevation
2
0
sdfsdf
WC-β…“bar
Latitude
2
0
Sand
GDD-Aug
2
0
WC-β…“bar
Longitude
2
0
sdfsdf
Tmax-July
Elevation
2
0
WC-β…“bar
GDD-Aug
1
0
Lat
AWS0-25
1
0
sdfsdf
Tmax-july
Max rad input
1
0
WC-β…“bar
Elevation
1
0
GDD-Aug
Ann-ppt
1
0
sdfsdf
Sand
Elevation
1
0
WC-15bar
Elevation
1
0
Elevation
Ann-ppt
1
0
sdfsdf
Lat
Clay
1
0
sdfsdf
Longitude
GDD-May
1
0
Max rad input
Longitude
1
0
Tmax-July
Longitude
1
0
sdfsdf
Latitude
AWS0-50
1
0
Max rad input
GDD-Aug
1
0
Silt
GDD-May
1
0
sdfsdf
Max rad input
GDD-May
1
0
Tmin-Jan
Sand
1
0
Longitude
AWS0-50
1
0
sdfsdf
TABLE S3 Intersection of SNPs among methods and the number of large-effect SNPs
within the
143
intersection. Large-effect SNPs from univariate LMM were identified from a reduced threshold,
144
𝑙𝑛⁑(π‘π‘Šπ‘Žπ‘™π‘‘ ) β‰₯ 10.
25
145
Table S4
146
h2
𝑄ST
Trait
PVE
𝑁𝑆𝑁𝑃
0.0156
(0.0000-0.0634)*
0.3089
(0.1857-0.4603)
0.2565
(0.0193,
0.6541)
192
(112,
293)
Bud Flush
0.0427 (0.0001-0.1452)* 0.7787 (0.3873-1.0000) 0.2013 (0.0174, 0.5218) 190 (112, 293)
13C
0.0418 (0.0000-0.2376)* 0.0608 (0.0075-0.1171) 0.1750 (0.0156, 0.4701) 191 (112, 293)
Height
0.0191 (0.0000-0.2984)
0.3525 (0.0036-0.6838) 0.1379 (0.0138, 0.3951) 193 (112, 293)
15N
0.0110
(0.0000-0.0736)
0.3240 (0.1219-0.5404) 0.3701 (0.0433, 0.7206) 194 (112, 293)
Root:Shoot
TABLE S4 Parameter estimates of the mean (95% credible intervals) from GEMMA, except for h2 and 𝑄𝑆𝑇 (mean and 95% confidence
147
interval - estimated in Maloney et al. in review). PVE – percent phenotypic variance explained by explained by individual SNPs included
148
in the multilocus model; 𝑁𝑆𝑁𝑃 – the number of SNPs underlying the trait.
26
149
27