Document

Inter-species transcriptomics of seagrasses
Franssen et al., Supplemental Material:
Contents:
Supplemental figures S1 – S9
pages 2-10
Supplemental tables S1 – S3
pages 11-13
Additional information on the transcriptome assembly for N. noltii
page 14
References
page 15
1
Inter-species transcriptomics of seagrasses
Fig. S1:
Fig. S1: Annual temperature profile of water temperatures in proximity to northern and southern sampling
locations. Water temperatures were measured in the shallow subtidal at one meter depth. Temperatures ≥
25°C are marked with red shading. A) Hals, Kattegat/Baltic, Denmark. Surface water temperatures were
measured at Hals (Hals harbor, 56° 59.45’ N, 10° 18.49’ E; data obtained from Jens Sund Larsen). B)
Gabicce Mare, Adriatic, Italy. Surface water temperatures measured every 2 week at Stn19 (Cattolica: 43°
58.29’ N, 12° 44.46’ E).
2
Inter-species transcriptomics of seagrasses
Fig. S2:
Fig. S2: Experimental design of the common-stress garden experiment. The Aquatron consisted of 12
experimental units (mesocosms). Half of the mesocosms were supplied with water of 26°C during the heat
wave (red mesocosms), while the other half was kept at 19°C (blue mesocosms). The mesocosms were
supplied with water from two storage tanks, which sustained the two different water temperatures, while
they were interconnected to allow for water exchange between the two temperature circuits. In each
mesocosm were eight boxes, into which the seagrasses were planted. Each mesocosm contained two
boxes for each population. Boxes with plants from Z. marina populations (northern or southern population)
contained ~15 – 25 shoots from ~9 – 12 different clones per box. Boxes with plants from either N. noltii
population (northern or southern population), which have much smaller shoots, contained ~30 – 70 shoots
per box.
3
Inter-species transcriptomics of seagrasses
Fig. S3:
Fig. S3: Temperature profile of the heat wave simulation. Blue indicates temperature at control and red at
heat stress treatment. Vertical lines symbolize time points for RNA sampling (time point: heat stress) and
the assessment of shoot count abundance (time points: heat stress and recovery).
4
Inter-species transcriptomics of seagrasses
Fig. S4:
Fig. S4: Bioinformatics & expression analysis workflow of RNA-seq data.
5
Inter-species transcriptomics of seagrasses
Fig. S5:
Fig. S5:
Heat map of expression profiles of 267 up-regulated genes during heat-stress in both Z. marina
populations. X-axis: columns display the four cDNA libraries from the various treatments for Z. marina; yaxis: each row displays the expression strength of the respective A. thaliana ortholog for the four libraries.
Rows as well as columns are clustered by expression similarities with average linkage clustering.
Expression strength was scaled for each gene across libraries via z-scores. Values are color-coded
(white: highest expression strength; red: lowest expression strength). Populations: northern (N), southern
(S); heat treatment (H), control treatment (C).
6
Inter-species transcriptomics of seagrasses
Fig. S6:
Fig. S6: Multivariate grouping of the expression profiles of the 78 annotated with the functional term
“stress.abiotic.heat” in A) N. noltii, B) Z. marina using multidimensional scaling (MDS). MDS is based on
the pairwise distances between the libraries using the biological coefficient of variation (Robinson et al.
2010). Only the most variable 75% of the genes have been used for MDS analysis. Species: N. noltii (Nn),
Z. marina (Zm); populations: northern (N), southern (S); heat treatment (H), control treatment (C); max,
min: maximal, minimal expression for each of the 78 genes observed for the four respective libraries. B)
Groupings are indicated by color and supported by ANOSIM analysis (R=0.9545; P=0.018498).
7
Inter-species transcriptomics of seagrasses
Fig. S7:
Fig. S7: Heat map of expression profiles of 28 up-regulated genes during heat in the northern N. noltii
population. X-axis: columns display the four cDNA libraries from the various treatments for N. noltii; y-axis:
each row displays the expression strength of the respective A. thaliana ortholog for the four libraries.
Rows as well as columns are clustered by expression similarities with average linkage clustering.
Expression strength was scaled for each gene across libraries via z-scores. Values are color-coded
(white: highest expression strength; red: lowest expression strength). Populations: northern (N), southern
(S); heat treatment (H), control treatment (C).
8
Inter-species transcriptomics of seagrasses
Fig. S8:
A)
B)
Fig. S8: Change in shoot abundance over the time course of the experiment for A) Z. marina and B) N.
noltii. Changes in shoot abundance were measured as the differences in percent compared to the start of
the experiment. Blue: (C) control temperature, red: (H) heat treatment. “heat wave”: time point in the
middle of the heat wave, “recovery”: one week after the ease of the heat wave (see Fig. S3). “north”:
northern population, “south”: southern population.
9
Inter-species transcriptomics of seagrasses
Fig. S9:
Fig. S9: Venn diagram displays overlaps of gene sets up-regulated in Z. marina during a heat wave
simulation from two analogous experiments and two different analysis methods (FDR α < 0.05). A) Display
of the complete set of differentially expressed genes. B) Display of all genes annotated with the Mapman
category “stress.abiotic.heat”. Results from expression data from (Franssen et al. 2011) are displayed in
red and green, results from this study are displayed in blue. The gene identification via indicator analysis
as used in Franssen et al. (2011a) is displayed in red and the differential expression analysis using
populations as biological replicates used in this study (see methods) is displayed in green and blue.
10
Inter-species transcriptomics of seagrasses
Table S1:
Overview of the RNA-seq reads, assembly and annotation statistics. A) De novo assembly of the N. noltii
transcriptome based on reads sequenced with the Genome sequencer FLX, Titanium series (Roche, 454)
(Gu et al. 2012b). B) Read and mapping statistics of the 8 RNA-seq libraries sequence with the Genome
Analyzer II (Illumina) for the 8 different experimental conditions.
Average #
median
of contigs
length
mapping
of
#
annotated # unique to unique
Nanozostera
# contigs
contigs reference annotated contigs *1 genes
reference
noltii
# reads
assembled [bp]
proteome contigs *1 [%]
found *1
gene *1
A) Assembly
A. thaliana
950.784
55.891
518
36.729
65,72
11.914
3,1
as
as above
as above
above O. sativa
36.949
66,11
12.177
3,0
*1 Orthologous genes were identified via Blastx (e ≤0.0001) against the respective reference proteome.
mapped reads to
library
transcriptome
(Illumina
assembly *2
B) sequencing) raw reads
#
#
%
Zm_N_C
10.178.959
8.271.470
81,3
Zm_N_H
7.790.381
6.029.852
77,4
Zm_S_C
10.824.307
8.970.083
82,9
Zm_S_H
12.769.832 10.825.888
84,8
Nn_N_C
8.841.109
6.485.421
73,4
Nn_N_H
12.881.904
9.856.279
76,5
Nn_S_C
13.159.520
9.751.223
74,1
Nn_S_H
11.871.096
9.081.939
76,5
mapped reads to A.
thaliana *3
#
7.455.500
5.427.240
7.992.766
9.538.158
5.229.704
8.172.148
7.608.659
7.374.264
%
73,2
69,7
73,8
74,7
59,2
63,4
57,8
62,1
mapped reads to O.
sativa *3
#
7.460.419
5.450.886
8.030.046
9.694.951
5.237.390
8.195.490
7.627.629
7.399.228
%
73,3
70,0
74,2
75,9
59,2
63,6
58,0
62,3
average Zm 10.390.870
8.524.323
81,6
7.603.416
73,2
7.659.076
73,7
average Nn
11.688.407
8.793.716
75,1
7.096.194
60,7
7.114.934
60,9
sum
88.317.108 69.272.155
78,4
58.798.439
66,6
59.096.039
66,9
3
nd
* reads were mapped via 2 stage mapping:
1st stage reads to respective transcriptome assembly*2, 2nd stage: contigs to reference proteome.
mapped reads to A.
thaliana to 8977 genes
expressed in both
species *2
#
%
7.065.344
69,4
5.145.392
66,0
7.559.174
69,8
9.055.996
70,9
4.857.479
54,9
7.826.334
60,8
7.273.175
55,3
7.078.692
59,6
7.206.477
6.758.920
55.861.586
69,4
57,8
63,3
Table S2:
List of all 8977 investigated genes including their functional annotation and the information whether they
are differentially expressed with respect to a certain factor (FDR α < 0.05).
## Table S2 is provided via an extra excel table as it is too big to put in here. ##
11
Inter-species transcriptomics of seagrasses
Table S3:
Change in shoot abundance for Z. marina and N. noltii. The normalized change in shoot abundance (with
respect to T0) was fit to a generalized linear model with the additive effects: treatment, population and
time point. Treatment levels: control temperature (C) and heat treatment (H). Populations: northern (N)
and southern (S) population. Time points: acute heat in the middle of the heat wave (T3) and post-stress,
~1.5 weeks after the ease of the heat wave (T6). Percentages given in brackets are the corresponding
effects in percent change compared to time point T0. (Note: Intercept estimates cannot be interpreted at
face value because data were transformed to yield positive values (subtraction of the smallest value) prior
to statistical testing.)
Z. marina
Coefficients:
(Intercept)
treatmentH
populationS
timepointT6
Estimate
Std. Error
z value
Pr(>|z|)
2,30108 (8,9%)
0,06426
35,808
< 2e-16 ***
-0,1359 (-0,5%)
0,06286
-2,162
0,0306 *
0,43753 (1,7%)
0,06422
6,813
9.57e-12 ***
-0,25106 (-1,0%)
0,06321
-3,972
7.13e-05 ***
N. noltii
Coefficients:
(Intercept)
treatmentH
populationS
timepointT6
Estimate
Std. Error
z value
Pr(>|z|)
3,64095 (4,1%)
0,03554
102,445
< 2e-16 ***
-0,16025 (-0,2%)
0,03834
-4,18
2.91e-05 ***
-0,05039 (-0.1%)
0,03823
-1,318
0,187
-0,42009 (-0.5%)
0,03906
-10,755
< 2e-16 ***
12
Inter-species transcriptomics of seagrasses
Tab. S4:
Indicator genes for heat stress expression in Z. marina. Overlaps of gene sets up-regulated in Z. marina
during a heat wave simulation from two analogous experiments (2008, 2009: heat wave experiments
described in Franssen et al. 2011a and this study, respectively) and different methods applied (indicator:
indicator analysis described in Franssen et al. 2011a, edgeR: differential expression analysis applied in
this study) along with their functional annotation. (1 indicates that the respective gene was identified
identified as upregulated during the heat wave, while 0 indicates that the respective gene was not
identified as such.)
ortholog
gene id
at3g30775
at5g02500
at4g28480
at5g56030
at1g08450
at5g54190
at5g28540
at4g24280
at2g42220
at4g24190
at4g16660
at1g77670
functional category
(Mapman)
amino acid metabolism.
Degradation. Glutamate
family.proline
stress.abiotic.heat /
protein.folding
stress.abiotic.heat
2008,
2008,
description
indicator edgeR
Symbols: ERD5, PRODH, AT-POX, ATPOX, ATPDH, PRO1 | ERD5 (EARLY
RESPONSIVE TO DEHYDRATION 5); proline dehydrogenase | chr3:1244863612451248 REVERSE
1
Symbols: HSC70-1, HSP70-1, AT-HSC70-1, HSC70 | HSC70-1 (HEAT SHOCK
COGNATE PROTEIN 70-1); ATP binding | chr5:553745-556442 REVERSE
1
DNAJ heat shock family protein | chr4:14073042-14075271 FORWARD
1
Symbols: HSP81-2, ERD8, HSP90.2 | HSP81-2 (HEAT SHOCK PROTEIN 81-2);
stress.abiotic.heat
ATP binding | chr5:22686802-22689650 FORWARD
1
Symbols: CRT3 | CRT3 (CALRETICULIN 3); calcium ion binding / unfolded protein
signalling.calcium
binding | chr1:2667825-2671832 REVERSE
1
tetrapyrrole synthesis.
Symbols: PORA | PORA; oxidoreductase/ protochlorophyllide reductase |
Protochlorophyllide reductase chr5:21990999-21992812 REVERSE
0
stress.abiotic.heat
Symbols: BIP1 | BIP1; ATP binding | chr5:10540460-10543375 REVERSE
0
Symbols: cpHsc70-1 | cpHsc70-1 (chloroplast heat shock protein 70-1); ATP binding
stress.abiotic.heat
| chr4:12589988-12593630 FORWARD
0
misc.rhodanese
rhodanese-like domain-containing protein | chr2:17592038-17593500 FORWARD
0
Symbols: SHD, HSP90.7 | SHD (SHEPHERD); ATP binding / unfolded protein
stress.abiotic.heat
binding | chr4:12551717-12555909 REVERSE
1
heat shock protein 70, putative / HSP70, putative | chr4:9376737-9381507
stress.abiotic.heat
FORWARD
1
secondary metabolism.
Phenylpropanoids
aminotransferase class I and II family protein | chr1:29188901-29190975 REVERSE
1
13
2009,
edgeR
1
0
1
1
0
0
1
0
1
0
1
1
1
1
1
1
1
1
0
1
0
1
1
1
Inter-species transcriptomics of seagrasses
De novo transcriptome assembly of N. noltii
The de novo transcriptome assembly was originally performed in (Gu et al. 2012b). Details on the
sequenced libraries and assembly are described below:
Sequencing
In order to establish a reference de novo transcriptome assembly for N. noltii two sequencing libraries
were constructed by pooling genotypes of the northern and southern populations sampled from several
time points during the heat stress experiment (7.2 and 14 μg total RNA, respectively). Total RNA was
DNAse-treated and first-strand cDNA synthesis was performed using oligo(dT) priming followed by eight
PCR cycles. cDNA normalization was performed to reduce highly expressed transcripts followed by eight
PCR amplification cycles. Libraries were tagged and sequenced on one slide with the 454 Genome
Sequencer FLX using Titanium chemistry (Roche / 454 Life Sciences). All cDNA library constructions and
sequencing was performed by GATC Biotech (Konstanz, Germany).
Data preprocessing & de novo assembly
Adapter and primer contaminations in the reads sequenced by 454 Titanium sequencing were identified
and removed using CROSSMATCH (http://www.phrap.org/). Cleaned reads were used for de novo assembly
of a N. noltii reference transcriptome using MIRA v.3.2.0 (Chevreux et al. 2004).
Characteristics of the transcriptome assembly
The reference transcriptome for N. noltii was assembled from a total of 850,359 reads obtained by 454
sequencing and yielded 55,891 contigs (Gu et al. 2012a). Annotation of the contigs based on similarity to
the proteomes of the reference plant species, Arabidopsis thaliana and Oryza sativa, identified 11,914 and
12,144 orthologs, respectively (Table S1A). This corresponds to an ortholog identification of 43.5% of all
A. thaliana genes (total 27,379 protein coding genes, TAIR9; (Swarbreck et al. 2008) and 21.4% of all O.
sativa genes (total 56,797 protein coding genes, Rice Genome Annotation Project (Ouyang et al. 2007). A
previous underrepresentation analysis of Gene Ontology terms (Ashburner et al. 2000) of the assembled
transcripts suggested that sufficient gene coverage was accomplished (Gu et al. 2012a).
14
Inter-species transcriptomics of seagrasses
References:
Ashburner M, Ball CA, Blake JA et al. (2000) Gene ontology: tool for the unification of biology. The Gene
Ontology Consortium. Nat Genet, 25, 25–29.
Chevreux B, Pfisterer T, Drescher B et al. (2004) Using the miraEST assembler for reliable and automated
mRNA transcript assembly and SNP detection in sequenced ESTs. Genome Res, 14, 1147–1159.
Franssen SU, Gu J, Bergmann N et al. (2011) Transcriptomic resilience to global warming in the seagrass
Zostera marina, a marine foundation species. Proc Natl Acad Sci U S A, 108, 19276–19281.
Gu J, Weber K, Klemp E et al. (2012a) Identifying core features of adaptive metabolic mechanisms for
chronic heat stress attenuation contributing to systems robustness. Integr. Biol.
Gu J, Weber K, Klemp E et al. (2012b) Identifying core features of adaptive metabolic mechanisms for
chronic heat stress attenuation contributing to systems robustness. Integrative biology:
quantitative biosciences from nano to macro, 4, 480–493.
Ouyang S, Zhu W, Hamilton J et al. (2007) The TIGR Rice Genome Annotation Resource: improvements
and new features. Nucleic Acids Research, 35, D883–887.
Robinson MD, McCarthy DJ, Smyth GK (2010) edgeR: a Bioconductor package for differential expression
analysis of digital gene expression data. Bioinformatics, 26, 139–140.
Swarbreck D, Wilks C, Lamesch P et al. (2008) The Arabidopsis Information Resource (TAIR): gene
structure and function annotation. Nucleic Acids Res, 36, 1009–1014.
15