Concerted divergence after gene duplication in

Plant Physiology Preview. Published on April 28, 2017, as DOI:10.1104/pp.16.01983
1
Short title: Concerted divergence of duplicated genes
2
3
Concerted divergence after gene duplication in Polycomb Repressor
4
complexes
5
1
2
1
Yichun Qiu , Shao-Lun Liu , and Keith L. Adams
6
7
1Department of Botany, University of British Columbia, Vancouver, British Columbia, Canada
8
2Department of Life Science, Tunghai University, Taichung, Taiwan
9
Corresponding author: Keith L. Adams, E-mail: [email protected]
10
11
12
13
14
Summary: FIS2 and MEA have diverged in concert after simultaneous gene duplication,
resulting in functional divergence of the PRC2-complexes in Brassicaceae, which is a novel fate
for duplicated genes whose products act in complexes.
15
16
17
Author contributions: Y.Q. and K.L.A. designed the research. Y.Q. and S.-L.L. performed the
18
experiments and analyzed the data. Y.Q. and K.L.A. wrote the manuscript.
19
20
Funding information: This work was supported by a Discovery Grant from the Natural Science
21
and Engineering Research Council of Canada (to KLA), a Postgraduate Fellowship from NSERC
22
(to YQ), and grants from the Ministry of Science and Technology, Taiwan (to SLL).
23
24
1
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
Copyright 2017 by the American Society of Plant Biologists
25
26
27
Abstract
28
29
Duplicated genes are a major contributor to genome evolution and phenotypic novelty. There are
30
multiple possible evolutionary fates of duplicated genes. Here we provide an example of
31
concerted divergence of simultaneously duplicated genes whose products function in the same
32
complex. We studied PRC2 (Polycomb Repressive Complex 2) in Brassicaceae. The VRN-PRC2
33
complex contains VRN2 and SWN, and both genes were duplicated during a whole genome
34
duplication to generate FIS2 and MEA which function in the Brassicaceae-specific FIS-PRC2
35
complex that regulates seed development. We examined expression of FIS2, MEA, and their
36
paralogs, compared their cytosine and histone methylation patterns, and analyzed the sequence
37
evolution of the genes. We found that FIS2 and MEA have reproductive-specific expression
38
patterns that are correlated and derived from the broadly expressed VRN2 and SWN in outgroup
39
species. In vegetative tissues of Arabidopsis repressive methylation marks are enriched in FIS2
40
and MEA, whereas active marks are associated with their paralogs. We detected comparable
41
accelerated amino acid substitution rates in FIS2 and MEA but not in their paralogs. We also
42
show divergence patterns of the PRC2-asssociated VEL2 that are similar to FIS2 and MEA.
43
These lines of evidence indicate that FIS2 and MEA have diverged in concert, resulting in
44
functional divergence of the PRC2-complexes in Brassicaceae. This type of concerted
45
divergence is a previously unreported fate of duplicated genes. In addition, the Brassicaceae-
46
specific FIS-PRC2 complex modified the regulatory pathways in female gametophyte and seed
47
development.
48
49
2
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
50
Introduction
51
Duplicated genes are continuously formed during evolution by various types of gene duplication
52
events in eukaryotes and they can have effects on morphological and physiological evolution
53
(reviewed in Van de Peer et al., 2009; Soltis and Soltis 2016). Gene duplication can happen at
54
small scales, such as tandem duplication, segmental duplication, and duplicative retroposition.
55
The largest scale of gene duplication is whole genome duplication (WGD), which gives rise to
56
thousands of duplicated gene pairs. The genetic model plant, Arabidopsis thaliana, has
57
experienced five rounds of WGD events in the evolutionary history of seed plants (Jiao et al.,
58
2011; Li et al., 2015). The most recent polyploidy event, the alpha WGD, is specific to the
59
Brassicaceae family, which took place after the divergence of the closest sister family,
60
Cleomaceae (Schranz and Mitchell-Olds, 2006). There are about 2500 pairs of duplicated genes
61
retained from this WGD in the Arabidopsis thaliana genome (Blanc et al., 2003; Bowers et al.,
62
2003).
63
Fates of duplicated genes vary during evolutionary history. One duplicate may eventually
64
be lost or become a pseudogene, thus the once duplicated pair returns to a single-copy status.
65
Several mechanisms drive the retention of both copies. Duplicated pairs could preserve similar
66
functions to maintain dosage balance (Birchler et al., 2005; Coate et al., 2016). Duplicated pairs
67
can also diverge through subfunctionalization or neofunctionalization, where two duplicated
68
genes divide the ancestral function or gain a novel function, respectively (Force et al., 1999;
69
Moore and Purugganan, 2005). These types of divergence could also be inferred from expression
70
pattern. For example, two duplicates together make up the pre-duplicate expression profile is
71
referred to as regulatory subfunctionalization, and regulatory neofunctionalization indicates one
72
or both copies gain a new expression pattern (Duarte et al., 2006; Liu et al., 2011). Sometimes
73
these processes are difficult to distinguish, and there can be a combination of different
74
mechanisms such as sub-neofunctionalization (He and Zhang, 2005).
75
There are many protein complexes whose members are encoded by different gene
76
families. If multiple components in a complex are duplicated simultaneously, such as in a whole
77
genome duplication, the doubled components could redundantly cross-interact, or go on to
78
experience subsequent divergence (Capra et al., 2012; Aarke et al., 2015). Thus a type of co-
79
evolution between the interacting gene products is hypothetically possible, but has not been
3
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
80
described in the plant kingdom. Extending the concept of concerted divergence, which is
81
discussed in the context of co-expression patterns of duplicated genes in the same metabolic or
82
regulatory pathways (Blanc and Wolfe, 2004), we here propose the evolutionary scenario that
83
simultaneous duplication of two genes whose products function together in a complex, followed
84
by parallel evolution and divergence of each derived gene, can lead to functional divergence of
85
the complexes.
86
In this study we focus on genes in PRC2 complexes (Polycomb Repressive Complex 2)
87
in Brassicaceae species as a potential example to demonstrate the proposed scenario. Those
88
complexes are histone modifiers, and regulate gene expression primarily by tri-methylation of
89
lysine 27 on histone H3 (H3K27me3) associated with target genes which leads to transcriptional
90
repression (Hennig and Derkacheva, 2009; Mozgova et al., 2015). One type of PRC2, the VRN-
91
complex that is present across all rosids, regulates vegetative tissue differentiation, and more
92
importantly, vernalization process to control flowering time in Arabidopsis (Chen et al., 2009;
93
Hennig and Derkacheva, 2009; Mozgova et al., 2015). The complex also represses autonomous
94
seed coat development (Roszak and Kohler, 2011) and it is present across rosids. The VRN2
95
complex consists of four subunits: REDUCED VERNALIZATION RESPONSE 2
96
(VERNALIZATION2, VRN2), SET DOMAIN-CONTAINING PROTEIN 10 (SWINGER,
97
SWN), with two WD-40 repeat proteins who act as the scaffold of the complex assemblies,
98
FERTILIZATION-INDEPENDENT ENDOSPERM (FIE) and MULTICOPY SUPRESSOR OF
99
IRA1 (MSI1). In Brassicaceae, the alpha WGD gave rise to a duplication of VRN2 to create its
100
paralog FERTILIZATION INDEPENDENT SEED 2 (FIS2) and a duplication of SWN to create its
101
paralog SET DOMAIN-CONTAINING PROTEIN 5 (MEDEA, MEA) (Fig. 1; Luo et al., 2009;
102
Spillane et al., 2007). Substituting for their paralogous proteins, FIS2 and MEA, together with
103
FIE and MSI1 (the alpha WGD paralogs of these two genes were lost), make up a new
104
Brassicaceae-specific PRC2, referred to as the FIS-complex (Fig. 1). The FIS-complex functions
105
in gametophyte and seed development, preventing female gamete proliferation before
106
fertilization, and facilitating endosperm cellularization after fertilization (Hennig and
107
Derkacheva, 2009). A typical fis phenotype, caused by non-functional mutation in FIS2, MEA
108
(also known as FIS1) or FIE (also known as FIS3), shows fertilization independent
109
embryogenesis, and other types of mutants have abnormal seed development, even abolished
110
seeds (Hennig and Derkacheva, 2009).
4
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
111
The observed divergence in the functions of the two kinds of PRC2 complexes leads to
112
the hypothesis that FIS2 and MEA have undergone divergence in a concerted way to give rise to
113
the FIS-complex. This study aimed to evaluate this hypothesis by examining expression patterns,
114
DNA and histone methylation, and rates of sequence evolution in both genes compared with
115
their paralogs. We found evidence for parallel divergence of FIS2 and MEA from their paralogs
116
in multiple ways that has accompanied functional divergence of the two complexes. This study
117
supports a model of concerted divergence of simultaneously duplicated genes whose products
118
function in a complex. This is a previously unreported fate of duplicated genes.
119
120
5
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
Results
121
122
FIS2 and MEA have specific and similar expression patterns in reproductive organs
123
FIS2 and MEA formed by the alpha whole genome duplication that is specific to the
124
Brassicaceae family after the divergence of the Brassicaceae lineage from the Caricaceae lineage.
125
After gene duplication, duplicated genes may experience expression divergence. We analyzed
126
microarray data in Arabidopsis thaliana to compare the expression profiles of paralogous
127
interacting gene pairs FIS2/VRN2 and MEA/SWN. We obtained two sets of ATH1 microarray
128
data and analyzed them separately: 63 different organ types and developmental stages (Schmid et
129
al., 2005), referred to as ADA (Arabidopsis developmental atlas) dataset hereafter, and 42
130
different tissue types during seed developmental stages (Le et al., 2010), referred to as ASA
131
(Arabidopsis seed atlas). We first calculated the expression specificity (τ) of the four genes
132
defined by Yang and Gaut (2011). VRN2 and SWN have expression specificity values of 0.19 and
133
0.17 respectively, indicating that both genes have relatively broad expression in nearly all organ
134
types included in the ADA dataset. In contrast, FIS2 has an expression specificity value of 0.70
135
and MEA is 0.63, indicating an organ-specific expression pattern. We observed that the
136
expression of FIS2 and MEA is restricted to flowers and siliques, and the absence of vegetative
137
expression explains the high expression specificity. Yang and Gaut (2011) analyzed the ADA
138
dataset and they found that the recent WG duplicates have a median tau close to 0.2. Thus what
139
we observed for FIS2 and MEA is quite high, and what we observed for VRN2 and SWN is about
140
average.
141
We also analyzed expression specificity (τ) in the ASA dataset (Fig. 2A). Similarly, the τ of FIS2
142
is 0.48 and MEA is 0.56, while VRN2 has τ of 0.21 and SWN has 0.22. FIS2 and MEA turn out to
143
show more tissue-specific expression in seed tissues. We broke down the ASA data and observed
144
that FIS2 and MEA tend to be expressed in the triploid endosperm rather than in the diploid
145
embryo or maternally derived seed coat. We did a 1000-replicate permutation test and gained
146
statistical support that the expression specificity differences in the FIS2-MEA and VRN2-SWN
147
comparisons are not significant (Fig. S1), indicative of the concerted divergence in their
148
expression profile. In contrast, the tissue specificity expression profile is significantly different in
149
the two duplicated pairs, VRN2-FIS2 and SWN-MEA, indicative of their regulatory divergence.
6
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
150
Not only did we analyze the expression index for those genes individually, we also
151
performed a correlation test to examine the association of the expression profiles of the four
152
genes, as their products function in a complex (Fig. 2B). We found that the expression patterns
153
of FIS2 and MEA are positively correlated in both the ADA and ASA datasets, while broadly
154
expressed VRN2 and SWN are co-expressed. However, the expression of both FIS2 and MEA is
155
negatively correlated to the expression of VRN2 and SWN. The negative coefficients are around
156
-0.5 (Fig. 2B), which is below 1% of the total alpha WG pairs analysed by Blanc and Wolfe
157
(2004). Overall the FIS2-MEA expression patterns indicate parallel divergence from VRN2-SWN
158
expression patterns in a concerted manner.
159
160
FIS2 and MEA acquired new expression patterns
161
As the microarray data from the Arabidopsis thaliana developmental expression atlas indicated,
162
FIS2 and MEA both have an expression pattern that is restricted to reproductive organs, such as
7
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
163
flowers and siliques, but not vegetative organs, including roots, stems, and leaves. We confirmed
164
this result with RT-PCR (Fig. 3). In contrast, their paralogs, VRN2 and SWN, have a broad
165
expression pattern in both vegetative and reproductive organs, and they are ubiquitously
166
expressed in all examined organ types in our RT-PCR results (Fig. 3). To infer the ancestral
167
expression pattern of the two gene pairs, we assayed the expression pattern of orthologs in
168
Tarenaya hassleriana (formerly known as Cleome spinosa), Carica papaya and Vitis vinifera.
169
Among those species with sequenced genomes, Tarenaya belongs to Cleomaceae, the most
170
closely related sister group to Brassicaceae. Although Tarenaya has its own genome triplication
171
after the divergence between Cleomaceae and Brassicaceae (Cheng et al., 2013), only a single
172
copy each of the orthologous VRN2 and SWN has been retained. Carica is also in the order
173
Brassicales. Vitis was chosen because its lineage has not experienced any whole genome
174
duplication events since the gamma WGD during early eudicot evolution, which applies to
175
Carica as well, and thus genes are frequently single copy in these taxa. These single-copy
176
orthologs can facilitate the inference of ancestral expression pattern. We confirmed that these
8
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
177
sequences are true orthologs of FIS2/VRN2 and MEA/SWN by phylogenetic analysis of the gene
178
families.
179
For both the FIS2/VRN2 and MEA/SWN pairs, their orthologs in Tarenaya, Carica and
180
Vitis are widely expressed in all examined organ types, which is the same as VRN2 and SWN in
181
Arabidopsis (Fig. 3). The absence of expression in vegetative organs is only observed in FIS2
182
and MEA. Collectively we inferred that the pre-duplicated expression state is likely to be a broad
183
expression pattern, which is reflected by VRN2 and SWN. The Brassicaceae FIS2 and MEA both
184
lost expression in vegetative organs to become specifically expressed in reproductive organs.
185
186
FIS2 and MEA acquired novel epigenetic modifications
187
The epigenetic features of cytosine methylation and histone methylation are often associated
188
with expression or silencing of genes. To examine the patterns of cytosine and histone
189
methylation in organ types where expression of FIS2 and MEA was lost, we investigated the
190
epigenetic variation among these genes in vegetative tissues including leaves, roots and seedlings
191
of Arabidopsis thaliana (see methods for details). For DNA methylation, we found that cytosine
192
methylation at CpG sites is enriched in the promoter region (defined as 1500 bp upstream of the
193
transcription start site) of FIS2 genomic sequence, but not the gene body (Fig. 4). The opposite is
194
found for VRN2, with the promoter region unmarked but the gene body is highly methylated (Fig.
195
4). The same divergence of DNA methylation was found for MEA and SWN (Fig. 4). Cytosine
196
methylation is enriched in the promoter region of MEA but only in the gene body of SWN. The
197
DNA methylation patterns in EMF2 and CLF, the more distant paralogs of VRN2 and SWN,
198
respectively, are also gene body enrichment, the same as VRN2 and SWN, suggesting that the
199
pattern of DNA methylation for FIS2 and MEA has changed after duplication. As promoter
200
cytosine methylation is associated with transcriptional repression, and gene body methylation is
201
indicative of expression activation (Suzuki and Bird 2008), this finding is consistent with the
202
expression data. We did not examine methylation patterns in whole endosperm because in the
203
ASA seed atlas dataset FIS2 and MEA showed variable expression patterns in different parts of
204
the endosperm and different developmental stages.
205
206
We also examined histone methylation in the region of these genes in the seedlings of
Arabidopsis thaliana based on the data generated by Roudier et al. (2011). Similar to DNA
9
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
207
methylation, we found that VRN2, SWN, EMF2 and CLF have the same types of histone
208
methylation, which are different from FIS2 and MEA (Table 1). We noticed that FIS2 and MEA
209
lost H3K4me3, which is shared by all the other genes. Instead they gained a novel mark of
210
H3K27me3. H3K4me3 is an activating mark, while its antagonistic mark H3K27me3 is
211
repressive. This could help explain the expression of VRN2, SWN, EMF2 and CLF in the
212
vegetative tissue, but the lack of expression of FIS2 and MEA. It is also notable that in the fie
213
mutant, where the PRC2 function was supposed to be abolished, FIS2 and MEA lost their
214
H3K27me3, but instead VRN2, SWN, EMF2 and CLF were marked by H3K27me3 (Bouyer et al.,
215
2011). As H3K27me3 is regulated by PRC2 complexes, this finding suggests the self- and cross-
216
regulation among these genes. With both DNA and histone modification comparative analyses
217
we observed the convergent evolution of epigenetic features in FIS2 and MEA, divergent from
218
their pre- and post-duplicated paralogs.
219
220
Gene structural changes in FIS2 and MEA
10
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
221
FIS2 formed from VRN2 by duplication, and MEA duplicated from SWN, during the alpha whole
222
genome duplication. FIS2 in Arabidopsis thaliana lost three exons, called the E15-17 region
223
(exons 15 to 17) (corresponding to the 15th to 17th exons in the Arabidopsis thaliana EMF2, not
224
named after VRN2) compared to VRN2 (Fig. S2A; Chen et al., 2009). FIS2 has a large serine-rich
225
domain that is not shared with any other VEF genes in any species, indicating gain of the domain
226
in Brassicaceae (Fig. S2A; Chen et al., 2009). Our sequence analysis showed that the serine-rich
227
domain is highly variable among FIS2 sequences from different Brassicaceae species (Fig. S2A).
228
The lost E15-17 domain and the gained serine-rich domain are both neighbouring the VEF
229
domain that interacts with the C5 domain in MEA.
230
MEA is about 150 aa shorter than SWN, and the deleted region is just downstream of the
231
C5 domain which interacts with the VEF domain in FIS2, due to a large shrinkage in a single
232
exon (the 9th in Arabidopsis thaliana MEA and SWN) where Brassicaceae SWN and
233
orthologous SWN-like sequences are not conserved (Fig. S2B). How the structural changes
234
affect the physical interaction of FIS2 and MEA remains to be tested. In addition to the
235
rearrangement of functional domains, those shared domains show different levels of amino acid
236
sequence divergence. In contrast, VRN2/EMF2-like sequences and SWN/CLF-like sequences
237
show relative conservation across all flowering plants in amino acid sequences and functional
238
domains (Chen et al., 2009; Qian et al., 2014).
239
240
FIS2 and MEA show accelerated amino acid substitution rates and evidence for positive
241
selection
242
Duplicated genes not only diverge in expression pattern but also in their sequences. We first
243
analyzed by Ka/Ks analysis the full-length coding region of FIS2, VRN2, MEA, and SWN genes
244
(Fig. S3). The Brassicaceae FIS2 clade had a much higher average Ka/Ks than VRN2 lineages,
245
3.5-fold greater than the paralogous Brassicaceae VRN2 clade, and 10-fold greater than the
246
orthologous pre-duplicate VRN2 sequences. Similarly, the Brassicaceae MEA clade had a high
247
average Ka/Ks comparable to the FIS2 clade, which is 3.5-fold greater than the paralogous
248
Brassicaceae SWN clade, and 4.5-fold greater than the orthologous pre-duplicated SWN
249
sequences. We implemented different models assuming similar vs. different Ka/Ks ratios in these
250
clades, described in the methods section, and the likelihood ratio tests indicated that the
11
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
251
divergence in sequence rate is significant (Table S1). These analyses indicate that while the
252
paralogous Brassicaceae VRN2 and SWN lineages are under stronger purifying selection along
253
with the orthologous genes in outgroup species, FIS2 and MEA in the Brassicaceae have
254
experienced relaxation of purifying selection. Asymmetric Ka/Ks ratios are seen in a minority of
255
duplicated gene pairs in Arabidopsis thaliana; for example, Gossmann and Schmid (2011)
256
estimated that 7% of the duplicated pairs they analyzed have asymmetric Ka/Ks ratios.
257
Additionally, among the branch-wise Ka/Ks of specific FIS2 and MEA sequences, we
258
detected possible positive selection, indicated by Ka/Ks greater than one, acting on the sequences
259
from certain lineages (Fig. S3). In order to distinguish certain amino acid sites evolving under
260
positive selection from relaxed purifying selection, we also applied a branch-site model, which
261
suggested that both branches leading to Arabidopsis FIS2 (P<0.0001) and MEA (P=0.007) have
262
positively selected amino acid sites across different functional domains (Fig. S4).
263
Thus we further studied the sequence evolution of characterized functional domains of
264
FIS2/VRN2 and MEA/SWN genes, including the VEF and C2H2 domains in the FIS2 and VRN2
265
genes, and the C5, SET, SANT and CXC domains in the MEA and SWN genes (Fig. 5, Fig. S3).
266
We observed that the trend of acceleration in sequence evolution of FIS2 and MEA, and
267
evolutionary constraint resulting in the conservation of VRN2 and SWN, was reflected by all the
268
functional domains we analyzed individually. The VEF domain in FIS2/VRN2 genes and the C5
269
domain in MEA/SWN genes physically interact with each other, thus the comparison between the
270
two sets of Ka/Ks ratios best describes the co-evolution between FIS2 and MEA at the coding
271
sequence level from a protein-protein interaction perspective (Fig. 5). Consistent with the full-
272
length gene analyses, the VEF domain in the FIS2 lineages and the C5 domain in the MEA
273
lineages both have accelerated amino acid substitution rates, with evidence (Ka/Ks > 1)
274
suggesting positive selection on a few branches (Fig. S3; Table S1). Similar results were found
275
in the DNA binding related domains, C2H2 in FIS2/VRN2, CXC and SANT in MEA/SWN genes
276
(Fig. S3), indicating that the PRC2 complexes with FIS2 and MEA may have affinity to specific
277
DNA regions, regulating a novel network of gene expression. The SET domain plays the role of
278
methyltransferase in the PRC2 complex, and is usually highly conserved across eukaryotes
279
(Baumbusch et al., 2001). This is reflected by the low Ka/Ks ratios detected in the SWN SET
280
domains (Fig. S3). Instead, the SET domain in the Brassicaceae MEA shows evidence for
281
positive selection (Fig. S4). The rapid amino acid substitution rates in the PRC2 functional
12
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
282
domains together likely relate to the functional divergence of the PRC2 complexes containing
283
FIS2 and MEA.
284
285
VEL2 and VEL1, which interact with PRC2 complexes, show corresponding divergence
286
patterns to FIS2/VRN2 and MEA/SWN
287
A family of five PHD finger proteins is necessary for the core PRC2 complex to maintain the
288
repressed status of chromatin (Kim and Sung, 2010; Kim and Sung, 2013). Among them,
289
VERNALIZATION5/VIN3-LIKE 1 (VEL1) and VEL2 are a pair of alpha whole genome duplicates.
290
VEL2 is a maternally expressed imprinted gene (Wolff et al. 2011). We analyzed their expression
291
profile in the ADA and ASA microarray datasets, and detected that VEL1 shows a co-expression
292
pattern with VRN2 and SWN, which is similar to the broadly expressed VEL homologs, whereas
293
VEL2 has a similar expression pattern to FIS2 and MEA due to loss of vegetative expression (Fig.
294
6A; Qiu et al., 2014). VEL2 has a higher specificity than its paralog VEL1 (Fig. 6B). Thus the
13
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
295
observed concerted divergence in expression pattern in the FIS-complex is not limited to the core
296
complex, but also includes other associated proteins.
297
For cytosine methylation in the vegetative tissue, VEL1 is marked through the coding
298
exons but not the promoter region, whereas VEL2 has cytosine methylation enriched in the
299
upstream promoter region and the first two introns located between the 5’UTR exons (Schmitz et
300
al., 2013; Stroud et al., 2013; Zemach et al., 2013). For histone methylation in the vegetative
301
tissue, VEL1 is marked by activating marks, including H3K4me3, H3K36me3 and H3K4me2
302
(Roudier et al., 2011). VEL2 has lost the H3K4me3 and H3K36me3, but instead gained the
303
repressive mark H3K27me3. Those epigenetic features not only correspond to the vegetative
304
expression level, but also are consistent with the divergence of the core PRC2 components FIS2
305
and MEA (Table 1).
306
We further analyzed the sequence evolution of the VEL genes. The VEL2 sequences have
307
an elevated average Ka/Ks ratio compared to the Brassicaceae VEL1 and orthologous VEL genes
14
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
308
(Fig. 6B). While VEL1 and orthologous sequences have a low Ka/Ks ratio close to zero
309
indicating strong purifying selection, a three-fold change in VEL2 sequences suggests the
310
relaxation of purifying selection. This coincides with the accelerated amino acid substitution
311
rates of FIS2 and MEA.
312
313
15
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
314
Discussion
315
Concerted divergence of FIS2 and MEA in the FIS-PRC2 Complex
316
Upon gene duplication, hypothetically two duplicates are identical in function, as well as
317
expression pattern if the cis-elements also are entirely duplicated. Considering that many
318
proteins function through interactions with other proteins, in a regulatory or metabolic pathway,
319
through protein-protein interaction, or form an integral complex, either the duplicates are
320
redundant or both duplicates could integrate into either complex and affect the function of the
321
complex if they have divergence. A shift in expression pattern would be one way to avoid
322
potentially disadvantageous crosstalk between interacting members (Aarke et al., 2015). Blanc
323
and Wolfe (2004) described a process of concerted divergence of gene expression in Arabidopsis
324
thaliana, in which pairs of duplicates, whose protein products interact, diverge in a parallel
325
manner in expression pattern. However, as FIS2 and VRN2 were not identified as alpha WG
326
duplicates by the genome-wide study (Blanc et al., 2003), their concerted divergence in
327
expression pattern with MEA and SWN was not included.
328
Here we show that FIS2 and MEA diverged in expression pattern in a concerted manner,
329
modified from co-expressed VRN2 and SWN whose expression pattern resembles the ancestral
330
status. In addition, we show that cytosine methylation and histone methylation patterns in FIS2
331
and MEA also diverged in a concerted manner. It is possible that the methylation change
332
contributed to the changes in expression patterns, although mutations in regulatory elements may
333
also have played a role in the expression pattern changes. FIS2 and MEA are marked by
334
H3K27me3 in the vegetative tissue, suggesting they both became the targets of a vegetative
335
PRC2 complex after formation by gene duplication (Bouyer et al., 2011). In addition to the
336
vegetative epigenetic divergence, FIS2 and MEA are well known as imprinted genes during seed
337
development, both of which are maternally expressed genes (Berger and Chaudhury, 2009).
338
Based on the genome-wide datasets from Hsieh et al. (2011) and Gehring et al. (2011), we
339
determined that VRN2 and SWN are not imprinted, while the more distant relatives in their gene
340
families, EMF2 and CLF, also lack of evidence for imprinting. Thus we infer that FIS2 and MEA
341
became imprinted genes after their divergence from VRN2 and SWN. This concerted change in
342
regulation of both genes ensures the dosage balance between the interacting proteins. The
343
concerted divergence of FIS2 and MEA from their paralogs is also reflected by the elevated
16
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
344
Ka/Ks ratios in the coding sequences at comparable levels, suggesting similar relaxed purifying
345
selection is acting on the two genes. Altogether these changes indicate that FIS2 and MEA have
346
been diverging in concert in multiple ways, which likely contributed to the divergence in
347
functions between the FIS2-PRC2-complex and the VRN2-PRC2-complex.
348
349
Functional divergence in the FIS-PRC2 Complex
350
VRN2, SWN/CLF, FIE, and MSI1 form the VRN-complex, which regulates vernalization to
351
control flowering time in Arabidopsis (Fig. 1; Hennig and Derkacheva, 2009). The complex also
352
represses autonomous seed coat development (Roszak and Kohler, 2011). The FIS-complex
353
contains FIS2, MEA, FIE, and MSI1. The FIS-complex is important in gametophyte and seed
354
development and it has two major functions. A pre-fertilization role for the FIS-complex is that it
355
prevents proliferation of the central cell of the female gametophyte until after fertilization so that
356
seed development does not start until after fertilization (Hennig and Derkacheva, 2009). The
357
FIS-complex also acts post-fertilization. It is needed for regulating endosperm cellularization
358
during seed development (Hehenberger et al., 2012). FIS2 mutants show a phenotype of
359
abnormal female gametophyte development into embryos and are defective in controlling central
360
cell proliferation in the female gametophyte, suggesting that FIS2 is not redundant with VRN2 in
361
the pre-fertilization function (Roszak and Kohler, 2011). Thus the FIS-complex function in the
362
female gametophyte is specific to the FIS-complex and not the VRN-complex. MEA was also
363
shown to not be redundant with SWN (Roszak and Kohler, 2011). Unlike all the key components
364
in FIS-complex, a SWN mutant failed to lead to autonomous seed development in the absence of
365
fertilization, nor seed abortion with embryo and endosperm overgrowth (Luo et al., 1999), thus it
366
is possible that MEA is functionally specialized for the pre-fertilization function of the FIS-
367
complex and can not be complemented by SWN. As for the post-fertilization function, SWN was
368
shown to be not essential in seed development (Spillane et al., 2007). Thus it was proposed that
369
MEA underwent neofunctionalization to gain a post-fertilization role in regulating seed
370
development after its duplication from SWN (Spillane et al., 2007). Taking the two parts of the
371
FIS-complex functions together, it appears that the novel PRC2 made up by FIS2 and MEA
372
created a Brassicaceae-specific complex for preventing seed development prior to fertilization
373
and facilitating seed development after fertilization in Brassicaceae. This functional divergence
17
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
374
complements the concerted divergence of FIS2 and MEA in other ways that we show in this
375
study. The FIS-complex also plays an important role in establishing imprinted expression of
376
many genes in the endosperm, especially paternally expressed imprinted genes, as the
377
differentially methylated paternal or maternal allele can affect the targeting by this complex
378
(Wolff et al., 2011; Kohler et al., 2012). The concerted divergence of FIS2 and MEA in
379
expression patterns, methylation patterns, and accelerated sequence evolution may have
380
contributed to functional diversification or potentially neofunctionalization of the FIS-PRC2
381
complex. An alternative to neofunctionalization of the FIS-PRC2 complex is
382
subfunctionalization after the formation of FIS2 and MEA from their paralogs. Without
383
knowledge of the ancestral function of the PRC2 complex in plants closely related to the
384
Brassicaceae, discussed below, we cannot say for sure if there has been neofunctionalization or
385
subfunctionalization. We show in this study that there has been regulatory neofunctionalization
386
of FIS2 and MEA, which leads us to favor the possibility of neofunctionalization of the complex.
387
Nonetheless, under a scenario of subfunctionalization, FIS2 and MEA still show concerted
388
divergence in their expression patterns, cytosine and histone methylation, and accelerated
389
sequence evolution. In order to distinguish the two possible hypotheses, more research on VRN-
390
complexes in rosid species will provide valuable information to infer the function of the ancestral
391
rosid PRC2 complex.
392
How are the FIS-complex functions performed in other angiosperms outside of
393
Brassicaceae? Some clues come from studies of FIE, which is a member of the FIS2 complex, in
394
Hieracium piloselloides (Asteraceae). The central cell proliferation phenotype of Arabidopsis fie
395
mutants is not seen in sexual Hieracium FIE RNAi lines; thus a PRC2 complex does not regulate
396
central cell proliferation in the female gametophyte of Hieracium, in contrast to Arabidopsis
397
(Rodrigues et al., 2008). This might indicate that parts of the pre-fertilization function of FIS-
398
PRC2 in Brassicaceae is an evolutionary innovation, at the same time it is possible the unknown
399
mechanism repressing central cell proliferation is specific to the Hieracium lineage. FIE down-
400
regulation in Hieracium leads to seed abortion (Rodrigues et al., 2008) and thus FIE is important
401
for seed development, presumably as part of a PRC2 complex. Asterids do not contain FIS2,
402
VRN2, or MEA. Thus, if there is a PRC2 complex regulating seed development in asterids, it
403
probably contains the product of lineage specific polycomb proteins, and a mechanism
404
independently evolved from Brassicaceae. In maize and rice there has been duplication of FIE
18
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
405
(Luo et al., 2009; Li et al., 2014). Thus the grasses may have PRC2 complexes that are
406
divergent from the ancestral state. The requirement of H3K27me3 in rice and maize endosperm
407
for establishment of imprinting suggests the functional conservation or convergence of a PRC2
408
complex in Brassicaceae and Poaceae (Makarevitch et al., 2013; Zhang et al., 2014).
409
410
Evolution of protein complexes after the duplication of components
411
We propose the model of simultaneous gene duplication and concerted divergence of one copy
412
of each duplicated pair (Fig. 7). Following formation by duplication, two genes whose products
413
function together in a complex diverge in similar ways and the complex diverges in function.
414
This divergence pattern is not limited to neo- / sub-functionalization, but includes some other
415
modifications of these scenarios such as escape from adaptive conflict. The PRC2 complexes in
416
Brassicaceae we examined in this study provide the first example of this type of divergence of
417
duplicated genes. We contrast this scenario with single-gene-duplication and divergence, where
418
one component in the complex underwent gene duplication, then the paralog diverges driving the
419
two complexes with either paralog to diverge in function as a result. Intuitively many described
420
functionally divergent paralogs may contribute to this type of divergence of their protein
421
complexes. One example is the centromere-defining histone variants CENH3 in the histone core
422
octamers that show duplication specific to the genus Mimulus and sequence divergence, whereas
423
other components in the histone core octamers do not show duplications specific to Mimulus
424
(Finseth et al., 2015). Another case is the telomere-associated proteins POT1a and POT1b in the
425
telomerase RNP complexes in Brassicaceae, where POT1a experienced positive selection that
426
enhanced its affinity with interacting proteins (Beilstein et al., 2015). A variation on this model
427
is when there is a subsequent gene duplication at a later time of another gene whose product
428
functions in the complex, followed by divergence. An example is the plant-specific RNA
429
polymerase IV and V where rounds of independent lineage-specific duplications and subsequent
430
divergence of varying kinds of subunits have increased RNA polymerase complexity and
431
specificity among different plant groups (Wang et al., 2015).
432
433
Concerted divergence of the functionally associated VELs and some PRC2 targets
19
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
434
The VEL genes, VEL1 and VEL2, which are required to maintain and facilitate polycomb
435
transcriptional repression, interact with the PRC2 complex but are not part of the complex itself.
436
Our expression, methylation, and sequence analysis results indicate that VEL2 has similar
437
patterns to FIS2 and MEA, whereas VEL1 has similar patterns to VRN2 and SWN. Thus VEL2
438
appears to be diverging in concert with FIS2 and MEA. VEL2 is also a maternally expressed
439
gene and regulated by the FIS-complex in the endosperm (Wolff et al., 2011), and VEL2 works
440
together with the FIS core complex to impose maternal regulation in seed development similar to
441
FIS2 and MEA.
442
Several PRC2 targets duplicated through the alpha WGD show similar patterns of
443
divergence as well. PKR2 and JMJ15 are FIS-PRC2 regulated imprinted genes (Hsieh et al.,
444
2011; Wolff et al., 2011), whereas their paralogs, PKL and JMJ18 show broad expression, are
445
not imprinted, and are associated with a vegetative PRC2 complex (Aichinger et al., 2011; Yang
446
et al., 2012; Zhang et al., 2012). Out of 46 imprinted genes regulated by FIS2 (Wolff et al.,
447
2011), we identified 41 Brassicaceae-specific duplicated genes. Some of those genes have roles
20
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
448
in seed development, such as PHERES1 (Kohler et al., 2003; Villar et al., 2009) and ADMETOS
449
(Kradolfer et al., 2013). Thus, there are new Brassicaceae-specific genes involved in seed
450
development that are regulated by the FIS-PRC2 complex. The functional innovation of the FIS-
451
complex appears to have rewired, to some extent, the regulatory pathway of seed development
452
specific to Brassicaceae.
453
Simultaneous gene duplication events, such as polyploidy, give rise to pairs of duplicated
454
genes that can then co-diverge (Shan et al., 2009). Many of the genes that are PRC2 targets,
455
included in the previous paragraph, were derived by the alpha whole genome duplication. FIS2,
456
MEA, and VEL2 also were derived from that WGD. Thus this study illustrates the potential of
457
concerted divergence after simultaneous gene duplication to affect functions as well as regulation
458
of other genes.
459
Materials and Methods
460
461
Comparing expression specificity and detecting co-expression using microarray data
462
analyses
463
Two sets of ATH1 microarray data from Arabidopsis thaliana were obtained: the Arabidopsis
464
development atlas (ADA) from the TAIR website (http://www.Arabidopsis.org/), which included
465
63 different organ types and developmental stages (Schmid et al., 2005), and the Arabidopsis
466
seed development atlas (ASA) from the Goldberg Lab Arabidopsis thaliana Gene Chip Database
467
(http://estdb.biology.ucla.edu/genechip/), which included 42 different tissue types from seed
468
developmental stages (Le et al., 2010). The data were GC-RMA normalized using the gcrma
469
package in R. We used the expression specificity (τ) defined by Yang and Gaut (2011) to
470
describe the expression patterns of FIS2, VRN2, MEA and SWN: τ =
471
where n is the total number of samples (63 or 42) and S(i,max) is the highest log2 transformed
472
expression values for gene i across the n organ types. High values of expression specificity
473
indicate genes with expression limited to few organ or tissue types or developmental stages,
474
while low values of expression specificity indicate broad expression of genes with similar
475
expression levels in most of the organ or tissue types and developmental stages. To test if there is
476
any significant difference of expression specificity between any two of the four genes, we
[
( , )/ ( ,
)]
,
21
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
477
applied 1000 Monte Carlo randomization tests to each two-gene comparison. For the Monte
478
Carlo randomization test, we computed the following statistic: DIF = |τGENE1-τGENE2|, where DIF
479
indicates the absolute difference of expression specificity between two genes. Then, we
480
compared the observed value (DIFobs) against the null distribution of simulated DIF value (=
481
DIFsim) from 1000 randomized data. If the null hypothesis is rejected, the expression specificity
482
of any two compared genes is significantly different. The cutoff of the significant P value was
483
set to 0.05.
484
In addition to the comparison of expression specificity among gene pairs, we applied the
485
Pearson correlation analysis to determine if the expression profile between any two genes
486
showed any evidence of co-expression (i.e. correlated expression across different organ types or
487
tissue types). Co-expression is determined when the Pearson correlation coefficient (r) is
488
significantly positive, and vice versa.
489
490
Inferring the ancestral expression states using RT-PCR
491
Total RNA samples of Arabidopsis thaliana, Tarenaya hassleriana (formerly known as Cleome
492
spinosa), Carica papaya, and Vitis vinifera were extracted from liquid N2 frozen tissue of five
493
organ types: root, stem, leaf (rosette leaves in Arabidopsis thaliana), flower, and seed (whole
494
siliques in Arabidopsis thaliana and Tarenaya hassleriana). A modified CTAB method was used
495
for RNA extraction (Zhou et al., 2011). The quality of each RNA sample was checked on 2%
496
agarose gels by electrophoresis, and the amount of each RNA sample was determined by a
497
Nanodrop spectrophotometer. After DNaseI (Invitrogen) treatment to remove residual DNA, M-
498
MLV reverse transcriptase (Invitrogen) was applied to the RNA samples to generate cDNA,
499
according to the manufacturer’s instructions. PCR was performed with cDNA templates to detect
500
the organ-specific expression of Arabidopsis FIS2/VRN2 and MEA/SWN paralogous pairs, as
501
well as orthologous genes in outgroup species for inference of the ancestral, pre-duplication,
502
expression states. Gene-specific primers were designed to amplify 250-1000 bp of the cDNA of
503
targeted genes (Table S2). For PCR reactions, the cycling programs were: preheating at 94℃ for
504
3 minutes; 30-35 cycles of denaturing at 94℃ for 30 seconds, annealing at 53-56℃ for 30
505
seconds, elongation at 72℃ for 30 seconds or 1 minute, and a final elongation at 72℃ for 7
506
minutes. PCR products were checked on 1% agarose gels, and sequenced to confirm identity.
22
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
507
508
Identifying epigenetic marks associated with the studied genes
509
We investigated the epigenetic modifications around the genomic regions of Arabidopsis FIS2,
510
VRN2, MEA and SWN. We also used EMF2 and CLF, which are members of the FIS2/VRN2 and
511
MEA/SWN families, respectively, to help assess the ancestral state. For DNA methylation, we
512
obtained data from Schmitz et al. (2013), Stroud et al. (2013) and Zemach et al. (2013) from
513
CoGe (https://genomevolution.org/CoGe/), visualized by JBrowse in Araport
514
(https://www.araport.org/). Analyzed data included assayed genomic DNAs from leaves in
515
Schmitz et al. (2013) and Stroud et al. (2013), and assayed genomic DNAs from seedlings and
516
roots in Zemach et al. (2013), which were all vegetative organs. Cytosine methylation at CpG
517
sites was analyzed along the genomic region of a target gene. For histone methylation, we
518
extracted tiling-array data from seedlings from Roudier et al. (2011) and ChIP-on-chip data from
519
wild type and fie mutant seedlings from Bouyer et al. (2011). Four histone marks were analyzed:
520
tri-methylation of lysine 27 on histone H3 (H3K27me3), tri-methylation of lysine 4 on histone
521
H3 (H3K4me3), di-methylation of lysine 4 on histone H3 (H3K4me2) and tri-methylation of
522
lysine 36 on histone H3 (H3K36me3). The epigenetic features in Arabidopsis seedlings were
523
compared among the paralogous genes in a family and between the two interacting gene families.
524
525
Detecting accelerated sequence evolution and positive selection by Ka/Ks analyses
526
To analyse the selection acting on the gene pairs FIS2/VRN2 and MEA/SWN, several rate
527
analyses were performed using Codeml in the PAML package (Yang, 2007). We obtained the
528
sequences of the four genes from Arabidopsis thaliana, as well as some other Brassicaceae
529
species, including Arabidopsis lyrata, Arabidopsis halleri, Capsella rubella, Brassica rapa,
530
Brassica oleracea, Eutrema salsugineum (formerly known as Thellungiella halophila), and
531
Schrenkiella parvula (formerly known as Thellungiella parvula). We also identified orthologous
532
sequences, by reciprocal best BLAST hits, from species outside of the Brassicaceae including
533
Tarenaya hassleriana (formerly known as Cleome spinosa), Carica papaya, Gossypium
534
raimondii, Theobroma cacao, Citrus sinensis, Populus trichocarpa, Ricinus communis, and
535
Manihot esculenta, Vitis vinifera from PLAZA v3.0 Dicots
536
(http://bioinformatics.psb.ugent.be/plaza/versions/plaza_v3_dicots/; Proost et al., 2015),
23
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
537
Phytozome v10 (http://phytozome.jgi.doe.gov/pz/portal.html; Goodstein et al., 2012), BRAD
538
database (http://brassicadb.org/brad/; Cheng et al., 2011) and NCBI’s GenBank. Gene orthology
539
was later confirmed by comparing the topology of the gene phylogeny to the species tree.
540
Alignments of amino acid sequences were generated using MUSCLE under default parameters
541
(Edgar, 2004), and then reverse translated into codon alignments using the customized Perl script.
542
We generated the alignments for the full length of the two gene families, as well as some
543
documented functional domains, including the VEF and C2H2 domains in the FIS2 and VRN2
544
genes, and the C5, SET, SANT and CXC domains in the MEA and SWN genes. Phylogenies of
545
the two gene families were analyzed by RAxML v.7.0.3 with GTR as the substitution matrix
546
(Stamatakis, 2006). ML trees of the two gene families were generated based on codon
547
alignments.
548
We first used a phylogeny-based free-ratio test to estimate branch-wise Ka/Ks ratios
549
along the phylogenetic tree branches. For the full-length FIS2/VRN2 genes we implemented four
550
different models to test if the Ka/Ks ratios of the Brassicaceae FIS2 clade and the Brassicaceae
551
VRN2 clade display an asymmetric pattern, and how conserved they are compared to the
552
orthologous genes. The first model (Model I: one-ratio model) assumes that all the genes have
553
the same Ka/Ks ratio, bearing the hypothesis that all genes are under the same level of selection.
554
The second model (Model II: two-ratio model-1) assumes that the Brassicaceae VRN2 clade and
555
the orthologous genes have the same Ka/Ks ratio, but the Brassicaceae FIS2 clade can have a
556
different one, suggesting that the Brassicaceae VRN2 clade reflects the ancestral selection but
557
FIS2 evolved in a different manner. The third model (Model III: two-ratio model-2) assumes the
558
duplicated FIS2 and VRN2 clades in Brassicaceae have the same Ka/Ks ratio, while the orthologs
559
can have a different ratio, which is a hypothesis that the two Brassicaceae copies evolved at the
560
same rate. The fourth model (Model IV: three-ratio model) assumes that the two Brassicaceae
561
branches have different Ka/Ks ratios, and thus the two genes evolved at different rates, with the
562
third Ka/Ks ratio for the orthologous branches. A set of likelihood ratio tests were applied, where
563
twice the different of likelihood values was calculated and compared against a chi-square
564
distribution with the degree of freedom (df) set at one: comparison between Model II and Model
565
IV can tell if the selection on the Brassicaceae VRN2 is significantly different from the
566
orthologous genes; and comparisons between Model I and Model II, as well as between Model
567
III and Model IV is to see if the selection on the Brassicaceae FIS2 is different from the
24
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
568
Brassicaceae VRN2 and/or the orthologous genes. When Model II fits better than Model I, and
569
Model IV fits better than Model III with statistical support, the evolutionary rate of the
570
duplicated pair in Brassicaceae is considered to evolve asymmetrically. The same analyses were
571
performed on the functional domains of the FIS2/VRN2 genes, and the full-length MEA/SWN
572
genes and their functional domains (Table S1). We also applied a branch-site model to detect
573
positively selected sites along FIS2 as well as MEA. Test 2 of ModelA with the Bayes Empirical
574
Bayes analysis was applied to identify amino acid sites with a high posterior probability of
575
positive selection (Zhang et al. 2005).
576
577
578
579
580
25
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
581
Table(s)
582
Table 1. Histone methylation of studied genes.
583
H3K27me3
VEF
genes
SET
genes
VEL
genes
FIS2
VRN2
EMF2
x
MEA
SWN
CLF
x
VEL2
VEL1
x
H3K4me3
H3K4me2
H3K36me3
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
584
585
x’s indicate presence of a particular type of histone methylation
586
587
588
26
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
589
SUPPLEMENTAL DATA
590
Fig. S1. Permutation test for microarray data to detect the difference in expression profile for all sets of
591
comparisons of gene pairs in the ADA and ASA datasets.
592
Fig. S2. Structures of FIS2 and VRN2, along with MEA and SWN in Brassicaceae and other eurosids.
593
Fig. S3. Ka/Ks ratios of full-length FIS2/VRN2 and MEA/SWN genes and functional domains.
594
Fig. S4. Positive selection on specific sites of MEA and FIS2 genes.
595
Table S1. Ka/Ks ratios under different branch models for full-length FIS2/VRN2 and MEA/SWN genes
596
and functional domains.
597
Table S2. Gene-specific primers used in this study.
598
599
600
Figure Legends
601
Figure 1. Two PRC2 complexes in Brassicaceae, the VRN-complex and the Brassicaceae-
602
specific FIS-complex, arose by the alpha whole genome duplication where VRN2 duplicated to
603
form FIS2, and SWN duplicated to form MEA.
604
605
Figure 2. A. Organ/tissue-specific expression indices based on two sets of microarray data. A
606
large value indicates expression is restricted to fewer organ or tissue types while a low value
607
indicates broad expression. B. Correlation of expression profile of each gene pair. Left: ADA set
608
(63 organ types and developmental stages); right: ASA set (42 seed tissue types and
609
developmental stages). Black arrows indicate a positive correlation and grey arrows indicate a
610
negative correlation. The thickness of arrows indicates the level of the correlation coefficient.
611
The correlation coefficient and p-value of expression profile of each gene pair are labeled along
612
the arrows. Bold values indicate positive correlation.
613
27
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
614
Figure 3. RT-PCR assays indicate that FIS2 and MEA have lost the ancestral vegetative
615
expression pattern after duplication. Plus signs indicate reactions with reverse transcriptase and
616
minus signs indicate controls with no reverse transcriptase. Species abbreviations include: At -
617
Arabidopsis thaliana, Th - Tarenaya hassleriana, Cp - Carica papaya, and Vv - Vitis vinifera.
618
619
Figure 4. DNA methylation at the genomic region of the VEF-domain genes and SET-domain
620
genes. CLF and EMF2 are ancient paralogs of SWN and VRN2, respectively. For each gene, four
621
rows represent four replicates, and the dashed line separates 1500 bp upstream of the
622
transcription start site. Vertical bars in each row represent the level of methylation.
623
624
Figure 5. Ka/Ks values of the interacting domains: VEF domain in FIS2/VRN2 and C5 domain
625
in MEA/SWN. Estimated average Ka/Ks ratio of each clade is shown between the two trees. The
626
values above branches are Ka/Ks ratios (where no value suggested the lack of power to detect the
627
accurate Ka/Ks ratio in the PAML analysis). The black dots indicate the alpha WGD at the base
628
of the Brassicaceae. The scale bars indicate 0.1 substitution per codon. Species abbreviations
629
include: At - Arabidopsis thaliana, Al - Arabidopsis lyrata, Cr - Capsella rubella, Sp -
630
Schrenkiella parvula, Es - Eutrema salsugineum, Br - Brassica rapa, Bo - Brassica oleracea, Th
631
- Tarenaya hassleriana, Cp - Carica papaya, Gr – Gossypium raimondii, Tc - Theobroma cacao,
632
Pt - Populus trichocarpa, Rc - Ricinus communis, Me - Manihot esculenta and Vv - Vitis
633
vinifera.
634
635
Figure 6. VEL2 and VEL1 expression and sequence evolution. A. Organ/tissue specificity of
636
VEL genes. B. Correlation of expression profile between VEL genes and PRC2 core components.
637
Left: ADA set (63 organ types and developmental stages); right: ASA set (42 seed tissue types
638
and developmental stages). Black arrows indicate positive correlation, and grey arrows indicate
639
negative correlation. The thickness of arrows indicates the level of the correlation coefficient.
640
The correlation coefficient and p-value of expression profile of each gene pair are labeled along
641
the arrows. Bold values indicate positive correlations. C. DNA methylation at the genomic
28
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
642
region of VEL genes (as in Fig.4). D. Ka/Ks values of the VEL genes. Average Ka/Ks ratio of
643
each clade is shown. The black dot at the node indicates gene duplication events.
644
645
Figure 7. Schematic diagrams illustrating models of protein complex divergence. Colors indicate
646
conservation vs. divergence (could be neofunctionalization, subfunctionalization, loss of partial
647
function, and other types of divergence). A. Single-gene-duplication and divergence: a single
648
gene (dark blue) in a complex is duplicated. After duplication there is subsequent divergence
649
(light blue vs. red) of the ancestral gene (dark blue) to give rise to divergent protein complexes.
650
B. Simultaneous-gene-duplication and concerted divergence: two (or more) genes (dark green +
651
dark blue) were duplicated simultaneously. After duplication there is parallel divergence (light
652
green + light blue vs. yellow + red) to give rise to divergent protein complexes. C. The PRC2
653
complexes in this study are an example of simultaneous-gene-duplication and concerted
654
divergence.
655
29
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
656
30
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
Parsed Citations
Aakre CD, Herrou J, Phung TN, Perchuk BS, Crosson S, Laub MT. 2015. Evolving new protein-protein interaction specificity
through promiscuous intermediates. Cell 163: 594-606.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Aichinger E, Villar CB, Di Mambro R, Sabatini S, Köhler C. 2011. The CHD3 chromatin remodeler PICKLE and polycomb group
proteins antagonistically regulate meristem activity in the Arabidopsis root. Plant Cell 23: 1047-1060.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Baumbusch LO, Thorstensen T, Krauss V, Fischer A, Naumann K, Assalkhou R, Schulz I, Reuter G, Aalen RB. 2001. The
Arabidopsis thaliana genome contains at least 29 active genes encoding SET domain proteins that can be assigned to four
evolutionarily conserved classes. Nucleic Acids Research 29: 4319-4333.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Beilstein MA, Renfrew KB, Song X, Shakirov EV, Zanis MJ, Shippen DE. 2015. Evolution of the Telomere-Associated Protein
POT1a in Arabidopsis thaliana is characterized by positive selection to reinforce protein-protein interaction. Molecular Biology
and Evolution 32: 1329-1341.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Birchler JA, Riddle NC, Auger DL, Veitia RA. 2005. Dosage balance in gene regulation: biological implications. Trends in Genetics
21: 219-226.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Blanc G, Hokamp K, Wolfe KH. 2003. A recent polyploidy superimposed on older large-scale duplications in the Arabidopsis
genome. Genome Research 13: 137-144.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Blanc G, Wolfe KH. 2004. Functional divergence of duplicated genes formed by polyploidy during Arabidopsis evolution. Plant Cell
16: 1679-1691.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Berger F, Chaudhury A. 2009. Parental memories shape seeds. Trends in Plant Science 14: 550-556.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Bouyer D, Roudier F, Heese M, Andersen ED, Gey D, Nowack MK, Goodrich J, Renou JP, Grini PE, Colot V, Schnittger A. 2011.
Polycomb repressive complex 2 controls the embryo-to-seedling phase transition. PLoS Genetics 7: e1002014.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Bowers JE, Chapman BA, Rong J, Paterson AH. 2003. Unraveling angiosperm genome evolution by phylogenetic analysis of
chromosomal duplication events. Nature 422: 433-438.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Capra EJ, Perchuk BS, Skerker JM, Laub MT. 2012. Adaptive mutations that prevent crosstalk enable the expansion of paralogous
signaling protein families. Cell 150: 222-232.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Casneuf T, De Bodt S, Raes J, Maere S, Van de Peer Y. 2006. Nonrandom divergence of gene expression following gene and
genome duplications in the flowering plant Arabidopsis thaliana. Genome Biology 7: R13.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Chen LJ, Diao ZY, Specht C, Sung ZR. 2009. Molecular evolution of VEF-domain-containing PcG genes in plants. Molecular Plant 2:
738-754.
Pubmed: Author and Title
CrossRef: Author and Title
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
Google Scholar: Author Only Title Only Author and Title
Cheng F, Liu S, Wu J, Fang L, Sun S, Liu B, Li P, Hua W, Wang X. 2011. BRAD, the genetics and genomics database for Brassica
plants. BMC Plant Biology 11: 136.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Cheng S, van den Bergh E, Zeng P, Zhong X, Xu J, Liu X, Hofberger J, de Bruijn S, Bhide AS, Kuelahoglu C, Bian C, Chen J, Fan G,
Kaufmann K, Hall JC, Becker A, Bräutigam A, Weber AP, Shi C, Zheng Z, Li W, Lv M, Tao Y, Wang J, Zou H, Quan Z, Hibberd JM,
Zhang G, Zhu XG, Xu X, Schranz ME. 2013. The Tarenaya hassleriana genome provides insight into reproductive trait and genome
evolution of crucifers. Plant Cell 25: 2813-2830.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Coate JE, Song MJ, Bombarely A, Doyle JJ. 2016. Expression-level support for gene dosage sensitivity in three Glycine subgenus
Glycine polyploids and their diploid progenitors. New Phytologist doi: 10.1111/nph.14090.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Duarte JM, Cui L, Wall PK, Zhang Q, Zhang X, Leebens-Mack J, Ma H, Altman N, dePamphilis CW. 2006. Expression pattern shifts
following duplication indicative of subfunctionalization and neofunctionalization in regulatory genes of Arabidopsis. Molecular
Biology and Evolution 23: 469-478.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Edgar RC. 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research 32: 17921797.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Finseth FR, Dong Y, Saunders A, Fishman L. 2015. Duplication and adaptive evolution of a key centromeric protein in Mimulus, a
genus with female meiotic drive. Molecular Biology and Evolution 32: 2694-2706.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Force A, Lynch M, Pickett FB, Amores A, Yan YL, Postlethwait J. 1999. Preservation of duplicate genes by complementary,
degenerative mutations. Genetics 151: 1531-1545.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Gehring M, Missirian V, Henikoff S. 2011. Genomic analysis of parent-of-origin allelic expression in Arabidopsis thaliana seeds.
PLoS One 6: e23687.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Goodstein DM, Shu S, Howson R, Neupane R, Hayes RD, Fazo J, Mitros T, Dirks W, Hellsten U, Putnam N, Rokhsar DS. 2012.
Phytozome: a comparative platform for green plant genomics. Nucleic Acids Research 40 (Database issue): D1178-1186.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Gossmann TI, Schmid KJ. 2011. Selection-driven divergence after gene duplication in Arabidopsis thaliana. Journal of Molecular
Evolution. 73: 153-165.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
He X, Zhang J. 2005. Rapid subfunctionalization accompanied by prolonged and substantial neofunctionalization in duplicate gene
evolution. Genetics 169: 1157-1164.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Hehenberger E, Kradolfer D, Köhler C. 2012. Endosperm cellularization defines an important developmental transition for embryo
development. Development 139: 2031-2039.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Hennig L, Derkacheva M. 2009. Diversity of Polycomb group complexes in plants: same rules, different players? Trends in
Genetics 25: 414-423.
Pubmed: Author and Title
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Hsieh TF, Shin JY, Uzawa R, Silva P, Cohen S, Bauer MJ, Hashimoto M, Kirkbride RC, Harada JJ, Zilberman D, Fischer RL. 2011.
Regulation of imprinted gene expression in Arabidopsis endosperm. Proceeding of National Academy of Sciences of the U S A
108: 1755-1762.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Jiao Y, Wickett NJ, Ayyampalayam S, Chanderbali AS, Landherr L, Ralph PE, Tomsho LP, Hu Y, Liang H, Soltis PS, Soltis DE, Clifton
SW, Schlarbaum SE, Schuster SC, Ma H, Leebens-Mack J, dePamphilis CW. 2011. Ancestral polyploidy in seed plants and
angiosperms. Nature 473: 97-100.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Kim DH, Sung S. 2010. The Plant Homeo Domain finger protein, VIN3-LIKE 2, is necessary for photoperiod-mediated epigenetic
regulation of the floral repressor, MAF5. Proceeding of National Academy of Sciences of the U S A 107: 17029-17034.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Kim DH, Sung S. 2013. Coordination of the vernalization response through a VIN3 and FLC gene family regulatory network in
Arabidopsis. Plant Cell 25: 454-469.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Kradolfer D, Wolff P, Jiang H, Siretskiy A, Köhler C. 2013. An imprinted gene underlies postzygotic reproductive isolation in
Arabidopsis thaliana. Developmental Cell 26: 525-535.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Köhler C, Hennig L, Spillane C, Pien S, Gruissem W, Grossniklaus U. 2003. The Polycomb-group protein MEDEA regulates seed
development by controlling expression of the MADS-box gene PHERES1. Genes and Development 17: 1540-1553.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Köhler C, Wolff P, Spillane C. 2012. Epigenetic mechanisms underlying genomic imprinting in plants. Annual Review of Plant
Biology 63: 331-352.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Le BH, Cheng C, Bui AQ, Wagmaister JA, Henry KF, Pelletier J, Kwong L, Belmonte M, Kirkbride R, Horvath S, Drews GN, Fischer
RL, Okamuro JK, Harada JJ, Goldberg RB. 2010. Global analysis of gene activity during Arabidopsis seed development and
identification of seed-specific transcription factors. Proceeding of National Academy of Sciences of the U S A 107: 8063-8070.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Li S, Zhou B, Peng X, Kuang Q, Huang X, Yao J, Du B, Sun MX. 2014. OsFIE2 plays an essential role in the regulation of rice
vegetative and reproductive development. New Phytologist 201: 66-79.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Li Z, Baniaga AE, Sessa EB, Scascitelli M, Graham SW, Rieseberg LH, Barker MS. 2015. Early genome duplications in conifers and
other seed plants. Science Advances 1: e1501084.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Liu SL, Baute GJ, Adams KL. 2011. Organ and cell type-specific complementary expression patterns and regulatory
neofunctionalization between duplicated genes in Arabidopsis thaliana. Genome Biology and Evolution 3: 1419-1436.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Luo M, Bilodeau P, Koltunow A, Dennis ES, Peacock WJ, Chaudhury AM. 1999. Genes controlling fertilization-independent seed
development in Arabidopsis thaliana. Proceeding of National Academy of Sciences of the U S A 96: 296-301.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Luo M, Platten D, Chaudhury A, Peacock WJ, Dennis ES. 2009. Expression, imprinting, and evolution of rice homologs of the
polycomb group genes. Molecular Plant 2: 711-723.
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Makarevitch I, Eichten SR, Briskine R, Waters AJ, Danilevskaya ON, Meeley RB, Myers CL, Vaughn MW, Springer NM. 2013.
Genomic distribution of maize facultative heterochromatin marked by trimethylation of H3K27. Plant Cell 25: 780-793.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Moore RC, Purugganan. 2005. The evolutionary dynamics of plant duplicate genes. Current Opinion in Plant Biology. 8: 122-128.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Mozgova I, Köhler C, Hennig L. 2015. Keeping the gate closed: functions of the polycomb repressive complex PRC2 in
development. Plant Journal 83: 121-132.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Proost S, Van Bel M, Vaneechoutte D, Van de Peer Y, Inzé D, Mueller-Roeber B, Vandepoele K. 2015. PLAZA 3.0: an access point
for plant comparative genomics. Nucleic Acids Research 43(Database issue): D974-981.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Qian Y, Xi Y, Cheng B, Zhu S, Kan X. 2014. Identification and characterization of the SET domain gene family in maize. Molecular
Biology Report 41: 1341-1354.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Qiu Y, Liu SL, Adams KL. 2014. Frequent changes in expression profile and accelerated sequence evolution of duplicated
imprinted genes in Arabidopsis. Genome Biology and Evolution 6: 1830-1842.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Rodrigues JC, Tucker MR, Johnson SD, Hrmova M, Koltunow AM. 2008. Sexual and apomictic seed formation in Hieracium
requires the plant polycomb-group gene FERTILIZATION INDEPENDENT ENDOSPERM. Plant Cell 20: 2372-2386.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Roszak P, Köhler C. 2011. Polycomb group proteins are required to couple seed coat initiation to fertilization. Proceeding of
National Academy of Sciences of the U S A 108: 20826-20831.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Roudier F, Ahmed I, Bérard C, Sarazin A, Mary-Huard T, Cortijo S, Bouyer D, Caillieux E, Duvernois-Berthet E, Al-Shikhley L, Giraut
L, Després B, Drevensek S, Barneche F, Dèrozier S, Brunaud V, Aubourg S, Schnittger A, Bowler C, Martin-Magniette ML, Robin
S, Caboche M, Colot V. 2011. Integrative epigenomic mapping defines four main chromatin states in Arabidopsis. The EMBO
Journal 30: 1928-1938.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Schmid M , Davison TS, Henz SR, Pape UJ, Demar M, Vingron M, Schölkopf B, Weigel D, Lohmann JU. 2005. A gene expression
map of Arabidopsis thaliana development. Nature Genetics 37: 501-506.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Schmitz RJ, Schultz MD, Urich MA, Nery JR, Pelizzola M, Libiger O, Alix A, McCosh RB, Chen H, Schork NJ, Ecker JR. 2013.
Patterns of population epigenomic diversity. Nature 495: 193-198.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Schranz M, Mitchell-Olds T. 2006. Independent ancient polyploidy events in the sister families Brassicaceae and Cleomaceae.
Planc Cell 18: 1152-1165.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Shan H, Zahn L, Guindon S, Wall PK, Kong H, Ma H, DePamphilis CW, Leebens-Mack J. 2009. Evolution of plant MADS box
transcription factors: evidence for shifts in selection associated with early angiosperm diversification and concerted gene
duplications. Molecular Biology and Evolution 26: 2229-2244.
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Soltis PS, Soltis DE. 2016. Ancient WGD events as drivers of key innovations in angiosperms. Current Opinion in Plant Biology 30:
159-165.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Spillane C, Schmid KJ, Laoueille-Duprat S, Pien S, Escobar-Restrepo J-M, Baroux C, Gagliardini V, Page DR, Wolfe KH,
Grossniklaus U. 2007. Positive darwinian selection at the imprinted MEDEA locus in plants. Nature 448: 349-352.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Stamatakis A. 2006. RAxML-VI-HPC: Maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models.
Bioinformatics 22: 2688-2690.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Stroud H, Greenberg MV, Feng S, Bernatavichute YV, Jacobsen SE. 2013. Comprehensive analysis of silencing mutants reveals
complex regulation of the Arabidopsis methylome. Cell 152: 352-364.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Suzuki MM, Bird A. 2008. DNA methylation landscapes: provocative insights from epigenomics. Nature Review Genetics 9: 465-476.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Van de Peer Y, Maere S, Meyer A. 2009. The evolutionary significance of ancient genome duplications. Nature Review Genetics
10: 725-732.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Villar CB, Erilova A, Makarevich G, Trösch R, Köhler C. 2009. Control of PHERES1 imprinting in Arabidopsis by direct tandem
repeats. Molecular Plant 2: 654-660.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Wang J, Tao F, Marowsky NC, Fan C. 2016. Evolutionary Fates and Dynamic Functionalization of Young Duplicate Genes in
Arabidopsis Genomes. Plant Physiology 172: 427-440.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Wang Y, Ma H. 2015. Step-wise and lineage-specific diversification of plant RNA polymerase genes and origin of the largest plantspecific subunits. New Phytologist 207: 1198-1212.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Wolff P, Weinhofer I, Seguin J, Roszak P, Beisel C, Donoghue MT, Spillane C, Nordborg M, Rehmsmeier M, Köhler C. 2011. Highresolution analysis of parent-of-origin allelic expression in the Arabidopsis Endosperm. PLoS Genetics 7: e1002126.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Yang H, Han Z, Cao Y, Fan D, Li H, Mo H, Feng Y, Liu L, Wang Z, Yue Y, Cui S, Chen S, Chai J, Ma L. 2012. A companion celldominant and developmentally regulated H3K4 demethylase controls flowering time in Arabidopsis via the repression of FLC
expression. PLoS Genetics 8: e1002664.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Yang L, Gaut BS. 2011. Factors that contribute to variation in evolutionary rate among Arabidopsis genes. Molecular Biology and
Evolution 28: 2359-2369.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Yang Z. 2007. PAML 4: phylogenetic analysis by maximum likelihood. Molecular Biology and Evolution 24: 1586-1591.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.
Zemach A, Kim MY, Hsieh PH, Coleman-Derr D, Eshed-Williams L, Thao K, Harmer SL, Zilberman D. 2013. The Arabidopsis
nucleosome remodeler DDM1 allows DNA methyltransferases to access H1-containing heterochromatin. Cell 153: 193-205.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Zhang H, Bishop B, Ringenberg W, Muir WM, Ogas J. 2012. The CHD3 remodeler PICKLE associates with genes enriched for
trimethylation of histone H3 lysine 27. Plant Physiology 159: 418-432.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Zhang JZ, Nielsen R, Yang ZH. (2005) Evaluation of an improved branch-site likelihood method for detecting positive selection at
the molecular level, Moleclular Biology and Evolution 22: 2472-2479.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Zhang M, Xie S, Dong X, Zhao X, Zeng B, Chen J, Li H, Yang W, Zhao H, Wang G, Chen Z, Sun S, Hauck A, Jin W, Lai J. 2014.
Genome-wide high resolution parental-specific DNA and histone methylation maps uncover patterns of imprinting regulation in
maize. Genome Research 24: 167-176.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Zhou R, Moshgabadi N, Adams KL. 2011. Extensive changes to alternative splicing patterns following allopolyploidy in natural and
resynthesized polyploids. Proceeding of National Academy of Sciences of the U S A 108: 16122-16127.
Pubmed: Author and Title
CrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Downloaded from on June 18, 2017 - Published by www.plantphysiol.org
Copyright © 2017 American Society of Plant Biologists. All rights reserved.