Plant Physiology Preview. Published on April 28, 2017, as DOI:10.1104/pp.16.01983 1 Short title: Concerted divergence of duplicated genes 2 3 Concerted divergence after gene duplication in Polycomb Repressor 4 complexes 5 1 2 1 Yichun Qiu , Shao-Lun Liu , and Keith L. Adams 6 7 1Department of Botany, University of British Columbia, Vancouver, British Columbia, Canada 8 2Department of Life Science, Tunghai University, Taichung, Taiwan 9 Corresponding author: Keith L. Adams, E-mail: [email protected] 10 11 12 13 14 Summary: FIS2 and MEA have diverged in concert after simultaneous gene duplication, resulting in functional divergence of the PRC2-complexes in Brassicaceae, which is a novel fate for duplicated genes whose products act in complexes. 15 16 17 Author contributions: Y.Q. and K.L.A. designed the research. Y.Q. and S.-L.L. performed the 18 experiments and analyzed the data. Y.Q. and K.L.A. wrote the manuscript. 19 20 Funding information: This work was supported by a Discovery Grant from the Natural Science 21 and Engineering Research Council of Canada (to KLA), a Postgraduate Fellowship from NSERC 22 (to YQ), and grants from the Ministry of Science and Technology, Taiwan (to SLL). 23 24 1 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. Copyright 2017 by the American Society of Plant Biologists 25 26 27 Abstract 28 29 Duplicated genes are a major contributor to genome evolution and phenotypic novelty. There are 30 multiple possible evolutionary fates of duplicated genes. Here we provide an example of 31 concerted divergence of simultaneously duplicated genes whose products function in the same 32 complex. We studied PRC2 (Polycomb Repressive Complex 2) in Brassicaceae. The VRN-PRC2 33 complex contains VRN2 and SWN, and both genes were duplicated during a whole genome 34 duplication to generate FIS2 and MEA which function in the Brassicaceae-specific FIS-PRC2 35 complex that regulates seed development. We examined expression of FIS2, MEA, and their 36 paralogs, compared their cytosine and histone methylation patterns, and analyzed the sequence 37 evolution of the genes. We found that FIS2 and MEA have reproductive-specific expression 38 patterns that are correlated and derived from the broadly expressed VRN2 and SWN in outgroup 39 species. In vegetative tissues of Arabidopsis repressive methylation marks are enriched in FIS2 40 and MEA, whereas active marks are associated with their paralogs. We detected comparable 41 accelerated amino acid substitution rates in FIS2 and MEA but not in their paralogs. We also 42 show divergence patterns of the PRC2-asssociated VEL2 that are similar to FIS2 and MEA. 43 These lines of evidence indicate that FIS2 and MEA have diverged in concert, resulting in 44 functional divergence of the PRC2-complexes in Brassicaceae. This type of concerted 45 divergence is a previously unreported fate of duplicated genes. In addition, the Brassicaceae- 46 specific FIS-PRC2 complex modified the regulatory pathways in female gametophyte and seed 47 development. 48 49 2 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. 50 Introduction 51 Duplicated genes are continuously formed during evolution by various types of gene duplication 52 events in eukaryotes and they can have effects on morphological and physiological evolution 53 (reviewed in Van de Peer et al., 2009; Soltis and Soltis 2016). Gene duplication can happen at 54 small scales, such as tandem duplication, segmental duplication, and duplicative retroposition. 55 The largest scale of gene duplication is whole genome duplication (WGD), which gives rise to 56 thousands of duplicated gene pairs. The genetic model plant, Arabidopsis thaliana, has 57 experienced five rounds of WGD events in the evolutionary history of seed plants (Jiao et al., 58 2011; Li et al., 2015). The most recent polyploidy event, the alpha WGD, is specific to the 59 Brassicaceae family, which took place after the divergence of the closest sister family, 60 Cleomaceae (Schranz and Mitchell-Olds, 2006). There are about 2500 pairs of duplicated genes 61 retained from this WGD in the Arabidopsis thaliana genome (Blanc et al., 2003; Bowers et al., 62 2003). 63 Fates of duplicated genes vary during evolutionary history. One duplicate may eventually 64 be lost or become a pseudogene, thus the once duplicated pair returns to a single-copy status. 65 Several mechanisms drive the retention of both copies. Duplicated pairs could preserve similar 66 functions to maintain dosage balance (Birchler et al., 2005; Coate et al., 2016). Duplicated pairs 67 can also diverge through subfunctionalization or neofunctionalization, where two duplicated 68 genes divide the ancestral function or gain a novel function, respectively (Force et al., 1999; 69 Moore and Purugganan, 2005). These types of divergence could also be inferred from expression 70 pattern. For example, two duplicates together make up the pre-duplicate expression profile is 71 referred to as regulatory subfunctionalization, and regulatory neofunctionalization indicates one 72 or both copies gain a new expression pattern (Duarte et al., 2006; Liu et al., 2011). Sometimes 73 these processes are difficult to distinguish, and there can be a combination of different 74 mechanisms such as sub-neofunctionalization (He and Zhang, 2005). 75 There are many protein complexes whose members are encoded by different gene 76 families. If multiple components in a complex are duplicated simultaneously, such as in a whole 77 genome duplication, the doubled components could redundantly cross-interact, or go on to 78 experience subsequent divergence (Capra et al., 2012; Aarke et al., 2015). Thus a type of co- 79 evolution between the interacting gene products is hypothetically possible, but has not been 3 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. 80 described in the plant kingdom. Extending the concept of concerted divergence, which is 81 discussed in the context of co-expression patterns of duplicated genes in the same metabolic or 82 regulatory pathways (Blanc and Wolfe, 2004), we here propose the evolutionary scenario that 83 simultaneous duplication of two genes whose products function together in a complex, followed 84 by parallel evolution and divergence of each derived gene, can lead to functional divergence of 85 the complexes. 86 In this study we focus on genes in PRC2 complexes (Polycomb Repressive Complex 2) 87 in Brassicaceae species as a potential example to demonstrate the proposed scenario. Those 88 complexes are histone modifiers, and regulate gene expression primarily by tri-methylation of 89 lysine 27 on histone H3 (H3K27me3) associated with target genes which leads to transcriptional 90 repression (Hennig and Derkacheva, 2009; Mozgova et al., 2015). One type of PRC2, the VRN- 91 complex that is present across all rosids, regulates vegetative tissue differentiation, and more 92 importantly, vernalization process to control flowering time in Arabidopsis (Chen et al., 2009; 93 Hennig and Derkacheva, 2009; Mozgova et al., 2015). The complex also represses autonomous 94 seed coat development (Roszak and Kohler, 2011) and it is present across rosids. The VRN2 95 complex consists of four subunits: REDUCED VERNALIZATION RESPONSE 2 96 (VERNALIZATION2, VRN2), SET DOMAIN-CONTAINING PROTEIN 10 (SWINGER, 97 SWN), with two WD-40 repeat proteins who act as the scaffold of the complex assemblies, 98 FERTILIZATION-INDEPENDENT ENDOSPERM (FIE) and MULTICOPY SUPRESSOR OF 99 IRA1 (MSI1). In Brassicaceae, the alpha WGD gave rise to a duplication of VRN2 to create its 100 paralog FERTILIZATION INDEPENDENT SEED 2 (FIS2) and a duplication of SWN to create its 101 paralog SET DOMAIN-CONTAINING PROTEIN 5 (MEDEA, MEA) (Fig. 1; Luo et al., 2009; 102 Spillane et al., 2007). Substituting for their paralogous proteins, FIS2 and MEA, together with 103 FIE and MSI1 (the alpha WGD paralogs of these two genes were lost), make up a new 104 Brassicaceae-specific PRC2, referred to as the FIS-complex (Fig. 1). The FIS-complex functions 105 in gametophyte and seed development, preventing female gamete proliferation before 106 fertilization, and facilitating endosperm cellularization after fertilization (Hennig and 107 Derkacheva, 2009). A typical fis phenotype, caused by non-functional mutation in FIS2, MEA 108 (also known as FIS1) or FIE (also known as FIS3), shows fertilization independent 109 embryogenesis, and other types of mutants have abnormal seed development, even abolished 110 seeds (Hennig and Derkacheva, 2009). 4 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. 111 The observed divergence in the functions of the two kinds of PRC2 complexes leads to 112 the hypothesis that FIS2 and MEA have undergone divergence in a concerted way to give rise to 113 the FIS-complex. This study aimed to evaluate this hypothesis by examining expression patterns, 114 DNA and histone methylation, and rates of sequence evolution in both genes compared with 115 their paralogs. We found evidence for parallel divergence of FIS2 and MEA from their paralogs 116 in multiple ways that has accompanied functional divergence of the two complexes. This study 117 supports a model of concerted divergence of simultaneously duplicated genes whose products 118 function in a complex. This is a previously unreported fate of duplicated genes. 119 120 5 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. Results 121 122 FIS2 and MEA have specific and similar expression patterns in reproductive organs 123 FIS2 and MEA formed by the alpha whole genome duplication that is specific to the 124 Brassicaceae family after the divergence of the Brassicaceae lineage from the Caricaceae lineage. 125 After gene duplication, duplicated genes may experience expression divergence. We analyzed 126 microarray data in Arabidopsis thaliana to compare the expression profiles of paralogous 127 interacting gene pairs FIS2/VRN2 and MEA/SWN. We obtained two sets of ATH1 microarray 128 data and analyzed them separately: 63 different organ types and developmental stages (Schmid et 129 al., 2005), referred to as ADA (Arabidopsis developmental atlas) dataset hereafter, and 42 130 different tissue types during seed developmental stages (Le et al., 2010), referred to as ASA 131 (Arabidopsis seed atlas). We first calculated the expression specificity (τ) of the four genes 132 defined by Yang and Gaut (2011). VRN2 and SWN have expression specificity values of 0.19 and 133 0.17 respectively, indicating that both genes have relatively broad expression in nearly all organ 134 types included in the ADA dataset. In contrast, FIS2 has an expression specificity value of 0.70 135 and MEA is 0.63, indicating an organ-specific expression pattern. We observed that the 136 expression of FIS2 and MEA is restricted to flowers and siliques, and the absence of vegetative 137 expression explains the high expression specificity. Yang and Gaut (2011) analyzed the ADA 138 dataset and they found that the recent WG duplicates have a median tau close to 0.2. Thus what 139 we observed for FIS2 and MEA is quite high, and what we observed for VRN2 and SWN is about 140 average. 141 We also analyzed expression specificity (τ) in the ASA dataset (Fig. 2A). Similarly, the τ of FIS2 142 is 0.48 and MEA is 0.56, while VRN2 has τ of 0.21 and SWN has 0.22. FIS2 and MEA turn out to 143 show more tissue-specific expression in seed tissues. We broke down the ASA data and observed 144 that FIS2 and MEA tend to be expressed in the triploid endosperm rather than in the diploid 145 embryo or maternally derived seed coat. We did a 1000-replicate permutation test and gained 146 statistical support that the expression specificity differences in the FIS2-MEA and VRN2-SWN 147 comparisons are not significant (Fig. S1), indicative of the concerted divergence in their 148 expression profile. In contrast, the tissue specificity expression profile is significantly different in 149 the two duplicated pairs, VRN2-FIS2 and SWN-MEA, indicative of their regulatory divergence. 6 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. 150 Not only did we analyze the expression index for those genes individually, we also 151 performed a correlation test to examine the association of the expression profiles of the four 152 genes, as their products function in a complex (Fig. 2B). We found that the expression patterns 153 of FIS2 and MEA are positively correlated in both the ADA and ASA datasets, while broadly 154 expressed VRN2 and SWN are co-expressed. However, the expression of both FIS2 and MEA is 155 negatively correlated to the expression of VRN2 and SWN. The negative coefficients are around 156 -0.5 (Fig. 2B), which is below 1% of the total alpha WG pairs analysed by Blanc and Wolfe 157 (2004). Overall the FIS2-MEA expression patterns indicate parallel divergence from VRN2-SWN 158 expression patterns in a concerted manner. 159 160 FIS2 and MEA acquired new expression patterns 161 As the microarray data from the Arabidopsis thaliana developmental expression atlas indicated, 162 FIS2 and MEA both have an expression pattern that is restricted to reproductive organs, such as 7 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. 163 flowers and siliques, but not vegetative organs, including roots, stems, and leaves. We confirmed 164 this result with RT-PCR (Fig. 3). In contrast, their paralogs, VRN2 and SWN, have a broad 165 expression pattern in both vegetative and reproductive organs, and they are ubiquitously 166 expressed in all examined organ types in our RT-PCR results (Fig. 3). To infer the ancestral 167 expression pattern of the two gene pairs, we assayed the expression pattern of orthologs in 168 Tarenaya hassleriana (formerly known as Cleome spinosa), Carica papaya and Vitis vinifera. 169 Among those species with sequenced genomes, Tarenaya belongs to Cleomaceae, the most 170 closely related sister group to Brassicaceae. Although Tarenaya has its own genome triplication 171 after the divergence between Cleomaceae and Brassicaceae (Cheng et al., 2013), only a single 172 copy each of the orthologous VRN2 and SWN has been retained. Carica is also in the order 173 Brassicales. Vitis was chosen because its lineage has not experienced any whole genome 174 duplication events since the gamma WGD during early eudicot evolution, which applies to 175 Carica as well, and thus genes are frequently single copy in these taxa. These single-copy 176 orthologs can facilitate the inference of ancestral expression pattern. We confirmed that these 8 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. 177 sequences are true orthologs of FIS2/VRN2 and MEA/SWN by phylogenetic analysis of the gene 178 families. 179 For both the FIS2/VRN2 and MEA/SWN pairs, their orthologs in Tarenaya, Carica and 180 Vitis are widely expressed in all examined organ types, which is the same as VRN2 and SWN in 181 Arabidopsis (Fig. 3). The absence of expression in vegetative organs is only observed in FIS2 182 and MEA. Collectively we inferred that the pre-duplicated expression state is likely to be a broad 183 expression pattern, which is reflected by VRN2 and SWN. The Brassicaceae FIS2 and MEA both 184 lost expression in vegetative organs to become specifically expressed in reproductive organs. 185 186 FIS2 and MEA acquired novel epigenetic modifications 187 The epigenetic features of cytosine methylation and histone methylation are often associated 188 with expression or silencing of genes. To examine the patterns of cytosine and histone 189 methylation in organ types where expression of FIS2 and MEA was lost, we investigated the 190 epigenetic variation among these genes in vegetative tissues including leaves, roots and seedlings 191 of Arabidopsis thaliana (see methods for details). For DNA methylation, we found that cytosine 192 methylation at CpG sites is enriched in the promoter region (defined as 1500 bp upstream of the 193 transcription start site) of FIS2 genomic sequence, but not the gene body (Fig. 4). The opposite is 194 found for VRN2, with the promoter region unmarked but the gene body is highly methylated (Fig. 195 4). The same divergence of DNA methylation was found for MEA and SWN (Fig. 4). Cytosine 196 methylation is enriched in the promoter region of MEA but only in the gene body of SWN. The 197 DNA methylation patterns in EMF2 and CLF, the more distant paralogs of VRN2 and SWN, 198 respectively, are also gene body enrichment, the same as VRN2 and SWN, suggesting that the 199 pattern of DNA methylation for FIS2 and MEA has changed after duplication. As promoter 200 cytosine methylation is associated with transcriptional repression, and gene body methylation is 201 indicative of expression activation (Suzuki and Bird 2008), this finding is consistent with the 202 expression data. We did not examine methylation patterns in whole endosperm because in the 203 ASA seed atlas dataset FIS2 and MEA showed variable expression patterns in different parts of 204 the endosperm and different developmental stages. 205 206 We also examined histone methylation in the region of these genes in the seedlings of Arabidopsis thaliana based on the data generated by Roudier et al. (2011). Similar to DNA 9 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. 207 methylation, we found that VRN2, SWN, EMF2 and CLF have the same types of histone 208 methylation, which are different from FIS2 and MEA (Table 1). We noticed that FIS2 and MEA 209 lost H3K4me3, which is shared by all the other genes. Instead they gained a novel mark of 210 H3K27me3. H3K4me3 is an activating mark, while its antagonistic mark H3K27me3 is 211 repressive. This could help explain the expression of VRN2, SWN, EMF2 and CLF in the 212 vegetative tissue, but the lack of expression of FIS2 and MEA. It is also notable that in the fie 213 mutant, where the PRC2 function was supposed to be abolished, FIS2 and MEA lost their 214 H3K27me3, but instead VRN2, SWN, EMF2 and CLF were marked by H3K27me3 (Bouyer et al., 215 2011). As H3K27me3 is regulated by PRC2 complexes, this finding suggests the self- and cross- 216 regulation among these genes. With both DNA and histone modification comparative analyses 217 we observed the convergent evolution of epigenetic features in FIS2 and MEA, divergent from 218 their pre- and post-duplicated paralogs. 219 220 Gene structural changes in FIS2 and MEA 10 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. 221 FIS2 formed from VRN2 by duplication, and MEA duplicated from SWN, during the alpha whole 222 genome duplication. FIS2 in Arabidopsis thaliana lost three exons, called the E15-17 region 223 (exons 15 to 17) (corresponding to the 15th to 17th exons in the Arabidopsis thaliana EMF2, not 224 named after VRN2) compared to VRN2 (Fig. S2A; Chen et al., 2009). FIS2 has a large serine-rich 225 domain that is not shared with any other VEF genes in any species, indicating gain of the domain 226 in Brassicaceae (Fig. S2A; Chen et al., 2009). Our sequence analysis showed that the serine-rich 227 domain is highly variable among FIS2 sequences from different Brassicaceae species (Fig. S2A). 228 The lost E15-17 domain and the gained serine-rich domain are both neighbouring the VEF 229 domain that interacts with the C5 domain in MEA. 230 MEA is about 150 aa shorter than SWN, and the deleted region is just downstream of the 231 C5 domain which interacts with the VEF domain in FIS2, due to a large shrinkage in a single 232 exon (the 9th in Arabidopsis thaliana MEA and SWN) where Brassicaceae SWN and 233 orthologous SWN-like sequences are not conserved (Fig. S2B). How the structural changes 234 affect the physical interaction of FIS2 and MEA remains to be tested. In addition to the 235 rearrangement of functional domains, those shared domains show different levels of amino acid 236 sequence divergence. In contrast, VRN2/EMF2-like sequences and SWN/CLF-like sequences 237 show relative conservation across all flowering plants in amino acid sequences and functional 238 domains (Chen et al., 2009; Qian et al., 2014). 239 240 FIS2 and MEA show accelerated amino acid substitution rates and evidence for positive 241 selection 242 Duplicated genes not only diverge in expression pattern but also in their sequences. We first 243 analyzed by Ka/Ks analysis the full-length coding region of FIS2, VRN2, MEA, and SWN genes 244 (Fig. S3). The Brassicaceae FIS2 clade had a much higher average Ka/Ks than VRN2 lineages, 245 3.5-fold greater than the paralogous Brassicaceae VRN2 clade, and 10-fold greater than the 246 orthologous pre-duplicate VRN2 sequences. Similarly, the Brassicaceae MEA clade had a high 247 average Ka/Ks comparable to the FIS2 clade, which is 3.5-fold greater than the paralogous 248 Brassicaceae SWN clade, and 4.5-fold greater than the orthologous pre-duplicated SWN 249 sequences. We implemented different models assuming similar vs. different Ka/Ks ratios in these 250 clades, described in the methods section, and the likelihood ratio tests indicated that the 11 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. 251 divergence in sequence rate is significant (Table S1). These analyses indicate that while the 252 paralogous Brassicaceae VRN2 and SWN lineages are under stronger purifying selection along 253 with the orthologous genes in outgroup species, FIS2 and MEA in the Brassicaceae have 254 experienced relaxation of purifying selection. Asymmetric Ka/Ks ratios are seen in a minority of 255 duplicated gene pairs in Arabidopsis thaliana; for example, Gossmann and Schmid (2011) 256 estimated that 7% of the duplicated pairs they analyzed have asymmetric Ka/Ks ratios. 257 Additionally, among the branch-wise Ka/Ks of specific FIS2 and MEA sequences, we 258 detected possible positive selection, indicated by Ka/Ks greater than one, acting on the sequences 259 from certain lineages (Fig. S3). In order to distinguish certain amino acid sites evolving under 260 positive selection from relaxed purifying selection, we also applied a branch-site model, which 261 suggested that both branches leading to Arabidopsis FIS2 (P<0.0001) and MEA (P=0.007) have 262 positively selected amino acid sites across different functional domains (Fig. S4). 263 Thus we further studied the sequence evolution of characterized functional domains of 264 FIS2/VRN2 and MEA/SWN genes, including the VEF and C2H2 domains in the FIS2 and VRN2 265 genes, and the C5, SET, SANT and CXC domains in the MEA and SWN genes (Fig. 5, Fig. S3). 266 We observed that the trend of acceleration in sequence evolution of FIS2 and MEA, and 267 evolutionary constraint resulting in the conservation of VRN2 and SWN, was reflected by all the 268 functional domains we analyzed individually. The VEF domain in FIS2/VRN2 genes and the C5 269 domain in MEA/SWN genes physically interact with each other, thus the comparison between the 270 two sets of Ka/Ks ratios best describes the co-evolution between FIS2 and MEA at the coding 271 sequence level from a protein-protein interaction perspective (Fig. 5). Consistent with the full- 272 length gene analyses, the VEF domain in the FIS2 lineages and the C5 domain in the MEA 273 lineages both have accelerated amino acid substitution rates, with evidence (Ka/Ks > 1) 274 suggesting positive selection on a few branches (Fig. S3; Table S1). Similar results were found 275 in the DNA binding related domains, C2H2 in FIS2/VRN2, CXC and SANT in MEA/SWN genes 276 (Fig. S3), indicating that the PRC2 complexes with FIS2 and MEA may have affinity to specific 277 DNA regions, regulating a novel network of gene expression. The SET domain plays the role of 278 methyltransferase in the PRC2 complex, and is usually highly conserved across eukaryotes 279 (Baumbusch et al., 2001). This is reflected by the low Ka/Ks ratios detected in the SWN SET 280 domains (Fig. S3). Instead, the SET domain in the Brassicaceae MEA shows evidence for 281 positive selection (Fig. S4). The rapid amino acid substitution rates in the PRC2 functional 12 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. 282 domains together likely relate to the functional divergence of the PRC2 complexes containing 283 FIS2 and MEA. 284 285 VEL2 and VEL1, which interact with PRC2 complexes, show corresponding divergence 286 patterns to FIS2/VRN2 and MEA/SWN 287 A family of five PHD finger proteins is necessary for the core PRC2 complex to maintain the 288 repressed status of chromatin (Kim and Sung, 2010; Kim and Sung, 2013). Among them, 289 VERNALIZATION5/VIN3-LIKE 1 (VEL1) and VEL2 are a pair of alpha whole genome duplicates. 290 VEL2 is a maternally expressed imprinted gene (Wolff et al. 2011). We analyzed their expression 291 profile in the ADA and ASA microarray datasets, and detected that VEL1 shows a co-expression 292 pattern with VRN2 and SWN, which is similar to the broadly expressed VEL homologs, whereas 293 VEL2 has a similar expression pattern to FIS2 and MEA due to loss of vegetative expression (Fig. 294 6A; Qiu et al., 2014). VEL2 has a higher specificity than its paralog VEL1 (Fig. 6B). Thus the 13 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. 295 observed concerted divergence in expression pattern in the FIS-complex is not limited to the core 296 complex, but also includes other associated proteins. 297 For cytosine methylation in the vegetative tissue, VEL1 is marked through the coding 298 exons but not the promoter region, whereas VEL2 has cytosine methylation enriched in the 299 upstream promoter region and the first two introns located between the 5’UTR exons (Schmitz et 300 al., 2013; Stroud et al., 2013; Zemach et al., 2013). For histone methylation in the vegetative 301 tissue, VEL1 is marked by activating marks, including H3K4me3, H3K36me3 and H3K4me2 302 (Roudier et al., 2011). VEL2 has lost the H3K4me3 and H3K36me3, but instead gained the 303 repressive mark H3K27me3. Those epigenetic features not only correspond to the vegetative 304 expression level, but also are consistent with the divergence of the core PRC2 components FIS2 305 and MEA (Table 1). 306 We further analyzed the sequence evolution of the VEL genes. The VEL2 sequences have 307 an elevated average Ka/Ks ratio compared to the Brassicaceae VEL1 and orthologous VEL genes 14 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. 308 (Fig. 6B). While VEL1 and orthologous sequences have a low Ka/Ks ratio close to zero 309 indicating strong purifying selection, a three-fold change in VEL2 sequences suggests the 310 relaxation of purifying selection. This coincides with the accelerated amino acid substitution 311 rates of FIS2 and MEA. 312 313 15 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. 314 Discussion 315 Concerted divergence of FIS2 and MEA in the FIS-PRC2 Complex 316 Upon gene duplication, hypothetically two duplicates are identical in function, as well as 317 expression pattern if the cis-elements also are entirely duplicated. Considering that many 318 proteins function through interactions with other proteins, in a regulatory or metabolic pathway, 319 through protein-protein interaction, or form an integral complex, either the duplicates are 320 redundant or both duplicates could integrate into either complex and affect the function of the 321 complex if they have divergence. A shift in expression pattern would be one way to avoid 322 potentially disadvantageous crosstalk between interacting members (Aarke et al., 2015). Blanc 323 and Wolfe (2004) described a process of concerted divergence of gene expression in Arabidopsis 324 thaliana, in which pairs of duplicates, whose protein products interact, diverge in a parallel 325 manner in expression pattern. However, as FIS2 and VRN2 were not identified as alpha WG 326 duplicates by the genome-wide study (Blanc et al., 2003), their concerted divergence in 327 expression pattern with MEA and SWN was not included. 328 Here we show that FIS2 and MEA diverged in expression pattern in a concerted manner, 329 modified from co-expressed VRN2 and SWN whose expression pattern resembles the ancestral 330 status. In addition, we show that cytosine methylation and histone methylation patterns in FIS2 331 and MEA also diverged in a concerted manner. It is possible that the methylation change 332 contributed to the changes in expression patterns, although mutations in regulatory elements may 333 also have played a role in the expression pattern changes. FIS2 and MEA are marked by 334 H3K27me3 in the vegetative tissue, suggesting they both became the targets of a vegetative 335 PRC2 complex after formation by gene duplication (Bouyer et al., 2011). In addition to the 336 vegetative epigenetic divergence, FIS2 and MEA are well known as imprinted genes during seed 337 development, both of which are maternally expressed genes (Berger and Chaudhury, 2009). 338 Based on the genome-wide datasets from Hsieh et al. (2011) and Gehring et al. (2011), we 339 determined that VRN2 and SWN are not imprinted, while the more distant relatives in their gene 340 families, EMF2 and CLF, also lack of evidence for imprinting. Thus we infer that FIS2 and MEA 341 became imprinted genes after their divergence from VRN2 and SWN. This concerted change in 342 regulation of both genes ensures the dosage balance between the interacting proteins. The 343 concerted divergence of FIS2 and MEA from their paralogs is also reflected by the elevated 16 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. 344 Ka/Ks ratios in the coding sequences at comparable levels, suggesting similar relaxed purifying 345 selection is acting on the two genes. Altogether these changes indicate that FIS2 and MEA have 346 been diverging in concert in multiple ways, which likely contributed to the divergence in 347 functions between the FIS2-PRC2-complex and the VRN2-PRC2-complex. 348 349 Functional divergence in the FIS-PRC2 Complex 350 VRN2, SWN/CLF, FIE, and MSI1 form the VRN-complex, which regulates vernalization to 351 control flowering time in Arabidopsis (Fig. 1; Hennig and Derkacheva, 2009). The complex also 352 represses autonomous seed coat development (Roszak and Kohler, 2011). The FIS-complex 353 contains FIS2, MEA, FIE, and MSI1. The FIS-complex is important in gametophyte and seed 354 development and it has two major functions. A pre-fertilization role for the FIS-complex is that it 355 prevents proliferation of the central cell of the female gametophyte until after fertilization so that 356 seed development does not start until after fertilization (Hennig and Derkacheva, 2009). The 357 FIS-complex also acts post-fertilization. It is needed for regulating endosperm cellularization 358 during seed development (Hehenberger et al., 2012). FIS2 mutants show a phenotype of 359 abnormal female gametophyte development into embryos and are defective in controlling central 360 cell proliferation in the female gametophyte, suggesting that FIS2 is not redundant with VRN2 in 361 the pre-fertilization function (Roszak and Kohler, 2011). Thus the FIS-complex function in the 362 female gametophyte is specific to the FIS-complex and not the VRN-complex. MEA was also 363 shown to not be redundant with SWN (Roszak and Kohler, 2011). Unlike all the key components 364 in FIS-complex, a SWN mutant failed to lead to autonomous seed development in the absence of 365 fertilization, nor seed abortion with embryo and endosperm overgrowth (Luo et al., 1999), thus it 366 is possible that MEA is functionally specialized for the pre-fertilization function of the FIS- 367 complex and can not be complemented by SWN. As for the post-fertilization function, SWN was 368 shown to be not essential in seed development (Spillane et al., 2007). Thus it was proposed that 369 MEA underwent neofunctionalization to gain a post-fertilization role in regulating seed 370 development after its duplication from SWN (Spillane et al., 2007). Taking the two parts of the 371 FIS-complex functions together, it appears that the novel PRC2 made up by FIS2 and MEA 372 created a Brassicaceae-specific complex for preventing seed development prior to fertilization 373 and facilitating seed development after fertilization in Brassicaceae. This functional divergence 17 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. 374 complements the concerted divergence of FIS2 and MEA in other ways that we show in this 375 study. The FIS-complex also plays an important role in establishing imprinted expression of 376 many genes in the endosperm, especially paternally expressed imprinted genes, as the 377 differentially methylated paternal or maternal allele can affect the targeting by this complex 378 (Wolff et al., 2011; Kohler et al., 2012). The concerted divergence of FIS2 and MEA in 379 expression patterns, methylation patterns, and accelerated sequence evolution may have 380 contributed to functional diversification or potentially neofunctionalization of the FIS-PRC2 381 complex. An alternative to neofunctionalization of the FIS-PRC2 complex is 382 subfunctionalization after the formation of FIS2 and MEA from their paralogs. Without 383 knowledge of the ancestral function of the PRC2 complex in plants closely related to the 384 Brassicaceae, discussed below, we cannot say for sure if there has been neofunctionalization or 385 subfunctionalization. We show in this study that there has been regulatory neofunctionalization 386 of FIS2 and MEA, which leads us to favor the possibility of neofunctionalization of the complex. 387 Nonetheless, under a scenario of subfunctionalization, FIS2 and MEA still show concerted 388 divergence in their expression patterns, cytosine and histone methylation, and accelerated 389 sequence evolution. In order to distinguish the two possible hypotheses, more research on VRN- 390 complexes in rosid species will provide valuable information to infer the function of the ancestral 391 rosid PRC2 complex. 392 How are the FIS-complex functions performed in other angiosperms outside of 393 Brassicaceae? Some clues come from studies of FIE, which is a member of the FIS2 complex, in 394 Hieracium piloselloides (Asteraceae). The central cell proliferation phenotype of Arabidopsis fie 395 mutants is not seen in sexual Hieracium FIE RNAi lines; thus a PRC2 complex does not regulate 396 central cell proliferation in the female gametophyte of Hieracium, in contrast to Arabidopsis 397 (Rodrigues et al., 2008). This might indicate that parts of the pre-fertilization function of FIS- 398 PRC2 in Brassicaceae is an evolutionary innovation, at the same time it is possible the unknown 399 mechanism repressing central cell proliferation is specific to the Hieracium lineage. FIE down- 400 regulation in Hieracium leads to seed abortion (Rodrigues et al., 2008) and thus FIE is important 401 for seed development, presumably as part of a PRC2 complex. Asterids do not contain FIS2, 402 VRN2, or MEA. Thus, if there is a PRC2 complex regulating seed development in asterids, it 403 probably contains the product of lineage specific polycomb proteins, and a mechanism 404 independently evolved from Brassicaceae. In maize and rice there has been duplication of FIE 18 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. 405 (Luo et al., 2009; Li et al., 2014). Thus the grasses may have PRC2 complexes that are 406 divergent from the ancestral state. The requirement of H3K27me3 in rice and maize endosperm 407 for establishment of imprinting suggests the functional conservation or convergence of a PRC2 408 complex in Brassicaceae and Poaceae (Makarevitch et al., 2013; Zhang et al., 2014). 409 410 Evolution of protein complexes after the duplication of components 411 We propose the model of simultaneous gene duplication and concerted divergence of one copy 412 of each duplicated pair (Fig. 7). Following formation by duplication, two genes whose products 413 function together in a complex diverge in similar ways and the complex diverges in function. 414 This divergence pattern is not limited to neo- / sub-functionalization, but includes some other 415 modifications of these scenarios such as escape from adaptive conflict. The PRC2 complexes in 416 Brassicaceae we examined in this study provide the first example of this type of divergence of 417 duplicated genes. We contrast this scenario with single-gene-duplication and divergence, where 418 one component in the complex underwent gene duplication, then the paralog diverges driving the 419 two complexes with either paralog to diverge in function as a result. Intuitively many described 420 functionally divergent paralogs may contribute to this type of divergence of their protein 421 complexes. One example is the centromere-defining histone variants CENH3 in the histone core 422 octamers that show duplication specific to the genus Mimulus and sequence divergence, whereas 423 other components in the histone core octamers do not show duplications specific to Mimulus 424 (Finseth et al., 2015). Another case is the telomere-associated proteins POT1a and POT1b in the 425 telomerase RNP complexes in Brassicaceae, where POT1a experienced positive selection that 426 enhanced its affinity with interacting proteins (Beilstein et al., 2015). A variation on this model 427 is when there is a subsequent gene duplication at a later time of another gene whose product 428 functions in the complex, followed by divergence. An example is the plant-specific RNA 429 polymerase IV and V where rounds of independent lineage-specific duplications and subsequent 430 divergence of varying kinds of subunits have increased RNA polymerase complexity and 431 specificity among different plant groups (Wang et al., 2015). 432 433 Concerted divergence of the functionally associated VELs and some PRC2 targets 19 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. 434 The VEL genes, VEL1 and VEL2, which are required to maintain and facilitate polycomb 435 transcriptional repression, interact with the PRC2 complex but are not part of the complex itself. 436 Our expression, methylation, and sequence analysis results indicate that VEL2 has similar 437 patterns to FIS2 and MEA, whereas VEL1 has similar patterns to VRN2 and SWN. Thus VEL2 438 appears to be diverging in concert with FIS2 and MEA. VEL2 is also a maternally expressed 439 gene and regulated by the FIS-complex in the endosperm (Wolff et al., 2011), and VEL2 works 440 together with the FIS core complex to impose maternal regulation in seed development similar to 441 FIS2 and MEA. 442 Several PRC2 targets duplicated through the alpha WGD show similar patterns of 443 divergence as well. PKR2 and JMJ15 are FIS-PRC2 regulated imprinted genes (Hsieh et al., 444 2011; Wolff et al., 2011), whereas their paralogs, PKL and JMJ18 show broad expression, are 445 not imprinted, and are associated with a vegetative PRC2 complex (Aichinger et al., 2011; Yang 446 et al., 2012; Zhang et al., 2012). Out of 46 imprinted genes regulated by FIS2 (Wolff et al., 447 2011), we identified 41 Brassicaceae-specific duplicated genes. Some of those genes have roles 20 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. 448 in seed development, such as PHERES1 (Kohler et al., 2003; Villar et al., 2009) and ADMETOS 449 (Kradolfer et al., 2013). Thus, there are new Brassicaceae-specific genes involved in seed 450 development that are regulated by the FIS-PRC2 complex. The functional innovation of the FIS- 451 complex appears to have rewired, to some extent, the regulatory pathway of seed development 452 specific to Brassicaceae. 453 Simultaneous gene duplication events, such as polyploidy, give rise to pairs of duplicated 454 genes that can then co-diverge (Shan et al., 2009). Many of the genes that are PRC2 targets, 455 included in the previous paragraph, were derived by the alpha whole genome duplication. FIS2, 456 MEA, and VEL2 also were derived from that WGD. Thus this study illustrates the potential of 457 concerted divergence after simultaneous gene duplication to affect functions as well as regulation 458 of other genes. 459 Materials and Methods 460 461 Comparing expression specificity and detecting co-expression using microarray data 462 analyses 463 Two sets of ATH1 microarray data from Arabidopsis thaliana were obtained: the Arabidopsis 464 development atlas (ADA) from the TAIR website (http://www.Arabidopsis.org/), which included 465 63 different organ types and developmental stages (Schmid et al., 2005), and the Arabidopsis 466 seed development atlas (ASA) from the Goldberg Lab Arabidopsis thaliana Gene Chip Database 467 (http://estdb.biology.ucla.edu/genechip/), which included 42 different tissue types from seed 468 developmental stages (Le et al., 2010). The data were GC-RMA normalized using the gcrma 469 package in R. We used the expression specificity (τ) defined by Yang and Gaut (2011) to 470 describe the expression patterns of FIS2, VRN2, MEA and SWN: τ = 471 where n is the total number of samples (63 or 42) and S(i,max) is the highest log2 transformed 472 expression values for gene i across the n organ types. High values of expression specificity 473 indicate genes with expression limited to few organ or tissue types or developmental stages, 474 while low values of expression specificity indicate broad expression of genes with similar 475 expression levels in most of the organ or tissue types and developmental stages. To test if there is 476 any significant difference of expression specificity between any two of the four genes, we [ ( , )/ ( , )] , 21 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. 477 applied 1000 Monte Carlo randomization tests to each two-gene comparison. For the Monte 478 Carlo randomization test, we computed the following statistic: DIF = |τGENE1-τGENE2|, where DIF 479 indicates the absolute difference of expression specificity between two genes. Then, we 480 compared the observed value (DIFobs) against the null distribution of simulated DIF value (= 481 DIFsim) from 1000 randomized data. If the null hypothesis is rejected, the expression specificity 482 of any two compared genes is significantly different. The cutoff of the significant P value was 483 set to 0.05. 484 In addition to the comparison of expression specificity among gene pairs, we applied the 485 Pearson correlation analysis to determine if the expression profile between any two genes 486 showed any evidence of co-expression (i.e. correlated expression across different organ types or 487 tissue types). Co-expression is determined when the Pearson correlation coefficient (r) is 488 significantly positive, and vice versa. 489 490 Inferring the ancestral expression states using RT-PCR 491 Total RNA samples of Arabidopsis thaliana, Tarenaya hassleriana (formerly known as Cleome 492 spinosa), Carica papaya, and Vitis vinifera were extracted from liquid N2 frozen tissue of five 493 organ types: root, stem, leaf (rosette leaves in Arabidopsis thaliana), flower, and seed (whole 494 siliques in Arabidopsis thaliana and Tarenaya hassleriana). A modified CTAB method was used 495 for RNA extraction (Zhou et al., 2011). The quality of each RNA sample was checked on 2% 496 agarose gels by electrophoresis, and the amount of each RNA sample was determined by a 497 Nanodrop spectrophotometer. After DNaseI (Invitrogen) treatment to remove residual DNA, M- 498 MLV reverse transcriptase (Invitrogen) was applied to the RNA samples to generate cDNA, 499 according to the manufacturer’s instructions. PCR was performed with cDNA templates to detect 500 the organ-specific expression of Arabidopsis FIS2/VRN2 and MEA/SWN paralogous pairs, as 501 well as orthologous genes in outgroup species for inference of the ancestral, pre-duplication, 502 expression states. Gene-specific primers were designed to amplify 250-1000 bp of the cDNA of 503 targeted genes (Table S2). For PCR reactions, the cycling programs were: preheating at 94℃ for 504 3 minutes; 30-35 cycles of denaturing at 94℃ for 30 seconds, annealing at 53-56℃ for 30 505 seconds, elongation at 72℃ for 30 seconds or 1 minute, and a final elongation at 72℃ for 7 506 minutes. PCR products were checked on 1% agarose gels, and sequenced to confirm identity. 22 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. 507 508 Identifying epigenetic marks associated with the studied genes 509 We investigated the epigenetic modifications around the genomic regions of Arabidopsis FIS2, 510 VRN2, MEA and SWN. We also used EMF2 and CLF, which are members of the FIS2/VRN2 and 511 MEA/SWN families, respectively, to help assess the ancestral state. For DNA methylation, we 512 obtained data from Schmitz et al. (2013), Stroud et al. (2013) and Zemach et al. (2013) from 513 CoGe (https://genomevolution.org/CoGe/), visualized by JBrowse in Araport 514 (https://www.araport.org/). Analyzed data included assayed genomic DNAs from leaves in 515 Schmitz et al. (2013) and Stroud et al. (2013), and assayed genomic DNAs from seedlings and 516 roots in Zemach et al. (2013), which were all vegetative organs. Cytosine methylation at CpG 517 sites was analyzed along the genomic region of a target gene. For histone methylation, we 518 extracted tiling-array data from seedlings from Roudier et al. (2011) and ChIP-on-chip data from 519 wild type and fie mutant seedlings from Bouyer et al. (2011). Four histone marks were analyzed: 520 tri-methylation of lysine 27 on histone H3 (H3K27me3), tri-methylation of lysine 4 on histone 521 H3 (H3K4me3), di-methylation of lysine 4 on histone H3 (H3K4me2) and tri-methylation of 522 lysine 36 on histone H3 (H3K36me3). The epigenetic features in Arabidopsis seedlings were 523 compared among the paralogous genes in a family and between the two interacting gene families. 524 525 Detecting accelerated sequence evolution and positive selection by Ka/Ks analyses 526 To analyse the selection acting on the gene pairs FIS2/VRN2 and MEA/SWN, several rate 527 analyses were performed using Codeml in the PAML package (Yang, 2007). We obtained the 528 sequences of the four genes from Arabidopsis thaliana, as well as some other Brassicaceae 529 species, including Arabidopsis lyrata, Arabidopsis halleri, Capsella rubella, Brassica rapa, 530 Brassica oleracea, Eutrema salsugineum (formerly known as Thellungiella halophila), and 531 Schrenkiella parvula (formerly known as Thellungiella parvula). We also identified orthologous 532 sequences, by reciprocal best BLAST hits, from species outside of the Brassicaceae including 533 Tarenaya hassleriana (formerly known as Cleome spinosa), Carica papaya, Gossypium 534 raimondii, Theobroma cacao, Citrus sinensis, Populus trichocarpa, Ricinus communis, and 535 Manihot esculenta, Vitis vinifera from PLAZA v3.0 Dicots 536 (http://bioinformatics.psb.ugent.be/plaza/versions/plaza_v3_dicots/; Proost et al., 2015), 23 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. 537 Phytozome v10 (http://phytozome.jgi.doe.gov/pz/portal.html; Goodstein et al., 2012), BRAD 538 database (http://brassicadb.org/brad/; Cheng et al., 2011) and NCBI’s GenBank. Gene orthology 539 was later confirmed by comparing the topology of the gene phylogeny to the species tree. 540 Alignments of amino acid sequences were generated using MUSCLE under default parameters 541 (Edgar, 2004), and then reverse translated into codon alignments using the customized Perl script. 542 We generated the alignments for the full length of the two gene families, as well as some 543 documented functional domains, including the VEF and C2H2 domains in the FIS2 and VRN2 544 genes, and the C5, SET, SANT and CXC domains in the MEA and SWN genes. Phylogenies of 545 the two gene families were analyzed by RAxML v.7.0.3 with GTR as the substitution matrix 546 (Stamatakis, 2006). ML trees of the two gene families were generated based on codon 547 alignments. 548 We first used a phylogeny-based free-ratio test to estimate branch-wise Ka/Ks ratios 549 along the phylogenetic tree branches. For the full-length FIS2/VRN2 genes we implemented four 550 different models to test if the Ka/Ks ratios of the Brassicaceae FIS2 clade and the Brassicaceae 551 VRN2 clade display an asymmetric pattern, and how conserved they are compared to the 552 orthologous genes. The first model (Model I: one-ratio model) assumes that all the genes have 553 the same Ka/Ks ratio, bearing the hypothesis that all genes are under the same level of selection. 554 The second model (Model II: two-ratio model-1) assumes that the Brassicaceae VRN2 clade and 555 the orthologous genes have the same Ka/Ks ratio, but the Brassicaceae FIS2 clade can have a 556 different one, suggesting that the Brassicaceae VRN2 clade reflects the ancestral selection but 557 FIS2 evolved in a different manner. The third model (Model III: two-ratio model-2) assumes the 558 duplicated FIS2 and VRN2 clades in Brassicaceae have the same Ka/Ks ratio, while the orthologs 559 can have a different ratio, which is a hypothesis that the two Brassicaceae copies evolved at the 560 same rate. The fourth model (Model IV: three-ratio model) assumes that the two Brassicaceae 561 branches have different Ka/Ks ratios, and thus the two genes evolved at different rates, with the 562 third Ka/Ks ratio for the orthologous branches. A set of likelihood ratio tests were applied, where 563 twice the different of likelihood values was calculated and compared against a chi-square 564 distribution with the degree of freedom (df) set at one: comparison between Model II and Model 565 IV can tell if the selection on the Brassicaceae VRN2 is significantly different from the 566 orthologous genes; and comparisons between Model I and Model II, as well as between Model 567 III and Model IV is to see if the selection on the Brassicaceae FIS2 is different from the 24 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. 568 Brassicaceae VRN2 and/or the orthologous genes. When Model II fits better than Model I, and 569 Model IV fits better than Model III with statistical support, the evolutionary rate of the 570 duplicated pair in Brassicaceae is considered to evolve asymmetrically. The same analyses were 571 performed on the functional domains of the FIS2/VRN2 genes, and the full-length MEA/SWN 572 genes and their functional domains (Table S1). We also applied a branch-site model to detect 573 positively selected sites along FIS2 as well as MEA. Test 2 of ModelA with the Bayes Empirical 574 Bayes analysis was applied to identify amino acid sites with a high posterior probability of 575 positive selection (Zhang et al. 2005). 576 577 578 579 580 25 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. 581 Table(s) 582 Table 1. Histone methylation of studied genes. 583 H3K27me3 VEF genes SET genes VEL genes FIS2 VRN2 EMF2 x MEA SWN CLF x VEL2 VEL1 x H3K4me3 H3K4me2 H3K36me3 x x x x x x x x x x x x x x x x x 584 585 x’s indicate presence of a particular type of histone methylation 586 587 588 26 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. 589 SUPPLEMENTAL DATA 590 Fig. S1. Permutation test for microarray data to detect the difference in expression profile for all sets of 591 comparisons of gene pairs in the ADA and ASA datasets. 592 Fig. S2. Structures of FIS2 and VRN2, along with MEA and SWN in Brassicaceae and other eurosids. 593 Fig. S3. Ka/Ks ratios of full-length FIS2/VRN2 and MEA/SWN genes and functional domains. 594 Fig. S4. Positive selection on specific sites of MEA and FIS2 genes. 595 Table S1. Ka/Ks ratios under different branch models for full-length FIS2/VRN2 and MEA/SWN genes 596 and functional domains. 597 Table S2. Gene-specific primers used in this study. 598 599 600 Figure Legends 601 Figure 1. Two PRC2 complexes in Brassicaceae, the VRN-complex and the Brassicaceae- 602 specific FIS-complex, arose by the alpha whole genome duplication where VRN2 duplicated to 603 form FIS2, and SWN duplicated to form MEA. 604 605 Figure 2. A. Organ/tissue-specific expression indices based on two sets of microarray data. A 606 large value indicates expression is restricted to fewer organ or tissue types while a low value 607 indicates broad expression. B. Correlation of expression profile of each gene pair. Left: ADA set 608 (63 organ types and developmental stages); right: ASA set (42 seed tissue types and 609 developmental stages). Black arrows indicate a positive correlation and grey arrows indicate a 610 negative correlation. The thickness of arrows indicates the level of the correlation coefficient. 611 The correlation coefficient and p-value of expression profile of each gene pair are labeled along 612 the arrows. Bold values indicate positive correlation. 613 27 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. 614 Figure 3. RT-PCR assays indicate that FIS2 and MEA have lost the ancestral vegetative 615 expression pattern after duplication. Plus signs indicate reactions with reverse transcriptase and 616 minus signs indicate controls with no reverse transcriptase. Species abbreviations include: At - 617 Arabidopsis thaliana, Th - Tarenaya hassleriana, Cp - Carica papaya, and Vv - Vitis vinifera. 618 619 Figure 4. DNA methylation at the genomic region of the VEF-domain genes and SET-domain 620 genes. CLF and EMF2 are ancient paralogs of SWN and VRN2, respectively. For each gene, four 621 rows represent four replicates, and the dashed line separates 1500 bp upstream of the 622 transcription start site. Vertical bars in each row represent the level of methylation. 623 624 Figure 5. Ka/Ks values of the interacting domains: VEF domain in FIS2/VRN2 and C5 domain 625 in MEA/SWN. Estimated average Ka/Ks ratio of each clade is shown between the two trees. The 626 values above branches are Ka/Ks ratios (where no value suggested the lack of power to detect the 627 accurate Ka/Ks ratio in the PAML analysis). The black dots indicate the alpha WGD at the base 628 of the Brassicaceae. The scale bars indicate 0.1 substitution per codon. Species abbreviations 629 include: At - Arabidopsis thaliana, Al - Arabidopsis lyrata, Cr - Capsella rubella, Sp - 630 Schrenkiella parvula, Es - Eutrema salsugineum, Br - Brassica rapa, Bo - Brassica oleracea, Th 631 - Tarenaya hassleriana, Cp - Carica papaya, Gr – Gossypium raimondii, Tc - Theobroma cacao, 632 Pt - Populus trichocarpa, Rc - Ricinus communis, Me - Manihot esculenta and Vv - Vitis 633 vinifera. 634 635 Figure 6. VEL2 and VEL1 expression and sequence evolution. A. Organ/tissue specificity of 636 VEL genes. B. Correlation of expression profile between VEL genes and PRC2 core components. 637 Left: ADA set (63 organ types and developmental stages); right: ASA set (42 seed tissue types 638 and developmental stages). Black arrows indicate positive correlation, and grey arrows indicate 639 negative correlation. The thickness of arrows indicates the level of the correlation coefficient. 640 The correlation coefficient and p-value of expression profile of each gene pair are labeled along 641 the arrows. Bold values indicate positive correlations. C. DNA methylation at the genomic 28 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. 642 region of VEL genes (as in Fig.4). D. Ka/Ks values of the VEL genes. Average Ka/Ks ratio of 643 each clade is shown. The black dot at the node indicates gene duplication events. 644 645 Figure 7. Schematic diagrams illustrating models of protein complex divergence. Colors indicate 646 conservation vs. divergence (could be neofunctionalization, subfunctionalization, loss of partial 647 function, and other types of divergence). A. Single-gene-duplication and divergence: a single 648 gene (dark blue) in a complex is duplicated. After duplication there is subsequent divergence 649 (light blue vs. red) of the ancestral gene (dark blue) to give rise to divergent protein complexes. 650 B. Simultaneous-gene-duplication and concerted divergence: two (or more) genes (dark green + 651 dark blue) were duplicated simultaneously. After duplication there is parallel divergence (light 652 green + light blue vs. yellow + red) to give rise to divergent protein complexes. C. The PRC2 653 complexes in this study are an example of simultaneous-gene-duplication and concerted 654 divergence. 655 29 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. 656 30 Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. Parsed Citations Aakre CD, Herrou J, Phung TN, Perchuk BS, Crosson S, Laub MT. 2015. Evolving new protein-protein interaction specificity through promiscuous intermediates. Cell 163: 594-606. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Aichinger E, Villar CB, Di Mambro R, Sabatini S, Köhler C. 2011. The CHD3 chromatin remodeler PICKLE and polycomb group proteins antagonistically regulate meristem activity in the Arabidopsis root. Plant Cell 23: 1047-1060. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Baumbusch LO, Thorstensen T, Krauss V, Fischer A, Naumann K, Assalkhou R, Schulz I, Reuter G, Aalen RB. 2001. The Arabidopsis thaliana genome contains at least 29 active genes encoding SET domain proteins that can be assigned to four evolutionarily conserved classes. Nucleic Acids Research 29: 4319-4333. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Beilstein MA, Renfrew KB, Song X, Shakirov EV, Zanis MJ, Shippen DE. 2015. Evolution of the Telomere-Associated Protein POT1a in Arabidopsis thaliana is characterized by positive selection to reinforce protein-protein interaction. Molecular Biology and Evolution 32: 1329-1341. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Birchler JA, Riddle NC, Auger DL, Veitia RA. 2005. Dosage balance in gene regulation: biological implications. Trends in Genetics 21: 219-226. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Blanc G, Hokamp K, Wolfe KH. 2003. A recent polyploidy superimposed on older large-scale duplications in the Arabidopsis genome. Genome Research 13: 137-144. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Blanc G, Wolfe KH. 2004. Functional divergence of duplicated genes formed by polyploidy during Arabidopsis evolution. Plant Cell 16: 1679-1691. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Berger F, Chaudhury A. 2009. Parental memories shape seeds. Trends in Plant Science 14: 550-556. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Bouyer D, Roudier F, Heese M, Andersen ED, Gey D, Nowack MK, Goodrich J, Renou JP, Grini PE, Colot V, Schnittger A. 2011. Polycomb repressive complex 2 controls the embryo-to-seedling phase transition. PLoS Genetics 7: e1002014. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Bowers JE, Chapman BA, Rong J, Paterson AH. 2003. Unraveling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422: 433-438. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Capra EJ, Perchuk BS, Skerker JM, Laub MT. 2012. Adaptive mutations that prevent crosstalk enable the expansion of paralogous signaling protein families. Cell 150: 222-232. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Casneuf T, De Bodt S, Raes J, Maere S, Van de Peer Y. 2006. Nonrandom divergence of gene expression following gene and genome duplications in the flowering plant Arabidopsis thaliana. Genome Biology 7: R13. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Chen LJ, Diao ZY, Specht C, Sung ZR. 2009. Molecular evolution of VEF-domain-containing PcG genes in plants. Molecular Plant 2: 738-754. Pubmed: Author and Title CrossRef: Author and Title Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. Google Scholar: Author Only Title Only Author and Title Cheng F, Liu S, Wu J, Fang L, Sun S, Liu B, Li P, Hua W, Wang X. 2011. BRAD, the genetics and genomics database for Brassica plants. BMC Plant Biology 11: 136. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Cheng S, van den Bergh E, Zeng P, Zhong X, Xu J, Liu X, Hofberger J, de Bruijn S, Bhide AS, Kuelahoglu C, Bian C, Chen J, Fan G, Kaufmann K, Hall JC, Becker A, Bräutigam A, Weber AP, Shi C, Zheng Z, Li W, Lv M, Tao Y, Wang J, Zou H, Quan Z, Hibberd JM, Zhang G, Zhu XG, Xu X, Schranz ME. 2013. The Tarenaya hassleriana genome provides insight into reproductive trait and genome evolution of crucifers. Plant Cell 25: 2813-2830. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Coate JE, Song MJ, Bombarely A, Doyle JJ. 2016. Expression-level support for gene dosage sensitivity in three Glycine subgenus Glycine polyploids and their diploid progenitors. New Phytologist doi: 10.1111/nph.14090. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Duarte JM, Cui L, Wall PK, Zhang Q, Zhang X, Leebens-Mack J, Ma H, Altman N, dePamphilis CW. 2006. Expression pattern shifts following duplication indicative of subfunctionalization and neofunctionalization in regulatory genes of Arabidopsis. Molecular Biology and Evolution 23: 469-478. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Edgar RC. 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research 32: 17921797. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Finseth FR, Dong Y, Saunders A, Fishman L. 2015. Duplication and adaptive evolution of a key centromeric protein in Mimulus, a genus with female meiotic drive. Molecular Biology and Evolution 32: 2694-2706. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Force A, Lynch M, Pickett FB, Amores A, Yan YL, Postlethwait J. 1999. Preservation of duplicate genes by complementary, degenerative mutations. Genetics 151: 1531-1545. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Gehring M, Missirian V, Henikoff S. 2011. Genomic analysis of parent-of-origin allelic expression in Arabidopsis thaliana seeds. PLoS One 6: e23687. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Goodstein DM, Shu S, Howson R, Neupane R, Hayes RD, Fazo J, Mitros T, Dirks W, Hellsten U, Putnam N, Rokhsar DS. 2012. Phytozome: a comparative platform for green plant genomics. Nucleic Acids Research 40 (Database issue): D1178-1186. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Gossmann TI, Schmid KJ. 2011. Selection-driven divergence after gene duplication in Arabidopsis thaliana. Journal of Molecular Evolution. 73: 153-165. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title He X, Zhang J. 2005. Rapid subfunctionalization accompanied by prolonged and substantial neofunctionalization in duplicate gene evolution. Genetics 169: 1157-1164. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Hehenberger E, Kradolfer D, Köhler C. 2012. Endosperm cellularization defines an important developmental transition for embryo development. Development 139: 2031-2039. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Hennig L, Derkacheva M. 2009. Diversity of Polycomb group complexes in plants: same rules, different players? Trends in Genetics 25: 414-423. Pubmed: Author and Title Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Hsieh TF, Shin JY, Uzawa R, Silva P, Cohen S, Bauer MJ, Hashimoto M, Kirkbride RC, Harada JJ, Zilberman D, Fischer RL. 2011. Regulation of imprinted gene expression in Arabidopsis endosperm. Proceeding of National Academy of Sciences of the U S A 108: 1755-1762. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Jiao Y, Wickett NJ, Ayyampalayam S, Chanderbali AS, Landherr L, Ralph PE, Tomsho LP, Hu Y, Liang H, Soltis PS, Soltis DE, Clifton SW, Schlarbaum SE, Schuster SC, Ma H, Leebens-Mack J, dePamphilis CW. 2011. Ancestral polyploidy in seed plants and angiosperms. Nature 473: 97-100. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Kim DH, Sung S. 2010. The Plant Homeo Domain finger protein, VIN3-LIKE 2, is necessary for photoperiod-mediated epigenetic regulation of the floral repressor, MAF5. Proceeding of National Academy of Sciences of the U S A 107: 17029-17034. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Kim DH, Sung S. 2013. Coordination of the vernalization response through a VIN3 and FLC gene family regulatory network in Arabidopsis. Plant Cell 25: 454-469. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Kradolfer D, Wolff P, Jiang H, Siretskiy A, Köhler C. 2013. An imprinted gene underlies postzygotic reproductive isolation in Arabidopsis thaliana. Developmental Cell 26: 525-535. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Köhler C, Hennig L, Spillane C, Pien S, Gruissem W, Grossniklaus U. 2003. The Polycomb-group protein MEDEA regulates seed development by controlling expression of the MADS-box gene PHERES1. Genes and Development 17: 1540-1553. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Köhler C, Wolff P, Spillane C. 2012. Epigenetic mechanisms underlying genomic imprinting in plants. Annual Review of Plant Biology 63: 331-352. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Le BH, Cheng C, Bui AQ, Wagmaister JA, Henry KF, Pelletier J, Kwong L, Belmonte M, Kirkbride R, Horvath S, Drews GN, Fischer RL, Okamuro JK, Harada JJ, Goldberg RB. 2010. Global analysis of gene activity during Arabidopsis seed development and identification of seed-specific transcription factors. Proceeding of National Academy of Sciences of the U S A 107: 8063-8070. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Li S, Zhou B, Peng X, Kuang Q, Huang X, Yao J, Du B, Sun MX. 2014. OsFIE2 plays an essential role in the regulation of rice vegetative and reproductive development. New Phytologist 201: 66-79. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Li Z, Baniaga AE, Sessa EB, Scascitelli M, Graham SW, Rieseberg LH, Barker MS. 2015. Early genome duplications in conifers and other seed plants. Science Advances 1: e1501084. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Liu SL, Baute GJ, Adams KL. 2011. Organ and cell type-specific complementary expression patterns and regulatory neofunctionalization between duplicated genes in Arabidopsis thaliana. Genome Biology and Evolution 3: 1419-1436. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Luo M, Bilodeau P, Koltunow A, Dennis ES, Peacock WJ, Chaudhury AM. 1999. Genes controlling fertilization-independent seed development in Arabidopsis thaliana. Proceeding of National Academy of Sciences of the U S A 96: 296-301. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Luo M, Platten D, Chaudhury A, Peacock WJ, Dennis ES. 2009. Expression, imprinting, and evolution of rice homologs of the polycomb group genes. Molecular Plant 2: 711-723. Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Makarevitch I, Eichten SR, Briskine R, Waters AJ, Danilevskaya ON, Meeley RB, Myers CL, Vaughn MW, Springer NM. 2013. Genomic distribution of maize facultative heterochromatin marked by trimethylation of H3K27. Plant Cell 25: 780-793. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Moore RC, Purugganan. 2005. The evolutionary dynamics of plant duplicate genes. Current Opinion in Plant Biology. 8: 122-128. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Mozgova I, Köhler C, Hennig L. 2015. Keeping the gate closed: functions of the polycomb repressive complex PRC2 in development. Plant Journal 83: 121-132. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Proost S, Van Bel M, Vaneechoutte D, Van de Peer Y, Inzé D, Mueller-Roeber B, Vandepoele K. 2015. PLAZA 3.0: an access point for plant comparative genomics. Nucleic Acids Research 43(Database issue): D974-981. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Qian Y, Xi Y, Cheng B, Zhu S, Kan X. 2014. Identification and characterization of the SET domain gene family in maize. Molecular Biology Report 41: 1341-1354. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Qiu Y, Liu SL, Adams KL. 2014. Frequent changes in expression profile and accelerated sequence evolution of duplicated imprinted genes in Arabidopsis. Genome Biology and Evolution 6: 1830-1842. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Rodrigues JC, Tucker MR, Johnson SD, Hrmova M, Koltunow AM. 2008. Sexual and apomictic seed formation in Hieracium requires the plant polycomb-group gene FERTILIZATION INDEPENDENT ENDOSPERM. Plant Cell 20: 2372-2386. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Roszak P, Köhler C. 2011. Polycomb group proteins are required to couple seed coat initiation to fertilization. Proceeding of National Academy of Sciences of the U S A 108: 20826-20831. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Roudier F, Ahmed I, Bérard C, Sarazin A, Mary-Huard T, Cortijo S, Bouyer D, Caillieux E, Duvernois-Berthet E, Al-Shikhley L, Giraut L, Després B, Drevensek S, Barneche F, Dèrozier S, Brunaud V, Aubourg S, Schnittger A, Bowler C, Martin-Magniette ML, Robin S, Caboche M, Colot V. 2011. Integrative epigenomic mapping defines four main chromatin states in Arabidopsis. The EMBO Journal 30: 1928-1938. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Schmid M , Davison TS, Henz SR, Pape UJ, Demar M, Vingron M, Schölkopf B, Weigel D, Lohmann JU. 2005. A gene expression map of Arabidopsis thaliana development. Nature Genetics 37: 501-506. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Schmitz RJ, Schultz MD, Urich MA, Nery JR, Pelizzola M, Libiger O, Alix A, McCosh RB, Chen H, Schork NJ, Ecker JR. 2013. Patterns of population epigenomic diversity. Nature 495: 193-198. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Schranz M, Mitchell-Olds T. 2006. Independent ancient polyploidy events in the sister families Brassicaceae and Cleomaceae. Planc Cell 18: 1152-1165. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Shan H, Zahn L, Guindon S, Wall PK, Kong H, Ma H, DePamphilis CW, Leebens-Mack J. 2009. Evolution of plant MADS box transcription factors: evidence for shifts in selection associated with early angiosperm diversification and concerted gene duplications. Molecular Biology and Evolution 26: 2229-2244. Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Soltis PS, Soltis DE. 2016. Ancient WGD events as drivers of key innovations in angiosperms. Current Opinion in Plant Biology 30: 159-165. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Spillane C, Schmid KJ, Laoueille-Duprat S, Pien S, Escobar-Restrepo J-M, Baroux C, Gagliardini V, Page DR, Wolfe KH, Grossniklaus U. 2007. Positive darwinian selection at the imprinted MEDEA locus in plants. Nature 448: 349-352. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Stamatakis A. 2006. RAxML-VI-HPC: Maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22: 2688-2690. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Stroud H, Greenberg MV, Feng S, Bernatavichute YV, Jacobsen SE. 2013. Comprehensive analysis of silencing mutants reveals complex regulation of the Arabidopsis methylome. Cell 152: 352-364. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Suzuki MM, Bird A. 2008. DNA methylation landscapes: provocative insights from epigenomics. Nature Review Genetics 9: 465-476. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Van de Peer Y, Maere S, Meyer A. 2009. The evolutionary significance of ancient genome duplications. Nature Review Genetics 10: 725-732. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Villar CB, Erilova A, Makarevich G, Trösch R, Köhler C. 2009. Control of PHERES1 imprinting in Arabidopsis by direct tandem repeats. Molecular Plant 2: 654-660. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Wang J, Tao F, Marowsky NC, Fan C. 2016. Evolutionary Fates and Dynamic Functionalization of Young Duplicate Genes in Arabidopsis Genomes. Plant Physiology 172: 427-440. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Wang Y, Ma H. 2015. Step-wise and lineage-specific diversification of plant RNA polymerase genes and origin of the largest plantspecific subunits. New Phytologist 207: 1198-1212. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Wolff P, Weinhofer I, Seguin J, Roszak P, Beisel C, Donoghue MT, Spillane C, Nordborg M, Rehmsmeier M, Köhler C. 2011. Highresolution analysis of parent-of-origin allelic expression in the Arabidopsis Endosperm. PLoS Genetics 7: e1002126. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Yang H, Han Z, Cao Y, Fan D, Li H, Mo H, Feng Y, Liu L, Wang Z, Yue Y, Cui S, Chen S, Chai J, Ma L. 2012. A companion celldominant and developmentally regulated H3K4 demethylase controls flowering time in Arabidopsis via the repression of FLC expression. PLoS Genetics 8: e1002664. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Yang L, Gaut BS. 2011. Factors that contribute to variation in evolutionary rate among Arabidopsis genes. Molecular Biology and Evolution 28: 2359-2369. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Yang Z. 2007. PAML 4: phylogenetic analysis by maximum likelihood. Molecular Biology and Evolution 24: 1586-1591. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved. Zemach A, Kim MY, Hsieh PH, Coleman-Derr D, Eshed-Williams L, Thao K, Harmer SL, Zilberman D. 2013. The Arabidopsis nucleosome remodeler DDM1 allows DNA methyltransferases to access H1-containing heterochromatin. Cell 153: 193-205. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Zhang H, Bishop B, Ringenberg W, Muir WM, Ogas J. 2012. The CHD3 remodeler PICKLE associates with genes enriched for trimethylation of histone H3 lysine 27. Plant Physiology 159: 418-432. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Zhang JZ, Nielsen R, Yang ZH. (2005) Evaluation of an improved branch-site likelihood method for detecting positive selection at the molecular level, Moleclular Biology and Evolution 22: 2472-2479. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Zhang M, Xie S, Dong X, Zhao X, Zeng B, Chen J, Li H, Yang W, Zhao H, Wang G, Chen Z, Sun S, Hauck A, Jin W, Lai J. 2014. Genome-wide high resolution parental-specific DNA and histone methylation maps uncover patterns of imprinting regulation in maize. Genome Research 24: 167-176. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Zhou R, Moshgabadi N, Adams KL. 2011. Extensive changes to alternative splicing patterns following allopolyploidy in natural and resynthesized polyploids. Proceeding of National Academy of Sciences of the U S A 108: 16122-16127. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Downloaded from on June 18, 2017 - Published by www.plantphysiol.org Copyright © 2017 American Society of Plant Biologists. All rights reserved.
© Copyright 2026 Paperzz