Extensive Divergence in Alternative Splicing Patterns after Gene and Genome Duplication During the Evolutionary History of Arabidopsis Peter G. Zhang, Suzanne Z. Huang,à Anne-Laure Pin,§ and Keith L. Adams* Research article UBC Botanical Garden and Centre for Plant Research, Department of Botany, University of British Columbia, Vancouver, British Columbia, Canada Present address: Centre for Molecular Medicine and Therapeutics, Vancouver, British Columbia, Canada àPresent address: Shalhevet High School, Vancouver, British Columbia, Canada §Present address: Centre de recherche en cancérologie de l’Université Laval (Laval University), L’Hôtel-Dieu de Québec, Québec, Canada *Corresponding author: E-mail: [email protected]. Associate editor: Aoife McLysaght Abstract Gene duplication at various scales, from single gene duplication to whole-genome (WG) duplication, has occurred throughout eukaryotic evolution and contributed greatly to the large number of duplicated genes in the genomes of many eukaryotes. Previous studies have shown divergence in expression patterns of many duplicated genes at various evolutionary time scales and cases of gain of a new function or expression pattern by one duplicate or partitioning of functions or expression patterns between duplicates. Alternative splicing (AS) is a fundamental aspect of the expression of many genes that can increase gene product diversity and affect gene regulation. However, the evolution of AS patterns of genes duplicated by polyploidy, as well as in a sizable number of duplicated gene pairs in plants, has not been examined. Here, we have characterized conservation and divergence in AS patterns in genes duplicated by a polyploidy event during the evolutionary history of Arabidopsis thaliana. We used reverse transcription–polymerase chain reaction to assay 104 WG duplicates in six organ types and in plants grown under three abiotic stress treatments to detect organ- and stress-specific patterns of AS. Differences in splicing patterns in one or more organs, or under stress conditions, were found between the genes in a large majority of the duplicated pairs. In a few cases, AS patterns were the same between duplicates only under one or more abiotic stress treatments and not under normal growing conditions or vice versa. We also examined AS in 42 tandem duplicates and we found patterns of AS roughly comparable with the genes duplicated by polyploidy. The alternatively spliced forms in some of the genes created premature stop codons that would result in missing or partial functional domains if the transcripts are translated, which could affect gene function and cause functional divergence between duplicates. Our results indicate that AS patterns have diverged considerably after gene and genome duplication during the evolutionary history of the Arabidopsis lineage, sometimes in an organ- or stress-specific manner. AS divergence between duplicated genes may have contributed to gene functional evolution and led to preservation of some duplicated genes. Key words: gene duplication, polyploidy, alternative splicing, gene expression. Introduction Genome duplication (polyploidy) has been a common phenomenon among eukaryotes of various groups, including plants, animals, and some single-celled eukaryotes. Two rounds of polyploidy are inferred to have occurred during the early evolutionary history of vertebrates (reviewed in Van de Peer et al. 2009), and certain lineages, such as ray-finned fish, have experienced an ancient whole-genome (WG) duplication during their evolutionary history (reviewed in Meyer and Van de Peer 2009). Among flowering plants, multiple ancient polyploidy events have occurred during angiosperm evolution and most lineages have experienced at least one round of polyploidy (e.g., Blanc and Wolfe 2004b; Cui et al. 2006). Polyploidy is an ongoing process in plants, and many plant species are cytologically polyploids, having experienced an evolutionarily recent polyploidy event. Segmental duplication of regions of a chromosome, tandem gene duplication, and duplicative retroposition also have contributed to the large number of duplicated genes in plant genomes. Over time, many duplicated genes are lost and some of those that are retained gain new functions and/or expression patterns (neofunctionalization) or subdivide their functions and/or expression patterns between them (subfunctionalization). Expression studies of genes duplicated by ancient WG duplication events, including those during the evolutionary history of the Brassicaceae and the Poaceae families, as well as other types of duplicated genes in plants, have revealed considerable divergence in expression patterns among different organ types, developmental stages, and in response to abiotic and biotic stress conditions (Haberer et al. 2004; Blanc and Wolfe 2004a; Casneuf et al. 2006; Duarte et al. © The Author 2010. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: [email protected] 1686 Mol. Biol. Evol. 27(7):1686–1697. 2010 doi:10.1093/molbev/msq054 Advance Access publication February 25, 2010 Alternative Splicing in Duplicated Genes · doi:10.1093/molbev/msq054 2006; Ganko et al. 2007; Ha et al. 2007; Yim et al. 2009; Zou et al. 2009). The divergence in expression patterns, along with asymmetric rates of sequence evolution between some pairs, have been interpreted as evidence of functional divergence (e.g., Blanc and Wolfe 2004a). A fundamental aspect of the expression of many genes is alternative splicing (AS). AS creates multiple forms of mature messenger RNAs (mRNAs) from a single precursor mRNA by using different 5# and/or 3# splice sites. There are multiple types of AS, including exon skipping where an exon is excluded from the mature mRNA, intron retention in which a complete intron remains in the transcripts, and AS at the 5# end of an intron (alternative donor) or the 3# end (alternative acceptor) (reviewed in Reddy 2007; Barbazuk et al. 2008). More than one intron in a transcript can be affected by AS to create multiple mature mRNAs that are differently spliced. Despite the importance of AS to gene expression, the conservation of AS patterns after WG duplication remains unknown, and AS events for a sizable set of duplicated gene pairs have not been examined and compared in plants. In this study, we investigated the evolutionary conservation and divergence of AS patterns in genes duplicated by polyploidy during the evolutionary history of the Arabidopsis lineage approximately 35 million years ago, after the divergence of the Brassicaceae from the Cleomaceae (Simillion et al. 2002; Blanc et al. 2003; Schranz and Mitchell-Olds 2006). WG duplicates provide a good system to investigate changes in AS patterns of duplicated genes because of their simultaneous formation. We compared the presence/ absence of AS events between genes in a WG duplicate pair by reverse transcription–polymerase chain reaction (RT-PCR) for 104 genes using multiple organ types and three abiotic stress treatments to detect organ- and stress-specific AS patterns. Also we studied 42 genes duplicated in tandem. Considerable divergence has occurred in AS events, some of which is organ or stress specific, following gene and genome duplication in the Arabidopsis thaliana lineage. Some of the AS changes likely cause alterations in function. Materials and Methods Plant Growth, Stress Treatments, and Nucleic Acid Extraction Arabidopsis thaliana (Columbia ecotype) were grown in soil with a photoperiod of 16-h light and 8-h dark at room temperature (22 ± 3°C). Cauline leaves, bolt stems, rosette leaves, roots, siliques, open flowers, and flower buds were collected at about the same time of day to minimize circadian effects, and samples were frozen in liquid nitrogen. Three sets of tissue from different plants were collected as biological replicates with several plants used per replicate. Three stress treatments were performed: Cold stress was done at 4°C for 72 h; heat stress was done at 38°C for 6 h, as in Palusa et al. (2007); and drought stress included no water for 1 week and then drying in a fume hood for 24 h. Organ harvesting was done at the location of stress treatment, and tissue was frozen with liquid nitrogen. MBE Gene Choice and Primer Design Gene pairs were selected from the Blanc data set of approximately 2,500 alpha WG duplicate gene pairs (Blanc et al. 2003) from a list provided in Blanc and Wolfe (2004a). At the time this project was started, it was estimated that about 22% of genes in A. thaliana undergo AS of one or more intron in one or more organ type or growth condition (Wang and Brendel 2006). To avoid potentially assaying a large number of gene pairs with no AS that would be uninformative in terms of AS conservation between duplicates, we selected gene pairs in which at least one gene in the pair showed evidence for AS in available complementary DNA (cDNA) and expressed sequence tag (EST) data. The Alternative Splicing in Plants database (Wang and Brendel 2006; 30 June 2007 update) and AceView (Thierry-Mieg and Thierry-Mieg 2006; October 2008 update) were used to find introns in the genes that are alternatively spliced. Those databases contain full-length cDNA sequences and partial ESTs from different types of libraries, but neither is comprehensive with respect to AS forms. In addition, a few gene pairs were genes where AS in one gene in a pair was assayed in Ner-Gaon et al. (2004) using transcripts associated with ribosomes. Gene pairs were selected without knowledge of the AS status of the other duplicate, except when the first duplicate in a pair did not show AS among the available cDNA sequences (although it could show AS that was not detected in the EST collections). Also we had no prior knowledge of the organ- and stress-specific AS of the gene pairs. We disregarded any ESTs in which all introns were retained because they might represent unspliced transcripts. The number of ESTs showing a particular AS event compared with the total number of ESTs in that region is shown in supplementary table S1, Supplementary Material online. Some of the assayed gene pairs did not show AS in the organ types or conditions used in this study, presumably because the AS was present in a different organ or growth condition (such as hormonetreated callus) and those genes were discarded because they were uninformative about AS. Primers were designed to be specific to each gene in a pair, and they were searched against the genome of A. thaliana to determine if they would likely amplify other paralogs. Primer sequences are listed in supplementary table S2, Supplementary Material online. Nucleic Acid Extraction, RT, and Gene Amplification and Sequencing Total RNA was extracted with Trizol (Invitrogen) for most organs and hot borate extraction (Wan and Wilkins 1994) for siliques. RNAqualityand integritywas checked on agarose gels, and the concentration was determined with a spectrophotometer. Genomic DNA was extracted using the Qiagen DNeasy kit. RNA was treated with DNase I (New England Biolabs) according to the manufacturer’s protocol to remove genomic DNA. The DNase-treated RNA was used in RT reactions with M-MLV reverse transcriptase (Invitrogen) according to the manufacturer’s instructions with oligo dT used as a primer to prime on poly(A) tails of transcripts. 1687 Zhang et al. · doi:10.1093/molbev/msq054 Negative controls were made for each sample with no reverse transcriptase to check for contaminating genomic DNA. PCRs were set up as in Liu and Adams (2008) and 30–35 cycles of PCR were carried out with the following conditions in each cycle: 96°C for 10 s, annealing at the optimal annealing temperature for 30 s, and 72°C for 30 s. The annealing temperature for each gene was optimized using a gradient beforehand. A negative control with water instead of template was used to ensure all reagents were free of DNA contamination. PCR products were run on 1.5% agarose gels for band separation and stained with ethidium bromide for visualization. Gels were scored based on presence or absence of bands of the expected sizes. Some of the RT-PCR products that showed only one band on the gel were directly sequenced to confirm the presence of a single PCR amplicon. RT-PCR products with less than 20 bp difference between splice forms were resolved by direct sequencing to determine if one or two splice products were present. Some bands representing putative AS products, as well as unexpected bands, were cut out of the gels, and a gel extraction kit (Qiagen) was used to elute the DNA. PCR products were sequenced using BigDye Version 3.1 (Applied Biosystems) and run on an ABI 3730 DNA sequencer. Sequence Alignment and Phylogenetic Analysis TransAlign (Bininda-Emonds 2005), an amino acid–based alignment for coding DNA sequence, was used to align the sequences of the sulfate transporter genes. Maximum parsimony in PHYLIP was used for phylogenetic analysis. Bootstrapping of 100 replicates was performed using PHYLIP bootstrap with the neighbor-joining algorithm. Domain and Gene Ontology Analyses Protein domain analysis was performed using the Conserved Domains Database at National Center for Biotechnology Information (NCBI) (Marchler-Bauer and Bryant 2004; http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml), Pfam (Finn et al. 2008; http://pfam.sanger.ac.uk/), PROSITE (Hulo et al. 2006; http://ca.expasy.org/prosite/), and SMART (Letunic et al. 2009; http://smart.emblheidelberg.de/). Gene ontology (GO) analysis was done using GOEAST (Zheng and Wang 2008; http://omicslab .genetics.ac.cn/GOEAST/). Results Comparisons of AS Events between WG Duplicates in Multiple Organs We used RT-PCR to individually assay conservation of AS patterns in a set of 52 WG duplicate pairs identified in Blanc et al. (2003). The vast majority of AS events assayed were intron retentions, which is the most common type of AS in plants (Wang and Brendel 2006). Some of the alternatively spliced introns are flanked by exons in the coding region, whereas others are located in the 5#untranslated region (UTR) or 3#UTR. Primers for RT-PCR were designed to amplify a region containing one or more putative AS 1688 MBE event(s). An AS event is a single case of AS, such as retention of a single intron. The primers were designed to not crossamplify other duplicates in the genome. RT-PCR was used to evaluate the AS events in six organ types because AS in plants can be organ specific (e.g., Palusa et al. 2007; Simpson et al. 2008). Rosette leaves, roots, bolt stems, cauline leaves, whole flowers, and green siliques were examined, with two replicates of each organ type. In most cases, the presence or absence of splice forms was identical between replicates; those few that were not identical were examined in a third replicate. Examples of RT-PCR gels showing AS events are shown in figure 1. Some of the RT-PCR products with a single splice form on the agarose gels were directly sequenced to verify the presence of only a single amplicon (e.g., genes shown in fig. 2). The same AS event (e.g., retention of the same intron) was found in only one of the two WG duplicates for 33 gene pairs; both WG duplicates showed the same AS event in 21 gene pairs; both conserved and nonconserved AS events were seen in 4 gene pairs; and neither WG duplicate showed AS in the region examined for 1 gene pair (although there was AS after stress treatment, below) (tables 1 and 2). Among the 21 gene pairs with the same AS event in both WG duplicates, coexpression of both transcript isoforms in different organs between the two genes was found for 11 gene pairs. Thus, although the splice form is conserved when comparing some organs, regulation of the usage of that splice form is different between the two genes. Complete conservation of the assayed AS event(s) in all organs examined in both WG duplicates was found for 10 gene pairs and 11 AS events (i.e., retention of a single homologous intron in both WG duplicates), whereas divergence between the genes pairs was found for 48 AS events (table 1). Conservation of AS events between WG duplicates was organ specific in some cases (tables 1 and 2 and supplementary table S3, Supplementary Material online). For example, a pair of protein kinase genes (At1g18160 and At1g73660) showed retention of intron 2 in both genes in vegetative organs, but in flowers and siliques, only At1g18160 showed retention of the intron. A second example is a pair of genes for nucleic acid–binding proteins (At1g48920 and At3g18610) where both pairs showed retention of intron 1 in siliques and bolt stems but only At3g18610 shows retention of the intron in the other examined organs. Complete partitioning of AS was discovered in a pair of ADP-ribosyltransferase genes where only At1g32230 retained intron 3 and only At2g35510 shows skipping of exon 4. Many of the AS events in the studied genes introduce premature stop codons that would disrupt the reading frame if the transcripts are translated. In some of the duplicated gene pairs, an AS event in one of the genes creates a premature stop codon that results in a truncated protein that would be missing one or more functional domains or portions of functional domains, if the transcripts are translated (fig. 3). For example, gene At1g32230, an ADP-ribosyltransferase, has a skipped exon form that alters Alternative Splicing in Duplicated Genes · doi:10.1093/molbev/msq054 MBE FIG. 2. Chromatograms from direct sequencing of RT-PCR products. Exon–exon junctions are shown from examples of genes, which showed only one splice form on the RT-PCR gels. Arrows indicate exon–exon junctions. No evidence for intron presence was seen in any of these genes. Panels (B) and (D) show the reverse strand and the chromatograms were inverted. the WRKY superfamily domain, presumably resulting in loss of DNA-binding function; in contrast, the intronretained form was not detected in its duplicate At2g40750. In other cases, a premature stop codon does not result in the loss of any currently recognized functional domains; however, there could be loss of functional domains that have not yet been characterized in plants. AS in WG Duplicates after Abiotic Stress Treatments FIG. 1. RT-PCR gels from WG duplicates and tandem duplicates showing AS events. Genes, types, and effects of AS are (A) At1g28330 and At2g33830, retained intron (IR) divergence; (B) At4g29160 and At2g19830, IR divergence; (C) At1g79650 and At1g16190, alternative position divergence; (D) At4g12040 and At4g22820, IR conservation; (E) At4g30690 (top) and At2g24060 (bottom), IR conservation under stress but not normal conditions; (F) At3g48440 (top) and At5g63260 (bottom), IR conservation, although only the non-IR form is present under drought stress in stems and roots; (G) At1g70100 (top) and At1g24160 (bottom), IR conservation under normal growth conditions but not under drought stress; (H) At4g26710 (top) and At5g55290 (bottom), reciprocal IR between heat and normal conditions in flowers and stems; (I) At4g30660 (top) and At4g30650 (bottom), IR divergence under all conditions; (J) At1g78380 (top) and At1g78370 (bottom), IR only under heat stress in At1g78380. An example of RT-PCR controls is shown in supplementary figure S1, Supplementary Material online. the reading frame and has a premature stop codon that causes loss of part of the ADP-ribosyl superfamily domain; the skipped exon form was not found in its duplicate At2g35510. Gene At3g56400 (WRKY70), a DNA-binding protein, has an intron-retained form that causes loss of Abiotic stresses are known to cause changes in AS patterns in plants (e.g., Iida et al. 2004; Palusa et al. 2007). It is possible that pairs of WG duplicates that did not show conservation of AS events under regular growing conditions could show the same AS events under abiotic stress conditions or vice versa. To evaluate that possibility, we assayed AS patterns of the 52 gene pairs in plants treated under three abiotic stresses—cold, heat, and drought. Some splicing variants that were present under regular growing conditions were not present after one or more of the stress treatments, whereas in other cases, a gene produced a new splicing variant in the stress-treated plants in one or more organs (fig. 1 and table 1 and supplementary table S4, Supplementary Material online). In a few cases, there was a shift from one splicing variant to another. In a pair of ATP synthase genes, At5g55290 but not At4g26710 showed a retained intron form under regular growing conditions but At4g26710 and not At5g55290 showed the retained intron form after each of the three stress treatments (fig. 1 and table 1 and supplementary table S4, Supplementary Material online). Another example is an RAD23 DNA repair gene (At1g16190) that showed only a retained intron form in rosette leaves subjected to heat and drought stress but only the non-intron– retained form under regular growing conditions, whereas both forms were present in leaves of cold-stressed plants. 1689 MBE Zhang et al. · doi:10.1093/molbev/msq054 Table 1. Conservation and Divergence of AS Patterns in WG and Tandem Duplicates. AS event in only one WG duplicate AS event in both WG duplicates AS in the same organs AS in different organs AS event conserved only after one or more abiotic stress treatment AS event in one tandem duplicate AS event in both tandem duplicates AS in same organs AS in different organs AS event conserved only after one or more abiotic stress treatments 33 21 10 11 pairs pairs pairs pairs 36 23 11 12 events events events events 8 pairs 11 pairs 9 pairs 5 pairs 4 pairs 8 events 12 events 9 events 5 events 4 events 4 pairs 4 events NOTE.—AS event refers to a single type of AS in a single intron, for example, retention of a homologous intron. There were seven gene pairs that did not show conservation of an AS event (e.g., a retained intron form in transcripts from one gene but not the other) under regular growing conditions but did show the same AS event (i.e., the transcripts from both genes retained the intron) after one or more stress conditions. In a pair of IF-3 translation initiation factor genes, At4g30690 retained intron 6 under all conditions, whereas At2g24060 retained intron 6 under all three stresses in most organs but not under normal growing conditions (fig. 1). Between a pair of lipoamide dehydrogenases At1g48030 and At3g17240, At1g48030 showed retention of intron 2 under all conditions but At3g17240 showed retention of intron 2 only under the three stress conditions in some organs. Thus, AS events were conserved between WG duplicates only after abiotic stress treatments in the two cases previously mentioned. Conversely, a pair of genes of unknown function (At1g70100 and At2g24160) showed AS conservation between WG duplicates only under normal conditions and not under any of the three stress conditions. In addition, one gene pair showed AS in both duplicates under stress conditions but in neither duplicate under normal conditions. A few WG duplicate gene pairs showed opposite splice forms in each gene under stress compared with each other. Examples of those genes include a pair of genes with unknown function (At1g16840 and At1g78890) in cold-stressed plants where At1g78890 showed only a retained intron form and At1g16840 showed only the nonretained intron form and a pair of genes for RAD23 DNA repair proteins (At1g79650 and At1g16190) under heat and drought stresses in rosette leaves where At1g16190 showed only an intron-retained form and At1g79650 showed only the non-intron–retained form (supplementary table S4, Supplementary Material online). All the above-mentioned examples indicate how the apparent conservation and divergence of AS events in WG duplicate gene pairs can vary by growing conditions. AS Divergence in Tandem Duplicates In addition to the WG duplicates, we assayed 21 pairs of tandem duplicates, identified in Haberer et al. (2004). 1690 RT-PCR was performed using RNA from six organs (rosette leaves, roots, bolt stems, cauline leaves, unopened flower buds, and mature opened flowers) and in response to heat and drought stress treatments. Of 21 tandem pairs, 11 showed an AS event in only one gene in the region analyzed and 9 showed AS in both genes (table 1). Among the gene pairs with the AS event in both genes, the AS patterns were the same in all six organs in five pairs and showed different organ specificity in four pairs (tables 1 and 3 and supplementary table S5, Supplementary Material online). Thus, like the WG duplicates, there has been considerable divergence in AS events between the tandem duplicate genes that were assayed. We also examined AS in plants grown under two abiotic stress treatments to determine if any genes with diverged AS patterns between duplicates under normal growing conditions had the same AS patterns between duplicates after stress treatment or vice versa. Thirteen WG duplicates showed the same AS event in both duplicates under one or both stresses with seven having AS in the same organs and six having AS in different organs. In contrast, nine gene pairs showed an AS event only in one duplicate. Two pairs, At1g78380 and At1g78370 coding for glutathione transferases along with At3g22231 and At3g22240 that have unknown functions, showed an AS event in one gene only after one or both stress treatments. The abiotic stress treatments affected the perceived AS conservation between duplicates in four pairs of duplicates (table 3). In three cases, only one gene in the pair had the AS form under normal growing conditions but both genes had the AS form after one or both stresses. Several of the AS events in the tandem duplicates introduce premature stop codons that would disrupt the reading frame if the transcripts are translated. In some of the duplicated gene pairs, an AS event in one of the genes creates a premature stop codon that results in a truncated protein that is missing one or more functional domains, or portions of functional domains, if the transcripts are translated (figs. 3 and 4). For example, gene At5g06860, which codes for a polygalacturonaseinhibiting protein, contains a retained intron that results in premature stop codon formation that would result in loss of some of the leucine-rich repeats. Gene At1g78000, coding for a sulfate transporter, has a retained intron form that results in loss of part of the STAS (Sulfate Transporter and AntiSigma factor antagonist) functional domain, whereas its duplicate At1g77990 does not have that form (fig. 4). Finally, we examined AS in two subgroups of the sulfate transporter family that contain both tandem duplicates and WG duplicates that were part of this study to look at the evolutionary dynamics of AS in the family. At1g22150 and At1g78000 are WG duplicates, and genes At1g77990 and At1g78000 are tandem duplicates; At1g78000 is both a WG duplicate and a tandem duplicate. Of the five genes only At1g78000 retains the final intron (fig. 4A), suggesting gain of the AS event in this gene after WG duplication. MBE Alternative Splicing in Duplicated Genes · doi:10.1093/molbev/msq054 Table 2. Genes Studied and Summary of RT-PCR Results for WG Duplicates. Gene Number Gene 1, Gene 2 At5g07370, At5g61760 At4g30690, At2g24060 At2g43010, At3g59060 At5g55550, At4g26650 At1g32230, At2g35510 Function or Putative Function Inositol phosphate kinase Translation initiation factor PIF transcription factor RNA recognition motif ADP-ribosyltransferase At5g18620, At1g50630, At3g56400, At3g23830, At2g19620, At3g48440, At1g76140, At2g25850, Chromatin remodeling factor Extracellular ion channel WRKY transcription factor Glycine-rich RNA binding Ndr family Nucleic acid binding Prolyl oligopeptidase Nucleotidyltransferase At3g06400 At3g20300 At2g40750 At4g13850 At5g56750 At5g63260 At1g20380 At4g32850 At3g05640, At5g27930 At1g03457, At4g03110 Protein phosphatase RNA-binding protein At1g16840, At1g73650, At1g48030, At2g21940, At3g15980, Unknown protein Oxidoreductase Lipoamide dehydrogenase Shikimate kinase Coatomer protein complex At1g78890 At1g18180 At3g17240 At4g39540 At1g52360 At2g21660, At4g39260 Glycine-rich RNA binding At1g19000, At1g74840 At3g45240, At5g60550 myb transcription factor Knase At4g22590, At4g12430 At3g02900, At5g16660 Phosphatase Unknown protein At2g01180, At1g15080 At1g15960, At1g80830 Phosphatidate phosphatase Metal ion transporter At1g80910, At1g16020 At1g79650, At1g16190 Unknown protein RAD23 DNA repair At1g19400, At1g78000, At1g70100, At2g17320, At4g26710, At2g45170, At3g01500, At1g15920, At1g48920, At1g18160, At5g47080, At2g37340, Unknown protein Sulfate transporter Unknown protein Pantothenate kinase ATP synthase subunit H Microtubule binding Carbonic anhydrase Transcription complex Nucleic acid binding Protein kinase Casein kinase II beta chain RSZ splicing factor At1g75180 At1g22150 At1g24160 At4g35360 At5g55290 At3g60640 At5g14740 At1g80780 At3g18610 At1g73660 At4g17640 At3g53500 At5g16800, At3g02980 At4g12040, At4g22820 At4g14410, At3g23210 N-acetyltransferase AN-1–like zinc finger bHLH family At1g52250, At3g16120 At3g46130, At5g59780 Dynein light chain myb transcription factor At1g55310, At3g13570 SCL splicing factor At2g17640, At4g35640 At4g29160, At2g19830 At2g20590, At4g28430 Serine acetyltransferase SNF7 family Reticulon family AS I I I I I E I I I A I I I I1 I2 E D A I I I I I I D D I I D I I A I I I1 I2 I P I I I I I I I I I I I I A I1 I2 I I I A I I D I E I I I Normal 12 12 12 12 12 12 21 12 11s 12 12 11d 11s 11d 11s 11s 21 12 12 22 21 12 12 12 12 22 11d 12 11d 12 22 22 12 22 21 12 12 22 12 22 11s 12 11s 12 21 12 12 21 11d 11d 11d 11s 22 22 11s 11s 11d 12 11d 11d 11d 12 11s 11s 12 11d Stress 12 12 11d 22 12 12 21 22 12 12 12 11d 11d 11d 11s 11s 21 1x 12 12 1 2* 12 11d 12 12 12 11s 11s 12 22 21 12 12 12 11d 12 12 21 22 21 12 12 12 12 12 11d 12 11s 11d 11d 11s 11s 11d 12 11s 11s 11s 12 11d 11s 11d 11d 11s 11s 12 11d Effect of AS 5#UTR PTC; domain loss PTC 5#UTR PTC PTC; domain loss New stop New stop PTC; domain loss 5#UTR PTC PTC; domain loss PTC; domain loss New stop New stop New stop 5’UTR Longer ORF PTC New stop New stop 3#UTR PTC 3#UTR New stop Longer ORF Longer ORF 3#UTR 5#UTR 5#UTR New stop Longer ORF PTC 5#UTR PTC; domain disrupt PTC; domain disrupt New stop Longer ORF PTC; domain loss 5#UTR New stop New stop PTC 5#UTR 5#UTR New stop 5#UTR New start PTC; domain loss New stop PTC; domain loss PTC; domain loss PTC; domain loss New stop 5#UTR New start PTC PTC; domain disrupt New start New start PTC; domain disrupt PTC; domain disrupt PTC 5#UTR PTC 1691 MBE Zhang et al. · doi:10.1093/molbev/msq054 Table 2. Continued. Gene Number Gene 1, Gene 2 At4g39140, At2g21500 At2g39250, At3g54990 Function or Putative Function Protein and zinc ion binding Transcription factor AS A I1 I2 Normal 12 12 12 12 Stress 12 12 12 12 Effect of AS 5#UTR PTC; domain disrupt PTC; domain disrupt NOTE.—Types of AS: I, intron retention; A, alternative acceptor; D, alternative donor; P, alternative position; E, skipped exon. Data for normal and stress conditions: Gene 1 data are listed first followed by gene 2 data. Plus signs indicate an AS event in one or more organs, minus signs indicate no AS event. s indicates that the same AS event is present in the same organs in both WG duplicates, d indicates that AS is present different organs in each WG duplicate, and x indicates no expression, and asterisk indicates only the retained intron form is present. Bold indicates genes that show an AS event that is present in both WG duplicates under one or more stress conditions but not under normal conditions or vice versa. PTC indicates premature stop codon, new start indicates a new start codon is formed, new stop indicates a new stop codon is formed in the final intron, and longer ORF indicates a longer open reading frame. See supplementary tables S3 and S4, Supplementary Material online for complete organ and stress data. Discussion Extensive Divergence in AS between Duplicated Genes in Arabidopsis thaliana The results of our RT-PCR experiments indicate that considerable divergence in AS patterns has occurred between many of the genes duplicated by the most recent polyploidy event during the evolutionary history of the Arabidopsis lineage, as well as between many of the tandem duplicates that were examined in this study. Thus, changes in AS patterns are an important aspect of the evolution of WG duplicate and tandem duplicate pairs, in addition to the previously documented changes in expression patterns (as reviewed in the Introduction). Several different types of AS events have been gained or lost, including retained introns, alternative donors and acceptors, and skipped exons. In some cases, the perceived conservation or lack of conservation of a splice form between duplicates varied among organ types or in response to abiotic stress treatments, indicating that AS regulation has changed between the duplicated genes. Particularly interesting in this regard are genes where an AS event is conserved between WG or tandem duplicates only under one or more abiotic stress treatments and not under normal growing conditions, showing that regulation and apparent conservation of AS between duplicated genes can be affected by stress treatments. Those genes have a variety of functions, including a transcription initiation factor, a lipoamide dehydrogenase, a phosphatidate phosphatase, and a polygalacturonase-inhibiting protein, among others. The opposite effect, AS conservation under normal growing conditions but not under the tested stress conditions, also was observed. A few of the WG and tandem duplicate pairs have conserved AS events in all organs and stress conditions examined (tables 2 and 3). It is possible that no changes in regulation of AS have occurred in those genes or it is also possible that there is divergence in AS in organs, developmental stages, or under abiotic or biotic stress conditions not examined in this study. We examined a small fraction of genes duplicated by the alpha polyploidy event and the tandem duplicates in several organ types and under two to three different abiotic stress conditions using a sensitive RT-PCR approach. An advantage of this approach is the precision by which AS patterns are assayed compared with analyzing EST databases that are incomplete in regard to AS and highly het1692 erogeneous and unequally sampled in terms of tissues, organ types, and growth conditions. A disadvantage is that our data set may or may not be completely representative of the two classes of duplicates as a whole. The WG duplicate genes we assayed were relatively evenly sampled in terms of gene classes as defined by GO categories (supplementary tables S6 and S7, Supplementary Material online). That is, the top 12 GO categories in our data set were mostly among the top 12 in the entire WG duplicate data set. In contrast, our tandem data set appears to be less representative of the GO categories, with only some of the top categories also being among the top in the entire data set (supplementary tables S8 and S9, Supplementary Material online). There has not been a systematic study of the amount of AS in each GO category in A. thaliana and thus it is unknown if some GO categories have more AS than others. There are no indications that not having a completely representative sample of genes, from a GO category perspective, would bias our inferences about conservation of AS patterns between duplicates in a pair. We think that the general patterns of divergence between duplicated genes, as well as the general patterns in different organs and under the abiotic stress conditions, are likely to be relatively applicable across the WG duplicates, but future studies of all alpha WG duplicates and tandem duplicates that assay several organ types and stress conditions using sensitive detection methods will be needed for a full characterization. Future studies also could examine the amounts of transcripts with particular AS forms by quantifying the amounts of each form relative to the completely spliced transcripts. That information may be helpful in inferring which splicing events represent true AS and which might represent incomplete splicing that could be present at low levels. Also the use of ribosome-associated transcripts would be helpful in that regard. We included some tandem duplicates in this study to determine if extensive divergence in AS patterns between duplicates is unique to the WG duplicates. Our data indicate that it is not. Our sample size of 42 duplicate genes is likely too small to make accurate comparisons of the frequency of AS conservation in WG duplicates compared with tandem duplicates in general, except to say that the patterns we saw are roughly comparable, as seen in table 1. It would take a much larger sample size, preferably all tandem and WG duplicates in the genome, to make thorough Alternative Splicing in Duplicated Genes · doi:10.1093/molbev/msq054 MBE FIG. 3. Domain losses and disruptions resulting from AS. Shown are protein diagrams for 16 genes with domains labeled. Arrows indicate locations in the protein that correspond to the beginning of an AS intron in the corresponding gene sequence; regions to the right of the arrows would be missing in the proteins if they are translated. Proteins are drawn to scale and are shown with the N-terminus on the left and the C-terminus on the right. Gene numbers and domain identifiers are (A) At3g56400 (IF-3 translation initiation factor) pfam00707 IF-3 conserved C-terminal domain and pfam05198 IF-3 conserved N-terminal domain. (B) At2g35510 (ADP-ribosyltransferase) cI02729 WWE domain and cI00283 ADP-ribosyl superfamily domain. (C) At3g56400 (WRKY transcription factor) cI03892 WRKY DNA-binding domain. (D) At3g48440 (nucleic acid binding) cl11592 zinc finger domain. (E) At1g20380 (prolyl oligopeptidase) peptidase S9 domain. (F) At1g15960 (metal ion transporter) cl00836 Nramp domain. (G) At1g16190 (RAD23 DNA repair) cd01805, RAD23 N-terminal domain, pfam09280, XPC-binding domain, cl00153, UBA ubiquitin-associated domain. (H) At1g18160 (protein kinase) cd00192 PTKc—catalytic domain of protein tyrosine kinases. (I) At2g37340 (RSZ splicing factor) cd00590 RNA recognition motif, pfam00098 zinc knuckle. (J) At1g52250 (dynein light chain) ci03131. (K) At1g55310 (SCL 33 splicing factor) cd00590 RNA recognition motif. (L) At2g39250 (transcription factor) cd00018 AP2 DNAbinding domain. (M) At4g28420 (aspartate aminotransferase) cd00609 aspartate aminotransferase family. (N) At5g06860 (polygalacturonase inhibiting) pfam08263 leucine-rich repeat N-terminal domain and COG4886 leucine-rich repeat region. (O) At1g78380 (glutathione transferase) cd03185 glutathione transferase conserved N-terminal domain and cd03058 glutathione transferase conserved C-terminal domain. (P) At5g01040 (laccase family) cl06664 Cu-oxidase domains. comparisons and to determine if the amount of AS divergence is proportional to the time since duplication. Future analyses of transcriptome data generated by secondgeneration sequencing may provide the opportunity to examine AS conservation and divergence between most or all duplicate gene pairs in the genome of A. thaliana and other plants. Effects of AS Divergence on Gene Regulation and Function Many of the AS events that differ between duplicates among the genes we assayed create a premature stop codon within the transcripts. In several cases, the protein would be lacking one or more functional domains if the transcripts are translated, probably resulting in nonfunctional proteins or 1693 MBE Zhang et al. · doi:10.1093/molbev/msq054 Table 3. Genes Studied and Summary of RT-PCR Data from Tandem Duplicates. Gene Number Gene 1, Gene 2 At1g27370, At1g27360 Function or Putative Function DNA-binding protein At1g78000, At1g77990 At2g29930, At2g29910 Sulfate transporter F-box family protein At2g46280, At3g19000, At4g01450, At4g27610, At2g46290 At3g19010 At4g01440 At4g27620 TGF-B receptor interacting Oxidoreductase Nodulin MtN21 family Unknown protein At4g27870, At4g28420, At4g30660, At4g36195, At5g06860, At5g52300, At1g78380, At1g23510, At5g01040, At4g16770, At2g17320, At1g10360, At4g25690, At3g22231, At4g27860 At4g28410 At4g30650 At4g36190 At5g06870 At5g52310 At1g78370 At1g23520 At5g01050 At4g16765 At2g17340 At1g10370 At4g25670 At3g22240 Integral membrane family Aminotransferase Hydrophobic protein Serine-type peptidase Polygalacturonase inhibiting Unknown protein Glutathione transferase Unknown protein Laccase family Oxidoreductase Pantothenate kinase related Glutathione transferase Unknown protein Unknown protein AS A I I A I I I I E I I I I I I I I I I I I I I I Normal 12 22 12 12 11d 21 12 11s 21 12 12 11d 11d 21 21 12 1 2* 22 11s 11d 11s 11s 11s 12 22 Stress 12 11d 12 12 11d 22 1x 11s 21 11d 12 11s 11d 21 21 11d 1 2* 12 11s 11d 11s 11s 11s 11d 12 Effect of AS 5#UTR 5#UTR New stop codon New start codon 5#UTR New stop codon New stop codon 5#UTR 5#UTR PTC PTC; domain disrupt PTC New stop codon PTC; domain disrupt New stop codon PTC; domain disrupt New start codon PTC; domain loss PTC PTC PTC 5#UTR PTC NOTE.—Types of AS: I, intron retention; A, alternative acceptor; D, alternative donor; P, alternative position; E, skipped exon. Data for normal and stress conditions: Gene 1 data are listed first followed by gene 2 data. Plus signs indicate an AS event in one or more organs, minus signs indicate no AS event. s indicates that the same AS event is present in the same organs in both WG duplicates, d indicates that AS is present different organs in each WG duplicate, and x indicates no expression, and asterisk indicates only the retained intron form is present. Bold indicates genes that show an AS event that is present in both WG duplicates under one or more stress conditions but not under normal conditions, or vice versa. PTC indicates premature stop codon, new start indicates a new start codon is formed, and new stop indicates a new stop codon is formed in the final intron. See supplementary table S5, Supplementary Material online for complete organ and stress data. proteins with altered functions (fig. 3). For example, the retained intron form of At3g56400, a WRKY transcription factor, would lack the WRKY domain. More detailed functional information is available for the sulfate transporter genes Sultr 1;2 and Sultr 2;2 (tandem pairs At1g78000 and At1g77990) along with Sultr 1;3 and 1;2 (WG duplicates At1g22150 and At1g 78000). The retained intron form in Sultr 1;2 includes the last intron before the normal stop codon and retention of the intron creates a stop codon that would result in the loss of about 39 amino acids at the C-terminus within the STAS domain (fig. 4). Rouached et al. (2005) showed that deletion of the last 12 amino acids at the C-terminus of Sultr 1;2 resulted in a 100% reduction of sulfate transport when expressed and assayed in yeast. Thus, it appears that the 39 amino acid deletion caused by the intron retention and premature stop codon in Sultr 1;2 results in nonfunctional proteins. Sultr 2;2 and Sultr 1;3 do not have the retained intron form in the organ types and stress conditions examined here; thus, the divergence in AS patterns between duplicates has implications for functional divergence in these gene pairs. Truncated proteins created by translation of alternatively spliced transcripts that contain premature stop codons can have important functions. For example, two transcript forms are produced from the N gene in tobacco, involved in conferring resistance to tobacco mosaic virus: a fulllength form and a truncated form produced from an alternatively spliced transcript that contains a premature stop 1694 codon. Transgenic experiments showed that the full-length form by itself does not show complete resistance to the virus, unlike when both forms are present; thus, the truncated form is playing a role in the resistance (Dinesh-Kumar and Baker 2000). A second example is the disease resistance gene RPS4 in Arabidopsis that has two alternatively spliced forms, each of which results in a premature stop codon and a shorter protein. Zhang and Gassmann (2003) showed that the alternatively spliced forms are necessary, in addition to the full-length form, for RPS4 function. Considering that the alternatively spliced and truncated forms of some gene products are important for function, it is likely that some of the genes in this study with AS that creates truncated forms have as-yet-unknown functions. Thus, some duplicate gene pairs may have experienced functional divergence by gaining, or loosing, an alternatively spliced form in one copy that creates a truncated protein. Another consequence of a premature stop codon created by AS is degradation of transcripts by nonsensemediated RNA decay (NMD). Wang and Brendel (2006) found that about 43% of AS events in A. thaliana produce candidates for NMD, as defined by a premature stop codon .50 bp upstream of the 3# most exon–exon junction. Thus, many AS forms probably function to downregulate gene expression by transcript degradation instead of being translated into proteins. A recent study of Serine/argininerich (SR) splicing factors in A. thaliana showed that some AS forms were present at higher levels in a mutant for one Alternative Splicing in Duplicated Genes · doi:10.1093/molbev/msq054 MBE FIG. 4. AS in a sulfate transporter family. (A) Phylogeny of five sulfate transporter genes in two subfamilies and AS data from RT-PCR. Presence (þ) or absence () of the last intron in the mature mRNA are indicated. Numbers indicate bootstrap values from 100 replicates. The sequence alignment used to generate the phylogenetic tree is shown in supplemental figure S2, Supplementary Material online. Genes from the third subfamily are shown as an outgroup. (B) Sequence alignment of the 3# end and C-terminus of five sulfate transporter genes showing the retained intron in Sultr 1;2 and the conceptual translation of the C-terminus of the retained intron form of Sultr 1;2 compared with the other sequences. Bullet indicates a stop codon in the retained intron form of Sultr 1;2 (abbreviated ‘‘IR’’). Retention of the intron causes the loss of 39 amino acids at the C-terminus of Sultr 1;2. Locations of alpha helices are indicated. (C) Diagram of the amino acid sequence of Sultr 1;2 showing conserved domains and the location of intron 12, from NCBI’s Conserved Domains Database. (D) Predicted tertiary structure of the STAS domain of Sultr 1;2 using SPICE (Prlić et al. 2005). The structure is similar to that reported in Rouached et al. (2005). Locations of alpha helices are shown and correspond with locations in panel (B). of the genes involved in NMD, indicating that some of the alternatively spliced transcripts are degraded (Palusa and Reddy 2009). Downregulation of gene expression by AS can have important functional consequences. For example, the RNA-binding proteins AtGRP7 and AtGRP8 in A. thaliana autoregulate and cross-regulate their own expression by AS and NMD (Staiger et al. 2003; Schöning et al. 2008). A second example is the flowering time gene FCA in A. thaliana that controls the transition from the vegetative to the reproductive phase. There are three AS forms of FCA mRNAs produced, none of which encode a full-length protein, and AS of FCA limits the amount of FCA protein, both spatially and temporally, to prevent precocious flowering (Macknight et al. 2002). Some of the AS forms of the genes in this study may result in downregulation of gene expression that has important functional consequences that have not yet been studied; the functions of only a small number of AS forms have been characterized in plants. Thus, some duplicate gene pairs in this study may have experienced functional divergence by differential regulation of transcript abundance by AS. Evolution of AS after Gene Duplication This is the first study of AS events in a large number of duplicated gene pairs in plants. Previous studies have examined AS only in one or two pairs of duplicates or duplicates within a small gene family (Palusa et al. 2007). The relation between gene family size and AS was explored in a bioinformatics study that compared the AS frequency in single-copy genes versus gene families in A. thaliana and rice (Lin et al. 2008). A higher percentage of AS was found in gene families than in single-copy genes, contrary to 1695 MBE Zhang et al. · doi:10.1093/molbev/msq054 computational analyses of gene families in mammals and Caenorhabditis elegans, which showed an inverse relationship between AS and gene family size (Kopelman et al. 2005; Su et al. 2006; Hughes and Friedman 2007; Irimia et al. 2008). Thus, the evolution of AS in gene families may differ between plants and animals or the differing number of cDNA sequences available from each organism and the number of AS events detected might affect the trends. Future studies using RNA-seq data from A. thaliana and other plants may help to resolve the issue. After a gene with two splice forms is duplicated, each copy gene could retain one of the splice forms, which is a type of subfunctionalization involving splice forms. In maize, two genes coding for the small subunit of ADP-glucose pyrophosphorylase without AS correspond to a single gene with AS in other grass species (Rosti and Denyer 2007), indicating that duplication in the maize lineage was followed by loss of one AS form in each duplicate. In a second case, the ribosomal protein L32 and superoxide dismutase genes are present as a fusion gene and co-expressed by AS in Burma mangrove (Bruguiera gymnorrhiza), whereas in Populus, the two genes were separated after gene duplication (Cusack and Wolfe 2007). A putative case of partitioning of AS forms was revealed in this study: a pair of ADP-ribosyltransferase genes where a retained intron form was found only in At1g32230 and a skipped exon form was found only in At2g35510. In addition to partitioning of ancestral splice forms, a duplicated gene could gain a new splice form that could result in a new function (neofunctionalization) or changes in gene regulation. New splice forms that are created soon after duplication could be a way for trying out AS forms in one of the duplicates without affecting regulation or function of the other. What causes AS divergence between duplicated genes? One likely factor is mutations in the gene sequence at bases important for splicing, both within the intron and the surrounding exons. Those regions include exonic splicing enhancers, exonic splicing silencers, intronic splicing enhancers, and intronic splicing suppressors (Reddy 2007). Such mutations in each gene could lead to differential binding of AS factors, including SR proteins, and differential inclusion or exclusion of certain introns and exons, along with alternative donor or acceptor sites. Conclusions In this study, we have assayed AS patterns in a set of WG duplicates and tandem duplicates in several organ types and under multiple abiotic stress conditions. Our results illustrate the patterns by which AS events can diverge between duplicates, including organ- and stress-specific differences. In some cases, the AS divergence affects inclusion of functional domains that may result in functional divergence between the duplicates. Divergence in AS patterns between duplicates is another way in which the expression of duplicated genes can change in plants, in addition to the previously reported changes in transcript levels in different organs, developmental stages, and in 1696 response to stress conditions. AS divergence between duplicated genes may contribute to gene regulatory and functional evolution and potentially lead to preservation of some duplicated genes. Future studies of AS in duplicated genes at different evolutionary time scales, including evolutionarily recent polyploids, as well as whole-transcriptome studies of AS in duplicate genes will reveal further insights into how duplicate genes diverge in AS patterns. Supplementary Material Supplementary tables S1–S9 and Supplementary figures S1–S2 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/). Acknowledgments We thank Jonathan Wendel and the Adams laboratory for comments on the manuscript. This research was supported by a grant from the Natural Science and Engineering Research Council (NSERC) of Canada, and by infrastructure funds from the Canadian Foundation for Innovation. S.Z.H. was supported in part by an Undergraduate Student Research Award from NSERC. References Barbazuk WB, Fu Y, McGinnis KM. 2008. Genome-wide analyses of alternative splicing in plants: opportunities and challenges. Genome Res. 18:1381–1392. Bininda-Emonds ORP. 2005. TransAlign: using amino acids to facilitate the multiple alignment of protein-coding DNA sequences. BMC Bioinformatics. 6:156. Blanc G, Hokamp K, Wolfe KH. 2003. A recent polyploidy superimposed on older large-scale duplications in the Arabidopsis genome. Genome Res. 13:137–144. Blanc G, Wolfe KH. 2004a. Functional divergence of duplicated genes formed by polyploidy during Arabidopsis evolution. Plant Cell. 16:1679–1691. Blanc G, Wolfe KH. 2004b. Widespread paleopolyploidy in model plant species inferred from age distributions of duplicate genes. Plant Cell. 16:1667–1678. Casneuf T, De Bodt S, Raes J, Maere S, Van de Peer Y. 2006. Nonrandom divergence of gene expression following gene and genome duplications in the flowering plant Arabidopsis thaliana. Genome Biol. 7:R13. Cui L, Wall PK, Leebens-Mack JH, et al. (13 co-authors). 2006. Widespread genome duplications throughout the history of flowering plants. Genome Res. 16:738–749. Cusack BP, Wolfe KH. 2007. Not born equal: increased rate asymmetry in relocated and retrotransposed rodent gene duplicates. Mol Biol Evol. 24:679–686. Dinesh-Kumar SP, Baker BJ. 2000. Alternatively spliced N resistance gene transcripts: their possible role in tobacco mosaic virus resistance. Proc Natl Acad Sci U S A. 97:1908–1913. Duarte JM, Cui L, Wall PK, Zhang Q, Zhang X, Leebens-Mack J, Ma H, Altman N, dePamphilis CW. 2006. Expression pattern shifts following duplication indicative of subfunctionalization and neofunctionalization in regulatory genes of Arabidopsis. Mol Biol Evol. 23:469–478. Alternative Splicing in Duplicated Genes · doi:10.1093/molbev/msq054 Finn RD, Tate J, Mistry J, et al. (11 co-authors). 2008. The Pfam protein families database. Nucleic Acids Res. 36:D281–D288. Ganko EW, Meyers BC, Vision TJ. 2007. Divergence in expression between duplicated genes in Arabidopsis. Mol Biol Evol. 24:2298–2309. Ha M, Li W-H, Chen Z. 2007. External factors accelerate expression divergence between duplicate genes. Trends Genet. 23:162–166. Haberer G, Hindemitt T, Meyers B, Mayer KFX. 2004. Transcriptional similarities, dissimilarities, and conservation of cis-elements in duplicated genes of Arabidopsis. Plant Physiol. 136:3009–3022. Hughes AL, Friedman R. 2007. Likelihood-ratio tests for positive selection of human and mouse duplicate genes reveal nonconservative and anomalous properties of widely used methods. Mol Phylogenet Evol. 42:388–393. Hulo N, Bairoch A, Bulliard V, Cerutti L, De Castro E, LangendijkGenevaux PS, Pagni M, Sigrist CJ. 2006. The PROSITE database. Nucleic Acids Res. 34:D227–D230. Iida K, Seki M, Sakurai T, Satou M, Akiyama K, Toyoda T, Konagaya A, Shinozaki K. 2004. Genome-wide analysis of alternative pre-mRNA splicing in Arabidopsis thaliana based on full-length cDNA sequences. Nucleic Acids Res. 32:5096–5103. Irimia M, Rukov JL, Roy SW, Vinther J, Garcia-Fernandez J. 2008. Widespread evolutionary conservation of alternatively spliced exons in Caenorhabditis. Mol Biol Evol. 25:375–382. Kopelman NM, Lancet D, Yanai I. 2005. Alternative splicing and gene duplication are inversely correlated evolutionary mechanisms. Nat Genet. 37:588–589. Letunic I, Doerks T, Bork P. 2009. SMART 6: recent updates and new developments. Nucleic Acids Res. 37:D229–D232. Lin H, Ouyang S, Egan A, Nobuta K, Haas BJ, Zhu W, Gu X, Silva JC, Meyers BC, Buell CR. 2008. Characterization of paralogous protein families in rice. BMC Plant Biol. 8:18. Liu SL, Adams K. 2008. Molecular adaptation and expression evolution following duplication of genes for organellar ribosomal protein S13 in rosids. BMC Evol Biol. 8:25. Macknight R, Duroux M, Laurie R, Dijkwel P, Simpson G, Dean C. 2002. Functional significance of the alternative transcript processing of the Arabidopsis floral promoter FCA. Plant Cell. 14:877–888. Marchler-Bauer A, Bryant SH. 2004. CD-Search: protein domain annotations on the fly. Nucleic Acids Res. 32:W327–W331. Meyer A, Van de Peer Y. 2009. From 2R to 3R: evidence for a fishspecific genome duplication (FSGD). Bioessays 27:937–945. Ner-Gaon H, Halachmi R, Savaldi-Goldstein S, Rubin E, Ophir R, Fluhr R. 2004. Intron retention is a major phenomenon in alternative splicing in Arabidopsis. Plant J. 39:877–885. Palusa SG, Ali GS, Reddy ASN. 2007. Alternative splicing of premRNAs of Arabidopsis serine/arginine-rich proteins: regulation by hormones and stresses. Plant J. 49:1091–1107. Palusa SG, Reddy ASN. 2009. Extensive coupling of alternative splicing of pre-mRNAs of serine/arginine (SR) genes with nonsense-mediated decay. New Phytol. 185:83–89. Prlić A, Down TA, Hubbard TJ. 2005. Adding some SPICE to DAS. Bioinformatics 21(2 Suppl):ii40–ii41. Reddy ASN. 2007. Alternative splicing of pre-messenger RNAs in plants in the genomic era. Annu Rev Plant Biol. 58:267–294. MBE Rosti S, Denyer K. 2007. Two paralogous genes encoding small subunits of ADP-glucose pyrophosphorylase in maize, bt2 and l2, replace the single alternatively spliced gene found in other cereal species. J Mol Evol. 65:316–327. Rouached H, Berthomieu P, Kassis EE, Cathala N, Catherinot V, Labesse G, Davidian J-C, Fourcroy P. 2005. Structural and functional analysis of the C-terminal STAS (sulfate transporter and anti-sigma antagonist) domain of the Arabidopsis thaliana sulfate transporter Sultr1. 2. J Biol Chem. 280:15976–15983. Schöning JC, Streitner C, Meyer IM, Gao Y, Staiger D. 2008. Reciprocal regulation of glycine-rich RNA-binding proteins via an interlocked feedback loop coupling alternative splicing to nonsense-mediated decay in Arabidopsis. Nucleic Acids Res. 36:6977–6987. Schranz ME, Mitchell-Olds T. 2006. Independent ancient polyploidy events in the sister families Brassicaceae and Cleomaceae. Plant Cell. 18:1152–1165. Simillion C, Vandepoele K, Van Montagu MCE, Zabeau M, Van de Peer Y. 2002. The hidden duplication past of Arabidopsis thaliana. Proc Natl Acad Sci USA. 99:13627–13632. Simpson CG, Fuller J, Maronova M, Kalyna M, Davidson D, McNicol J, Barta A, Brown JWS. 2008. Monitoring changes in alternative precursor messenger RNA splicing in multiple gene transcripts. Plant J. 53:1035–1048. Staiger D, Zecca L, Kirk DAW, Apel K, Eckstein L. 2003. The circadian clock regulated RNA-binding protein atgrp7 autoregulates its expression by influencing alternative splicing of its own pre-mRNA. Plant J. 33:361–371. Su Z, Wang J, Yu J, Huang X, Gu X. 2006. Evolution of alternative splicing after gene duplication. Genome Res. 16:182–189. Thierry-Mieg D, Thierry-Mieg J. 2006. AceView: a comprehensive cDNA-supportedgene and transcript annotation. Genome Biol. 7:S12. Van de Peer Y, Maere S, Meyer A. 2009. The evolutionary significance of ancient genome duplications. Nat Rev Genet. 10:725–732. Wan CY, Wilkins TA. 1994. A modified hot borate method significantly enhances the yield of high-quality RNA from cotton (Gossypium hirsutum l.). Anal Biochem. 223:7–12. Wang BB, Brendel V. 2006. Genomewide comparative analysis of alternative splicing in plants. Proc Natl Acad Sci USA. 103:7175–7180. Yim WC, Lee B-M, Jang CS. 2009. Expression diversity and evolutionary dynamics of rice duplicate genes. Mol Genet Genomics. 281:483–493. Zhang XC, Gassmann W. 2003. RPS4-mediated disease resistance requires the combined presence of RPS4 transcripts with fulllength and truncated open reading frames. Plant Cell. 15:2333–2342. Zheng Q, Wang XJ. 2008. GOEAST: a web-based software toolkit for Gene Ontology enrichment analysis. Nucleic Acids Res. 36:W358–W363. Zou C, Lehti-Shiu MD, Thomashow M, Shiu SH. 2009. Evolution of stress-regulated gene expression in duplicate genes of Arabidopsis thaliana. PLoS Genet. 5:e1000581. 1697
© Copyright 2026 Paperzz