Extensive Divergence in Alternative Splicing Patterns after Gene

Extensive Divergence in Alternative Splicing Patterns after
Gene and Genome Duplication During the Evolutionary
History of Arabidopsis
Peter G. Zhang, Suzanne Z. Huang,à Anne-Laure Pin,§ and Keith L. Adams*
Research article
UBC Botanical Garden and Centre for Plant Research, Department of Botany, University of British Columbia, Vancouver,
British Columbia, Canada
Present address: Centre for Molecular Medicine and Therapeutics, Vancouver, British Columbia, Canada
àPresent address: Shalhevet High School, Vancouver, British Columbia, Canada
§Present address: Centre de recherche en cancérologie de l’Université Laval (Laval University), L’Hôtel-Dieu de Québec,
Québec, Canada
*Corresponding author: E-mail: [email protected].
Associate editor: Aoife McLysaght
Abstract
Gene duplication at various scales, from single gene duplication to whole-genome (WG) duplication, has occurred
throughout eukaryotic evolution and contributed greatly to the large number of duplicated genes in the genomes of many
eukaryotes. Previous studies have shown divergence in expression patterns of many duplicated genes at various
evolutionary time scales and cases of gain of a new function or expression pattern by one duplicate or partitioning of
functions or expression patterns between duplicates. Alternative splicing (AS) is a fundamental aspect of the expression
of many genes that can increase gene product diversity and affect gene regulation. However, the evolution of AS patterns
of genes duplicated by polyploidy, as well as in a sizable number of duplicated gene pairs in plants, has not been examined.
Here, we have characterized conservation and divergence in AS patterns in genes duplicated by a polyploidy event during the
evolutionary history of Arabidopsis thaliana. We used reverse transcription–polymerase chain reaction to assay 104 WG
duplicates in six organ types and in plants grown under three abiotic stress treatments to detect organ- and stress-specific
patterns of AS. Differences in splicing patterns in one or more organs, or under stress conditions, were found between the
genes in a large majority of the duplicated pairs. In a few cases, AS patterns were the same between duplicates only under one
or more abiotic stress treatments and not under normal growing conditions or vice versa. We also examined AS in 42 tandem
duplicates and we found patterns of AS roughly comparable with the genes duplicated by polyploidy. The alternatively spliced
forms in some of the genes created premature stop codons that would result in missing or partial functional domains if the
transcripts are translated, which could affect gene function and cause functional divergence between duplicates. Our results
indicate that AS patterns have diverged considerably after gene and genome duplication during the evolutionary history of the
Arabidopsis lineage, sometimes in an organ- or stress-specific manner. AS divergence between duplicated genes may have
contributed to gene functional evolution and led to preservation of some duplicated genes.
Key words: gene duplication, polyploidy, alternative splicing, gene expression.
Introduction
Genome duplication (polyploidy) has been a common phenomenon among eukaryotes of various groups, including
plants, animals, and some single-celled eukaryotes. Two
rounds of polyploidy are inferred to have occurred during
the early evolutionary history of vertebrates (reviewed in
Van de Peer et al. 2009), and certain lineages, such as
ray-finned fish, have experienced an ancient whole-genome
(WG) duplication during their evolutionary history (reviewed in Meyer and Van de Peer 2009). Among flowering
plants, multiple ancient polyploidy events have occurred
during angiosperm evolution and most lineages have experienced at least one round of polyploidy (e.g., Blanc and
Wolfe 2004b; Cui et al. 2006). Polyploidy is an ongoing
process in plants, and many plant species are cytologically
polyploids, having experienced an evolutionarily recent
polyploidy event. Segmental duplication of regions of
a chromosome, tandem gene duplication, and duplicative
retroposition also have contributed to the large number of
duplicated genes in plant genomes. Over time, many duplicated genes are lost and some of those that are retained
gain new functions and/or expression patterns (neofunctionalization) or subdivide their functions and/or expression patterns between them (subfunctionalization).
Expression studies of genes duplicated by ancient WG
duplication events, including those during the evolutionary
history of the Brassicaceae and the Poaceae families, as well
as other types of duplicated genes in plants, have revealed
considerable divergence in expression patterns among different organ types, developmental stages, and in response
to abiotic and biotic stress conditions (Haberer et al. 2004;
Blanc and Wolfe 2004a; Casneuf et al. 2006; Duarte et al.
© The Author 2010. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please
e-mail: [email protected]
1686
Mol. Biol. Evol. 27(7):1686–1697. 2010 doi:10.1093/molbev/msq054
Advance Access publication February 25, 2010
Alternative Splicing in Duplicated Genes · doi:10.1093/molbev/msq054
2006; Ganko et al. 2007; Ha et al. 2007; Yim et al. 2009; Zou
et al. 2009). The divergence in expression patterns, along
with asymmetric rates of sequence evolution between
some pairs, have been interpreted as evidence of functional
divergence (e.g., Blanc and Wolfe 2004a).
A fundamental aspect of the expression of many genes is
alternative splicing (AS). AS creates multiple forms of
mature messenger RNAs (mRNAs) from a single precursor
mRNA by using different 5# and/or 3# splice sites. There are
multiple types of AS, including exon skipping where an
exon is excluded from the mature mRNA, intron retention
in which a complete intron remains in the transcripts, and
AS at the 5# end of an intron (alternative donor) or the
3# end (alternative acceptor) (reviewed in Reddy 2007;
Barbazuk et al. 2008). More than one intron in a transcript
can be affected by AS to create multiple mature mRNAs
that are differently spliced. Despite the importance of
AS to gene expression, the conservation of AS patterns
after WG duplication remains unknown, and AS events
for a sizable set of duplicated gene pairs have not been
examined and compared in plants.
In this study, we investigated the evolutionary conservation and divergence of AS patterns in genes duplicated by
polyploidy during the evolutionary history of the Arabidopsis lineage approximately 35 million years ago, after the divergence of the Brassicaceae from the Cleomaceae (Simillion
et al. 2002; Blanc et al. 2003; Schranz and Mitchell-Olds
2006). WG duplicates provide a good system to investigate
changes in AS patterns of duplicated genes because of their
simultaneous formation. We compared the presence/
absence of AS events between genes in a WG duplicate pair
by reverse transcription–polymerase chain reaction
(RT-PCR) for 104 genes using multiple organ types and three
abiotic stress treatments to detect organ- and stress-specific
AS patterns. Also we studied 42 genes duplicated in tandem.
Considerable divergence has occurred in AS events, some of
which is organ or stress specific, following gene and genome
duplication in the Arabidopsis thaliana lineage. Some of the
AS changes likely cause alterations in function.
Materials and Methods
Plant Growth, Stress Treatments, and Nucleic
Acid Extraction
Arabidopsis thaliana (Columbia ecotype) were grown in
soil with a photoperiod of 16-h light and 8-h dark at room
temperature (22 ± 3°C). Cauline leaves, bolt stems, rosette
leaves, roots, siliques, open flowers, and flower buds were
collected at about the same time of day to minimize circadian effects, and samples were frozen in liquid nitrogen.
Three sets of tissue from different plants were collected as
biological replicates with several plants used per replicate.
Three stress treatments were performed: Cold stress was
done at 4°C for 72 h; heat stress was done at 38°C for
6 h, as in Palusa et al. (2007); and drought stress included
no water for 1 week and then drying in a fume hood for
24 h. Organ harvesting was done at the location of stress
treatment, and tissue was frozen with liquid nitrogen.
MBE
Gene Choice and Primer Design
Gene pairs were selected from the Blanc data set of approximately 2,500 alpha WG duplicate gene pairs (Blanc et al.
2003) from a list provided in Blanc and Wolfe (2004a). At
the time this project was started, it was estimated that
about 22% of genes in A. thaliana undergo AS of one or
more intron in one or more organ type or growth condition
(Wang and Brendel 2006). To avoid potentially assaying
a large number of gene pairs with no AS that would be
uninformative in terms of AS conservation between duplicates, we selected gene pairs in which at least one gene in
the pair showed evidence for AS in available complementary DNA (cDNA) and expressed sequence tag (EST) data.
The Alternative Splicing in Plants database (Wang and
Brendel 2006; 30 June 2007 update) and AceView
(Thierry-Mieg and Thierry-Mieg 2006; October 2008 update) were used to find introns in the genes that are alternatively spliced. Those databases contain full-length cDNA
sequences and partial ESTs from different types of libraries,
but neither is comprehensive with respect to AS forms. In
addition, a few gene pairs were genes where AS in one gene
in a pair was assayed in Ner-Gaon et al. (2004) using transcripts associated with ribosomes. Gene pairs were selected
without knowledge of the AS status of the other duplicate,
except when the first duplicate in a pair did not show AS
among the available cDNA sequences (although it could
show AS that was not detected in the EST collections). Also
we had no prior knowledge of the organ- and stress-specific
AS of the gene pairs. We disregarded any ESTs in which all
introns were retained because they might represent
unspliced transcripts. The number of ESTs showing a particular AS event compared with the total number of ESTs in
that region is shown in supplementary table S1, Supplementary Material online. Some of the assayed gene pairs
did not show AS in the organ types or conditions used
in this study, presumably because the AS was present in
a different organ or growth condition (such as hormonetreated callus) and those genes were discarded because
they were uninformative about AS.
Primers were designed to be specific to each gene in
a pair, and they were searched against the genome of
A. thaliana to determine if they would likely amplify other
paralogs. Primer sequences are listed in supplementary
table S2, Supplementary Material online.
Nucleic Acid Extraction, RT, and Gene
Amplification and Sequencing
Total RNA was extracted with Trizol (Invitrogen) for most
organs and hot borate extraction (Wan and Wilkins 1994)
for siliques. RNAqualityand integritywas checked on agarose
gels, and the concentration was determined with a spectrophotometer. Genomic DNA was extracted using the Qiagen
DNeasy kit. RNA was treated with DNase I (New England
Biolabs) according to the manufacturer’s protocol to remove
genomic DNA. The DNase-treated RNA was used in RT
reactions with M-MLV reverse transcriptase (Invitrogen)
according to the manufacturer’s instructions with oligo
dT used as a primer to prime on poly(A) tails of transcripts.
1687
Zhang et al. · doi:10.1093/molbev/msq054
Negative controls were made for each sample with no reverse
transcriptase to check for contaminating genomic DNA.
PCRs were set up as in Liu and Adams (2008) and
30–35 cycles of PCR were carried out with the following
conditions in each cycle: 96°C for 10 s, annealing at the
optimal annealing temperature for 30 s, and 72°C for
30 s. The annealing temperature for each gene was optimized using a gradient beforehand. A negative control with
water instead of template was used to ensure all reagents
were free of DNA contamination. PCR products were run
on 1.5% agarose gels for band separation and stained with
ethidium bromide for visualization. Gels were scored based
on presence or absence of bands of the expected sizes.
Some of the RT-PCR products that showed only one band
on the gel were directly sequenced to confirm the presence
of a single PCR amplicon. RT-PCR products with less than
20 bp difference between splice forms were resolved by
direct sequencing to determine if one or two splice
products were present. Some bands representing putative
AS products, as well as unexpected bands, were cut out
of the gels, and a gel extraction kit (Qiagen) was used
to elute the DNA. PCR products were sequenced using
BigDye Version 3.1 (Applied Biosystems) and run on an
ABI 3730 DNA sequencer.
Sequence Alignment and Phylogenetic Analysis
TransAlign (Bininda-Emonds 2005), an amino acid–based
alignment for coding DNA sequence, was used to align
the sequences of the sulfate transporter genes. Maximum
parsimony in PHYLIP was used for phylogenetic analysis.
Bootstrapping of 100 replicates was performed using
PHYLIP bootstrap with the neighbor-joining algorithm.
Domain and Gene Ontology Analyses
Protein domain analysis was performed using the Conserved
Domains Database at National Center for Biotechnology
Information (NCBI) (Marchler-Bauer and Bryant 2004;
http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml),
Pfam (Finn et al. 2008; http://pfam.sanger.ac.uk/), PROSITE
(Hulo et al. 2006; http://ca.expasy.org/prosite/),
and SMART (Letunic et al. 2009; http://smart.emblheidelberg.de/). Gene ontology (GO) analysis was done
using GOEAST (Zheng and Wang 2008; http://omicslab
.genetics.ac.cn/GOEAST/).
Results
Comparisons of AS Events between WG
Duplicates in Multiple Organs
We used RT-PCR to individually assay conservation of AS
patterns in a set of 52 WG duplicate pairs identified in
Blanc et al. (2003). The vast majority of AS events assayed
were intron retentions, which is the most common type of
AS in plants (Wang and Brendel 2006). Some of the alternatively spliced introns are flanked by exons in the coding
region, whereas others are located in the 5#untranslated
region (UTR) or 3#UTR. Primers for RT-PCR were designed
to amplify a region containing one or more putative AS
1688
MBE
event(s). An AS event is a single case of AS, such as retention
of a single intron. The primers were designed to not crossamplify other duplicates in the genome.
RT-PCR was used to evaluate the AS events in six organ
types because AS in plants can be organ specific (e.g., Palusa
et al. 2007; Simpson et al. 2008). Rosette leaves, roots, bolt
stems, cauline leaves, whole flowers, and green siliques were
examined, with two replicates of each organ type. In most
cases, the presence or absence of splice forms was identical
between replicates; those few that were not identical were
examined in a third replicate. Examples of RT-PCR gels
showing AS events are shown in figure 1. Some of the
RT-PCR products with a single splice form on the agarose
gels were directly sequenced to verify the presence of only
a single amplicon (e.g., genes shown in fig. 2).
The same AS event (e.g., retention of the same intron)
was found in only one of the two WG duplicates for 33 gene
pairs; both WG duplicates showed the same AS event in
21 gene pairs; both conserved and nonconserved AS events
were seen in 4 gene pairs; and neither WG duplicate
showed AS in the region examined for 1 gene pair
(although there was AS after stress treatment, below)
(tables 1 and 2). Among the 21 gene pairs with the same
AS event in both WG duplicates, coexpression of both transcript isoforms in different organs between the two genes
was found for 11 gene pairs. Thus, although the splice form
is conserved when comparing some organs, regulation of
the usage of that splice form is different between the
two genes. Complete conservation of the assayed AS
event(s) in all organs examined in both WG duplicates
was found for 10 gene pairs and 11 AS events (i.e., retention
of a single homologous intron in both WG duplicates),
whereas divergence between the genes pairs was found
for 48 AS events (table 1).
Conservation of AS events between WG duplicates was
organ specific in some cases (tables 1 and 2 and supplementary table S3, Supplementary Material online). For example, a pair of protein kinase genes (At1g18160 and
At1g73660) showed retention of intron 2 in both genes
in vegetative organs, but in flowers and siliques, only
At1g18160 showed retention of the intron. A second example is a pair of genes for nucleic acid–binding proteins
(At1g48920 and At3g18610) where both pairs showed
retention of intron 1 in siliques and bolt stems but only
At3g18610 shows retention of the intron in the other
examined organs. Complete partitioning of AS was discovered in a pair of ADP-ribosyltransferase genes where only
At1g32230 retained intron 3 and only At2g35510 shows
skipping of exon 4.
Many of the AS events in the studied genes introduce
premature stop codons that would disrupt the reading
frame if the transcripts are translated. In some of the
duplicated gene pairs, an AS event in one of the genes creates a premature stop codon that results in a truncated
protein that would be missing one or more functional domains or portions of functional domains, if the transcripts
are translated (fig. 3). For example, gene At1g32230, an
ADP-ribosyltransferase, has a skipped exon form that alters
Alternative Splicing in Duplicated Genes · doi:10.1093/molbev/msq054
MBE
FIG. 2. Chromatograms from direct sequencing of RT-PCR products.
Exon–exon junctions are shown from examples of genes, which
showed only one splice form on the RT-PCR gels. Arrows indicate
exon–exon junctions. No evidence for intron presence was seen in
any of these genes. Panels (B) and (D) show the reverse strand and
the chromatograms were inverted.
the WRKY superfamily domain, presumably resulting
in loss of DNA-binding function; in contrast, the intronretained form was not detected in its duplicate
At2g40750. In other cases, a premature stop codon does
not result in the loss of any currently recognized functional
domains; however, there could be loss of functional
domains that have not yet been characterized in plants.
AS in WG Duplicates after Abiotic Stress
Treatments
FIG. 1. RT-PCR gels from WG duplicates and tandem duplicates
showing AS events. Genes, types, and effects of AS are
(A) At1g28330 and At2g33830, retained intron (IR) divergence;
(B) At4g29160 and At2g19830, IR divergence; (C) At1g79650 and
At1g16190, alternative position divergence; (D) At4g12040 and
At4g22820, IR conservation; (E) At4g30690 (top) and At2g24060
(bottom), IR conservation under stress but not normal conditions;
(F) At3g48440 (top) and At5g63260 (bottom), IR conservation,
although only the non-IR form is present under drought stress in
stems and roots; (G) At1g70100 (top) and At1g24160 (bottom),
IR conservation under normal growth conditions but not under
drought stress; (H) At4g26710 (top) and At5g55290 (bottom),
reciprocal IR between heat and normal conditions in flowers and
stems; (I) At4g30660 (top) and At4g30650 (bottom), IR divergence
under all conditions; (J) At1g78380 (top) and At1g78370 (bottom),
IR only under heat stress in At1g78380. An example of RT-PCR
controls is shown in supplementary figure S1, Supplementary
Material online.
the reading frame and has a premature stop codon that
causes loss of part of the ADP-ribosyl superfamily domain;
the skipped exon form was not found in its duplicate
At2g35510. Gene At3g56400 (WRKY70), a DNA-binding
protein, has an intron-retained form that causes loss of
Abiotic stresses are known to cause changes in AS patterns
in plants (e.g., Iida et al. 2004; Palusa et al. 2007). It is
possible that pairs of WG duplicates that did not show conservation of AS events under regular growing conditions
could show the same AS events under abiotic stress conditions or vice versa. To evaluate that possibility, we
assayed AS patterns of the 52 gene pairs in plants treated
under three abiotic stresses—cold, heat, and drought.
Some splicing variants that were present under regular
growing conditions were not present after one or more
of the stress treatments, whereas in other cases, a gene produced a new splicing variant in the stress-treated plants in
one or more organs (fig. 1 and table 1 and supplementary
table S4, Supplementary Material online). In a few cases,
there was a shift from one splicing variant to another.
In a pair of ATP synthase genes, At5g55290 but not
At4g26710 showed a retained intron form under regular
growing conditions but At4g26710 and not At5g55290
showed the retained intron form after each of the three
stress treatments (fig. 1 and table 1 and supplementary
table S4, Supplementary Material online). Another example
is an RAD23 DNA repair gene (At1g16190) that showed
only a retained intron form in rosette leaves subjected
to heat and drought stress but only the non-intron–
retained form under regular growing conditions, whereas
both forms were present in leaves of cold-stressed plants.
1689
MBE
Zhang et al. · doi:10.1093/molbev/msq054
Table 1. Conservation and Divergence of AS Patterns in WG and
Tandem Duplicates.
AS event in only one WG duplicate
AS event in both WG duplicates
AS in the same organs
AS in different organs
AS event conserved only after one
or more abiotic stress treatment
AS event in one tandem duplicate
AS event in both tandem duplicates
AS in same organs
AS in different organs
AS event conserved only after one
or more abiotic stress treatments
33
21
10
11
pairs
pairs
pairs
pairs
36
23
11
12
events
events
events
events
8 pairs
11 pairs
9 pairs
5 pairs
4 pairs
8 events
12 events
9 events
5 events
4 events
4 pairs
4 events
NOTE.—AS event refers to a single type of AS in a single intron, for example,
retention of a homologous intron.
There were seven gene pairs that did not show conservation of an AS event (e.g., a retained intron form in transcripts from one gene but not the other) under regular
growing conditions but did show the same AS event (i.e.,
the transcripts from both genes retained the intron) after
one or more stress conditions. In a pair of IF-3 translation
initiation factor genes, At4g30690 retained intron 6 under
all conditions, whereas At2g24060 retained intron 6 under all three stresses in most organs but not under normal
growing conditions (fig. 1). Between a pair of lipoamide
dehydrogenases At1g48030 and At3g17240, At1g48030
showed retention of intron 2 under all conditions but
At3g17240 showed retention of intron 2 only under
the three stress conditions in some organs. Thus, AS
events were conserved between WG duplicates only after
abiotic stress treatments in the two cases previously mentioned. Conversely, a pair of genes of unknown function
(At1g70100 and At2g24160) showed AS conservation
between WG duplicates only under normal conditions
and not under any of the three stress conditions. In
addition, one gene pair showed AS in both duplicates
under stress conditions but in neither duplicate
under normal conditions. A few WG duplicate gene
pairs showed opposite splice forms in each gene under
stress compared with each other. Examples of those
genes include a pair of genes with unknown function
(At1g16840 and At1g78890) in cold-stressed plants where
At1g78890 showed only a retained intron form and
At1g16840 showed only the nonretained intron form
and a pair of genes for RAD23 DNA repair proteins
(At1g79650 and At1g16190) under heat and drought
stresses in rosette leaves where At1g16190 showed only
an intron-retained form and At1g79650 showed only the
non-intron–retained form (supplementary table S4, Supplementary Material online). All the above-mentioned
examples indicate how the apparent conservation and divergence of AS events in WG duplicate gene pairs can
vary by growing conditions.
AS Divergence in Tandem Duplicates
In addition to the WG duplicates, we assayed 21 pairs of
tandem duplicates, identified in Haberer et al. (2004).
1690
RT-PCR was performed using RNA from six organs (rosette
leaves, roots, bolt stems, cauline leaves, unopened flower
buds, and mature opened flowers) and in response to heat
and drought stress treatments. Of 21 tandem pairs, 11
showed an AS event in only one gene in the region analyzed
and 9 showed AS in both genes (table 1). Among the gene
pairs with the AS event in both genes, the AS patterns were
the same in all six organs in five pairs and showed different
organ specificity in four pairs (tables 1 and 3 and supplementary table S5, Supplementary Material online). Thus,
like the WG duplicates, there has been considerable divergence in AS events between the tandem duplicate genes
that were assayed.
We also examined AS in plants grown under two abiotic
stress treatments to determine if any genes with diverged
AS patterns between duplicates under normal growing
conditions had the same AS patterns between duplicates
after stress treatment or vice versa. Thirteen WG duplicates
showed the same AS event in both duplicates under one or
both stresses with seven having AS in the same organs and
six having AS in different organs. In contrast, nine gene
pairs showed an AS event only in one duplicate. Two pairs,
At1g78380 and At1g78370 coding for glutathione transferases along with At3g22231 and At3g22240 that have
unknown functions, showed an AS event in one gene only
after one or both stress treatments. The abiotic stress treatments affected the perceived AS conservation between duplicates in four pairs of duplicates (table 3). In three cases,
only one gene in the pair had the AS form under normal
growing conditions but both genes had the AS form after
one or both stresses.
Several of the AS events in the tandem duplicates introduce premature stop codons that would disrupt the
reading frame if the transcripts are translated. In some of
the duplicated gene pairs, an AS event in one of the genes
creates a premature stop codon that results in a truncated protein that is missing one or more functional
domains, or portions of functional domains, if the transcripts are translated (figs. 3 and 4). For example, gene
At5g06860, which codes for a polygalacturonaseinhibiting protein, contains a retained intron that results
in premature stop codon formation that would result in
loss of some of the leucine-rich repeats. Gene At1g78000,
coding for a sulfate transporter, has a retained intron
form that results in loss of part of the STAS (Sulfate
Transporter and AntiSigma factor antagonist) functional
domain, whereas its duplicate At1g77990 does not have
that form (fig. 4).
Finally, we examined AS in two subgroups of the
sulfate transporter family that contain both tandem
duplicates and WG duplicates that were part of this
study to look at the evolutionary dynamics of AS in
the family. At1g22150 and At1g78000 are WG duplicates, and genes At1g77990 and At1g78000 are tandem
duplicates; At1g78000 is both a WG duplicate and a tandem duplicate. Of the five genes only At1g78000 retains
the final intron (fig. 4A), suggesting gain of the AS event
in this gene after WG duplication.
MBE
Alternative Splicing in Duplicated Genes · doi:10.1093/molbev/msq054
Table 2. Genes Studied and Summary of RT-PCR Results for WG Duplicates.
Gene Number
Gene 1, Gene 2
At5g07370, At5g61760
At4g30690, At2g24060
At2g43010, At3g59060
At5g55550, At4g26650
At1g32230, At2g35510
Function or Putative Function
Inositol phosphate kinase
Translation initiation factor
PIF transcription factor
RNA recognition motif
ADP-ribosyltransferase
At5g18620,
At1g50630,
At3g56400,
At3g23830,
At2g19620,
At3g48440,
At1g76140,
At2g25850,
Chromatin remodeling factor
Extracellular ion channel
WRKY transcription factor
Glycine-rich RNA binding
Ndr family
Nucleic acid binding
Prolyl oligopeptidase
Nucleotidyltransferase
At3g06400
At3g20300
At2g40750
At4g13850
At5g56750
At5g63260
At1g20380
At4g32850
At3g05640, At5g27930
At1g03457, At4g03110
Protein phosphatase
RNA-binding protein
At1g16840,
At1g73650,
At1g48030,
At2g21940,
At3g15980,
Unknown protein
Oxidoreductase
Lipoamide dehydrogenase
Shikimate kinase
Coatomer protein complex
At1g78890
At1g18180
At3g17240
At4g39540
At1g52360
At2g21660, At4g39260
Glycine-rich RNA binding
At1g19000, At1g74840
At3g45240, At5g60550
myb transcription factor
Knase
At4g22590, At4g12430
At3g02900, At5g16660
Phosphatase
Unknown protein
At2g01180, At1g15080
At1g15960, At1g80830
Phosphatidate phosphatase
Metal ion transporter
At1g80910, At1g16020
At1g79650, At1g16190
Unknown protein
RAD23 DNA repair
At1g19400,
At1g78000,
At1g70100,
At2g17320,
At4g26710,
At2g45170,
At3g01500,
At1g15920,
At1g48920,
At1g18160,
At5g47080,
At2g37340,
Unknown protein
Sulfate transporter
Unknown protein
Pantothenate kinase
ATP synthase subunit H
Microtubule binding
Carbonic anhydrase
Transcription complex
Nucleic acid binding
Protein kinase
Casein kinase II beta chain
RSZ splicing factor
At1g75180
At1g22150
At1g24160
At4g35360
At5g55290
At3g60640
At5g14740
At1g80780
At3g18610
At1g73660
At4g17640
At3g53500
At5g16800, At3g02980
At4g12040, At4g22820
At4g14410, At3g23210
N-acetyltransferase
AN-1–like zinc finger
bHLH family
At1g52250, At3g16120
At3g46130, At5g59780
Dynein light chain
myb transcription factor
At1g55310, At3g13570
SCL splicing factor
At2g17640, At4g35640
At4g29160, At2g19830
At2g20590, At4g28430
Serine acetyltransferase
SNF7 family
Reticulon family
AS
I
I
I
I
I
E
I
I
I
A
I
I
I
I1
I2
E
D
A
I
I
I
I
I
I
D
D
I
I
D
I
I
A
I
I
I1
I2
I
P
I
I
I
I
I
I
I
I
I
I
I
I
A
I1
I2
I
I
I
A
I
I
D
I
E
I
I
I
Normal
12
12
12
12
12
12
21
12
11s
12
12
11d
11s
11d
11s
11s
21
12
12
22
21
12
12
12
12
22
11d
12
11d
12
22
22
12
22
21
12
12
22
12
22
11s
12
11s
12
21
12
12
21
11d
11d
11d
11s
22
22
11s
11s
11d
12
11d
11d
11d
12
11s
11s
12
11d
Stress
12
12
11d
22
12
12
21
22
12
12
12
11d
11d
11d
11s
11s
21
1x
12
12
1 2*
12
11d
12
12
12
11s
11s
12
22
21
12
12
12
11d
12
12
21
22
21
12
12
12
12
12
11d
12
11s
11d
11d
11s
11s
11d
12
11s
11s
11s
12
11d
11s
11d
11d
11s
11s
12
11d
Effect of AS
5#UTR
PTC; domain loss
PTC
5#UTR
PTC
PTC; domain loss
New stop
New stop
PTC; domain loss
5#UTR
PTC
PTC; domain loss
PTC; domain loss
New stop
New stop
New stop
5’UTR
Longer ORF
PTC
New stop
New stop
3#UTR
PTC
3#UTR
New stop
Longer ORF
Longer ORF
3#UTR
5#UTR
5#UTR
New stop
Longer ORF
PTC
5#UTR
PTC; domain disrupt
PTC; domain disrupt
New stop
Longer ORF
PTC; domain loss
5#UTR
New stop
New stop
PTC
5#UTR
5#UTR
New stop
5#UTR
New start
PTC; domain loss
New stop
PTC; domain loss
PTC; domain loss
PTC; domain loss
New stop
5#UTR
New start
PTC
PTC; domain disrupt
New start
New start
PTC; domain disrupt
PTC; domain disrupt
PTC
5#UTR
PTC
1691
MBE
Zhang et al. · doi:10.1093/molbev/msq054
Table 2. Continued.
Gene Number
Gene 1, Gene 2
At4g39140, At2g21500
At2g39250, At3g54990
Function or Putative Function
Protein and zinc ion binding
Transcription factor
AS
A
I1
I2
Normal
12
12
12
12
Stress
12
12
12
12
Effect of AS
5#UTR
PTC; domain disrupt
PTC; domain disrupt
NOTE.—Types of AS: I, intron retention; A, alternative acceptor; D, alternative donor; P, alternative position; E, skipped exon. Data for normal and stress conditions: Gene 1
data are listed first followed by gene 2 data. Plus signs indicate an AS event in one or more organs, minus signs indicate no AS event. s indicates that the same AS event is
present in the same organs in both WG duplicates, d indicates that AS is present different organs in each WG duplicate, and x indicates no expression, and asterisk indicates
only the retained intron form is present. Bold indicates genes that show an AS event that is present in both WG duplicates under one or more stress conditions but not
under normal conditions or vice versa. PTC indicates premature stop codon, new start indicates a new start codon is formed, new stop indicates a new stop codon is
formed in the final intron, and longer ORF indicates a longer open reading frame. See supplementary tables S3 and S4, Supplementary Material online for complete organ
and stress data.
Discussion
Extensive Divergence in AS between Duplicated
Genes in Arabidopsis thaliana
The results of our RT-PCR experiments indicate that considerable divergence in AS patterns has occurred between
many of the genes duplicated by the most recent polyploidy
event during the evolutionary history of the Arabidopsis lineage, as well as between many of the tandem duplicates that
were examined in this study. Thus, changes in AS patterns
are an important aspect of the evolution of WG duplicate
and tandem duplicate pairs, in addition to the previously
documented changes in expression patterns (as reviewed
in the Introduction). Several different types of AS events
have been gained or lost, including retained introns, alternative donors and acceptors, and skipped exons. In some
cases, the perceived conservation or lack of conservation
of a splice form between duplicates varied among organ
types or in response to abiotic stress treatments, indicating
that AS regulation has changed between the duplicated
genes. Particularly interesting in this regard are genes where
an AS event is conserved between WG or tandem duplicates
only under one or more abiotic stress treatments and not
under normal growing conditions, showing that regulation
and apparent conservation of AS between duplicated genes
can be affected by stress treatments. Those genes have a variety of functions, including a transcription initiation factor,
a lipoamide dehydrogenase, a phosphatidate phosphatase,
and a polygalacturonase-inhibiting protein, among others.
The opposite effect, AS conservation under normal growing
conditions but not under the tested stress conditions, also
was observed. A few of the WG and tandem duplicate pairs
have conserved AS events in all organs and stress conditions
examined (tables 2 and 3). It is possible that no changes in
regulation of AS have occurred in those genes or it is also
possible that there is divergence in AS in organs, developmental stages, or under abiotic or biotic stress conditions
not examined in this study.
We examined a small fraction of genes duplicated by
the alpha polyploidy event and the tandem duplicates
in several organ types and under two to three different abiotic stress conditions using a sensitive RT-PCR approach.
An advantage of this approach is the precision by which
AS patterns are assayed compared with analyzing EST databases that are incomplete in regard to AS and highly het1692
erogeneous and unequally sampled in terms of tissues,
organ types, and growth conditions. A disadvantage is that
our data set may or may not be completely representative
of the two classes of duplicates as a whole. The WG duplicate genes we assayed were relatively evenly sampled in
terms of gene classes as defined by GO categories (supplementary tables S6 and S7, Supplementary Material online).
That is, the top 12 GO categories in our data set were
mostly among the top 12 in the entire WG duplicate data
set. In contrast, our tandem data set appears to be less representative of the GO categories, with only some of the top
categories also being among the top in the entire data set
(supplementary tables S8 and S9, Supplementary Material
online). There has not been a systematic study of the
amount of AS in each GO category in A. thaliana and thus
it is unknown if some GO categories have more AS than
others. There are no indications that not having a completely representative sample of genes, from a GO category
perspective, would bias our inferences about conservation
of AS patterns between duplicates in a pair. We think that
the general patterns of divergence between duplicated
genes, as well as the general patterns in different organs
and under the abiotic stress conditions, are likely to be relatively applicable across the WG duplicates, but future
studies of all alpha WG duplicates and tandem duplicates
that assay several organ types and stress conditions using
sensitive detection methods will be needed for a full characterization. Future studies also could examine the
amounts of transcripts with particular AS forms by quantifying the amounts of each form relative to the completely
spliced transcripts. That information may be helpful in inferring which splicing events represent true AS and
which might represent incomplete splicing that could be
present at low levels. Also the use of ribosome-associated
transcripts would be helpful in that regard.
We included some tandem duplicates in this study to
determine if extensive divergence in AS patterns between
duplicates is unique to the WG duplicates. Our data indicate that it is not. Our sample size of 42 duplicate genes is
likely too small to make accurate comparisons of the
frequency of AS conservation in WG duplicates compared
with tandem duplicates in general, except to say that the
patterns we saw are roughly comparable, as seen in table 1.
It would take a much larger sample size, preferably all tandem and WG duplicates in the genome, to make thorough
Alternative Splicing in Duplicated Genes · doi:10.1093/molbev/msq054
MBE
FIG. 3. Domain losses and disruptions resulting from AS. Shown are protein diagrams for 16 genes with domains labeled. Arrows indicate
locations in the protein that correspond to the beginning of an AS intron in the corresponding gene sequence; regions to the right of the
arrows would be missing in the proteins if they are translated. Proteins are drawn to scale and are shown with the N-terminus on the left and
the C-terminus on the right. Gene numbers and domain identifiers are (A) At3g56400 (IF-3 translation initiation factor) pfam00707 IF-3
conserved C-terminal domain and pfam05198 IF-3 conserved N-terminal domain. (B) At2g35510 (ADP-ribosyltransferase) cI02729 WWE
domain and cI00283 ADP-ribosyl superfamily domain. (C) At3g56400 (WRKY transcription factor) cI03892 WRKY DNA-binding domain. (D)
At3g48440 (nucleic acid binding) cl11592 zinc finger domain. (E) At1g20380 (prolyl oligopeptidase) peptidase S9 domain. (F) At1g15960 (metal
ion transporter) cl00836 Nramp domain. (G) At1g16190 (RAD23 DNA repair) cd01805, RAD23 N-terminal domain, pfam09280, XPC-binding
domain, cl00153, UBA ubiquitin-associated domain. (H) At1g18160 (protein kinase) cd00192 PTKc—catalytic domain of protein tyrosine
kinases. (I) At2g37340 (RSZ splicing factor) cd00590 RNA recognition motif, pfam00098 zinc knuckle. (J) At1g52250 (dynein light chain)
ci03131. (K) At1g55310 (SCL 33 splicing factor) cd00590 RNA recognition motif. (L) At2g39250 (transcription factor) cd00018 AP2 DNAbinding domain. (M) At4g28420 (aspartate aminotransferase) cd00609 aspartate aminotransferase family. (N) At5g06860 (polygalacturonase
inhibiting) pfam08263 leucine-rich repeat N-terminal domain and COG4886 leucine-rich repeat region. (O) At1g78380 (glutathione transferase)
cd03185 glutathione transferase conserved N-terminal domain and cd03058 glutathione transferase conserved C-terminal domain. (P)
At5g01040 (laccase family) cl06664 Cu-oxidase domains.
comparisons and to determine if the amount of AS divergence is proportional to the time since duplication. Future
analyses of transcriptome data generated by secondgeneration sequencing may provide the opportunity to
examine AS conservation and divergence between most
or all duplicate gene pairs in the genome of A. thaliana
and other plants.
Effects of AS Divergence on Gene Regulation and
Function
Many of the AS events that differ between duplicates among
the genes we assayed create a premature stop codon within
the transcripts. In several cases, the protein would be lacking
one or more functional domains if the transcripts are translated, probably resulting in nonfunctional proteins or
1693
MBE
Zhang et al. · doi:10.1093/molbev/msq054
Table 3. Genes Studied and Summary of RT-PCR Data from Tandem Duplicates.
Gene Number
Gene 1, Gene 2
At1g27370, At1g27360
Function or Putative Function
DNA-binding protein
At1g78000, At1g77990
At2g29930, At2g29910
Sulfate transporter
F-box family protein
At2g46280,
At3g19000,
At4g01450,
At4g27610,
At2g46290
At3g19010
At4g01440
At4g27620
TGF-B receptor interacting
Oxidoreductase
Nodulin MtN21 family
Unknown protein
At4g27870,
At4g28420,
At4g30660,
At4g36195,
At5g06860,
At5g52300,
At1g78380,
At1g23510,
At5g01040,
At4g16770,
At2g17320,
At1g10360,
At4g25690,
At3g22231,
At4g27860
At4g28410
At4g30650
At4g36190
At5g06870
At5g52310
At1g78370
At1g23520
At5g01050
At4g16765
At2g17340
At1g10370
At4g25670
At3g22240
Integral membrane family
Aminotransferase
Hydrophobic protein
Serine-type peptidase
Polygalacturonase inhibiting
Unknown protein
Glutathione transferase
Unknown protein
Laccase family
Oxidoreductase
Pantothenate kinase related
Glutathione transferase
Unknown protein
Unknown protein
AS
A
I
I
A
I
I
I
I
E
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
Normal
12
22
12
12
11d
21
12
11s
21
12
12
11d
11d
21
21
12
1 2*
22
11s
11d
11s
11s
11s
12
22
Stress
12
11d
12
12
11d
22
1x
11s
21
11d
12
11s
11d
21
21
11d
1 2*
12
11s
11d
11s
11s
11s
11d
12
Effect of AS
5#UTR
5#UTR
New stop codon
New start codon
5#UTR
New stop codon
New stop codon
5#UTR
5#UTR
PTC
PTC; domain disrupt
PTC
New stop codon
PTC; domain disrupt
New stop codon
PTC; domain disrupt
New start codon
PTC; domain loss
PTC
PTC
PTC
5#UTR
PTC
NOTE.—Types of AS: I, intron retention; A, alternative acceptor; D, alternative donor; P, alternative position; E, skipped exon. Data for normal and stress conditions: Gene 1
data are listed first followed by gene 2 data. Plus signs indicate an AS event in one or more organs, minus signs indicate no AS event. s indicates that the same AS event is
present in the same organs in both WG duplicates, d indicates that AS is present different organs in each WG duplicate, and x indicates no expression, and asterisk indicates
only the retained intron form is present. Bold indicates genes that show an AS event that is present in both WG duplicates under one or more stress conditions but not
under normal conditions, or vice versa. PTC indicates premature stop codon, new start indicates a new start codon is formed, and new stop indicates a new stop codon is
formed in the final intron. See supplementary table S5, Supplementary Material online for complete organ and stress data.
proteins with altered functions (fig. 3). For example, the retained intron form of At3g56400, a WRKY transcription factor, would lack the WRKY domain. More detailed functional
information is available for the sulfate transporter genes Sultr
1;2 and Sultr 2;2 (tandem pairs At1g78000 and At1g77990)
along with Sultr 1;3 and 1;2 (WG duplicates At1g22150 and
At1g 78000). The retained intron form in Sultr 1;2 includes
the last intron before the normal stop codon and retention
of the intron creates a stop codon that would result in the
loss of about 39 amino acids at the C-terminus within the
STAS domain (fig. 4). Rouached et al. (2005) showed that
deletion of the last 12 amino acids at the C-terminus of Sultr
1;2 resulted in a 100% reduction of sulfate transport when
expressed and assayed in yeast. Thus, it appears that the
39 amino acid deletion caused by the intron retention
and premature stop codon in Sultr 1;2 results in nonfunctional proteins. Sultr 2;2 and Sultr 1;3 do not have the
retained intron form in the organ types and stress conditions
examined here; thus, the divergence in AS patterns between
duplicates has implications for functional divergence in
these gene pairs.
Truncated proteins created by translation of alternatively
spliced transcripts that contain premature stop codons can
have important functions. For example, two transcript
forms are produced from the N gene in tobacco, involved
in conferring resistance to tobacco mosaic virus: a fulllength form and a truncated form produced from an alternatively spliced transcript that contains a premature stop
1694
codon. Transgenic experiments showed that the full-length
form by itself does not show complete resistance to the virus, unlike when both forms are present; thus, the truncated
form is playing a role in the resistance (Dinesh-Kumar and
Baker 2000). A second example is the disease resistance gene
RPS4 in Arabidopsis that has two alternatively spliced forms,
each of which results in a premature stop codon and a shorter protein. Zhang and Gassmann (2003) showed that the
alternatively spliced forms are necessary, in addition to
the full-length form, for RPS4 function. Considering that
the alternatively spliced and truncated forms of some gene
products are important for function, it is likely that some of
the genes in this study with AS that creates truncated forms
have as-yet-unknown functions. Thus, some duplicate gene
pairs may have experienced functional divergence by gaining, or loosing, an alternatively spliced form in one copy that
creates a truncated protein.
Another consequence of a premature stop codon
created by AS is degradation of transcripts by nonsensemediated RNA decay (NMD). Wang and Brendel (2006)
found that about 43% of AS events in A. thaliana produce
candidates for NMD, as defined by a premature stop codon
.50 bp upstream of the 3# most exon–exon junction.
Thus, many AS forms probably function to downregulate
gene expression by transcript degradation instead of being
translated into proteins. A recent study of Serine/argininerich (SR) splicing factors in A. thaliana showed that some
AS forms were present at higher levels in a mutant for one
Alternative Splicing in Duplicated Genes · doi:10.1093/molbev/msq054
MBE
FIG. 4. AS in a sulfate transporter family. (A) Phylogeny of five sulfate transporter genes in two subfamilies and AS data from RT-PCR. Presence
(þ) or absence () of the last intron in the mature mRNA are indicated. Numbers indicate bootstrap values from 100 replicates. The sequence
alignment used to generate the phylogenetic tree is shown in supplemental figure S2, Supplementary Material online. Genes from the third
subfamily are shown as an outgroup. (B) Sequence alignment of the 3# end and C-terminus of five sulfate transporter genes showing the
retained intron in Sultr 1;2 and the conceptual translation of the C-terminus of the retained intron form of Sultr 1;2 compared with the other
sequences. Bullet indicates a stop codon in the retained intron form of Sultr 1;2 (abbreviated ‘‘IR’’). Retention of the intron causes the loss of
39 amino acids at the C-terminus of Sultr 1;2. Locations of alpha helices are indicated. (C) Diagram of the amino acid sequence of Sultr
1;2 showing conserved domains and the location of intron 12, from NCBI’s Conserved Domains Database. (D) Predicted tertiary structure of the
STAS domain of Sultr 1;2 using SPICE (Prlić et al. 2005). The structure is similar to that reported in Rouached et al. (2005). Locations of alpha
helices are shown and correspond with locations in panel (B).
of the genes involved in NMD, indicating that some of the
alternatively spliced transcripts are degraded (Palusa and
Reddy 2009). Downregulation of gene expression by AS
can have important functional consequences. For example,
the RNA-binding proteins AtGRP7 and AtGRP8 in A. thaliana autoregulate and cross-regulate their own expression
by AS and NMD (Staiger et al. 2003; Schöning et al. 2008). A
second example is the flowering time gene FCA in A. thaliana that controls the transition from the vegetative to the
reproductive phase. There are three AS forms of FCA
mRNAs produced, none of which encode a full-length protein, and AS of FCA limits the amount of FCA protein, both
spatially and temporally, to prevent precocious flowering
(Macknight et al. 2002). Some of the AS forms of the genes
in this study may result in downregulation of gene expression that has important functional consequences that have
not yet been studied; the functions of only a small number
of AS forms have been characterized in plants. Thus, some
duplicate gene pairs in this study may have experienced
functional divergence by differential regulation of transcript abundance by AS.
Evolution of AS after Gene Duplication
This is the first study of AS events in a large number of
duplicated gene pairs in plants. Previous studies have examined AS only in one or two pairs of duplicates or duplicates within a small gene family (Palusa et al. 2007). The
relation between gene family size and AS was explored
in a bioinformatics study that compared the AS frequency
in single-copy genes versus gene families in A. thaliana and
rice (Lin et al. 2008). A higher percentage of AS was found
in gene families than in single-copy genes, contrary to
1695
MBE
Zhang et al. · doi:10.1093/molbev/msq054
computational analyses of gene families in mammals and
Caenorhabditis elegans, which showed an inverse relationship between AS and gene family size (Kopelman et al.
2005; Su et al. 2006; Hughes and Friedman 2007; Irimia
et al. 2008). Thus, the evolution of AS in gene families
may differ between plants and animals or the differing
number of cDNA sequences available from each organism
and the number of AS events detected might affect the
trends. Future studies using RNA-seq data from A. thaliana
and other plants may help to resolve the issue.
After a gene with two splice forms is duplicated, each
copy gene could retain one of the splice forms, which is
a type of subfunctionalization involving splice forms. In
maize, two genes coding for the small subunit of
ADP-glucose pyrophosphorylase without AS correspond
to a single gene with AS in other grass species (Rosti
and Denyer 2007), indicating that duplication in the maize
lineage was followed by loss of one AS form in each duplicate. In a second case, the ribosomal protein L32 and
superoxide dismutase genes are present as a fusion gene
and co-expressed by AS in Burma mangrove (Bruguiera
gymnorrhiza), whereas in Populus, the two genes were
separated after gene duplication (Cusack and Wolfe
2007). A putative case of partitioning of AS forms was revealed in this study: a pair of ADP-ribosyltransferase genes
where a retained intron form was found only in At1g32230
and a skipped exon form was found only in At2g35510. In
addition to partitioning of ancestral splice forms, a duplicated gene could gain a new splice form that could result in
a new function (neofunctionalization) or changes in gene
regulation. New splice forms that are created soon after
duplication could be a way for trying out AS forms in
one of the duplicates without affecting regulation or function of the other.
What causes AS divergence between duplicated genes?
One likely factor is mutations in the gene sequence at bases
important for splicing, both within the intron and the
surrounding exons. Those regions include exonic splicing
enhancers, exonic splicing silencers, intronic splicing enhancers, and intronic splicing suppressors (Reddy 2007).
Such mutations in each gene could lead to differential
binding of AS factors, including SR proteins, and differential
inclusion or exclusion of certain introns and exons, along
with alternative donor or acceptor sites.
Conclusions
In this study, we have assayed AS patterns in a set of WG
duplicates and tandem duplicates in several organ types
and under multiple abiotic stress conditions. Our results
illustrate the patterns by which AS events can diverge
between duplicates, including organ- and stress-specific
differences. In some cases, the AS divergence affects inclusion of functional domains that may result in functional
divergence between the duplicates. Divergence in AS patterns between duplicates is another way in which the
expression of duplicated genes can change in plants, in
addition to the previously reported changes in transcript
levels in different organs, developmental stages, and in
1696
response to stress conditions. AS divergence between
duplicated genes may contribute to gene regulatory
and functional evolution and potentially lead to preservation of some duplicated genes. Future studies of AS in
duplicated genes at different evolutionary time scales,
including evolutionarily recent polyploids, as well as
whole-transcriptome studies of AS in duplicate genes will
reveal further insights into how duplicate genes diverge in
AS patterns.
Supplementary Material
Supplementary tables S1–S9 and Supplementary figures
S1–S2 are available at Molecular Biology and Evolution
online (http://www.mbe.oxfordjournals.org/).
Acknowledgments
We thank Jonathan Wendel and the Adams laboratory for
comments on the manuscript. This research was supported
by a grant from the Natural Science and Engineering Research Council (NSERC) of Canada, and by infrastructure
funds from the Canadian Foundation for Innovation.
S.Z.H. was supported in part by an Undergraduate Student
Research Award from NSERC.
References
Barbazuk WB, Fu Y, McGinnis KM. 2008. Genome-wide analyses of
alternative splicing in plants: opportunities and challenges.
Genome Res. 18:1381–1392.
Bininda-Emonds ORP. 2005. TransAlign: using amino acids to
facilitate the multiple alignment of protein-coding DNA
sequences. BMC Bioinformatics. 6:156.
Blanc G, Hokamp K, Wolfe KH. 2003. A recent polyploidy
superimposed on older large-scale duplications in the Arabidopsis genome. Genome Res. 13:137–144.
Blanc G, Wolfe KH. 2004a. Functional divergence of duplicated
genes formed by polyploidy during Arabidopsis evolution. Plant
Cell. 16:1679–1691.
Blanc G, Wolfe KH. 2004b. Widespread paleopolyploidy in model
plant species inferred from age distributions of duplicate genes.
Plant Cell. 16:1667–1678.
Casneuf T, De Bodt S, Raes J, Maere S, Van de Peer Y. 2006.
Nonrandom divergence of gene expression following gene and
genome duplications in the flowering plant Arabidopsis
thaliana. Genome Biol. 7:R13.
Cui L, Wall PK, Leebens-Mack JH, et al. (13 co-authors). 2006.
Widespread genome duplications throughout the history of
flowering plants. Genome Res. 16:738–749.
Cusack BP, Wolfe KH. 2007. Not born equal: increased rate
asymmetry in relocated and retrotransposed rodent gene
duplicates. Mol Biol Evol. 24:679–686.
Dinesh-Kumar SP, Baker BJ. 2000. Alternatively spliced N resistance
gene transcripts: their possible role in tobacco mosaic virus
resistance. Proc Natl Acad Sci U S A. 97:1908–1913.
Duarte JM, Cui L, Wall PK, Zhang Q, Zhang X, Leebens-Mack J,
Ma H, Altman N, dePamphilis CW. 2006. Expression pattern
shifts following duplication indicative of subfunctionalization
and neofunctionalization in regulatory genes of Arabidopsis. Mol
Biol Evol. 23:469–478.
Alternative Splicing in Duplicated Genes · doi:10.1093/molbev/msq054
Finn RD, Tate J, Mistry J, et al. (11 co-authors). 2008. The Pfam
protein families database. Nucleic Acids Res. 36:D281–D288.
Ganko EW, Meyers BC, Vision TJ. 2007. Divergence in expression
between duplicated genes in Arabidopsis. Mol Biol Evol.
24:2298–2309.
Ha M, Li W-H, Chen Z. 2007. External factors accelerate expression
divergence between duplicate genes. Trends Genet. 23:162–166.
Haberer G, Hindemitt T, Meyers B, Mayer KFX. 2004. Transcriptional
similarities, dissimilarities, and conservation of cis-elements in
duplicated genes of Arabidopsis. Plant Physiol. 136:3009–3022.
Hughes AL, Friedman R. 2007. Likelihood-ratio tests for positive
selection of human and mouse duplicate genes reveal nonconservative and anomalous properties of widely used methods.
Mol Phylogenet Evol. 42:388–393.
Hulo N, Bairoch A, Bulliard V, Cerutti L, De Castro E, LangendijkGenevaux PS, Pagni M, Sigrist CJ. 2006. The PROSITE database.
Nucleic Acids Res. 34:D227–D230.
Iida K, Seki M, Sakurai T, Satou M, Akiyama K, Toyoda T,
Konagaya A, Shinozaki K. 2004. Genome-wide analysis of
alternative pre-mRNA splicing in Arabidopsis thaliana based
on full-length cDNA sequences. Nucleic Acids Res. 32:5096–5103.
Irimia M, Rukov JL, Roy SW, Vinther J, Garcia-Fernandez J. 2008.
Widespread evolutionary conservation of alternatively spliced
exons in Caenorhabditis. Mol Biol Evol. 25:375–382.
Kopelman NM, Lancet D, Yanai I. 2005. Alternative splicing and
gene duplication are inversely correlated evolutionary mechanisms. Nat Genet. 37:588–589.
Letunic I, Doerks T, Bork P. 2009. SMART 6: recent updates and new
developments. Nucleic Acids Res. 37:D229–D232.
Lin H, Ouyang S, Egan A, Nobuta K, Haas BJ, Zhu W, Gu X, Silva JC,
Meyers BC, Buell CR. 2008. Characterization of paralogous
protein families in rice. BMC Plant Biol. 8:18.
Liu SL, Adams K. 2008. Molecular adaptation and expression
evolution following duplication of genes for organellar ribosomal
protein S13 in rosids. BMC Evol Biol. 8:25.
Macknight R, Duroux M, Laurie R, Dijkwel P, Simpson G, Dean C. 2002.
Functional significance of the alternative transcript processing of
the Arabidopsis floral promoter FCA. Plant Cell. 14:877–888.
Marchler-Bauer A, Bryant SH. 2004. CD-Search: protein domain
annotations on the fly. Nucleic Acids Res. 32:W327–W331.
Meyer A, Van de Peer Y. 2009. From 2R to 3R: evidence for a fishspecific genome duplication (FSGD). Bioessays 27:937–945.
Ner-Gaon H, Halachmi R, Savaldi-Goldstein S, Rubin E, Ophir R,
Fluhr R. 2004. Intron retention is a major phenomenon in
alternative splicing in Arabidopsis. Plant J. 39:877–885.
Palusa SG, Ali GS, Reddy ASN. 2007. Alternative splicing of premRNAs of Arabidopsis serine/arginine-rich proteins: regulation
by hormones and stresses. Plant J. 49:1091–1107.
Palusa SG, Reddy ASN. 2009. Extensive coupling of alternative
splicing of pre-mRNAs of serine/arginine (SR) genes with
nonsense-mediated decay. New Phytol. 185:83–89.
Prlić A, Down TA, Hubbard TJ. 2005. Adding some SPICE to DAS.
Bioinformatics 21(2 Suppl):ii40–ii41.
Reddy ASN. 2007. Alternative splicing of pre-messenger RNAs in
plants in the genomic era. Annu Rev Plant Biol. 58:267–294.
MBE
Rosti S, Denyer K. 2007. Two paralogous genes encoding small
subunits of ADP-glucose pyrophosphorylase in maize, bt2 and
l2, replace the single alternatively spliced gene found in other
cereal species. J Mol Evol. 65:316–327.
Rouached H, Berthomieu P, Kassis EE, Cathala N, Catherinot V,
Labesse G, Davidian J-C, Fourcroy P. 2005. Structural and
functional analysis of the C-terminal STAS (sulfate transporter
and anti-sigma antagonist) domain of the Arabidopsis thaliana
sulfate transporter Sultr1. 2. J Biol Chem. 280:15976–15983.
Schöning JC, Streitner C, Meyer IM, Gao Y, Staiger D. 2008.
Reciprocal regulation of glycine-rich RNA-binding proteins via
an interlocked feedback loop coupling alternative splicing to
nonsense-mediated decay in Arabidopsis. Nucleic Acids Res.
36:6977–6987.
Schranz ME, Mitchell-Olds T. 2006. Independent ancient polyploidy
events in the sister families Brassicaceae and Cleomaceae. Plant
Cell. 18:1152–1165.
Simillion C, Vandepoele K, Van Montagu MCE, Zabeau M, Van de
Peer Y. 2002. The hidden duplication past of Arabidopsis
thaliana. Proc Natl Acad Sci USA. 99:13627–13632.
Simpson CG, Fuller J, Maronova M, Kalyna M, Davidson D,
McNicol J, Barta A, Brown JWS. 2008. Monitoring changes in
alternative precursor messenger RNA splicing in multiple gene
transcripts. Plant J. 53:1035–1048.
Staiger D, Zecca L, Kirk DAW, Apel K, Eckstein L. 2003. The circadian
clock regulated RNA-binding protein atgrp7 autoregulates
its expression by influencing alternative splicing of its own
pre-mRNA. Plant J. 33:361–371.
Su Z, Wang J, Yu J, Huang X, Gu X. 2006. Evolution of alternative
splicing after gene duplication. Genome Res. 16:182–189.
Thierry-Mieg D, Thierry-Mieg J. 2006. AceView: a comprehensive
cDNA-supportedgene and transcript annotation. Genome Biol.
7:S12.
Van de Peer Y, Maere S, Meyer A. 2009. The evolutionary
significance of ancient genome duplications. Nat Rev Genet.
10:725–732.
Wan CY, Wilkins TA. 1994. A modified hot borate method
significantly enhances the yield of high-quality RNA from cotton
(Gossypium hirsutum l.). Anal Biochem. 223:7–12.
Wang BB, Brendel V. 2006. Genomewide comparative analysis of
alternative splicing in plants. Proc Natl Acad Sci USA.
103:7175–7180.
Yim WC, Lee B-M, Jang CS. 2009. Expression diversity and
evolutionary dynamics of rice duplicate genes. Mol Genet
Genomics. 281:483–493.
Zhang XC, Gassmann W. 2003. RPS4-mediated disease resistance
requires the combined presence of RPS4 transcripts with fulllength and truncated open reading frames. Plant Cell.
15:2333–2342.
Zheng Q, Wang XJ. 2008. GOEAST: a web-based software toolkit for
Gene Ontology enrichment analysis. Nucleic Acids Res.
36:W358–W363.
Zou C, Lehti-Shiu MD, Thomashow M, Shiu SH. 2009. Evolution of
stress-regulated gene expression in duplicate genes of Arabidopsis thaliana. PLoS Genet. 5:e1000581.
1697