Genome-Wide Patterns of Expression in Drosophila Pure Species

Genome-Wide Patterns of Expression in Drosophila Pure Species and
Hybrid Males. II. Examination of Multiple-Species Hybridizations,
Platforms, and Life Cycle Stages
Amanda J. Moehring, Katherine C. Teeter, and Mohamed A. F. Noor
DCMB Group/Biology, Duke University
Species often produce sterile hybrids early in their evolutionary divergence, and some evidence suggests that hybrid sterility may be associated with deviations or disruptions in gene expression. In support of this idea, many studies have shown
that a high proportion of male-biased genes are underexpressed, compared with non–sex-biased genes, in sterile F1 male
hybrids of Drosophila species. In this study, we examined and compared patterns of misexpression in sterile F1 male
hybrids of Drosophila simulans and 2 of its sibling species, Drosophila mauritiana and Drosophila sechellia, at both
the larval and adult life stages. We analyzed hybrids using both commercial Drosophila melanogaster microarrays
and arrays we developed from reverse transcriptase–polymerase chain reactions of spermatogenesis and reproductionrelated transcripts from these species (sperm array). Although the majority of misexpressed transcripts were underexpressed, a disproportionate number of the overexpressed transcripts were located on the X chromosome. We detected
a high overlap in the genes misexpressed between the 2 species pairs, and our sperm array was better at detecting such
misexpression than the D. melanogaster array, suggesting possible weaknesses in the use of an array designed from another
species. We found only minimal misexpression in the larval samples with the sperm array, suggesting that disruptions in
spermatogenesis occur after this life stage. Further study of these misexpressed loci may allow us to identify precisely
where disruptions in the spermatogenesis pathway occur.
Introduction
dysregulation of gene networks in hybrids provides
a unique view of how those networks evolve in the first
place. (Clark 2006)
Many empirical studies have established that natural
populations harbor considerable variation in gene expression (see reviews in White 2001; Wray et al. 2003; Ranz
and Machado 2006; Whitehead and Crawford 2006). Because of this high intraspecific variation, differences in gene
expression are also often observed between closely related
species, resulting from sequence divergence in pathways at
trans- and/or cis-acting sites. One unexpected finding from
the last decade is that component parts of regulatory pathways sometimes diverge through time between species or
isolated populations with no accompanying change in the
final phenotype, a process termed ‘‘developmental systems
drift’’ (True and Haag 2001). Hence, although 2 species
may bear a consistent phenotype, the underlying regulatory
architecture may be distinct.
One consequence of developmental systems drift
would be the disruption of pathways in species hybrids
(see review in Ortiz-Barrientos et al. 2006). If 2 divergent
regulatory architectures are brought together in a single individual, the resultant phenotype could be anomalous. Species often produce sterile hybrids early in their evolutionary
divergence (see review in Coyne and Orr 2004), and there is
some evidence that hybrid sterility may be associated with
divergence or disruptions in ‘‘normal’’ gene expression. For
example, the only cloned hybrid male sterility gene, Odysseus, bears a homeodomain region (Ting et al. 1998), and its
transcript is differentially localized between sterile and fertile individuals (Sun et al. 2004). Theoretical models have
Key words: gene regulation, Drosophila simulans, Drosophila
mauritiana, Drosophila sechellia, hybrid male sterility, speciation.
E-mail: [email protected].
Mol. Biol. Evol. 24(1):137–145. 2007
doi:10.1093/molbev/msl142
Advance Access publication October 10, 2006
Ó The Author 2006. Published by Oxford University Press on behalf of
the Society for Molecular Biology and Evolution. All rights reserved.
For permissions, please e-mail: [email protected]
also shown how the evolution of binding strength of regulatory proteins with promoter regions can provide biologically plausible and powerful postmating reproductive
isolation, such as hybrid sterility (Johnson and Porter
2000; Porter and Johnson 2002).
Disruptions in gene expression in species hybrids have
been studied extensively in the Drosophila melanogaster
group (Ranz et al. 2004) and particularly between Drosophila simulans and Drosophila mauritiana (Michalak and
Noor 2003, 2004; Haerty and Singh 2006). Using microarrays derived from D. melanogaster, Michalak and Noor
(2003) examined patterns of expression in D. simulans,
D. mauritiana, and their sterile male F1 hybrids. They found
many genes severely underexpressed in the hybrids relative
to the pure species, and these were disproportionately associated with spermatogenesis or other male-specific phenotypes. This finding was confirmed using testes of male
hybrids of D. simulans and both D. melanogaster and Drosophila sechellia (Haerty and Singh 2006). In a subsequent
study, Michalak and Noor (2004) found that sterility and
misexpression of 5 transcripts were strongly correlated even
in fifth generation backcross hybrids. Thus, it is possible
(though far from certain) that misexpression of essential
genes involved in spermatogenesis caused sterility in these
hybrids.
Phylogenetic studies suggest that D. simulans may be
slightly more closely related to D. mauritiana than either is
to D. sechellia (Kliman et al. 2000; Ting et al. 2000), but
more loci appear to contribute to hybrid sterility between
the more closely related pair (Palopoli et al. 1996). Male
F1 hybrids of D. simulans to its sister species do not produce
motile sperm, yet their testes appear normal, spermatogenesis is initiated, and meiosis is completed (Lachaise et al.
1986; Wu et al. 1992; Maside et al. 1998). Additionally,
2D gel electrophoresis performed on testes in one study
found that sterile males had remarkably similar protein profiles to fertile males, with only a few specific changes,
showing that there is no general breakdown in protein synthesis (Zeng and Singh 1993). Sterility in these hybrids is
138 Moehring et al.
therefore believed to be associated with spermiogenic (postmeiotic) failures rather than large-scale disruptions in reproductive tissue or organ development (Wu et al. 1992).
Our study builds on earlier work in 2 significant directions. First, we use a novel microarray platform wherein we
printed reverse transcriptase–polymerase chain reactions
(RT–PCRs) of 48 genes known or thought to be involved
in spermatogenesis or reproduction (hereafter, ‘‘sperm array’’). These RT–PCRs were performed on the strains of
D. simulans, D. mauritiana, and D. sechellia studied here,
so sequence divergence between the array and the RNA
samples does not confound the analysis. We also use the
D. melanogaster Drosophila Genomics Resource Center
(DGRC) array-1 (hereafter ‘‘genome array’’) for comparison. Second, we examine expression both at the adult stage
and at the third instar larval stage, hence potentially localizing when the disruptions occur during the development.
Materials and Methods
Drosophila Strains
Drosophila simulans Florida city, D. mauritiana Synthetic, and D. sechellia Robertson stocks were maintained
on standard sugar medium at 20 °C, 12-h light–dark cycle.
Virgin D. simulans females were crossed to either D. mauritiana or D. sechellia males. Virgin pure species and F1
males were collected, aged 4 days, then immediately frozen
at 80 °C between 1 and 2 hours after ‘‘lights on’’ on the
fourth day. Third instar larvae were sexed based on the presence of visible testes in males. Three trial collections were
made of 4–16 male larvae per vial and were allowed to
eclose to check the accuracy of the larvae-sexing procedure.
After we determined that the sexing was accurate, larvae
were collected at the third instar stage and frozen immediately at 80 °C.
Creation of Custom Sperm Array
All 3 alleles (D. simulans, D. mauritiana, and D. sechellia) of 48 genes known or suspected to be involved in
spermatogenesis or reproduction (White-Cooper et al.
2000; Michalak and Noor 2003; Perezgasga et al. 2004;
White-Cooper 2004; Grumbling et al. 2006), 2 ‘‘housekeeping’’ genes (18SrRNA and Actin5C), and 2 genes of interest
for another study (CG9775 and 5-HT2) were included on
the custom sperm array. Note that, although our goal was
to have every gene known or suspected to be involved
in spermatogenesis on the array, several genes were not able
to be included due to either the lack of available sequence or
difficulty in amplification in our samples. The procedure for
the creation of the array is described in the Supplementary
Material online. The primers and amplified sequence for the
array are listed in supplementary table 1, Supplementary
Material online.
RNA Extraction and Hybridization to Custom and
Genome-Wide Arrays
RNA extractions were performed for D. simulans, D.
mauritiana, D. sechellia, and the 2 hybrids, D. simulans–D.
mauritiana (sim–mau) and D. simulans–D. sechellia (sim–
sec). RNA was extracted from 35 to 50 flies or 4 to 30 larvae
D. simulans
D. sechellia
D. mauritiana
F1, sim-mau
F1, sim-sec
FIG. 1.—Microarray hybridization schematic. Each solid arrow
represents 2 replicates (4 comparisons for each pair) and dashed circles
represent 1 replicate of a self–self hybridization. Arrowheads indicate
samples labeled with Cy3; arrow bases Cy5.
using the Qiagen (Valencia, CA) RNeasy Mini Kit, yielding
a total of, approximately, 10 lg of RNA for hybridization to
the arrays.
Two array platforms were used: the custom sperm array described above and a genome-wide D. melanogaster
chip (DGRC-1) purchased from the DGRC (Bloomington,
Indiana), hereafter called ‘‘genome array.’’ The genome array contains long amplicons (100–600 bp) of 88% of the
genome (12,012 genes) on Corning Gaps II printed slides.
The long amplicons increased the likelihood of the slightly
divergent sequence of D. melanogaster’s sibling species
hybridizing. Hybridizations were performed at the Duke
University Microarray Core Facility (Durham, NC) using
Cy5 and Cy3 fluorescent labels. Six to 10 samples of each
type (parental or F1) were hybridized in the scheme shown
in figure 1. Larval male RNA was hybridized to the sperm
array; adult male RNA was hybridized to either the sperm
array or the genome array.
Data Analysis
Background-subtracted expression values were log2
transformed and normalized on the median for the genome
array. Expression values were normalized based on Actin5C expression on the custom sperm array because we
expect a shift in the median expression value for hybrids
in spermatogenesis genes, which are the primary constituent of this array. For comparison, the values from the genome array were analyzed for a second time using only the
genes also present on the sperm array and normalizing by
Actin5C. The data were analyzed with a mixed-model analysis of variance (ANOVA) (Wolfinger et al. 2001; Cui and
Churchill 2003) using PROC MIXED with SAS 9.1 software (SAS Institute, Cary, NC). The mixed-model ANOVA
takes into account both fixed and random effects of variance; in this experimental model, dye and species are fixed
effects, whereas array and spot are random effects. The
mixed model provides a single statistical framework that
includes the expression levels of a set of clones over all arrays and an analysis of whether these clones differ significantly between treatments.
A locus was considered significantly misexpressed in
the hybrids if its P value was below 0.05 except where specified otherwise. Although this significance threshold may
Dysregulation in Drosophila Hybrids 139
Table 1
Number of Misexpressed Genes in Hybrids (based on
a mixed-model ANOVA), Separated by Chromosome
and Whether They Were Underexpressed (under) or
Overexpressed (over) Relative to the Parental Species.
Only Those for Which the Chromosomal Location Is
Known Are Included in This Table
Total
X chromosome
Autosomes
Chromosome 2
Chromosome 3
Chromosome 4
Total
Array
sim–mau
Under
sim–mau
Over
sim–sec
Under
sim–sec
Over
11,896
1,992
9,904
4,510
5,324
70
502
23
479
209
270
0
61
12
49
17
32
0
190
9
181
87
93
1
30
11
19
8
11
0
allow for some false positives, we nonetheless wanted to
fully explore the range of genes highlighted, particularly,
in comparisons between the 2 hybrid classes. Misexpression was considered significant when the hybrid expression
was different from both parents—it would simply be a demonstration of dominance (not misexpression) if the hybrid
expression value is only significantly different from one parent. Additionally, the locus must be significantly over- or
underexpressed relative to both parents because if the hybrid is in between the 2 parental values, it demonstrates additivity, not misexpression. PANTHER software (Mi et al.
2005) was used to evaluate the number of loci in each functional category for both the entire genome and the subset of
misexpressed genes. Significant loci for both array platforms
(genome and sperm) and both life stages (adult and larvae), as
well as the representative functional classes are presented in
supplementary table 2A–C, Supplementary Material online.
To examine general patterns in the microarray data and
to compare different hybridizations, the expression levels of
the hybrids were compared with the mean expression levels
of the parental species. The log10 of the ratio of the normalized gene expression level of the hybrids to the mean expression level of the 2 parental species was calculated. This
value provided a measure of the deviation in expression
level of the hybrid from both parents. Regressions (a 5
0.05) of this hybrid–midparent value were performed in Microsoft Excel to compare different hybridizations. Comparisons of 1) hybridizations to the D. melanogaster probes on
the genome array versus D. simulans probes on the sperm
array and 2) hybridizations to heterospecific (D. mauritiana
or D. sechellia) probes versus conspecific (D. simulans)
probes on the sperm array were done to identify effects
of sequence divergence on array hybridization between
these closely related species. Comparisons were also done
for hybridizations of larval RNA versus hybridizations of
adult RNA. We excluded both the control genes (Actin5C
and 18S rRNA), which were used for normalization, and
gonadal, which was the only gene to give contradictory results between the platforms and therefore sampled the 47
remaining genes with both platforms.
Results
Misexpressed Loci in Hybrids
We identified 30 genes as overexpressed and 190 genes
underexpressed, based on hybridizations to the D. mela-
nogaster genome array in D. simulans–D. sechellia (sim–
sec) hybrids and 63 genes overexpressed and 505 genes
underexpressed in D. simulans–D. mauritiana (sim–mau)
hybrids (supplementary table 2C, Supplementary Material
online). Of these loci, 128 were misexpressed in both sets
of hybrids, and all but 4 of these transcripts were those that
were underexpressed compared with the parental species.
There was an overall trend of genes that are misexpressed to be expressed at lower levels in hybrids than
in the parental species (see also Michalak and Noor
2003; Haerty and Singh 2006). Additionally, 18.9% of
the genome showed strong dominance in their expression,
with an equal amount showing D. simulans–like expression
(1,127 loci) as D. sechellia–like expression (1,140). In the
sim–mau hybrids, 2852 transcripts appeared to show dominance, with a greater amount showing D. simulans–like
expression (1,771) than D. mauritiana–like expression
(1,081).
Although 16.7% of the genes on the genome array are
located on the X chromosome, only 9.1% of the sim–sec and
6.2% of the sim–mau misexpressed genes were on the X
chromosome (table 1). There is a disproportionately small
number of misexpressed loci on the X, even when we make
the conservative comparison to genes that have testisspecific expression, of which 10.6% are located on the X
in D. melanogaster (Parisi et al. 2004). Of greater interest,
however, is that a disproportionate number of genes exhibited misexpression via overexpression on the X. For
those genes located on the X chromosome, 55% of the genes
that were significantly misexpressed in sim–sec and 34.3%
of those in sim–mau were overexpressed, whereas only
9.5% (sim–sec) and 9.3% (sim–mau) of misexpressed genes
on the autosomes were overexpressed.
Of the genes that were misexpressed in hybrids, genes
involved in transcription were underrepresented (6.9%
genome vs. 2.2% sim–sec, 1.6% sim–mau), as were genes
involved in DNA binding (6.4% vs. 1.4%, 2.4%) (supplementary table 2C, Supplementary Material online). Functional categories that were more prevalent than expected
include chaperones (4.33 more for sim–mau misexpressed
genes) and protein folding (4.53, sim–mau), mitochondrial
transport (5.23, sim–sec; 5.13, sim–mau), calcium binders
(3.13,sim–sec;3.13, sim–mau),microtubule-associatedproteins (3.63, sim–sec; 3.83, sim–mau), epimerase/racemase
(3.13, sim–sec), and apoptosis (33, sim–sec).
When we compare functional categories of misexpressed loci with those prevalent for testis-specific loci
in D. melanogaster (Parisi et al. 2004), the same trends generally hold, with 2 exceptions: there is no longer an underrepresentation of loci involved in transcription, as this
category is less prevalent for testis-specific genes, and loci
for DNA binding are overrepresented as misexpressed in
hybrids. However, this comparison to D. melanogaster is
somewhat contrived, as some misexpression may be unrelated to the testes or male fertility or both.
We also detected misexpression in adult hybrid males
using the D. simulans probe of our sperm array (supplementary table 2B, Supplementary Material online). Based
on results from these probes, we found 1 gene (tra2) to
be significantly overexpressed and 7 genes significantly
underexpressed in sim–mau hybrids. In sim–sec hybrids,
140 Moehring et al.
Table 2
Overlap in Adult Hybrid Male Misexpressed Transcripts
between Drosophila simulans–Drosophila mauritiana and
D. simulans–Drosophila sechellia Hybridizations, after
Eliminating Genes Expressed in One of the Pure Species at
a Level More than Twofold below the Mean Hybridization
Intensity across the Array
Underexpressed transcripts
D. simulans–
D. mauritiana
D. simulans–D. sechellia
No
Yes
No
9214
366
Yes
65
124
P , 0.0001
Overexpressed transcripts
D. simulans–
D. mauritiana
D. simulans–D. sechellia
No
Yes
No
9693
53
Yes
21
2
P 5 0.0073
we found 2 genes to be significantly overexpressed and
11 genes underexpressed. Assignments of misexpression
were relatively robust to significance threshold: even at
the conservative Bonnferroni adjusted P , 0.000005, 4
genes were still significantly misexpressed in sim–mau hybrids and 4 genes in sim–sec hybrids (dj was still significantly
underexpressed in both), as assayed with these probes on the
sperm array.
Commonalities in Misexpression between Species
Hybridizations
We sought to determine whether there was disproportionate overlap in the sets of misexpressed genes between
the 2 species hybridizations. Such overlap would indicate
commonalities in the regulatory networks disrupted in species hybrids (but see Discussion). As our sets of misexpressed genes from the sperm array were very small, we
performed this analysis on the results from the genome array. We found a large and statistically significant overlap in
the genes disrupted in hybrids from the 2 crosses. This overlap was still significant when we excluded all genes
expressed in one or more of the pure species at a level more
than twofold below the mean hybridization intensity across
the array (see table 2), hence accounting for possible artifacts resulting from genes not truly expressed in adult
males. The overlap was particularly large for the sets of
underexpressed genes: 25–65% of genes significantly
underexpressed in 1 hybridization were also underexpressed in the other (the percentage depends on which direction the comparison is made). Furthermore, among the
65 genes significantly underexpressed in only the sim–sec
hybridization, 43 were expressed at levels below both parental species in the sim–mau hybridization (but not significantly so). Similarly, among the 366 genes significantly
underexpressed only in the sim–mau hybridization, 282
were expressed at levels below both parental species in
the sim–sec hybridization (but not significantly so). Therefore, our analyses may greatly underestimate the extent of
overlap in misexpression between these 2 hybridizations.
Effects of Developmental Stage
Using our sperm array, we detected much less misexpression in hybrid male third instar larvae than in the adult
hybrid males. Based on hybridization to the D. simulans
probe, the only gene significantly misexpressed in hybrids
relative to pure species was Mst84Dc, which was underexpressed in sim–mau hybrids. A few other genes were misexpressed, when all probes on the sperm array were considered:
OdsH in sim–mau, rox1 and Mst84Dc in sim–sec.
Given the small number of significantly misexpressed
genes above, we also examined the overall general trend in
hybrid expression in larvae of transcripts shown to be
underexpressed in adult hybrids. In the D. simulans–D.
mauritiana hybridization, 3 of the 7 transcripts significantly
underexpressed in adult hybrids were underexpressed (nonsignificantly) in larvae (see supplementary table 2A, Supplementary Material online). This representation is greater
than expected by chance (P 5 0.01). However, in the D.
simulans–D. sechellia hybridization, only 2 of 11 significantly underexpressed transcripts were nonsignificantly
underexpressed in larvae, which was the proportion
expected by chance.
We also compared in adults and larvae the overall estimated levels of expression in hybrids relative to the midparent of the 2 parental species using the D. simulans probes
for each gene. This was done to assess whether there is a correlation between larval and adult relative expression values
in hybrids. In the D. simulans–D. mauritiana hybridization,
deviation from midparent expression in adult hybrids was
significantly correlated with that in larval hybrids (r 5
0.615; P , 0.0001; see supplementary fig. 1, Supplementary Material online). However, in the D. simulans–D. sechellia hybridization, deviation from midparent expression
was not significantly correlated between adult and larval hybrids (r 5 0.214; P 5 0.128; see supplementary fig. 1, Supplementary Material online).
Technical Comparison of Microarray Platforms
and Probes
We compared the success of the D. simulans probe in
detecting misexpression with that of the D. melanogaster
probe on the genome array to identify effects of sequence
divergence on array hybridization. We acknowledge that
this comparison is imperfect because of the use of different
platforms. In both hybridizations, more genes were suggested to be misexpressed based on the sperm array, than
based on the D. melanogaster genome array (table 3). This
difference was statistically significant in the D. simulans–
D. sechellia hybridization (P 5 0.0005) but not in the D.
simulans–D. mauritiana hybridization. Further, in both
cases, no single transcript was found to be significantly misexpressed by the D. melanogaster genome array that was
also not found to be misexpressed using the sperm array.
Overall expression levels in hybrids relative to the midparent (mean expression of each of the parent species) was
strongly correlated in the D. simulans–D. mauritiana hybridization (r 5 0.595, P , 0.0001) but not in the D.
simulans–D. sechellia hybridization (r 5 0.181, P 5 0.23)
between the 2 platforms, with the genome array detecting
Dysregulation in Drosophila Hybrids 141
Table 3
Number of Genes Showing Statistically Significant
Misexpression via a Mixed-Model ANOVA in F1 Adult
Male Hybrids Compared with Parents for the Different
Microarray Platforms and Probes
Hybridization
sim–mau
Genome array versus sperm array
No misexpression detected
Drosophila melanogaster probe
misexpression only
Drosophila simulans probe misexpression only
Both probes misexpression
Probes within sperm array
No misexpression detected
‘‘Foreign’’ D. simulans group species probe
misexpression only
D. simulans probe misexpression only
Both probes misexpression
Table 4
Relative Hybridization Intensity in Heterospecific:
Conspecific Spots on Sperm Array. The First Numbers
Indicate Genes for Which a Higher Hybridization Intensity
Was Noted for the Heterospecific Spot, and the Second
Numbers Indicate Genes for Which the Higher Intensity Was
Noted for the Conspecific Spot
sim–sec
38
0
35
0
4
4
11
0
34
4
30
5
3
5
5
6
much less underexpression than the sperm array (see supplementary fig. 2, Supplementary Material online).
We then compared the relative hybridization intensity
on our sperm array for pure species RNA to conspecific
versus heterospecific probes on the sperm array (e.g., D.
sechellia and D. mauritiana probes were studied for D. simulans). This was done to identify effects of sequence divergence in the parentals on array hybridizations between these
closely related species. If there were effects of sequence divergence on hybridization of RNA to the chip, we would
expect to generally observe more genes for which the hybridization signals are stronger for the conspecific than the
heterospecific spots. We did observe a trend toward this pattern in 5 of 6 comparisons (see table 4) but the overall trend
was nonsignificant.
We also estimated the overall levels of expression in
hybrids relative to the midparent of the 2 parental species using the D. simulans probes and the ‘‘foreign’’ probes in our
sperm array. This is the same comparison as above but looking at hybrid expression values. Expression of hybrids relative to the midparent were strongly correlated between the
D. simulans and foreign probes in the D. simulans–D. sechellia hybridization (r 5 0.717, P , 0.0001) and in the
D. simulans–D. mauritiana hybridization (r 5 0.745, P ,
0.0001) (see supplementary fig. 3, Supplementary Material
online).
Finally, we compared detection of misexpression using the probe from the third species on the sperm array (e.g.,
D. sechellia probes were studied in hybrids of D. simulans
and D. mauritiana) to the D. simulans probe. This was done
to identify effects of sequence divergence on our ability to
detect significant misexpression in the hybrids. Unlike the
results with the D. melanogaster probe comparison, we did
not observe a difference in sensitivity of the D. simulans
probe when compared with its closely related sibling species (table 3). Although transcripts for which the misexpression was not statistically significant using the D. simulans
probe (which would always match at least 1 of 2 haploid
genomes) were sometimes suggested to be misexpressed
by the foreign probe, the hybrid expression values for
the D. simulans probe tended to be in the same direction
RNA Source
Foreign
Probe
D. simulans
D. mauritiana
D. sechellia
Drosophila
simulans
Drosophila
mauritiana
Drosophila
sechellia
—
16:34
23:27
22:28
—
29:21
19:31
21:29
—
(i.e., over- or underexpressed) in these cases, even though
it was not statistically significant.
Discussion
We have approached the genetic basis of hybrid dysfunction by identifying genes and pathways that are misexpressed in hybrids relative to pure species, using the D.
simulans species complex as a model system. Consistent
with the findings of previous studies in this species group
(Michalak and Noor 2003; Haerty and Singh 2006), we observed that genes that were misexpressed in hybrids of these
Drosophila species tended to be underexpressed relative to
the parental species. If F1 hybrids bore shrunken testes relative to parental species, then finding a preponderance of
underexpressed loci could reflect the smaller proportion
of mRNA derived from testes relative to the rest of the
body. However, based on our dissections and multiple published studies (e.g., Coyne 1984; Thomas and Singh 1992),
testes do not differ in size or shape between these pure species and their F1 hybrids. Further, some testes-specific transcripts were expressed in hybrids at levels virtually
identical to one or both pure species in our whole body assays (e.g., Crtp and CG7813; Boutanaev et al. 2002).
We also observed that, although a smaller number of X
chromosomal loci than expected were misexpressed, a disproportionate number of overexpressed loci were located
on this chromosome. Although we have no compelling explanations for the cause of this pattern, we note that the one
characterized hybrid male sterility gene, OdysseusH, is an X
chromosome locus believed to cause disruptions in spermatogenesis via a gain of expression in the testes (Sun et al. 2004).
To elucidate at which precise stage of spermatogenesis
disruptions occur, we have overlaid the misexpression
results onto part of the known spermatogenesis pathway
for D. melanogaster (fig. 2). Although this is a preliminary
attempt at identifying points within the developmental network at which misexpression occurs, we note the large
number of late-stage downstream loci exhibiting misexpression, whereas relatively few early-stage loci are misexpressed. This is an exciting first step toward identifying the
causal stage of disruptions in spermatogenesis and it also
supports our suggestion that the disruptions in spermatogenesis occur postmeiotically and are not due to an overarching failure of spermatogenesis or organ development.
Few loci involved in transcription or DNA binding
were misexpressed in hybrids, suggesting either a constraint
142 Moehring et al.
Table 5
Comparison of F1 Hybrid Expression of CG5762 Relative to
That in the Parental Species Using Real-Time Quantitative
PCR on Testes-Derived Total RNA
Drosophila simulans
Drosophila mauritiana
F1 sim–mau
FIG. 2.—The spermatogenesis pathway (adapted from Fuller 1998;
White-Cooper et al. 1998, 2000). Color-coded arrows represent genes significantly misexpressed in hybrid male third instar larvae using the sperm
array (purple), 4-day-old adults when assayed using the sperm array
(green), or genome-wide array (orange). Lighter shaded arrows on the left
are for sim–sec hybrids; darker shaded arrows on the right are for sim–mau
hybrids. Underexpressed loci are shown as standard arrows, whereas loci
that are overexpressed have an arrow beginning with a circle.
on the evolution of these loci or a robustness of the expression of these loci in spite of perturbations. Conversely, a
larger number of misexpressed loci than expected were involved in protein folding, chaperone function, mitochondrial transport, calcium binding, microtubule function,
epimerase/racemase function, and apoptosis. These overrepresented categories could reflect divergent evolution
of function, which directly or indirectly affects hybrid sterility, or a shift in expression with no bearing on the sterile
hybrid phenotype. They could also be downstream effects of
sterility (e.g., a reduction in sperm production could cause
a reduction in expression of certain types of genes), and it is
quite possible (indeed likely) that the underexpression of
downstream factors in spermatogenesis is due to failures
at earlier steps in the spermatogenesis pathway. Therefore,
at this stage one can only speculate as to the significance of
the presence or absence of specific functional categories.
Comparison between Species Hybridizations
Although differences in gene expression between
species have been studied extensively (see review in OrtizBarrientos et al. 2006), including several studies in Drosophila (e.g., Ranz et al. 2003; Rifkin et al. 2003), only
one other study (Haerty and Singh 2006) has compared misexpression of a broad range of transcripts in hybrids across
multiple-species hybridizations. In this study, we compared
genome-wide patterns of expression in 2 species hybridizations also examined by Haerty and Singh (2006): D.
simulans–D. mauritiana and D. simulans–D. sechellia.
Mean
Variance
Standard
Deviation
1.065
0.982
0.666
0.064
0.039
2.41E-5
0.254
0.197
0.005
We detected significantly more misexpressed genes
(both over- and underexpressed) relative to the parental species in hybrid males of the D. simulans–D. mauritiana cross
than in hybrids of the D. simulans–D. sechellia cross. This
result is consistent with genetic studies of hybrid sterility in
these species groups (Palopoli et al. 1996). However, the
tendency for genes significantly underexpressed in hybrids
of D. simulans–D. mauritiana to be at least nonsignificantly
underexpressed in hybrids of D. simulans–D. sechellia suggests that we may be underestimating the number of genes
misexpressed in the latter hybridization.
In contrast to the results of Haerty and Singh (2006), we
found extensive overlap (128 loci) in the sets of genes significantly misexpressed in hybrids of the species pairs. Our
finding suggests some commonalities in the disrupted regulatory pathways between the 2 hybridizations. There are at
least 3 possible explanations for the discrepancy between
our results and the previous study. First, we assayed whole
bodies rather than testes alone, and much of the misexpression we detected may be specific to some tissue besides the
testes. However, a disproportionate number of misexpressed
genes in these hybridizations are male specific, with many
involved in spermatogenesis (Michalak and Noor 2003,
2004; Haerty and Singh 2006), so this explanation seems
unlikely. Second, it is possible that the inconsistency is
due to the different array platforms; however, both platforms
contained long cDNAs, so length of binding substrate
should not be a factor. Lastly, many recent studies have
shown that a minimum of 5 replicates is necessary for statistically analyzing microarray results (e.g., Pavlidis et al.
2003; Allison et al. 2006), and therefore our study with
a larger sample size may have been better able to detect overlapping misexpressed genes. Consistent with this hypothesis, our real-time PCR assays of one gene, CG5762,
surveyed by both studies found it to be underexpressed in
hybrids relative to pure species when surveyed with RNA
derived from both whole bodies and testes alone (whole
bodies: see Michalak and Noor 2003, 2004; testes: sim–
mau hybrids express this gene at about 66% pure species
level; Teeter K, Noor M, unpublished data). Our microarray
assays also found this gene to be underexpressed (see table
5), whereas the analyses of Haerty and Singh (2006) failed to
identify significant underexpression in this gene.
Our results suggest that there are indeed significant
commonalities in patterns of expression in hybrids relative
to pure species between these 2 species groups, at least
when whole bodies are examined. That said, commonality
in patterns of misexpression does not suggest that the same
genes were involved in speciation. Expression studies such
as these may be sampling genes near the end of long
Dysregulation in Drosophila Hybrids 143
regulatory networks that are disrupted at different points in
the 2 hybridizations. There is no reason to assume that the
panel of misexpressed genes reflects the specific failed
interactions causing misexpression and possibly sterility
(Noor and Feder 2006; Ortiz-Barrientos et al. 2006) but
does suggest that similar pathways may be involved.
Comparison between Life Cycle Stages
Changes in gene expression occur throughout the life
cycle of all species, and such changes have been studied
extensively in D. melanogaster (e.g., Arbeitman et al.
2002). Differences in expression between species are also
apparent within life cycle stages (e.g., Rifkin et al. 2003;
Goodisman et al. 2005). Given this variation, some hybrid
expression disruptions may only be observed at particular
life cycle stages. Further, examining multiple stages may
suggest when in development particular disruptions first occurred, possibly giving hints as to the underlying causes.
We found very little misexpression in third instar hybrid male larvae from both species crosses. Underexpression of Mst84Dc is evident this early in development,
but most other disruptions are not apparent until the adult
stage. We further noted a disconnection between relative
expression levels in adult and larval hybrids in the D.
simulans–D. sechellia hybridization (see supplementary
fig. 1, Supplementary Material online). This disconnection
may suggest that, especially for sim–sec hybrids, most misexpression appears later in development, but other explanations are also possible.
Based on cytological studies, some disruptions should
be apparent in hybrid third instar larvae. Many postmeiotic
gametes are apparent in the testes of D. simulans third instar
larvae (Maside et al. 1998). However, only about half of
third instar larval hybrids between D. simulans and D.
mauritiana have produced sperm passing meiosis (Maside
et al. 1998), suggesting that delays or defects in spermatogenesis resulting in hybrid sterility are already apparent at
this life cycle stage. Transcription of genes scored on the
sperm array has also been detected in pure species at this stage
(Arbeitman et al. 2002), and for a few of the transcripts surveyed, even at the embryonic stage (Mukai et al. 2006), and so
the opportunity for misexpression was certainly present.
Comparison between Microarray Platforms
The vast majority of studies comparing gene expression between species have been performed on singlespecies arrays, so that it has been impossible to distinguish
differential hybridization due to sequence mismatches from
underlying expression differences. Recently, Gilad et al.
(2005) used a ‘‘multispecies’’ primate cDNA array using
probes from human, chimpanzee, orangutan, and rhesus.
They found a large effect of sequence divergence on hybridization signal, even in the most closely related species
pair, humans and chimpanzees. They concluded that ‘‘naive
use of single-species arrays in direct interspecies comparisons can yield spurious results.’’
To eliminate this concern from our experiments, we
used both the DGRC-1 genome-wide D. melanogaster microarray as well as a sperm array that we constructed using
RT–PCRs from the same lines of D. simulans, D. mauritiana, and D. sechellia from which RNA was isolated. As
a result, there was absolutely no sequence divergence between at least 1 of the 3 probe types for all of the sperm
array transcripts and the hybridized RNA.
When the same normalization procedures were applied to the data from both arrays, no genes were identified
as misexpressed using the genome array that were not identified as such with the sperm array, whereas the converse
was not true. Several of the transcripts suggested to be misexpressed by the sperm array data but not by the genome
array have been confirmed previously via real-time PCR to
be misexpressed in these hybrids (e.g., Mst84Dc and
Mst98Cb: see Michalak and Noor 2003), lending support
for the underexpression documented by the sperm array.
Taken together, this difference suggests that our sperm array was generally more sensitive to expression differences.
It is tempting to speculate that this difference results in part
from effects of sequence divergence from D. melanogaster,
but this conclusion is not altogether fair because the platforms were constructed separately and may bear other subtle differences affecting sensitivity.
We did observe a nonsignificant trend for RNAs to
hybridize better to conspecific spots than heterospecific
spots among the probes on the sperm array, suggesting that
despite the close relationships of the 3 species represented,
there may be some slight effect of sequence divergence on
hybridization. However, we did not detect substantial differences in detection of misexpression among the probes
within our sperm array (see tables 3 and 4), though this negative result may have been affected by variation in the quality of the PCR products used or their spotting onto the array
or the fact that these 3 species are much more closely related
to each other than they are to D. melanogaster. Unlike the
comparison to the genome array, significant misexpression
was sometimes detected using data from ‘‘foreign’’ probes,
which was not detected from the D. simulans probe. It is
possible that analysis from our sperm array still underestimated the extent of misexpression in these hybrids and that
the different probes were only sampling from a larger pool
of affected transcripts. Indeed, at least 1 transcript was
shown previously by real-time PCR to be underexpressed
in sim–mau hybrids but was not detected as significantly
underexpressed using the probes from any of our platforms
(aly: see Noor 2005).
Conclusion
Several lines of evidence suggest that failures in transcriptional regulation may be associated with hybrid incompatibilities, such as hybrid sterility. In most Drosophila
species hybrids, transcription of genes involved in spermatogenesis is premeiotic (Fuller 1998), whereas the cytological disruptions associated with hybrid male sterility
appear postmeiotically (Wu et al. 1992; Coyne and Orr
2004). This suggests that the opportunity for transcription
was present in sterile hybrids, and the observed underexpression of many transcripts may not be a simple artifact
of sterility. Here, we have shown extensive commonalities in
patterns of misexpression across 2 species hybridizations.
We have also shown that much of the hybrid misexpression
144 Moehring et al.
is not apparent until late in development, though a few transcripts are underexpressed in hybrid larvae. Further examination of expression differences between sterile and fertile
later-generation hybrids may help to elucidate which (if any)
of the misexpressed transcripts may be directly or indirectly
associated with hybrid male sterility and what upstream regulatory factors are causing misexpression (e.g., Michalak
and Noor 2004).
Supplementary Material
Supplementary figures 1–3, 2 Microsoft Excel files
(supplementary tables 1 and 2), and description of construction of sperm array are available at Molecular Biology and
Evolution online (http://www.mbe.oxfordjournals.org/).
Acknowledgments
We thank Greg Bourgeois for molecular assistance
and Ian Dworkin for statistical support. We thank A.
Chang, K. Tamura, and 2 anonymous reviewers for helpful
comments on the manuscript. This work was supported by
National Science Foundation grants 0314552 and 0509780
to M.A.F.N., an National Institutes of Health National
Research Service Award fellowship to A.J.M., and National
Science Foundation grant EPS-0346411 to the Louisiana
Board of Regents. The D. melanogaster DGRC-1 microarray slides were purchased from the DGRC, Bloomington,
Indiana.
Literature Cited
Allison DB, Cui X, Page GP, Sabripour M. 2006. Microarray data
analysis: from disarray to consolidation and consensus. Nat
Rev Genet. 7:55–65.
Arbeitman MN, Furlong EEM, Imam F, Johnson E, Null BH,
Baker BS, Krasnow MA, Scott MP, Davis RW, White KP.
2002. Gene expression during the life cycle of Drosophila melanogaster. Science. 297:2270–2275.
Boutanaev AM, Kalmykova AI, Shevelyov YY, Nurminsky DI.
2002. Large clusters of co-expressed genes in the Drosophila
genome. Nature. 420:666–669.
Clark AG. 2006. Genomics of the evolutionary process. Trends
Ecol Evol. 21:316–321.
Coyne JA. 1984. Genetic basis of male sterility in hybrids between
two closely related species of Drosophila. Proc Natl Acad Sci
USA. 81:4444–4447.
Coyne JA, Orr HA. 2004. Speciation. Sunderland (MA): Sinauer
Associates, Inc.
Cui X, Churchill GA. 2003. Statistical tests for differential expression in cDNA microarray experiments. Genome Biol. 4:210.
Fuller MT. 1998. Genetic control of cell proliferation and differentiation in Drosophila spermatogenesis. Semin Cell Dev Biol.
9:433–444.
Gilad Y, Rifkin SA, Bertone P, Gerstein M, White KP. 2005.
Multi-species microarrays reveal the effect of sequence divergence on gene expression profiles. Genome Res. 15:674–680.
Goodisman MA, Isoe J, Wheeler DE, Wells MA. 2005. Evolution
of insect metamorphosis: a microarray-based study of larval
and adult gene expression in the ant Camponotus festinatus.
Evolution. 59:858–870.
GrumblingG, Strelets V,Consortium TF.2006.FlyBase:anatomical
data, images and queries. Nucleic Acids Res. 34:D484–D488.
Haerty W, Singh RS. 2006. Gene regulation divergence is a major
contributor to the evolution of Dobzhansky-Muller incompat-
ibilities between species of Drosophila. Mol Biol Evol.
23:1707–1714.
Johnson NA, Porter AH. 2000. Rapid speciation via parallel,
directional selection on regulatory genetic pathways. J Theor
Biol. 205:527–542.
Kliman RM, Andolfatto P, Coyne JA, Depaulis F, Kreitman M,
Berry AJ, McCarter J, Wakeley J, Hey J. 2000. The population
genetics of the origin and divergence of the Drosophila simulans complex species. Genetics. 156:1913–1931.
Lachaise D, David JR, Lemeunier F, Tsacas L, Ashburner M.
1986. The reproductive relationships of Drosophila sechellia
with D. mauritiana, D. simulans, and D. melanogaster from
the Afrotropical region. Evolution. 40:262–271.
Maside XR, Barral JP, Naveira HF. 1998. Hidden effects of X
chromosome introgressions on spermatogenesis in Drosophila
simulans 3 D. mauritiana hybrids unveiled by interactions
among minor genetic factors. Genetics. 150:745–754.
Mi H, Lazareva-Ulitsky B, Loo R, et al. (12 co-authors). 2005. The
PANTHER database of protein families, subfamilies, functions
and pathways. Nucleic Acids Res. 33:D284–D288.
Michalak P, Noor MAF. 2003. Genome-wide patterns of expression in Drosophila pure-species and hybrid males. Mol Biol
Evol. 20:1070–1076.
Michalak P, Noor MAF. 2004. Association of misexpression with
sterility in hybrids of Drosophila simulans and D. mauritiana.
J Mol Evol. 59:277–282.
Mukai M, Kitadate Y, Arita K, Shigenobu S, Kobayashi S. 2006.
Expression of meiotic genes in the germline progenitors of
Drosophila embryos. Gene Expr Patterns. 6:256–266.
Noor MA. 2005. Patterns of evolution of genes disrupted in expression in Drosophila species hybrids. Genet Res. 85:119–125.
Noor MAF, Feder JL. Forthcoming 2006. Speciation genetics:
evolving approaches. Nat Rev Genet. 7:851–861.
Ortiz-Barrientos D, Counterman B, Noor MAF. Forthcoming
2006. Gene expression divergence and the origin of hybrid
dysfunctions. Genetica.
Palopoli MF, Davis AW, Wu C-I. 1996. Discord between the phylogenies inferred from molecular versus functional data: uneven rates of functional evolution or low levels of gene
flow? Genetics. 144:1321–1328.
Parisi M, Nuttall R, Edwards P, et al. (12 co-authors). 2004. A
survey of ovary-, testis-, and soma-biased gene expression
in Drosophila melanogaster adults. Genome Biol. 5:R40.
Pavlidis P, Li Q, Noble WS. 2003. The effect of replication on
gene expression microarray experiments. Bioinformatics.
19:1620–1627.
Perezgasga L, Jiang J, Bolival B Jr, Hiller M, Benson E, Fuller
MT, White-Cooper H. 2004. Regulation of transcription of
meiotic cell cycle and terminal differentiation genes by the
testis-specific Zn-finger protein matotopetli. Development.
131:1691–1702.
Porter AH, Johnson NA. 2002. Speciation despite gene flow when
developmental pathways evolve. Evolution. 56:2103–2111.
Ranz JM, Castillo-Davis CI, Meiklejohn CD, Hartl DL. 2003.
Sex-dependent gene expression and evolution of the Drosophila transcriptome. Science. 300:1742–1745.
Ranz JM, Machado CA. 2006. Uncovering evolutionary patterns
of gene expression using microarrays. Trends Ecol Evol.
21:29–37.
Ranz JM, Namgyal K, Gibson G, Hartl DL. 2004. Anomalies in
the expression profile of interspecific hybrids of Drosophila
melanogaster and D. simulans. Genome Res. 14:373–379.
Rifkin SA, Kim J, White KP. 2003. Evolution of gene expression in
the Drosophila melanogaster subgroup. Nat Genet. 33:138–144.
Sun S, Ting C-T, Wu C-I. 2004. The normal function of a speciation gene, Odysseus, and its hybrid sterility effect. Science.
305:81–83.
Dysregulation in Drosophila Hybrids 145
Thomas S, Singh RS. 1992. A comprehensive study of genic
variation in natural populations of Drosophila melanogaster.
VII. Varying rates of genic divergence as revealed by twodimensional electrophoresis. Mol Biol Evol. 9:507–525.
Ting C-T, Tsaur S-C, Wu C-I. 2000. The phylogeny of closely
related species as revealed by the genealogy of a speciation
gene, Odysseus. Proc Natl Acad Sci USA. 97:5313–5316.
Ting C-T, Tsaur S-C, Wu M-L, Wu C-I. 1998. A rapidly evolving
homeobox at the site of a hybrid sterility gene. Science.
282:1501–1504.
True JR, Haag ES. 2001. Developmental system drift and flexibility in evolutionary trajectories. Evol Dev. 3:109–119.
White KP. 2001. Functional genomics and the study of development, variation and evolution. Nat Rev Genet. 2:528–537.
White-Cooper H. 2004. Spermatogenesis: analysis of meiosis and
morphogenesis. In: Henderson DS, editor. Drosophila cytogenetics protocols. Totowa (NJ): Humana Press. p. 45–75.
White-Cooper H, Leroy D, MacQueen A, Fuller MT. 2000. Transcription of meiotic cell cycle and terminal differentiation
genes depends on a conserved chromatin associated protein,
whose nuclear localisation is regulated. Development. 127:
5463–5473.
White-Cooper H, Schäfer MA, Alphey LS, Fuller MT. 1998.
Transcriptional and post-transcriptional control mechanisms
coordinate the onset of spermatid differentiation with meiosis
I in Drosophila. Development. 125:125–134.
Whitehead A, Crawford DL. 2006. Variation within and among
species in gene expression: raw material for evolution. Mol
Ecol. 15:1197–1211.
Wolfinger RD, Gibson G, Wolfinger ED, Bennett L, Hamadeh H,
Bushel P, Afshari C, Paules RS. 2001. Assessing gene significance from cDNA microarray expression data via mixed models. J Comput Biol. 8:625–637.
Wray GA, Hahn MW, Abouheif E, Balhoff JP, Pizer M, Rockman
MV, Romano LA. 2003. The evolution of transcriptional regulation in eukaryotes. Mol Biol Evol. 20:1377–1419.
Wu C-I, Perez DE, Davis AW, Johnson NA, Cabot EL, Palopoli
MF, Wu M-L. 1992. Molecular genetic studies of postmating
reproductive isolation in Drosophila. In: Takahata N, Clark
AG, editors. Molecular paleo-population biology. Berlin:
Springer-Verlag. p. 191–212.
Zeng L-W, Singh RS. 1993. A combined classical genetic and
high resolution two-dimensional electrophoretic approach to
the assessment of the number of genes affecting hybrid male
sterility in Drosophila simulans and Drosophila sechellia.
Genetics. 135:135–147.
Koichiro Tamura, Associate Editor
Accepted September 29, 2006