Transcription profiles of the bacterium

6306±6320 Nucleic Acids Research, 2003, Vol. 31, No. 21
DOI: 10.1093/nar/gkg841
Transcription pro®les of the bacterium Mycoplasma
pneumoniae grown at different temperatures
J. Weiner III, C.-U. Zimmerman, H. W. H. GoÈhlmann and R. Herrmann*
Zentrum fuÈr Molekulare Biologie Heidelberg, UniversitaÈt Heidelberg, 69120 Heidelberg, Germany
Received June 25, 2003; Revised and Accepted September 12, 2003
ABSTRACT
Applying microarray technology, we have investigated the transcriptome of the small bacterium
Mycoplasma pneumoniae grown at three different
temperature conditions: 32, 37 and 32oC followed by
a heat shock for 15 min at 43oC, before isolating the
RNA. From 688 proposed open-reading frames, 676
were investigated and 564 were found to be
expressed (P < 0.001; 606 with P < 0.01) and at least
33 (P < 0.001; 77 at P < 0.01) regulated. By quantitative real-time PCR of selected mRNA species, the
expression data could be linked to absolute
molecule numbers. We found M.pneumoniae to be
regulated at the transcriptional level. Forty-seven
genes were found to be signi®cantly up-regulated
after heat shock (P < 0.01). Among those were the
conserved heat shock genes like dnaK, lonA and
clpB, but also several genes coding for ribosomal
proteins and 10 genes of unassigned functions. In
addition, 30 genes were found to be down-regulated
under the applied heat shock conditions. Furthermore, we have compared different methods of cDNA
synthesis (random hexamer versus gene-speci®c
primers, different RNA concentrations) and various
normalization strategies of the raw microarray data.
INTRODUCTION
Until now, 96 bacterial and 16 archaeal genomes have
been completely sequenced (http://www.ncbi.nlm.nih.gov/
PubMed/). The annotations of these sequences provide a
fairly good description of the functional capacities of these
organisms, but gaining a thorough understanding of the
biology of these organisms requires more than the mere
knowledge of gene functions. For instance, to comprehend the
process of adaptation of an organism to changing environmental conditions demands a knowledge about the regulation
of genes, operons and regulons, and about the coordination
and interaction of numerous gene products. A prominent
example of such a regulation is the bacterial response to
high temperature, known as heat shock. Genes, which are
up-regulated during heat shock are also known to be
frequently involved in the bacterial response to other environmental stress conditions like UV exposure, osmotic shock
or starvation conditions (1).
Only limited information on gene expression and regulation
can be extracted directly from genome sequences.
Nevertheless, such sequences can be exploited for working
out new strategies and methods to monitor simultaneously and
quantitatively the expression of the gene pool of an organism
at the transcriptional (transcriptome) (2) and translational
level (proteome) (3).
The most popular method for the detection and quanti®cation of all mRNA species from an organism is the array-based
technology. Derived from a known genome sequence, genespeci®c probes can be synthesized, immobilized on a
supportive surface and used in hybridization experiments
either with labeled total RNA or with cDNA. The acquired
signal intensities give a rough estimate of gene expression on
the transcriptional level (4±10). However, this technology is
prone to several errors and requires a careful examination of
the collected raw data (11±13). Moreover, it is yet unclear
which statistical techniques are most suitable for the normalization and evaluation of microarray data.
The equivalent to a transcriptome analysis on the translational level is the proteome analysis. Aiming directly at the
proteins, the proteome enables identi®cation and quanti®cation of single proteins, and provides an insight into the
regulation of translation. At present, a frequently used method
for protein analysis is two-dimensional (2D) gel electrophoresis with immobilized pH gradients (14), which allows the
separation of complex protein mixtures into individual
components, which can be characterized by mass spectrometry (15±17). Since the separation of proteins is the
bottleneck of this method, alternative approaches are explored
to supplement 2D gel electrophoresis, e.g. multi-dimensional
liquid chromatography (18). The main advantage of the
mRNA analysis over the protein analysis is the direct and fast
identi®cation of all transcripts. Also, with speci®c probes, one
can target and quantify individual RNA species within a crude
mixture of RNAs, which is convenient when RNA samples are
contaminated with RNAs from different organisms. Such a
precise analysis is not yet feasible in the proteome analysis,
which is limited by the power of resolution of the 2D gel
*To whom correspondence should be addressed. Tel: +49 6221 54 68 27; Fax: +49 6221 54 58 93; Email: [email protected]
Present address:
H. W. H. GoÈhlmann, Johnson & Johnson Pharmaceutical Research and Development, 2340 Beerse, Belgium
The authors wish it to be known that, in their opinion, the ®rst two authors should be regarded as joint First Authors
Nucleic Acids Research, Vol. 31 No. 21 ã Oxford University Press 2003; all rights reserved
Nucleic Acids Research, 2003, Vol. 31, No. 21
electrophoresis, the sensitivity of mass spectral analyses and
the lack of protein-speci®c probes which could be applied to a
large number of proteins simultaneously.
Regulation of gene expression is known to take place at the
level of transcription and translation. Therefore, to understand
the regulation of gene expression of an organism, it is
necessary to study both processes. However, the complexity of
such an analysis grows considerably with the number of
expressed genes. For this reason, we have chosen the simple
bacterium Mycoplasma pneumoniae as a model organism. The
genome of M.pneumoniae has been completely sequenced
(19). It has a small genome size of only 816 kb and was
originally proposed to code for 677 open-reading frames
(ORFs). In the course of the recent re-annotation, the number
of ORFs increased from 677 to 688, one ORF being dismissed
and 12 added (20).
Mycoplasma pneumoniae is a human pathogen causing an
atypical pneumonia (21,22). It is considered to be a parasite of
the respiratory tract, colonizing the surface of the epithelial
cells. Recently, several reports were published indicating that
M.pneumoniae may also enter the host cells, although growth
of these bacteria inside the cell has not been well documented
(23). Mycoplasma pneumoniae is host dependent in nature, but
can be grown in the laboratory without the presence of host
cells in rich medium including horse or pig serum.
There is a lot yet to be learned about gene expression
regulation in M.pneumoniae. In both annotations of the
genome, only one sigma factor could be identi®ed (19,20).
Bornberg-Bauer and Weiner (24) found a weak homology in a
further protein (MPN626) to sigma D factor of Bacillus
subtilis, but its putative role remains unclear. Mycoplasma
pneumoniae possesses only a limited number of transcriptional regulation elements common for the bacterial world.
Therefore, one of the primary questions asked was whether
transcriptional regulation exists at all in this organism. The
answer obtained with the applied microarray technology was
clear and revealed the identi®cation of numerous genes which
were up- and down-regulated during heat shock conditions.
Thus, we laid the foundation for the identi®cation of regulated
genes at altered growth conditions and for further investigation of regulatory elements in M.pneumoniae.
MATERIALS AND METHODS
Organism, growth conditions and RNA isolation
Mycoplasma pneumoniae M129 was grown at 37°C in cell
culture ¯asks (150 cm2) containing 100 ml of modi®ed
Hay¯ick medium (25) supplemented with 20% horse serum.
After 96 h, surface-attached cells were washed twice with
phosphate-buffered saline (PBS; 0.15 M NaCl, 10 mM sodium
phosphate, pH 7.4) and immediately lysed in the cultivation
¯ask by adding RLT buffer from the RNeasy Midi RNAextraction System (Qiagen). This RNA isolation method was
used for RNA extraction as it removes most RNAs smaller
than 200 bases, thus preventing the synthesis of cDNA from
tRNAs. For cell lysis, 2 ml of RLT buffer in the presence of
20 ml of b-mercaptoethanol was used per cultivation ¯ask. The
lysate was sonicated twice for 5 s to shear genomic DNA and
diluted with an equal volume of RLT buffer. The diluted lysate
was mixed with an equal volume of 70% ethanol and was
6307
immediately loaded onto an RNeasy Midi column (Qiagen).
The column was washed twice with 3 ml of RW-1 wash buffer
and twice with 3 ml of RPE wash buffer. RNA was eluted from
the column with 1200 ml of the elution buffer supplied with the
kit. All centrifugation steps were carried out at 4000 r.p.m.
(Hettich Rotana/R centrifuge) at room temperature. The eluate
was ethanol precipitated and the RNA was resuspended in
150 ml of H2O/DEPC. The RNA solution was mixed with 5 ml
of 103 DNAse buffer (1 M NaAc, 50 mM MgSO4) and 30 U
of RNAse-free DNAse I (Roche) and incubated for 30 min at
25°C, followed by a phenol/chloroform extraction and an
ethanol precipitation. Finally, the RNA was then resuspended
in 50 ml aliquots and stored at ±70°C. For quantitative PCR,
the total RNA was isolated using the guanidinium isothiocyanate/phenol extraction method. The reagents were
obtained from Roth (Roti-Quickâ kit, catalog no. A979.1)
and preparation was done following the manufacturer's
instructions. For a single 100 ml cell-culture ¯ask of
M.pneumoniae, 2 ml of solution A, 2.6 ml of solution B and
2 ml of solution C were used. The RNA was intensively
washed with 70% ethanol, dried, suspended in DEPC-treated
water and stored at ±70°C.
Heat shock
Mycoplasma pneumoniae was grown at 32°C in the same
medium as described above. After 144 h, 50 ml of the 100 ml
medium was removed and either heated to 45°C or left at
32°C. The heated medium was mixed with the remaining 50 ml
of medium in the culture ¯ask and placed in a 43°C water bath
for 15 min. Adherent cells were washed twice with 13 PBS
(32°C for cells grown at 32°C and 37°C for heat shock-treated
cells or cells grown at 37°C) and RNA was extracted as
described above.
Primer database and synthesis of ORF-speci®c PCR
products
A database was designed which contains approximately 7000
M.pneumoniae-speci®c primers. All primers, which were used
for the synthesis of ORF-speci®c PCR products as well as for
cDNA synthesis, were between 17 and 22 nt in length. The
length of the PCR products varied from 250 to 450 bp;
exceptions were ORFs shorter than 250 bp. The PCR products
started >45 bp downstream of the start codon of the postulated
ORF unless either repetitive DNA sequences were located
within this region or the primers had more than one binding
site within a region. The primer database can be found at
http://mol2.zmbh.uni-heidelberg.de/mycodb/.
PCR products were evaluated by visual inspection of
ethidium bromide stained DNA after gel electrophoresis in a
2% agarose gel. A successfully synthesized PCR product had
to ful®ll the following criteria: (i) the PCR should produce
only one DNA fragment, (ii) the fragment had to be of the
expected size and (iii) the concentration of the PCR products
should be similar.
Membrane preparation
The PCR products were diluted 1:5 with TE buffer (10 mM
Tris±HCl, pH 8, 0.1 mM EDTA), which was stained with
bromphenol blue and transferred into 384 well plates.
Positively charged nylon membranes (Amersham) 7.3 3
11.5 cm were equilibrated for 5 min in a denaturing solution
6308
Nucleic Acids Research, 2003, Vol. 31, No. 21
(0.5 M NaOH, 1.5 M NaCl) and then placed onto a precut
3MM Whatman paper soaked with the same solution.
The PCR products were transferred onto the membrane with
a BioGrid robot (BioRobotics Ltd). Each PCR product was
spotted in a diagonally positioned doublet for an internal
control. Every spot was loaded 10 times with ~0.05 ml per
loading. After the spotting, the membrane was brie¯y placed
on a 3MM Whatman soaked with a neutralizing solution
(0.5 M Tris±HCl, pH 7.2, 1.5 M NaCl, 1 mM EDTA) before it
was air dried. DNA was immobilized by automatic UV
crosslinking (Stratalinker; 120 mJ).
cDNA synthesis for microarray probes
For gene-speci®c cDNA synthesis used in microarray analysis,
10 mg of total RNA and 676 ORF-speci®c primers (0.5 pmol
per primer) were annealed in a programmable thermocycler.
The primers used for cDNA synthesis were the reverse primers
of each ORF used in the PCR synthesis of ORF-speci®c
probes.
The samples were kept at 75°C for 1 min and then cooled
down over a period of 8 min to 43°C. Then, 20 mmol each of
dCTP, dGTP and dTTP, 120 pmol of dATP, 100 mCi of [a33P]dATP (Amersham-Pharmacia), 50 U of RNAse inhibitor,
MMuLV RT buffer and 60 U of MMuLV reverse transcriptase
was added. Reverse transcription took place in a ®nal volume
of 100 ml at 37°C for 1 h. RNA was hydrolyzed for 30 min at
68°C in the presence of 4.5 ml of 2 N NaOH, 1 ml of 10% SDS
and 1 ml of 500 mM EDTA pH 8.0. The probe was neutralized
with 3 ml of 2 N HCl and 10 ml of 1 M Tris±HCl pH 7.4.
Labeled cDNA was separated from the unincorporated
nucleotides by a small Sephadex G100 column. The incorporation of 33P-labeled dATP was measured in scintillation
enhancer ¯uid (Quicksafe A, Zinsser Analytic).
Reverse transcriptase (RT)±PCR
Total RNA was isolated using the Qiagen RNeasy Mini Kit
(see Organism, growth conditions and RNA isolation). For
cDNA synthesis, 5 mg of RNA was mixed with a primer mix
containing 688 ORF-speci®c primers (new primer mix
generated for experiments which include the additional 12
genes) (1 pmol/primer) at a volume of 29 ml. This mix was
denatured for 1 min at 75°C, followed by a gradual cooling
from 75 to 42°C in 6 s steps. Twenty-one microliters of
enzyme mix was added when the temperature reached 42°C.
The enzyme mix contained 5 ml of DTT, 10 ml of SSII 53
buffer,1 ml of dNTP (10 mM), 1 ml of RNase inhibitor (40 U),
1 ml of (200 U) SuperScriptÔ II (Life Technologies), 3 ml of
ddH2O. Reverse transcription was carried out at 42°C for
50 min.
PCR protocol. Each reaction was done in a ®nal volume of
50 ml containing 0.1 ml of cDNA, 5 ml of Taq 103 buffer, 1 ml
of dNTP (10 mM), 30 pmol/primer, 0.5 ml of Taq polymerase
(2.5 U). The following cycles were applied to all PCRs: 2 min
at 92°C; 24 cycles of: 20 s at 92°C; 30 s at 56°C; 45 s at
72°C. Five microliter PCR mixtures were loaded on a 1.5%
agarose gel.
Hybridization procedure
Before the ®rst hybridization, the membrane was washed
twice with a boiling 0.1% SDS solution. Hybridization was
carried out in 100 ml hybridization bottles (Mini
Hybridization Oven; Appligene). Membranes were prehybridized twice for 2 h in 3.0 ml of hybridization buffer
(53 SSC, 1% SDS) at 60°C. Hybridization was carried out
overnight in 3±5 ml of buffer and complemented with 2±3 3
106 c.p.m. of labeled cDNA. After hybridization, the membrane was washed three times for 20 min at 60°C with the
following buffers: 23 SSC, 0.1% SDS; 13 SSC, 0.1% SDS;
and 0.13 SSC, 0.1% SDS, respectively. The membrane was
placed under a PhosphorImager screen for 3±5 days and then
scanned at 88u (high pixel resolution) on a PhosphorImager
(Molecular Dynamics). Immediately after scanning, the
membrane was stripped two times with a boiling solution of
0.1% SDS and rehybridized or dried for storage.
Image analysis, quanti®cation and data analysis
For data acquisition, the imaging program VisualGrid was
used (VisualGridâ, implemented by Markus Kietzmann with
contributions from David Bancroft and Igor Ivanov.
Copyrighted and licensed by GPC Biotech AG 1998±2000).
The software measured both pixel density as well as
background signals. A standard deviation of each duplicated
signal was calculated. In case the standard deviation exceeded
30%, it was checked whether an increased signal was due to
high radioactive background within the measured ®eld. If such
was the case, the higher signal was substituted by the low
signal. An average of the duplicates was calculated and the
quantile 20 of the background calculated by VisualGrid was
subtracted. Furthermore, an average of 41 signals derived
from unspotted ®elds was subtracted from the calculated
signal.
Data normalization
Two normalization procedures were applied to the collected
raw data. The results from the different normalization methods
were compared with each other. In the ®rst application, each
data point from a hybridization experiment was divided by the
sum of all signals from that membrane. In the second
normalization procedure, the mean background was subtracted from the data points and the result was logarithmized.
From each derived data point the mean signal from the
membrane was subtracted and the result was divided by the
standard error of the signals.
Normalization against genomic DNA
Genomic DNA of M.pneumoniae was highly puri®ed by a
CsCl density-centrifuged gradient and ethanol precipitation.
The DNA was then quanti®ed and mechanically further
disrupted by pipeting and vortexing. A standard amount of
10 mg of DNA was used for a nick-translation (Roche) using
33P-labeled dATP and dCTP and the labeled DNA was further
disrupted by sonication. A standard hybridization was done as
described before. This experiment aimed to gain an estimate
of the hybridization ability of the PCR product.
Quantitative PCR
Quantitative PCR (Q-PCR) allows a precise quanti®cation of
the amount of DNA present in a certain probe. We applied this
method to determine the absolute copy number of mRNA
molecules in Mycoplasma cells and correlate the numbers with
the signal strength derived from microarrays.
Nucleic Acids Research, 2003, Vol. 31, No. 21
6309
For Q-PCR, the 5700 series PCR detection unit and SYBR
green master mix (Applied Biosystems) were used. For each
experimental set-up, a standard curve was prepared, consisting
of ®ve dilutions of the PCR products, ranging from ~1 3 106
to 100 copies of DNA. From each of the cDNA probes and the
negative controls, 1 ml was ampli®ed in a 50 ml reaction. The
results were evaluated using the software provided by the
manufacturer of the instrument (Applied Biosystems).
cDNA synthesis for Q-PCR
For Q-PCR, the following procedure has been applied: each
reaction was carried out in 25 ml with 150 ng of total RNA
isolation, 50 U of AMV reverse transcriptase (RT) (Roche),
1 ml of AMV RT buffer, 1.4 mM dNTP, 1 ml of random
hexanucleotide primer set (Roche) and 50 U of RNAse
inhibitor (Roche). Before adding the AMV RT, the mixture
was heated for 10 min at 60°C, then cooled on ice. Next, AMV
RT was added and the mixture was transferred to 42°C for
45 min. Enzymes were heat inactivated by heating for 10 min
at 95°C, the reaction volume was set to 200 ml by adding
DEPC-treated water, and the cDNA was stored at ±20°C. For
each of the probes, negative controls were prepared as
described, with the omission of the AMV RT.
Determination of cell density
Two methods were applied for the determination of cell
density (color changing test and Q-PCR). For the color
changing test, serial 1:10 dilutions of M.pneumoniae cultures
were done, beginning at the time of inoculation. Samples were
taken in 24 h intervals and cell density was determined based
on the last dilution in which color change of the medium, due
to cell growth, could be observed (CCU).
Secondly, genome equivalents were determined by Q-PCR
in three different PCRs. Mycoplasma pneumoniae cells taken
from one culture were washed with 13 PBS and suspended in
1 ml of 13 PBS. The suspension was subsequently sonicated
three times for 15 s and diluted 1:100. From this dilution, 1 ml
was taken for Q-PCRs corresponding to genome positions
located within the gene MPN434.
Mycoplasma pneumoniae transcriptome web site
All results and additional data related to the M.pneumoniae
transcriptome can be retrieved from the web address http://
www.zmbh.uni-heidelberg.de/M_pneumoniae/transcriptome/.
RESULTS
Experimental set-up and standard transcription pro®le
The aims of our study were: (i) to monitor simultaneously and
semi-quantitatively all mRNA species synthesized in
M.pneumoniae cells grown under standard laboratory conditions, i.e. rich medium and a growth temperature of 37°C; and
(ii) to investigate regulation of gene expression at the
transcriptional level during heat shock. Microarray technology
served as the basis for these investigations and the results of
selected candidates were con®rmed by Q-PCR and RT±PCR.
Furthermore, our approach involved determining and quantifying cell numbers as well as copy numbers of RNA species.
Although M.pneumoniae grows at 37°C under standard
laboratory conditions and probably also in its natural
Figure 1. DNA array of the entire set of 677 postulated ORFs of
M.pneumoniae hybridized with radioactively labeled cDNA probes
generated from RNA which was extracted from cells grown at 37°C for
96 h. Each ORF was spotted in doublets.
environment, we chose 32°C as a reference temperature for
comparison with heat shock conditions, because pilot experiments indicated that 37°C seems already to be a kind of heat
shock for certain genes. Therefore, to get clearer differences of
temperature-induced expression signals, we used total RNA
from cells grown at 32°C as a reference for the effect of heat
shock. A temperature shift from 37 to 43°C gave similar but
less diverging results (data not shown).
First, we established the transcription pro®les from each of
the 677 ORFs proposed in the original publication (Fig. 1).
Genes coding for RNA only, like the abundant ribosomal
RNA or tRNA, were excluded from the analysis for technical
reasons. The cDNA was synthesized from total M.pneumoniae
RNA by priming with 676 gene-speci®c oligonucleotides.
Since one of the originally proposed 677 ORFs was dismissed
(20), we used only 676 primers. The individual signals
obtained after exposing the nylon membrane to a
PhosphorImager were inspected and expressed as a fraction
of the total hybridization signal. Figure 2 presents the data
from a selected array of ORFs as the average of various
independent gene expression measurements for three growth
conditions. Pooling the results from the four independently
established transcription pro®les of cells grown at 37°C,
signals from more than 600 genes were signi®cantly above
background (606 for P < 0.01, 564 for P < 0.001; see Table 1).
Among the genes which showed no signi®cant expression
(transcription) according to our rules, 31 (43%) were without
assigned function and for 21 (30%) genes that showed no
signi®cant transcriptional expression, proteins were identi®ed
in the proteome analysis of M.pneumoniae (36).
The additional 12 genes (MPN069, MPN242, MPN254,
MPN270, MPN272, MPN296, MPN377, MPN388, MPN418,
MPN482, MPN495, MPN605) (20), which were identi®ed
after the original publication appeared, were tested on separate
nylon membranes. Since the genes were spotted on a new
array set, the data for these genes could not be evaluated
together with the data discussed in this paper. However,
we found a positive transcription, based on signal above
background and controls, for 11 of the 12 new genes. The
transcription signal for MPN418 (hypothetical protein) fell
6310
Nucleic Acids Research, 2003, Vol. 31, No. 21
Figure 2. A series of 25 ORFs (MPN507±MPN531) and their hybridization signals. The signals were derived from three temperature experiments where
M.pneumoniae was incubated for 96 h at 37°C, 144 h at 32°C or exposed to a heat shock of 43°C for 15 min after incubation for 144 h at 32°C, respectively.
Each signal represents a percentage of the sum of all 677 ORF signals. A whole number of 15 experiments were included in the evaluation (four for
incubation at 37°C, six for incubation at 32°C and ®ve for heat shock). For each pro®le, the sum of all signals was calculated and each ORF signal was
divided by the sum. An average of all corresponding ORF data was calculated and is presented in the histogram. Error bars represent the standard error within
the selected experiments.
below the background signal. MPN069 (50s ribosomal protein
L33) and MPN377 (hypothetical protein) showed high
expression signals.
For comparative studies and to show regulation of gene
expression in M.pneumoniae by heat shock, transcription
pro®les were established from cells grown at 32°C and
compared with those derived from cells grown at 32°C but
exposed for 15 min to a temperature of 43°C before RNA
isolation. Altogether, 47 genes were up-regulated after
exposure of the bacterium to 43°C (Figs 3 and 4 and
Table 1). Among the up-regulated candidates were ubiquitous
conserved genes coding for heat shock proteins like DnaK,
GroEL and LonA. In addition, unexpected genes, which were
not known to code for products involved in a heat shock
response, like those coding for various ribosomal proteins,
proteases and functionally unassigned lipoproteins, were also
heat shock induced. Besides an up-regulation, we also
observed the down-regulation of 30 genes upon a temperature
shift to 43°C (Fig. 3 and Table 1). Down-regulation was
also perceived in genes of unassigned function. A further
comparison of the genes with the highest expression level at
each of the three selected temperatures revealed that certain
genes like MPN053 (phosphocarrier protein HPr), MPN624
(ribosomal protein L28), MPN665 (elongation factor Tu),
MPN024 (DNA-directed RNA polymerase delta subunit) or
MPN331 (trigger factor) were constitutively highly expressed
at all three temperatures (Table 2). The results of the
comparisons among the various experimental conditions
were very similar for the two statistical tests applied. If
compared using the t-test, 133 genes showed signi®cant
difference in expression in the heat shock pro®le with P < 0.05
and 19 with P < 0.001. In the Mann±Whitney U-test, the
numbers were 135 and 33, respectively (see Table 1 for
details). A transcriptome map of M.pneumoniae based on the
above described results is shown in Figure 4. The ORF, which
was removed from the original annotation, was negative in the
transcriptome.
Control experiments
In all experiments, the transcription pro®les proved to be
reproducible, independent of total RNA or ®lter preparations.
However, to minimize artifacts or misinterpretations, a
number of control experiments had been done. One of the
most critical parameters in a transcriptome analysis concerns
Nucleic Acids Research, 2003, Vol. 31, No. 21
synthesis of cDNA. A poor primer selection of sequencespeci®c primers may be the cause for meager cDNA synthesis
due to low hybridization to an mRNA template. For this
reason, cDNA synthesis was alternatively done with random
hexamers. Although the speci®c activity of the random
hexamer-primed cDNA was higher (107 c.p.m./10 mg of
total RNA) than the activity of 18±20mer gene-speci®c primed
cDNA (105±106 c.p.m./10 mg of total RNA), fewer positive
signals were counted after ®lter hybridization with random
hexamer-primed cDNA, but the background was consistently
higher. The reason for the high activity is probably the reverse
transcription of the very redundant tRNA and rRNA species
with random hexamer primers (see Discussion). Except
MPN123 [topoisomerase IV subunit A (parC)], MPN356
[cysteinyl-tRNA synthetase (cysS)], MPN349 (conserved
hypothetical), MPN345 [type 1 restriction enzyme (hsdR)
homolog] and MPN448 (conserved hypothetical), which
showed a higher signal intensity when cDNA was synthesized
with random hexamers, all probes which gave positive signals
with hexamer-primed cDNA were also detected in pro®les
derived with gene-speci®c primed cDNA. The array signals of
the cDNA, which was synthesized with the gene-speci®c
primer mix, were much stronger than that of hexamer-primed
cDNA. Evidently, not all mRNA species are reverse transcribed with a random hexamer primer mix as ef®ciently as
with sequence-speci®c primers. For this reason, the signal
pattern of the array pro®les is different from that derived with
gene-speci®c primers. The more precise way for transcribing
the majority of each mRNA species is by using gene-speci®c
primers in an excess concentration. This is the approach that
we took in our experiments.
The even spotting pattern of the probes on the membranes
was tested by hybridization with nick translated DNA. All
M.pneumoniae-speci®c probes on the ®lter reacted positively
under the selected conditions, although the signal strength was
not homogenous. When comparing this control pro®le with
the standard transcription pro®le using the same ®lter, we
found them to have no correlation, as expected. From this
comparison, we concluded that the loading of the probes onto
the ®lter worked well and that the distribution of the signal
intensities in the transcription pro®les re¯ects a true difference
in the number of transcripts and labeled genomic DNA,
respectively, in the hybridization mixture. To con®rm the
signals and signal differences obtained with 33P-labeled cDNA
and to eliminate any bias introduced by the labeling procedure, RT±PCRs were done with selected genes and gene
(mRNA)-speci®c primers. The selected candidates fell in the
three categories of up-, down- and non-regulated genes, based
on the evaluated transcription pro®les. The intention was to
compare transcription of a given gene from cells grown at
different temperatures rather than comparing the expression
differences among different genes. The results con®rm the
temperature-dependent signal differences obtained in
the transcriptome analysis for many of the genes (Fig. 5). In
the RT±PCR, however, more genes showed regulation.
Numerous genes, which showed no statistically signi®cant
regulation in the array experiment, showed an up- or downregulation in the RT±PCR, and some genes, which proved to
be regulated in the array experiment, showed no regulation in
the RT±PCR. An interesting result from the RT±PCR
experiment was, for instance, the heat shock-induced up-
6311
regulation of several genes involved in the purine and
pyrimidine salvage pathway. Five of these genes, MPN065
cytidine deaminase (cdd), MPN064 thymidine phosphorylase
(deoA), MPN063 deoxyribose-phosphate aldolase (deoC),
MPN062 purine-nucleoside phosphorylase (deoD) and
MPN061 signal recognition particle protein (ffh), are clustered
in the same orientation on the genome. Of these, only
MPN065 was shown to be signi®cantly up-regulated on the
microarray. Of course, one has to keep in mind that the array
results are a statistical set of data derived from an array of
experimental results. RT±PCR, on the other hand, was done
only once and the results were, therefore, not averaged. The
RT±PCR results, however, are not contradictory to the results
obtained from the microarray experiments, as we only
observed a discrepancy between signi®cantly expressed to
not signi®cantly expressed signals. A larger discrepancy,
where ORFs showed regulation in one direction (either up- or
down-regulated) in the microarray experiment and regulation
in the opposite direction in the RT±PCR experiment, were not
observed.
Determination of cell numbers; Q-PCR
To interpret the microarray signals in terms of copy number of
speci®c mRNA molecules in a total RNA preparation, or more
precisely in the bacterium, at least three variables must be
known, i.e. the number of M.pneumoniae cells which was used
for a total RNA preparation, the total amount of RNA isolated
from a given cell number and the concentration of individual
mRNA species within the total RNA preparation. Counting
M.pneumoniae cells in a preparation is not trivial, since the
bacteria are too small to be seen in a light microscope. In
addition, the cells are sticky and have the tendency to form
clumps, which excludes a counting of the colonies on agar
plates as a reliable method for quanti®cation. Therefore, we
decided to de®ne the number of genomes in a bacterial
suspension by Q-PCR and to use the data as a reproducible
measure (genome equivalent) for cell density. The mean
number of genome equivalents in a M.pneumoniae suspension
grown at standard conditions in one cell culture ¯ask (150 cm2)
(see Materials and Methods) was 2.4 6 0.8 3 1011. The
determination of numbers with the color-changing test
provided evidence that the highest 1:10 dilution of a
M.pneumoniae suspension, which still showed growth, indicated 1010±1012 cells per ¯ask.
From such a standard bacterial suspension, we routinely
isolated 200 mg of total RNA. This means that 200 mg of total
RNA corresponds to 2.4 3 1011 genome equivalents or to
1.2 3 1011 bacteria, assuming that one cell, growing under
standard conditions, contains two genomes.
To determine the actual copy numbers of individual mRNA
species in total RNA preparations, 0.15 mg of RNA was used
for cDNA synthesis, priming with a random hexamer primer
mix. Table 3 shows the results of copy numbers of individual
mRNA species and of the 16SrRNA. We had chosen these
particular mRNA species because they gave differentÐfrom
low to highÐsignals in the microarray analysis.
Approximately 4 3 1010 16SrRNA copies were measured
per 1 mg of RNA or per 1.2 3 109 genome equivalents. The
number of molecules of the mRNA, which gave the strongest
signals, were a factor of approximately 100 lower (Tables 2
and 3).
6312
Nucleic Acids Research, 2003, Vol. 31, No. 21
Table 1. Genes with signi®cant differences (P < 0.01) in expression between the 32°C and heat shock pro®les
ORFa
Annotationb
Genes up-regulated during heat shock
531
ATP-dependent protease binding subunit (clpB) homolog
481
GTP binding, era like
507
Type I restriction enzyme ecokI speci®city protein (hsdS) homolog
480
Valyl-tRNA synthetase (valS)
333
Hypothetical
332
ATP-dependent protease (lon)
614
Conserved hypothetical
434
Heat shock protein DnaK
168
Ribosomal protein L2 (rpL2)
218
Oligopeptide transport ATP-binding protein (oppF)
165
Ribosomal protein L3 (rpL3)
249
Conserved hypothetical
166
Ribosomal protein L4 (rpL4)
199
Conserved hypothetical
352
Sigma-70 factor family
170
Ribosomal protein L22 (rpL22)
200
Conserved hypothetical
621
Similar to metallohydrolase
164
Ribosomal protein S10 (rpS10)
172
Ribosomal protein L16 (rpL16)
65
Cytidine deaminase (cdd)
171
Ribosomal protein S3 (rpS3)
2
Similar to j-domain of DnaJ
161
Conserved hypothetical
334
ABC transporter ATP-binding protein
315
SAM-dependent methytransferase
173
Ribosomal protein L29 (rpL29)
541
Ribosomal protein S20 (rpsT)
350
Conserved hypothetical
198
Adenine-speci®c methyltransferase EcoRI (mte1)
574
Heat shock protein GroES
1
DNA polymerase III beta subunit (dnaN)
572
Similar to cytosol aminopeptidase (leucine aminopeptidase) (lap)
268
Similar to PTS system: EIIB and N-terminal part of EIIC; bidomainal
266
Similar to (oxido/arsenate) reductase
220
Ribosomal protein L1 (rpL1)
346
Type I restriction enzyme hsdR (fragment)
160
Conserved hypothetical
250
Phosphoglucose isomerase B (pgiB)
66
Phosphomannomutase or phosphoglucomutase
267
Conserved hypothetical
557
NADH-binding oxidoreductase GidA
343
Type I restriction enzyme ecokI speci®city protein (hsdS) homolog
13
Hypothetical
245
Membrane-associated guanylate kinase homolog
636
Ribosome releasing factor (frr)
30
Conserved hypothetical
Genes down-regulated during heat shock
39
Conserved hypothetical
213
Conserved hypothetical
146
Conserved hypothetical
509
Membrane export protein family
591
Conserved hypothetical
40
Conserved hypothetical
304
Arginine deiminase (arcA) (N-terminal, fragment)
154
N-utilization substance protein A homolog (nusA)
583
Conserved hypothetical
493
3-Hexulose-6-phosphate synthase
57
Spermidine/putrescine transport system permease (potI)
667
UDP-glucose pyrophosphorylase (gtaB)
685
Sulfate transport ATP-binding protein (cysA)
26
Similar to GTPases
131
Adhesin P1 precursor homolog
284
Conserved hypothetical
256
Conserved hypothetical
455
Conserved hypothetical
77
Hypothetical
264
150-End, similar to phosphate hydrolising
Difference
Pd
Heat shock
Meanc
SD
32°C
Meanc
SD
2.799
0.056
0.136
±0.230
0.953
1.322
0.066
1.641
0.815
0.121
1.095
±0.234
0.187
0.940
1.303
1.275
0.961
0.438
±0.395
0.057
1.810
0.455
0.872
0.453
0.325
1.557
±0.357
0.579
0.223
±1.311
2.196
1.637
1.226
0.119
1.087
±0.155
0.773
±0.317
±0.673
0.241
±0.132
0.526
1.154
1.014
1.192
0.909
0.815
0.27
0.24
0.49
0.35
0.18
0.39
0.12
0.17
0.31
0.39
0.25
0.28
0.32
0.29
0.41
0.27
0.08
0.33
0.23
0.48
0.22
0.25
0.16
0.34
0.20
0.13
0.26
0.23
0.13
0.28
0.11
0.14
0.13
0.10
0.18
0.22
0.17
0.16
0.17
0.15
0.15
0.11
0.14
0.15
0.17
0.15
0.13
0.765
±1.586
±1.161
±1.444
±0.193
0.202
±0.934
0.673
±0.153
±0.767
0.227
±1.047
±0.619
0.134
0.509
0.500
0.195
±0.326
±1.137
±0.669
1.102
±0.219
0.199
±0.204
±0.311
0.922
±0.985
±0.045
±0.398
±1.930
1.598
1.039
0.643
±0.425
0.556
±0.667
0.290
±0.781
±1.136
±0.194
±0.554
0.153
0.784
0.653
0.835
0.559
0.476
0.28
0.90
0.44
0.82
0.41
0.22
0.42
0.34
0.40
0.31
0.41
0.27
0.27
0.49
0.17
0.36
0.37
0.34
0.26
0.33
0.20
0.40
0.15
0.32
0.27
0.50
0.40
0.39
0.16
0.34
0.38
0.20
0.34
0.19
0.18
0.25
0.32
0.26
0.37
0.27
0.32
0.21
0.22
0.20
0.17
0.15
0.12
2.03
1.64
1.30
1.21
1.15
1.12
1.00
0.97
0.97
0.89
0.87
0.81
0.81
0.81
0.79
0.78
0.77
0.76
0.74
0.73
0.71
0.67
0.67
0.66
0.64
0.64
0.63
0.62
0.62
0.62
0.60
0.60
0.58
0.54
0.53
0.51
0.48
0.46
0.46
0.43
0.42
0.37
0.37
0.36
0.36
0.35
0.34
0.001
0.001
0.003
0.008
0.001
0.001
0.001
0.001
0.008
0.001
0.008
0.001
0.003
0.003
0.001
0.003
0.001
0.008
0.001
0.008
0.001
0.003
0.001
0.008
0.003
0.008
0.008
0.008
0.001
0.008
0.003
0.001
0.001
0.001
0.003
0.008
0.003
0.008
0.008
0.003
0.008
0.003
0.001
0.003
0.001
0.003
0.003
±1.704
±0.866
1.653
0.407
0.562
±0.902
0.057
±1.245
±1.015
±1.161
±0.045
0.123
1.137
±1.029
±0.102
0.128
0.139
0.318
1.004
0.370
0.93
0.40
0.17
0.33
0.16
0.25
0.25
0.23
0.24
0.22
0.28
0.29
0.32
0.11
0.46
0.15
0.23
0.19
0.19
0.17
±0.475
0.091
2.591
1.318
1.416
±0.060
0.884
±0.428
±0.290
±0.444
0.654
0.815
1.807
±0.364
0.489
0.634
0.624
0.801
1.473
0.805
0.54
0.27
0.29
0.36
0.36
0.36
0.26
0.43
0.38
0.25
0.44
0.35
0.33
0.41
0.23
0.34
0.31
0.25
0.19
0.27
±1.23
±0.96
±0.94
±0.91
±0.85
±0.84
±0.83
±0.82
±0.73
±0.72
±0.70
±0.69
±0.67
±0.67
±0.59
±0.51
±0.48
±0.48
±0.47
±0.43
0.003
0.001
0.001
0.001
0.001
0.001
0.001
0.003
0.001
0.001
0.008
0.008
0.008
0.001
0.008
0.001
0.008
0.003
0.003
0.008
Nucleic Acids Research, 2003, Vol. 31, No. 21
6313
Table 1. Continued
ORFa
Annotationb
Heat shock
Meanc
SD
32°C
Meanc
SD
586
7
464
413
211
5
420
414
372
498
Conserved hypothetical
Similar to DNA-polymerase subunits
Involved in cytadherence
Hypothetical
Excinuclease ABC subunit B (uvrB)
Seryl-tRNA synthetase (serS)
Glycerophosphoryl diester phosphodiesterase (glpQ)
Involved in cytadherence
Similarity to pertussis toxin subunit s1
L-Ribulose-5-phosphate 4-epimerase (araD)
1.108
±0.299
±0.469
0.120
±0.336
±0.657
1.037
±0.485
±0.922
0.354
1.540
0.128
±0.054
0.525
0.067
±0.276
1.379
±0.148
±0.650
0.608
0.17
0.23
0.09
0.18
0.21
0.12
0.11
0.12
0.14
0.09
0.20
0.15
0.16
0.15
0.20
0.13
0.16
0.13
0.21
0.29
Difference
Pd
±0.43
±0.43
±0.41
±0.41
±0.40
±0.38
±0.34
±0.34
±0.27
±0.25
0.003
0.001
0.001
0.008
0.003
0.001
0.001
0.003
0.008
0.008
aORF
numbers according to Dandekar et al. (20).
according to Dandekar et al. (20).
cMean of the percentage values as described in the text.
dProbability of type I error in the Mann±Whitney U-test for unpaired samples.
bAnnotation
The mRNA copy numbers were scaled accordingly and the
linear regression coef®cients with copy numbers as the
independent variable and percentage mRNA microarray signal
as dependent variable were calculated (Fig. 6).
DISCUSSION
Regulation in heat shock conditions
Mycoplasma pneumoniae lives in a relatively constant environment, the human respiratory tract, where growth conditions
do not seem to change dramatically. For instance, the window
for temperature change is very small and should not exceed
42°C as the upper limit.
Nevertheless, with the applied microarray analysis we
found 47 genes to be signi®cantly up-regulated in heat shock
(43°C) conditions at P < 0.001 (Table 1). These results show
that gene regulation takes place in M.pneumoniae, as in other
organisms, in response to environmental changes. The upregulated genes can be classi®ed according to function. The
classical heat shock proteins with chaperone or protease
activity are represented by DnaK, LonA, GroES, a protein
with the j-domain of DnaJ (MPN002) and ClpB. We assume
that, as shown for Escherichia coli and other bacteria, the
DnaK system prevents aggregation of thermo-labile proteins
(26) and together with ClpB solubilizes heat-induced protein
aggregates (27). An additional attractive possibility and useful
function for a cell could be the observed cooperative action of
ClpB and the DnaK system in the activation of the initiation of
replication of the plasmid RK2 (28) by converting an inactive
protein dimer into an active monomer form. The fact that clpB
was the most signi®cantly up-regulated gene (MPN531) and
the only conserved member of the clp family in M.pneumoniae
stresses the importance of these functions, even for a cell
living in a constant environment.
The meaning of the other up-regulated genes is not that easy
to interpret. They belong to the following functional groups:
translation and ribosomal proteins, transport and binding
proteins, cell envelope proteins, proteins involved in restriction and modi®cation, purine and pyrimidine metabolism,
DNA replication, transcription and energy metabolism.
Eleven of the 47 up-regulated genes are not assigned a
function at all.
Interestingly, many mRNAs encoding different ribosomal
proteins were found to be strongly up- or down-regulated. This
might be connected to the role of heat shock proteins in the
biosynthesis of ribosomes (29) and should be further investigated. Hansen et al. (30) found that the rRNA concentrations
are subject to heat shock regulation. We did not ®nd a
signi®cant regulation of the 16SrRNA, but it is likely that a
signi®cant change in rRNA concentration will not occur after
only 15 min of heat shock.
Obvious is the large number of up-regulated genes coding
for ribosomal proteins. Eight of 10 of these genes are part of
the S10 operon, which consists of 35 genes (MPN164±
MN198), most of them coding for ribosomal proteins. All of
the eight genes are located at the beginning of the operon,
giving the impression that their transcription is differently
regulated from the transcription of the residual genes of this
operon. So far, there is no experimental evidence as to whether
these ribosomal proteins have additional functions under heat
shock conditions beyond ribosome formation, which could
explain this observation. Orthologs of the genes up-regulated
in M.pneumoniae could also be found in transcriptome
analyses from other bacterial species (31±35), for instance
the RNA polymerase sigma-70 factor (sigA, MPN352)
transport systems, individual ribosomal proteins, tRNA
synthesases, surface-exposed proteins (lipoproteins) or restriction-modi®cation enzymes. The increase in transcription of
genes involved transcription, translation or transport or energy
metabolism could be the response to the increasing demands
of a heat-stressed bacterium, while the increase in transcription of restriction-modi®cation connected genes has been
interpreted as a defense strategy against intruding DNA (34)
and the synthesis of lipoproteins as a possibility to modify the
cell surface (35). Of course, this is highly speculative and
needs experimental evidence by constructing knock-out
mutants in the genes of interest or gene modi®cations, which
allow a regulated gene expression.
Since no experiments have been done so far with
M.pneumoniae to understand the regulation of gene expression after heat shock, one depends on the experimental results
of other bacteria. Several global transcription analyses in
response to growth temperature variation from various
bacterial species exist, e.g. E.coli (31), group A
Streptococcus (32), B.subtilis (33), Campylobacter jejuni
6314
Nucleic Acids Research, 2003, Vol. 31, No. 21
Figure 3. Correlation between expression at 32°C and expression in heat shock. Gray circles, genes that are not signi®cantly expressed in either of the tested
experimental conditions; black circles, genes that are signi®cantly expressed, but do not show signi®cant differences between the two pro®les as tested with
the U-test. Genes up-regulated during heat shock: light-red circles (P < 0.01), dark-red circles (P < 0.001); genes signi®cantly down-regulated during heat
shock: light-blue circles (P < 0.01), dark-blue circles (P < 0.001).
(34) or Borrelia burgdorferi (35). Because of the close
phylogenetic relationship and the extended studies done,
B.subtilis seems to be the best choice for comparison. Heat
shock genes in this bacterium are assigned to four different
classes depending on their mode of regulation of transcription.
Class I genes posses the CIRCE element (1,33), a palindromic
sequence located between the start codon of the gene and the
promoter sequence, which is regulated by the repressor HrcA.
Class II genes depend on the alternative sigma factor sB and
class III genes are mainly controlled by the repressor CtsR. All
other heat shock genes, which do not belong in either of the
described classes were assigned as class IV or class U genes,
since their mechanisms of transcriptional regulation are
unknown.
The heat shock-regulated genes found in M.pneumoniae
only ®t into either class I or class IV (U). In M.pneumoniae,
the genes MPN021 (dnaJ), MPN332 (lonA), MPN434 (dnaK)
and MPN531 (clpB) possess the conserved CIRCE element,
and since the repressor HrcA (MPN124) is also present, we
can assume that these genes are regulated accordingly.
Based on similarity searches, neither the alternative sigma
factor sB nor the repressor CtsR could be found in
M.pneumoniae, indicating that the genes are not regulated
according to the class II and class III rules. Therefore, we have
no explanation for the regulation of transcription for the
residual heat shock-induced genes in M.pneumoniae. Analysis
of promoters with statistical methods did not reveal any heat
shock gene-speci®c sequences and the only other candidate
gene for an alternative sigma factor besides sA, MPN626,
which shows some similarity with sD from B.subtilis (24), has
not been analyzed experimentally. However, we did not ®nd a
transcriptional induction of this gene during the applied heat
shock conditions, making its role for heat shock regulation
unlikely.
The whole bulk of our data can be retrieved at http://
www.zmbh.uni.heidelberg.de/M_pneumoniae/transcriptome/.
Statistical evaluation of the data
Two approaches (Student's t-test and Mann±Whitney U-test)
were applied to test for signi®cant differences in gene
expression between both pro®les (32°C versus heat shock).
Student's t-test is a parametric test assuming both the
normality of the data and a constant variance for a given
gene for each microarray. Since both of these assumptions are
questionable in the case of microarrays (11), we decided to use
the non-parametric alternative which does not make such
assumptions. A common problem for the non-parametric
alternatives is the large number of type I errors, i.e. false
negatives. These tests, while not making any assumptions
about the distribution of the samples, are weaker than their
parametric counterparts. An extremely simple (and also very
weak) non-parametric test used in evaluating microarray data
is the min/max separation (12), where two sample groups are
assumed to be signi®cant if the lowest value in one group is
larger than the highest value in the other group. We have
applied the stronger Mann±Whitney U-test for unpaired
samples, which will give statistical signi®cance in all the
genes that will also show a positive min/max separation.
Finally, one has to apply a low signi®cance threshold,
otherwise a high rate of type II errors (false positives) will be
Nucleic Acids Research, 2003, Vol. 31, No. 21
6315
Table 2. Absolute expression of genes in different expression pro®les
Expression during heat shock
MPNa Eb
Annotationc
Expression at 32°C
MPNa Eb
Annotationc
Expression at 37°C
MPNa Eb
Annotationc
531
2.80
053
3.08
376
2.75
Hypothetical
099
053
2.73
2.62
099
146
2.99
2.59
099
024
2.72
2.56
624
2.35
024
2.53
144
376
574
024
2.27
2.22
2.20
2.11
144
376
624
665
665
2.06
Adhesin P1 precursor homolog
Hypothetical
Heat shock protein GroES
DNA-directed RNA polymerase
delta subunit (rpoE)
Elongation factor TU (tuf)
688
1.95
331
573
1.95
1.93
ParA family of ATPases involved
in chromosome partition
Trigger factor (tig)
Heat shock protein GroEL
169
1.91
132
065
ATP-dependent protease
binding subunit (clpB) homolog
Adhesin P1 precursor homolog
Phosphocarrier protein HPr
(ptsH)
Ribosomal protein L28 (rpL28)
Phosphocarrier protein HPr
(ptsH)
Adhesin P1 precursor homolog
Conserved hypothetical
053
2.53
2.45
2.37
2.28
2.12
DNA-directed RNA polymerase
delta subunit (rpoE)
Adhesin P1 precursor homolog
Hypothetical
Ribosomal protein L28 (rpL28)
Elongation factor TU (tuf)
146
665
144
564
2.49
2.29
2.28
2.12
132
2.06
Adhesin P1 precursor homolog
392
2.08
491
2.05
Membrane nuclease
303
2.04
Adhesin P1 precursor homolog
DNA-directed RNA polymerase
delta subunit (rpoE)
Phosphocarrier protein HPr
(ptsH)
Conserved hypothetical
Elongation factor TU (tuf)
Adhesin P1 precursor homolog
Similar to NADP-dependent
alcohol dehydrogenase
Pyruvate dehydrogenase E1-beta
subunit (pdhB)
Pyruvate kinase (pyk)
370
392
1.98
1.92
574
331
2.03
2.02
Heat shock protein GroES
Trigger factor (tig)
Ribosomal protein S19 (rpS19)
688
1.89
624
1.98
Ribosomal protein L28 (rpL28)
1.90
1.81
Adhesin P1 precursor homolog
Cytidine deaminase (cdd)
600
685
1.83
1.81
491
386
1.95
1.95
370
219
392
1.79
1.77
1.75
412
126
281
1.77
1.71
1.70
471
302
501
1.94
1.92
1.91
Membrane nuclease
Deoxyguanosine kinase/
deoxyadenosine kinase
Ribosomal protein L33 (rpL33)
6-phosphofructokinase (pfk)
Hypothetical
491
412
1.74
1.72
Adhesin P1 precursor homolog
Ribosomal protein L11 (rpl11)
Pyruvate dehydrogenase E1-beta
subunit (pdhB)
Membrane nuclease
Adhesin P1 precursor homolog
Adhesin P1 precursor homolog
Pyruvate dehydrogenase E1-beta
subunit (pdhB)
ParA family of ATPases involved
in chromosome partition
ATP synthase alpha chain (atpA)
Sulfate transport ATP-binding
protein (cysA)
Adhesin P1 precursor homolog
Conserved hypothetical
Conserved hypothetical
303
181
1.68
1.67
Pyruvate kinase (pyk)
Ribosomal protein L18 (rpL18)
606
688
1.91
1.87
Enolase (eno)
ParA family of ATPases
involved in chromosome
partition
aThe
MPN ORF number according to Dandekar et al. (20).
gene expression as measured with microarrays.
cThe annotation of the ORF according to Dandekar et al. (20).
bNormalized
expected. In spite of the differences in approach and type of
statistics used, both tests (Student's t-test and Mann±Whitney
U-test) gave strikingly similar results, which con®rms our
®ndings.
Reliability of the data
When evaluating quantitative data from RNA analysis, one
has to keep in mind that there is no direct method for speci®c
quanti®cation of the minute amounts of mRNAs present in
bacterial cells, and that any method used to achieve this aim is
inevitably loaded with different sources of errors (10). Both
microarray technology and Q-PCR quanti®cation rely on an
exact and homogenous cDNA synthesis. However, cDNA
synthesis itself relies on different conditions, which might, or
might not, be gene speci®c.
Radioactively labeled cDNA for microarray analysis was
synthesized using gene-speci®c primers. In theory, this should
yield exact data, provided that there is no interaction among
the 676 primers added to the reaction, and that the primers
show equivalent hybridization kinetics. When these conditions are met, synthesis of similar amounts of cDNA per copy
of mRNA will be synthesized independently of the genes.
When primers cross-hybridize, however, a series of unspeci®c
signals will be detected in the array experiment. To determine
the severity of this problem in our experimental set-up, we
have done control microarray experiments using a limited
number of gene-speci®c primers. We have synthesized cDNA
with a set of 41 gene-speci®c primers to detect any trend for
signi®cant cross-hybridization (data not shown). Aside from
the expected 41 ORFs, at least 87 different ORFs showed
signi®cant signals. The effect of unspeci®cally primed cDNA
might be the result of at least two factors. Both the low
temperature of 37°C during reverse transcription and the
unfractionated primer mix might contribute to this undesired
effect. The low reaction temperature supports unspeci®c
hybridization of primers and the n ± 1 fragments in the primer
mix serve as alternate primers, binding randomly all over the
RNA templates. Furthermore, genes with signi®cant sequence
similarities, like those which contain repetitive DNA
sequences (19) might cause problems due to false positive
signals (cross-hybridization). In these cases, a protein analysis
by mass spectrometry with partial amino acid sequences will
provide more con®dent data (36).
It is worth mentioning that a number of genes did not give
signi®cant transcription signals, but their translation products
could be identi®ed by mass spectrometry (36).
The overall trends (i.e. high or low expression) seen in the
data derived from microarrays were con®rmed by the more
variable data derived from the real-time PCR analysis and the
regulation seen in the case of microarray-derived data could be
6316
Nucleic Acids Research, 2003, Vol. 31, No. 21
Nucleic Acids Research, 2003, Vol. 31, No. 21
6317
Figure 5. RT±PCR results. Total RNA from M.pneumoniae grown at 32°C and from heat shock-treated cells was reverse transcribed with a mix of 688
gene-speci®c primers. The cDNA was used in a 24 cycle PCR. The synthesized DNA was loaded onto an agarose gel and stained with ethidium bromide.
PCR results are shown in pairs (left, 32°C; right, heat shock) for each selected gene. The results were divided into three categories based on human judgment
of the ethidium bromide staining intensity: (a) genes which showed a heat shock-induced up-regulation in the RT±PCR, (b) genes which showed no regulation
in the RT±PCR and (c) genes which showed a heat shock-induced down-regulation in the RT±PCR. The results were compared with the statistically evaluated
microarray signals. The microarray data are represented as: ns, genes which showed no statistically signi®cant regulation in the array analysis; ±, genes which
showed statistically signi®cant down-regulation in the array analysis.
Figure 4. Expression map of M.pneumoniae. Color scale represents the normalized gene expression. Black boxes represent non-coding RNA genes (tRNAs,
ribosomal RNAs, etc.). Genes that were not subject to the microarray analysis are white. White squares above the genome position scale represent the
repetitive regions.
6318
Nucleic Acids Research, 2003, Vol. 31, No. 21
Table 3. Copy numbers of mRNA and rRNA in a 1 mg total RNA preparation (thousands) as acquired by the Q-PCR analysis
MPN
Annotationa
Heat shock
Mean
SD
32°C
Mean
SD
37°C
Mean
SD
140
288
449
449
376
434
333
352
531
DHH family phosphoesterases
Conserved hypothetical
Conserved hypothetical
Conserved hypothetical
Hypothetical
DnaK
Hypothetical
Sigma-70 factor family
ATP-dependent protease binding subunit (clpB) homolog
16S RNA
395
650
157
354
42 064
71 090
8297
2966
201 478
3.04E+07
144
229
70
73
17 783
9058
8772
1378
155 193
1.15E+07
496
4780
<1
701
28 032
13 392
1385
5978
7569
4.82E+07
100
3818
68
257
6767
4911
301
3339
2185
1.50E+07
720
1868
160
328
34 726
21 397
2837
1228
34 286
3.76E+07
635
1716
26
108
27 561
4983
1879
785
17 080
5.45E+06
aAnnotation
according to Dandekar et al. (20).
con®rmed. It is probable that the microarray technology is less
amenable to the varying ef®ciency of the cDNA reaction and
the overall amount of cDNA used because only relative, and
not absolute, amounts of cDNA are measured.
Microarray signal and mRNA concentration
The results of a microarray analysis are always relative to the
total acquired signal. Therefore, they give no information
about the absolute number of mRNA molecules found in a cell
or per 1 mg of total RNA. Ribosomal RNA is sometimes used
for normalizing the data, but it has been shown that rRNA in
E.coli is also regulated during heat shock conditions (30).
Thus, to give the qualitative microarray data absolute
values, it is necessary to apply different techniques, such as
Q-PCR. Although this method did not yield very precise
results, it was possible to ®nd a correlation between the
measured copy number of mRNA molecules and microarray
signal strength. From these data, we were able to calculate
copy numbers of individual mRNA species or 16SrRNA in a
total RNA preparation and to correlate these numbers with
genome equivalents and cell numbers. The main problems
with these quantitative analyses concern the low numbers of
individual mRNA or 16SrRNA species per bacterium
(Table 3). Assuming that the 200 mg of total RNA, which
we isolate from a standard growth culture, would consist only
of the three ribosomal RNA species, then approximately
800 molecules of each rRNA species would be calculated for a
single cell. In E.coli, the preponderance (80%) of total RNA is
ribosomal RNA, while mRNA amounts only to 4% of the total
RNA (37). Adapting these data for M.pneumoniae, it is
obvious that the experimentally derived number of approximately 80 16SrRNA molecules per cell is too low by a factor
of 10. The experimental result for the clpB mRNA is even
lower, as we ®nd only 0.1 copy per cell. These data represent
the concentrations of the mRNA species which are present at a
given time in the cell population. There are several potential
sources of error (10), like calculation of cell number in our
preparation, loss of RNA during isolation, inef®cient cDNA
synthesis and the Q-PCR itself. In addition, the cells in our
culture show most probably growth phase differences. Besides
not being synchronized, M.pneumoniae grows in clumps
attached to the plastic surface, providing different microenvironments for individual cells. The contribution of errors
from the different sources is not equal, meaning that one might
account for a greater error than another. We think that the
estimation of cell numbers is conservative, although the CCU
test, which we used, is not precise as M.pneumoniae grows in
clumps. For this reason, the number of cells in the last dilution
could deviate from the real number by at least a factor of 10.
Loss of RNA during isolation in the range of 90% seems
unlikely, since M.pneumoniae is surrounded by a cytoplasma
membrane only. This facilitates a quick lysis of the cell and
inactivation of cellular RNAse activities. Nevertheless, the
loss of RNA during an RNA extraction is inevitable, however,
dif®cult to quantify. Therefore, we conclude that cDNA
synthesis is the most crucial step of all, which is dif®cult to
optimize for all mRNA species in a total RNA preparation.
Several parameters in cDNA synthesis are dif®cult to control,
like the ef®ciency of priming, as well as the dissociation of
secondary structures within mRNA species. Both these
parameters are associated. To standardize the parameters, we
applied two primer types in the microarray experiments
(random hexamers and ORF-speci®c primers) and compared
the results.
Our ®nding with the random primers was that, although the
yield in transcribed cDNA was higher, the speci®city of
reverse transcription of gene probes, i.e. mRNA species, was
lower compared with gene-speci®cally primed cDNA. This
led to weak signals and a high background on the microarrays.
The ORF-speci®c primers eliminated these technical problems.
The type of primers used in cDNA synthesis and the
experience with optimal results varies among laboratories and
seems controversial in the literature. Some authors favor
hexamers, others longer gene-speci®c primers. Ar®n et al. (13)
presented a comparison between hexamer and oligo priming.
They identi®ed more mRNA species with hexamer priming
and recommended the use of such primers. In our experiments,
we tested both types of primers and the results with hexamers
were less successful. Since hexamer priming also labels the
rRNA, which comprises ~90% of the total RNA, we thought
this might be the reason for our high background problem. It
was also important for us to use gene-speci®c oligonucleotides
with respect to experiments aimed to study the interaction of
M.pneumoniae with human lung epithelial cells. In such an
experimental set-up, we would also have to deal with large
amounts of RNA from the epithelial cells aside from the
ribosomal RNA.
Nucleic Acids Research, 2003, Vol. 31, No. 21
6319
ACKNOWLEDGEMENTS
We thank E. Pirkl for excellent technical assistance and
J. Hoheisel and N. Hauser for help with the preparation of
nylon membranes. This research was supported by grants from
the Deutsche Forschungsgemeinschaft, the Graduiertenkolleg
`Pathogene Mikroorganismen: Molekulare Mechanismen und
Genome' and by the Fonds der Chemischen Industrie.
REFERENCES
Figure 6. Results of Q-PCR analysis of nine different M.pneumoniae genes.
(a) Correlation between the absolute copy number per 1 mg of a total RNA
isolation, and the microarray signal (percentage). Different symbols correspond to the three experimental conditions used. (b) Correlation of the differences in copy number and differences in micorarray signal strength between
the three different conditions. Correlation coef®cients are given in the
legend. Different symbols correspond to the three comparisons possible
among three different experimental conditions, as explained in the box.
All the ORF-speci®c primers were used for the synthesis of
PCR products from DNA, therefore, we assume that they also
prime during cDNA synthesis. However, although all the
primers have a similar melting temperature, it is impossible to
provide conditions which are optimal for the cDNA synthesis
for all mRNA species. Therefore, comparing signals from
mRNA of different genes of cells grown in identical conditions could be misleading about the actual concentration of
these mRNAs, since a similar signal strength does not
necessarily indicate a similar number of mRNA molecules.
In contrast, gene-speci®c signals derived from cells grown
under different conditions can be compared with con®dence,
since the conditions for all the parameters, except the mRNA
copy numbers, are kept constant.
In summary, for a precise determination of mRNA copy
numbers it would be important to develop methods which
allow measuring the mRNA directly without the additional
step of cDNA synthesis.
1. Hecker,M., Schumann,W. and Volker,U. (1996) Heat-shock and general
stress response in Bacillus subtilis. Mol. Microbiol., 19, 417±428.
2. Velculescu,V.E., Zhang,L., Vogelstein,B. and Kinzler,K.W. (1995)
Serial analysis of gene expression. Science, 270, 484±487.
3. Williams,K.L. and Hochstrasser,D.F. (1997) Introduction to the
Proteome. In Wilkins,M., Williams,K., Appel,R. and Hochstrrasser,D.
(eds), Proteome Research: New Frontiers in Functional Genomics.
Springer Verlag, Heidelberg, Germany, pp. 1±11.
4. Chee,M., Yang,R., Hubbell,E., Berno,A., Huang,X.C., Stern,D.,
Winkler,J., Lockhart,D.J., Morris,M.S. and Fodor,S.P. (1996) Accessing
genetic information with high-density DNA arrays. Science, 274,
610±614.
5. Cho,R.J., Fromont Racine,M., Wodicka,L., Feierbach,B., Stearns,T.,
Legrain,P., Lockhart,D.J. and Davis,R.W. (1998) Parallel analysis of
genetic selections using whole genome oligonucleotide arrays. Proc. Natl
Acad. Sci. USA, 95, 3752±3757.
6. Devaux,F., Marc,P. and Jacq,C. (2001) Transcriptomes, transcription
activators and microarrays. FEBS Lett., 498, 140±144.
7. Hauser,N.C., Vingron,M., Scheideler,M., Krems,B., Hellmuth,K.,
Entian,K.D. and Hoheisel,J.D. (1998) Transcriptional pro®ling on all
open reading frames of Saccharomyces cerevisiae. Yeast, 14, 1209±1221.
8. Velculescu,V.E., Zhang,L., Zhou,W., Vogelstein,J., Basrai,M.A.,
Bassett,D.E.,Jr, Hieter,P., Vogelstein,B. and Kinzler,K.W. (1997)
Characterization of the yeast transcriptome. Cell, 88, 243±251.
9. Conway,T. and Schoolnik,G.K. (2003) Microarray expression pro®ling:
capturing a genome-wide portrait of the transcriptome. Mol. Microbiol.,
47, 879±889.
10. Hat®eld,G.W., Hung,S.P. and Baldi,P. (2003) Differential analysis of
DNA microarray gene expression data. Mol. Microbiol., 47, 871±877.
11. Sekowska,A., Robin,S., Daudin,J.J., Henaut,A. and Danchin,A. (2001)
Extracting biological information from DNA arrays: an unexpected link
between arginine and methionine metabolism in Bacillus subtilis.
Genome Biol., 2, RESEARCH0019.
12. Beissbarth,T., Fellenberg,K., Brors,B., Arribas-Prat,R., Boer,J.,
Hauser,N.C., Scheideler,M., Hoheisel,J.D., Schutz,G., Poustka,A. and
Vingron,M. (2000) Processing and quality control of DNA array
hybridization data. Bioinformatics, 16, 1014±1022.
13. Ar®n,S.M., Long,A.D., Ito,E.T., Tolleri,L., Riehle,M.M., Paegle,E.S. and
Hat®eld,G.W. (2000) Global gene expression pro®ling in Escherichia
coli K12. The effects of integration host factor. J. Biol. Chem., 275,
29672±29684.
14. GoÈrg,A., Obermaier,C., Boguth,G., Harder,A., Scheibe,B., Wildgruber,R.
and Weiss,W. (2000) The current state of two-dimensional
electrophoresis with immobilized pH gradients. Electrophoresis, 21,
1037±1053.
15. Siuzdak,G. (1994) The emergence of mass spectrometry in biochemical
research. Proc. Natl Acad. Sci. USA, 91, 11290±11297.
16. Shevchenko,A., Jensen,O.N., Podtelejnikov,A.V., Sagliocco,F.,
Wilm,M., Vorm,O., Mortensen,P., Boucherie,H. and Mann,M. (1996)
Linking genome and proteome by mass spectrometry: large-scale
identi®cation of yeast proteins from two dimensional gels. Proc. Natl
Acad. Sci. USA, 93, 14440±14445.
17. Wilm,M. (2000) Mass spectrometric analysis of proteins. Adv. Protein
Chem., 54, 1±30.
18. Peng,J., Elias,J.E., Thoreen,C.C., Licklider,L.J. and Gygi,S.P. (2003)
Evaluation of multidimensional chromatography coupled with tandem
mass spectrometry (LC/LC-MS/MS) for large-scale protein analysis: the
yeast proteome. J. Proteome Res., 2, 43±50.
19. Himmelreich,R., Hilbert,H., Plagens,H., Pirkl,E., Li,B.C. and
Herrmann,R. (1996) Complete sequence analysis of the genome of the
bacterium Mycoplasma pneumoniae. Nucleic Acids Res., 24, 4420±4449.
6320
Nucleic Acids Research, 2003, Vol. 31, No. 21
20. Dandekar,T., Huynen,M., Regula,J.T., Ueberle,B., Zimmermann,C.U.,
Andrade,M., Doerks,T., Sanchez-Pulido,L., Snel,B., Suyama,M.,
Yuan,Y.P., Herrmann,R. and Bork,P. (2000) Re-annotating the
Mycoplasma pneumoniae genome sequence: adding value, function and
reading frames. Nucleic Acids Res., 28, 3278±3288.
21. Razin,S. and Jacobs,E. (1992) Mycoplasma adhesion. J. Gen. Microbiol.,
138, 407±422.
22. Razin,S., Yogev,D. and Naot,Y. (1998) Molecular biology and
pathogenicity of mycoplasmas. Microbiol. Mol. Biol. Rev., 62,
1094±1156.
23. Baseman,J.B., Lange,M., Criscimagna,N.L., Giron,J.A. and Thomas,C.A.
(1995) Interplay between mycoplasmas and host target cells. Microb.
Pathog., 19, 105±116.
24. Bornberg-Bauer,E. and Weiner,J.,III (2002) A putative transcription
factor inducing mobility in Mycoplasma pneumoniae. Microbiology, 148,
3764±3765.
25. Hay¯ick,L. (1965) Tissue cultures and mycoplasmas. Tex. Rep. Biol.
Med., 23, Suppl 1, 285+.
26. Gragerov,A., Nudler,E., Komissarova,N., Gaitanaris,G.A.,
Gottesman,M.E. and Nikiforov,V. (1992) Cooperation of GroEL/GroES
and DnaK/DnaJ heat shock proteins in preventing protein misfolding in
Escherichia coli. Proc. Natl Acad. Sci. USA, 89, 10341±10344.
27. Mogk,A., Tomoyasu,T., Goloubinoff,P., Rudiger,S., Roder,D.,
Langen,H. and Bukau,B. (1999) Identi®cation of thermolabile
Escherichia coli proteins: prevention and reversion of aggregation by
DnaK and ClpB. EMBO J., 18, 6934±6949.
28. Konieczny,I. and Liberek,K. (2002) Cooperative action of Escherichia
coli ClpB protein and DnaK chaperone in the activation of a replication
initiation protein. J. Biol. Chem., 277, 18483±18488.
29. El Hage,A., Sbai,M. and Alix,J.H. (2001) The chaperonin GroEL and
other heat-shock proteins, besides DnaK, participate in ribosome
biogenesis in Escherichia coli. Mol. Gen. Genet., 264, 796±808.
30. Hansen,M.C., Nielsen,A.K., Molin,S., Hammer,K. and Kilstrup,M.
(2001) Changes in rRNA levels during stress invalidates results from
mRNA blotting: ¯uorescence in situ rRNA hybridization permits
renormalization for estimation of cellular mRNA levels. J. Bacteriol.,
183, 4747±4751.
31. Richmond,C.S., Glasner,J.D., Mau,R., Jin,H. and Blattner,F.R. (1999)
Genome-wide expression pro®ling in Escherichia coli K-12. Nucleic
Acids Res., 27, 3821±3835.
32. Smoot,L.M., Smoot,J.C., Graham,M.R., Somerville,G.A.,
Sturdevant,D.E., Migliaccio,C.A., Sylva,G.L. and Musser,J.M. (2001)
Global differential gene expression in response to growth temperature
alteration in group A Streptococcus. Proc. Natl Acad. Sci. USA, 98,
10416±10421.
33. Helmann,J.D., Wu,M.F., Kobel,P.A., Gamo,F.J., Wilson,M.,
Morshedi,M.M., Navre,M. and Paddon,C. (2001) Global transcriptional
response of Bacillus subtilis to heat shock. J. Bacteriol., 183, 7318±7328.
34. Stintzi,A. (2003) Gene expression pro®le of Campylobacter jejuni in
response to growth temperature variation. J. Bacteriol., 185, 2009±2016.
35. Ojaimi,C., Brooks,C., Casjens,S., Rosa,P., Elias,A., Barbour,A.,
Jasinskas,A., Benach,J., Katona,L., Radolf,J., Caimano,M., Skare,J.,
Swingle,K., Akins,D. and Schwartz,I. (2003) Pro®ling of temperatureinduced changes in Borrelia burgdorferi gene expression by using whole
genome arrays. Infect. Immun., 71, 1689±1705.
36. Ueberle,B., Frank,R. and Herrmann,R. (2002) The proteome of the
bacterium Mycoplasma pneumoniae: comparing predicted open reading
frames to identi®ed gene products. Proteomics, 2, 754±764.
37. Neidhardt,F., Ingraham,J.L. and Schaechter,M. (1990) Physiology of the
Bacterial Cell. Sinauer Associates, Inc., Sunderland, MA.