IVET experiments in Pseudomonas fluorescens reveal cryptic

Microbiology Comment
Thus, computational approaches may fail
to find expressed sequences that differ
significantly from those characterized
previously. For example, Wong et al. (2000)
found a gene, specifying a 41 residue
protein that affected cell-wall-synthesis
inhibition, within a 602 bp region of the
Escherichia coli genome that had been
annotated as being an ‘intergenic’ sequence.
Genome arrays designed to include all
annotated ORFs of a particular genome
sequence will fail to identify novel expressed
sequences that have escaped the annotation
process. Examples include the growing
number of regulatory RNA molecules that
are being identified in prokaryotes. These
limitations do not seriously diminish the
value of genomic approaches, but do
indicate that genomics should be just
one of a number of tools used to study
complex systems.
IVET experiments in
Pseudomonas fluorescens
reveal cryptic promoters at
loci associated with
recognizable overlapping
genes
Despite the vast amount of useful data that
has come from the use of bioinformatic
and genomic technologies, there are
intrinsic limitations to the type of
information that can be obtained simply
by standard computational analysis of
DNA sequences. Bioinformatic tools must
be ‘trained’ to look for sequence features
associated with genes or specific motifs.
518
A complementary approach to identify
and understand functional genes in
bacteria – even in complex environments –
is the promoter trapping strategy IVET
(in vivo expression technology) (Mahan
et al., 1993; Osbourn et al., 1987).
Promoters active in the wild, but inactive
under laboratory conditions, can be
isolated on a genome-wide scale on the
basis of their ability to drive expression
of a gene that is essential for growth. IVET
(and various derivatives) has been widely
used to examine the genes induced in
pathogens during infection of hosts and
has also been used to identify genes
induced during colonization of plants,
of plant-pathogenic fungi (Lee & Cooksey,
2000), during infection of Arabidopsis
thaliana (Boch et al., 2002) and during
Rhizobium–legume symbiosis (Oke &
Long, 1999).
We have used IVET to identify
Pseudomonas fluorescens promoters (and
the genes they control) induced in complex
soil-based environments. Screening IVET
libraries constructed from genomes of
both SBW25 or Pf0-1 has identified more
than 50 P. fluorescens promoters (Gal et al.,
2003; Rainey, 1999; M. W. Silby & S. B.
Levy, unpublished). Analysis of the
trapped DNA sequences shows that most
contain a recognizable promoter oriented
in the correct direction with respect to
transcription (Fig. 1a). However, 10 out
of 22 fusions in Pf0-1 lack a discernable
promoter and are organized such that the
captured DNA has a known gene oriented
opposite to that necessary for transcription
of the promoterless reporter (Fig. 1b).
That these fusion strains do indeed possess
‘trapped’ promoters and are not merely
false-positive isolates is demonstrated by
showing increased survival of such strains
in the wild relative to a negative control.
Various groups have reported IVET
fusions that are oriented in the opposite
orientation to annotated gene(s) (Camilli
& Mekalanos, 1995; Mahan et al., 1993;
Rainey, 1999; Wang et al., 1996).
Although only one such fusion was
reported for SBW25 in a screen for
rhizosphere-activated promoters (Rainey,
1999), approximately 20 % of fusions
recovered in that screen fell into the
‘opposite orientation’ class but were not
published (P. B. Rainey, unpublished). We
term these ‘cryptic fusions’ to reflect the
fact that the active promoters and the
sequences under their transcriptional
control (oriented correctly with respect
to the reporter genes) have not been
recognized.
There are two possible explanations
for the activity of the cryptic fusions.
First, the result is an artefact of the system
and the promoter-like sequences that are
trapped and driving expression of reporters
in the IVET constructs do not normally
function in their natural genomic context.
Second, the cryptic fusions reveal
previously unrecognized ‘cryptic’
promoters which have a purpose. While
the first possibility may be true for some
constructs, the frequency at which cryptic
fusions are isolated argues strongly in
favour of a genuine function. The existence
of these cryptic fusions with a recognizable
gene on the opposite strand suggests that,
while seldom predicted in genome
annotation, expressed sequences that
overlap extensively on opposite DNA
strands of prokaryotic genomes may be
more common than currently thought
(Fig. 1).
Functional cryptic promoters can
conceivably drive transcription with one
of two resulting outcomes – production
of a non-coding RNA molecule that is not
translated, or transcription of mRNA that
will subsequently direct production of a
protein. In the first case, the cryptic
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sat, 17 Jun 2017 18:26:35
Microbiology 150
Microbiology Comment
Masse et al., 2003). Given the emerging
view that regulatory RNA molecules are
important for environmental adaptation
(Repoila et al., 2003), it is conceivable that
the ‘cryptic’ promoter drives transcription
of an environment-specific regulatory
molecule to allow appropriate modulation
of activity of the target gene in a given
environment, in a metabolically affordable
manner. An antisense molecule would be
the simplest possibility, but the diversity
of regulatory RNAs means that a more
broadly active regulator cannot be ruled
out. That the unknown sequences might be
involved in antisense regulation has been
suggested (Osorio & Camilli, 2003).
Fig. 1. Organization of IVET fusions recovered in screens for P. fluorescens genes
induced in soil and rhizosphere environments. (a) The genomic insert contains the promoter (and partial ORF) of a ‘known’ gene induced in natural environments, fused to
promoterless IVET reporter genes. Interrogation of DNA sequence databases reveals
similar sequences or similarity to predicted protein-coding regions. All similar sequences
read from left to right, which is the arrangement required to drive expression of the
reporters. (b) Gene structure of ‘cryptic’ IVET fusions. Our data indicate that the IVET
reporters are expressed, confirming the existence of previously unrecognized ‘cryptic’
promoters. The sequences whose expression is driven by the ‘cryptic’ promoters overlap
those of previously characterized (‘known’) genes. Analysis of genome sequence data
shows the presence of an ORF, suggesting overlapping protein-coding sequences. The
overlapping ORFs show varied arrangements. Some cryptic fusion ORFs start and finish
within the opposite ORF, some start within but finish beyond the start of the opposite
ORF, while others start downstream of the opposing ORF and terminate within it. The
broken box at the 59 end of the ‘known gene’ indicates that the fusion constructs do
not necessarily all contain the 59 end of the known genes. (c) A specific example of an
iiv (induced in vivo, i.e. soil induced) ORF overlapping a gene annotated in the Pf0-1
genome sequence, as it appears in the Pf0-1 genome. ‘CP’ indicates the cryptic promoter upstream of the ORF. The asterisk represents the fusion point between this
sequence and the IVET reporter genes in the original fusion construct. The Pflu1358
sequence (NCBI annotation of the Pf0-1 genome sequence; COG1012: predicted
NAD-dependent aldehyde dehydrogenase) begins 393 bp beyond the end of the iiv5
ORF. The iiv5 ORF begins 6 bp upstream of the Pflu1358 termination codon. Thus,
there is 1032 bp of overlapping sequence that potentially specifies two proteins. Using
SoftBerry software (http://www.softberry.com/berry.phtml), ”35 and ”10 boxes and a
transcriptional start site were predicted 84, 60 and 44 bp upstream of the iiv5 ORF,
respectively. A putative rho-independent terminator is predicted 180 bp downstream of
the iiv5 ORF.
promoter might normally be responsible
for the transcription of a regulatory RNA
molecule, which would be antisense to
the transcript of any oppositely oriented,
overlapping gene (Fig. 1b) and so control
http://mic.sgmjournals.org
its expression. There are an increasing
number of regulatory RNAs described
among the prokaryotes, adding to the
complexity of the current view of gene
regulation (Johansson & Cossart, 2003;
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sat, 17 Jun 2017 18:26:35
If the cryptic promoters drive production
of mRNA, the findings would indicate the
existence of a number of overlapping
protein-coding genes that run in opposite
directions. Although overlapping genes
of this nature are thought to be rare in
prokaryote genomes (Rogozin et al., 2002),
examination of sequences available from
the draft genome of Pf0-1 reveals the
presence of an ORF in the same orientation
as the reporter gene in each ‘cryptic’ fusion
construct, and on the opposite strand to a
gene that gives a clear hit in BLAST searches
(see Fig. 1c, for example). A closer
examination shows that these ORFs range
in size from 213 to 1908 bp and the
predicted translation products do not
match any known or hypothetical protein
in the public databases. In terms of codon
usage, the mean differences when compared
to the P. fluorescens codon-usage table
(http://www.kazusa.or.jp/codon/cgi-bin/
showcodon.cgi?species=Pseudomonas+
fluorescens+[gbbct]) range from 14?81 to
27?47 %, with the shorter ORFs tending
to have the greater mean difference.
While the codon usage may appear to be
considerably different from that expected
for P. fluorescens, the fact that known
genes from Pf0-1 also show high mean
differences suggests that such differences
are not significant. For example, RecA,
DapB, GlnA and FlgB have mean
differences of 14?33, 11?84, 14?69 and
16?8 %, respectively. What has led to the
retention of such gene arrangements
through evolution is unknown, but the
potentially large numbers of such loci
could indicate a functional significance.
For example, there may be a competition
between the two genes for optimal
519
Microbiology Comment
transcription, the outcome of which is
determined by environmental factors.
An alternative speculation is that there is
some special relationship between the
protein products of the overlapping
genes. Regardless, the correlation of
overlapping ORFs with the ‘cryptic’ fusions
is striking. When considered alongside the
fact that the IVET reporter is expressed in
these strains, this arrangement suggests
hitherto unrecognized genes that are
active only under natural environmental
conditions (or repressed under laboratory
conditions).
Boch, J., Joardar, V., Gao, L., Robertson, T. L.,
Lim, M. & Kunkel, B. N. (2002). Identification of
Pseudomonas syringae pv. tomato genes induced
during infection of Arabidopsis thaliana.
Mol Microbiol 44, 73–88.
affecting intrinsic susceptibility to certain
inhibitors of peptidoglycan synthesis. Mol
Microbiol 37, 364–370.
DOI 10.1099/mic.0.26871-0
Camilli, A. & Mekalanos, J. J. (1995). Use of
recombinase gene fusions to identify Vibrio
cholerae genes induced during infection. Mol
Microbiol 18, 671–683.
Gal, M., Preston, G. M., Massey, R. C.,
Spiers, A. J. & Rainey, P. B. (2003). Genes
encoding a cellulosic polymer contribute
toward the ecological success of Pseudomonas
fluorescens SBW25 on plant surfaces. Mol Ecol
12, 3109–3121.
Johansson, J. & Cossart, P. (2003).
While there is yet considerable effort
required to fully understand and appreciate
the significance of these findings, we are
moved to communicate the results for two
reasons. First, to alert other researchers
using similar screens not to simply discard
such sequences in the belief that they
represent false-positive findings. Second,
and perhaps more importantly, these
findings serve as a timely reminder of the
ongoing potential for discovery using
genetic approaches. Genetics and genomics
should progress side by side to maximize
the potential of both.
Acknowledgements
We are grateful to Laura McMurry
for comments on the manuscript. This
work was supported by a grant from
the Department of Energy to S. B. L.
(DE#FG02-97ER62493) and a BBSRC
(UK) Research Fellowship to P. B. R.
1
Mark W. Silby, Paul B. Rainey
1,4
and Stuart B. Levy
2,3
1
Center for Adaptation Genetics and
Drug Resistance, Department of
Molecular Biology and Microbiology,
Tufts University School of Medicine,
Boston, MA 02111, USA
2
Department of Plant Sciences,
University of Oxford, South Parks
Road, Oxford OX1 3RB, UK
3
School of Biological Sciences,
University of Auckland, Private Bag
92019, Auckland, New Zealand
4
Department of Medicine, Tufts
University School of Medicine, Boston,
MA 02111, USA
Correspondence: Stuart B. Levy
([email protected])
520
RNA-mediated control of virulence gene
expression in bacterial pathogens. Trends
Microbiol 11, 280–285.
Lee, S. W. & Cooksey, D. A. (2000). Genes
expressed in Pseudomonas putida during
colonization of a plant-pathogenic fungus.
Appl Environ Microbiol 66, 2764–2772.
Mahan, M. J., Slauch, J. M. & Mekalanos, J. J.
(1993). Selection of bacterial virulence genes
that are specifically induced in host tissues.
Science 259, 686–688.
Masse, E., Majdalani, N. & Gottesman, S.
(2003). Regulatory roles for small RNAs in
bacteria. Curr Opin Microbiol 6, 120–124.
Oke, V. & Long, S. R. (1999). Bacterial
genes induced within the nodule during the
Rhizobium-legume symbiosis. Mol Microbiol
32, 837–849.
Osbourn, A. E., Barber, C. E. & Daniels, M. J.
(1987). Identification of plant-induced genes of
the bacterial pathogen Xanthomonas campestris
pathovar campestris using a promoter-probe
plasmid. EMBO J 6, 23–28.
Osorio, G. & Camilli, A. (2003). Hidden
dimensions of Vibrio cholerae pathogenesis.
ASM News 69, 396–401.
Rainey, P. B. (1999). Adaptation of
Pseudomonas fluorescens to the plant
rhizosphere. Environ Microbiol 1, 243–257.
Repoila, F., Majdalani, N. & Gottesman, S.
(2003). Small non-coding RNAs, co-ordinators
of adaptation processes in Escherichia coli: the
RpoS paradigm. Mol Microbiol 48, 855–861.
Rogozin, I. B., Spiridonov, A. N., Sorokin, A. V.,
Wolf, Y. I., Jordan, I. K., Tatusov, R. L. & Koonin,
E. V. (2002). Purifying and directional selection
in overlapping prokaryotic genes. Trends Genet
18, 228–232.
Wang, J., Mushegian, A., Lory, S. & Jin, S.
(1996). Large-scale isolation of candidate
virulence genes of Pseudomonas aeruginosa by
in vivo selection. Proc Natl Acad Sci U S A 93,
10434–10439.
Wong, R. S., McMurry, L. M. & Levy, S. B.
(2000). ‘Intergenic’ blr gene in Escherichia
coli encodes a 41-residue membrane protein
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sat, 17 Jun 2017 18:26:35
Microbiology 150