Prokaryotic genome regulation: multifactor promoters, multitarget

REVIEW ARTICLE
Prokaryotic genome regulation: multifactor promoters, multitarget
regulators and hierarchic networks
Akira Ishihama
Department of Frontier Bioscience, Hosei University, Tokyo, Japan
Correspondence: Akira Ishihama,
Department of Frontier Bioscience, Hosei
University, Koganei, Tokyo 184-8584, Japan.
Tel./fax: 181 42 387 6231; e-mail:
[email protected]
Received 13 February 2010; revised 8 April
2010; accepted 9 April 2010.
Final version published online 18 May 2010.
DOI:10.1111/j.1574-6976.2010.00227.x
Editor: Juan L. Ramos
MICROBIOLOGY REVIEWS
Keywords
transcription factor; multifactor promoter;
global regulator; regulation network; Genomic
SELEX (systematic evolution of ligands by
exponential enrichment); Escherichia coli.
Abstract
The vast majority of experimental data have been accumulated on the transcription regulation of individual genes within a single model prokaryote, Escherichia
coli, which form the well-established on–off switch model of transcription by
DNA-binding regulatory proteins. After the development of modern highthroughput experimental systems such as microarray analysis of whole genome
transcription and the Genomic SELEX search for the whole set of regulation
targets by transcription factors, a number of E. coli promoters are now recognized
to be under the control of multiple transcription factors, as in the case of
eukaryotes. The number of regulation targets of a single transcription factor has
also been found to be more than hitherto recognized, ranging up to hundreds of
promoters, genes or operons for several global regulators. The multifactor
promoters and the multitarget transcription factors can be assembled into
complex networks of transcription regulation, forming hierarchical networks.
Introduction
The RNA polymerase holoenzyme of Escherichia coli is
composed of a multisubunit core enzyme with subunit
composition a2bb 0 o, and one of seven species of promoter
recognition subunit s (Helmann & Chamberlin, 1988;
Gross et al., 1992, 1998; Ishihama, 2000) (Fig. 1). The
growing E. coli cells contain about 2000 molecules of the
core enzyme per genome equivalent of DNA (Ishihama,
2000), which is less than the total number of about 4450
genes on the E. coli K-12 genome, and a combined number
of 2000–3000 molecules of all seven s subunits in various
ratios depending on environmental conditions (Ishihama,
2000). In addition, approximately 270 species of DNAbinding regulatory proteins, herein referred to as transcription factors, and 20–30 species of RNA polymerase-binding
transcription factors are involved in the control of RNA
polymerase functions (Ishihama, 2009). Altogether, about
8% of the E. coli genes are involved in transcription and
regulation (Table 1) (Ishihama, 2009). To function, the
DNA-associated transcription factors interact with DNAbound RNA polymerase subunits, and thus RNA polymerase and transcription factors together form the transcription
apparatus on the template DNA. Some transcription factors,
2010 Federation of European Microbiological Societies
Published by Blackwell Publishing Ltd. All rights reserved
c
however, influence transcription by modulating the local
conformation of DNA, such as induction of DNA curvature.
The distribution of RNA polymerase on the genome is
determined after interaction with DNA-associated transcription factors (Ishihama, 2000). As the intracellular
concentration and activity of each transcription factor
change depending on external conditions and internal
metabolic states, the localization pattern of transcription
factors changes, modulating the distribution of transcription apparatus on the genome. The transcription pattern of
the genome is therefore controlled in response to environmental conditions (Ishihama, 2000, 2009; Perez-Rueda &
Collado-Vides, 2000).
With the advances in genome sequencing technology, the
complete genome sequence has been determined for
4 1000 bacterial strains. The complete genome sequence of
E. coli has been determined for 4 10 different strains. From
the complete genome sequence, the whole set of proteincoding sequences on the E. coli genome has been predicted
(Hayashi et al., 2006; Riley et al., 2006), but the molecular
functions of gene products remain unidentified for about
half of the genes, even for the best-characterized model
prokaryote E. coli. No short cut for theoretical procedure is
FEMS Microbiol Rev 34 (2010) 628–645
629
Prokaryotic genome regulation
Fig. 1. Functional differentiation of Escherichia coli RNA polymerase.
RNA polymerase core enzyme with RNA synthesis activity is assembled
sequentially in the order of a1a ! a2 ! a2b ! a2bb 0 (Ishihama,
1990, 1999). Subunit o is a molecular chaperon and plays a role in
folding of the largest subunit b 0 (Ghosh et al., 2001). One of seven
species of the s subunit binds to the core enzyme, leading to the
formation of seven species of holoenzyme with promoter recognition
and transcription activities. About 300 species of transcription factors
modulate the promoter selectivity of RNA polymerase. When bound to
DNA targets near promoters, most of the DNA-binding transcription
factors interact with either a or s subunits to function (class-I and class-II
factors) (Ishihama, 1992, 1993; Ebright, 1993; Busby & Ebright, 1999).
Some factors directly associate with either b or b 0 subunits in the absence
of DNA (class-III and class-IV factors).
Table 1. Proteins involved in genome functions
Cell machinery
Genome architecture
Nucleoid
Replication machinery
Recombination-repair machinery
Transcription apparatus
RNA synthesis machinery
Promoter recognition factors
DNA-binding transcription factors
RNA polymerase-associated regulators
Total
No. of components
8
32
36
4
14
261
25
380
available to identify functions of uncharacterized individual
genes and proteins. In parallel with the whole genome
sequencing, a variety of high-throughput techniques have
been developed recently and employed to reveal the expression of the whole set of genes on the genome under given
culture conditions. For instance, high-throughput microarray analysis has made a breakthrough in the provision of
transcription patterns (or the transcriptomes) for a large
number of bacterial strains and under a variety of conditions
(Steinmetz & Davis, 2004). On a proteomic level, the highresolution two-dimensional PAGE system has also elucidated the genome expression patterns at protein level (the
proteomes). In addition, in the case of model organism E.
coli, large amounts of data on the regulation of individual
genes have been accumulated, which are being collected in
FEMS Microbiol Rev 34 (2010) 628–645
databases (Baumbach et al., 2006; Salgado et al., 2006).
These data have been integrated, together with the transcriptome, proteome and metabolome data obtained by
modern high-throughput techniques, to produce comprehensive models of the transcriptional regulation networks in
E. coli (Babu et al., 2004; Barabasi & Oltvai, 2004; GamaCastro et al., 2008).
Using the high-throughput procedures, the genome expression pattern has been analyzed for wild type and various
mutant E. coli strains growing under various conditions.
However, the mechanism of how the genome expression
pattern is determined still remains unresolved. One possible
approach is to identify the location of the transcription
apparatus within the genome and to monitor the dynamic
changes in its distribution. For this purpose, chromatin
immunoprecipitation combined with microarray hybridization (ChIP-chip) was developed and has been widely used to
identify the location of DNA-binding proteins on the
eukaryotic genomes (Ren et al., 2000; Lee et al., 2002;
Harbison et al., 2004; Bulyk, 2006). To a certain extent, the
ChIP-chip method has been found to be useful for prokaryotes (Grainger et al., 2005), although there remains a
problem with its successful application for prokaryotes.
One of the goals for postgenomic research, often referred
to as systems biology, is to understand the organization of all
the regulatory networks in the transcription of the genome
in a single organism. The target of modern molecular
genetics has thus shifted from regulation of individual genes
to genome regulation as a whole. This review provides an
overview of the recent progress on the regulation of genome
transcription, focusing on the regulatory roles and networks
of all transcription factors in a single model E. coli organism.
Genome expression pattern
Bacteria constantly monitor extracellular physicochemical
conditions so that they can respond by modifying their
genome expression pattern. In bacteria, transcription initiation is a major step of regulation of the genome expression,
although mRNA abundance is also regulated at the step of
transcription elongation and termination. mRNA degradation
is also subject to control and, furthermore, increasing data
indicate the involvement of translational control of mRNA in
E. coli through various mechanisms, including the interference
of mRNA translation by regulatory RNAs and proteins.
The genome-wide expression pattern has been analyzed
for various E. coli strains growing under various conditions,
such as alterations in nutrients (Khodursky et al., 2000; Oh
et al., 2002; Kao et al., 2005; Overton et al., 2006), increased
or decreased temperature (Phadtare & Inouye, 2004), exposure to oxidative stress (Zheng et al., 2001), addition of
polyamines (Terui et al., 2007) and metals (Lee et al., 2005),
under anaerobic (Salmon et al., 2003; Constantinidou et al.,
2010 Federation of European Microbiological Societies
Published by Blackwell Publishing Ltd. All rights reserved
c
630
2006; Overton et al., 2006) or acidic conditions (Masuda &
Church, 2003), and within biofilms (Ren et al., 2004). The
pattern of genome transcription is altered through modulation of the gene selectivity of RNA polymerase after interactions with transcription factors (Ishihama, 2000, 2009). Two
groups of the regulatory protein are involved in the modulation of gene selectivity of E. coli RNA polymerase: seven
species of the s factor (the promoter recognition subunit)
and about 300 species of the transcription factor (Ishihama,
2000, 2009) (Fig. 1). The s factors are essential for RNA
polymerase to recognize promoters and transcription initiation. The gene selectivity of RNA polymerase is mainly
determined by the s factors (Helmann & Chamberlin,
1988; Ishihama, 1997, 2000, 2009; Gruber & Gross, 2003;
Paget & Helmann, 2003; Gourse et al., 2006; Typas et al.,
2007). Within a group of the promoters recognized by one s
factor, the order of transcription level is determined primarily by the strength of the promoter. The promoter strength
is, however, subject to modulation by the species of transcription factor, which associates with the target DNA,
usually located near promoters, and modulates transcription from the promoters. The knowledge of the regulatory
mode of each promoter by these transcription factors is
needed to understand the molecular basis of the genome
regulation in the whole cell. At present, however, we have
only fragmentary knowledge even of the best-characterized
model prokaryote E. coli. The regulatory function is not
known at all for approximately a third of about 300
transcription factors from E. coli.
Combining the microarray-based high-throughput technology with ordinary genetic analysis systems of E. coli
permits the identification of whole sets of genes whose
expression depends on the functions of each of the transcription factors (Bulyk, 2006). For instance, microarray
analyses have been performed for a number of E. coli
mutants, each lacking a specific transcription factor such as
ArcA (Liu & de Wulf, 2004), CRP (Zheng et al., 2004), EvgA
(Masuda & Church, 2002; Eguchi et al., 2003), Fim (Schembri et al., 2002), Fis (Bradley et al., 2007), FNR (Salmon
et al., 2003; Overton et al., 2006), IHF (Arfin et al., 2000),
LexA (Fernandez de Henestrisa et al., 2000), LrhA (Lehnen
et al., 2002), Lrp (Hung et al., 2002), ModE (Tao et al.,
2005), NalL/NalP (Overton et al., 2006), NsrR (Bodenmiller
& Spiro, 2006) and SdiA (Wei et al., 2001), or overproducing
a specific transcription factor such as MarA/SoxS/Rob
(Martin & Rosner, 2002), EvgA (Masuda & Church, 2002)
and NsrR (Rankin et al., 2008). Microarray technology
produces gene expression data of E. coli on a genome scale
for an endless variety of conditions. However, the gene set
affected by depletion of one specific regulator gene or after
overproduction of one specific transcription factor, does not
represent the regulation targets under the direct control of
the test transcription factor but instead includes large
2010 Federation of European Microbiological Societies
Published by Blackwell Publishing Ltd. All rights reserved
c
A. Ishihama
numbers of genes, which are affected indirectly due to the
change in the expression level of direct target genes. Generally, the direct targets represent only minor fractions of the
genes detected by microarray analysis, because often the
genes for transcription factors are under the control of other
transcription factors, together forming cascades of the
transcription factor network (see below).
Transcription factors
The pattern of genome transcription is determined by
controlling the distribution of a limited number of RNA
polymerase among about 4450 genes on the E. coli genome
(Ishihama, 2000, 2009). As noted above, two groups of
regulatory proteins associate with the RNA polymerase core
enzyme to modulate its gene selectivity, i.e. the s factors at
the first step and the transcription factors at the second step,
together forming the transcription apparatus (Ishihama,
1999, 2000, 2009). Complete genomic sequencing has
allowed the prediction of the full repertoire of transcription
factors in E. coli using standard computational sequence
analysis techniques. About 300 species of transcription
factor have been identified in E. coli (Perez-Rueda &
Collado-Vides, 2000; Babu & Teichmann, 2003; Ishihama,
2009). Approximately 270–280 species of the transcription
factor are sequence-specific DNA-binding proteins. When
bound to target DNA sites, the DNA-binding transcription
factors interact directly with promoter-associated RNA
polymerase to function (Ishihama, 1992, 1993; Ebright,
1993; Busby & Ebright, 1999). To facilitate the frequent and
quick exchange of RNA polymerase-interacting regulators,
the affinity of protein–protein interaction between RNA
polymerase and transcription factors must be weak. The
binding of transcription factors at specific target sites near
promoters is therefore necessary for effective protein–protein interaction by increasing the local concentration of
pairing proteins on DNA. After initiation of transcription,
the transcription factors are dissociated from the transcription apparatus. However, 20–30 species of transcription
factor associate directly with the RNA polymerase in the
absence of DNA (Ishihama, 2009). This group of transcription factors generally associates with b or b 0 subunits, even
during elongation, and controls attenuation, elongation and
termination (class-III and class-IV factors in Fig. 1).
Generally, transcription factors are composed of two
domains, one functioning as the sensor for external and
internal signals and the other as interacting DNA targets. In
prokaryotes, the helix-turn-helix (H-T-H) motif is the most
common element in the DNA-binding domain. Based on
the common motifs, E. coli transcription factors can be
classified into 54 families (Ishihama, 2009). One group of
transcription factors, known as negative regulators or repressors, acts by binding to target operators in the absence of
FEMS Microbiol Rev 34 (2010) 628–645
631
Prokaryotic genome regulation
low-molecular weight effectors, known as inducers. Promoters under the control of repressors are inactive in the
presence of repressors, but become active once the repressors are dissociated from target DNA after association with
the inducers. In contrast, another group of transcription
factors, known as positive regulators or activators, require
interaction with effector ligands to function. Repressors and
activators are interconvertible depending on the position of
DNA binding relative to promoters (Ishihama, 2000, 2009).
Generally, repressor-type transcription factors bind upon or
downstream of promoters to interfere with the binding of
RNA polymerase to promoter or its elongation along
template; however, in several cases, upstream-bound transcription factors repress transcription initiation by interfering with promoter escape due to strong protein–protein
contact with RNA polymerase (Yamamoto & Ishihama,
2003). Conversely, activators bind upstream and, in a few
specific cases, downstream of promoters, supporting stable
association of RNA polymerase to promoters (class-I transcription factors) or of promoter DNA opening (class-II
transcription factors) (Fig. 1).
The active conformation of transcription factors is generally a homodimer or homomultimeric oligomer (Wolberger, 1999). In concert with the symmetrical conformation of
transcription factors, their binding sites on DNA often
include palindromic sequences (see below). The regulation
of target promoter by a transcription factor depends on the
intracellular concentration of the transcription factor and its
affinity to the target DNA site. The affinity of protein–protein association increases upon binding to DNA. Cooperative binding to DNA targets reduces the noise that arises
from the binding of nonspecific proteins and increases the
sensitivity to regulation (Brent & Ptashne, 1981; Bundschuh
et al., 2003). The DNA-binding activity of transcription
factors is controlled by either interaction with effector
ligands or covalent modification, such as protein phosphorylation. The environmental conditions and/or cellular metabolic states influence both the activity of transcription
factors through these two pathways and the intracellular
concentration of transcription factors. In the case of prokaryotic transcription, transcription factors themselves sense
changes in extracellular environmental conditions and/or
intracellular metabolic states. For a small set of regulatory
systems, two functions are mediated by two different
proteins, sensors and response regulators. Of 300 species of
transcription factors in E. coli, about 10% are involved in
this kind of two-component system (TCS) (Oshima et al.,
2002; Yamamoto et al., 2005). The sensor kinases monitor
environmental conditions and autophosphorylate at their
conserved His resides, whereas the receiver domain of the
response regulators are phosphorylated at their conserved
Asp residues by the sensor kinase to function as transcription factors. Overall, the link between changes in environFEMS Microbiol Rev 34 (2010) 628–645
mental conditions and genome transcription involves
signal-transduction pathways through the generation of
effectors for modulation of the transcription factor activities
or a cascade of protein phosphorylation of transcription
factors.
As in the case of s factors (Jishage & Ishihama, 1997;
Maeda et al., 2000), the intracellular concentrations of
transcription factors are also subject to growth conditionor growth phase-dependent control. Using specific antibodies and quantitative immune-blot analysis, the intracellular
concentrations have been determined for 4 150 species of
transcription factors in E. coli (A. Kori & A. Ishihama,
unpublished data). Except for about 10 species of the global
regulator and the bifunctional nucleoid proteins with both
architectural and regulatory functions, the levels of transcription factors are o 100 molecules per cell under steadystate cell growth under laboratory culture conditions.
Search for regulation targets of
transcription factors: in vitro search for
transcription factor-binding sequences
The regulation targets under the direct control of a test
transcription factor cannot be identified simply by comparison of transcriptomes or proteomes between wild type and
mutants lacking the test transcription factor because the
majority of genes thus detected represent the set of genes
indirectly affected. One short cut for the identification of the
promoters, genes and operons under the direct control of a
test transcription factor is to determine the binding sites of
the test transcription factor on the genome. Identification of
the connections between transcription factors and DNAbinding sites represents a major bottleneck for modeling
transcriptional regulatory networks. Thus, the first step of a
bottom-up approach toward understanding the regulatory
network is to make the connection list of all the transcription factors and their DNA recognition motifs in the
genome. At present, only about two-thirds of the estimated
300 transcription factors in E. coli have been linked to at
least one regulation target gene in the genome. Surprisingly,
the regulatory roles have not been identified for about 100
species of the transcription factor, even for the best-characterized model organism E. coli (Table 2). Furthermore, for
about 200 species of the known transcription factor, only a
single or a fraction of regulation targets have been identified
and analyzed, but the whole sets of regulation targets have
not been identified for these transcription factors.
For a quick search of DNA sequences that are recognized
by DNA-binding proteins, the elegant systematic evolution
of ligands by exponential enrichment (SELEX) system was
developed, in which DNA–protein complexes were isolated
from mixtures of a test DNA-binding protein and synthetic
oligonucleotides of all possible sequences, followed by
2010 Federation of European Microbiological Societies
Published by Blackwell Publishing Ltd. All rights reserved
c
632
A. Ishihama
Table 2. DNA-binding transcription factors in Escherichia coli
Family
No.
AlpA
AraC
1
28
ArgR
ArsR
AsnC
BirA
CadC
CaiF
CriR
Crl
Crp
DeoR
DicC
DnaA
DtxR
Fis
FlhC
FlhD
Fur
GntR
1
2
3
1
3
1
3
1
3
14
1
1
1
1
1
1
2
21
GutM
IclR
IleR
LacI
LexA
LysR
1
7
1
14
1
46
LytR
MalT
MarR
MerR
MetJ
ModE
MtlR
NadR
NagC
Nlp
NarL
NtrC
OgrK
OmpR
OraA
PhaN
PutA
RfaH
RpiR
RtcR
SorC
TdcR
TetR
TrpR
TyrR
Xre
Total
2
11
3
6
1
1
2
1
3
1
9
4
1
14
1
1
1
1
4
1
2
1
13
1
8
8
261
Characterized transcription factor
AlpA
Ada,AdiY,AppY,AraC,CelD,EnvY,FeaR,GadW,GadX,MarA,MelR,RhaR,
RhaS,Rob,SoxS,XylR,[YdeO,YfeG(EutR)]
ArgR
ArsR,[YgaV]
AsnC,Lrp
BirA
CadC
CaiF
CitB,CriR,DcuR
Crl
Crp,Fnr,YeiL
AgaR,DeoR,FrvR,FucR,GatR,GlpR,SgcR,SrlR,[YafY,YciT(DeoT),YjfQ(UlaR)]
DicC
DnaA
MntR
Fis
FlhC
FlhD
Fur,Zur
ExuR,FadR,FarR,GlcC,LctR,NanR,PdhR,PhnF,UxuR,[YgaE(GabC),YncC(McbR)]
GutM
IclR,KdgR,MhpR,[YiaJ]
AsnG,Cra,CytR,EbgR,GalR,GalS,GntR,IdnR,LacI,MalI,PurR,RbsR,TreR,
LexA
AllR,AllS,Cbl,CynR,CysB,DsdC,GcvA,HcaR,IciA,IlvY,LeuO,LrhA,LysR,
MetR,Nac,NhaR,OxyR,PerR,PssR,TdcA,XapR,[YeiE,YgiP(TtdR),YhcS(AaeR),YidZ]
BglJ,CsgD,MalT,RcsA,SciiA,[YhiE(GadE),YhiF,YjjQ,YqeH]
EmrR,MarR,SlyA
CueR,SoxR,ZntR,[YcgE]
MetJ
ModE
MtlR
NadR
Mlc,NagC
Nlp
EvgA,FimZ,NarL,NarP,RcsB,UhpA,UvrY
AtoC,GlnG,HydG,[YfhA(GlrR)]
OgrK
ArcA,BaeR,BasR,CpxR,CreB,CusR,KdpE,OmpR,PhoB,PhoP,QseB,RstA,TorR
OraA
PaaX
PutA
RfaH
RpiR,[YfeT(MurR)]
RtcR
YbcM,YdiP,YeaM,YfiE,YidL,YijO,
YkgA,YkgD,YpdC,YqhC
YbaO
YqeH,YqeI
YdjF,YfjR,YihW
YdcR,YdfH,YegW,YgbI,YhfR,
YidP,YieP,YidW,YihL,YjiM,YjiR
YagI,YfaX,YjhI
YjfA
YejW
YafC,YahB,YbbO,YbeF,YbhD,YcaN,
YcjZ,YdaK,YcdI,YdhB,YeaT,YeeY,YfeR,
YgfI,YhaJ,YhjC,YiaU,YjiE,YneL,YnfJ,YnfL
YehT,YpdB
YahA,YkgK
YcfQ,YehV
YggD
YphH
YgeK,YhjB
YedW
YebK,YfhH
YdeW,YjgS
TdcR
AcrR,BetI,EnvR,FadR,GusR,Ttk,[YcdC(RutR),YdhM(NemR)]
TrpR
DhaR,FhlA,HyfR,NorR,PrpR,PspF,TyrR
DicA,HipB,[YfgA(RodZ),YgiT]
159124 = 183
2010 Federation of European Microbiological Societies
Published by Blackwell Publishing Ltd. All rights reserved
c
Uncharacterized transcription factor
YbiH,YbjK,YcfQ,YjdC,YjgJ
YgeV
YcjC,YdcN,YgjM,YiaG
102 24 = 78
FEMS Microbiol Rev 34 (2010) 628–645
633
Prokaryotic genome regulation
sequencing of protein-bound DNA fragments (Ellington &
Szostak, 1990; Tuerk & Gold, 1990). Typically, the starting
DNA library used for screening contained 4n different
sequences, where n represents the length of nucleotide
residues of the DNA probes. With an increasing the chain
length, however, it becomes difficult to solve the long probes
at the effective concentration needed for protein binding. To
overcome the solubility problem, mixtures of genome DNA
fragments can be used in place of synthetic oligonucleotide
mixtures because the binding sites of test transcription
factors are located on the E. coli genome (Singer et al.,
1997; Shimada et al., 2005). To search for regulation targets
of hitherto uncharacterized putative transcription factors as
well as to identify the entire set of targets of known
transcription factors, we have developed an improved
method of ‘Genomic SELEX’ (Fig. 2) and initiated a
systematic search for DNA sequences recognized by all 300
species of the DNA-binding transcription factor from E. coli.
For determination of the sequences of protein-bound SELEX DNA fragments, two procedures are employed, SELEXclos (cloning and sequencing) and SELEX-chip (mapping by
tiling array consisting of 22 000 species of 60-b-long oligonucleotide probe aligned at 160-bp intervals along the E. coli
genome; Fig. 2) (Roth et al., 2004). To date, the newly
developed ‘Genomic SELEX’ has been successfully employed
for identification of the recognition and binding sequences
by Cra (Shimada et al., 2005), RstA (Ogasawara et al.,
2007a), PdhR (Ogasawara et al., 2007b), RutR (renamed
from YcdC; Shimada et al., 2007), TyrR (Yang et al., 2007),
NemR (renamed from YdhM; Umezawa et al., 2008), AllR
(Hasegawa et al., 2008), CitB (Yamamoto et al., 2008), LeuO
(Shimada et al., 2009), AscG (Ishida et al., 2009) and Dan
(renamed from YgiP; Teramoto et al., 2010). After repetition
of Genomic SELEX, DNA sequences with high affinity to
test transcription factors are enriched and thus, in the
SELEX-clos method, the proportion of plasmid clones
carrying SELEX sequences with high affinity to the test
transcription factor increases, providing a list of the affinity
order for the test factors. Alternatively, the whole set of
factor-binding sequences can be obtained using the SELEXchip method (Fig. 3a). As the low-level peaks are unreliable,
the number of factor-binding peaks changes, depending on
the setting of the cut-off level of background pattern without
protein addition. Combination of the SELEX-clos and
SELEX-chip patterns provides not only a more reliable set
of regulation targets of the test transcription factor but also
the order of binding affinity between the predicted targets.
For Genomic SELEX to work in the search of regulation
targets by the hitherto uncharacterized regulators, the conditions under which the test transcription factors are active
need to be determined, as most of the uncharacterized
putative transcription factors are considered to be necessary
for expression of the genes for response to as yet unidentified environmental stresses in nature. In the absence of
required effector ligands such as inducers and corepressors
or specific reaction conditions, Genomic SELEX screening
yields mixtures of nonspecific sequences. In these cases, one
possible approach to identify specific effectors or conditions
for activation of transcription factors is the phenotype
microarray, in which the growth of E. coli mutants lacking
the genes for test transcription factors can be examined
under up to 2000 different conditions to monitor the
Fig. 2. Genomic SELEX search for recognition sequences by transcription factors. Starting from mixtures of oligonucleotides with all possible
sequences, SELEX was developed for quick screening of DNA sequences recognized by DNA-binding proteins (Ellington & Szostak, 1990; Tuerk & Gold,
1990). As the binding sites of Escherichia coli transcription factors are located on the E. coli genome, mixtures of sonicated genome DNA fragments of
200–300 bp in length were used in place of synthetic oligonucleotide mixtures for construction of Genomic SELEX system (Shimada et al., 2005). DNA
fragment mixtures isolated by Genomic SELEX screening are subjected to sequence analysis by SELEX-clos (cloning-sequencing) or SELEX-chip (DNA
tiling array) systems.
FEMS Microbiol Rev 34 (2010) 628–645
2010 Federation of European Microbiological Societies
Published by Blackwell Publishing Ltd. All rights reserved
c
634
A. Ishihama
Fig. 3. Systematic search for RutR-binding sites on
the Escherichia coli genome. RutR is a global
transcription factor that regulates a number of
genes involved in the synthesis of pyrimidine and
arginine, the degradation of purines and
pyrimidines and the transport of glutamate
(Shimada et al., 2007). (a) The RutR-binding
sequences have been isolated from a mixture of E.
coli genome fragments using the Genomic SELEX
screening system, and the resulting RutR-associated
DNA fragment mixture was analyzed with DNA
tiling array (T. Shimada & A. Ishihama, unpublished
data). The list of RutR-binding sites agreed well with
that identified by the SELEX-clos
(cloning–sequencing) method. (b) The location of
RutR-binding sites in vivo on the genome of
growing E. coli cells was also analyzed after
DNA–protein cross-linking according to the
ChIP-chip procedure (Shimada et al., 2005). The
major RutR-binding sequences identified both in
vitro and in vivo are indicated.
utilization of various C, N, P and S sources, survival at
different pH ranges or different osmolality, and the sensitivity to various drugs and chemicals (Bochner, 2009).
Search for regulation targets of
transcription factors: in vivo search for
transcription factor-binding sites on the
genome
Search for regulation targets of
transcription factors: in silico search of
transcription factor-binding sites on the
whole genome
Traditional methods in molecular genetics have only been
successfully employed to identify a fraction of the transcription regulatory interactions (Martinez-Antonio & ColladoVides, 2003). Modern high-throughput methods such as
ChIP-chip have been developed to rapidly associate a number
of transcription factors with their cognate-binding sites in
the yeast genome (Ren et al., 2000), providing the genomescale interaction necessary to model the regulatory network.
As in the case of Genomic SELEX with uncharacterized
transcription factors, the growth conditions under which
the transcription factors are active need to be known before
ChIP-chip are conducted. Another requirement is the knowledge of the intracellular concentrations of test transcription
factors. The expression level of some transcription factors
was found to be too low for reliable detection with ChIP-chip
analysis (A. Kori & A. Ishihama, unpublished data).
Initial efforts to apply ChIP-chip to prokaryotes have
been made for identification of the localization on the E. coli
genome of individual components of the transcription
apparatus, such as RNA polymerase (Herring et al., 2005),
Fis, IHF and H-NS (Grainger et al., 2006), Fis (Cho et al.,
2006), NsrR (Rankin et al., 2008), RutR (Shimada et al.,
2008) and Lrp (Cho et al., 2008). Genomic SELEX screening
allows identification of the whole set of potential binding
sites for one specific transcription factor (Fig. 3a), while the
actual binding sites of the test transcription factor under
given culture conditions can be identified by ChIP-chip
analysis (Fig. 3b). For ChIP-chip analysis, cells must be
In silico recognition for transcription regulatory signals in
bacterial genomes is still a difficult problem of bioinformatics because of the lack of algorithms capable of making
reliable predictions. The initial computer analysis of transcription factor-binding sequences produces a huge number
of false positives. However, once the recognition and association sequences of transcription factors are established
after Genomic SELEX, the consensus sequence can be
deduced, which can then be used for the in silico search for
additional targets using the whole genome sequence (Fig. 4).
Comparative analysis of multiple genomes is one approach
for confirmation of transcription factor–DNA binding site
interactions (Tan et al., 2001, 2007). The comparative
approach is based on the assumption that sets of coregulated
genes are conserved in related bacteria. Computational
methods of phylogenetic footprinting have been applied to
the E. coli genome, allowing the discovery of many novel
transcription factor-binding sites (McCue et al., 2001, 2002).
Clustering of phylogenetic footprints has generated DNA
motif models for both unknown transcription factors and
many previously characterized transcription factors, yielding the sets of regulons (van Nimwegen et al., 2002; Qin
et al., 2003).
2010 Federation of European Microbiological Societies
Published by Blackwell Publishing Ltd. All rights reserved
c
FEMS Microbiol Rev 34 (2010) 628–645
635
Prokaryotic genome regulation
treatment for application to prokaryotes (T. Shimada & A.
Ishihama, unpublished data).
Location of transcription factor-binding
sites on the genome
Fig. 4. The consensus sequences recognized by Escherichia coli transcription factors. The most simplified core sequences recognize and are
bound by some well-characterized DNA-binding transcription factors
from E. coli. The consensus sequences often consist of palindromic
sequences. For details see text.
treated with a reagent, typically formaldehyde, which creates
covalent cross-links between proteins and genome DNA. An
antibody specific for a protein of interest is then used to
immunoprecipitate protein-bound DNA fragments, which
are subsequently labeled in an amplification reaction and
hybridized to DNA microarrays to map the protein-bound
DNA fragments. Initially, the ChIP-chip system was developed with yeast and animal cultured cells and formaldehyde
treatment was performed for 15–20 min. Formaldehyde, a
highly toxic carbonyl compound, reacts like an electrophile
with the side-chains of arginine and lysine, resulting in the
formation of glycation end products, and causes protein–
protein and protein–DNA cross-links in vivo (Orlando et al.,
1997). As a stress response to formaldehyde treatment, the
distribution of transcription factors changes even during
formaldehyde treatment (Huyen et al., 2009). Attempts are
therefore being made to improve the ChIP-chip system to
minimize the time and concentration of formaldehyde
FEMS Microbiol Rev 34 (2010) 628–645
In sharp contrast to the eukaryotic genomes, noncoding
sections are limited in the prokaryotic genomes. In the case
of E. coli genome, for instance, 4 90% DNA sequence is used
for coding, noncoding sequences occupying o 10% (Hayashi
et al., 2006; Riley et al., 2006). Transcription factors analyzed
so far tend to bind to the noncoding intergenic regions. Even
for the bifunctional nucleoid proteins such as IHF and Fis,
approximately 50% are bound in vivo within intergenic
regions, as detected by ChIP-chip analysis (Grainger et al.,
2006). After extensive Genomic SELEX search of the binding
sites of 4 150 species of E. coli transcription factors so far
examined, the binding preference for noncoding spacer regions has been identified only for a specific set of transcription
factors (Shimada et al., 2008; A. Ishihama, unpublished data),
implying coding sequence-associated transcription factors
may play as yet unidentified regulatory role(s).
The spacing between transcription and translation start
sites in the E. coli genome mostly ranges up to 50 nucleotides, but a small number of E. coli genes carry longer
untranslated flanking sequences ranging up to about 300
nucleotides upstream from the translation start codon.
Among the transcription factors that bind to noncoding
intergenic regions, functional binding sites for a transcription factor are present both upstream and downstream of
transcription initiation sites. Gene expression data normally
allow clear separation of genes into those that are activated
and those that are repressed by a regulatory protein. One
reliable but simple criterion for this classification of transcription factors into activated and repressed subsets is the
location of their binding sites relative to that of the RNA
polymerase-binding site (or promoter) (Collado-Vides
et al., 1991; Ishihama, 2009). The determination of transcription factor-binding sites relative to promoters contributes to a better understanding of transcription regulation
of the respective promoters. Generally, positive factors bind
upstream from promoter –10, whereas negative factors bind
downstream from promoter –35.
For determination of the regulatory signals associated
with each E. coli promoter, a collection of about 2000
promoter assay vectors has been constructed, in which an
approximately 500-bp-long DNA fragment upstream of the
translation initiation codon was isolated from each gene and
inserted into GRP promoter assay vector carrying two
fluorescent protein reporters (Shimada et al., 2004). The
initiation codon of the promoter fragment was sealed to the
initiation codon of green fluorescent protein (GFP)-coding
sequence, and another red fluorescent protein RFP was
2010 Federation of European Microbiological Societies
Published by Blackwell Publishing Ltd. All rights reserved
c
636
A. Ishihama
fused to a reference promoter lacUV5 in the same vector.
The involvement of test transcription factors in the regulation of the target promoters can be easily confirmed by
measuring the GFP/RFP ratio in mutants lacking the factor
gene or after overexpression of the test factor (e.g. see
Hasegawa et al., 2008; Umezawa et al., 2008).
Transcription factors are generally functional when
bound at either orientation relative to the RNA polymerase-binding site (Lobell & Schleif, 1991), possibly because
transcription factors form symmetric oligomers or induce
DNA looping to make contact with either the flexible a or s
subunits of the RNA polymerase (Teichmann & Babu, 2002;
Barnard et al., 2004; Harari et al., 2008; Ishihama, 2009).
tion factor. The binding sites of all these multiple factors are
located in a single and the same promoter (Fig. 5b). The
most typical examples of the multifactor promoter system
are the promoters for the genes encoding the master
regulator FlhCD for flagella formation and the master
regulator CsgD for biofilm formation. The complexity of
these two multifactor promoters reflects the two opposite
behaviors of bacterial survival, i.e. planktonic growth as
single cells and biofilm formation as a bacterial community,
in stressful conditions in nature.
Multifactor promoters: involvement of
multiple transcription factors in the
regulation of single promoters
A small number of transcription factors, termed ‘global’
regulators, influence the expression of a large number
of transcription units that belong to different metabolic pathways,
thereby
exhibiting
pleiotropic
phenotypes
(Gottesman, 1984; Perez-Rueda & Collado-Vides, 2000; Gutierrez-Rois et al., 2003). On the other hand, each of a large
number of ‘specific’ or ‘local’ transcription factors affects the
expression of one specific gene or a small number of transcription units (Martinez-Antonio & Collado-Vides, 2003;
Wei et al., 2004). A set of the genes or operons in E. coli are,
however, controlled by one and the same transcription factor,
together forming the regulon. This concept generated criticism of the current simple classification of transcription
factors into specific (local) and global regulators. The role of
both specific and global regulators is to mediate precise
activation or repression by sensing changes in environmental
conditions or metabolic states (Martinez-Antonio & ColladoVides, 2003). After Genomic SELEX screening of both known
and unknown transcription factors, including the well-characterized global regulators, more regulation targets were found
than hitherto recognized (A. Ishihama, unpublished data).
The factors regulating a single and the same specific promoter
are integrated into the complex regulatory network, promoting coherent transcriptional responses to environmental
changes. The systematic search for the binding sites of many
transcription factors also indicated that there are more genes
or operons organized in a single regulon under the control of a
common transcription factor than hitherto identified (A.
Ishihama, unpublished data). Regulons under the control of
global regulators (or multigene regulators) include a large
number of genes or operons, in which the genes for other
transcription factors are often included, together forming
hierarchic networks of transcription factors.
The number of genes or operons discovered with multiple
transcription initiation sites (and thus multiple promoters) is
increasing following detailed analysis of transcription regulation of the stress-response genes (Ishihama, 2009) and in silico
analysis of E. coli genome with newly developed programs to
search for promoters (Mendoza-Vargas et al., 2009). Often,
each promoter of the same gene or operon is recognized by a
different s factor, and thus it is difficult to detect all potential
promoters under a single culture conditions (Ishihama, 1999,
2000). If experiments for mRNA detection are carried out
under various stressful conditions, multiple promoters could
be identified in a single gene or operon of E. coli.
Among the set of promoters under the control of a single
and the same s factor, the level of transcription varies
depending on the culture conditions or the growth phase.
Multiple species of the transcription factor are involved in
this control of the promoter strength recognized by the same
s factor. The current promoter databases indicate that
approximately 50% of the E. coli promoters are under the
control of one specific regulator, whereas the other 50% of
genes are regulated by more than two transcription factors
(Saldago et al., 2004, 2006; Gama-Castro et al., 2008). After
Genomic SELEX search, however, we found that most of the
E. coli promoters carry the binding sites for multiple
transcription factors (A. Ishihama, unpublished data), each
factor monitoring a different environmental condition or
metabolic state (Fig. 5). The involvement of multiple
transcription factors fine-tunes genome transcription. For
instance, the expression of genes encoding metabolic enzymes is controlled by metabolites in the metabolic cycle in
which the enzymes participate, each metabolite being monitored by a specific transcription factor (Fig. 5a). Likewise,
the promoters for the genes involved in the construction of
cell structures are controlled by environmental conditions
and factors, each in turn monitored by a different transcrip2010 Federation of European Microbiological Societies
Published by Blackwell Publishing Ltd. All rights reserved
c
Global regulators: multitarget
transcription factors
Global regulators for carbon metabolism:
cAMP receptor protein (CRP)
CRP, also called catabolite gene activator protein CAP, was
the first purified transcription activator (Zubay et al., 1970)
FEMS Microbiol Rev 34 (2010) 628–645
637
Prokaryotic genome regulation
Fig. 5. Multifactor promoters. After extensive search for regulation targets by recently developed high-throughput procedures such as microarray
assays and Genomic SELEX screening, the number of transcription factors involved in regulation of single Escherichia coli promoters has been increasing
markedly. (a) Multifactor promoters involved in the genes for metabolism. (b) Multifactor promoters for the genes involved in cell structure formation.
FEMS Microbiol Rev 34 (2010) 628–645
2010 Federation of European Microbiological Societies
Published by Blackwell Publishing Ltd. All rights reserved
c
638
and is the best-characterized global regulator from E. coli
(Kolb et al., 1993; Harman, 2001; Krueger et al., 2003). The
total number of promoters that have been identified to be
under the control of cAMP-CRP is now reaching to 100
(Gama-Castro et al., 2008), acting as activators or repressors
depending on the position of CRP binding (Kolb et al.,
1993). After Genomic SELEX searches, 4 200 potential
target genes have been identified (T. Shimada and A.
Ishihama, unpublished data). The CRP regulon includes a
large number of the genes encoding enzymes and transport
systems of sugars. In the absence of glucose, cAMP is
synthesized, which associates with CRP for its conversion
into the active regulator in the transcription of a number of
genes for transport and utilization of sugars other than
glucose.
The functional CRP protomer is composed of two
molecules of CRP, each being associated with cAMP. Binding of cAMP to its N-terminal domain leads to the activation of the C-terminal DNA-binding domain (Kim et al.,
1992; Passner & Steitz, 1997), of which the characteristic
H-T-H motif is responsible for interaction with a CRP-box
consisting of a palindromic TGTGAnnnnnnTCACA sequence associated with target promoters (Fig. 4) (Berg &
von Hippel, 1988). When CRP binds DNA, it induces DNA
bending of about 871 (Parkinson et al., 1996; Pyles & Lee,
1998; Lin & Lee, 2003). The DNA-bound CRP is the first
transcription factor that was identified to interact directly
with the promoter-bound RNA polymerase for function
(Ishihama, 1992, 1993; Busby & Ebright, 1999).
Global regulators for carbon metabolism:
catabolite repressor activator (Cra)
Carbon availability in the environment influences the expression pattern of a number of genes in E. coli in various ways.
The best-characterized mechanism involves cAMP-CRP. In
addition, a number of the genes for both glycolysis and
gluconeogenesis are under the control of Cra, initially characterized as FruR (fructose repressor) (Geerse et al., 1989).
Cra, a member of the GalR-LacI family, consists of two
functional domains, an N-terminal DNA-binding domain
with H-T-H motif and a C-terminal inducer-binding and
subunit–subunit contact domain. Cra controls transcription
of the genes in major pathways of carbon and energy
metabolism (Ramseier, 1996; Saier & Ramseier, 1996) by
playing a key role in the modulation of the direction of carbon
flow through the different metabolic pathways of energy
metabolism, but independently of cAMP-CRP. After Genomic
SELEX screening, we found more regulation targets of Cra
than previously identified, which act as activators of most of
the genes encoding enzymes for gluconeogenesis, tricarboxylic
acid cycle, and glyoxylate shunt pathway, and as a repressor of
the genes encoding the Entner–Doudoroff pathway and
2010 Federation of European Microbiological Societies
Published by Blackwell Publishing Ltd. All rights reserved
c
A. Ishihama
glycolysis (Shimada et al., 2005). Derepression of the glycolysis
genes takes place when the repressor Cra is inactivated after
interaction with inducers such as D-fructose-1-phosphate and
D-fructose-1,6-bisphosphate. In the absence of these inducers,
Cra recognizes and binds to a Cra box consisting of the
TGAAACGTTTCA palindromic sequence (Fig. 4) (Negre
et al., 1996; Shimada et al., 2005). In the presence of glucose,
the intracellular concentration of the inducers increases,
which interacts with Cra to prevent its binding to the target
operons. On the other hand, the genes activated by Cra are
subject to regulation through the control of the Cra level.
Both CRP and Cra regulate the genes for carbon metabolism. Genomic SELEX screening revealed that a set of genes
are controlled by both CRP and Cra (T. Shimada and
A. Ishihama, unpublished data). Which regulator operates
under given conditions is determined by the intracellular
concentrations of the respective effectors, cAMP and phosphorylated fructose.
Global regulators for nitrogen
metabolism: regulator of pyrimidine
utilization (RutR)
RutR was originally identified as a repressor of the rut
operon encoding a set of enzymes for pyrimidine degradation for its reutilization as a nitrogen source (Loh et al.,
2006). In addition to the rut operon, we identified a number
of regulation targets of RutR after Genomic SELEX screening (Shimada et al., 2007), including the genes for purine
degradation, pyrimidine synthesis, and supply of glutamate
from the environment (see Fig. 2). RutR regulates the carAB
genes encoding the enzyme for the synthesis of carbamoylphosphate, the key substrate for the synthesis of pyrimidine
and arginine from glutamine. The carAB operon carries two
promoters: the downstream promoter responds to arginine
and is regulated by the arginine repressor ArgR, whereas the
upstream promoter responds to pyrimidine and is under the
control of IHF, Fis, PepA, PurR and RutR (Fig. 4a). In good
agreement with the key role of RutR in synthesis and
degradation of pyrimidines, uracil and thymine were found
to act as the effectors that inactivate RutR regulator for shutoff of the de novo synthesis pathway of pyrimidines; instead
the salvage pathway uses free pyrimidines for the synthesis
of pyrimidine nucleotides (Shimada et al., 2007). In
addition to the control of pyrimidine degradation, RutR
also plays a role, together with AllR (Hasegawa et al., 2008),
in degradation of purines at the steps downstream of
allantoin.
Coupling with glutamate transport, RutR also controls
the gadBC and gadAX operons, which play major roles in
transport of glutamic acid and synthesis of glutamine from
glutamate for de novo synthesis of pyrimidines. The gad
system is involved in glutamate-dependent acid resistance
FEMS Microbiol Rev 34 (2010) 628–645
639
Prokaryotic genome regulation
for maintenance of pH homeostasis and survival under
acidic conditions (Ma et al., 2002).
Global regulators for energy metabolism:
fumarate nitrate reduction (FNR)
FNR, initially named for the mutant defect in ‘fumarate and
nitrate reduction’, is another global transcription factor of
the CRP/FNR superfamily. FNR plays a key role in the
metabolic transition from aerobic to anaerobic growth
through regulation of a number of genes (Spiro & Guest,
1990; Iuchi & Lin, 1993). As in the case of CRP, FNR has an
N-terminal sensory domain, an internal dimerization domain, and a C-terminal H-T-H DNA-binding domain.
Generally, FNR activates the genes involved in anaerobic
metabolism, but it also regulates transcription of a number
of genes with other functions, such as acid resistance,
chemotaxis and cell structure. The intracellular concentration of FNR remains constant under both anaerobic and
aerobic growth, but its activity is regulated directly by
oxygen (O2). The sensory domain of FNR contains five Cys
residues, four of which are essential for linking the [4Fe–4S]
cluster (Kiley & Beinert, 2003). Under anaerobiosis, FNR is
activated by the formation of a [4Fe–4S] cluster that causes a
conformational change and dimerization of the protein, but
upon exposure to O2, FNR is inactivated via oxidation of
[4Fe–4S] cluster into [2Fe–2S]. The activated FNR conformation is able to bind the FNR-box sequence consisting of a
palindromic TTGATNNNNATCAA sequence (Fig. 4).
Nucleoid proteins as global regulators:
integration host factor (IHF)
In the E. coli nucleoid, two groups of the nucleoid protein
exist: universal nucleoid proteins (UNPs), which always stay
in the nucleoid, and growth condition- and growth phasespecific nucleoid proteins (GNPs), which appear only at
specific growth phases or under specific growth conditions
(Ishihama, 1999, 2009). IHF, a UNP, was originally found to
be required for the site-specific recombination of phage l
with the E. coli genome (Nash & Robertson, 1981). IHF is a
heterodimer consisting of two subunits, IhfA (HimA) and
IhfB (HimD, Hip), which share about 25% amino acid
identity. IHF plays a dual role, an architectural role for
DNA supercoiling and DNA duplex destabilization, and a
regulatory role of genome functions that control processes
such as DNA replication, recombination, and the expression
of a number of genes (Ishihama, 2009). IHF is highly
abundant during all the growth phases, hence its classification as a UNP (Azam et al., 1999). The intracellular
concentration of IHF ranges from 6000 dimers per cell in
the log phase to 3000 dimers in the stationary phase.
FEMS Microbiol Rev 34 (2010) 628–645
IHF binds tightly to DNA regions of about 40 bp carrying
the 13-bp consensus sequence with A/T-rich elements upstream of the core consensus sequence (Goodrich et al.,
1990; Ishihama, 2009). The structure of IHF bound to DNA
has been solved, revealing that IHF makes only a few
contacts with the minor groove (Swinger & Rice, 2007).
Thus, the DNA recognition specificity is due to the sequence-dependent structural parameters of the DNA, where
A/T-rich regions play an important role. The bend angle
induced by IHF is approximately 1601 (Sugimura &
Crothers, 2006). In transcription regulation, IHF acts to
facilitate the formation of the loop around promoter for
conversion into the active conformation. The binding to
low-affinity sites and introduction of sharp bends in the
promoter DNA promote the formation of the initiation
complex for transcription.
Nucleoid proteins as global regulators:
factor for inversion stimulation (Fis)
Fis is a GNP associated with the growing cell nucleoid (Ball
et al., 1992; Azam et al., 1999; Ishihama, 2009) as Dps in
stationary-phase cells (Almiron et al., 1992; Azam et al.,
1999; Ishihama, 2009) and Dan in cells growing under
anaerobic conditions (Teramoto et al., 2010). Under optimal
growth conditions, Fis is the dominant nucleoid protein,
reaching concentrations as high as 60 000 copies in a single
log-phase cell. In stationary-phase cells, Fis decreases to a
nearly imperceptible level. Expression of fis is regulated by
several systems and at different levels. At the transcription
level, Fis is autoregulated, induced by high supercoiling
levels, and regulated by both growth rate-dependent and
stringent control systems (Ishihama, 1999; Travers et al.,
2001). Transcription of fis is also regulated by the availability
of the nucleotide triphosphate CTP, the initiation nucleotide
of fis RNA synthesis (Walker et al., 2004). DksA, an RNA
polymerase-interacting transcription factor, inhibits transcription of fis by increasing the sensitivity to ppGpp, another
RNA polymerase-interacting nucleotide transcription factor
(Mallik et al., 2006). The GNP group of the nucleoid proteins
carries dual functions, playing an architectural role, folding
the genome DNA into the nucleoid structure and maintaining it, and a regulatory role in genome functions such as
transcription, replication, DNA inversion and transposition,
and phage integration–excision. As a transcriptional regulator, Fis regulates the expression of a number of genes involved
in translation (rRNA, tRNA and r-protein genes), virulence,
biofilm formation, energy metabolism, stress response, central intermediary metabolism, amino acid biosynthesis, transport, cell structure, carbon compound metabolism, amino
acid metabolism, nucleotide metabolism, motility and chemotaxis (Ishihama, 2009). Accordingly, microarray analysis
indicated that transcription of approximately 21% of genes is
2010 Federation of European Microbiological Societies
Published by Blackwell Publishing Ltd. All rights reserved
c
640
modulated directly or indirectly by Fis, and ChIP-chip
analysis indicated that Fis binds to 894 DNA regions in the
genome (Cho et al., 2008). These regions include both
intergenic spacer and coding regions.
A degenerate consensus sequence has been proposed for
Fis sites, with only a subset of common nucleotides. A corebinding site of 15 bp with partial dyad symmetry commonly
presents an AT-rich sequence. Once bound to DNA, Fis
bends the DNA between 401 and 901. This bending stabilizes
the DNA looping to regulate transcription and to promote
DNA compaction (Ishihama, 2009).
Nucleoid proteins as global regulators:
DNA-binding protein under anaerobic
growth conditions (Dan)
As noted above, Fis, one of GNPs, is synthesized preferentially in growing cells and plays an essential role for maintenance of the nucleoid competent for transcription of the
growth-related genes. On the other hand, Dps is the major
nucleoid protein produced only in starved stationary-phase
cells and plays a role in protecting resting E. coli cells from
environmental stresses such as high levels of toxic iron (Nair
& Finkel, 2004). A novel nucleoid protein Dan was identified
in E. coli cells grown under anaerobic conditions (Teramoto
et al., 2010). The intracellular concentration is very low in
growing E. coli cells under aerobic conditions, but increases
4 100-fold to a level as high as the major nucleoid proteins,
H-NS, HU and IHF, under hypoxic or anaerobic culture
conditions. After a systematic search for DNA-binding
sequences by Dan, 4 700 Dan-binding loci were identified
on the E. coli genome. In silico search of the consensus Danbox GTTnATT sequence indicated the presence of as many
as 1900 Dan-binding sites. At high protein concentrations,
Dan covers the entire DNA surface, as observed by atomic
force microscopy (AFM). An E. coli mutant lacking dan
showed retarded growth under anaerobic conditions.
A. Ishihama
biofilm formation is under the control of a complex network
of transcription factors. More than 10 transcription factors
were found to be directly involved in regulation of the
promoter for csgD encoding the master regulator of biofilm
formation (Fig. 6) (Ogasawara et al., 2010). The expression
of these primary transcription factors that directly regulate
the csgD promoter are under the control of secondary
transcription factors. Various environmental factors and
conditions such as nutrients, pH, osmolarity and temperature affect the expression of the genes encoding these
primary and secondary transcription factors (Fig. 6). Biofilms tend to develop on surfaces of plastic materials in
nature or on tissues in host animals. The process includes an
ordered sequence of developmental steps, i.e. reversible
attachment, irreversible attachment, maturation and dispersion (Sauer et al., 2002).
Transcription factors and their regulation target genes
and operons are generally located near each other in the
genome. Such distance constraints are considered to have
arisen from horizontal gene transfer (Mironov et al., 1999;
Tan et al., 2001). The Genomic SELEX search supports the
prediction that transcription factors and their regulated
genes tend to evolve concurrently. The regulator–target sets
Transcription factor networks: formation
of hierarchic networks
After extensive Genomic SELEX screening of the whole
set of DNA sequences recognized by E. coli transcription
factors, we have realized that the increasing number of
promoters are under the control of multiple factors (see
Fig. 5). Genomic SELEX data can be assembled to form
complex transcription factor networks. One unexpected
feature is that, under the control of one transcription factor,
the genes encoding other transcription factors are organized, forming a hierarchy of transcription factors. The
expression of genes under the control of a downstream
transcription factor of this hierarchy must be controlled by
various factors and conditions. For instance, the pathway of
2010 Federation of European Microbiological Societies
Published by Blackwell Publishing Ltd. All rights reserved
c
Fig. 6. Transcription factors involved in regulation of csgD, the master
regulator gene of biofilm formation. (a) A number of primary transcription factors, indicated in the inner circle, participate directly in regulation
of the csgD promoter. (b) The genes coding for these transcription
factors are organized under the control of secondary transcription
factors. The primary and secondary transcription factors together monitor various environmental signals and stresses.
FEMS Microbiol Rev 34 (2010) 628–645
641
Prokaryotic genome regulation
were then interconnected through cross-talk between regulators and targets. The transcription factor network involved in regulation of single promoters can be connected to
yield the interaction network consisting of a number of
signaling pathways (Shen-Orr et al., 2002). These interacting
pathways construct an intricate network. This network
integrates diverse extracellular and intracellular signals to
ensure the regulated expression of appropriate genes in the
genome at the proper time and proper level. The signals in
one pathway are often transferred to another pathway.
The cross-talk in signal transduction among various
signaling pathways has been recognized, particularly among
TCSs, i.e. sensor His kinase and response regulator (Oshima
et al., 2002; Yamamoto et al., 2005). Escherichia coli harbor
about 36 pairs of TCS. Sensor kinases monitor external
factors and conditions, self-phosphorylate His residues in
their receiver domains, and then transfer phosphoryl residues to Asp residues of response regulators to function. A
single response regulator is often transphosphorylated by
sensor kinases organized in different TCS pathways (Yamamoto et al., 2005). Cross-talk takes place not only through
the sharing of the same targets between different transcription factors but also in signal-transduction pathways such as
recognition of the same external signals by two different
sensors. Comprehensive microarray analysis of a set of 36
TCS mutants also indicated a high correlation for gene
expression among deletion mutants (Oshima et al., 2002).
Deletion of one TCS mutant often influences the transcription pattern under the control of other TCSs, implying the
sharing of the same regulation targets between two TCSs.
Conclusions
Regulatory interactions in E. coli can now be recognized to
be more complicated than previously understood and
probably as complex as those in eukaryotes, involving
multifactor promoters and multitarget regulators, together
forming hierarchic regulation networks.
Acknowledgements
The author acknowledges Tomohiro Shimada, Jun Teramoto and Hiroshi Ogasawara for experimental support and
discussion, and Ayako Kori and Kayoko Yamada for technical and secretarial assistance. The research was supported by
Grants-in-Aid for Scientific Research Priority Area
17076016 and Scientific Research 18310133 and 21241047
from the Ministry of Education, Culture, Sports, Science
and Technology of Japan.
FEMS Microbiol Rev 34 (2010) 628–645
References
Almiron M, Link AJ, Furlog D & Kolter D (1992) A novel DNAbinding protein with regulatory and protective roles in starved
Escherichia coli. Gene Dev 6: 2624–2654.
Arfin SM, Long AD, Ito ET, Tolleri L, Riehle MM, Paegle ES &
Hatfield GW (2000) Global gene expression profiling in
Escherichia coli K12: the effects of integration host factor. J Biol
Chem 275: 29672–29684.
Azam TA, Iwata A, Nishimura A, Ueda S & Ishihama A (1999)
Growth phase-dependent variation in protein composition of
the Escherichia coli nucleoid. J Bacteriol 181: 6361–6370.
Babu MM & Teichmann SA (2003) Evolution of transcription
factors and the gene regulatory network in Escherichia coli.
Nucleic Acids Res 31: 1234–1244.
Babu MM, Luscombe NM, Aravind L, Gerstein M & Teichmann
SA (2004) Structure and evolution of transcriptional
regulatory networks. Curr Opin Struc Biol 14: 283–291.
Ball CA, Osuna R, Ferguson KC & Johnson RC (1992) Dramatic
changes in Fis levels upon nutrient upshift in Escherichia coli.
J Bacteriol 174: 8042–8056.
Barabasi AL & Oltvai ZN (2004) Network biology: understanding
the cell’s functional organization. Nat Rev Genet 5: 101–113.
Barnard A, Wolfe A & Busby S (2004) Regulation at complex
bacterial promoters: how bacteria use different promoter
organizations to produce different regulatory outcomes. Curr
Opin Microbiol 7: 102–108.
Baumbach J, Wittkop T, Rademacher K, Rahmann S, Brinkrolf K
& Tauch A (2006) CoryneRegNet 3.0: an interactive systems
biology platform for the analysis of gene regulatory networks
in corynebacteria and Escherichia coli. J Biotechnol 129:
279–289.
Berg OG & von Hippel PH (1988) Selection of DNA binding sites
by regulatory proteins. II. The binding specificity of cyclic
AMP receptor protein to recognition sites. J Mol Biol 200:
709–723.
Bochner BR (2009) Global phenotypic characterization of
bacteria. FEMS Microbiol Rev 33: 191–205.
Bodenmiller DM & Spiro S (2006) The yjeB (nsrR) gene of
Escherichia coli encodes a nitric oxide-sensitive transcriptional
regulator. J Bacteriol 188: 874–881.
Bradley MD, Beach MB, de Koning PJ, Pratt TS & Osuna R (2007)
Effects of Fis on Escherichia coli gene expression during
different growth stages. Microbiology 153: 2922–2940.
Brent R & Ptashne M (1981) Mechanism of action of the lexA
gene product. P Natl Acad Sci USA 78: 4202–4208.
Bulyk ML (2006) DNA microarray technologies for measuring
protein–DNA interactions. Curr Opin Biotech 17: 422–430.
Bundschuh R, Hayot F & Jayaprakash C (2003) The role of
dimerization in noise reduction of simple genetic networks.
J Theor Biol 220: 261–269.
Busby S & Ebright RH (1999) Transcription activation by
catabolite activator protein. J Mol Biol 293: 199–213.
Cho BK, Knight EM, Barrett CL & Palsson BO (2006) Genomewide analysis of Fis-binding in Escherichia coli indicates a
causative role for A-/AT-tracts. Genome Res 18: 900–910.
2010 Federation of European Microbiological Societies
Published by Blackwell Publishing Ltd. All rights reserved
c
642
Cho BK, Barrett CL, Knight EM, Park YS & Palsson BO (2008)
Genome-scale reconstruction of the Lrp regulatory network in
Escherichia coli. P Natl Acad Sci USA 105: 19462–19467.
Collado-Vides J, Magasanik B & Gralla JD (1991) Control site
location and transcriptional regulation in Escherichia coli.
Microbiol Rev 55: 371–394.
Constantinidou C, Hobman JL, Griffiths L, Patel MD, Penn CW,
Cole JA & Overton TW (2006) A reassessment of the FNR
regulon and transcriptomic analysis of the effects of nitrate,
nitrite, NarXL, and NarQP as Escherichia coli K12 adapts from
aerobic to anaerobic growth. J Biol Chem 281: 4802–4815.
Ebright RH (1993) Transcription activation at class I CAPdependent promoters. Mol Microbiol 8: 797–802.
Eguchi Y, Oshima T, Mori H, Aono R, Yamamoto K, Ishihama A
& Ustumi R (2003) Transcriptional regulation of drug efflux
genes by EvgAS, a two-component system in Escherichia coli.
Microbiology 149: 2819–2828.
Ellington AD & Szostak JW (1990) In vitro selection of DNA
molecules that bind specific ligands. Nature 346: 818–822.
Fernandez de Henestrisa AR, Ogi T & Aoyagi S (2000)
Identification of additional genes belonging to the LexA
regulon in Escherichia coli. Mol Microbiol 35: 1560–1572.
Gama-Castro S, Jimenez-Jacinto V, Peralta-Gil M et al. (2008)
RegulonDB (version 6.0): gene regulation model of Escherichia
coli K-12 beyond transcription, active (experimental)
annotated promoters and Textpresso navigation. Nucleic Acids
Res 36: D120–D124.
Geerse RH, van der Pluijm J & Postma PW (1989) The repressor
of the PEP-fructose phosphotransferase system is required for
the transcription of the pps gene of Escherichia coli. Mol Gen
Genet 218: 348–352.
Ghosh P, Ishihama A & Chatterji D (2001) Escherichia coli RNA
polymerase subunit omega and its N-terminal domain bind
full-length b 0 to facilitate incorporation into the a2b
subassembly. Eur J Biochem 268: 4621–4627.
Goodrich JA, Schwartz ML & McClure WR (1990) Searching for
and predicting the activity of sites for DNA binding proteins;
compilation and analysis of the binding sites for Escherichia
coli integration host factor (IHF). Nucleic Acids Res 18:
4993–5000.
Gottesman S (1984) Bacterial regulation: global regulatory
networks. Annu Rev Genet 18: 415–441.
Gourse RL, Ross W & Rutherford ST (2006) General pathway for
turning on promoters transcribed by RNA polymerase
containing alternative sigma subunits. Mol Microbiol 63:
1296–1306.
Grainger DC, Hurd D, Harrison M, Holdstock J & Busby SJW
(2005) Studies of the distribution of Escherichia coli cAMPreceptor protein and RNA polymerase on the E. coli
chromosome. P Natl Acad Sci USA 102: 496–510.
Grainger DC, Hurd D, Goldberg MD & Busby SJW (2006)
Association of nucleoid proteins with coding and non-coding
segments of the Escherichia coli genome. Nucleic Acids Res 34:
4642–4650.
2010 Federation of European Microbiological Societies
Published by Blackwell Publishing Ltd. All rights reserved
c
A. Ishihama
Gross CA, Lonetto M & Losick R (1992) Bacterial sigma factors.
Transcriptional Regulation (McKnight SL & Yamamoto KR,
eds), pp. 129–176. Cold Spring Harbor Laboratory Press, Cold
Spring Harbor, NY.
Gross CA, Chan C, Dombroski A, Gruber T, Sharp T, Tupy J &
Young B (1998) The function and regulatory roles of sigma
factors in transcription. Cold Spring Harbor Symp Quant Biol
63: 141–155.
Gruber TM & Gross CA (2003) Multiple sigma subunits and the
partitioning of bacterial transcription space. Annu Rev
Microbiol 57: 441–466.
Gutierrez-Rois RM, Rosenblueth DA, Loza JA, Huerta AM,
Glasner JD, Blattner FR & Collado-Vides J (2003) Regulatory
network of Escherichia coli: consistency between literature
knowledge and microarray profiles. Genome Res 13:
2435–2443.
Harari O, del Val C, Romeo-Zaliz R, Shin D, Hung H, Groisman
EA & Zwir I (2008) Identifying promoter features of coregulated genes with similar network motifs. BMC
Bioinformatics 10 (suppl 4): S1.
Harbison CT, Gordon DB, Lee TI et al. (2004) Transcriptional
regulatory code of a eukaryotic genome. Nature 431: 99–104.
Harman JG (2001) Allosteric regulation of the cAMP receptor
protein. Biochim Biophys Acta 1547: 1–17.
Hasegawa A, Ogasawara H, Kori A, Teramoto J & Ishihama A
(2008) The transcription regulator AllR senses both allantoin
and glyoxylate and controls a set of genes for degradation and
utilization of purines. Microbiology 154: 3366–3378.
Hayashi K, Morooka N, Yamamoto Yet al. (2006) Highly accurate
genome sequences of Escherichia coli K-12 strains MG1655 and
W3110. Mol Syst Biol 2: 2006.0007.
Helmann J & Chamberlin M (1988) Structure and function of
bacterial sigma factors. Annu Rev Biochem 57: 839–872.
Herring CD, Raffaelle M, Allen TE, Kanin EL, Landick R, Ansari
AZ & Palsson BO (2005) Immobilization of Escherichia coli
RNA polymerase and location of binding sites by use of
chromatin immunoprecipitation and microarrays. J Bacteriol
187: 6166–6174.
Hung S-P, Baldi P & Hatfield GW (2002) Global gene expression
profiling in Escherichia coli K12: the effects of leucine-response
regulatory protein. J Biol Chem 277: 40309–40323.
Huyen NTT, Eiamphungporn W, Maeder U, LIebeke M, Lalk M,
Hecker M, Helmann JD & Antelmann H (2009) Genome-wide
responses to carbonyl electrophophiles in Bacillus subtilis:
control of the thiol-dependent formaldehyde dehydrogenase
AdhA and cysteine proteinase YraA by the MerR-family
regulator YraB (AdhR). Mol Microbiol 71: 876–894.
Ishida U, Kori A & Ishihama A (2009) Participation of regulator
AscG of the b-glucoside utilization operon in regulation of the
propionate catabolism operon. J Bacteriol 191: 6136–6144.
Ishihama A (1990) Molecular assembly and functional
modulation of Escherichia coli RNA polymerase. Adv Biophys
26: 19–31.
Ishihama A (1992) Role of the RNA polymerase alpha subunit in
transcription activation. Mol Microbiol 6: 3283–3288.
FEMS Microbiol Rev 34 (2010) 628–645
643
Prokaryotic genome regulation
Ishihama A (1993) Protein–protein communication within the
transcription apparatus. J Bacteriol 175: 2483–2489.
Ishihama A (1997) Adaptation of gene expression in stationary
phase bacteria. Curr Opin Genet Dev 7: 582–588.
Ishihama A (1999) Modulation of the nucleoid, the transcription
apparatus, and the translation machinery in bacteria for
stationary phase survival. Genes Cells 4: 136–143.
Ishihama A (2000) Functional modulation of Escherichia coli
RNA polymerase. Annu Rev Microbiol 54: 499–518.
Ishihama A (2009) The nucleoid: an overview. EcoSal –
Escherichia coli and Salmonella: Cellular and Molecular Biology
(Boek A, Curtiss R III, Kaper JB, Karp PD, Neidhardt FC,
Nystrom T, Slauch JM, Squires CL & Ussery D, eds). ASM
Press, Washington, DC.
Iuchi S & Lin EC (1993) Adaptation of Escherichia coli to redox
environments by gene expression. Mol Microbiol 9: 9–15.
Jishage M & Ishihama A (1997) Variation in RNA polymerase
sigma subunit synthesis in Escherichia coli: intracellular levels
of four species of sigma subunit under various growth
conditions. J Bacteriol 178: 5447–5451.
Kao KC, Tran LM & Liao JC (2005) A global regulatory role of
gluconeogenic genes in Escherichia coli revealed by
transcription network analysis. J Biol Chem 280: 36079–36087.
Khodursky AB, Peter BJ, Cozzarelli NR, Botstein D, Brown PO &
Yanofsky C (2000) DNA microarray analysis of gene
expression in response to physiological and genetic changes
that affect tryptophan metabolism in Escherichia coli. P Natl
Acad Sci USA 97: 12170–12175.
Kiley PJ & Beinert H (2003) The role of Fe–S proteins in sensing
and regulation in bacteria. Curr Opin Microbiol 6: 181–185.
Kim J, Adhya S & Garges S (1992) Allosteric changes in the cAMP
receptor protein of Escherichia coli: hinge reorientation. P Natl
Acad Sci USA 89: 1770–1773.
Kolb A, Busby S, Buc H, Garges S & Adhya S (1993)
Transcriptional regulation by cAMP receptor protein of
Escherichia coli. Annu Rev Biochem 62: 749–795.
Krueger S, Gregurick S, Shi Y, Wang S, Wladkowski BD &
Schwarz FP (2003) Entropic nature of the interaction between
promoter and bound CRP mutants and RNA polymerase.
Biochemistry 42: 1958–1968.
Lee LJ, Barrett JA & Poole RK (2005) Genome-wide
transcriptional response of chemostat-cultured Escherichia coli
to zinc. J Bacteriol 187: 1124–1134.
Lee TI, Rinaldi NJ, Robert F et al. (2002) Transcriptional
regulatory networks in Saccharomyces cerevisiae. Science 298:
799–804.
Lehnen D, Blumer C, Polen T, Wackwitz B, Wendisch VF &
Unden G (2002) LrhA as a new transcriptional key regulator of
flagella, motility and chemotaxis genes in Escherichia coli. Mol
Microbiol 45: 521–532.
Lin SH & Lee JC (2003) Determinants of DNA bending in the
DNA-cyclic AMP receptor protein complexes in Escherichia
coli. Biochemistry 42: 4809–4818.
FEMS Microbiol Rev 34 (2010) 628–645
Liu X & de Wulf P (2004) Probing the ArcA-P modulon of
Escherichia coli by whole genome transcriptional analysis and
sequence recognition profiling. J Biol Chem 279: 12588–12597.
Lobell RB & Schleif RF (1991) AraC-DNA looping: orientation
and distance-dependent loop breaking by the cyclic AMP
receptor protein. J Mol Biol 218: 45–54.
Loh KD, Gyaneshwar P, Papadimitriou EM, Fong R, Kim KS,
Pareles R, Zhou Z, Inwood W & Kustu S (2006) A previously
undescribed pathway for pyrimidine catabolism. P Natl Acad
Sci USA 103: 5114–5119.
Ma Z, Gong S, Richard H, Tucker DL, Conway T & Foster JW
(2002) Collaborative regulation of Escherichia coli glutamatedependent acid resistance by two AraC like regulators, GadX
and GadW. J Bacteriol 184: 7001–7012.
Maeda H, Jishage M, Nomura T, Fujita N & Ishihama A (2000)
Two extracytoplasmic function sigma subunits, sigma-E and
sigma-FecI, of Escherichia coli: promoter selectivity and
intracellular levels. J Bacteriol 182: 1181–1184.
Mallik P, Paul BJ, Rutherford ST, Gourse RL & Osuna R (2006)
DksA is required for growth phase-dependent regulation,
growth rate-dependent control, and stringent control of fis
expression in Escherichia coli. J Bacteriol 188: 5775–5782.
Martin RG & Rosner JL (2002) Genomics of the marA/soxS/rob
regulon of Escherichia coli: identification of directly activated
promoters by application of molecular genetics and
informatics to microarray data. Mol Microbiol 44: 1611–1624.
Martinez-Antonio A & Collado-Vides J (2003) Identifying global
regulators in transcriptional regulatory networks in bacteria.
Curr Opin Microbiol 6: 482–489.
Masuda N & Church GM (2002) Escherichia coli gene expression
responsive to levels of the response regulator EvgA. J Bacteriol
184: 6225–6234.
Masuda N & Church GM (2003) Regulatory network of acid
resistance in Escherichia coli. Mol Microbiol 48: 699–712.
McCue L, Thompson W, Carmack C, Ryan MP, Liu JS,
Derbyshire V & Lawrence CE (2001) Phylogenetic footprinting
of transcription factor binding sites in proteobacterial
genomes. Nucleic Acids Res 29: 774–782.
McCue LA, Thompson W, Carmack CS & Lawrence CE (2002)
Factors influencing the identification of transcription factor
binding sites by cross-species comparison. Genome Res 12:
1523–1532.
Mendoza-Vargas A, Olvera L, Olvera M et al. (2009) Genomewide identification of transcription start sites, promoters and
transcription factor binding sites in E. coli. PLoS One 4: e7526.
Mironov VA, Koonin EV, Royberg MA & Gelfand MS (1999)
computer analysis of transcription regulatory patterns in
completely sequenced bacterial genomes. Nucleic Acids Res 27:
2981–2989.
Nair S & Finkel SE (2004) Dps protects cells against multiple
stresses during stationary phase. J Bacteriol 186: 4192–4198.
Nash HA & Robertson CA (1981) Purification and properties of
the Escherichia coli protein factor required for lambda
integrative replication. J Biol Chem 256: 9246–9253.
2010 Federation of European Microbiological Societies
Published by Blackwell Publishing Ltd. All rights reserved
c
644
Negre D, Bonod-Bidaud C, Geourjon G, Deleage G, Cozzone AJ
& Cortay JC (1996) Definition of a consensus DNA-binding
site for the Escherichia coli pleiotropic regulatory protein FruR.
Mol Microbiol 21: 257–266.
Ogasawara H, Hasegawa A, Kanda E, Miki T, Yamamoto K &
Ishihama A (2007a) Genomic SELEX search for target
promoters under the control of the PhoQP-RstBA signal
cascade. J Bacteriol 187: 4791–4799.
Ogasawara H, Ishida Y, Yamada K, Yamamoto K & Ishihama A
(2007b) PdhR (pyruvate dehydrogenase complex regulator)
controls the respiratory electron transport system in
Escherichia coli. J Bacteriol 189: 5534–5541.
Ogasawara H, Yamada K, Kovi A, Yamamoto K & Ishihama A
(2010) Regulation of the E. coli csgD promotor: interplay
between five transcription factors. Microbiology, in press.
Oh M-K, Rohlin L, Kao KC & Liao JC (2002) Global expression
profiling of acetate-grown Escherichia coli. J Biol Chem 277:
13175–13183.
Orlando V, Strutt H & Paro RO (1997) Analysis of chromatin
structure by in vivo formaldehyde cross-linking. Methods 11:
205–214.
Oshima T, Aiba H, Masuda Y, Kanaya S, Sugiura M, Wanner BL,
Mori H & Mizuno T (2002) Transcriptome analysis of
two-component response regulatory system mutants of
Escherichia coli K-12. Mol Microbiol 46: 281–291.
Overton TW, Griffiths L, Patel MD, Hobman JL, Penn CW, Cole
JA & Constantinidou C (2006) Microarray analysis of gene
regulation by oxygen, nitrate, nitrite, FNR, NarL and NarP
during anaerobic growth of Escherichia coli: new insights into
microbial physiology. Biochem Soc T 34: 104–107.
Paget MS & Helmann JD (2003) The sigma 70 family of sigma
factors. Genome Biol 4: 203.
Parkinson G, Wilson C, Gunasekera A, Ebright YW, Ebright RE &
Berman HM (1996) Structure of the CAP–DNA complex at
2.5 Å resolution: a complete picture of the protein–DNA
interface. J Mol Biol 304: 847–859.
Passner JM & Steitz TA (1997) The structure of a CAP–DNA
complex having two cAMP molecules bound to each
monomer. P Natl Acad Sci USA 94: 2843–2847.
Perez-Rueda E & Collado-Vides J (2000) The repertoires of DNAbinding transcription regulators in Escherichia coli K-12.
Nucleic Acids Res 28: 1838–1847.
Phadtare S & Inouye M (2004) Genome-wide transcriptional
analysis of the cold shock response in wild-type and coldsensitive, quadruple-csp-deletion strains of Escherichia coli. J
Bacteriol 186: 7007–7014.
Pyles EA & Lee JC (1998) Escherichia coli cAMP receptor
protein–DNA complexes. 2. Structural asymmetry of DNA
bending. Biochemistry 37: 5201–5210.
Qin ZS, McCue LA, Thompson W, Mayerhofer L, Mayerhofer L,
Lawrence CE & Liu JS (2003) Identification of co-regulated
genes through Bayesian clustering of predicted regulatory
binding sites. Nat Biotechnol 21: 435–443.
Ramseier TM (1996) Cra and the control of carbon flux via
metabolic pathways. Res Microbiol 147: 489–493.
2010 Federation of European Microbiological Societies
Published by Blackwell Publishing Ltd. All rights reserved
c
A. Ishihama
Rankin LD, Bodenhiller DM, Partridge JD, Nishino SF, Spain JC
& Spiro S (2008) Escherichia coli NsrR regulates a pathway for
the oxidation of 3-nitrotyramine to 4-hydroxyl-3nitrophenylacetate. J Bacteriol 190: 6170–6177.
Ren B, Robert F, Wyrick JJ et al. (2000) Genome-wide location
and function of DNA-binding proteins. Science 290:
2306–2309.
Ren D, Bedzyk LA, Thomas SM, Ye RW & Wood TK (2004) Gene
expression in Escherichia coli microfilms. Appl Microbiol Biot
64: 515–524.
Riley M, Abe T, Arnad MB et al. (2006) Escherichia coli K-12: a
cooperatively developed annotation snapshot – 2005. Nucleic
Acids Res 34: 1–9.
Roth MF, Feng L, McConnell KJ, Schaffer PJ & Guerra CE (2004)
Expression profiling using a hexamer-based universal
microarray. Nat Biotechnol 22: 416–426.
Saier MH & Ramseier TM (1996) The catabolite repressor/
activator (Cra) protein of enteric bacteria. J Bacteriol 176:
3411–3417.
Saldago H, Gama-Cstro S, Martinez-Antonia A, Diaz-Peredo E,
Sanchez-Solano F, Perez-Rueda E, Bonavides-Martinez C &
Collado-Vides J (2004) RegulonDB 4.0: transcriptional
regulation, operon organization and growth conditions in
Escherichia coli. Nucleic Acids Res 32: D303–D306.
Salgado H, Gama-Castro S, Peralta-Gil M et al. (2006)
RegulonDB (version 5.0); Escherichia coli K-12 transcriptional
regulatory network, operon organization, and growth
conditions. Nucleic Acids Res 34: D394–D397.
Salmon K, Hung S-P, Mekjian K, Baldi P, Hatfield GW &
Gunsalus RP (2003) Global gene expression profiling in
Escherichia coli K12: the effects of oxygen availability and FNR.
J Biol Chem 278: 29837–29855.
Sauer K, Camper AK, Ehrich GD, Costerton JW & Davies DG
(2002) Pseudomonas aeruginosa displays multiple phenotypes
during development as a biofilm. J Bacteriol 184: 1140–1154.
Schembri MA, Ussery DW, Workman C, Hasman H & Klemm P
(2002) DNA microarray analysis of fim mutations in
Escherichia coli. Mol Genet Genomics 267: 721–729.
Shen-Orr SS, Milo R, Mangan S & Alon U (2002) Network motifs
in the transcriptional regulation network of Escherichia coli.
Nat Genet 312: 64–68.
Shimada T, Makinoshima H, Ogawa Y, Miki T, Maeda M &
Ishihama A (2004) Classification and strength measurement of
stationary-phase promoters by use of a newly developed
promoter cloning vector. J Bacteriol 186: 7112–7122.
Shimada T, Fujita N, Maeda M & Ishihama A (2005) Systematic
search for the Cra-binding promoters using genomic SELEX
systems. Genes Cells 10: 907–918.
Shimada T, Hirao K, Kori A, Yamamoto K & Ishihama A (2007)
RutR is the uracil/thymine-sensing master regulator of a set of
genes for synthesis and degradation of pyrimidines. Mol
Microbiol 66: 744–779.
Shimada T, Ishihama A, Busby SJ & Grainer DC (2008) The
Escherichia coli RutR transcription factor binds at targets
FEMS Microbiol Rev 34 (2010) 628–645
645
Prokaryotic genome regulation
within genes as well as intergenic regions. Nucleic Acids Res 36:
3950–3955.
Shimada T, Yamamoto K & Ishihama A (2009) Involvement of
leucine-response transcription factor LeuO in regulation of
the genes for sulfa-drug efflux. J Bacteriol 191: 4562–4571.
Singer BS, Shtatland T, Brown D & Gold L (1997) Libraries for
genomic SELEX. Nucleic Acids Res 25: 781–786.
Spiro S & Guest JR (1990) FNR and its role in oxygen-regulated
gene expression in Escherichia coli. FEMS Microbiol Rev 6:
399–428.
Steinmetz LM & Davis RW (2004) Maximizing the potential of
functional genomics. Nat Rev Genet 5: 190–201.
Sugimura S & Crothers DM (2006) Stepwise binding and bending
of DNA by Escherichia coli integration host factor. P Natl Acad
Sci USA 103: 18510–18514.
Swinger KK & Rice PA (2007) Structure-based analysis of HUDNA binding. J Mol Biol 365: 1005–1016.
Tan K, Moreno-Hagelsieb G, Colado-Vides J & Stormo GD
(2001) A comparative genomics approach to prediction of new
members of regulons. Genome Res 11: 566–584.
Tan K, McCue LA & Stormo GD (2007) Making connections
between novel transcription factors and their DNA motifs.
Genome Res 15: 312–320.
Tao H, Hasona A, Do PM, Ingram LO & Shanmuga KT (2005)
Global gene expression analysis revealed an unsuspected deo
operon under the control of molybdate sensor, ModE, protein
in Escherichia coli. Arch Microbiol 184: 225–233.
Teichmann SA & Babu MM (2002) Conservation of gene coregulation in prokaryotes and eukaryotes. Trends Biotechnol
20: 407–410.
Teramoto J, Yoshimura SH, Takeyasu K & Ishihama A (2010) A
novel nucleoid protein of Escherichia coli induced under
anaerobic growth conditions. Nucleic Acids Res, DOI: 10.1093/
nar/gkq077.
Terui Y, Higashi K, Taniguchi S, Shigemasa A, Nishimura K,
Yamamoto K, Kashiwagi K, Ishihama A & Igarashi K (2007)
Enhancement of the synthesis of RpoN, Cra and H-NS by
polyamines at the level of translation in Escherichia coli
cultured with glucose and glutamine. J Bacteriol 189:
2359–2368.
Travers A, Schneider R & Muskhelishhvilli G (2001) DNA
supercoiling and transcription in Escherichia coli: the FIS
connection. Biochimie 83: 213–217.
Tuerk C & Gold L (1990) Systematic evolution of ligands by
exponential enrichment: RNA ligands to bacteriophage T4
DNA polymerase. Science 249: 505–510.
Typas A, Becker G & Hengge R (2007) The molecular basis of
selective promoter activation by the sigma S subunit of RNA
polymerase. Mol Microbiol 63: 1296–1306.
FEMS Microbiol Rev 34 (2010) 628–645
Umezawa Y, Shimada T, Kori A, Yamada K & Ishihama A (2008)
The uncharacterized transcription factor YdhM is the
regulator of the nemA gene encoding N-ethylmaleimide
reductase. J Bacteriol 190: 5890–5897.
Van Nimwegen E, Zavolan M, Rajewsky N & Siggia ED (2002)
Probabilistic clustering of sequences: inferring new bacterial
regulons by comparative genomics. P Natl Acad Sci USA 99:
7323–7328.
Walker KA, Mallik P, Pratt TS & Osuna R (2004) The Escherichia
coli Fis promoter is regulated by changes in the levels of its
transcription initiation nucleotide CTP. J Biol Chem 279:
50818–50828.
Wei GH, Liu DP & Liang CC (2004) Charting gene regulatory
networks: strategies, challenges and perspectives. Biochem J
381: 1–12.
Wei Y, Lee J-M, Smulski DR & Larossa RA (2001) Global impact
of sdiA amplification revealed by comprehensive gene
expression profiling of Escherichia coli. J Bacteriol 183:
2265–2272.
Wolberger C (1999) Multiprotein–DNA complexes in
transcriptional regulation. Annu Rev Bioph Biom 28:
29–56.
Yamamoto K & Ishihama A (2003) Two different modes of
transcription repression of the Escherichia coli acetate operon
by IclR. Mol Microbiol 47: 183–194.
Yamamoto K, Hirao K, Oshima T, Aiba H, Utsumi R & Ishihama
A (2005) Functional characterization in vitro of all twocomponent signal transduction systems from Escherichia coli. J
Biol Chem 280: 1448–1456.
Yamamoto K, Matsumoto F, Oshima T, Fujita N, Ogasawara N &
Ishihama A (2008) Anaerobic regulation of citrate
fermentation by CitAB in Escherichia coli. Biosci Biotech Bioch
72: 3011–3014.
Yang J, Ogawa Y, Camkaris H, Shimada T, Ishihama A & Pitard AJ
(2007) folA, a new member of the TyrR regulon in Escherichia
coli K-12. J Bacteriol 189: 6080–6084.
Zheng D, Constantinidou C, Hobman JL & Minchin SD
(2004) Identification of the CRP regulon using in vitro
and in vivo transcriptional profiling. Nucleic Acids Res 32:
5874–5893.
Zheng M, Wang X, Templeton LJ, Smulski DR, LaRossa RA &
Storz G (2001) DNA microarray-mediated transcriptional
profiling of the Escherichia coli response to hydrogen peroxide.
J Bacteriol 183: 4562–4570.
Zubay G, Schwartz D & Beckwith J (1970) Mechanism of
activation of catabolite-sensitive genes: a positive control
system. P Natl Acad Sci USA 66: 104–110.
2010 Federation of European Microbiological Societies
Published by Blackwell Publishing Ltd. All rights reserved
c