The Regulatory RNAs of Bacillus subtilis Mars, Ruben

University of Groningen
The Regulatory RNAs of Bacillus subtilis
Mars, Ruben
IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to
cite from it. Please check the document version below.
Document Version
Publisher's PDF, also known as Version of record
Publication date:
2014
Link to publication in University of Groningen/UMCG research database
Citation for published version (APA):
Mars, R. (2014). The Regulatory RNAs of Bacillus subtilis [S.l.]: [S.n.]
Copyright
Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the
author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).
Take-down policy
If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately
and investigate your claim.
Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the
number of authors shown on this cover page is limited to 10 maximum.
Download date: 18-06-2017
74
Chapter 4
In silico Target Profiling Reveals Small
Regulatory RNA Functions in
Bacillus subtilis
Ruben A. T. Mars, Pierre Nicolas, Gerhild Wachlin, Michael Hecker, Emma L. Denham,
and Jan Maarten van Dijl
To be submitted
75
In silico target profiling
Abstract
Small regulatory RNAs (srRNAs) are bacterial post-transcriptional regulators that act by
complementary base-pairing to modulate mRNA stability or translation. Studies in Gramnegative bacteria have been greatly facilitated by focusing mainly on Hfq-interacting RNAs.
However, the RNA chaperone Hfq is apparently not required for srRNA regulation in Grampositive bacteria, where many putative srRNAs have also been identified. The present study was
aimed at providing new leads for the functional analysis of 63 selected putative srRNAs from
the Gram-positive soil bacterium Bacillus subtilis. This involved extensive target predictions,
evolutionary conservation analyses of both srRNAs and their predicted targets, target
enrichment analyses on these predictions, two expression correlation analyses computed over
a 104-condition expression space, and a selection of those srRNA-mRNA pairs that are coexpressed. The validity of the various predictions was tested with experimental data on two
known srRNAs, namely FsrA/S512 and RsaE/S415. We were able to retrieve the established role
of FsrA/S512 in iron metabolism through our predictions and to suggest additional iron-related
targets of the FsrA/S512 regulon. In addition, we experimentally show that FsrA/S512 also has
a regulatory role in cells grown on iron-proficient LB medium. Implication of RsaE/S415 in
the regulation of B. subtilis central metabolism was shown via deregulated expression of the
2-oxoglutarate dehydrogenase OdhA. Furthermore, conserved predicted targets of RsaE/S415
suggest that this srRNA regulates the expression of genes from the functional categories lipid
utilization and biosynthesis of cofactors in organisms ranging from Staphylococcus aureus to B.
subtilis. We conclude that our present data can serve as valuable leads for further functional
studies on the srRNAs of B. subtilis.
76
Chapter 4
Introduction
Small regulatory RNAs (srRNAs) are the regulators of a wide variety of cellular processes in
bacteria. These include stress adaptation, central carbon metabolism and bacterial virulence (1,
2, 3). The srRNAs act in trans by short, imperfect, complementary base-pairing with their target
messenger RNA (mRNA) molecules. Regulation via srRNAs can be faster than the regulation
via protein transcription factors (4), which makes the srRNA-mediated regulation ideally
suited for stress responses. However, srRNAs can also have more subtle fine-tuning functions
in the regulation of gene expression (5). These theoretical considerations imply that there is a
clear niche for gene regulation at the RNA level, and this might explain why srRNA-mediated
regulation is a universally occurring biological phenomenon (3, 6, 7).
The phenotypes of srRNA mutants are consequences of the deregulation of their
target mRNAs. While the number of functional targets for some srRNAs may be small, others
have emerged as important post-transcriptional regulatory hubs in Gram-negative bacteria,
with more than 20 verified targets for GcvB in Salmonella typhimurium (8). The srRNA target
identification in Gram-negative bacteria has been greatly facilitated by focusing on those RNAs
that interact with the RNA chaperone Hfq. Because of this, discussions regarding regulatory
RNAs in bacteria have become tightly linked to the function of Hfq. However, genes for Hfq
homologues are only found in approximately half of the sequenced genomes (9). For example,
the Gram-positive soil bacterium Bacillus subtilis contains an Hfq homologue (previously named
YmaH), but to date this protein has proven to be dispensable for the established srRNA-mRNA
interactions in this organism (10, 11, 12, 13). Nevertheless, a recent comparative transcriptome
analysis revealed altered abundances for mRNAs belonging to the ResD-ResE, GerE and ComK
regulons in hfq mutant cells and, more importantly for our present study, the abundance of six
predicted putative srRNAs was changed in cells lacking Hfq (13). These findings suggest that
Hfq is only critically involved in a small subset of the srRNA-mRNA interactions in B. subtilis.
Since many putative srRNAs have been identified in B. subtilis (14, 15, 16, 17) and other Grampositive bacteria (18, 19, 20), this raises the question how to best study these potentially Hfqindependent srRNAs.
Experimentally characterizing srRNA targets can be complicated and laborious. After
identifying the srRNA, either by a bioinformatics (21) or experimental approach (tiling array or
RNA-sequencing) (16, 17), experimental work starts with the verification of srRNA expression.
Subsequently, transcriptome and/or proteome analyses can be performed, either on an srRNA
mutant, or a strain in which the srRNA is overexpressed for a brief period (22). These analyses
are likely to result in a number of mRNAs and/or proteins that show differential expression.
The respective differentially expressed genes can be translationally coupled to a reporter gene to
establish their mRNAs as direct targets of the srRNA (22). In such a system, the introduction of
compensatory mutations in the srRNA-mRNA target interaction site provides usually the final
proof for direct srRNA-mRNA regulation. A possible shortcoming of this approach is that true
srRNA targets might not be expressed under the selected experimental conditions. This implies
that a comprehensive view of srRNA regulation has to rely to some extent on target predictions,
as was also suggested by Sharma et al. (8).
Predicting srRNA targets can be very helpful in predicting the function of an srRNA.
Accordingly, several srRNA target prediction algorithms have been reported (23, 24, 25, 26) and
successfully applied (e.g. (8, 27)). However, results of srRNA target predictions should be treated
with caution, because of the small numbers of true positive targets that are usually predicted.
The limited success in predicting srRNA targets relates to the fact that a small degree of sequence
complementarity can already suffice for srRNA-target regulation. Consequently, the number
of predicted targets is generally too large to justify direct target verification experiments. This
is especially problematic for the analysis of srRNA-mediated regulation in organisms that are
77
In silico target profiling
genetically poorly accessible. Additional in silico approaches can therefore be highly useful to
pinpoint the most likely candidate targets from the initially large lists of predicted targets. For
instance, such additional approaches may address the evolutionary conservation of predicted
srRNA–target interactions (26), or they may take into account expression data for the investigated
srRNAs and their predicted targets (28).
A recent large-scale transcriptome study by Nicolas et al. identified 1583 potential
regulatory RNAs in the B. subtilis prototype strain 168, of which ~150 are independently
expressed (17). These RNAs were identified by analyzing the transcriptome of B. subtilis 168
across 104 different growth conditions with high-resolution tiling arrays. Analysis of this
extensive dataset revealed that the number of genes that are not expressed under any condition
is very low (4.4%), while 85% of all coding sequences (CDS) are highly expressed under one or
more experimental conditions. Only 3% of the CDS were highly expressed under every tested
condition. This remarkably high expression plasticity suggests that the study by Nicolas et al.
(17) had, for the first time, covered almost all transcriptionally active regions of an organism,
including its regulatory RNAs. The 1583 identified RNA features were divided into multiple
categories, based on their location with respect to the nearest CDS. One of these categories
consists of RNA segments with their own promoter and terminator signals, which were
therefore termed All-independent (All-Indep) segments (Chapter 3 of this thesis). These AllIndep segments are transcriptionally related to srRNAs, and have the potential to function as
such.
The aim of the present study was to determine whether the wealth of B. subtilis gene
expression data reported by Nicolas et al. (17) can be used to identify true targets of putative
srRNAs amongst a large set of predicted targets. For this purpose, we performed extensive
target predictions for a selected set of putative srRNAs, analyzed the evolutionary conservation
of these srRNAs and their predicted targets, computed enriched functional categories on the
target predictions, performed two expression correlation analyses computed over the 104
condition expression space, and selected those srRNA-mRNA pairs that are co-expressed.
The validity of results obtained through this prediction pipeline was tested by focusing on two
known srRNAs of B. subtilis, namely FsrA/S512 and RsaE/S415 (10, 18). Indeed, the established
role of FsrA/S512 in iron metabolism was pinpointed in our predictions, and additional ironrelated candidate members of the FsrA/S512 regulon are suggested. In contrast to the previously
reported data (10), our experiments strongly suggest that FsrA/S512 also has a regulatory role
during the exponential growth phase on the iron-proficient Lysogeny Broth (LB) medium. The
predicted involvement of RsaE/S415 in the regulation of B. subtilis central carbon metabolism
was evidenced through the observed deregulation of the 2-oxoglutarate dehydrogenase OdhA
in an RsaE/S415 mutant. Conserved target predictions on RsaE/S415 also suggest that the
regulation of genes from the functional categories lipid utilization and biosynthesis of cofactors is a
conserved function of RsaE/S415 in a range of Gram-positive bacteria, including Staphylococcus
aureus. Altogether, the procedures and data presented in this study will most likely facilitate
future functional studies on the putative srRNAs of B. subtilis.
Results and Discussion
General description of in silico srRNA target profiling
The conceptual outline of our srRNA target prediction and analysis pipeline is shown in Figure
1A. There are two aspects integrated in our approach. The first addresses the srRNA functions
and the second one addresses the srRNA targets. To assess srRNA functions, we gathered data
on the respective promoters, evolutionary conservation in different genomes, presence of open
78
Chapter 4
reading frames (ORFs), putative secondary structures, lengths and GC contents of a set of 63
putative srRNAs from B. subtilis. A summary of these data is provided in Table 1. The targetfocused analyses were initiated with extensive srRNA target predictions of which the results are
presented in the Supplementary data file predictions and Table S1. Subsequently, five separate
analyses with the goal to enrich for the most likely srRNA targets were performed and positive
hits in anyone of these five analyses were used to flag the target as exemplified in Table 2 (marked
with Y). The first flag (marked ‘Conserved’) is representative of the evolutionary conservation
of the srRNA target interaction and this already strongly reduced the number of considered
targets. The second flag (marked ‘Enriched’) indicates the presence of the target in an enriched
functional category of the complete set of predicted targets. Specifically, this analysis can identify
srRNA regulons, as is illustrated for FsrA/S512 (see Query S512 in Table 2). The third flag for
B-cluster enrichment (17) (marked ‘BclusterFlag’) represents an expression correlation-related
analysis based on the presence of multiple genes that share an expression profile within the set of
target predictions for one srRNA (Figure 1B). This shared expression profile might thus reflect
regulation by a particular putative srRNA. The fourth flag (marked ‘PeaksFlag’) represents the
expression correlation with the target under an experimental condition where the putative
srRNA is highly expressed compared to its baseline expression level (Figure 1B). The fifth and
final flag (marked ‘ConditionalFlag’) represents the result of a co-expression analysis that selects
predicted srRNA-target interactions based on the co-expression of sRNAs and their predicted
targets at a certain cut-off under at least one experimental condition. The proportion of highly
significant targets that received a flag in one of the five analyses and the distribution of the
number of flags per target are indicated in Figure 2.
The following paragraphs will give a general description of our srRNA target
predictions and the subsequent in silico analyses. These are followed by descriptions of some
srRNA functions predicted by our approach in the case studies section.
General analysis of independently expressed RNA segments
Functional srRNAs are often highly structured, and this RNA structure is essential for their
regulatory function. In general, the secondary structure elements in an srRNA molecule protect
it from being degraded by RNases (29, 30). Strong secondary structure of regulatory RNAs can
therefore be an indication of a function. Nevertheless, it should be noted that the region of an
srRNA that interacts with its target mRNA is on average less structured than the rest of the
molecule, as has been reported for srRNAs in E. coli (31). As detailed in Chapter 3 of this thesis,
the All-Indep RNA segments of B. subtilis that were identified by Nicolas et al. (17) display the
strongest degree of predicted secondary structure and species-level evolutionary conservation.
This might therefore point to an important regulatory function of these RNA segments, possibly
as srRNAs.
Besides the prediction of secondary structure and the analysis of evolutionary
conservation, we wondered whether we could identify possibly subtle roles for Hfq in the
expression of the All-Indep segments of B. subtilis. As indicated above, Hfq is the central
RNA chaperone in Gram-negative bacteria, but it seems to have only a very limited effect on
srRNA regulation in B. subtilis (9, 10, 12, 13). To assess a possible effect of the expression level
of hfq on that of all identified RNA segments, we computed the expression correlation of hfq
with the identified RNA segments in the 104 condition space. When these correlations were
inspected globally per (pooled) category, the All-Indep category, which contains the putative
srRNAs, displayed a significantly increased hfq correlation compared to the other categories of
RNA segments (Figure 3). This seems to suggest that Hfq could somehow be involved in srRNA
regulation in B. subtilis to a larger extent than thus far believed, but only in a very subtle manner.
Interestingly, hfq correlations with segments from the 3’UTR category were also significantly
79
In silico target profiling
higher than those for the 5’UTRs and the intergenic RNAs (Figure 3). Notably, 3’UTRs appear
to be frequent sources of srRNAs in Gram-negative bacteria (32), and it is therefore tempting
to speculate that the observed expression correlation between hfq and 3’UTRs is suggestive of
3’UTR-derived srRNAs in B. subtilis as well. Altogether, the observed expression correlations
of All-Indep segments and hfq may point to a general (but very subtle) role of B. subtilis Hfq in
stabilizing srRNA molecules, analogous to what has been observed in Gram-negative bacteria
(33).
Selection and description of putative srRNAs
Having discovered that All-Indep segments (i.e. putative srRNAs) have the highest level of
predicted secondary structure, the largest degree of evolutionary conservation, and the strongest
positive correlations with hfq, we aimed at refining this set to include only the most likely srRNAs.
To do this, we first excluded the All-Indep segments that were annotated as antisense RNAs
(asRNA) of protein-encoding genes. For this purpose, we followed the definition of Nicolas et
al., where overlaps of ≥100 nucleotides or overlaps of ≥50% of the sequence length were used as
criteria to identify asRNAs (17). A first motivation for eliminating the asRNAs was that these
RNAs probably function in cis at the genomic location where they are transcribed. However, it
Figure 1. Overview of the
srRNA function and target
prediction approach
A) Overview of the approach
outlined in this manuscript.
The ultimate goal of the
present studies was to identify
potential new srRNA functions
in B. subtilis. This can be
approached via a focus on
srRNA functions by studying
srRNA deletion phenotypes,
or the direct search for true
srRNA targets. In practice
these two approaches often
overlap as is illustrated in the
main text. B) Example plot
of the condition-dependent
expression of one srRNA,
namely
CsfG/S547.
This
srRNA was arbitrarily selected
for illustration purposes. The
B-cluster expression for the
segment is included as the
average of the 51 segments of
this cluster. Three conditions
that could qualify as peak
expression conditions are
indicated by the grey zones
labelled peak 1 – 3. Note
however that the actual peaks
expression
analysis
only
considers two peak expression
conditions at maximum. The
data and x-axis are derived
from Nicolas et al. (17).
A
B
80
Chapter 4
should be noted that we cannot fully exclude the possibility that some of these segments might
additionally have a function in trans. The second motivation for eliminating asRNAs from the
subset of putative srRNAs was that the sequence of asRNAs is de facto complementary to its
sense mRNA. This could lead to an undesired bias in our sequence-based target predictions.
Furthermore, we excluded two segments related to type I toxin-antitoxin modules, namely
BrsH/S978 (34) and as-BrsH/S977 (16), and a known protein-encoding gene. The latter gene
encodes the small basic protein FbpC, but it was annotated as independent segment S834. FbpC
has been implicated in FsrA/S512-mediated srRNA regulation, possibly as an RNA chaperone
(10, 11, 35). Lastly, we added two known B. subtilis srRNAs to the remaining segments. These are
SR1 (12, 36, 37, 38, 39) (referred to by the name of the ORF ykzW in (17)) and CsfG/S547 (40)
(annotated as antisense in (17)). Altogether, this selection of All-Indep RNAs resulted in a set of
63 segments that are hereafter referred to as putative srRNAs (Table 1).
We next compared the extent of predicted secondary structure of the selected putative
srRNAs (reflected by Z-scores) to that of all sense RNA segments and all of the independently
expressed RNA segments. From this comparison it followed that the selected putative srRNA
segments display significantly higher levels of secondary structure (i.e. lower Z-scores) than the
other sense RNAs (Figure 4A). There is also a statistically non-significant trend towards higher
secondary structure of the selected segments when they are compared to the other independent
Conditional
Peaks
Conserved
B−cluster
Enriched cat.
●
●
●
Figure 2. Graphical summary of flag attribution to
predicted srRNA targets
The top panel illustrates the proportion of all predicted targets
(11419 with p-value ≤0.01) that received a flag in the five
different analyses performed in the present study. The lower
panel is a histogram of the number of flags per predicted
target. All these targets with extra information are listed in
Table S1. Those targets with three or more flags are presented
in Table S2, and the targets with four flags are listed in Table 2.
5000
4000
3000
** *
** * **
**
1.0
2000
1000
0
0
1
2
3
4
number of flags
Figure 3. Independent segments correlate most strongly with
hfq/ymaH
The 104-condition expression data from Nicolas et al. (17)
was used to compute the Pearson correlation between the
expression pattern of all the new RNA segments and that of hfq/
ymaH. Significance was tested with Anova with TukeyHSD at
99% confidence. One star (*) indicates significance with p-value
≤ 0.05, two stars (**) indicate significance with p-value ≤ 0.01.
hfq correlation all conditions
number of targets
All targets
0.5
0.0
−0.5
3'
81
5'
indep inter
ykzW
S2
S72
S111
S140
S144
S145
S181
S198
S249
S254
S275
S289
S309
S313
S326
S345
S348
S357
S415
S423
S444
S458
S462
S499
S503
S512
S547
S612
S641
S645
S653
S659
S665
S708
S717
S718
S728
S731
S796
S797
S809
S849
S857
S863
S877
S903
S907
S912
S968
S1009
S1022
S1024
S1027
S1029
S1052
S1136
S1227
S1251
S1455
S1495
S1534
S1583
SigA, SigEF
SigK, SigEF
SigA
SigEF
SigA, SigB, SigA
Sig-, SigA, SigA
SigEF, SigB
SigA
SigK
SigA
SigGF, SigA, SigB
SigA
SigA
SigGF
SigA
SigA, SigA
SigGF
RsaE SigA
SigK, SigEF
SigA
SigA, SigEF
SigWXY
SigL
SigA
FsrA SigA, SigCsfG SigGF
SigA
SigA
SigK
Sig-, SigA
SigWXY, SigWXY, Sig-, SigK
SigK
SigA
SigA
BsrE SigA, SigWXY
BsrF SigA, SigH
SigEF
SigA
SigA, SigA
SigA, SigB
SigA, SigWXY, SigA, SigA, SigA
SigK, SigA, SigSigEF
SigA, SigA
SigA
SigWXY, SigK
SigK, SigA, SigB
SigK, SigA, SigA
RnaC SigD, SigB
SigK, SigA
SigA, SigA
SigA, SigA
SigSigB
SigK, SigA
SigA
SigWXY
SigA
Sig-
22
10
11
23
9
9
17
48
9
5
21
14
20
19
10
19
22
19
57
19
9
9
9
8
9
22
61
12
16
8
11
10
19
18
17
16
21
9
6
10
15
10
19
19
19
15
10
9
9
13
19
10
22
21
9
24
9
18
43
9
2
19
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
-
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
-
82
-4.43
-1.86
0.16
-1.22
-2.57
-6.78
-0.74
-5.09
-1.24
0.12
-2.63
-2.4
-2.29
-1.73
-5.27
0.16
-3.39
-6.37
-2.62
-1.91
-3.07
-1.33
0.68
-0.06
-1.01
-4.75
-4.72
-0.79
0.31
1.12
-0.72
-3.52
-1.67
-0.67
-4.89
-2.61
-1.78
-0.25
0.13
-2.42
-3.55
-0.82
-1.55
-0.59
-0.21
-1.27
-0.42
2.18
-0.18
-0.84
-5.22
-3.81
-1.37
-3.01
-0.87
-3.53
1.41
0.24
-0.35
-0.08
-2.62
-0.84
-73.9
-45.3
-28.3
-97.5
-19.8
-70.4
-115.63
-58.5
-200.7
-31.37
-39.6
-67.4
-68
-41.17
-32.19
-34.1
-77.71
-90.1
-29.3
-34.34
-136.8
-53
-88.36
-48.2
-52.2
-25.3
-41.4
-32
-44.43
-19.9
-21.3
-55
-26.9
-143.9
-55.4
-40.2
-37.1
-140.1
-24.4
-80.18
-69.4
-10.71
-27.5
-50.4
-17.96
-39.2
-33.84
-27
-36.1
-29.6
-44.72
-74.2
-60.9
-43.8
-49.9
-69.8
-3.35
-87.86
-68.6
-30.4
-128.4
-29.47
36.5
26.1
39.6
40.6
28.6
37.0
31.7
32.8
52.4
30.1
45.9
36.4
27.9
40.3
30.6
40.9
36.5
36.7
37.3
38.0
34.3
42.7
41.4
38.7
35.6
41.2
47.6
38.1
26.2
33.5
40.9
39.7
37.3
34.4
43.5
44.4
48.7
40.9
23.5
34.9
36.8
45.3
34.1
43.0
31.5
34.8
35.1
36.8
41.2
36.2
42.1
44.2
44.1
35.7
44.4
48.2
30.8
35.8
42.6
37.2
41.4
33.3
ren
ce
274
234
149
367
112
216
546
198
561
176
157
269
330
176
111
154
263
297
126
187
502
204
367
212
219
102
124
134
286
161
132
179
110
628
154
135
111
499
153
505
234
75
132
200
108
201
154
171
148
152
126
217
220
154
198
220
107
388
263
223
425
198
Re
fe
th
GC
%
ng
Le
en
om
es
Sp
oru
lat
OR ion
F
Z-s
co
re
MF
Eo
pti
m
#g
ds
icte
am
SR1
Pre
d
Alt
.n
Na
m
e
e
igm
af
ac
tor
In silico target profiling
Licht et al, 2005
Geissmann et al, 2009
Gaballa et al, 2008
Marchais et al, 2011
Saito et al, 2009
Saito et al, 2009
Schmalisch et al, 2010
Chapter 4
segments (which are mostly asRNAs) (Figure 4B). The fraction of conserved sequences amongst
the selected putative srRNAs was visualized in a heatmap with automatic reordering of rows
(genomes) and columns (RNA segments) (Figure 5). Similar to what was observed for such
a plot of all sense RNA segments (Chapter 3), three clusters of putative srRNAs with different
conservation levels were distinguishable. A first cluster consists of four RNA segments, namely
S1455, S198, RsaE/S415 and CsfG/S547, which are conserved in almost all considered genomes.
Briefly, S1455 represents an un-annotated T-box regulatory mechanism (upstream of the
threonyl-tRNA synthetase gene). S198 is the 5’ leader region of the vmlR gene encoding an
ABC transporter. The S198 RNA segment was nevertheless included here since leader regions
and riboswitches might also function as trans-regulatory RNAs (41). RsaE/S415 is reported as a
regulator of central carbon metabolism in S. aureus (19) and also seems to function as an srRNA
in B. subtilis (18). CsfG/S547 is an srRNA, which is highly conserved within endospore-forming
bacteria (40). A second cluster is composed of putative srRNAs that are only conserved in the
genomes of the 9 included B. subtilis subspecies. The third cluster consists of putative srRNAs
that display an intermediate conservation level, as they are mostly present in B. subtilis sp., B.
amyloliquefaciens sp. and B. atrophaeus (~20 genomes in total) (Figure 5).
It was conceivable that the more structured sRNAs would also be more conserved.
If so, this could be an indication of the importance of the level of secondary structure for
regulation. To test this idea, we plotted the conservation levels of all All-Indep segments against
their predicted secondary structures (Figure 6). The resulting plot illustrates that there are many
highly conserved All-Indep segments (>20 genomes) with intermediate Z-scores (between -2
and 0), but these are mostly asRNAs and this is the reason for their sequence-level conservation.
The most structured All-Indep segments either belong to the selected set of putative srRNAs, or
to the above-mentioned type I toxin-antitoxin systems. Furthermore, Figure 6 shows that there
is no significant relationship between the conservation and Z-score for the All-Indep segments
in general. However, such a relationship does exist for the selected putative srRNAs (R2 of fitted
linear model 0.084, p-value= 0.01).
The selection of putative srRNAs contains eight previously investigated RNA segments.
The first three of these - SR1, FsrA/S512, and RsaE/S415 - will be discussed in more detail in the
case studies later in this manuscript. Briefly, SR1 was identified in a computational approach,
and it represented the first reported srRNA in B. subtilis (42). Following this, a series of papers
by the same authors described multiple functions of this srRNA (12, 36, 37, 38, 39). The second
reported srRNA, FsrA/S512, was identified through a computational prediction of Fur regulatory
regions followed by a closer inspection of the respective downstream regions (10). FsrA/S512
was subsequently shown to be a regulator of the iron-sparing response (10, 11). The third
srRNA, RsaE/S415, was first identified in an expression screen of intergenic regions of S. aureus,
and it was then found to be conserved in the Bacillaceae (18). The remaining five previously
reported srRNAs are less well characterized. The first of these is CsfG/S547 (40). The sequence
Table 1. Selection of putative srRNAs (facing page)
Name, name of segment in (17). Alt. name, alternative name for previously reported segments. Predicted sigma factor,
sigma factor regulation predicted by (17). #genomes, number of Bacillus genomes for which a significant Blast hit
was obtained in this study (maximum 62). Sporulation, “Y” for yes serves as an indication for segment expression
exclusively under conditions of sporulation. ORF, “Y” for yes indicates that an ORF was identified in (17). Z-score,
the secondary structure Z-score of the segment compared to shuffled sequences with the same length and nucleotide
composition (a lower Z-score indicates stronger secondary structure). MFEoptim, RNAfold minimum free energy
of the optimal structure used for the computation of secondary structure Z-scores. Length, length of the segment
according to (17). GC%, GC percentage of the segment. Reference, reference marks the publication in which a segment
was first identified and named; if no reference is indicated, the segment was first identified in (17).
83
lag gs
t
rt
p
d8
ar stop
n
alF
lag lag
la
d
r
rF
ve
sta _sto _st
on erF tatio
_
r
i
e
F
e
e
t
_
e
t
t
i
e
s
h
A
A
e
b
e
u
s
s
A
A
ry
k
s
d
k
c
r
l
i
N
N
g
u
u
e
n
m
n
n
m
r
a
no
N
N
o
a
l
l
r
a
R
R
u
a
a
o
o
u
c
v
c
n
c
e
n
t
R
R
o
Q
R
N
L
S
P
s
s
m
m
c
B
C
E
B
P
C
N
A
S111
10 veg
BSU00440 -98 0.000 15 107 -73 42 0.14 B26
Y
Y
Y
Y
4 Biofilm formation
S140
82 mecB BSU22970 -87 0.000 68 121 -34 14 0.12 B474 Y
Y
Y
Y
4 Proteolysis
S140
91 htrA BSU12900 -86 0.000 213 254 -10 34 -0.23 B328 Y
Y
Y
Y
4 Coping with stress - heatshock
S140
115 ftsH BSU00690 -84 0.010 8
26 26 44 -0.09 B36
Y
Y
Y
Y
4 Coping with stress - heatshock
S181
110 purT BSU02230 -91 0.010 62 116 -75 -28 -0.29 B9
Y
Y
Y
Y
4 Biosynthesis / acquisition of nucleotides
S254
138 yxeB BSU39610 -74 0.010 4
43 -29 13 -0.27 B49
Y
Y
Y
Y
4 Acquisition of iron
S275
38 yvaE BSU33570 -72 0.000 105 156 -16 37 -0.28 B523 Y
Y
Y
Y
4 resistance against toxins / antibiotics / based on similarity
S275
63 ykuC BSU14030 -69 0.010 57 104 -4
44 0.20 B274 Y
Y
Y
Y
4 resistance against toxins / antibiotics / based on similarity
S309
294 scoC BSU09990 -85 0.010 154 205 -70 -18 -0.05 B28
Y
Y
Y
Y
4 Regulation of gene expression - transition state regulator
S313
92 efeM BSU38270 -80 0.000 87 155 -73 -3 -0.42 B49
Y
Y
Y
Y
4 Acquisition of iron
S357
58 mtnE BSU13580 -91 0.000 107 194 -68 15 -0.23 B98
Y
Y
Y
Y
4 Biosynthesis / acquisition of amino acids
S357
169 tcyA BSU03610 -82 0.010 196 246 -75 -21 0.05 B16
Y
Y
Y
Y
4 Biosynthesis / acquisition of amino acids
S357
176 dapA BSU16770 -82 0.010 146 209 -9
48 0.17 B41
Y
Y
Y
Y
4 Biosynthesis / acquisition of amino acids - sporulation - essential
S415
89 ndhF BSU01830 -75 0.000 16 55
-1
36 -0.01 B56
Y
Y
Y
Y
4 Electron transport and ATP synthesis
S512
1 yxeB BSU39610 -96 0.000 2
30 -74 -44 0.54 B49
Y
Y
Y
Y
4 Acquisition of iron
S512
3 yfiY
BSU08440 -83 0.000 3
29 -51 -24 0.45 B49
Y
Y
Y
Y
4 Acquisition of iron
S512
5 feuA BSU01630 -82 0.000 1
25 -52 -26 0.49 B49
Y
Y
Y
Y
4 Acquisition of iron
S512
22 fhuD BSU33320 -73 0.000 1
21 -61 -39 0.52 B49
Y
Y
Y
Y
4 Acquisition of iron
S512
47 efeU BSU38280 -69 0.000 1
42 -64 -22 0.23 B49
Y
Y
Y
Y
4 Acquisition of iron
S512
84 feuC BSU01610 -64 0.010 3
29 -38 -11 0.40 B49
Y
Y
Y
Y
4 Acquisition of iron
S547
5 frlD
BSU32570 -85 0.000 9
91 -31 36 0.14 B593 Y
Y
Y
Y
4 Utilization of nitrogen sources other than amino acids
S547
16 glmS BSU01780 -78 0.000 36 70 -14 24 0.06 B53
Y
Y
Y
Y
4 Biosynthesis of cell wall components - essential
S641
125 murB BSU15230 -95 0.000 164 227 -28 46 -0.17 B215 Y
Y
Y
Y
4 Cell envelope stress proteins - essential
S641
164 ykrA BSU14550 -92 0.000 181 257 -65 17 0.07 B46
Y
Y
Y
Y
4 Unknown function
S641
221 salA BSU01540 -89 0.000 160 232 -35 50 0.10 B46
Y
Y
Y
Y
4 Regulation of gene expression - transition state regulator
S645
198 phrK BSU18920 -79 0.000 21 103 -40 38 -0.09 B28
Y
Y
Y
Y
4 Genetic competence
S659
24 glmS BSU01780 -91 0.000 31 100 -16 50 0.06 B53
Y
Y
Y
Y
4 Biosynthesis of cell wall components - essential
S659
154 pbpD BSU31490 -77 0.010 46 114 -34 36 -0.18 B39
Y
Y
Y
Y
4 Cell wall synthesis
S659
193 yqgS BSU24840 -75 0.010 76 124 -35 17 -0.03 B502 Y
Y
Y
Y
4 Biosynthesis of cell wall components
S718
66 yclP BSU03820 -72 0.000 11
38 -37 -10 -0.52 B49
Y
Y
Y
Y
4 Acquisition of iron
S797
122 lspA BSU15450 -91 0.000 334 399 -53 15 0.26
B5
Y
Y
Y
Y
4 Protein synthesis, modification, and degradation
S797
322 ftsY BSU15950 -83 0.010 358 409 -4
41 -0.24 B5
Y
Y
Y
Y
4 Protein synthesis, modification, and degradation - sporulation
S809
76 ykuN BSU14150 -84 0.010 12 111 -64 43 -0.12 B49
Y
Y
Y
Y
4 Electron transport and ATP synthesis
S863
9 yfmC BSU07520 -83 0.000 75 132 -65 -13 -0.24 B49
Y
Y
Y
Y
4 Acquisition of iron
S912
243 yyzM BSU40939 -75 0.010 111 167 -70 -11 -0.18 B12
Y
Y
Y
Y
4 Unknown function
S968
17 pckA BSU30560 -94 0.000 47 96 -42
3
0.09 B54
Y
Y
Y
Y
4 Carbon core metabolism
S1009 57 yusE BSU32770 -83 0.000 50 92 -24 11 0.07 B598 Y
Y
Y
Y
4 Electron transport and ATP synthesis
S1022 121 abrB BSU00370 -70 0.010 51 82 -10 19 0.12 B22
Y
Y
Y
Y
4 Regulation of gene expression - transition state regulator
S1022 135 adcA BSU02850 -69 0.010 37 83 -57 -7 -0.27 B94
Y
Y
Y
Y
4 Trace metal homeostasis
S1024 60 exoA BSU22010 -75 0.010 41 70 14 44 -0.34 B387 Y
Y
Y
Y
4 DNA replication / based on similarity
S1227 82 bacB BSU37730 -77 0.000 13 49 -26 12 -0.20 B170 Y
Y
Y
Y
4 Biosynthesis of antibacterial compounds
S1455
1 coaE BSU29060 -108 0.000 40 79 -30 16 0.12 B41
Y
Y
Y
Y
4 Biosynthesis of cofactors - essential
S1583 86 gapB BSU29020 -84 0.000 3
110 -75 15 0.11 B54
Y
Y
Y
Y
4 Carbon core metabolism
In silico target profiling
84
Chapter 4
Table 2. Predicted srRNA targets with four flags (facing page)
Query, putative srRNA name. Rank, rank in the TargetRNA predictions for the Query. Name, name of the predicted
target. Ltag, unique B. subtilis 168 locus tag of the predicted target. Score, TargetRNA_v1 (23) prediction significance
score. Pvalue, TargetRNA_v1 (23) prediction p-value. sRNA_start, start coordinate of putative srRNA in the predicted
target interaction. sRNA_stop, end coordinate of putative srRNA in the predicted target interaction. mRNA_start, start
coordinate of the predicted target in the predicted target interaction relative to start codon. mRNA_stop, end coordinate
of the predicted target in the predicted target interaction relative to start codon. Cor, Pearson correlation between
putative srRNA and predicted target computed over the complete condition space from (17). Bcluster, B-cluster of the
target from (17). Conserved8, flag with “Y” for yes indicating whether the predicted target interaction is conserved in
8 genomes (including B. subtilis 168) or more. Enriched, flag with “Y” for yes indicating whether the predicted target
is part of an enrichment category from Table 3. BclusterFlag, flag with “Y” for yes indicating whether the predicted
target is part of an enriched B-cluster from Table S3. PeaksFlag, flag with “Y” for yes indicating whether the predicted
target is a Peaks expression target from Table S4. NumberFlags, the sum of the number of flags. Annotation, shortened
description of the annotation category of the predicted target. All 11419 predicted targets can be found in Table S1 and
the 746 targets with three or more flags in Table S2.
A
B
P= 0.158
secondary structure Z−score
secondary structure Z−score
P= 0.001
2
0
−2
−4
−6
−8
2
0
−2
−4
−6
−8
All sense
Selection
Indep
Selection
Figure 4. Selected segments display stronger predicted secondary structure than other sense RNAs
A) Box plots of the secondary structure Z-score of all sense RNAs compared to the selection of putative srRNAs from
Table 1. A lower Z-score indicates stronger predicted secondary structure. The selected putative srRNA segments
are significantly more structured than the other sense RNAs. B) Same as in A, but the selected putative srRNAs are
compared to all ‘All-Indep’ segments. There is a trend to more secondary structure in the selected putative srRNAs.
P-values shown are from a Welch Two Sample t-test.
85
In silico target profiling
Color Key
and Density Plot
0
5
10
Density
15
Selection independent segments
0
0.2
0.4
0.6
Value
0.8
1
S1
4
S155
S498
S 15
S1547
5
S234
S754
S796
S897
S709
S718
S617
S 41
S1181
0
S109
S611
S212
S689
S953
S603
S445
S199
4
S 4
S1 72
0
S924
S307
2
S6 6
S859
S749
S431
S462
S144
S245
S 49
S1912
S1495
2
S427
S 58
S1968
0
S552
S 03
S1708
1
S 36
S1275
0
S129
40
S
S3 2
S548
S1 12
02
S 7
S1728
2
S351
S309
S 13
S1345
S1583
0
S822
S877
S457
S623
S365
S857
63
Bacillus_subtilis_subsp._subtilis_str._BSP1
Bacillus_subtilis_BSn5
Bacillus_subtilis_subsp._natto_BEST195
Bacillus_subtilis_QB928
Bacillus_subtilis_subsp._subtilis_str._168
Bacillus_subtilis_subsp._subtilis_str._RO.NN.1
Bacillus_subtilis_subsp._spizizenii_TU.B.10
Bacillus_sp._JS
Bacillus_subtilis_subsp._spizizenii_str._W23
Bacillus_amyloliquefaciens_subsp._plantarum_YAU_B9601.Y2
Bacillus_amyloliquefaciens_Y2
Bacillus_amyloliquefaciens_FZB42
Bacillus_amyloliquefaciens_subsp._plantarum_AS43.3
Bacillus_amyloliquefaciens_subsp._plantarum_CAU_B946
Bacillus_amyloliquefaciens_XH7
Bacillus_amyloliquefaciens_TA208
Bacillus_amyloliquefaciens_LL3
Bacillus_amyloliquefaciens_DSM_7
Bacillus_atrophaeus_1942
Bacillus_anthracis_str._Ames_Ancestor
Bacillus_cereus_Q1
Bacillus_anthracis_str._A0248
Bacillus_anthracis_str._H9401
Bacillus_anthracis_str._Ames
Bacillus_anthracis_str._Sterne
Bacillus_cereus_E33L
Bacillus_thuringiensis_BMB171
Bacillus_cereus_ATCC_10987
Bacillus_cereus_biovar_anthracis_str._CI
Bacillus_cereus_03BB102
Bacillus_cereus_AH187
Bacillus_cereus_NC7401
Bacillus_cereus_AH820
Bacillus_cereus_F837.76
Bacillus_thuringiensis_str._Al_Hakam
Bacillus_cereus_B4264
Bacillus_anthracis_str._CDC_684
Bacillus_thuringiensis_serovar_finitimus_YBT.020
Bacillus_cereus_FRI.35
Bacillus_thuringiensis_serovar_chinensis_CT.43
Bacillus_thuringiensis_Bt407
Bacillus_cereus_G9842
Bacillus_thuringiensis_HD.771
Bacillus_weihenstephanensis_KBAB4
Bacillus_thuringiensis_HD.789
Bacillus_cereus_ATCC_14579
Bacillus_thuringiensis_serovar_konkukian_str._97.27
Bacillus_thuringiensis_MC28
Bacillus_cytotoxicus_NVH_391.98
Bacillus_megaterium_QM_B1551
Bacillus_megaterium_DSM_319
Bacillus_coagulans_36D1
Bacillus_megaterium_WSH.002
Bacillus_coagulans_2.6
Bacillus_cellulosilyticus_DSM_2522
Bacillus_pseudofirmus_OF4
Bacillus_halodurans_C.125
Bacillus_selenitireducens_MLS10
Bacillus_clausii_KSM.K16
Bacillus_licheniformis_ATCC_14580
Bacillus_licheniformis_DSM_13_._ATCC_14580
Bacillus_pumilus_SAFR.032
Figure 5. Conservation heatmap of selected putative srRNAs
The proportion of sequence conservation of selected putative srRNAs in different bacilli is indicated in color code.
Rows and columns were reordered automatically to illustrate the relationship between the conservation level of the
putative srRNAs and the similarity between the included genomes.
60
Number of genomes conserved
●
Figure 6. Selected putative srRNAs display stronger
secondary structure with higher conservation.
The secondary structure Z-scores of putative srRNAs are
plotted against the respective species-level conservation.
There is a significant linear relationship between these
two parameters for the selected putative srRNAs (R2 0.084,
p-value= 0.01).
all indep
selection
50
●
●
●
●
40
30
20
●●
●
●
●
10
●
●
0
−8
−6
●
● ● ● ● ●●
● ●●
●
● ●●
●● ●
● ●
●
● ●●
●
●
●
●
●
●● ● ●● ● ●
● ●●
●●●
●
●
● ● ●● ●
●
●
●
−4
−2
0
Secondary structure Z−score
86
2
Chapter 4
on the opposite strand CsfG/S547 was first annotated by Barrick et al. as a cis-acting regulatory
element termed the “ylbH leader” (43). However, Marchais et al. (40) have convincingly shown
that the region on the opposite strand likely encodes for an srRNA, and they also noted its
high conservation in endospore-forming bacteria. The sporulation-specific expression of CsfG/
S547 is due to regulation by SigG and SigF (40). The second segment is SurA/S653, which was
identified by a microarrays bearing intergenic regions of B. subtilis (44). SurA/S653 is induced at
the onset of sporulation under indirect control of Spo0A. However, this is not the only condition
under which SurA/S653 is induced (17), suggesting additional regulatory functions in processes
other than sporulation. Two other selected putative srRNAs, BsrE/S718 and BsrF/S728, were
identified in a Northern Blot analysis of the transcription of 123 intergenic regions in cells grown
in LB medium (34). Lastly, RnaC/S1022 was identified – again using microarrays of intergenic
regions – as a SigD-dependent srRNA expressed during exponential growth on LB medium
(45).
Intriguingly, regulatory RNAs are known to be involved in various differentiation
processes in eukaryotes (6). Since the main developmental pathway of B. subtilis is sporulation,
we wondered whether some of the selected putative srRNAs would be predominantly expressed
under sporulation-inducing conditions. To identify such putative srRNAs, we inspected the
104-condition expression profile of all segments to identify those segments that have a low
baseline expression (≤~9 on a log2 scale from 7-16) under most conditions and a higher
expression under sporulation-inducing conditions. This expression pattern was identified for
11 out of the 63 selected RNA segments (Table 1, “Sporulation” column). Future studies will be
necessary to further define the predicted role(s) of these putative srRNAs in spore development.
It was previously shown that some srRNAs can enact (a part of) their regulatory
functions via the translation of a small ORF within their sequence. Examples of this are SgrS in
E. coli and SR1 in B. subtilis. SgrS regulates the ptsG mRNA via an srRNA-mRNA interaction and
the peptide encoded by SgrS (SgrT) is important for the recovery of E. coli cells from glucosephosphate stress (46). SR1 of B. subtilis binds to the mRNA of ahrC via an srRNA-mRNA
interaction while the SR1P peptide regulates the gapA mRNA via binding of SR1P to GapA
(37). For this reason, Nicolas et al. (17) have assessed the presence of ORFs in all identified RNA
segments of B. subtilis. From this analysis it followed that 15 out of the 63 putative srRNAs include
an open reading frame (Table 1), which may be important for their (presumed) regulatory roles.
Target predictions and functional gene enrichment
We predicted targets for all independent RNA segments using the source code of TargetRNA_v1
(23, 27) with expanded settings around the 5’ end of all mRNA and new RNA segments. The
included target region was set from -75 nucleotides upstream until 50 nucleotides downstream
the annotated AUG translation start codon. These expanded settings were based on reported
srRNA–target interactions in which srRNAs were shown to bind at more distant locations than
those covered by TargetRNA_v1’s default settings. This choice is supported by the observation
that the default target region settings in TargetRNA_v2 were expanded to the region between
-80 nucleotides and +20 nucleotides relative to the start codon (47). We selected TargetRNA
for our predictions, because it has previously been successfully employed for srRNA target
predictions (8, 27), and because it is solely based on sequence information. We preferred a
solely sequence-based algorithm rather than a target prediction algorithm that takes structural
and conservation elements into account, because the latter algorithms have so far only been
developed and benchmarked for Gram-negative bacteria (24). To what extent srRNA-based
regulation in Gram-positive and Gram-negative bacteria can be compared is unclear at this
point, as is illustrated by the lower importance of Hfq in Gram-positive bacteria. To facilitate the
comparisons, we performed exhaustive target predictions for every segment. This means that
87
In silico target profiling
we predicted all possible targets up to p-value 1. The default p-value threshold for TargetRNA_
v1 target predictions is ≤0.01. In total 216248 targets with p-value ≤1 and 11419 targets with
p-value ≤0.01 were predicted for the 63 selected putative srRNAs (Table S1 and Supplementary
data file predictions).
It is known from previously published studies that a functional srRNA-target interaction
region can be very short (7). However, we noted that predicted interaction regions with prediction
p-values ≤0.01 are occasionally very large, involving an average of ~70 nucleotides of the selected
putative srRNAs. We therefore wondered how the length of the interaction region is related to
the TargetRNA p-value. To inspect this, we plotted the predicted interaction length for multiple
cut-offs of the TargetRNA p-value. As expected, the predicted interaction length is dependent on
the p-value cut-off (Figure 7A). It thus seems that the average predicted interaction length with
a p-value ≤0.01 is large. This is consistent with the notion that Gram-positive bacteria might on
average have longer srRNA interaction regions than Gram-negative bacteria (9).
Besides the length of predicted interactions, we also inspected the number of targets
predicted per putative srRNA. Since the number of predicted targets is expected to increase
with the length of the sRNA, we plotted the number of predicted targets against the length of
the sequence. Indeed, this showed that more targets are predicted for longer sequences (R2 0.159
with p-value <0.001 on all All-Indep points), but this relationship was completely undetectable
when it was tested on the 63 selected putative srRNAs only (p-value 0.468) (Figure 7B). In RNA
segments of around 150 nucleotides, the number of predicted targets ranges from around 20
to around 500. This wide range is interesting as it shows that there are other factors besides the
sequence length responsible for the number of predicted targets. Thus, it may be that the number
of predicted targets per srRNA reflects on the function of the segment. However, the wide
range of predicted targets could also be an artefact caused by the presence of certain frequently
occurring sequence motifs in those RNA segments with many predicted targets.
It may be that srRNAs regulate the expression of multiple genes involved in the
same functional process. Such an srRNA regulon was for instance identified for FsrA/S512,
which targets transcripts from genes involved in iron metabolism (11). In fact, the presence
of srRNA regulons is a common aspect of srRNA regulation in Gram-negative bacteria (26,
28, 48). The likelihood of the non-random prediction of multiple genes involved in the same
A
B
Figure 7. General description of target predictions
A) Predicted interaction length at different TargetRNA p-value cut-offs. B) Relationship between the length of an
srRNA query sequence and the number of predicted targets with p-value ≤0.01. This relationship is absent for the
selection of putative srRNAs.
88
Chapter 4
functional process can be computed and expressed as a binomial p-value. To do this, we used
the most recent B. subtilis gene annotation from SubtiWiki (http://subtiwiki.uni-goettingen.
de) (49), selected only those predicted targets with p-values ≤0.01, and only considered those
functional categories with a binomial p-value of enrichment of ≤0.05. Enrichment of minimally
one functional category was observed for 46 out of the 63 putative srRNAs analysed (with a
maximum of 4 categories for FsrA/S512). The enriched functional categories are listed in Table
3. Examples from this Table are FsrA/S512, S462, and RsaE/S415, which will be discussed in
detail below. Similarly, we tested whether a more than random number of genes from a specific
regulon were present in the target predictions. For this purpose, we again computed the binomial
p-values on target predictions with p-values ≤0.01, now using the B. subtilis regulon annotation
from SubtiWiki (49), and only considering regulons with a binomial p-value of enrichment
of ≤0.05. For 47 out of the 63 selected putative srRNAs, the predictions show enrichment of
minimally one regulon (with a maximum number of 4 regulons for FsrA/S512). These enriched
regulons are listed in Table 4. Examples from this Table are FsrA/S512, SR1, and RsaE/S415,
which are to be discussed below.
We also wondered whether there might be any groups of genes that are predicted
to be regulated by multiple putative srRNAs. We therefore computed the binomial p-values
of functional enrichment on all 11419 predicted targets of the selected putative srRNAs with
p-values ≤0.01. This led to two functional categories with binomial p-values ≤0.05, namely
proteins of unknown function (p-value 0.03) and SPβ prophage (p-value 0.05). Clearly, the
proteins of unknown function category is not informative. However, the observed enrichment
of SPβ genes as targets for regulation by multiple srRNAs is intriguing, since it is presently
unclear how organisms can specifically silence genomically integrated viruses. Repressing gene
expression from integrated viruses by srRNAs is a plausible mechanism, especially since srRNAs
can rapidly evolve to take on new functions. Analogous to this hypothesis, it was noted from
work on Salmonella that bacterial pathogens use their large number of srRNAs to integrate
horizontally acquired genes into existing posttranscriptional regulatory networks (50). This is
reminiscent of transcription factors that are recruited to tame foreign genes at the DNA level
(50), such as Rok in B. subtilis (51). We thus suggest that srRNAs may play (or have played) a role
in silencing the expression of genes from SPβ, and we propose this as a topic worth exploring.
Evolutionary target conservation as a criterion to identify candidate srRNA targets
The conservation of a predicted srRNA-target interaction might be indicative of the importance
of this interaction over evolutionary time. Thus, testing whether the regulatory interaction is
conserved could be useful when trying to identify the most important targets of a particular
srRNAs. During the course of the present analyses, such a comparative genomic approach was
reported to be successful for improving srRNA target predictions (26). In parallel to the reported
approach, we established a bioinformatics pipeline to identify conserved predicted srRNA targets
in B. subtilis. This pipeline predicts srRNA targets in genomes in which the sequence of the
putative srRNA is conserved. Since we were mainly interested in finding actual srRNA targets in
B. subtilis 168, we only considered targets also predicted in B. subtilis 168. Overall, this analysis
reduced the average number of considered targets per srRNA from 181 to 29. For the selected
putative srRNAs, we plotted the number of predicted targets before and after evolutionary target
profiling in Figure 8. Only for S254 and S499 no conserved targets were identified and this
was linked to their low level of conservation (Table 1, Figure 5). Target interactions that were
predicted in more than 8 species were indicated with a ‘Y’ (short for yes) flag in the column
‘Conserved8’ in the Table with all TargetRNA predictions with p-values ≤0.01 (Table S1; see
Table 2 for examples). To illustrate these evolutionary conserved predicted targets, we plotted
them as a network (Figure S1). This network also visualizes the shared targets, sizes of predicted
89
In silico target profiling
Query
S2
S72
S72
S111
S140
S144
S145
S145
S181
S181
S181
S254
S254
S275
S275
S289
S289
S313
S313
S313
S326
S345
S345
S357
S415
S415
S423
S444
S444
S444
S458
S458
S462
S462
S503
S512
S512
S512
S512
S547
S547
S547
S641
S641
S645
S645
S645
S653
S653
S659
S659
S708
S708
S717
S717
S718
S718
S718
S731
S731
S796
S797
S797
S809
S849
S857
S863
S863
S877
S912
S912
S968
S968
S968
S1009
S1009
S1022
S1022
S1024
S1024
S1136
S1136
S1227
S1227
S1455
S1455
S1583
S1583
S198
S249
S309
S348
S499
ykzW
S612
S665
S728
S903
S907
S1027
S1029
S1052
S1251
S1495
S1534
Enriched category
Information.processing...RNA.synthesis.and.degradation...RNases
Information.processing...genetics...DNA.replication..based.on.similarity
Information.processing...genetics...DNA.restriction..modification
Metabolism...lipid.metabolism...utilization.of.lipids
Information.processing...protein.synthesis..modification.and.degradation...proteolysis
Cellular.processes...cell.envelope.and.cell.division...cell.division
Metabolism...lipid.metabolism...biosynthesis.of.lipids
Information.processing...genetics...DNA.replication..based.on.similarity
Cellular.processes...transporters...transporters..other
Cellular.processes...homeostasis...metal.ion.homeostasis..K..Na..Ca..Mg.
Metabolism...nucleotide.metabolism...biosynthesis..acquisition.of.nucleotides
Cellular.processes...homeostasis...acquisition.of.iron
Information.processing...protein.synthesis..modification.and.degradation...chaperones..protein.folding
Cellular.processes...transporters...transporters..other
Cellular.processes...homeostasis...metal.ion.homeostasis..K..Na..Ca..Mg.
Cellular.processes...homeostasis...acquisition.of.iron..based.on.similarity
Metabolism...lipid.metabolism...utilization.of.lipids
Cellular.processes...cell.envelope.and.cell.division...cell.wall..other..based.on.similarity
Cellular.processes...transporters...transporters..other
Metabolism...electron.transport.and.ATP.synthesis...electron.transport..other..based.on.similarity
Metabolism...electron.transport.and.ATP.synthesis...electron.transport..other..based.on.similarity
Metabolism...amino.acid..nitrogen.metabolism...biosynthesis..acquisition.of.amino.acids
Metabolism...nucleotide.metabolism...nucleotide.metabolism..other
Metabolism...amino.acid..nitrogen.metabolism...biosynthesis..acquisition.of.amino.acids
Cellular.processes...homeostasis...trace.metal.homeostasis..Cu..Zn..Ni..Mn..Mo.
Metabolism...electron.transport.and.ATP.synthesis...respiration
Metabolism...additional.metabolic.pathways...iron.metabolism
Cellular.processes...transporters...transporters..other
Cellular.processes...homeostasis...pH.homeostasis
Metabolism...electron.transport.and.ATP.synthesis...electron.transport..other..based.on.similarity
Metabolism...electron.transport.and.ATP.synthesis...regulators.of.electron.transport
Metabolism...nucleotide.metabolism...nucleotide.metabolism..other
Cellular.processes...cell.envelope.and.cell.division...capsule.biosynthesis.and.degradation..based.on.similarity
Metabolism...nucleotide.metabolism...biosynthesis..acquisition.of.nucleotides
Cellular.processes...cell.envelope.and.cell.division...cell.wall.synthesis
Cellular.processes...transporters...ABC.transporters
Cellular.processes...homeostasis...acquisition.of.iron
Metabolism...additional.metabolic.pathways...iron.metabolism
Information.processing...genetics...genetic.competence
Cellular.processes...transporters...ABC.transporters
Metabolism...amino.acid..nitrogen.metabolism...utilization.of.nitrogen.sources.other.than.amino.acids
Information.processing...genetics...genetic.competence
Cellular.processes...cell.envelope.and.cell.division...cell.wall.synthesis
Metabolism...additional.metabolic.pathways...biosynthesis.of.cell.wall.components
Cellular.processes...cell.envelope.and.cell.division...cell.wall.synthesis
Metabolism...additional.metabolic.pathways...biosynthesis.of.cell.wall.components
Information.processing...genetics...genetic.competence
Metabolism...lipid.metabolism...utilization.of.lipids
Information.processing...RNA.synthesis.and.degradation...transcription
Cellular.processes...cell.envelope.and.cell.division...cell.wall.synthesis
Metabolism...additional.metabolic.pathways...biosynthesis.of.cell.wall.components
Metabolism...additional.metabolic.pathways...biosynthesis.of.cell.wall.components
Information.processing...genetics...DNA.restriction..modification
Metabolism...carbon.metabolism...carbon.core.metabolism
Metabolism...lipid.metabolism...utilization.of.lipids
Cellular.processes...homeostasis...acquisition.of.iron
Metabolism...additional.metabolic.pathways...iron.metabolism
Information.processing...protein.synthesis..modification.and.degradation...protein.secretion
Metabolism...additional.metabolic.pathways...miscellaneous.metabolic.pathways
Information.processing...protein.synthesis..modification.and.degradation...proteolysis
Information.processing...protein.synthesis..modification.and.degradation...chaperones..protein.folding
Metabolism...nucleotide.metabolism...nucleotide.metabolism..other
Information.processing...protein.synthesis..modification.and.degradation...protein.secretion
Metabolism...electron.transport.and.ATP.synthesis...electron.transport..other
Metabolism...additional.metabolic.pathways...phosphate.metabolism
Information.processing...genetics...DNA.restriction..modification
Cellular.processes...homeostasis...acquisition.of.iron..based.on.similarity
Metabolism...additional.metabolic.pathways...iron.metabolism
Information.processing...protein.synthesis..modification.and.degradation...protein.modification
Cellular.processes...cell.envelope.and.cell.division...cell.wall.degradation..turnover
Information.processing...genetics...DNA.repair..recombination..based.on.similarity
Cellular.processes...transporters...transporters..other
Metabolism...amino.acid..nitrogen.metabolism...putative.amino.acid.transporter
Information.processing...RNA.synthesis.and.degradation...DEAD.box.RNA.helicases
Cellular.processes...cell.envelope.and.cell.division...cell.wall..other..based.on.similarity
Metabolism...electron.transport.and.ATP.synthesis...electron.transport..other..based.on.similarity
Cellular.processes...cell.envelope.and.cell.division...cell.wall..other
Cellular.processes...transporters...ABC.transporters
Cellular.processes...homeostasis...trace.metal.homeostasis..Cu..Zn..Ni..Mn..Mo.
Information.processing...genetics...DNA.replication..based.on.similarity
Information.processing...protein.synthesis..modification.and.degradation...chaperone..protein.folding..based.on.similarity
Information.processing...protein.synthesis..modification.and.degradation...protein.secretion..based.on.similarity
Metabolism...additional.metabolic.pathways...miscellaneous.metabolic.pathways
Information.processing...protein.synthesis..modification.and.degradation...chaperone..protein.folding..based.on.similarity
Metabolism...additional.metabolic.pathways...biosynthesis.of.cofactors
Metabolism...additional.metabolic.pathways...miscellaneous.metabolic.pathways
Metabolism...carbon.metabolism...carbon.core.metabolism
Information.processing...genetics...DNA.repair..recombination
No enrichment category
No enrichment category
No enrichment category
No enrichment category
No enrichment category
No enrichment category
No enrichment category
No enrichment category
No enrichment category
No enrichment category
No enrichment category
No enrichment category
No enrichment category
No enrichment category
No enrichment category
No enrichment category
No enrichment category
90
Binomial P-value
0.044
0.039
0.012
0.022
0.037
0.041
0.043
0.023
0.006
0.030
0.030
0.048
0.002
0.018
0.031
0.031
0.024
0.015
0.038
0.004
0.049
0.027
0.021
0.011
0.037
0.000
0.026
0.004
0.037
0.002
0.038
0.038
0.031
0.023
0.033
0.032
0.001
0.000
0.038
0.042
0.042
0.018
0.006
0.006
0.008
0.002
0.027
0.009
0.010
0.023
0.018
0.038
0.007
0.026
0.044
0.049
0.022
0.038
0.033
0.042
0.008
0.043
0.004
0.040
0.042
0.027
0.025
0.047
0.013
0.040
0.023
0.003
0.013
0.008
0.012
0.020
0.015
0.019
0.034
0.047
0.048
0.029
0.031
0.030
0.031
0.014
0.024
0.048
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
Table 3. Enriched functional categories from target predictions on
the selected putative srRNAs
Query, putative srRNA name. Enriched category, enriched category
of the putative srRNA computed on
all B. subtilis 168 target predictions.
Binomial P-value, p-value indicating
the significance of the enrichment.
Chapter 4
Query
S2
S72
S72
S111
S140
S144
S145
S145
S145
S181
S181
S198
S249
S249
S254
S254
S254
S289
S289
S309
S326
S326
S326
S345
S345
S357
S357
S415
S458
S499
S512
S512
S512
S512
ykzW
S547
S612
S641
S641
S659
S708
S708
S717
S718
S728
S728
S728
S731
S796
S796
S796
S797
S797
S809
S857
S877
S877
S903
S907
S912
S1009
S1022
S1024
S1027
S1052
S1052
S1227
S1227
S1251
S1495
S1495
S1534
S1534
S1583
S1583
S1583
S275
S313
S348
S423
S444
S462
S503
S645
S653
S665
S849
S863
S968
S1029
S1136
S1455
Enriched regulon
DnaA.regulon
AdaA.regulon
CssR.regulon
FadR.regulon
CodY.regulon
CcpA.regulon
Btr.regulon
CcpC.regulon
CitB.regulon
A.box
AbrB.regulon
AbrB.regulon
AseR.regulon
DnaA.regulon
DegU.regulon
FadR.regulon
Fur.regulon
AbrB.regulon
FadR.regulon
AbrB.regulon
A.box
AbrB.regulon
DegU.regulon
AhrC.regulon
FruR.regulon
A.box
AraR.regulon
CcpA.regulon
A.box
A.box
Btr.regulon
CitB.regulon
FsrA.regulon
Fur.regulon
CysL.regulon
glmS.ribozyme
FruR.regulon
AbrB.regulon
DegU.regulon
glmS.ribozyme
AhrC.regulon
BltR.regulon
BsdA.regulon
Fur.regulon
AseR.regulon
CcpA.regulon
DeoR.regulon
FruR.regulon
Abh.regulon
CcpC.regulon
FadR.regulon
ArsR.regulon
CtsR.regulon
CzrA.regulon
AzlB.regulon
A.box
CatR.regulon
CysL.regulon
FadR.regulon
AraR.regulon
DnaA.regulon
CcpC.regulon
ComN.regulon
CsoR.regulon
AdaA.regulon
G.box
CymR.regulon
FadR.regulon
DegU.regulon
A.box
FMN.box
ArsR.regulon
GlcT.regulon
Abh.regulon
AbrB.regulon
CcpN.regulon
No enriched regulon
No enriched regulon
No enriched regulon
No enriched regulon
No enriched regulon
No enriched regulon
No enriched regulon
No enriched regulon
No enriched regulon
No enriched regulon
No enriched regulon
No enriched regulon
No enriched regulon
No enriched regulon
No enriched regulon
No enriched regulon
Binomial P-value
0.022
0.017
0.029
0.005
0.011
0.025
0.026
0.039
0.026
0.023
0.025
0.034
0.041
0.041
0.004
0.023
0.039
0.014
0.023
0.001
0.048
0.004
0.021
0.046
0.009
0.026
0.047
0.031
0.012
0.012
0.005
0.005
0.007
0.000
0.018
0.031
0.044
0.001
0.011
0.035
0.046
0.042
0.032
0.043
0.012
0.024
0.017
0.032
0.014
0.044
0.018
0.021
0.030
0.035
0.018
0.018
0.036
0.019
0.025
0.006
0.039
0.011
0.040
0.038
0.031
0.012
0.004
0.035
0.041
0.029
0.019
0.049
0.037
0.023
0.023
0.018
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
Table 4. Enriched regulons from target predictions on the selected putative srRNAs
Query, putative srRNA name. Enriched regulon, enriched regulon of the putative srRNA computed on all B. subtilis 168 target predictions. Binomial P-value, p-value indicating the significance of the enrichment.
91
In silico target profiling
srRNA regulons and the possible relationships between several putative srRNAs (Figure S1).
For every putative srRNA, we computed the binomial p-value enrichment of functional
enrichment on the evolutionary conserved targets and only considered functional groups with a
binomial p-value of ≤0.05. Enrichment of minimally one functional category was observed for 45
of the 63 putative srRNAs, with a maximum of 4 functional categories for S659 and RnaC/S1022
(Table 5). In some cases, the obtained enriched categories were identical with the functional
enrichment obtained only for B. subtilis 168 (compare Tables 3 and 5). We hypothesize that this
provides additional information to predict the function and most likely targets of the respective
srRNAs. Examples from Table 5 will be discussed in detail for FsrA/S512, S462, SR1, and RsaE/
S415 in the case studies below.
Altogether, we believe that true and important srRNA-target interactions are likely
evolutionary conserved and can be identified through the analyses described in the previous
paragraphs. However, the detection of such conserved potential srRNA-target interactions does
not automatically make them true interactions. It is conceivable that there may be other reasons,
independent of srRNA regulation, for the conservation of both interacting sequences. Such
phylogenetic inertia - the influence of the ancestor on the descendant - makes it impossible to
compute a statistical likelihood for the relevance of the predicted conservation of srRNA-target
interactions.
Co-expression analyses for improved prediction of potential srRNA-target interactions
SrRNAs can regulate their targets in a wide variety of ways, most simply divided into directly
triggering degradation of srRNA–mRNA duplexes, or inhibiting the translation of the mRNA
(48). It is generally believed that even when an srRNA solely inhibits translation, this will lead to
some degradation of the mRNA (52), mainly because this srRNA-bound mRNA is not protected
by elongating ribosomes. While there are known exceptions to this rule (30), one can take the
degradation of RNA duplexes as a premise and, accordingly, it can be anticipated that for the
majority of srRNA targets (small) differences in mRNA abundance will be apparent, which
correlate with changes in the abundance of the respective srRNA. Thus, in transcriptome data
across many different growth conditions, one might expect to see a correlation in the expression
600
●
●
Number of targets
500
●
all targets
conserved targets
●
●
400
● ●
●
● ● ● ●
●
300
●
● ● ●
● ●
200
100
● ●
● ● ● ●
● ● ● ●
● ●
● ● ● ●
● ● ● ● ●
● ●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ● ●
●
●
●
●
●
●
●
●
● ● ● ●
● ● ● ● ●
● ●
● ● ●
● ●
●
● ●
●
●
●
● ● ● ● ● ● ● ●
● ●
●
● ●
S6
4
S1
S672
S745
9
S2 6
S389
S126
S 44
S1145
5
S783
S 97
S1309
2
S927
S412
S923
S507
S403
S144
S198
S140
S381
S 48
S1659
4
S395
S357
S413
S715
S908
S 68
S1653
0
S222
S554
S847
S 77
S1718
0
S809
S 09
S1111
25
1
S
S 2
S1512
02
S6 9
S212
S 49
S1849
5
S 34
S1462
S1136
0
S224
S975
S403
S 99
S1731
S1052
0
S627
S465
S758
yk 17
z
S W
S1857
4
S855
S363
S745
28
● ● ●
● ●
●
●
0
● ● ● ●
● ●
Figure 8. Number of predicted targets and conserved predicted targets of putative srRNAs
For every putative srRNA in the selection the total number of predicted targets was plotted, together with the number
of these targets that are also predicted in 8 or more Bacillus genomes (i.e. obtained a Conserved flag). For many RNA
segments, especially the ones with large numbers of initially predicted targets, the conservation requirement removes
the majority of considered targets.
92
Chapter 4
Query
S2
S2
S140
S144
S181
S198
S198
S249
S289
S309
S309
S309
S326
S345
S345
S357
S415
S423
S458
S458
S462
S503
S512
S512
S512
ykzW
ykzW
S547
S641
S641
S641
S645
S645
S645
S653
S659
S659
S659
S659
S717
S728
S731
S796
S797
S809
S849
S857
S863
S863
S863
S907
S912
S912
S968
S968
S968
S1022
S1022
S1022
S1022
S1024
S1052
S1136
S1227
S1227
S1227
S1455
S1495
S1495
S1495
S1583
S1583
S1583
S72
S111
S145
S275
S313
S348
S444
S612
S665
S708
S718
S877
S903
S1009
S1027
S1029
S1251
S1534
Enriched category
Metabolism...amino.acid..nitrogen.metabolism...biosynthesis..acquisition.of.amino.acids
Information.processing...RNA.synthesis.and.degradation...RNases
Cellular.processes...cell.envelope.and.cell.division...cell.division
Metabolism...nucleotide.metabolism...metabolism.of.signalling.nucleotides
Metabolism...electron.transport.and.ATP.synthesis...electron.transport..other..based.on.similarity
Metabolism...lipid.metabolism...biosynthesis.of.lipids
Information.processing...RNA.synthesis.and.degradation...transcription
Information.processing...genetics...DNA.condensation..segregation
Cellular.processes...cell.envelope.and.cell.division...cell.wall.synthesis
Cellular.processes...cell.envelope.and.cell.division...cell.division
Metabolism...additional.metabolic.pathways...iron.metabolism
Information.processing...genetics...DNA.replication..based.on.similarity
Metabolism...electron.transport.and.ATP.synthesis...electron.transport..other..based.on.similarity
Metabolism...amino.acid..nitrogen.metabolism...biosynthesis..acquisition.of.amino.acids
Metabolism...nucleotide.metabolism...utilization.of.nucleotides
Metabolism...amino.acid..nitrogen.metabolism...biosynthesis..acquisition.of.amino.acids
Metabolism...electron.transport.and.ATP.synthesis...respiration
Metabolism...electron.transport.and.ATP.synthesis...respiration
Metabolism...electron.transport.and.ATP.synthesis...regulators.of.electron.transport
Metabolism...nucleotide.metabolism...nucleotide.metabolism..other
Cellular.processes...cell.envelope.and.cell.division...cell.shape
Information.processing...protein.synthesis..modification.and.degradation...protein.secretion
Cellular.processes...transporters...ABC.transporters
Cellular.processes...homeostasis...acquisition.of.iron
Metabolism...additional.metabolic.pathways...iron.metabolism
Metabolism...nucleotide.metabolism...utilization.of.nucleotides
Metabolism...additional.metabolic.pathways...sulfur.metabolism
Cellular.processes...transporters...ABC.transporters
Cellular.processes...cell.envelope.and.cell.division...cell.wall.synthesis
Metabolism...electron.transport.and.ATP.synthesis...electron.transport..other
Metabolism...additional.metabolic.pathways...biosynthesis.of.cell.wall.components
Cellular.processes...cell.envelope.and.cell.division...cell.wall.synthesis
Cellular.processes...cell.envelope.and.cell.division...cell.wall..other
Metabolism...additional.metabolic.pathways...biosynthesis.of.cell.wall.components
Cellular.processes...transporters...transporters..other
Cellular.processes...cell.envelope.and.cell.division...cell.wall.synthesis
Cellular.processes...transporters...phosphotransferase.systems
Metabolism...amino.acid..nitrogen.metabolism...biosynthesis..acquisition.of.amino.acids
Metabolism...additional.metabolic.pathways...biosynthesis.of.cell.wall.components
Metabolism...carbon.metabolism...carbon.core.metabolism
Metabolism...carbon.metabolism...utilization.of.specific.carbon.sources
Information.processing...protein.synthesis..modification.and.degradation...translation
Metabolism...nucleotide.metabolism...biosynthesis..acquisition.of.nucleotides
Metabolism...lipid.metabolism...biosynthesis.of.lipids
Information.processing...genetics...DNA.condensation..segregation
Information.processing...protein.synthesis..modification.and.degradation...chaperone..protein.folding..based.on.similarity
Cellular.processes...cell.envelope.and.cell.division...cell.division
Cellular.processes...transporters...ABC.transporters
Cellular.processes...homeostasis...acquisition.of.iron..based.on.similarity
Metabolism...additional.metabolic.pathways...iron.metabolism
Cellular.processes...transporters...transporters..other
Cellular.processes...cell.envelope.and.cell.division...cell.wall.degradation..turnover
Information.processing...RNA.synthesis.and.degradation...DEAD.box.RNA.helicases
Cellular.processes...transporters...transporters..other
Metabolism...carbon.metabolism...carbon.core.metabolism
Metabolism...amino.acid..nitrogen.metabolism...putative.amino.acid.transporter
Cellular.processes...cell.envelope.and.cell.division...cell.wall..other
Cellular.processes...cell.envelope.and.cell.division...cell.division
Information.processing...genetics...DNA.replication
Information.processing...genetics...DNA.repair..recombination
Information.processing...genetics...DNA.replication..based.on.similarity
Information.processing...genetics...DNA.condensation..segregation
Cellular.processes...cell.envelope.and.cell.division...cell.shape
Cellular.processes...homeostasis...metal.ion.homeostasis..K..Na..Ca..Mg.
Metabolism...additional.metabolic.pathways...miscellaneous.metabolic.pathways
Information.processing...protein.synthesis..modification.and.degradation...chaperone..protein.folding..based.on.similarity
Metabolism...additional.metabolic.pathways...biosynthesis.of.cofactors
Cellular.processes...cell.envelope.and.cell.division...cell.wall..other..based.on.similarity
Metabolism...nucleotide.metabolism...nucleotide.metabolism..other
Metabolism...additional.metabolic.pathways...biosynthesis.of.cofactors
Metabolism...carbon.metabolism...carbon.core.metabolism
Information.processing...genetics...DNA.replication..based.on.similarity
Information.processing...RNA.synthesis.and.degradation...RNases
No enrichment category
No enrichment category
No enrichment category
No enrichment category
No enrichment category
No enrichment category
No enrichment category
No enrichment category
No enrichment category
No enrichment category
No enrichment category
No enrichment category
No enrichment category
No enrichment category
No enrichment category
No enrichment category
No enrichment category
No enrichment category
93
Binomial P-value
0.026
0.012
0.013
0.023
0.039
0.022
0.011
0.013
0.037
0.044
0.030
0.048
0.031
0.039
0.042
0.037
0.001
0.026
0.013
0.013
0.026
0.047
0.004
0.000
0.000
0.040
0.029
0.013
0.038
0.012
0.018
0.012
0.036
0.007
0.038
0.020
0.048
0.013
0.041
0.003
0.040
0.005
0.012
0.016
0.015
0.044
0.048
0.036
0.014
0.015
0.027
0.036
0.031
0.049
0.024
0.049
0.001
0.037
0.030
0.028
0.019
0.045
0.036
0.021
0.032
0.042
0.020
0.047
0.034
0.039
0.037
0.047
0.029
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
Table 5. Enriched
functional categories
from
evolutionary
conserved predicted
targets
Query, putative srRNA
name. Enriched category, enriched category
of the putative srRNA
computed on the the
predicted target interactions that are conserved
in 8 species (and including B. subtilis 168) or
more. Binomial P-value,
p-value indicating the
significance of the enrichment.
In silico target profiling
level of an srRNA and its actual target(s). To test this idea, we computed the Pearson correlation
between all predicted srRNA-target pairs across the 104 condition space analysed by Nicolas et
al. (17). However, there was no significant difference detectable when the pair-wise expression
correlations of potential srRNA-target pairs with predicted TargetRNA p-values of ≤0.01 were
compared to pairs predicted at a very low probability (>0.50) (Figure 9). There are thus strong
pair-wise correlations both between significantly predicted targets and between virtually random
srRNA-mRNA pairs. It therefore seems that, by only inspecting the pair-wise correlations in
expression, no reliable indication of srRNA-mediated regulation can be obtained. To find other
ways of further improving the predicted srRNA-target interactions, we decided to apply two
other expression-based methods for identifying more likely srRNA targets, namely B-cluster
enrichment and Peaks expression correlation.
For the B-cluster enrichment analysis, we exploited a correlation analysis that was
previously performed by Nicolas et al. (17). In this analysis all expressed genes and RNA segments
were clustered into three types of clusters based on the pair-wise correlations of their expression
levels. The most strongly co-expressed segments (with Pearson pairwise correlation coefficient
ρ≥0.8) were assigned to A-clusters, and the weakly co-expressed segments (ρ=0.4) were assigned
to C-clusters. Genes with an intermediate level of correlation (ρ=0.6) were clustered together in
B-clusters (for an example plot of a B-cluster profile see Figure 1B). To achieve a proper tradeoff between sensitivity and selectivity, we decided to use these intermediate B-clusters for our
analysis. Notably, when an srRNA has an effect on the expression level of a predicted target, this
target may be found in another B-cluster. Likewise, if an srRNA regulates multiple targets, it
may have a similar effect on the expression level of these targets. These predicted targets are thus
more likely to end up in the same B-cluster. The applied B-cluster analysis therefore tests whether
particular srRNAs have a more than randomly expected number of predicted targets from one
B-cluster amongst their predicted targets. This can be assessed with an enrichment analysis,
as was done before for the enrichment of functional categories. Accordingly, we computed
B-cluster enrichment on the set of predicted targets for every putative srRNA and used a p-value
cut-off for enrichments of ≤0.05. For 50 out of the 63 selected putative srRNAs, this resulted in
an enrichment of one or more B-clusters. The enriched B-clusters for every target are listed in
Table S3 in the Supplementary data, together with the respective enrichment p-values. Notably,
Figure 9. Pearson correlation between
putative srRNAs and their predicted
targets at different p-value cut-offs
The expression correlations of the
indicated group of potential srRNAtarget pairs were computed. For
each of these distributions a kernel
density estimate is shown. Expression
correlations between significant srRNAtarget pairs are only very slightly
different from control srRNA-target
pairs with an interaction p-value of
>0.5.
94
Chapter 4
the number of false-positive B-clusters is expected to increase strongly with increasing p-value,
and especially those B-clusters with very high enrichment (i.e. the B-clusters with the lowest
p-values) seem relevant. All predicted targets of a particular srRNA that are part of an enriched
B-cluster received a B-cluster Flag (Table S1; examples are shown in Table 2). Specific examples
of targets with a B-cluster flag will be discussed for FsrA/S512, SR1, and RsaE/S415 in the case
studies.
As pointed out above, the Pearson expression correlations over the complete condition
space are not indicative of true srRNA-target pairs (Figure 9). On the other hand, the B-cluster
analysis identified predicted srRNA targets that are significantly co-expressed in the complete
condition space, perhaps due to regulation by the respective srRNA. Notably however, none of
these approaches focused on the specific expression correlation between a particular srRNA
molecule and its predicted target under conditions where the expression level of the putative
srRNA changes. In case an srRNA is for instance specifically induced under stress conditions,
further inspection of this condition would be relevant to distinguish negatively or positively
correlated predicted targets. For such high expression conditions, we designed a dedicated
approach with the goal of identifying correlated targets that are specific for at least one of these
conditions. To this end, we used similarities in all 269 tiling array hybridizations (corresponding
to 104 expression conditions) that were computed previously by Nicolas et al. (17). The
expression profiles of all segments were plotted against an x-axis that was grouped according
to the computed level of similarity. The resulting plots facilitated the selection of environmental
conditions that are most similar to the peak conditions, but differ in the levels of putative
srRNA expression (Figure 1B). This peak expression analysis was built up as follows. Firstly,
high expression conditions for every putative srRNA were extracted from the compendium
of the 269 hybridizations. Such high expression conditions were designated as peak srRNA
expression conditions. As a control for each peak srRNA expression condition, the six tiling array
hybridizations most related to the peak srRNA expression condition (i.e. the closest conditions
on the x-axis of the plot in Figure 1B), where the expression of the putative srRNA was below a
cut-off, were extracted. These were then called control for peak srRNA expression. Secondly, the
correlation between the putative srRNAs with their possible targets was computed for the peak
srRNA expression condition with respect to the control for peak srRNA expression (see Materials
and Methods for the details of these computations). Thirdly, to identify relevant predictions,
the resulting data was filtered in three ways: 1) the p-values of peak correlations had to be
significant (≤0.05), 2) the TargetRNA prediction p-value was set at its default (≤0.01), and 3)
only those srRNA-target pairs where the peak correlation was larger (or smaller) than the overall
correlation were retained. The latter pairs, were defined as those pairs where the absolute ratio
of (peak correlation - overall correlation) / overall correlation was ≥1.5. This target selection
approach resulted in the identification of 4305 targets shared between all 63 putative srRNAs.
Likewise, as was done above, these targets obtained a flag, called Peaks flag (Table S1; for specific
examples see Table 2). The obtained results suggest that this analysis can help to determine the
conditional dependency of srRNA – target regulation. For example, if the correlation between
putative srRNA and target is close to 0 calculated over all 104 conditions and is -0.4 in the Peaks
analysis, the putative srRNA could be responsible for this. Based on such observations detailed
experiments in the relevant conditions can be designed. Examples of targets with a Peaks flag
will be discussed for SR1, S462, and RsaE/S415 in the case studies below.
Previous studies on srRNA regulation have shown that the expression level of an srRNA
is crucial for its mode of regulation (5, 53), and that an srRNA can have a (small) effect already
when it is expressed at a similar or lower level than the target (5). However, it is believed that for
strong target regulation, the srRNA should be expressed to a greater extent than the target. In
the afore-going paragraphs, we addressed multiple ways of selecting more likely srRNA targets,
95
In silico target profiling
but did not yet take into account this simple requirement of srRNA regulation. By specifically
assessing the expression of both the putative srRNA and its target in any experimental condition,
or their co-expression, we aimed to exclude those targets that are not co-expressed at a
considerable level in one of the 104 conditions. To achieve this, we first inspected the number of
targets that remained with different cut-offs of expression levels for both the putative srRNA and
their targets (Figure 10). This was done by selecting all expression conditions with an expression
level of the predicted target of higher than 8 - 16. Similarly, all expression conditions with a
predicted putative srRNA expression level higher than 10 - 16 were selected. Subsequently, it was
tested for every combination of expression cut-offs whether the putative srRNA and its predicted
target were still both present in minimally one expression condition. As pointed out before, the
complete set of predicted targets with p-values ≤0.01 was 11419 (Table S1). Figure 10 shows
that increasing the stringency of the target expression level reduced the number of predicted
targets faster than doing the same for the putative srRNAs. This shows that srRNAs are generally
expressed at higher levels than their predicted targets. To be selective, but at the same time not
remove too many targets, we selected those srRNA-target pairs with a minimum expression
level for the srRNA of 12 and a minimum expression level of the target of 11. This eliminates
3026 predicted srRNA-target pairs, representing a decrease of 26.5% (Figure 10). The remaining
73.5% of the predicted srRNA-target pairs that met this threshold obtained a Conditional flag
(Table S1; for specific examples see Table 2). It should however be noted here that, as with the
other presented selection analyses, one cannot exclude the possibility that removed predicted
srRNA-target pairs may eventually turn out to be real.
Remaining targets conditional subsetting. In grey n=11419
16
target expression level
14
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
15
16
12
10
8
10
11
12
13
14
sRNA expression level
96
Figure 10. Remaining targets of
putative srRNAs in co-expression
analysis
The fraction of remaining targets
for every combination of putative
srRNA expression levels and predicted
target expression levels (black circles)
is plotted. The surface of the grey
circles represents the total number of
predicted targets of the selection of
putative srRNAs. It follows that sRNA
expression is generally higher than
that of its predicted targets.
Chapter 4
Identification of srRNA seed regions
Seed regions are conserved and accessible regions of srRNAs that directly mediate srRNA-target
interaction(s). Other regions of the srRNA molecule could for instance only be important for
stability. Seed regions in srRNAs from Gram-negative bacteria are often present at the 5’ end of
the srRNA molecule, and this is reminiscent of eukaryotic microRNAs, which select multiple
targets by short Watson–Crick pairing of a 5’ located conserved “seed” (54). The concept of seed
regions is now established as a way of improving target predictions (55), and conferring srRNA
regulation to heterologous targets by Hfq-binding srRNAs (54, 56). Seed regions can help to
improve target predictions since the length of the query sequence is strongly reduced and,
thereby, also the number of expected false-positive predictions is minimized (55). Notably, it
has been shown that srRNA regulation can even be conferred to heterologous targets by placing
only the seed region at the 5’ end of a synthetic RNA or srRNA scaffold (54, 56).
Seed regions have not yet been reported in studies on B. subtilis srRNAs. However, it
has been reported that FsrA/S512 particularly uses a CCCCUCU sequence for target regulation,
and this could thus form the core of (one of its) seed region(s) (11). Likewise, the srRNA RsaE
of S. aureus (RsaE/S415 in B. subtilis) was reported to be a member of a class of srRNAs that
contain a conserved UCCC sequence motif (18). Since this motif was present in many predicted
srRNA-target interactions (18) it may also be part of a seed region.
It has been suggested that seed regions are the generally conserved and unstructured
regions of an srRNA molecule (54). Such regions could be identified from our analysis, since
we have investigated the conservation of all putative srRNAs in the afore-mentioned set of 62
Bacillus genomes. These sequences are formatted such that sequence alignments can be easily
made. The FASTA file with all BLAST results for the selected putative srRNAs is part of our
Supplementary data file predictions. Sequence alignments can for instance be made with
LocARNA (57), since LocARNA computes the conserved secondary structure and presents
an alignment of this computed secondary structure (see the case studies below for examples).
Another option to identify seed regions is not to take the srRNA sequence as a starting point,
but its predicted targets. In this case, the seed region may be that particular region where many
targets are predicted to bind. To facilitate the seed region analysis, we plotted the predicted
interaction region for all targets on a single plot for every selected srRNA (these plots are part
of Supplementary data file predictions; see Figure 11 where FsrA/S512 is used as an example).
In these plots the accessible regions (as predicted by RNAfold) were highlighted in grey and
the predicted targets were colored based on their number of flags. These plots can be visually
inspected to see whether there are any preferentially predicted interacting regions. However,
very few of such seed regions seem apparent (Supplementary data file predictions). The plots
also provide visual clues as to the number of predicted targets and the length of the sequence.
The example of this plot for FsrA/S512 will be further discussed below. Notably, a detailed
analyses of possible seed regions is beyond the scope of this study. Nevertheless, the presented
analyses might be helpful for assessment of predicted seed region targets. Please note also that
seed regions seem especially valuable tools to find new targets once the mode of action of an
srRNA is known. In this respect one has to bear in mind that it is presently not clear whether
Hfq-independent srRNAs will function as predictably as the Hfq-binding srRNAs for which
seed regions have been established (54, 55).
Prediction case studies
Enrichment of functional categories in target predictions captures the established FsrA/S512
function in iron metabolism
97
In silico target profiling
To investigate the relevance of our target predictions and the functional gene enrichment
analyses, we compared our data for FsrA/S512 with data reported on this srRNA by Gaballa et
al. (10). This breakthrough paper elucidated the molecular players of the Fur-mediated ironsparing response. It was shown that the iron-sparing response involves the conjoint action of
the FsrA/S512 srRNA and three small basic proteins named FbpABC (10). It was furthermore
reported that FsrA/S512 srRNA directly represses translation of the succinate dehydrogenase
operon, sdhCAB. In addition, a two-dimensional (2D) gel-based proteomics analysis was used
to identify other proteins of which the expression is modulated by FsrA/S512. Importantly,
nine of the twelve proteins that were differentially expressed in FsrA/S512 mutant cells were
independently identified through our target predictions addressing the 5’UTRs of all proteinencoding genes (via seven predicted interactions) (Figure 12A). Nevertheless, only 4 of these,
namely SdhCAB and CitB, were so far reported as members of the FsrA/S512 regulon (via two
interactions) (10, 11, 35). The fact that these targets were missed in previous target predictions
may relate to the extended settings around the translational start site used in our present
predictions. Indeed all of the previously missed target genes of FsrA/S512 have a predicted
interaction region for FsrA/S512 outside of the region addressed with the default prediction
settings of TargetRNA. We therefore believe that this finding justifies our decision to expand
the prediction criteria in our present target predictions. Another factor that has influenced the
presently predicted interactions is that the S512 sequence used in our study starts 19 nucleotides
earlier than the sequence considered in the published studies on FsrA. These nucleotides are
unstructured and this would be consistent with the known properties of seed regions at the 5’
end of srRNA molecules (54) (Figure 13A). As far as we are aware, the previously used FsrA
sequence start site was inferred from the distance to the Fur repressor binding site and has
not been mapped experimentally. FsrA/S512 was unfortunately not identified in the RNA-seq
analysis from Irnov et al. (16, 17) so this data cannot provide more information on its precise
start site. Our observations therefore warrant a detailed (and possibly condition-dependent)
mapping of the 5’ end of FsrA/S512. Until this has been done, we suggest that the transcripts
of fhuD, feuA, yfmC, yxiB and yfiY should be considered as possible additional direct targets of
FsrA/S512. Notably, the products of these five genes are all involved in iron metabolism, and
thus it is well conceivable that the respective mRNAs are genuine members of the FsrA regulon.
In accordance with the published data and our present observations, a functional gene
enrichment analysis of all predicted FsrA/S512 targets shows a highly significant enrichment for
the functional categories iron metabolism (p-value <0.001) and acquisition of iron (p-value 0.001)
(Table 3). Furthermore, the regulon enrichment analysis for FsrA/S512 shows enrichment of the
Fur, FsrA, CitB and Btr regulons (Table 4), where it has to be noted that Btr and CitB are involved
in the regulation of the feuABC-ybbA operon. All these observations imply that our de novo
target prediction strategy predicts FsrA/S512 to be involved in iron uptake and iron metabolismrelated processes. This shows that functional enrichment of target genes is a potentially useful
approach for identifying srRNA functions and regulons. Interestingly, this enrichment also holds
when only conserved targets are considered (iron metabolism and acquisition of iron with p-value
<0.001) (Table 4). This strongly suggests that the FsrA/S512 function is conserved in multiple
Bacillus species. Six FsrA/S512 targets are also part of the list of targets with four flags (Table
2), and all of these FsrA/S512 targets are related to iron handling. These flags are also visualized
in the target region plot for FsrA/S512 in Figure 11. Specifically, all putative targets with 4 flags
are predicted to interact with the beginning of the FsrA/S512 sequence. Predicted targets with
three flags potentially interact with the first, second, and third exposed (loop) regions of FsrA/
S512 (Figure 11; loop regions are shaded in grey). However, many of the predicted targets share
such a large amount of sequence complementarity with FsrA/S512 that they span multiple loop
regions (Figure 11). No targets are predicted for the 3’ end of the FsrA/S512 sequence, but this is
98
Chapter 4
due to the default TargetRNA settings that remove potential terminators. However, when FsrA/
S512 targets were predicted without this default setting, they still seemed to preferentially bind
to the first half of the FsrA/S512 sequence (data not shown).
Altogether, our FsrA/S512 case study shows that (evolutionary) target predictions
combined with a functional enrichment analysis can predict the functional process in which
a particular srRNA might be involved. Based on such a predicted functional process, srRNA
mutagenesis experiments can be designed to verify whether the respective srRNA mutant cells
show relevant phenotypes related to the enrichment category (i.e. iron limitation in the case of
FsrA/S512).
Additional predicted FsrA/S512 functions
As was discussed above, the role of FsrA/S512 in the iron-sparing response is well established.
Our predictions suggest that this is the main function of FsrA/S512. The conserved enriched
functional process that, at least by name, is not directly iron-related is termed ABC transporters
(Table 5). However, inspection of the conserved targets responsible for this enrichment shows
that all of these ABC transporters are involved in iron metabolism. Since none of the previously
published studies identified ABC transporter genes as targets for FsrA/S512 (11), this suggests
Figure 11. Predicted targets of FsrA/S512 plotted on the sequence and predicted secondary structure of this srRNA
The region of the srRNA predicted to take place in the predicted target interactions was represented by a line above the
nucleotide sequence of the srRNA (lower line). The colors of these lines represent the number of flags the predicted
target received in the five reported analyses. The predicted RNAfold secondary structure of the srRNA is indicated in
dot-bracket notation above the nucleotide sequence. Exposed (loop) regions from this predicted secondary structure of
five bases or larger were indicated with a grey zone. The plot shown here is for FsrA/S512 and plots for all other selected
putative srRNAs can be found in supplementary data file predictions.
99
In silico target profiling
that the FsrA/S512 regulon is much larger than previously thought. Notably, inspection of
the expression conditions of FsrA/S512 does however reveal a discrepancy with its presumed
exclusive role in the iron-sparing response. In fact, the base-line expression of FsrA/S512 is
quite high (10 on a log2 scale of 7-16) and it is highly expressed in many conditions that are not
intuitively related to a shortage of iron (17). These for instance include growth in the exponential
phase on minimal medium with pyruvate, and growth under conditions of high osmolarity.
It thus may be that there are additional functions of FsrA/S512 that are condition-dependent.
Condition-dependency of an srRNA regulator means that it can regulate one process under
one environmental condition and another process under another environmental condition.
Furthermore, the extent of target regulation by an srRNA can be dependent on the expression
level of the putative target in the environmental condition, the expression level of the srRNA in
the environmental condition, and possibly the condition-specific expression of srRNA-mRNA
chaperones.
According to the data presented by Nicolas et al. (17), FsrA/S512 is also expressed in
LB medium. To verify this observation, we fused the FsrA/S512 promoter with gfp in single copy
at its native locus by chromosomal integration of the GFP reporter plasmid pBaSysBioII (58).
Next, we deleted the gene for the iron-activated transcriptional repressor Fur, which is known
to repress FsrA/S512 expression (10). As shown by GFP expression measurements, FsrA/S512
expression was clearly de-repressed in the fur mutant background on LB medium (Figure 12, B
and C). However, the FsrA/S512 promoter was also active, albeit at a much lower level, in the
parental strain (Figure 12, C and D). In the presence of Fur, the FsrA/S512 promoter activity was
most prominent in the early exponential growth phase, but it remained active at a lower level up
to the late stationary phase. In addition, we determined the FsrA/S512 promoter activity by flow
cytometry, showing that FsrA/S512 was homogeneously expressed throughout the growth on
LB (Figure 12B).
Supported by the analysis of FsrA/S512 expression in cells growing on LB, we compared
the protein patterns of FsrA/S512 mutant and wild-type cells harvested during several phases
of growth. This suggested differential expression of several proteins in the FsrA/S512 mutant,
especially in the late exponential growth phase and during the transition from exponential to
post-exponential growth (data not shown). Based on these observations, we performed a 2D gelbased proteomics analysis of samples collected during different growth phases and, in parallel,
we isolated RNA which was analyzed by tiling arrays as previously described (17). Notably, these
are the first reports of experiments performed directly on an FsrA/S512 mutant, since previously
reported analyses have been performed on a strain where both fur and FsrA/S512 were deleted.
In the 2D PAGE analysis all protein spots that showed at least a 2-fold increase or decrease
in intensity compared to the parental strain were picked and analyzed by matrix-assisted laser
desorption/ionisation time-of-flight (MALDI-TOF) mass spectrometry (MS) (Figure S3 and
S4). In total 67 significantly differentially expressed protein spots were identified (Figure S5). We
next compared these differentially expressed proteins with our target predictions. This resulted
in one match, namely that of the aconitase CitB. CitB has been reported as a highly likely FsrA/
S512 target (10, 11). However, this was reported under the premise that CitB would only be
regulated under conditions of low-iron and not in LB medium. Notably, CitB is not expressed
in the early exponential phase, when FsrA/S512 expression is most prominent on LB medium.
This suggests that there could be other FsrA/S512 targets regulated in this growth phase. The
other differentially expressed proteins may be the result of indirect effects of the FsrA/S512
deletion or the deregulation of one of its targets. It is also feasible that some of these differentially
expressed proteins are actually FsrA/S512 targets, but that these were missed in our TargetRNA
predictions.
Transcriptome analyses can provide a genome-wide view of the effects of an FsrA/
100
Chapter 4
S512 deletion on transcript levels. We tested for significantly differentially expressed genes in the
tiling array data obtained with an FsrA/S512 mutant using a linear model and found 8.1% (468
genes) of all genes significantly changed in abundance with p-values ≤0.05, and 0.4% (24 genes)
with p-values ≤0.01 (Supplementary data file transcript profiling). Thirteen of the differentially
expressed genes with p-values ≤0.05 are indeed predicted FsrA/S512 targets (with TargetRNA
p-values ≤0.01), but most of these are of unknown function. On average more transcripts
were higher expressed in the FsrA/S512 mutant compared to the parental strain (Figure 13B).
This would be in line with the presumed dominant role of srRNAs in repressing their targets.
However, it seems highly unlikely that the differential expression of all these genes is a direct
result of the absence of FsrA/S512. Instead, it seems more likely that many of the observed
differences result indirectly from the deregulation of a limited number of FsrA/S512 targets. For
instance, a large part of the sulfur metabolism genes were found to be derepressed in the FsrA/
S512 mutant (Figure 13B). The expression changes of these sulfur metabolism genes suggest that
there is an increased demand for cysteine in the FsrA/S512 mutant (Figure S6). In an attempt to
explain this phenotype, we inspected the conserved predicted targets of FsrA/S512 for sulfurA
B
C
D
Figure 12. FsrA/S512 regulon members related to iron uptake and metabolism, and FsrA/S512 promoter activity
A) Overlap between FsrA-specific differentially expressed proteins identified in the 2D PAGE analysis by Gaballa et
al. (10) and our present target predictions. The sdhC and citB genes are known members of the FsrA/S512 regulon.
Our predictions suggest five additional targets for FsrA/S512 with the indicated TargetRNA_v1 significance scores
and coordinates of the predicted interaction. The yfmC interaction is predicted with a p-value of 0.02, the other
predicted interactions have p-values ≤0.01. B) Flow cytometry histograms shown for the FsrA/S512 promoter fusion
in the parental strain and the same promoter fusion in the Δfur background. Autofluorescence of the parental strain is
indicated in grey. Representative data is shown. C) Promoter activity plots of the same strains used in panel B grown
in a 96-well plate. Growth for both strains was identical and one growth curve is plotted. Representative data is shown.
D) Same plot as in C, but without the Δfur mutation to clarify the induction pattern of the FsrA/S512 promoter in the
parental background.
101
In silico target profiling
related genes. A highly conserved and significant putative target is yvrO/iscS (Figure 13, B and
C; Table 6). This gene encodes the essential cysteine desulfurase, which is involved in tRNA
thiolation. The enzyme activity of YvrO/IscS leads to cysteine consumption, and this suggests
that elevated levels of this protein could lead to the observed derepression of sulfur metabolism
genes. YvrO/IscS was not identified in the 2D PAGE analysis, but the yvrO/iscS transcript levels
are significantly changed in the FsrA/S512 mutant. However, this change is only slight, with a
0.67 log fold change decrease in the FsrA/S512 mutant compared to the parental strain (p-value
< 0.04). In the case of relieved translation inhibition, transcript levels are expected to slightly
increase, and not decrease as was observed in the present experiment. Therefore, it remains to
be seen whether elevated YvrO/IscS levels are indeed responsible for the derepression of sulfur
metabolism genes. It should be noted that an increase in the sulfur metabolism genes could also
be related to an oxidative stress response as B. subtilis uses cysteine for thiol protection after
oxidative stress (59). However no clear evidence for an oxidative stress response in cells lacking
FsrA/S512 was obtained.
ClpC is another conserved predicted target of FsrA/S512 (Figure 13C, Table 6).
While a change in the ClpC level was not detected in the 2D PAGE analysis, the clpC mRNA
level was significantly elevated in the FsrA/S512 mutant (log2 fold change 0.88). This elevated
clpC transcript level could be the result of increased protection by elongating ribosomes due
B
A
C
clpC
yrvO/iscS
Figure 13. Conserved structure of FsrA/S512 and identification of two additional putative FsrA/S512 targets by
tiling array analysis
A) Conserved structure of S512 as predicted with the IntaRNA algorithm (http://rna.informatik.uni-freiburg.de/
IntaRNA/Input.jsp) (24). The corresponding sequence alignment is shown in Figure S2. Red color indicates high
conservation of the predicted base-pairing interactions. The top loop, which represents the 5’ end of the S512 sequence,
may correspond to the seed region identified in this manuscript. B) Scatter plot of the transcript profiling data obtained
with tiling arrays for a FsrA/S512 deletion mutant and the parental strain 168. Averages from both hybridizations for the
FsrA/S512 mutant and the parental strain are shown. The additional labels “clpC” and “yrvO” show possible FsrA/S512
targets as discussed in the text. “S512” indicates the hybridization of FsrA/S512, which is absent from the FsrA/S512
mutant. The downstream segments “S513, S514” are expressed at a higher level in the FsrA/S512 mutant due to readthrough transcription from the phleomycin deletion cassette (76) used to replace FsrA/S512. C) Predicted interaction
of FsrA/S512 with clpC and yrvO/iscS. Predicted loop regions of FsrA/S512 are indicated in red. The coordinates of the
predicted interaction of the srRNA and mRNA are shown. Notably, clpC and yrvO/iscS represent the fourth and second
gene, respectively, of their cognate operons.
102
Chapter 4
to the alleviation of translational repression. Notably, ClpC is the ATPase subunit of the ATPdependent ClpC-ClpP protease, which is involved in a wide variety of processes (60). Because
of the central role of ClpC in proteolysis and differentiation, the deregulation of a protein like
ClpC is expected to have a large number of consequences (61). This could thus at least partly
explain the large number of changes observed in the FsrA/S512 mutant. Lastly, while the present
proteomic and transcriptomic analyses underscore the general importance of FsrA/S512 ,the
identification of such a large number of most likely indirect effects may argue against the use of
‘omics’ analyses on srRNA mutant strains, at least when the specific goal of these analyses is to
identify the direct srRNA targets.
t
BH
tar
_s
_s
1
1
1
0.99
1
1
0.99
0.99
31
35
29
30
33
32
40
124
clpC
BSn5_12000
A7A1_0120
clpC
clpC
clpC
clpC
clpC
-69
-69
-69
-69
-69
-69
-69
-58
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.02
6
6
6
6
6
6
6
6
39
39
39
39
39
39
39
39
-54
-54
-54
-54
-54
-54
-54
-54
-22
-22
-22
-22
-22
-22
-22
-22
BSU00860
BSU00860
BSU00860
BSU00860
BSU00860
BSU00860
BSU00860
BSU00860
S512
S512
S512
S512
S512
S512
S512
S512
S512
S512
S512
S512
S512
102
102
102
102
102
102
102
102
102
102
102
102
102
Bacillus_subtilis_spizizenii_W23_uid51879
Bacillus_JS_uid162189
Bacillus_subtilis_168_uid57675
Bacillus_subtilis_BSn5_uid62463
Bacillus_subtilis_BSP1_uid184010
Bacillus_subtilis_QB928_uid173926
Bacillus_subtilis_RO_NN_1_uid158879
Bacillus_subtilis_spizizenii_TU_B_10_uid73967
Bacillus_subtilis_natto_BEST195_uid183001
Bacillus_amyloliquefaciens_plantarum_AS43_3_uid183682
Bacillus_amyloliquefaciens_plantarum_CAU_B946_uid84215
Bacillus_amyloliquefaciens_plantarum_YAU_B9601_Y2_uid159001
Bacillus_amyloliquefaciens_Y2_uid165195
0.99
1
1
1
1
1
1
0.99
0.99
0.98
0.97
0.98
0.98
2
15
13
12
12
16
17
16
70
226
228
247
247
iscSA
MY9_2731
iscS
BSn5_04550
A7A1_0393
iscSA
I33_2796
GYO_2990
yrvO
B938_12725
yrvO
yrvO
yrvO
-86
-75
-75
-75
-75
-75
-75
-75
-64
-53
-53
-53
-53
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.01
0.05
0.06
0.06
0.06
11
11
11
11
11
11
11
11
11
12
12
12
12
69
69
69
69
69
69
69
69
45
45
45
45
45
-73
-73
-73
-73
-73
-73
-73
-73
-51
-51
-51
-51
-51
-17
-17
-17
-17
-17
-17
-17
-17
-17
-18
-18
-18
-18
BSU27510
BSU27510
BSU27510
BSU27510
BSU27510
BSU27510
BSU27510
BSU27510
BSU27510
BSU27510
BSU27510
BSU27510
BSU27510
103
BD
mR
NA
mR
NA
t
tar
_s
_s
sR
NA
e
sR
NA
e
alu
or
Bacillus_subtilis_168_uid57675
Bacillus_subtilis_BSn5_uid62463
Bacillus_subtilis_BSP1_uid184010
Bacillus_subtilis_natto_BEST195_uid183001
Bacillus_subtilis_QB928_uid173926
Bacillus_subtilis_RO_NN_1_uid158879
Bacillus_subtilis_spizizenii_TU_B_10_uid73967
Bacillus_subtilis_spizizenii_W23_uid51879
Sc
102
102
102
102
102
102
102
102
Na
Pv
me
nk
Id
ac
Ra
S512
S512
S512
S512
S512
S512
S512
S512
Or
Fr
g
top
e
am
th
Le
ng
yN
Qu
er
top
Table 6. Selection of conserved predicted FsrA/S512 targets
Queryname, putative srRNA name. Length, Length of the srRNA in B. subtilis. Org, name of the bacterium in
which the conserved interaction was predicted. FracId, fraction of sequence identity of the srRNA query. Rank,
rank in the TargetRNA predictions for the query in the respective genome. Name, name of the predicted target
in the respective genome. Score, TargetRNA_v1 (23) prediction significance score. Pvalue, TargetRNA_v1 (23)
prediction p-value. sRNA_start, start coordinate of putative srRNA in the predicted target interaction. sRNA_
stop, end coordinate of putative srRNA in the predicted target interaction. mRNA_start, start coordinate of the
predicted target in the predicted target interaction relative to start codon. mRNA_stop, end coordinate of the
predicted target in the predicted target interaction relative to start codon. BDBH, unique B. subtilis 168 locus tag
of the nearest protein Blast hit of the predicted target for B. subtilis 168.
In silico target profiling
Correlations and target predictions suggest additional targets for SR1
In the preceding sections, we have argued that the expression level of an srRNA is crucial for
its regulation, and an srRNA is therefore expected to be upregulated (or downregulated) in its
functionally relevant condition. Such induction of srRNA expression in a specific growth phase
was also observed for the B. subtilis srRNA SR1. SR1 is highly expressed during conditions of
gluconeogenesis when CcpA and CcpN repression of the SR1 promoter is alleviated (42). SR1 is
also strongly induced when cells enter the stationary phase after growth on rich medium (42).
SR1 was the first srRNA for which a function and target were reported in B. subtilis (12). A 2D
PAGE analysis of SR1 mutant B. subtilis cells revealed that three proteins under the control of
AhrC, the transcriptional regulator of arginine metabolic genes, were differentially expressed.
Direct pairing of SR1 and ahrC mRNA was subsequently confirmed (12). SR1 was later found
to encode a 39-amino acid peptide that regulates the stability of the gapA operon, making it the
only established dual-function srRNA in B. subtilis (37). Most recently, it was reported that these
two SR1 functions – regulation of ahrC mRNA via srRNA-mRNA interaction and regulation of
the gapA operon mediated by the SR1 peptide – are conserved in many Bacillus species (36).
Upon close inspection of our target prediction dataset for SR1, four findings seemed
of particular interest. Firstly, there appears to be no enrichment category for the predicted SR1
targets (Table 3). However, there are two enriched categories in the conserved predictions,
namely sulfur metabolism and utilization of nucleotides (Table 5). A single target, adeC is
responsible for the latter enrichment. AdeC is an adenine deaminase involved in purine salvage
and interconversion, and the utilization of adenine as a nitrogen source. In the target prediction
Tables, SR1-adeC has three flags; Conserved8, ConditionalFlag, and a BclusterFlag (Tables S1
and S2). The latter flag is an indication that SR1 may have an effect on the expression level of
adeC. We note that the coordinates of the predicted adeC targets are remarkably conserved (Table
7). The predicted nrdE gene target (which is not a conserved predicted target) is also identified
in the same B-cluster enrichment and is involved in nucleotide metabolism. NrdE is an essential
protein of B. subtilis, which is involved in the synthesis of deoxyribonucleoside triphosphates.
Furthermore, additional regulation of adeC and nrdE would fit with SR1s established role in
the regulation of nitrogen metabolism (via AhrC regulation (12)). Secondly, the extracellular
neutral protease NprE is a predicted conserved target of SR1 (Table 7). The nprE target also has
three flags, Conserved8, ConditionalFlag, and a PeaksFlag. The latter relates to the high SR1
expression level during the induction of sporulation (S4 condition) (17). This is a condition for
SR1 in the correlation under high expression conditions (Peaks expression). The correlation
between SR1 and nprE is 0.01 over all conditions, but -0.48 in this peaks expression condition.
This suggests that the nprE mRNA might be destabilized by SR1 under this condition. However,
the observation could also be coincidental in case SR1 and nprE are both strongly regulated
under this growth condition. Interestingly, NprE is again involved in nitrogen metabolism
of stationary phase cells. Thirdly, in a microarray study, arginine metabolism was previously
linked to methionine metabolism, suggesting a functional link between nitrogen and sulfur
metabolism (62). SR1 might be part of this link, since its function in nitrogen metabolism and
sulfur metabolism is enriched in the conserved target enrichment analysis (p-value 0.03; Table
5), via the putatively conserved regulation of CysJ (Table 7). Fourthly, the Sigma factor SigL is a
regulator of stationary phase processes, including arginine metabolism. We noted that expression
of SR1 and SigL is highly correlated (0.71) over all 104 expression conditions analysed by Nicolas
et al. (17). Interestingly, this correlation was almost complete (0.98) in a glucose run-out time
series (17). Both SR1 and sigL are induced when glucose runs out, but SR1 to a higher extent
(17). Based on these observations, we propose that SR1 is relevant for the fine-tuning of nitrogen
metabolism in the transition and stationary growth phases.
104
Chapter 4
Ra
nk
Na
me
Sc
or
e
Pv
alu
e
sR
NA
_s
sR tart
NA
_s
mR top
NA
_
mR star
NA t
_
BD stop
BH
ykzW
ykzW
ykzW
ykzW
ykzW
ykzW
ykzW
ykzW
120
120
120
120
120
120
120
120
Bacillus_subtilis_168_uid57675
Bacillus_subtilis_QB928_uid173926
Bacillus_subtilis_BSn5_uid62463
Bacillus_subtilis_BSP1_uid184010
Bacillus_subtilis_natto_BEST195_uid183001
Bacillus_subtilis_RO_NN_1_uid158879
Bacillus_subtilis_spizizenii_W23_uid51879
Bacillus_subtilis_spizizenii_TU_B_10_uid73967
1
1
0.98
0.98
0.98
0.98
0.96
0.97
26
25
42
31
40
39
53
84
adeC
adeC
BSn5_19330
A7A1_0576
adeC
ade
adeC
ade
-74
-74
-72
-72
-72
-72
-71
-67
0.01
0.01
0.01
0.01
0.01
0.01
0.02
0.02
69
69
69
69
69
69
69
69
114
114
114
114
114
114
114
114
-67
-67
-68
-68
-67
-68
-68
-68
-20
-20
-21
-21
-20
-21
-21
-21
BSU14520
BSU14520
BSU14520
BSU14520
BSU14520
BSU14520
BSU14520
BSU14520
ykzW
ykzW
ykzW
ykzW
ykzW
ykzW
ykzW
ykzW
ykzW
ykzW
ykzW
ykzW
120
120
120
120
120
120
120
120
120
120
120
120
Bacillus_subtilis_spizizenii_W23_uid51879
Bacillus_subtilis_168_uid57675
Bacillus_subtilis_QB928_uid173926
Bacillus_subtilis_BSn5_uid62463
Bacillus_subtilis_BSP1_uid184010
Bacillus_subtilis_natto_BEST195_uid183001
Bacillus_subtilis_RO_NN_1_uid158879
Bacillus_amyloliquefaciens_FZB42_uid58271
Bacillus_amyloliquefaciens_plantarum_AS43_3_uid183682
Bacillus_amyloliquefaciens_plantarum_CAU_B946_uid84215
Bacillus_JS_uid162189
Bacillus_subtilis_spizizenii_TU_B_10_uid73967
0.96
1
1
0.98
0.98
0.98
0.98
0.87
0.87
0.87
0.94
0.97
3
13
12
16
15
13
14
34
38
34
67
96
nprE
nprE
nprE
BSn5_19435
A7A1_0557
nprE
I33_1652
nprE
B938_07575
npr
MY9_1610
GYO_1812
-89
-82
-82
-77
-77
-77
-77
-67
-67
-67
-67
-66
0
0
0
0
0
0
0
0.02
0.02
0.02
0.02
0.03
7
7
7
7
7
7
7
7
7
7
7
7
80
75
75
74
74
74
74
80
80
80
80
60
-53
-39
-39
-38
-38
-38
-38
-51
-51
-51
-54
-30
22
22
22
22
22
22
22
22
22
22
22
22
BSU14700
BSU14700
BSU14700
BSU14700
BSU14700
BSU14700
BSU14700
BSU14700
BSU14700
BSU14700
BSU14700
BSU14700
ykzW
ykzW
ykzW
ykzW
ykzW
ykzW
ykzW
ykzW
ykzW
ykzW
ykzW
ykzW
ykzW
ykzW
120
120
120
120
120
120
120
120
120
120
120
120
120
120
Bacillus_subtilis_168_uid57675
Bacillus_subtilis_QB928_uid173926
Bacillus_JS_uid162189
Bacillus_subtilis_spizizenii_W23_uid51879
Bacillus_amyloliquefaciens_DSM_7_uid53535
Bacillus_amyloliquefaciens_LL3_uid158133
Bacillus_amyloliquefaciens_TA208_uid158701
Bacillus_amyloliquefaciens_XH7_uid158881
Bacillus_subtilis_BSn5_uid62463
Bacillus_subtilis_BSP1_uid184010
Bacillus_subtilis_natto_BEST195_uid183001
Bacillus_subtilis_RO_NN_1_uid158879
Bacillus_subtilis_spizizenii_TU_B_10_uid73967
Bacillus_halodurans_C_125_uid57791
1
1
0.94
0.96
0.87
0.87
0.87
0.87
0.98
0.98
0.98
0.98
0.97
0.9
11
10
15
24
16
18
15
21
40
36
45
44
89
64
cysJ
cysJ
MY9_3390
cysJ
cysJ
cysJ
cysJ
yvgR
BSn5_07640
A7A1_2561
yvgR
I33_3462
GYO_3656
BH0609
-84
-84
-77
-77
-72
-72
-72
-72
-72
-72
-72
-72
-67
-52
0
0
0
0.01
0.01
0.01
0.01
0.01
0.01
0.01
0.01
0.01
0.02
0.02
10
10
12
12
25
25
25
25
12
12
12
12
12
1
82
82
96
74
73
73
73
73
74
74
74
74
55
28
-54
-45
-55
-15
-41
-41
-41
-41
-15
-15
-15
-15
4
-64
17
26
30
45
5
5
5
5
45
45
45
45
45
-31
BSU33440
BSU33440
BSU33440
BSU33440
BSU33440
BSU33440
BSU33440
BSU33440
BSU33440
BSU33440
BSU33440
BSU33440
BSU33440
BSU33440
ac
Id
Qu
er
yN
am
Le
e
ng
th
Or
g
Fr
Table 7. Selection of conserved predicted SR1 targets
Same legend as for Table 6.
SigW-dependent expression of S462 is consistent with its predicted functions
S462 was first reported by Nicolas et al., who proposed this srRNA to be regulated by the three
alternative sigma factors SigWXY (17). Because of the similarity in the sequence motifs recognized
by these three sigma factors, the applied promoter cluster analysis was unable to distinguish the
respective promoters. However, a subsequent expression analysis that specifically focused on
unravelling the SigW regulon (63), showed that S462 expression is most likely dependent on
SigW (63). The SigW regulon is expressed at a low basal level during exponential growth, but is
induced in response to cell envelope stress. Such stresses can be provoked by antibiotics, alkaline
shock and salt shock (63). To confirm the SigW-dependent regulation of S462, we fused the start
of its sequence with gfp in single copy at its native locus using the chromosomal integration
plasmid pBaSysBioII (58). As shown by GFP expression analysis, the S462 promoter was indeed
expressed at low levels during exponential growth on LB medium, and GFP expression from this
promoter was no longer detectable when sigW was deleted (Figure 14A). Interestingly, the S462
gene is situated next to that encoding HtrA, an important quality control protease under CssR
control and it is known that the CssR and SigW response are linked (64).
Target predictions on S462 seem consistent with a role of S462 in the SigW-dependent
cell envelope stress response. In all B. subtilis target predictions the categories cell envelope and
105
In silico target profiling
cell division - capsule biosynthesis and degradation and acquisition of nucleotides were found to
be enriched (Table 3). The first category is clearly linked to the function of SigW and a related
category is also enriched in the conserved target predictions (cell envelope and cell division –
cell shape) (Table 5). The conserved predicted targets that are related to cell envelope processes
include the essential gene for MraY (2 flags) involved in peptidoglycan precursor biosynthesis,
the cell envelope stress protein YceH (2 flags) and the cell shape determinant MreD (2 flags)
(Table 8). The second enriched conserved category, acquisition of nucleotides, may also be related
to cell envelope stress since Yu et al. reported a strong induction of both the SigW regulon
and nucleotide metabolism genes upon exposure of B. subtilis to the cell wall-acting antibiotic
fusaricidin (65). The nucleotide metabolism genes predicted to be targeted by S462 encode
the xanthine permease PbuX (2 flags), the uracil permease PyrP (3 flags), and the essential
phosphoribosylpyrophosphate synthetase Prs (3 flags). Our target predictions thus suggest
that S462 could be partly responsible for the link between the SigW regulon and nucleotide
metabolism (65) by acting as an srRNA on one or more targets.
The predicted secondary structure of S462 is weak compared to other possible structures
with the same nucleotide composition and length (positive Z-score of 0.68). In addition, the
A
C
B
D
Figure 14. Promoter activity, structure, and predicted ORF of S462
A) Promoter activity plots of the S462 promoter-gfp fusion and the same fusion in combination with a sigW deletion
in cells grown on LB in a 96-well plate. Promoter activity was determined by GFP fluorescence readings as described
in the methods section. S462 is expressed at a relatively low level (e.g. compare with Figure 12) and S462 expression is
absent in a sigW deletion mutant. The plot shows representative data. B) Conserved structure of S462 as predicted with
IntaRNA (24). The corresponding sequence alignment is shown in Figure S7. Red color indicates highly conserved
predicted base-pairing . Note that the structure contains many base-pairs that are not conserved. C) Predicted ORF
in the S462 sequence. The indicated ORF is 59 or 61 amino acids in length depending on the ATG start codon that is
used. The second start codon seems to have a better spacing with respect to the upstream putative GGAGG ribosome
binding site. D) TMHMM prediction of transmembrane segments (66) in the S462 encoded peptide S462-P. Two
transmembrane helices that are separated by a small interior loop are predicted with high probability.
106
Chapter 4
Fr
a
Ra
sR
ta
NA rt
_s
mR top
NA
_
mR star
NA t
_
BD stop
BH
367
367
367
367
367
367
367
367
Bacillus_subtilis_BSn5_uid62463
Bacillus_subtilis_BSP1_uid184010
Bacillus_subtilis_natto_BEST195_uid183001
Bacillus_subtilis_RO_NN_1_uid158879
Bacillus_subtilis_168_uid57675
Bacillus_subtilis_QB928_uid173926
Bacillus_subtilis_spizizenii_TU_B_10_uid73967
Bacillus_JS_uid162189
0.98
0.99
0.98
0.97
1
1
0.92
0.88
4
2
1
2
3
3
3
14
mraY
A7A1_0507
mraY
mraY
mraY
mraY
mraY
MY9_1663
-111
-111
-111
-111
-105
-105
-104
-89
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
74
74
74
74
74
74
71
77
172
172
172
172
172
172
158
154
-51
-51
-51
-51
-51
-51
-41
-35
36
36
36
36
36
36
36
45
BSU15190
BSU15190
BSU15190
BSU15190
BSU15190
BSU15190
BSU15190
BSU15190
S462
S462
S462
S462
S462
S462
S462
S462
367
367
367
367
367
367
367
367
Bacillus_subtilis_spizizenii_TU_B_10_uid73967
Bacillus_JS_uid162189
Bacillus_subtilis_168_uid57675
Bacillus_subtilis_BSP1_uid184010
Bacillus_subtilis_natto_BEST195_uid183001
Bacillus_subtilis_QB928_uid173926
Bacillus_subtilis_RO_NN_1_uid158879
Bacillus_subtilis_BSn5_uid62463
0.92
0.88
1
0.99
0.98
1
0.97
0.98
2
20
44
47
33
45
48
235
yceH
MY9_0300
yceH
A7A1_1904
yceH
B657_02940
yceH
BSn5_13050
-111
-87
-84
-84
-84
-84
-84
-73
0.00
0.00
0.01
0.01
0.01
0.01
0.01
0.04
86
90
108
108
108
108
108
108
192
143
142
142
142
142
142
142
-56
-14
-14
-14
-14
-14
-14
-14
36
36
23
23
23
23
23
23
BSU02940
BSU02940
BSU02940
BSU02940
BSU02940
BSU02940
BSU02940
BSU02940
S462
S462
S462
S462
S462
S462
S462
S462
367
367
367
367
367
367
367
367
Bacillus_JS_uid162189
Bacillus_subtilis_spizizenii_W23_uid51879
Bacillus_subtilis_168_uid57675
Bacillus_subtilis_BSP1_uid184010
Bacillus_subtilis_natto_BEST195_uid183001
Bacillus_subtilis_QB928_uid173926
Bacillus_subtilis_RO_NN_1_uid158879
Bacillus_subtilis_BSn5_uid62463
0.88
0.89
1
0.99
0.98
1
0.97
0.98
5
22
66
58
52
65
63
142
MY9_2785
mreD
mreD
A7A1_0442
mreD
mreD
mreD
BSn5_04820
-96
-84
-82
-82
-82
-82
-82
-76
0.00
0.00
0.01
0.01
0.01
0.01
0.01
0.02
122
67
121
121
121
121
121
121
161
96
159
159
159
159
159
159
-33
-33
-34
-34
-34
-34
-34
-34
6
-5
6
6
6
6
6
6
BSU28010
BSU28010
BSU28010
BSU28010
BSU28010
BSU28010
BSU28010
BSU28010
sR
lue
re
Pv
a
Sc
o
me
nk
Na
cId
th
g
Or
ng
yN
er
NA
_s
Le
S462
S462
S462
S462
S462
S462
S462
S462
am
e
Qu
Table 8. Selection of conserved predicted S462 targets
Same legend as for Table 6.
predicted S462 structure is not highly conserved, since it contains a large number of base-pairs
with low evolutionary conservation (Figure 14B). The lower level of secondary structure might
by related to the peptide-encoding potential of S462. The respective ORF, S462-P (with P for
peptide), is predicted to be either 59 or 61 amino acids in length due to the presence of two
potential start codons, and it contains a GGAGG ribosome binding site upstream of the first
possible start codon (Figure 14C). As was discussed above, S462 is under control of the cell
envelope stress sigma factor SigW and most of its predicted srRNA functions seem also cellenvelope related. We therefore wondered whether S462-P might also be linked to cell envelope
processes. To test this, we used the TMHMM webserver (http://www.cbs.dtu.dk/services/
TMHMM/) for prediction of transmembrane domains (66) in the S462-P sequence. Indeed,
S462-P contains two significantly predicted transmembrane domains, suggesting that it is a
small integral membrane protein (Figure 14D). Future studies will have to reveal whether S462
is indeed a dual-function srRNA involved in the cell-envelope stress response of B. subtilis.
Conserved RsaE/S415 functions
The evolutionarily most conserved known srRNA in B. subtilis is RsaE/S415. RsaE/S415 was first
identified in S. aureus through a bioinformatics screen of this organism’s intergenic regions, and
the expression of RsaE/S415 was subsequently confirmed by Northern blotting (18). The authors
of this study also noted its strong conservation in the Bacillaeae (18). In a later investigation,
RsaE was found to downregulate (genes for) numerous metabolic enzymes (19). In both studies,
direct RsaE-target interactions were tested by gel retardation analysis. However, the results
remained inconclusive for many of the putative targets. It was therefore suggested that in vivo
there may be an unknown RNA chaperone required for these interactions (19). In our present
study, the RsaE/S415 sequence was confirmed to be highly conserved in the included Bacillus
107
In silico target profiling
genomes, with the core of the sequence displaying the highest conservation (Figure S8). More
generally, the presence of RsaE/S415 in organisms ranging from S. aureus to B. subtilis opens up
the possibility for comparative studies on the conservation of RsaE/S415 functions.
Expression data from Nicolas et al. suggest high RsaE/S415 expression under many
experimental conditions, being on average most prominent in the exponential growth phase
(17). However, the concordance between triplicates in this expression data was sometimes low
for RsaE/S415. Since the genome-wide data was highly concordant (17) this may be a functionally
relevant aspect of RsaE/S415 expression. Beyond this, RsaE in S. aureus was shown to be highly
induced in the transition between the exponential and stationary growth phases (18), and we
therefore wondered whether the expression pattern of B. subtilis RsaE/S415 could be similar to
the pattern of expression in S. aureus. To test this, we again employed the pBaSysBioII plasmid
to construct a chromosomally integrated single-copy promoter gfp fusion. Analysis of this strain
grown on LB medium showed a highly homogeneous promoter activity for RsaE/S415 (Figure
15, A and B). The RsaE/S415 promoter activity remained prominent throughout the exponential
growth phase and dropped to levels below the detection limit in the transition and stationary
growth phases (Figure 15A). The latter is in contrast to what was observed in S. aureus (18). This
shows that the pattern of expression of RsaE/S415 is probably not conserved from S. aureus to B.
subtilis. Yet, we cannot exclude the possibility that the observed difference in expression patterns
is caused by differences in the applied experimental conditions.
Target predictions for RsaE/S415 showed enrichment of the functional categories
electron transport and ATP synthesis - respiration (p-value <0.001) and trace metal homeostasis
(p-value <0.04) (Table 3). Especially the significance of the first category is striking. This relates
to the predicted target genes ctaC, qcrC, ndhF, scuA/ypmQ, ctaE, nasE and qcrB, all of which
encode components of electron transport chains for oxidative phosphorylation or are required
for cytochrome maturation. The NADH dehydrogenase gene ndhF has four flags (Table 2, Table
S1). This is due to the conservation of the predicted target interaction (Table 9) and presence
in the enriched respiration-specific B-cluster B56 (Table 2). The latter suggests that RsaE/S415
has an effect on the expression level of ndhF. This effect would, for instance, take place under
conditions of heat stress or nitrate respiration, where ndhF is specifically expressed (17).
A
B
C
Figure 15. RsaE/S415 promoter activity and nisin-sensitivity of an RsaE/S415 mutant
A) Promoter activity plot for an RsaE/S415-gfp gene fusion in cells grown in LB medium in a 96-well plate. RsaE/
S415 expression is high (compare to Figure 12) and prominent throughout the exponential phase. B) Flow cytometry
histogram shown for the GFP production in cells carrying the RsaE/S415-gfp promoter fusion. Autofluorescence of the
parental strain is indicated in grey, and representative data is shown. C) Increased sensitivity of an RsaE/S415 deletion
mutant to incubation with nisin. The survival of cells challenged with nisin was assessed by live/dead staining and flow
cytometry analyses. Representative flow cytometry data indicating a shift in color spectrum upon live/dead staining is
shown. A shift toward the left implies an increase in the number of cells with permeabilized membranes.
108
Chapter 4
Although the expression pattern of RsaE/S415 may not be conserved from S. aureus
to B. subtilis, it is conceivable that there are functional processes that have been maintained in
evolution. To investigate whether this is the case, we first examined the results of an expression
analysis previously reported for an RsaE mutant in S. aureus (18). In this analysis 86 differentially
expressed mRNAs were identified. These mRNAs belonged to multiple categories, including genes
related to lipid metabolism, cofactor metabolism, energy transport, and cell envelope biogenesis
(18). Interestingly, the categories cofactor metabolism and energy transport are highly related
to the enriched conserved target category electron transport and ATP synthesis - respiration of
B. subtilis 168 (Table 5). To look for functional categories that are enriched beyond B. subtilis
168, we analyzed these evolutionary target predictions again without the requirement of the
target also being predicted in B. subtilis 168. As such, this would give an overview of predicted
targets in the whole clade of Bacillus genomes. Remarkably, the conserved predicted targets thus
retrieved are enriched for the functional categories lipid utilization (p-value <0.01), biosynthesis
of cofactors (p-value = 0.01), electron transport and ATP synthesis - respiration (p-value = 0.01),
and coping with hypo-osmotic stress (p-value = 0.04). It thus seems likely that there is a functional
conservation of RsaE/S415 ranging from S. aureus to B. subtilis in the functional processes lipid
metabolism and cofactor metabolism. We next inspected these predictions further at the level of
predicted targets. For the process of lipid metabolism, this led to the observation that genes of the
fad operon involved in acetyl-coA metabolism were deregulated in S. aureus (fadABE operon),
and that some of these genes are also conserved predicted targets in Bacillus species. In B. subtilis
the specific predicted targets are fadH, fadE and fadN. The possible involvement of RsaE/S415
in acetyl-CoA metabolism is interesting, since it could perhaps partly explain the global changes
in metabolism observed in an S. aureus mutant defective in RsaE/S415 (19). In addition, acetylCoA is a main link between central carbon metabolism and lipid metabolism.
The implication of RsaE/S415 in the regulation of lipid metabolism and the predicted
regulation of many membrane proteins, for instance those involved in respiration, suggested
possible cell envelope changes in a Bacillus mutant of RsaE/S415. However, it is not clear
whether these changes would also take place in B. subtilis 168. We therefore scrutinized our
target predictions for RsaE/S415 in B. subtilis to check whether there are any possible links to cell
envelope processes. Indeed, there are three predicted targets linked to cell envelope processes,
namely dacB, yrpC and gcaD. DacB is a D-alanyl-D-alanine carboxypeptidase and YrpC is a
glutamate racemase involved in peptidoglycan precursor biosynthesis. Both predicted targets
have two flags, one for co-expression and one for peaks correlation. The third predicted target
gcaD is an essential cell wall metabolism gene and contains an additional conserved flag (three
flags) (Table S1, Table S2). We attempted to detect changes in the state of the cell envelope of
an RsaE/S415 mutant by exposing it to nisin. Nisin is a post-translationally modified 34-amino
acid polycyclic antibacterial peptide, which targets the essential Lipid II in the cell membrane to
form pores in this membrane (67). A change in Lipid II abundance in the membrane will thus
lead to a change in nisin sensitivity. We analyzed nisin sensitivity by a live-dead staining assay.
This live-dead stain relies on the penetration of a fluorescent dye into cells with a compromised
cell membrane. Using this assay, we indeed observed an increased sensitivity to nisin of the
RsaE/S415 mutant compared to its parental strain (Figure 15C). Using the same assay, we also
identified an increased ethanol sensitivity of the RsaE/S415 mutant (data not shown). These
observations suggest that that there are indeed changes in the cell envelope architecture that are
dependent on RsaE/S415. More targeted experiments are, however, required to directly link the
observed phenotypes to deregulation of the predicted mRNA targets of RsaE/S415.
S. aureus RsaE is probably involved in stationary phase adaptation, which is aimed
at reducing enzymes from central metabolism and increasing the amino acid pool (19). The
latter publication reported that in this adaptation many of the RsaE-modulated genes are also
109
In silico target profiling
dependent on CcpA (19). CcpA is the master regulator of carbon catabolite repression in
many Gram-positive bacteria, including S. aureus and B. subtilis. We therefore looked in our
predictions whether such a link with CcpA might also be present in B. subtilis, which would
thereby suggest a role for RsaE/S415 in the central carbon metabolism of B. subtilis. We did this
despite the fact that RsaE/S415 is expressed in the exponential growth phase (Figure 15A), and
is therefore unlikely to share the stationary phase function of S. aureus RsaE. The general target
predictions for RsaE/S415 indeed show an enrichment of targets from the CcpA regulon (Table
4). This enrichment is due to predicted interactions with rsbV, citM, ccpC, ctaC, ylbP, odhA, levF,
cstA, araQ, araA, acuB and licB (Table S1). The citM gene encodes an Mg2+-citrate transporter
and the protein product of ccpC represses citB and citZ. This suggests a role for B. subtilis RsaE/
S415 in citrate metabolism, as was also found for S. aureus. We additionally looked for other
targets involved in core carbon metabolism. Four core carbon metabolism genes are predicted
targets for RsaE/S415. These genes encode the 2-oxoglutarate dehydrogenase OdhA, succinate
dehydrogenase SdhC, the repressor of citB and citZ CcpC, and PdhD, which is a subunit of
both the pyruvate dehydrogenase and the 2-oxoglutarate dehydrogenase complexes. These four
potential targets all have two flags due to conditional expression and conservation, except for
ccpC, which is not conserved. Interestingly, OdhA and PdhD can be part of the same complex,
and ccpC and odhA are also part of the CcpA regulon. We therefore decided to construct a
translational GFP reporter, consisting of an in-frame fusion between the first 80 amino acids of
OdhA and GFP. The respective gene fusion was then placed under control of the native odhA
promoter on plasmid pRM3. GFP reporter activity of this construct was identified solely in the
exponential growth phase on LB medium (Figure 16A). Deregulated OdhA-GFP expression
compared to the parental strain was observed when GFP activity was assayed in an RsaE/S415
mutant background. This deregulation was characterized by a >2 fold increase in maximal GFP
reporter activity (Figure 16A). In addition, the reporter was active for a slightly longer period
of time in the RsaE/S415 deletion background. This deregulation in GFP reporter expression
was complemented by ectopic expression of RsaE/S415 under control of its native promoter
from the amyE locus (Figure 16A). In fact, the RsaE/S415-complemented strain exhibited an
earlier decrease in OdhA-GFP reporter activity than the parental strain (Figure 16A). Since this
is the opposite of what we observed for the RsaE/S415 mutant, this observation suggests that
RsaE/S415 is expressed at a (slightly) higher level from the amyE locus compared to its native
genomic locus. This has also been observed for complementation of another srRNA (data not
shown). We also integrated the RsaE/S415 complementation construct in the parental strain 168
to create a strain with an additional copy of RsaE/S415. This extra RsaE/S415 copy did not affect
the OdhA-GFP reporter expression compared to that of the parental strain, which may suggest
that repression by RsaE/S415 is already at its maximum in the parental strain. Despite this, these
observations suggest that odhA is a direct target of RsaE/S415.
To verify whether odhA could indeed be a direct RsaE/S415 target, we further inspected
the predicted srRNA-mRNA interaction region. The predicted odhA interaction is part of the
most conserved region of the RsaE/S415 molecule (Figure S8; Figure 16B). In B. subtilis 168,
the predicted interaction spans from the first codon of odhA (+3) until 57 bp after the start of
the ORF. This predicted interaction region is highly conserved in related bacteria (Table 9).
Notably, it has been reported that loop-exposed bases of srRNAs are more often responsible
for regulation than bases in stems (31). Six loop regions of RsaE/S415 are part of the predicted
interaction with odhA. The third of these loops contains the UCCC motif identified for RsaE/
S415 by Geissmann et al. (18). We checked whether target predictions could help to suggest a
seed region around this motif, but did not identify a preferential interaction region for RsaE/
S415 (Figure S9). Nevertheless, as an effect of the three consecutive G-C base-pairs, this third
loop contributes most strongly to the low free energy of the predicted interaction. With the
110
Chapter 4
goal of disrupting the putative interaction, we therefore introduced a point mutation in RsaE/
S415 by a C to G substitution in the middle of the UCCC loop. The resulting mutant RNA was
designated RsaE/S415*. In addition, we constructed a compensatory mutation in the plasmidborne odhA-gfp reporter construct, designated odhA-gfp*. The subsequent OdhA-GFP(*)
expression analyses showed that the odhA-gfp* construct is still regulated by the wild-type
RsaE/S415 (data not shown), suggesting that the interaction is too strong to be disrupted by
a single base substitution. Furthermore, the mutated RsaE/S415* construct did neither affect
the expression of the wild-type odhA-gfp reporter, nor that of the mutated odhA-gfp* reporter
(data not shown). This suggests that the srRNA may either be destabilized by the introduced
point mutation or that the UCCC loop is not critically involved in an interaction between RsaE/
S415 and odhA. Further experimental analyses are required to unravel the molecular basis for
A
B
C
Figure 16. OdhA is deregulated in an RsaE/S415 mutant
A) GFP reporter activity (top panel) and growth (lower panel) in cells of the indicated strains grown on LB medium in
96-well plates. All cells express a translational fusion between OdhA and GFP. Expression of the OdhA-GFP reporter
is >2 fold higher in the RsaE/S415 deletion strain than in the parental strain (WT), and can be reversed by ectopic
expression of RsaE/S415 under control of its native promoter. B) Conserved structure of RsaE/S415 as predicted with
IntaRNA (24). The corresponding sequence alignment is shown in Figure S8. Red color indicates high conservation of
secondary structure. The core structure of RsaE/S415 contains many conserved base-pairs, but the top of the structure
is not conserved in all species. The blue arrow indicates the cytosine base that was changed into a guanine, and that
possibly destabilized the RsaE/S415. C) Predicted TargetRNA_v1 interaction between RsaE/S415 and odhA. Predicted
RNAfold loop regions are indicated in red. The mutated base pair, indicated with the arrow in panel B, is marked in
blue.
111
In silico target profiling
_s
tar
t
_s
mR top
NA
_
mR star
NA t
_
BD stop
BH
sR
Bacillus_JS_uid162189
Bacillus_subtilis_168_uid57675
Bacillus_subtilis_BSn5_uid62463
Bacillus_subtilis_BSP1_uid184010
Bacillus_subtilis_natto_BEST195_uid183001
Bacillus_subtilis_QB928_uid173926
Bacillus_subtilis_RO_NN_1_uid158879
Bacillus_subtilis_spizizenii_W23_uid51879
Bacillus_atrophaeus_1942_uid59887
Bacillus_subtilis_spizizenii_TU_B_10_uid73967
1
1
1
1
1
1
1
1
0.98
1
73
72
75
80
78
61
66
67
155
235
MY9_0188
ndhF
BSn5_12510
A7A1_3299
ndhF
ndhF
I33_0230
ndhF
BATR1942_19550
GYO_0376
-75
-75
-75
-75
-75
-75
-75
-75
-70
-66
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.01
0.02
16
16
16
16
16
16
16
14
13
14
55
55
55
55
55
55
55
53
53
53
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
36
36
36
36
36
36
36
36
40
36
BSU01830
BSU01830
BSU01830
BSU01830
BSU01830
BSU01830
BSU01830
BSU01830
BSU01830
BSU01830
S415
S415
S415
S415
S415
S415
S415
S415
S415
S415
S415
S415
S415
S415
S415
S415
S415
S415
S415
S415
S415
126
126
126
126
126
126
126
126
126
126
126
126
126
126
126
126
126
126
126
126
126
Bacillus_JS_uid162189
Bacillus_subtilis_168_uid57675
Bacillus_subtilis_BSn5_uid62463
Bacillus_subtilis_natto_BEST195_uid183001
Bacillus_subtilis_QB928_uid173926
Bacillus_subtilis_RO_NN_1_uid158879
Bacillus_subtilis_spizizenii_TU_B_10_uid73967
Bacillus_subtilis_spizizenii_W23_uid51879
Bacillus_amyloliquefaciens_DSM_7_uid53535
Bacillus_amyloliquefaciens_FZB42_uid58271
Bacillus_amyloliquefaciens_LL3_uid158133
Bacillus_amyloliquefaciens_plantarum_AS43_3_uid183682
Bacillus_amyloliquefaciens_plantarum_CAU_B946_uid84215
Bacillus_amyloliquefaciens_plantarum_YAU_B9601_Y2_uid159001
Bacillus_amyloliquefaciens_TA208_uid158701
Bacillus_amyloliquefaciens_XH7_uid158881
Bacillus_amyloliquefaciens_Y2_uid165195
Bacillus_cytotoxicus_NVH_391_98_uid58317
Bacillus_atrophaeus_1942_uid59887
Bacillus_coagulans_2_6_uid68053
Bacillus_coagulans_36D1_uid54335
1
1
1
1
1
1
1
1
0.98
0.98
0.98
0.98
0.98
0.98
0.98
0.98
0.98
0.95
0.98
0.97
0.97
52
52
49
53
45
48
47
47
66
62
73
65
56
64
67
70
68
48
249
95
118
MY9_2119
sucA
sucA
kgd
odhA
sucA
sucA
odhA
odhA
sucA
odhA
sucA
sucA
odhA
odhA
sucA
odhA
sucA
sucA
BCO26_1275
Bcoa_3252
-78
-78
-78
-78
-78
-78
-78
-78
-73
-73
-73
-73
-73
-73
-73
-73
-73
-71
-65
-63
-63
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.02
0.01
0.01
70
70
70
70
70
70
68
68
57
57
57
57
57
57
57
57
57
12
74
42
42
119
119
119
119
119
119
117
117
104
104
104
104
104
104
104
104
104
39
115
69
69
3
3
3
3
3
3
3
3
-5
-5
-5
-5
-5
-5
-5
-5
-5
-45
-5
17
17
50
50
50
50
50
50
50
50
50
50
50
50
50
50
50
50
50
-20
46
44
44
BSU19370
BSU19370
BSU19370
BSU19370
BSU19370
BSU19370
BSU19370
BSU19370
BSU19370
BSU19370
BSU19370
BSU19370
BSU19370
BSU19370
BSU19370
BSU19370
BSU19370
BSU19370
BSU19370
BSU19370
BSU19370
sR
NA
Pv
126
126
126
126
126
126
126
126
126
126
NA
Sc
alu
e
Ra
nk
Na
me
S415
S415
S415
S415
S415
S415
S415
S415
S415
S415
or
e
Fr
ac
Id
Qu
er
yN
am
Le
e
ng
th
Or
g
Table 9. Selection of conserved predicted RsaE/S415 targets
Same legend as for Table 6.
these observations and to validate odhA as a direct target of RsaE/S415. Notably, the observed
deregulation of OdhA by the RsaE/S415 deletion may also be due to indirect effects, for instance
other changes in core carbon metabolism or a disturbed acetyl-CoA metabolism. However, the
strong and conserved predicted interaction between RsaE/S415 and odhA seems to argue for a
direct srRNA-mRNA interaction.
Conclusion
A major aim of the studies presented in this chapter was to establish a bioinformatics pipeline
for the prediction of srRNA regulatory functions in B. subtilis. As was exemplified with the
srRNAs FsrA/S512, SR1, S462 and RsaE/S415, such predicted regulatory functions can indeed
be extracted from various elements in the presented predictions, either by inspecting these
elements separately or in combination. Based on the present results, we encourage a further
exploration of the predicted srRNA functions in B. subtilis as this may lead to a much deeper
understanding of the mechanisms underlying srRNA regulation in B. subtilis in particular and
in Gram-positive bacteria in general. For this purpose, we have provided all data files, the R code
used for the data analysis, and an instruction how to browse through these predictions in the
supplementary material.
It is becoming increasingly clear that srRNAs may have many origins. They can be
transcribed from independent promoters situated in intergenic genomic regions, but they can
also originate from RNA processing events and complex control of transcriptional-termination
of operons or even 3’UTRs (32, 41). It can also not be excluded that there are asRNAs that have
a function in trans. This means that there are, most likely, many more putative srRNAs than
those 63 that were included in our present selection. Whenever these will be identified, the
112
Chapter 4
considerations for the study of putative srRNAs outlined in this chapter will also be of relevance
for these srRNAs.
There are currently multiple algorithms for srRNA target predictions available. Some
of these have been compared for different Gram-negative bacteria (26). We used TargetRNA_
v1 to provide a set of unbiased srRNA target predictions for the Gram-positive bacterium B.
subtilis. This set of predictions was further evaluated with the reported analyses to identify the
most likely srRNA targets. Such analyses can also be applied to other sets of target predictions,
including those from other target prediction algorithms. Furthermore, it would also be useful
to integrate precise knowledge on mRNA start sites in the target predictions to refine the range
of predictions. This seems feasible, for instance by implementation of RNAseq data (16), but the
incorporation of such data was beyond the scope of the present studies.
Lastly, we conclude from the present analyses that focusing on the identification of
srRNA phenotypes to subsequently link these to the deregulation of a particular target can only
be successful when the srRNA in question is sufficiently important to lead to a phenotype when
deleted. The fine-tuning nature of srRNA regulation, the abundance of potentially functionally
redundant srRNAs, as well as the relatively low sensitivity of phenotypic assays makes it very
unlikely that clear phenotypes can be observed for every srRNA mutant. For example the S462
RNA is under control of SigW, but a SigW mutant exhibits no particular phenotype - it does so
only in combination with deletions of other alternative sigma factors (68, 69). It is therefore not
expected that a strong S462 phenotype will be identified. Since only mRNA levels will change
upon srRNA deletion, the phenotypes of srRNA mutant strains are also expected to be less
prominent than phenotypes of mutants that lack, for instance, transcription factors. Instead of
focusing only on srRNA mutant phenotypes, studies on the function of putative srRNAs can also
include a pulsed overexpression approach of the respective srRNAs. The differentially expressed
genes that are identified upon pulse-induced srRNA overexpression can then be compared to
target predictions, for instance the ones reported here. Such a combination of bioinformatics
predictions and targeted experiments is expected to greatly advance our understanding of
srRNA regulation in B. subtilis.
Materials & Methods
Growth conditions and strain construction
LB medium was used for all experiments and cloning. When required, E. coli media were
supplemented with ampicillin (100 mg ml-1) or chloramphenicol (10 mg ml-1). Media for B.
subtilis were supplemented with phleomycin (4 mg ml-1), kanamycin (20 mg ml-1), tetracyclin
(5 mg ml-1), chloramphenicol (10 mg ml-1) or spectinomycin (100 mg ml-1), or combinations
thereof.
E. coli and B. subtilis strains and plasmids used in this study are listed in Table S4A and
oligonucleotides in Table S4B. E. coli TG1 was used for all cloning procedures. All B. subtilis
strains were based on the trpC2-proficient parental strain 168 (70). B. subtilis transformations
were performed as described previously (71). The isogenic FsrA/S512 and RsaE/S415 mutants
were constructed according to the method described by Tanaka et al. (72). Promoter fusions for
S462, FsrA/S512, and RsaE/S415 were constructed using the integrative Ligation Independent
Cloning (LIC) plasmid pBaSysBioII (58). For this purpose, the start of the respective RNA
segment was fused with GFP. A minimum of three clones were checked to exclude possible
multi-copy integration of the pBaSysBioII plasmid. These promoter fusions were combined with
deletion mutants by transformation of the respective genomic DNA.
The translational OdhA-GFP reporter fusion was constructed via overlap PCR followed
113
In silico target profiling
by gel-purification of the respective amplified DNA fragment and LIC in the plasmid pRM3 (73)
(Chapter 7). RsaE/S415 was complemented in trans by integration of the complete RsaE/S415
sequence under control of its native promoter in the amyE locus. For this, RsaE/S415 was first
LIC-cloned into pRMC (Mars et al.; Chapter 5).
Assaying reporter strains
GFP fluorescence and growth (600 nm) were monitored at 10 min intervals for cells grown in
96-well plates in a Biotek® Synergy 2 plate reader as previously described (58). Autofluoresence
of the parental strain was subtracted and promoter activity was computed by subtracting the
fluorescence of the previous time-point from that of the measured time-point (as in Botella et
al. (58)). Moving average filtering (filter function in R with filter=rep(1/10, 10)) was used for
smoothing of the promoter activity plots. For flow cytometry, cultures were grown in shake flasks,
sampled in the indicated growth phase and directly analyzed in an Accuri C6 flow cytometer.
Transcriptomics and 2D PAGE
Cultures for RNA isolations from cells grown on LB were sampled in the late exponential /
transition phase (OD600nm 3.2). The cells were directly harvested in killing buffer and processed
as described previously (17). RNA samples were analyzed with high-density tiling arrays and
analyzed as was described previously (17). For 2D PAGE analyses, the cells were harvested by
centrifugation, resuspended in TE-Buffer (10 mM Tris, 1 mM EDTA, pH 7.5), and mechanically
disrupted using a Precellys 24 homogenizator (PeqLab, Germany; 3 x 30 s at 6.5 m s-1). Protein
concentrations of extracts were determined using a ninhydrin-based assay (74, 75). Three
biological and two technical replicates were analyzed.
2D PAGE was performed as previously described (76). 100 μg protein was loaded onto
18 cm IPG strips (pH 4-7, GE-Healthcare). After 2D PAGE, gels were fixed with 40 % (v/v)
ethanol and 10 % (v/v) acetic acid for 1 to 2 h and subsequently stained with FlamingoTM.
Stained gels were scanned (Typhoon 9400, GE-Healthcare) and the resulting images were
analyzed and quantified employing Delta2D 4.2 software (Decodon GmbH, Germany). For all
spots detected on the gel, the spot volume was assigned to proteins, exported from the software
and subsequently used for calibration of 2D gels as described previously (77). Spot quantities
were calculated as % volume of each spot on one gel compared to all detected spots on the gel.
The spot volumes were used to discover spots with significantly changed abundance (students
t-test). Significantly changed spots were cut from the gel, protein digestion was performed in
a Ettan Spot Handling Workstation (GE Healthcare), and samples were analyzed by MALDITOF-MS/MS using the Proteome Analyzer 4800 (Applied Biosystems) as described in (77).
Nisin stress assay
Overnight cultures of cells grown in LB with the appropriate antibiotics were used to inoculate
fresh LB broth at approximately a 1:100 dilution. When this pre-culture reached exponential
phase, the samples were diluted to an OD600 of 0.05 in 100 ml bottles at a final volume of 5 ml
LB. Growth was continued for approximately 70 min to an OD600 of 0.4–0.5. At this point nisin
was added to a final concentration of 1 mM. The nisin was purchased from Biochemika (Fluka,
Sigma Aldrich). After 10 min incubation with vigorous shaking, 1 ml of cells were pelleted by
centrifugation for 1 min at maximal speed. The supernatant was discarded and cell pellets were
gently re-suspended in 0.5 ml 0.85% NaCl prior to the addition of 1 ml live/dead stain (1:1 SYTO
9:propidium iodide; LIVE/DEAD BacLight Bacterial Viability and Counting Kit; Invitrogen).
The membrane integrity of nisin-stressed cells was then determined by flow cytometry in an
Accuri C6 Flow Cytometer as previously described (78).
114
Chapter 4
Expression data and annotation of RNA segments
The tiling array expression data and annotation of RNA segments are available in the online
supplement of Nicolas et al. (17). The re-classification of RNA segments into four groups, global
conservation analyses and secondary structure predictions have been described in Chapter 3 of
this thesis.
Target predictions and computation of functional enrichment
Putative srRNA targets were predicted with the program TargetRNA_v1 (23) on the B. subtilis
168 genome (Genbank: AL009126-3) near the 5’ region. The search region was defined as -75
bp; +50 bp around the start codon of the CDS or around the 5’-end of the other RNA segments
reported by Nicolas et al. (17). The additional command line arguments “-z 10000 -y 2 -l 6” were
used to specify relaxed search and output criteria in order to obtain an array of prediction as
complete as possible: maximum number of hits per query in output 10000 (corresponding to
unlimited; default 100); no p-value cut-off (default 0.01); minimum number of consecutive base
pairs in the interaction set to 6 (default 9).
Enrichment of functional categories was expressed as a binomial p-value computed on
the genes following the SubtiWiki annotation (specifically column “FuncName3” corresponding
to the third level of the functional classification) (49). Enrichment of B-clusters within the
predicted targets for a putative srRNA was computed similarly, based on the B-cluster annotation
by Nicolas et al. (17). The B-clusters group genes or RNA segments with substantial similarity in
expression profiles as determined by the Pearson correlation coefficient (the average pair-wise
coefficient within these B-clusters is 0.6).
Evolutionarily conserved target analysis
In order to identify putative srRNA – predicted target pairs that are evolutionarily conserved,
we used the 62 Bacillus genomes available in Genbank (as of January 31, 2013). On each of these
genomes a BLAST search (Blastn v2.2.26 with default parameters) was conducted with the B.
subtilis 168 sequence of the putative srRNA (in practice the expression segments classified as
All-Indep in(17)). Genomes where a homologuous segment (E-value < 0.001) was found were
then subjected to a TargetRNA_v1 search with extended settings around the 5’UTR (-75 bp;
+50 bp around the start codon and additional command line arguments “-z 250 -y 2 -l 6”) using
as a query the sequence of the first high-scoring-pair of the first BLAST hit in that particular
genome. A bidirectional best hit criterion (based on Blastp v2.2.26 with default parameters and
E-value cut-off 0.001) was used to compare the predicted targets in the reference B. subtilis 168
genome (Genbank: AL009126-3) with the predicted targets in the other genomes. The data was
tabulated and subsetted to only include genes that were predicted in 8 or more genomes, and
were predicted in B. subtilis 168 with p-value ≤0.01. To inspect the complete predicted conserved
RsaE regulon, the data for this srRNA was additionally analyzed without the latter criterion.
Target correlations under high sRNA expression (peaks expression)
We investigated the pair-wise expression correlation across biological conditions of the
independent RNA segments and their predicted targets in B. subtilis 168 with two different
analyses. In the first analysis, we simply computed the pair-wise Pearson correlation across
conditions between a putative srRNA and a predicted target across the 269 hybridizations
performed by Nicolas et al. (17). In the second analysis the Pearson correlation coefficient was
computed only for the hybridizations around a peak of expression of the putative srRNA in the
condition space. The goal of this was to address the problem that the correlation between srRNA
115
In silico target profiling
and target might be strong only in a subset of conditions near the induction of the srRNA. In
practice we defined the center of the peak as the hybridization in which the highest expression
level was measured for the putative srRNA. We then proceeded with the step-wise aggregation
of the other hybridizations starting with those whose global transcriptome is the most similar
(correlated) to the hybridization selected as the center of the peak. We stopped this aggregation
process after the inclusion of 6 hybridizations where the expression level of the putative srRNA
is below a cut-off (quantile 75% of the distribution of the expression levels for this putative
srRNA). For each srRNA segment, a second expression peak was also investigated when the first
peak did not encompass all the hybridizations where the expression of the putative srRNA is in
its upper 10% and not less than 4x its global maximum. The pair-wise Pearson correlations were
then computed for the subset of hybridizations included in each peak. For each putative srRNA
and in each analysis both the ranking and p-values of the pair-wise correlations were plotted.
Acknowledgements
This work was supported by the Commission of the European Union (projects LSHGCT-2006-037469 and 244093), and the transnational Systems Biology of Microorganisms
(SysMO) organization (project BACELL SysMO2) through the Research Council for Earth and
Life Sciences of the Netherlands Organization for Scientific Research..
Supplementary Material (available on request; email [email protected])
Figures S1 - S9
Tables S1 - S5
Supplementary data file predictions
Compressed file containing the following files (additional required files from SubtiWiki (49) or Nicolas et al. (17) are
included but not listed here):
• R code for analysis and browsing through these files (R code prediction manuscript Chapter 4).
• All target predictions up to TargetRNA_v1 p-value 1 (HUGEallresTargetRNA_20120412.csv). Only open in
R (too large for Excel / Open Office).
• All results peaks correlation analysis (allrespeakcor_20140902.tsv). Only open in R (too large for Excel /
Open Office).
• All
information
on
conserved
targets
that
received
a
Conserved
flag
(ConservedTargetsSelectionCutoff8SpeciesOnlySubtilis.csv). Only open in R (too large for Excel / Open
Office).
• Selection of putative srRNAs (new selection sRNAs update.csv).
• Table S1 (tp0.01subsetWithFlagsCount.csv).
• Folder containing plots like in Figure 11 for all putative srRNAs from the selection (Regregion plots folder).
• Text file with all blast results of the selection of putative srRNAs. Can be used to make alignments
(lastSelectionBlastSepForAlignment.fasta).
Supplementary data file for the transcript profiling analysis of an S512 mutant using tiling arrays
Compressed file containing the following files (in addition to Figures S4, S5, and S6):
• Normalized tiling array data S512 mutant and parental strain (tilingQnormS512_and_WT.csv).
• Outcome of analysis with limma package from R (all analyzed S512.csv).
• Differentially expressed genes in the S512 mutant strain with p-value ≤ 0.01 (significantly changed targets
S512 0.01.csv).
• Differentially expressed genes in the S512 mutant strain with p-value ≤ 0.05 (significantly changed targets
S512 0.05.csv).
• Overlap between predicted targets (with p-value ≤ 0.01) and significantly changed targets in the expression
analysis (≤ 0.05) (tilingS512only0.01_0.05.csv).
116
Chapter 4
Descriptions, legends and selected Supplementary Figures
Table S1. All predicted targets with additional information
For legend see Table 2 in main text.
Table S2. Predicted targets from Table S1 with three or more flags
For legend see Table 2 in main text.
Table S3. Enriched B-clusters in predictions
Query, putative srRNA name. Bcluster, enriched B-cluster of the putative srRNAs computed on all B. subtilis 168 target
predictions. pval, binomial p-value indicating the significance of the enrichment.
Table S4. Peaks targets
sRNA, putative srRNA name. Name, name of the predicted target. Ltag, unique B. subtilis 168 locus tag of the predicted
target. GlobalCor, Pearson correlation between the expression level of the predicted target and the srRNA in all
conditions. GlobalCorPvalue, p-value of the GlobalCor. GlobalCorRankNeg, 1 for the gene that is most negatively
correlated across all hybridizations. GlobalCorRankPos, 1 for the gene that is most positively correlated across all
hybridizations. iPeak, peak number (maximum two). PeakCenter, unique tag of the relevant hybridization from (17).
PeakHeight, peak expression level. PeakSize, number of hybridizations with this peak value. Cor, peak correlation
between the expression level of the predicted target and the srRNA. CorPvalue, p-value of Cor. CorRankNeg, 1 for the
gene that is the most negatively correlated in peaks condition. CorRankPos, 1 for the gene that is the most positively
correlated in peaks condition. PredPval, TargetRNA_v1 (23) prediction p-value. This table contains predicted targets
with p-values <0.01. PeakCorScore, the absolute difference between the corall and peaks correlation. This table contains
pairs with a PeakCorScore of >1.5.
Table S5. Strains, plasmids, and oligonucleotides used in this study
thiV ypeQ
gpsB
S797
yceG
ywlD
S1455
plsY
ubiX
yaaH
yhbE
fbaA
queE
ctaE
yusE
coaDmrpG
yaaC
ybxG
yezD ylxR
nadC
tatCY
sacP
spoVFA
nudFykkC bglC
S1009
mutS ywcE
ywrE
S249
psd
ykpC clpX
opuBA
ydbD
ykzD bglP
yshA yqhM
yciB
fruR
yozG
yvrH ydaB
yxlH
ypfA
ydfJ
rapA
ycgK
malR yitI
yttB ydjN
trmB ycbL
ypuI
oxaAA
sppA
mbl yqzC ddl
ywqN
yugK
yhfC
hisC
abrB
sipS
ntdA
ypbFycbU
spsA
yrhC yhdL yqhV yxeC
glgB
yabB
oxdC
yfnD
cysC
yvoF
ygaO
yqgE
yfhI
ftsL
yqgU
rapH
yjbC ywpGfliE
ymaH
ywfO
yojN
yugE
defB ytqA
yvdA
rplS
ylmC
yloA
yerA
ydbL sspO
truA
S1227
ypjB
ytkP
bofA
dxr
yusU
rsiV
glcF
ydaO
yvgL
yhfP
yvrB
yfmC
ycsF
S731
ybxB
yojJ
ytdP
spsF
S641
yqhT yflH
bacB
adeC
yxjB
hmp
ypqE
ytlP
yxeN
tepA
cysJ
yrrO
S717
rapG yyaR
murB
thyA
dapB
sigW
xre
salA
ltaSA
ykrA ykhA
thrC
ohrB
etfA
glcP
yuxL
kbl
pgsE sirB
qoxB
yclM
xylB
coaE
purB yrvN
smc
fabHB
yufQ
pps
ywbO
S863
ywiE
rsbRC
nth
yugI
gyrB
ylbQ
sacY
metK
perR
yvrP
sigI
S708
yvgJ yjaZ
yxnA
yvaE yisX
mntH
pksS
sinR
mrpC sdhB
ftsA ylqB pbuG
msmX
kinD
S444
pheS senS
yerH
ykqA
yaaR
dctP
S326
natA
sspA
yodEdtpT
lytD
ftsW
ilvD
yusO
yybP
yvbU
ytmB
tagH
atpE
yjzD
yceK
nrdIspoIIM
ynfC
yjkB
spo0A
ysdB
dhbF
spoVAD
yqzJ mapB
rsfA
ycgS
yxeB
yvbG
yocS
dacF
sigM
S1027 pepF
ydiB aprE scoC
ycsD
amtB
braB
asnS
argD
citB
thrZ
msmE
yktA
ybaNyfiJ
yhbB
rplM
yqgN lytH ywzC
xynB
ytxD
ydbK
bipA mraY
ytxC
ylbA ywfA
rbsR
bcrC cinA
rsbV
ykuC rsbRA
ywlC
ykuJ
yugS
yueH
norM
ykuE
holA
yqeW yybGywmC
ykvS
sigX
yheA glcD
yacL
uvrA
mrpA S289
S877
parB S309
yxjL
spoVID
S912
ybcL
S462
yflI
purT
pupG
gutP
yetL mtlF
glpG
ywbI
ytkA
nusB
bdbC trpP
ypdP
feuC
natB
mrgA
comQ yfiY
lcfA cotO
ebrA
dppB
dapH S198
slrR
ydbP
xpt
yfmM yueI
yhjG
sspN
rsbSyvbW
dnaEydbC
adcAysnA
yojF xkdB
yizA yvzG
ytcQ
cotSyisL
splB ykuH
levE
ispA
yqzE
ykuL bltD
yxxD
ftsY
ywqC lepAywmE
pyrK dps pckA yaaL
yvdB abnA
sufS efeU
rapF
ugtP
natR
carA
padC
feuA
murC
ylmG
exlX
ruvA
ndk ybgF
metN
fliM
yvgK
ykuS
cwlJ
ydcI
ypjC
yjlA
S1022
nifS
ypuD
leuB
tatCD
ytbD ycsG ykzC
fhuD
spoIISA
spsI
yknX
yisS sleB
tlp yprA
ssuB yfhC
yunE
katA
ywjH
spoIIQ
S512
rluB
S645
dnaN
yoeC
ywfH
yclP
ylxY
glgC argG
iolJ yoyC
spoVAB
ytcJ ydhK
cwlC
oppF
sufC
nfrA
folD
S907
mrpB
fabD
cysE
yplQ
yitU
pdhA
yuaE
tnrA
yusI
sacX
purC
S968
S718
yyzM
mtlA
yfjO
pdhD
atpG
engD
levF
phrA
walHrplV
bioY
yfkJ yugU
yrzK ilvC
yccF
yceI pfkA
malQ
speD
oxaAB
bdbD
ytpS
cyeA
nprE
tyrA
ybzI
liaG
ylmH
thiT
yheE
ydeH
gabP
ydhB
kdgK
yfiU ywhC
blt yvyG
citR
yugG ywtG
ylbN
ylaN
ndhF
yjbImcpC
liaH
ybaR csbA
yhcWahpC
fliQ
exoA
kbaA
phoR
prs
ywsA
resC
clpC
yqgW
yqfU
ypbG
mscL
fliR
ctaC
queA
S181
mtnA
yoaD S275 bkdAA
bsn
S503
ythA
S796
ytrD
guaC
ftsZ
nucA ywrO
ytpA
fruK
bceR
ypkP
yunC
ctaO
minC
xsa
yhjH
tcyP
yhcU
sdhC
S857
ybbAnupC mutM ytpI
rpmEB
spoIIIAG
yyaK
spxA
yusH
yczE
ydeB
yczG
yfnB
flgD dxs
ilvK
gatC
yhxC
trpS
yxcA
pcrA
licA
ysdA
dtd cheW
ygzB
ykoL
S903
ycgQ
yqjB
nsrR
hslO
ywoG
mreD
zosA
ampS
yetN yobS pta
rho
tagO
ypiA
ykoY
ycnJ
yneN yqhL
yflS
ypzG cmk
yabP
hinT
mntR
ypbE
racA ycbC
trpB
secG
yceF
ytjA
aroB
ydbS ywdK
albA
S72 yycAyycQ
yndG
ynzB
clsA
yfkH
ppnKB
ybxI yqzH yojO
sigF
rpoZ
ydeL
spoIIIAB
dnaJ
S423
racE
pbpB
pucF
pepT
ywkF
estA
gerD
yodI
ylbP qcrB
S653
yqkD
yybO yfkI
yjbL
yuzF
adhA
mtnE
yhcK
S345
gerPA
yneJ hisI
yxaI
odhA
sinI
ybaC
yhzF
yocI yqeI
yutE
mhqR
ykaA
S612
argC ypmB
hemZ
rplL
ctaG
yhcC
tdk yhdJ
yurR
gndA
iscS
bioYB
yvfG
fhuC
ybcC
dppE
dnaC
ypzA
ypoP
yjcL
pbuE yqzF
recF
yvbJ ytzI
lysS
yfjT uppS
yqeB ywrD
fosB ydhH
S415
ykoA
moeB
yurZ
yvaB
ytaF pheT
S1251
yocC
glpP
yusF
thiC
coaX
gapB
yxiS
cycB
ydhD
thiU
yfhL ccdA
fliZ
S1583
yojK
rplO
radC
nasE
glpF
ycgP
udk
ywnFcspR ykzI
ydzK
qcrC pcrB
sigL yfnA
degS
splA
rplB
yusG
yceC
cstA
acpA
ylzI
ytpB
yqeF
obg comEA
ykzG
ktrA pbpG ydaH hemA
spoVR
yxjI
citZ
secE ytoP
ydcA
pucH
cwlS
mmsA
rnmV
ptsH
bkdR
safA
uvrB
yjqA
gcaD
fadR
ykoG polYB
pbpD
ylxS ywdJ
phoD
yqgT
yfmL
mcpB
yyaB
sspG
nosA
yhbI
aroF
yhdK
yrdC
htrA
acuB sigO
yvgO
yttP
yhjC ygxA
yppF nrnA
S140
dppA
nagP
yesR
treR
rok
yebC
spo0M pabC
ytzB
resA
yhzC
ydjH
ykoN
ylaI
ubiD
flhA
ywhD
yuaB
yoyD
tcyA
yqgS
S357
iscU
ydaT
ydeD yvdJ
ykkA
yqjV
ytzH
yitW
yhdP
narG
motB
fliW
ctrA
opuE yezE
hutH
htpX ripX
spo0F
ypjP
mecB
citM
ytkL
yxkC
bcd
ispF
yrzL
leuD
cgeD
iolR
ylbH swrC
iolC
ribE
prpE
rpmF ctc
fabL
araQ
serS
ycbD
S1495 pstA yhfI yjiA
sspH valS yfhD pbpA
tuaG
ylmD
ctsR yaaB
glnQ
ybdN ypmS bdhA
ydaK recX
yckD
ytzG
dltD
slrA
yjbA
araA
ydbJ ykzB
yetJ
cypB
mstX
yfkL
yoeAnrdF pdaA
epsD
ywfM
ydeS
yugP gltX patB
ydfO yaaT
yugM yitK
glyS
frlO ydeF
sqhC
S1024
xkdA
ribC
fliY
scpA
ylxP
ligA
yecA yunD
mntA
ywhL S659
yhaM atpC
glxK yhfM
dgkA
yqgC
yjdJ
yycN yvoD
S665
cysP
yqzG ptsG
fadM
ppnKA
ktrC
ycnB
ykuU
ymdB
ymcC yuaF yrhO yjcA
ftsH
yttA
rpsF
ytzA
ysfB
sirC
ypqA
yrrL
yheI parE
yfmT
moaC
ykwD
serA yfhK yabD
gerPD
yloV
S313
ydiF
ylxL
gltT
flhB
ydfL
ythP yrrM
yvdS
rplNA
fliJ
engA
plsX
yocH
artQ cysH
yaaO
coaBC
fur
albE
xhlB
ydgF
fliL
ypzKycgF
yccK
ykoI
pncA
efeM ywdF
ydaD
yflB
yngG
ydjI
yisY
prpC
aspS
purR
pyrH
yngL
yczF
yvdQ
yitZ
yflK
scoB
yuxN
tcyC ypzH khtS
aroK mlpA
yesU
yqhO ykvU
yqkB ylaE
pyrB
cheB
yerD mgsA
ribAB
yojB yybH tenA
S547
carB
minJ sumT
epsL
ytcIrecR
yueF
acoC
fer
yrrC
ponA
ileS
msmRyptA
S849 yjfA
divIVA ydfM
yoaA
yxkO
yozD
S1052
pit
citS yphF
yclE
cysS
yufK
defA
noc
yvzF
yqfW
S809
S111
ytxJ ytzJ sbcC
mtnK
yycP
dapA
yraJ
sufB
ymaD trpA
pnpA
ysmA
eag
adk
ymdAyulC
folB
ygaD
ykoQ
yueE
ypmA
glmS
recG
sdhA
adcB
yheH
gcvT
yhdXywgA
ykvT ycnI
lrgB
lrgA
cwlO
rapC
yweA
gmuR
aspB
ilvB yvrA
metE
yutJ
yfkM
S728
ynzL
ynbB
ytrH
metI
cydB ybfO ymaB
aroH
scuA
gltB
murAA
galT comER
yhdN
veg
S2
yngKyraK
jag ywlB
ykrK
rhgT
glpK
yfkF
ispG yodT
pbpI
deoC
frlD
cwlD
yqzD
ywcD
moeAspmA
bmrA dhbA
yisJ
thiW
pth
yvcD
yqfT
ycbN
yhaL
xynD
yjqC
asnB
fliS
yfhQ
rplC rnhB
S348
yxdM yocD
lysP
yhdA ycgM
S1029
hemH
ydbO
yodL
flgC yceH
amyC
sspP opuCB
gmuG ccpC
yxjA
tufA
ynzD albB
rpsP
rapD
garD
ybeF cspB
ykzW
yufL
ylaK yjjA
citT
fliG
cshA
yqgQ tsf
yaaA
yebE
yoxD
radA
trxB
ypbB parA
ydfC
hisS fabI
nadB
secDF
yflE yvaV
albF
pdxS
yktD
yoxA
azlC
S458
mccAsul
mraZ
ylaJ
fnr
yuiA yabK
ylbJ
pel
infA
ndoA
yesJ
pheA
ycbP
yydA trpE
yxjN
rnr
ywlA
dagK
ybfQ
ybbJ
ytlA
rocD rpsM cysK
ganP
yvaA manR ykuIgspA
yviA
ymfD yazB
yqgB ydjM
cssR
rocR
rnjB
ywkB
yugO
yheD ywmF
S145
ydaM
yugJ
yneEsacB lytS
ydhE
yerI
catD
ypdA
ydjE
rapB
purE
motA
ymfC
tilS
ywjAopuD
pdhC
queC
yitJ rpsNB
efp
ytrA
xepA
yflL
yuiD
icd
feuB
pbuO
ytwF
yvqJ yjbQ
sodF yczI rsgI
dnaG
rpmB
kdgR
yetO
ybbD
ctpA
recO
ytsJ
yybR
speB
glnA
ywlG
yknW yjbM
yjcF
S144
ydfF
araB
sspE
yqxK
yitS
yueG
bofC
tyrS
cotV mtnW
yxeP
S1534
gmuE
rpoC
licB yrrB
ytkK
spoIIAA
ydeM
yqxD
rapJ
yvsG
ycbG
ytrP
csbD
flhF
yknT
fabHA exuR
yrdA
pucI
ycnL
ydzA
aroA
cdd
ywnA
dusB
yabO
prsW
nasB
yjaV
yhcM
ybdO
yuxG
pelB
yjbK
vmlR
yeaC
yaaQ
S1136
rplE
patA
yvyF
yitY
yvdD
yisI recA
ysxE
yjnA ywtE
bltR
manP
hisH
ypwA
pyrR rimO
yvgN
fadF
yqfC
mreB yacD
divIB fmnP
Figure S1. Conserved target network
Only targets with a conservation flag were extracted from the Table of predictions (Table S1) and formatted for making
a network plot with the open source software Gephi (http://gephi.github.io).
117
In silico target profiling
Figure S2. Structural LocARNA alignment of the FsrA/S512 sequence
Sequence alignment of FsrA/S512 using local nBLAST results for the set of Bacillus genomes processed for alignment
and secondary structure prediction using the LocARNA algorithm (http://rna.informatik.uni-freiburg.de/LocARNA/
Input.jsp) (57).
Figure S3. Representative 2D PAGE analysis of proteins in an FsrA/S512 mutant and its parental strain.
Protein names of significantly changed spots identified with MALDI-TOF MS are indicated.
Figure S4. Hierarchical tree of changes observed by 2D PAGE analysis in the FsrA/S512 mutant strain compared
to its parental strain 168
Tree of changes in protein abundance identified by replicate 2D PAGE analyses of an FsrA/S512 mutant strain compared
to its parental strain 168.
Figure S5. Significant changes in the abundance of proteins identified by 2D PAGE analysis of an FsrA/S512
mutant compared to its parental strain
Overview of significant changes in protein abundance using data obtained through the analysis shown in Figures S3
and S4.
3
4
3,2,3
3
1
0.4
3
1.7
CymR=0.2
-0.3
5
4
-0.3,0
-0.1
1.5
-0.2
Figure S6. Deregulation of sulfur metabolism in an FsrA/S512 mutant
Fold changes in the expression of genes for sulphur metabolism upon deletion of FsrA/S512 are indicated on the sulfur
metabolism network. Positive (mutant / parental) fold changes are plotted in red and negative fold changes in dark
green. The expression of genes responsible for cysteine biosynthesis is increased. Adapted from (79).
118
Chapter 4
Figure S7. Structural LocARNA alignment of the S462 sequence
Sequence alignment of S462 using local nBLAST results for the set of Bacillus genomes processed for alignment and
secondary structure prediction using the LocARNA algorithm (57).
0
50
100
150
Figure S8. Structural LocARNA alignment of the RsaE/S415 sequence
Sequence alignment of RsaE/S415 using local nBLAST results for the set of Bacillus genomes processed for alignment
and secondary structure prediction using the LocARNA algorithm (57).
.((((.((((.....)))).(((((.(((((.((((((..((((...((((((.........))))))...))))..))))))...))))).......((((.....)))))))))......))))
AAAGTCGACATCTTTTGTTATCATAAGGATGTGAAATTGATCACAAACAAACATTACCCCTTTGTTTGACCGTGAAAAATTTCTCCCATCCCCTTTGTTGTCGTTAAGACATATGAAACCGCGCTT
Figure S9. Predicted targets of RsaE/S415 plotted on the RsaE/S415 sequence.
For legend see Figure 11 in main text.
119
In silico target profiling
References
1. Gorke,B. and Vogel,J. (2008) Noncoding RNA control of the making and breaking of sugars. Genes Dev., 22, 29142925.
2. Liu,J.M. and Camilli,A. (2010) A broadening world of bacterial small RNAs. Curr. Opin. Microbiol., 13, 18-23.
3. Storz,G., Vogel,J. and Wassarman,K.M. (2011) Regulation by small RNAs in bacteria: Expanding frontiers. Mol. Cell,
43, 880-891.
4. Shimoni,Y., Friedlander,G., Hetzroni,G., Niv,G., Altuvia,S., Biham,O. and Margalit,H. (2007) Regulation of gene
expression by small non-coding RNAs: A quantitative view. Mol. Syst. Biol., 3, 138.
5. Jost,D., Nowojewski,A. and Levine,E. (2013) Regulating the many to benefit the few: Role of weak small RNA targets.
Biophys. J., 104, 1773-1782.
6. Cech,T.R. and Steitz,J.A. (2014) The noncoding RNA revolution-trashing old rules to forge new ones. Cell, 157, 7794.
7. Waters,L.S. and Storz,G. (2009) Regulatory RNAs in bacteria. Cell, 136, 615-628.
8. Sharma,C.M., Papenfort,K., Pernitzsch,S.R., Mollenkopf,H.J., Hinton,J.C. and Vogel,J. (2011) Pervasive posttranscriptional control of genes involved in amino acid metabolism by the Hfq-dependent GcvB small RNA. Mol.
Microbiol., 81, 1144-1165.
9. Jousselin,A., Metzinger,L. and Felden,B. (2009) On the facultative requirement of the bacterial RNA chaperone, Hfq.
Trends Microbiol., 17, 399-405.
10. Gaballa,A., Antelmann,H., Aguilar,C., Khakh,S.K., Song,K.B., Smaldone,G.T. and Helmann,J.D. (2008) The Bacillus
subtilis iron-sparing response is mediated by a Fur-regulated small RNA and three small, basic proteins. Proc. Natl.
Acad. Sci. U. S. A., 105, 11927-11932.
11. Smaldone,G.T., Revelles,O., Gaballa,A., Sauer,U., Antelmann,H. and Helmann,J.D. (2012) A global investigation of
the Bacillus subtilis iron-sparing response identifies major changes in metabolism. J. Bacteriol., 194, 2594-2605.
12. Heidrich,N., Chinali,A., Gerth,U. and Brantl,S. (2006) The small untranslated RNA SR1 from the Bacillus subtilis
genome is involved in the regulation of arginine catabolism. Mol. Microbiol., 62, 520-536.
13. Hammerle,H., Amman,F., Vecerek,B., Stulke,J., Hofacker,I. and Blasi,U. (2014) Impact of hfq on the Bacillus subtilis
transcriptome. PLoS One, 9, e98661.
14. Saito,S., Kakeshita,H. and Nakamura,K. (2009) Novel small RNA-encoding genes in the intergenic regions of
Bacillus subtilis. Gene, 428, 2-8.
15. Schmalisch,M., Maiques,E., Nikolov,L., Camp,A.H., Chevreux,B., Muffler,A., Rodriguez,S., Perkins,J. and Losick,R.
(2010) Small genes under sporulation control in the Bacillus subtilis genome. J. Bacteriol., 192, 5402-5412.
16. Irnov,I., Sharma,C.M., Vogel,J. and Winkler,W.C. (2010) Identification of regulatory RNAs in Bacillus subtilis.
Nucleic Acids Res., 38, 6637-6651.
17. Nicolas,P., Mader,U., Dervyn,E., Rochat,T., Leduc,A., Pigeonneau,N., Bidnenko,E., Marchadier,E., Hoebeke,M.,
Aymerich,S., et al. (2012) Condition-dependent transcriptome reveals high-level regulatory architecture in Bacillus
subtilis. Science, 335, 1103-1106.
18. Geissmann,T., Chevalier,C., Cros,M.J., Boisset,S., Fechter,P., Noirot,C., Schrenzel,J., Francois,P., Vandenesch,F.,
Gaspin,C., et al. (2009) A search for small noncoding RNAs in Staphylococcus aureus reveals a conserved sequence
motif for regulation. Nucleic Acids Res., 37, 7239-7257.
19. Bohn,C., Rigoulay,C., Chabelskaya,S., Sharma,C.M., Marchais,A., Skorski,P., Borezee-Durant,E., Barbet,R.,
Jacquet,E., Jacq,A., et al. (2010) Experimental discovery of small RNAs in Staphylococcus aureus reveals a riboregulator
of central metabolism. Nucleic Acids Res., 38, 6620-6636.
20. Chabelskaya,S., Gaillot,O. and Felden,B. (2010) A Staphylococcus aureus small RNA is required for bacterial
virulence and regulates the expression of an immune-evasion molecule. PLoS Pathog., 6, e1000927.
21. Backofen,R. and Hess,W.R. (2010) Computational prediction of sRNAs and their targets in bacteria. RNA Biol., 7,
33-42.
22. Sharma,C.M. and Vogel,J. (2009) Experimental approaches for the discovery and characterization of regulatory
small RNA. Curr. Opin. Microbiol., 12, 536-546.
23. Tjaden,B., Goodwin,S.S., Opdyke,J.A., Guillier,M., Fu,D.X., Gottesman,S. and Storz,G. (2006) Target prediction for
small, noncoding RNAs in bacteria. Nucleic Acids Res., 34, 2791-2802.
24. Busch,A., Richter,A.S. and Backofen,R. (2008) IntaRNA: Efficient prediction of bacterial sRNA targets incorporating
target site accessibility and seed regions. Bioinformatics, 24, 2849-2856.
25. Eggenhofer,F., Tafer,H., Stadler,P.F. and Hofacker,I.L. (2011) RNApredator: Fast accessibility-based prediction of
sRNA targets. Nucleic Acids Res., 39, W149-54.
26. Wright,P.R., Richter,A.S., Papenfort,K., Mann,M., Vogel,J., Hess,W.R., Backofen,R. and Georg,J. (2013) Comparative
genomics boosts target prediction for bacterial small RNAs. Proc. Natl. Acad. Sci. U. S. A., 110, E3487-96.
27. Tjaden,B. (2008) TargetRNA: A tool for predicting targets of small RNA action in bacteria. Nucleic Acids Res., 36,
120
Chapter 4
W109-13.
28. Modi,S.R., Camacho,D.M., Kohanski,M.A., Walker,G.C. and Collins,J.J. (2011) Functional characterization of
bacterial sRNAs using a network biology approach. Proc. Natl. Acad. Sci. U. S. A., 108, 15522-15527.
29. Smolke,C.D. and Keasling,J.D. (2002) Effect of gene location, mRNA secondary structures, and RNase sites on
expression of two genes in an engineered operon. Biotechnol. Bioeng., 80, 762-776.
30. Prevost,K., Desnoyers,G., Jacques,J.F., Lavoie,F. and Masse,E. (2011) Small RNA-induced mRNA degradation
achieved through both translation block and activated cleavage. Genes Dev., 25, 385-396.
31. Peer,A. and Margalit,H. (2011) Accessibility and evolutionary conservation mark bacterial small RNA targetbinding regions. J. Bacteriol., 193, 1690-1701.
32. Chao,Y., Papenfort,K., Reinhardt,R., Sharma,C.M. and Vogel,J. (2012) An atlas of Hfq-bound transcripts reveals 3’
UTRs as a genomic reservoir of regulatory small RNAs. EMBO J., 31, 4005-4019.
33. Vogel,J. and Luisi,B.F. (2011) Hfq and its constellation of RNA. Nat. Rev. Microbiol., 9, 578-589.
34. Saito,S., Kakeshita,H. and Nakamura,K. (2009) Novel small RNA-encoding genes in the intergenic regions of
Bacillus subtilis. Gene, 428, 2-8.
35. Smaldone,G.T., Antelmann,H., Gaballa,A. and Helmann,J.D. (2012) The FsrA sRNA and FbpB protein mediate the
iron-dependent induction of the Bacillus subtilis LutABC iron-sulfur containing oxidases. J. Bacteriol., 194, 2586-2593.
36. Gimpel,M., Preis,H., Barth,E., Gramzow,L. and Brantl,S. (2012) SR1--a small RNA with two remarkably conserved
functions. Nucleic Acids Res., 40(22), 11659-11672.
37. Gimpel,M., Heidrich,N., Mader,U., Krugel,H. and Brantl,S. (2010) A dual-function sRNA from B. subtilis: SR1 acts
as a peptide encoding mRNA on the gapA operon. Mol. Microbiol., 76, 990-1009.
38. Preis,H., Eckart,R.A., Gudipati,R.K., Heidrich,N. and Brantl,S. (2009) CodY activates transcription of a small RNA
in Bacillus subtilis. J. Bacteriol., 191, 5446-5457.
39. Heidrich,N., Moll,I. and Brantl,S. (2007) In vitro analysis of the interaction between the small RNA SR1 and its
primary target ahrC mRNA. Nucleic Acids Res., 35, 4331-4346.
40. Marchais,A., Duperrier,S., Durand,S., Gautheret,D. and Stragier,P. (2011) CsfG, a sporulation-specific, small noncoding RNA highly conserved in endospore formers. RNA Biol., 8, 358-364.
41. Loh,E., Dussurget,O., Gripenland,J., Vaitkevicius,K., Tiensuu,T., Mandin,P., Repoila,F., Buchrieser,C., Cossart,P.
and Johansson,J. (2009) A trans-acting riboswitch controls expression of the virulence regulator PrfA in Listeria
monocytogenes. Cell, 139, 770-779.
42. Licht,A., Preis,S. and Brantl,S. (2005) Implication of CcpN in the regulation of a novel untranslated RNA (SR1) in
Bacillus subtilis. Mol. Microbiol., 58, 189-206.
43. Barrick,J.E., Corbino,K.A., Winkler,W.C., Nahvi,A., Mandal,M., Collins,J., Lee,M., Roth,A., Sudarsan,N., Jona,I., et
al. (2004) New RNA motifs suggest an expanded scope for riboswitches in bacterial genetic control. Proc. Natl. Acad.
Sci. U. S. A., 101, 6421-6426.
44. Silvaggi,J.M., Perkins,J.B. and Losick,R. (2006) Genes for small, noncoding RNAs under sporulation control in
Bacillus subtilis. J. Bacteriol., 188, 532-541.
45. Schmalisch,M., Maiques,E., Nikolov,L., Camp,A.H., Chevreux,B., Muffler,A., Rodriguez,S., Perkins,J. and Losick,R.
(2010) Small genes under sporulation control in the Bacillus subtilis genome. J. Bacteriol., 192, 5402-5412.
46. Wadler,C.S. and Vanderpool,C.K. (2007) A dual function for a bacterial small RNA: SgrS performs base pairingdependent regulation and encodes a functional polypeptide. Proc. Natl. Acad. Sci. U. S. A., 104, 20454-20459.
47. Kery,M.B., Feldman,M., Livny,J. and Tjaden,B. (2014) TargetRNA2: Identifying targets of small regulatory RNAs in
bacteria. Nucleic Acids Res., 42, W124-9.
48. Beisel,C.L. and Storz,G. (2010) Base pairing small RNAs and their roles in global regulatory networks. FEMS
Microbiol. Rev., 34, 866-882.
49. Mader,U., Schmeisky,A.G., Florez,L.A. and Stulke,J. (2012) SubtiWiki--a comprehensive community resource for
the model organism Bacillus subtilis. Nucleic Acids Res., 40, D1278-87.
50. Papenfort,K., Podkaminski,D., Hinton,J.C. and Vogel,J. (2012) The ancestral SgrS RNA discriminates horizontally
acquired salmonella mRNAs through a single G-U wobble pair. Proc. Natl. Acad. Sci. U. S. A., 109, E757-64.
51. Smits,W.K. and Grossman,A.D. (2010) The transcriptional regulator Rok binds A+T-rich DNA and is involved in
repression of a mobile genetic element in Bacillus subtilis. PLoS Genet., 6, e1001207.
52. Kaberdin,V.R. and Blasi,U. (2006) Translation initiation and the fate of bacterial mRNAs. FEMS Microbiol. Rev.,
30, 967-979.
53. Levine,E., Zhang,Z., Kuhlman,T. and Hwa,T. (2007) Quantitative characteristics of gene regulation by small RNA.
PLoS Biol., 5, e229.
54. Papenfort,K., Bouvier,M., Mika,F., Sharma,C.M. and Vogel,J. (2010) Evidence for an autonomous 5’ target
recognition domain in an Hfq-associated small RNA. Proc. Natl. Acad. Sci. U. S. A., 107, 20435-20440.
55. Beisel,C.L., Updegrove,T.B., Janson,B.J. and Storz,G. (2012) Multiple factors dictate target selection by Hfq-binding
small RNAs. EMBO J., 31, 1961-1974.
121
In silico target profiling
56. Na,D., Yoo,S.M., Chung,H., Park,H., Park,J.H. and Lee,S.Y. (2013) Metabolic engineering of Escherichia coli using
synthetic small regulatory RNAs. Nat. Biotechnol., 31, 170-174.
57. Will,S., Joshi,T., Hofacker,I.L., Stadler,P.F. and Backofen,R. (2012) LocARNA-P: Accurate boundary prediction and
improved detection of structural RNAs. RNA, 18, 900-914.
58. Botella,E., Fogg,M., Jules,M., Piersma,S., Doherty,G., Hansen,A., Denham,E.L., Le Chat,L., Veiga,P., Bailey,K., et al.
(2010) pBaSysBioII: An integrative plasmid generating gfp transcriptional fusions for high-throughput analysis of gene
expression in Bacillus subtilis. Microbiology, 156, 1600-1608.
59. Smits,W.K., Dubois,J.Y., Bron,S., van Dijl,J.M. and Kuipers,O.P. (2005) Tricksy business: Transcriptome analysis
reveals the involvement of thioredoxin A in redox homeostasis, oxidative stress, sulfur metabolism, and cellular
differentiation in Bacillus subtilis. J. Bacteriol., 187, 3921-3930.
60. Frees,D., Savijoki,K., Varmanen,P. and Ingmer,H. (2007) Clp ATPases and ClpP proteolytic complexes regulate vital
biological processes in low GC, gram-positive bacteria. Mol. Microbiol., 63, 1285-1295.
61. Elsholz,A.K., Hempel,K., Michalik,S., Gronau,K., Becher,D., Hecker,M. and Gerth,U. (2011) Activity control of the
ClpC adaptor McsB in Bacillus subtilis. J. Bacteriol., 193, 3887-3893.
62. Sekowska,A., Robin,S., Daudin,J.J., Henaut,A. and Danchin,A. (2001) Extracting biological information from
DNA arrays: An unexpected link between arginine and methionine metabolism in Bacillus subtilis. Genome Biol., 2,
RESEARCH0019.
63. Zweers,J.C., Nicolas,P., Wiegert,T., van Dijl,J.M. and Denham,E.L. (2012) Definition of the sigma(W) regulon of
Bacillus subtilis in the absence of stress. PLoS One, 7, e48471.
64. Zweers,J.C., Wiegert,T. and van Dijl,J.M. (2009) Stress-responsive systems set specific limits to the overproduction
of membrane proteins in Bacillus subtilis. Appl. Environ. Microbiol., 75, 7356-7364.
65. Yu,W.B., Yin,C.Y., Zhou,Y. and Ye,B.C. (2012) Prediction of the mechanism of action of fusaricidin on Bacillus
subtilis. PLoS One, 7, e50003.
66. Krogh,A., Larsson,B., von Heijne,G. and Sonnhammer,E.L. (2001) Predicting transmembrane protein topology
with a hidden markov model: Application to complete genomes. J. Mol. Biol., 305, 567-580.
67. Breukink,E. and de Kruijff,B. (2006) Lipid II as a target for antibiotics. Nat. Rev. Drug Discov., 5, 321-332.
68. Mascher,T., Hachmann,A.B. and Helmann,J.D. (2007) Regulatory overlap and functional redundancy among
Bacillus subtilis extracytoplasmic function sigma factors. J. Bacteriol., 189, 6919-6927.
69. Kingston,A.W., Liao,X. and Helmann,J.D. (2013) Contributions of the sigma(W) , sigma(M) and sigma(X) regulons
to the lantibiotic resistome of Bacillus subtilis. Mol. Microbiol., 90, 502-518.
70. Buescher,J.M., Liebermeister,W., Jules,M., Uhr,M., Muntel,J., Botella,E., Hessling,B., Kleijn,R.J., Le Chat,L.,
Lecointe,F., et al. (2012) Global network reorganization during dynamic adaptations of Bacillus subtilis metabolism.
Science, 335, 1099-1103.
71. Kunst,F. and Rapoport,G. (1995) Salt stress is an environmental signal affecting degradative enzyme synthesis in
Bacillus subtilis. J. Bacteriol., 177, 2403-2407.
72. Tanaka,K., Henry,C.S., Zinner,J.F., Jolivet,E., Cohoon,M.P., Xia,F., Bidnenko,V., Ehrlich,S.D., Stevens,R.L. and
Noirot,P. (2013) Building the repertoire of dispensable chromosome regions in Bacillus subtilis entails major refinement
of cognate large-scale metabolic model. Nucleic Acids Res., 41, 687-699.
73. Reilman,E., Mars,R.A., van Dijl,J.M. and Denham,E.L. (2015) The multidrug ABC transporter BmrC/BmrD of
Bacillus subtilis is regulated via a ribosome-mediated transcriptional attenuation mechanism. Nucleic Acids Res., 42,
11393-11407.
74. Maass,S., Sievers,S., Zuhlke,D., Kuzinski,J., Sappa,P.K., Muntel,J., Hessling,B., Bernhardt,J., Sietmann,R., Volker,U.,
et al. (2011) Efficient, global-scale quantification of absolute protein amounts by integration of targeted mass
spectrometry and two-dimensional gel-based proteomics. Anal. Chem., 83, 2677-2684.
75. Starcher,B. (2001) A ninhydrin-based assay to quantitate the total protein content of tissue samples. Anal. Biochem.,
292, 125-129.
76. Buttner,K., Bernhardt,J., Scharf,C., Schmid,R., Mader,U., Eymann,C., Antelmann,H., Volker,A., Volker,U. and
Hecker,M. (2001) A comprehensive two-dimensional map of cytosolic proteins of Bacillus subtilis. Electrophoresis, 22,
2908-2935.
77. Wolf,C., Hochgrafe,F., Kusch,H., Albrecht,D., Hecker,M. and Engelmann,S. (2008) Proteomic analysis of antioxidant
strategies of Staphylococcus aureus: Diverse responses to different oxidants. Proteomics, 8, 3139-3153.
78. Goosens,V.J., Mars,R.A., Akeroyd,M., Vente,A., Dreisbach,A., Denham,E.L., Kouwen,T.R., van Rij,T., Olsthoorn,M.
and van Dijl,J.M. (2013) Is proteomics a reliable tool to probe the oxidative folding of bacterial membrane proteins?
Antioxid. Redox Signal., 18, 1159-1164.
79. Even,S., Burguiere,P., Auger,S., Soutourina,O., Danchin,A. and Martin-Verstraete,I. (2006) Global control of
cysteine metabolism by CymR in Bacillus subtilis. J. Bacteriol., 188, 2184-2197.
122
Chapter 4
123