Methodology for high-throughput production of soluble

Methodology for high-throughput production
of soluble recombinant proteins in
Escherichia coli
Katrin Markland
M.Sc.
The Swedish Centre for Bioprocess Technology
School of Biotechnology
Royal Institute of Technology
Stockholm 2006
© Katrin Markland
Stockholm, 2006
ISBN 978-91-7178-561-9
The Swedish Centre for Bioprocess Technology, CBioPT
School of Biotechnology
Royal Institute of Technology
106 91 STOCKHOLM
SWEDEN
Katrin Markland (2006): Methodology for high-throughput production of soluble recombinant
proteins in Escherichia coli. The Swedish Centre for Bioprocess Technology, CBioPT, School of
Biotechnology, Royal Institute of Technology (KTH), Stockholm, Sweden
ABSTRACT
The aim of this work was to investigate and determine central parameters that can be used to
control and increase the solubility, quality and productivity of recombinant proteins. These central
parameters should be applicable under the constraints of high-throughput protein production in
Escherichia coli.
The present investigation shows that alternative methods exist to improve solubility, quality and
productivity of the recombinant protein. The hypothesis is that by reducing the synthesis rate of the
recombinant protein, a higher quality protein should be produced. The feed rate of glucose can be
used to decrease the synthesis rate of the recombinant protein.
The influence of feed rate on solubility and proteolysis was investigated using the lacUV5-promoter
and two model proteins, Zb-MalE and Zb-MalE31. Zb-MalE31 is a mutated form of Zb-MalE that
contains two different amino acids. These altered amino acids greatly affect the solubility of the
protein. The soluble fraction is generally twice as high using Zb-MalE compared to Zb-MalE31.
Using a low feed rate compared to high benefits the formation of the full-length soluble protein.
Furthermore, by using a low feed rate, the proteolysis can be decreased. One other factor that
influences the solubility is the amount of inducer used. An increase from 100 µM to 300 µM IPTG
only results in more inclusion bodies being formed, the fraction of soluble protein is the same.
The quality aspect of protein production was investigated for a secreted version of Zb-MalE using
two different feed rates of glucose and the maltose induced promoter PmalK. It was shown that when
the protein was secreted to the periplasm, the stringent response as well as the accumulation of
acetic acid (even for high feed rates) was reduced. The stringent response and accumulation of
acetic acid are factors that are known to affect the quality and quantity of recombinant proteins.
Transporting the protein to the periplasm results in this case on a lower burden on the cell, which
leads to less degradation products being formed when the protein is secreted to the periplasm.
Seeing the feed rate as a critical parameter, the high-throughput production would benefit from a
variation in the feed rate. However, since the fed-batch technique is technically complicated for
small volumes another approach is needed. E.coli strains that have been mutated to create an
internal growth limitation that simulate fed-batch were cultivated in batch and were compared to
the parent strain. It was shown that the growth rate and acetic acid formation was comparable to the
parent strain in fed-batch. Furthermore it was shown that a higher cell mass was reached using one
of the mutants when the cells were cultivated for as long time as possible. The higher cell mass can
be used to reach a higher total productivity.
Key words:
Escherichia coli, high-throughput methodology, recombinant protein production, solubility,
productivity, quality, cultivation technology, parallel reactors
LIST OF PUBLICATIONS
This thesis is based on the following papers, which in the text will be referred to by their Roman
numerals:
I.
Sandén A M, Boström M, Markland K, Larsson G (2005) Solubility and proteolysis of the
Zb-MalE and Zb-MalE31 proteins during overproduction in Escherichia coli.
Biotechnology and Bioengineering 90: 239-247
II.
Boström M, Markland K, Sandén A M, Hedhammar M, Hober S, Larsson G (2005). Effect
of substrate feed rate on recombinant protein secretion, degradation and inclusion body
formation in Escherichia coli. Applied Microbiology and Biotechnology 68: 82-90
III.
Markland K, Johansson E, Pedersen S, Larsson G (2006). Cell engineering of Escherichia
coli allows high cell density accumulation without fed-batch process control. Manuscript
TABLE OF CONTENTS
1. INDUSTRIAL BACKGROUND_______________________________________ 1
1.1 Process technology ______________________________________________ 2
1.1.1 Cultivation techniques____________________________________________________ 2
Batch technique ___________________________________________________________ 2
Fed-batch technique________________________________________________________ 2
High cell density fed-batch cultivation__________________________________________ 3
1.1.2 Bioreactors_____________________________________________________________ 4
Microtiter plates ___________________________________________________________ 4
Multi-parallel reactors ______________________________________________________ 4
Conclusions on reactor systems _______________________________________________ 6
1.2 Cell and vector engineering _______________________________________ 6
1.2.1 Glucose uptake _________________________________________________________ 6
1.2.2 Quality and solubility ____________________________________________________ 7
Protein folding ____________________________________________________________ 8
Stress induced proteolysis ___________________________________________________ 9
How to improve solubility and quality _________________________________________ 15
Summary________________________________________________________________ 17
1.2.3 Specific productivity ____________________________________________________ 17
Plasmid copy number ______________________________________________________ 17
Promoters _______________________________________________________________ 18
Induction level ___________________________________________________________ 19
Translation initiation ______________________________________________________ 19
Acetic acid formation ______________________________________________________ 20
Summary________________________________________________________________ 20
2. PRESENT INVESTIGATION _______________________________________ 21
2.1 Control of solubility and proteolysis in the cytoplasm (I) _____________ 22
Strategy_________________________________________________________________ 22
Results _________________________________________________________________ 23
Conclusions _____________________________________________________________ 25
2.2 Control of stress and proteolysis in the periplasm (II) ________________ 26
Strategy_________________________________________________________________ 26
Results _________________________________________________________________ 27
Conclusions _____________________________________________________________ 30
2.3 A cellular based system for substrate uptake rate control (III)_________ 30
Strategy_________________________________________________________________ 30
Results _________________________________________________________________ 31
Conclusions _____________________________________________________________ 34
3. CONCLUSIONS__________________________________________________ 35
4. ABBREVIATIONS ________________________________________________ 36
5. ACKNOWLEDGEMENTS __________________________________________ 37
6. REFERENCES ___________________________________________________ 38
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
1. INDUSTRIAL BACKGROUND
Escherichia coli, or E.coli, is a simple organism that is widely used as a host in protein production.
E.coli has several advantages such as the ease of use, the low cost and the speed in which a protein
can be produced. These advantages are some of the reasons why pharmaceuticals such as growth
hormones, growth factors and insulin are being produced in E.coli (Schmidt 2004).
The development of a new drug is a stepwise procedure and the finalization of the HUGO-project
and the human genome sequencing has lead to a paradigm shift in this chain of development. As a
result, a multitude of new proteins need to be produced in order to determine their structure and
function for further use in this development. For pharmaceutical companies it is important that the
discovery and development of new pharmaceuticals is rapid and thus that the production of e.g.
target proteins does not constitute a bottleneck.
The identification of new protein pharmaceuticals starts with protein production from a cDNA
sequence with little or no information about the protein the sequence code for. Further, since
proteins are fundamentally different from each other due to their amino acid sequence, localization,
function and structure; the production is still a trial-and-error effort to find the conditions best
suited for each and every protein.
In order to find the optimal conditions, a high-throughput production (HTPP) methodology is often
used. With this methodology, different production variables are tried simultaneously in order to
find a production hit as rapidly as possible. The larger the number of process parameters, the more
likely it is that the protein is produced in adequate quantities. HTPP is generally based on standard
techniques that previously have been developed in successful production processes and the HTPP
concept serves as a pipeline where the aim is to rapidly clone, produce, purify and validate, a
preferably soluble, full-length protein.
A common system in use today for production is based on forceful technology e.g. the use of strong
promoters, high plasmid copy numbers and high concentration of inducer. In order to increase the
number of process variables these induction systems are used in combination with a purification tag
which serves both as a purification handle but also as a solubility enhancer (Berrow et al. 2006).
This has given good results in several cases and presumably where the protein is relatively simple.
However, the high metabolic load that such systems create has in many cases lead to the opposite
result i.e. a lower production. Furthermore, the production of more difficult proteins, such as
membrane proteins, is more complicated e.g. due to the hydrophobic nature of the protein and was
shown to result in cell death when production was induced (Miroux and Walker 1996).
Few engineering principles are used in HTPP, except for production at different temperatures, due
to limitations in the control of small reactor volumes.
In order to produce difficult proteins, and proteins with today unknown structure and function, there
is a likely need for a larger variation in cell and vector systems as well as production techniques.
With larger number of variations, in both cell and vector systems as well as in process parameters,
the number of cultivations to be performed increases which increase the probability of success. To
cope with the larger number of cultivations, a cultivation format is needed where multiple
cultivations can be performed at the same time. To perform as many cultivations in parallel as
possible, the cultivation volume needs to be relatively small so the multi-parallel format is
practically feasible. However, by using a small volume the more advanced cultivation techniques,
such as fed-batch, are difficult to implement due to technical difficulties of process control.
1
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
The aim of this work was to investigate and determine central parameters that can be used to
control and increase the solubility, quality and productivity of recombinant proteins. These central
parameters should be applicable under the constraints of high-throughput production i.e. the multiparallel format at restricted volume.
This implies the investigation of two central areas of technology. The first is reactor and cultivation
technology and the second area is cell and vector engineering.
1.1 Process technology
The first part of a successful high-throughput methodology is knowledge about cultivation
techniques and the multi-parallel bioreactors. These tools will be a first step on the way to a high
cell density. With a high cell density, the concentration of the recombinant protein may be
increased and this will enable further studies on the active and soluble protein.
1.1.1 Cultivation techniques
Batch technique
Screening for a new protein is often done in small scale and the easiest cultivation method to use in
small scale is the batch technique. All necessary components (salts, trace elements and substrate)
are added in concentrations to make the reaction rate un-restricted. This means that the specific
growth rate (µ) will be at its highest until for example the cells have consumed the added substrate,
exhausted all oxygen or inhibitory metabolic by-products have formed. During this time the cells
will at first grow exponentially but when nutrients are depleted or inhibitory by-products are
formed, the growth rate will decrease and the cells will stop growing. The only additives to a batch
process during cultivation are air, a base or acid to control the pH and an antifoam agent if
necessary. The cell growth during ideal conditions can be described by: dX/dt = µX, where dX/dt is
the change in cell mass over time, µ is the specific growth rate (h-1) and X is the cell mass (g/L).
Fed-batch technique
Another cultivation technique is the fed-batch and this is the only way possible to reach high cell
densities. The difference between batch and fed-batch is that a substrate feed is continuously added
to the bioreactor during the cultivation making it possible to use different feed profiles to control
growth. The fed-batch can start as a batch cultivation and when the initial substrate is depleted a
feed is started. But in industry the feed is often started directly after inoculation of the bioreactor.
The feed substrate is most often glucose in a high concentration, ≥ 600 g/L, to minimize the volume
increase. The condition for growth rate limitation is: F/VSi < qs,maxX(t) where F is the feed rate of
the limiting substrate (L/h), V is the cultivation volume (L), Si is the concentration of the substrate
(g/L), qs,max is the maximum substrate consumption rate (g/g,h) and X(t) is the biomass
concentration changing with time (g/L). The feed rate can be designed in multiple ways. A constant
feed rate results in a decreasing growth rate as each cell receives less and less glucose. A
exponential feed on the other hand results in a constant growth rate since the feed is increased in the
same exponential pattern that the cells are growing in. Every cell receives the same amount of
glucose for the duration of the cultivation. There are two advantages with the fed-batch technique
compared to batch. The first is the reaction rate control and the second is the metabolic control.
2
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
The reaction rate control limits the growth rate. The bacteria will increase their cell mass with the
same rate as the substrate being fed into the reactor. As long as there is enough glucose and the
yield of biomass on consumed substrate,YX/S, is constant, the growth rate will also be constant.
When the reaction rate is controlled, problems with cooling capacity and oxygen transfer can be
avoided. Through the metabolic control, catabolite repression and sugar over-flow metabolism can
be avoided. By reducing the over-flow metabolism, the accumulation of the by-product acetic acid
may be reduced. However, one disadvantage with fed-batch cultivation is the need for more
technically trained personnel as well as more advanced equipment that can handle e.g. feedprofiles. The difference between batch and fed-batch is illustrated in Figure1.
Air in
F, Si
Air in
Figure 1: The difference between batch and
fed-batch cultivation techniques. F = feed rate of
limiting substrate (L/h), Si = concentration of the
substrate (g/L)
Gas out
Gas out
High cell density fed-batch cultivation
Strategies to improve cultivations have developed gradually since the 1920’s when 2 patent
applications were submitted in Denmark and Germany. It had been discovered that high initial
substrate and nutrient concentrations could inhibit the growth and could thus not be added all at
once.
Metabolic control
In order to avoid the growth inhibition caused by high substrate and nutrient concentrations and
also to control the metabolism, medium components are added gradually in a feed. When all
glucose is added at the same time, the cell directs some of the glucose to the formation of acetic
acid. Furthermore, if other medium components such as salts are added in to high concentrations,
growth is also inhibited (Risenberg 1991). On the other hand, nutrient limitation can affect protein
yields, plasmid copy number and plasmid stability. The plasmid copy number increased from 50400 during phosphate limitation (Horn et al. 1990) and it resulted in an enhanced plasmid instability
(Jones and Melling 1984). It has however, also been shown that the protein yield increased two-fold
when cells are grown on glucose after the depletion of phosphate (Jensen and Carlsen 1990). A
balance between too much and too little nutrients needs to be found to reach the best conditions for
the cells.
Limiting factors
In high-cell density cultivation the oxygen demand increases with the growth rate. A higher growth
rate requires more oxygen than a lower growth rate. Different methods are used to increase the
oxygen supply. Examples are increasing the aeration rate, use of O2-enriched air or pure oxygen,
decreasing the temperature or cultivating under pressure (Risenberg 1991). If only the supply of
oxygen is considered, the theoretical maximum cell density is about 350 g/L dry cell weight with
3
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
pure oxygen (Risenberg 1991). However, this is not possible in practice due to a high viscosity
making it difficult to stir the cultivation as well as the volume of the cells. Another calculation
considers the void volume in between the cells and the density of the cells. In 1 liter of culture the
maximum wet weight is 800 g of E.coli cells. Since the dry weight is approximately 20-25% of the
wet weight, this corresponds to about 200 g/L dry weight (Märkl et al. 1993). This is consistent
with the maximum cell density practically reached, 190 g/l (Fuchs et al. 2002; Nakano et al. 1997).
1.1.2 Bioreactors
Several types of bioreactors exist that are possible to use for high throughput production. Important
aspects that need to be considered for all bioreactors are the sterility and the possibility to control
and regulate the cultivation. Parameters that need to be controlled and regulated are for example the
temperature, pH and oxygen tension. Furthermore, the screening of a protein is often done in a
parallel format where several constructs are tried at the same time, so it is preferable with a reactor
where parallel cultivations can be performed.
Microtiter plates
A microtiter plate offers a possibility to perform many identical and parallel reactions in a very
small scale. This is the reason why microtiter plates have been used as miniature bioreactors for cell
cultivation. The cultivation volume is 25 µL to 5 mL. The plates are most often made in plastic and
the number of wells can be 6, 12, 24, 96, 384 and even as many as 1536 or 3456 (Mere et al. 1999).
The wells can have a rectangular or a cylindrical shape and they are deep or shallow. Oxygen
transfer and mixing is accomplished by orbital shaking of the entire plate and the rectangular shape
enhances the mixing and oxygen transfer through their baffle-like function (Duetz et al. 2000). The
plate is placed on a heated block that at the same time controls the temperature (Betts and Baganz
2006). The oxygen transfer rate can to a certain amount be increased by an increased shaking speed.
With a modified shaker it has been shown to reach a maximum KLa of 175 h-1. After that higher
shaking speeds only accomplishes that medium is spilled into other wells (Hermann et al. 2002). A
solution is to have a cover on the plate to minimize the contamination risk between wells. This
cover must at the same time allow the wells to be aerated through contact with air. To accomplish
this, a membrane can be used (Duetz et al. 2000). At the same time, the membrane also helps to
avoid medium evaporation which otherwise may be a problem during slow growing cultivations
(Kumar et al. 2004). Other reports have shown that KLa values normally are in the range of 100-130
h-1 for microtiter plates (Kumar et al. 2004). These relatively low values depend on the fact that
microtiter plates are shaken systems where the surface aeration is the sole source of oxygen.
One difficulty with microtiter plates has been pH and DOT measurement and control. During the
last years, plates with integrated fluorescent sensors for both DOT and pH have been developed.
These plates can be used with a fluorescent reader equipped with a shaking device (John et al.
2003a; John et al. 2003b). Since it is difficult to control the pH, a medium with high buffer capacity
is required.
Multi-parallel reactors
Multi-parallel reactors are similar to conventional fermenters in that temperature, pH and dissolved
oxygen can be monitored and maintained at desired levels. They have a greater potential for
monitoring and control than for example microtiter plates. The reactors are usually made of
borosilicate glass, different types of plastics or stainless steel (Betts and Baganz 2006; Kumar et al.
2004).
4
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
A number of research groups are developing different reactor-prototypes in a multi-parallel format.
Professor Weuster-Botz at the Technical University at München has developed a set consisting of
48 reactors (volume 8-15 mL) where pH, DOT and OD can be measured, pH can be controlled and
fed-batch cultivation is possible. These reactors have a maximum KLa of 700 h-1 and cell dry
weights of 20 g/L has been accomplished (Puskeiler et al. 2005; Weuster-Botz et al. 2005). This
system seems to be highly efficient due to both the cultivation control and the high KLa values
reached. Another system that has been developed is a set of miniaturized bubble columns. By using
a 48-well microplate with porous frits in the bottom of each well to allow an upward flow of gas, a
microplate with 48 bubble columns each 2 mL in volume was created. The maximum kLa is 220 h-1
and the dry cell weight has reached 10 g/L in this bubble column (Doig et al. 2005).
A comparison of a set of commercial parallel fermentation systems can be seen in Table 1. The list
is not intended to be complete but shows some examples. Some of these systems are equipped to
handle fed-batch cultivations but requires then a larger number of tubes and wires for additions and
measurements. The working volume can be as low as 35 mL in the Cellstation from Fluorometrix
(Massachusetts, US) and up to 1000 mL in the Greta (Figure 2) from Belach Bioteknik (Stockholm,
Sweden) (Hedrén et al. 2006).
Manufacturer Volume
Stirrer
mL
Cellstation
≥ 35
Dual paddle
(Fluorometrix)
impeller
Fed-batch pro 50-500
Orbital
(Dasgip)
shaker
Sixfors (Infors) 200-500
Impeller
Greta (Belach)
2001000
Impeller
Measured
pH, DOT, OD
Controlled Cultivation
technology
pH, DOT
Batch
pH, DOT
pH, DOT
pH, DOT
pH, DOT
pH, DOT, OD
pH, DOT
Batch,
Fed-batch
Batch,
Fed-batch
Batch,
Fed-batch
KLa
h-1
NR
Number
of units
12
1100
16
300
6
800
6-24
Table 1: Overview of some commercial multi-parallel bioreactors. NR = not reported. KLa values achieved from
companies (personal communication)
One drawback with these systems is the relatively high cost compared to microtiter plates. But if
high biomass is required, these reactors have the advantage of higher KLa values. Since the working
volume is relatively large, samples can be taken and more information can be gained during
cultivation. However, this is generally more important in process development than in highthroughput production.
Figure 2: The multi-parallel fermentation system Greta
manufactured by Belach Bioteknik, Sweden
5
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
Conclusions on reactor systems
For an efficient high-throughput system, i.e. a system where many parallel cultivations are rapidly
processed at the same time, the microtiter plate is the reactor of choice due to its ability to run many
parallel cultivations. However, the only cultivation technique possible in microtiter plates is the
batch technique. This system is best suited for high yield proteins i.e. proteins produced to a high
percentage of the total protein content. For other proteins, bioreactors are preferred. First for their
larger cultivation volumes, second for the possibility to use fed-batch cultivation and third for their
higher KLa values, which enable a higher biomass hence, possibilities for a higher total
productivity.
1.2 Cell and vector engineering
The goal in high-throughput protein production is to receive a high total productivity of the
recombinant protein. The total productivity depends on the cultivation volume, the cell density and
the specific productivity, i.e. Q = V*X*qp. To increase the total productivity a higher cell density, a
larger cultivation volume or a higher specific productivity is required. But the total productivity is
not the only important parameter. Since the recombinant protein is intended for structure and
function determinations, it also needs to be active and of full-length. Finally since it is also
important that the production of new proteins is rapid, a soluble protein is preferred.
The cell is influenced by a lot of different factors or stimuli. These can be external due to changing
environmental conditions such as altered pH or nutrient depletion. They can also be internal for
example the situation when the recombinant protein production is induced. In order to produce a
soluble, full-length and active protein with a high total productivity, the external and internal
stimuli needs to be controlled so that the cell machinery functions properly and quality, solubility
and productivity of the recombinant protein can be maintained.
1.2.1 Glucose uptake
The cell uses glucose as a carbon source for cell growth, product formation, respiration and also byproduct formation. In order to utilize the glucose, it needs to be transported from the medium and
into the cell. The proteins OmpC and OmpF transport glucose into the periplasm during high
glucose concentrations (Nikaido and Vaara 1985) whereas the protein LamB is responsible during
glucose limitation (Death et al. 1993). When the glucose is present in the periplasm, it is actively
transported into the cytoplasm by the phosphoenolpyruvate: phosphotransferase system (PTS). PTS
transports and phosphorylates several sugars. The PTS consist of Enzyme I (EI) and a
phosphohistidine carrier protein (HPr), which are both soluble and non-sugar specific. These
proteins transfer a phosphate group from phosphoenolpyruvate to the sugar specific enzymes IIA
and IIB. Furthermore, an integral membrane protein (IIC and sometimes IID) transports the sugars
across the cytoplasmic membrane where the sugar is phosphorylated by enzyme IIB. Glucose is
transported by the specific glucose enzyme, IIGlc as well as by the enzyme for mannose, IIMan
(Curtis and Epstein 1975). The specific glucose enzyme, IIGlc consist of the soluble enzyme IIAGlc
and the membrane permease IICBGlc. These two proteins are encoded by crr and ptsG respectively.
The enzyme IIMan also consists of two parts, IIABMan and IICDMan. IIABMan is a homodimer enzyme
coded by the gene manX and IICDMan is a membrane permease encoded by manYZ.
From the literature it is known that the regulation of the glucose- and mannose-PTS systems is
under the control of a repressor protein called Mlc (Kimata et al. 1998; Plumbridge 1998). This
protein binds to the un-phosphorylated form of the enzyme IIBGlc when glucose is present in the
6
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
medium resulting in the de-repression of for example the genes ptsHI, ptsG and manXYZ (Kimata et
al. 1998; Plumbridge 1998; Plumbridge 1999). During conditions without glucose in the medium,
the enzyme IIBGlc~P is unable to bind the Mlc protein, which results in the repression of the above
mentioned genes (Plumbridge 2002). Furthermore, it has also been shown that the transcription of
ptsG is under the control of cAMP-CRP (Kimata et al. 1997).
1.2.2 Quality and solubility
When producing proteins the quality and solubility are two important factors. In this thesis the
definition of a high quality is the production of a full-length protein with correct amino acid
sequence and which is correctly folded and active. The solubility is important when it comes to the
time aspect of the production process. The purification of the protein will be more rapid if the
protein is soluble than if inclusion bodies are formed and needs to be re-solubilized and reactivated.
A relatively new method to monitor the solubility of a recombinant protein in the cytoplasm is by
fusing it to the green fluorescent protein (GFP). When GFP is properly folded it is fluorescent and
can be used as a solubility marker. This means that when the recombinant protein is correctly
folded, GFP is fluorescent whereas it is not fluorescent if the recombinant protein has folded
incorrectly (Waldo et al. 1999). However, all recombinant proteins are not possible to produce in
the cytoplasm. These proteins can benefit from being secreted to the periplasm where the
environment is completely different.
The protein quality is affected by the amino acid sequence that can be influenced by the cDNA
quality or by errors in the transcription or translation processes. In the transcription process, RNA
polymerase can sometimes select and insert the wrong ribonucleotide. This can result in the
incorporation of an incorrect amino acid during translation due to the faulty codon. In translation,
missense errors can occur which is the substitution of one amino acid for another. This results from
either that a tRNA is charged with the wrong amino acid or an anticodon-codon mismatch on the
ribosome (Parker 1989). Other errors include the premature termination of translation because the
ribosome is stalled or the addition of extra amino acids due to the fact that the termination codon is
not recognized (Parker 1989). Frameshift errors can also occur. If the reading frame shifts, a
termination codon may be formed before the intended termination codon resulting in a too short
protein (Parker 1989).
Errors also occur from post-translational modifications. One example is the production of human
growth hormone. The correct form is 191 amino acids and the protein contains two intramolecular
disulphide bridges. The hormone can be altered to exist in a methylated form where one or several
amino groups have been lost. It can also receive one or two additional oxygen atoms giving an
oxidized form (Gellerfors et al. 1990)
The average number of translation errors is 10-4, which means that for every 10000 codons
translated, one error occurs (Parker 1989). For example, the misreading of arginine resulted in
incorporation of cysteine in flagellin, the subunit of bacterial flagella (Edelmann and Gallant 1977).
Another report showed that norleucine was translated instead of methionine (Tsai et al. 1988). As
much as 19 % of the methionine was replaced by norleucine when interleukin-2 was overexpressed
in minimal medium, whereas the level of norleucine was only 3% in rich media. During starvation
for certain amino acids, the translational errors increase to the extent that it is visible on a twodimensional polyacrylamide gel. After 25 minutes of starvation for histidine, the misincorporation
of amino acids leads to an altered charge of the protein. This charge shift leads to the formation of
7
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
several spots on a 2D-PAGE, with the same mobility in the SDS-PAGE but differing in the
isoelectric focusing (Parker et al. 1978).
The errors are often due to the fact that the codons used in a foreign gene are not the same as the
codons used by the host to synthesis other highly expressed proteins. Therefore the rarely used
amino acids will be depleted and the ribosome will stall while waiting for the correct amino acid.
This increases the probability that an incorrect amino acid will be inserted instead of the limiting
amino acid (Glick 1995).
Protein folding
During the last 20 years there has been a lot of work done in the area of protein folding to
understand the mechanisms behind folding and misfolding. An important group of proteins in this
process are the chaperones. They bind to exposed hydrophobic regions on proteins thus preventing
the proteins from aggregating into insoluble and nonfunctional inclusion bodies and instead reach
their native state (Wickner et al. 1999). There are three classes of chaperones that are important,
folding chaperones (DnaK, DnaJ, GrpE, GroEL, GroES), holding chaperones (IbpB, Hsp31,
Hsp33) and disaggregating chaperones (ClpB) (Baneyx 2004).
Other important groups of molecules are foldases and proteases. Rate-limiting steps in the folding
pathway are catalyzed by foldases for example peptidyl-prolyl cis/trans isomerases (PPIases) that
accelerate trans to cis isomerization of peptidyl-prolyl bonds and thiol-disulfide oxidoreductases
(Dsbs) that are involved in the formation of disulfide bridges (Baneyx 2004). If a protein is too
damaged to be rescued by any other pathway it will be degraded by proteases (Baneyx 2004).
When a cytoplasmic nascent polypeptide exits the ribosome it is either properly folded or it needs
assistance to fold correctly. If aid is required the protein can either interact with a trigger factor, TF
or with the chaperone DnaK and its co-chaperone DnaJ (Deuerling et al. 2003). Once released from
the chaperones by a GrpE-catalyzed ADP/ATP exchange, three possibilities exist for the protein.
The first possibility is that the protein is in a native form. The second is that it will interact with
DnaK until it folds or thirdly it can be transferred to the GroEL-GroES system (Baneyx 1999;
Baneyx and Palumbo 2003). During environmental stress native proteins may be unfolded and then
it is the holding chaperones (IbpA/B, Hsp31 and Hsp33) that stabilize the protein until DnaK and
GroEL can once again fold the protein (Veinger et al. 1998; Baneyx and Mujacic 2004). During
severe or prolonged stress, the holding chaperones are unable to stabilize the protein and it will
aggregate until the disaggregating chaperone (ClpB) can solubilize the protein (Baneyx and
Mujacic 2004). Sometimes it is not possible to solubilize the protein and the aggregates will remain
insoluble as inclusion bodies. See Figure 3 for the folding pathway.
When inclusion bodies do form they are highly stable and resistant to proteases in vivo (Fahnert et
al. 2004). But inclusion body formation is a result of an unbalance between protein precipitation
and cell-mediated refolding and solubilisation and as a result the inclusion bodies may be dissolved
(Mar Carrio and Villaverde 2001). Proteins that are toxic, unstable or easy to refold may benefit
from being produced as inclusion bodies. But since there is no guarantee that the inclusion bodies
will regain activity or result in a high yield of the refolded protein (Baneyx 1999), it is not optimal
to produce the protein as inclusion bodies. For every protein, the specific conditions during
refolding have to be optimized in respect to buffer composition, temperature, and protein
concentration (Lilie et al. 1998). Furthermore, it is more expensive and time-consuming to purify
and refold inclusion bodies than to purify soluble proteins (Sorensen and Mortensen 2005). The
production of inclusion bodies is thus not an interesting system for high-throughput production due
to the increased time for these processes.
8
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
Figure 3: A nascent
polypeptide is first bound to
TF or DnaK-DnaJ. When
released, the folding
intermediate may reach a
native conformation or it will
be transferred to GroELGroES for help in the folding
process. Native proteins may
also be stabilized by holding
chaperones (IbpA/B, Hsp31,
Hsp33) until it either is
folded or aggregates. When
the stress is reduced the
disaggregating chaperones
refolds the aggregated
proteins. 1
Stress induced proteolysis
In order to receive an active protein, it is vital that the protein is not degraded by proteases. During
overexpression of proteins, there are several stress response systems that are activated by misfolded
proteins. When these stress response systems are activated, they produce, among other proteins,
proteases that may degrade the desired protein. The heat-shock response is induced due to
overproduction of proteins in the cytoplasm (Gross 1996) and the envelope stress response is
induced when misfolded proteins are detected in the periplasm (Mecsas et al. 1993). It has also
been shown that the stringent response induces expression of heat-shock proteins (Grossman et al.
1985).
The heat-shock response
The heat-shock response is induced by a variety of stress conditions for example temperature shifts,
harmful substances such as ethanol and the antibiotics nalidixic acid and trimetoprim (Booth 1999;
Laskovska et al. 2003) and unfolded proteins (Allen et al. 1992; Gross 1996). When the heat-shock
response is induced, nearly 50 heat-shock induced proteins (HSPs) are synthesized. Many of the
HSPs are chaperones and proteases. It is important that these chaperones and proteases have a high
affinity and folds the appropriate proteins at a high rate. The immediate stress is followed by an
adaptation period after which the HSP synthesis rates have decreased to their normal levels (Arsène
et al. 2000).
1
Reprinted by permission from Macmillan Publishers Ltd: [Nature Biotechnology], (Francois Baneyx and Mirna
Mujacic, 2004, Recombinant protein folding and misfolding in Escherichia coli, Nature Biotechnology, Vol. 22, No 11,
p 1399-1408), Copyright (2004).
9
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
The heat-shock response is controlled at the transcriptional level by the rpoH gene product, σ32
(Straus et al. 1987). When the σ32 –levels increase, the heat-shock response is induced (Arsène et
al. 2000). The first mechanism is at the transcriptional level of rpoH. Four promoters, three σ70dependent (P1, P4, P5) and one σE-dependent (P3), transcribe the rpoH gene (Fig 4). The use of
each promoter seems to be temperature-dependent (Arsène et al. 2000). The second mechanism,
affects translation of rpoH mRNA (Fig 4). During normal growth temperatures, cis-acting mRNA
sites within the 5’ region of the rpoH messenger form secondary structures that block the ribosomebinding site. These secondary structures are temperature sensitive. At elevated temperatures, the
structures melt thus enabling a more efficient translation of the rpoH mRNA (Moat et al. 2002).
In Figure 4, the regulatory mechanism of the heat-shock response is illustrated. σ32 bind to RNApolymerase and start the production of heat-shock proteins for example DnaK, DnaJ and GrpE.
These proteins regulate the heat-shock response. During unstressful conditions when the amount of
misfolded or unfolded proteins in the cell is low, the chaperones bind to σ32. This binding has a
negative effect on stability and translation of σ32. During temperature up-shifts, the amount of
misfolded proteins will increase thereby removing the chaperones from σ32. The sigma factor is free
to start the production of HSPs. When the stress has decreased and the amount of misfolded
proteins is lower, the chaperones will once again bind σ32 and repress its activities (Gross 1996).
σ70
Figure 4: The regulatory mechanism
of the heat-shock response. The
chaperones DnaK, DnaJ and GrpE
bind to σ32 when there are no proteins
to fold. During heat-shock the level of
misfolded proteins increase, forcing
the chaperones to release σ32 and to
help other proteins in their folding
process. σ32 start the production of
HSPs. When the heat-shock is
released, the chaperones have no
misfolded proteins to bind; hence
they will bind σ32, thereby repressing
it. Adapted from (Arsène et al. 2000;
Gross 1996)
P1 P3 P4 P5
rpoH
σE
mRNA
Translation
Stability
σ 32
Activity
Eσ 32
Promoter
Hsp genes
DnaK, DnaJ, GrpE (50 proteins)
Some of these 50 heat shock proteins produced are located within the σ32 regulon and function as
chaperones (DnaK, DnaJ, ClpB, GroEL, GroES), proteases (Lon, ClpP, ClpX) or nucleotide
exchanging factors (GrpE). Other heat shock proteins are located in the σE regulon for example
Deg P, which can function both as a chaperone and a protease.
The envelope stress response
The cytoplasm and the periplasm in E.coli are two different environments. The periplasmic space is
more viscous, it is devoid of ATP and it has oxidizing conditions to favour disulfide bond
formation. It is therefore not surprising that there are specific systems to cope with stress in the
periplasm. E.coli has two different stress response mechanisms in the periplasm. They are called the
σE – and the Cpx- pathway.
10
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
The σE-pathway is induced by heat, ethanol and overexpression of outer membrane proteins
(Grossman et al. 1984; Mecsas et al. 1993; Rouvière et al. 1995). The pathway was discovered
when the gene degP was studied. This gene codes for a periplasmic protease that is essential for
survival at temperatures over 42°C (Lipinska et al. 1989). In response to heat-shock the levels of
DegP is increased but it is not dependent on the sigma factor for cytoplasmic heat-shock σ32.
Instead the degP gene is transcribed by another sigma factor, σE, which is a new heat-stable form of
RNA-polymerase (Erickson and Gross 1989; Wang and Kaguni 1989).
The operon that codes for σE also codes for the proteins RseA, RseB and RseC that are the keyplayers in the σE-pathway (De Las Penas et al. 1997; Missiakas et al. 1997). RseA is a protein that
spans the membrane. The N-terminal of RseA is localized in the cytoplasm where it interacts with
σE and the C-terminal faces the periplasm and interacts with RseB, a periplasmic protein (De Las
Penas et al. 1997; Missiakas et al. 1997). As shown in Figure 5, during normal growth conditions
envelope proteins are folded and correctly localized and the negative regulators RseA and RseB
interact with σE (De Las Penas et al. 1997; Missiakas et al. 1997). But during stressful conditions,
RseA is degraded by DegS, an inner membrane serine-protease, in the periplasm and by YaeL, a
metallo-protease, in the cytoplasm thus releasing σE which is then free to bind to RNA-polymerase
(Alba and Gross 2004; Alba et al. 2002; Kanehara et al. 2002). This is the start of the transcription
of folding factors (SurA, FkpA), chaperones (Skp) and proteases (DegP) (Dartigalongue et al.
2001).
Normal conditions
Stressful conditions
EtOH
Heat
OM
β -barrel
proteins
RseB
Misfolded
envelope
proteins
RseB
OMPs
RseB C
1 DegS
RseA
IM
2
N
E
σ
σE
YaeL
RNAP
surA, degP, rpoH, rpoE, rseA, rseB, rseC
Figure 5: The σE pathway. During normal growth conditions the negative regulators RseA and RseB interact with σE.
But when heat, ethanol or OMP’s are expressed, RseA is degraded by DegS on the periplasmic side and by YaeL on
the cytoplasmic side of the membrane thus releasing σE that is free to bind to RNA-polymerase and start the
transcription of folding factors (SurA, FkpA), chaperones (Skp) and proteases (DegP). Modified from (Raivio and
Silhavy 2001)
11
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
The second envelope stress response is a two-component system consisting of CpxA and CpxR. It
is induced by an alkaline pH (Danese and Silhavy 1998), misfolded P pilin subunits (Jones et al.
1997), overproduction of a rare outer membrane lipoprotein E, NlpE (Snyder et al. 1995) and by
major alterations in membrane phospholipid composition (Mileykovskaya and Dowhan 1996).
CpxA is an integral membrane protein with both a cytoplasmic and a periplasmic domain and it
function as a sensor-protein (Weber and Silverman 1988). CpxR on the other hand function as a
cytoplasmic response regulator (Dong et al. 1993).
During non-stressful conditions, the sensing domain of CpxA interacts with a small protein called
CpxP, which results in the inactivation of the Cpx-system, as shown in Figure 6 (Raivio et al.
1999). When the Cpx-pathway is induced, CpxP is removed from CpxA, which in turn is autophosphorylated. CpxR is then activated by phosphorylation when the phosphate is transferred from
CpxA to CpxR (Raivio et al. 1999). The phosphorylated form of CpxR is then free to bind upstream
of multiple promoters thereby activating transcription of folding factors, chaperones etc (Raivio and
Silhavy 2001).
Outer membrane
Folded
protein
CpxP
CpxP
CpxP
STRESS
periplasm
Misfolded
protein
CpxP
Inner membrane
autophosphorylation CpxA >H-P
CpxA >H
ATP
ADP
Phosphatase (ATP)
CpxR >D
kinase
CpxR >D-P
CpxR >D
Pi
degP, dsbA, ppiA, cpxP, cpxR, pap, pili
Figure 6: The Cpx-pathway. During non-stressful conditions, CpxA interacts with CpxP which results in the
inactivation of the Cpx-system. When the Cpx-pathway is induced by an alkaline pH, misfolded P pilin subunits or
overproduction of NlpE, CpxP is titrated away from CpxA.CpxA is autophosphorylatedand CpxR is activated by
phosphorylation when thephosphate is transferred fromCpxA to CpxR. The phosphorylated form of CpxR is then free to
bindupstream of multiple promoters thereby activating transcription of folding factors, chaperones etc. (Raivio and
Silhavy 2001)2
2
Reprinted, with permission, from the Annual Review of Microbiology, Volume 55 © 2001 by Annual Reviews,
www.annualreviews.org
12
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
The stringent response
The stringent response has been shown to trigger the induction of certain heat-shock proteins such
as DnaK, GroEL and Lon (Grossman et al. 1985). Lon is a protease with a broad specificity and
could increase the proteolysis of the recombinant protein. One hypothesis is that during
overproduction of proteins, certain amino acids may be depleted leading to amino acid limitation,
which is inducing the stringent response. The stringent response adjusts the cells’ macromolecular
composition to the actual growth rate. During slow growth, the cells are small, have few ribosomes
and less DNA, RNA, proteins, phospholipids and cell wall material compared to cells with a higher
growth rate that are also larger and have more ribosomes (Lengeler and Postma 1999). When the
cells suffer from carbon starvation or from a lack of charged tRNAs, the protein synthesis is
blocked by stalled ribosomes. This stimulates the cell, and an alarmone called guanosine
tetraphosphate (ppGpp) is produced. The alarmone ppGpp affects transcription by inhibiting DNA
initiation. Translation is affected by the inhibition of tRNA synthesis, rRNA synthesis as well as the
protein chain elongation. On the protein level, ppGpp is responsible for the degradation of
ribosomal proteins as well as the production of universal stress proteins. Furthermore, the amino
acid biosynthesis and the induction of the general stress response regulated by σs are stimulated by
ppGpp as can be seen in Figure 7 (Cashel et al. 1996; Neidhardt et al. 1990).
GTP + ATP
L11 RelA
SpoT
Stress and
starvation
pppGpp
Lack of charged tRNA
GDP + PPi
ppGpp
RNAP
Stable
RNA
Ribosomal
proteins
Elongation
factors
Fatty acid
Lipids
Cell wall
+
Glycolysis
DNA
replication
+
rpoS
σs
+
Stasis
survival
genes
+
Oxidative
stress
survival
genes
+
Universal
stress
proteins
+
Amino acid
biosynthesis
Proteolysis
+
Osmotic
stress
survival
genes
Figure 7: The effects of stringent response. Both positive and negative effects exist. Negative effects are inhibition of
stable RNA, ribosomal proteins, elongation factors, fatty acid synthesis, lipid synthesis and DNA replication. Positive
effects are induction of universal stress proteins, amino acid biosynthesis and induction of the general stress response
controlled by rpoS, which activates many survival genes.3
3
Reprinted from Trends in Microbiology, Vol.13 No. 5, Magnusson, Farewell and Nyström, ppGpp: a global regulator
in Escherichia coli, 236-242., Copyright (2005), with permission from Elsevier
13
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
Mechanisms of the stringent response
There are two different mechanisms in which a cell can produce the alarmone ppGpp. The first is
the RelA-dependent pathway and the second is SpoT-dependent as also can be seen in Figure 7
(Magnusson et al. 2005). Both pathways use GTP and ATP to form first pppGpp and then ppGpp.
During stringent response when the RelA pathway is induced, a pool of uncharged tRNAs has been
generated. The uncharged tRNA will bind to the ribosomal A-site and stall the ribosome (Fig 8a);
relA will then detect the stalled ribosome and bind to it (Fig 8b). After that RelA converts ATP and
GTP to (p)ppGpp and AMP. RelA but not the uncharged tRNA in the ribosomal A-site will then be
released (Fig 8c). When RelA is released from the ribosome it jumps to the next stalled ribosome to
repeat the ppGpp synthesis (Fig 8d). Once the stress is lowered, a charged tRNA replaces the
uncharged on the ribosome due to the higher affinity of the charged tRNA as can be seen in Figure
8e (Wendrich et al. 2002).
ATP + GTP
E P A
E P A
E P A
RelA
(p)ppGpp
+
AMP
b
Figure 8: The mechanism
of the RelA pathway4
a
Amino acid
starvation
RelA detects RelA
blocked ribosome
RelA hops to the next blocked ribosome
d
RelA
E P A
c
(p)ppGpp synthesis
releases RelA
Stringent
response
E P A
e
post stress:
aminoacyl-tRNA
has higher affinity
for A site
To restore its’ growth, the cell needs amino acids to “unstall” the ribosomes thus restarting the
protein synthesis. When the stringent response has been activated and ppGpp has been formed (Fig
9a), the alarmone plays a crucial role in the balance between the two enzymes polyP kinase (PPK)
and exopolyphosphatase (PPX). The enzyme polyP kinase (PPK) transfers a phosphate group from
ATP to a short-chain of a linear polyphosphate (polyP) thus creating longer polyP-chains whereas
PPX degrades polyP to phosphate groups (Fig 9b). PPX is inhibited by (p)ppGpp so during
stringent response when (p)ppGpp builds up in the cell, the concentration of polyP increases (Fig
9c). PolyP forms a complex with four molecules of the protease Lon (Fig 9d). This complex binds
to free ribosomal proteins and degrade them to amino acids (Fig 9e). The cell incorporates these
amino acids into essential proteins that are required for survival (Kuroda 2006).
4
Reprinted from Molecular Cell, Vol. 10, Wendrich TM, Blaha G, Wilson DN, Marahiel MA, Nierhaus KH, Dissection
of the mechanism for the stringent factor RelA, p 779-788, Copyright (2002) with permission from Elsevier.
14
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
Stringent response
a
Ribosome
Ribosomal
proteins
Amino acids
e
Degradation
(p)ppGpp
Lon
PP
P
Figure 9: Amino acid formation
during stringent response.
Modified from reference
(Kuroda 2006)
b
PP
P
PP
P
Inhibition
PPX
ATP + Pi
PPK
c
d
PolyP
Lon-PolyP complex
When ppGpp is accumulated during most stresses and nutrient limitations except for amino acid
starvation, the SpoT-pathway is the responsible mechanism although not that much is known about
the exact mechanism (Magnusson et al. 2005).
One way to minimize the effect of the stress response on protein production is to use bacterial
strains that lack the stringent response. It has been shown that chloramphenicol acetyltransferase
(CAT) cultivated in batch has a 5-fold increase in activity when a strain lacking spoT and relA was
grown in LB-medium (Dedhia et al. 1996).
How to improve solubility and quality
Since solubility and quality are important aspects of protein production, different methods have
been developed that can be used to increase both solubility and quality.
Co-expression of chaperones to increase solubility
It was first shown in 1989 that co-expression of chaperones facilitated the solubility of a
heterologous protein. Since then chaperone co-expression have been utilized to increase the
solubility of many proteins. Examples of co-expression in E.coli can be seen in Table 2, and as can
be seen, co-expression of chaperones is not an instant method to success. It is an trial-and-error
effort to find the correct chaperone that interacts with the specific protein of interest. The optimal
level of co-expression also needs to be found since overexpression of chaperones have drawbacks
like cell filamentation (Georgiou and Valax 1996).
Furthermore, co-expression of one chaperone may be insufficient to increase solubility since
different types of chaperones act together (Langer et al. 1992). Adding 3% ethanol together with
chaperone co-expression increased the production of preS2-S’-β-galactosidase probably due to
induction of the heat-shock response in combination with the co-expression (Thomas and Baneyx
1997). The periplasmic production of single-chain Fv antibodies was improved markedly by not
only overexpressing DsbC but also by fusing the antibody to DsbG (Zhang et al. 2002).
15
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
Chaperone
Foldase
GroES-GroEL
GroES-GroEL
GroES-GroEL
DnaK
Protein product
Results
Reference
Xenopus mos protooncogene product
Human p53 tumor
suppressor gene product
Adenovirus oncogene
product E1A
Human growth hormone
Significant increase in solubility
Thomas et al. 1997
Slight increase in solubility
Thomas et al. 1997
No effect
Thomas et al. 1997
Increase the amount of soluble
product from 5 to 85%
No significant effect
Georgiou and Valax
1996
Thomas et al. 1997
Little effect on expression but
prevented inclusion body
formation (from 3 up to 12 mg
soluble protein in membrane)
Increased from 3 up to 5 mg
soluble protein in membrane
14-fold increase in soluble
product
Soluble portion increased from
50 % to 90 %
Chen et al. 2003
DnaK or
GroEL-GroES
DnaK, DnaJ
Chloramphenicol
acetyltransferase (CAT)
Magnesium transporter
CorA
GroEL, GroES
Magnesium transporter
CorA
α-amylase/trypsin
inhibitor
DsbG-scFv
DsbA
DsbC
Chen et al. 2003
Georgiou and Valax
1996
Zhang et al. 2002
Table 2: Co-expression of various chaperones and foldases in E.coli and their effect on solubility of both cytoplasmic
and periplasmic proteins.
Improved solubility through fusion proteins
Another method to increase the solubility of a protein is to fuse it to either side of the protein.
Fusion proteins have been used in different applications for example immobilization of enzymes or
receptors, drug targeting etc (Uhlén et al. 1992). They were primarily thought of as an aid in protein
purification but certain fusion proteins was seen to greatly increase its partner’s solubility. A reason
for this could be due to the efficient and rapid folding of the fusion protein once it emerges from the
ribosome (Baneyx 1999).
In 1999, Kapust and Waugh did a study to compare three commonly used fusion proteins during
production of six diverse proteins that normally form inclusion bodies. The three fusion proteins
were MBP (maltose binding protein), GST (glutathione-S-transferase) and TRX (thioredoxin). In
all cases but one, the fusion with MBP resulted in more than 60 % soluble protein compared to 20%
soluble protein or less when the same proteins were fused to GST or TRX (Kapust and Waugh
1999).
Other fusion proteins that have been shown to increase solubility are the N-utilizing substance A
(NusA), the Gb1 domain from Streptococcus protein G and the Z-domain from Staphylococcal
protein A. All these fusion proteins were studied with respect to expression of 32 human proteins in
E.coli (Hammarström et al. 2002). According to the study TRX, Gb1 and MBP were most efficient
in giving a soluble protein. They were followed by ZZ, NusA, GST and the tag least able to give a
soluble protein was the 6*His-tag that normally is not used as a solubilizing tag. In fact the histidine
tag widely used as a purification tag in immobilised metal-affinity chromatography (Terpe 2003)
has been shown to have a negative effect on protein solubility indifferent on C- or N-terminal
placement of the tag (Woestenenk et al. 2004).
16
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
One disadvantage with the use of fusion proteins is that the solubility may decrease when the fusion
partner is removed. Furthermore, the cleavage of the target protein from the fusion protein takes
time and it is also expensive.
Protein engineering to decrease proteolysis
If a proteolytically sensitive site is found in a region of a protein that is not important for the
activity, folding or other functional properties protein-engineering can be used to improve the
solubility (Murby et al. 1996). It may be possible to delete or substitute amino acids in the sensitive
site and obtain a more stable product. Even small changes in the primary structure can have large
effects on protein solubility and stability (Schein 1990). Furthermore, many hydrophobic residues
on the surface of the protein can lead to an insoluble protein (Sorensen and Mortensen 2005). In
1989 Hellebust et al showed that by exchanging a linker region where 2 basic amino acids
contributed to the degradation of a fusion between Staphylococcus protein A (SpA) and E.coli βgalactosidase, both IgG-binding and β-galactosidase activity were retained (Hellebust et al. 1989).
Another example is from the expression of a 101-amino acid long fragment from the human
respiratory syncytial virus (RSV) major glycoprotein G where 4 phenylalanine residues and 2
cysteins were replaced by serines. This engineering increased the fraction of soluble protein from
27 to 75 % (Murby et al. 1995). It has also been shown that if surface-exposed hydrophobic
residues are exchanged for positively charged residues the protein solubility can be significantly
improved. A designed protein, 4ANK was only soluble between pH 3 to 4.25 but when six leucines
were replaced by arginines to increase the net positive charge, the protein was soluble between pH
4 and 7 (Mosavi and Peng 2003).
Summary
During overproduction of a recombinant protein, the conditions during cultivation act as stimuli on
the cell. The quality of the recombinant protein can be influenced by the different stress response
systems that exist in the cell. The heat shock and envelope stress responses are for example induced
by misfolded proteins and respond with the production of proteases that may degrade the
recombinant product. Furthermore, carbon depletion caused by too little glucose may induce the
stringent response, which puts the cell in a survival mode greatly affecting the production of the
recombinant protein.
1.2.3 Specific productivity
During high-throughput protein production it is important to get a high total product concentration.
Amounts of 5-100 mg soluble protein are necessary for structural studies. A high total productivity
is accomplished by either a scale-up of the cultivation volume, an increased biomass or an
increased specific productivity. Due to the constraints of HTPP, the volume of the production
process is fixed, as is the biomass possible to reach due to the limited oxygen transfer. The only
possible way to increase the total productivity is to increase the specific productivity, that is the
productivity per cell, which can be influenced by the rate of the protein synthesis.
Plasmid copy number
Multicopy plasmids have been used for protein overexpression during a long time and especially
high copy number plasmids are used. They have several advantages, they are small (about 5 kbp),
easy to manipulate and generally result in good expression (Jones et al. 2000). But there are
disadvantages as well, the plasmid may become structurally unstable and it may shift the normal
metabolism in the host also increasing the risk of instability (Jones et al. 2000). The plasmid copy
number influences the productivity. There is a gene dosage effect, i.e. that the amount of product
17
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
increases with copy number but this only seems to be true up to medium copy number (Yansura and
Henner 1990). For examples of some common commercial vectors, see Table 3.
Plasmid
pUC
pGEM
pBR322
pACYC
Copy number
500-700
300-400
15-50
10-12
Classification
High copy
High copy
Medium copy
Low copy
Reference
Yanisch-Perron et al. 1985
Promega Corp.
Balbás et al. 1986
Rose 1988
Table 3: Copy number and classification of some common commercial vectors.
A study comparing a low-copy plasmid with a high-copy plasmid showed, in this case, a 29 %
increase in product concentration when the low-copy plasmid was used instead of the high-copy
plasmid (Jones et al. 2000). One other way to increase the productivity might be to stabilize the
plasmid copy number to avoid high expression rates which otherwise may cause an irreversible
increase in plasmid production. This can lead to a metabolic overload on the cell and the system
might collapse (Grabherr et al. 2001).
Promoters
There are many promoters to choose from when producing a protein. Promoters are inducible or
constitutive and a promoter can be controlled through positive and negative control. The negative
control consists of a repressor molecule that binds the promoter thus inactivating it whereas the
positive control is a transcriptional activation. It is necessary that the promoters have a tight
regulation in order to control the protein production and that they can be induced in a simple and
cost-effective manner. Examples of repressible promoters are the trp and phoA promoters that are
induced when the cells are starved for tryptophan or phosphate respectively. Other promoters are
inducible, examples are the lacUV5- and the tac-promoter (Georgiou 1988). Promoters can be
divided into different classes depending on strength. The difference in strength depends on the
sequence in the consensus region of the promoter. A strong promoter binds more strongly to the
RNA polymerase and theoretically the stronger the promoter is; the more protein should be
produced. Strong promoters often have an “on/off control” (see Figure 10). By that I mean that
independent of the concentration of inducer, the synthesis rate rapidly increases to its maximal
level. It has been shown that the mRNA levels reached the highest values 5 minutes after induction
(Sandén et al. 2002), which would imply an instantaneous response to the inducer. A small increase
in inducer concentration will often lead to a high increase of the production level.
Productivity
Figure 10: The “on-off control” of
promoters. When induced, a small change
in inducer has a large effect on
productivity. Modified from (Ruhdal Jensen
and Hammer 1997)
on
Induction
off
18
Inducer
concentration
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
Induction level
In high-throughput production a constitutive production might be a good choice. A constitutive
production system has a promoter that is constantly active which means that no extra additions need
to be made to the bioreactor. Auto-induction is another possibility where cells for example are
growing on a mixture on glucose and lactose. Due to catabolite repression the cells will favour the
uptake of glucose and when the glucose is exhausted, lactose will be utilized and will lead to
induction of the lac-promoter. But a high expression level can cause a metabolic burden on the cell
leading to decreasing growth rates (Glick 1995). Therefore an induced production of the
recombinant protein can be a better option if the system is not overinduced. When using the lacpromoter, an optimal inducer concentration needs to be chosen to balance the decreasing yields of
cells with increasing levels of target protein. When too high inducer concentration is used, in this
example 3.4 mM isopropyl-β-D-thiogalactopyranoside (IPTG), the reduction in specific growth rate
may be so severe that it will lead to cells entering stationary phase and even cell death (Bentley et
al. 1991). Even an induction with 0.1 mM IPTG can lead to a 23% lower biomass than an uninduced culture (Andersson et al. 1996). Apart from reducing the growth rate of the bacteria, if too
high concentrations of IPTG is used it will lead to a high production which causes stress responses
in the cell. Heat-shock proteins such as DnaK, GroEL and GroES were observed 30 minutes after
induction with 0.5 mM IPTG (Kosinski et al. 1992).
One can also take advantage of the effects of IPTG. IPTG causes a high production rate, which in
turn might lead to an increased permeability of the outer membrane, thus releasing 90 % of the
produced periplasmic protein to the medium (Georgiou et al. 1988). This simplifies the purification
process of the product since the medium contains fewer proteins that need to be removed compared
to if the product were still in the cells. The productivity may be increased through a reduction in the
IPTG concentration. When decreased from 1 mM to 0.01 mM, the concentration of functional and
secreted Fab fragment was increased from below 90 ng/mL culture to almost 1700 ng/mL culture
(Shibui and Nagahari 1992).
Translation initiation
The main elements of translation initiation region (TIR) are a Shine-Dalgarno sequence (SD), an
initiation codon and a downstream region (DR). The SD is located 5-9 bases upstream of the
initiation codon and it is important in the binding of mRNA to the 30S complex of the ribosome.
SD base pairs to a region located at 3’ end of 16S rRNA that locates the ribosome to the proper
initiation codon. In 91 % of the translation initiation regions in E.coli that have been sequenced, the
initiation codon used is AUG, followed by GUG (8 %) and UUG (1 %) (Gualerzi and Pon 1990).
The downstream region (DR) consists of 5-10 codons where the first codon, the +2 codon, has a
strong influence on gene expression. The expression can vary 15-20 fold depending on which
codon that is in this position. Codons that start with the base A generally give a high gene
expression (Stenström et al. 2001b), whereas codons rich on G give low gene expression if located
in +2 position (Gonzalez de Valdivia and Isaksson 2004). These G-rich codons, (AGG, CGG,
UGG, GGG) gave a similar low expression when placed in position +3 and +5 in the downstream
region (Gonzalez de Valdivia and Isaksson 2004).
The gene expression is also influenced by the localization of the SD-region. A strong SD sequence
located eight bases upstream from the initiation codon showed a strong positive effect on gene
expression, whereas a negative effect was shown when the strong SD was placed 15 bases
downstream of the initiation codon (Jin et al. 2006). It has also been shown that the SD region
19
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
upstream of the initiation codon and the DR that follows the initiation codon act together to
influence gene expression (Stenström et al. 2001a).
Acetic acid formation
Acetic acid is produced during oxygen limitation and during glucose excess in the presence of
oxygen. With a high glucose concentration, the carbon flux into the metabolic pathways is larger
then what is demanded for biosynthesis and the surplus of carbon is used to produce acetic acid.
This may be due to an imbalance in the glycolysis, a saturation of the TCA-cycle or the electron
transport chain (Lee 1996). The formation of acetic acid can be decreased by controlling the growth
rate below 0.2 h-1 or 0.35 h-1 for complex or defined media respectively (Meyer et al. 1984) as well
as using glycerol instead of glucose as the carbon source (Lee 1996). It has been shown that acetic
acid has a negative impact on both growth and protein production. At acetic acid concentrations
over 6 g/L, the growth was inhibited (Jensen and Carlsen 1990). However, the specific production
rate of a human growth hormone was decreased already below an acetic acid concentration of 2.4
g/L (Jensen and Carlsen 1990).
Since acetate has a negative impact on productivity it is likely to increase the productivity by
reducing the accumulation of acetic acid. One pathway to acetate production is from
acetylcoenzyme A by the acetate kinase-phosphotransacetylase pathway. Through mutations in the
enzymatic genes acetate accumulation was seven to nine-fold less than in the wild type as well as
an improved accumulation of the product, Interleukin-2 was received (Bauer et al. 1990). Acetic
acid accumulation could also be decreased by inserting the als gene from Bacillus subtilis that
converts pyruvate to acetoin which is a far less toxic compound to the cell than acetate. The specific
activity for the product ß-galactosidase was increased with 60 % with the inserted gene (Aristidou
et al. 1995).
Summary
There are many ways in which the specific productivity of a recombinant protein can be influenced.
The choice of plasmid copy number and promoter as well as how the production is induced and to
what levels, affects the specific productivity. Furthermore, it is also influenced by the translation
initiation region and through by-product formation such as acetate, which can inhibit both growth
and product formation.
20
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
2. PRESENT INVESTIGATION
E.coli is a host with a large capacity to produce target proteins that are structurally and functionally
diverse. However, we believe that this capacity have not been used to its full potential specifically
concerning the production of more difficult proteins, i.e. proteins that are toxic to the cells or
proteins with a low solubility. The aim with this thesis is to investigate and determine central
parameters that can be used to increase the solubility, quality and productivity of such recombinant
proteins.
The first problem faced was the product quality. The definition here is seen as a soluble, active,
full-length protein. The hypothesis is that folding and post-translational modifications are the rate
limiting steps in production of difficult recombinant proteins. Therefore, focus has been to reduce
the synthesis rate of the nascent polypeptide. This may benefit the production of recombinant
proteins, since the produced proteins often are heterologous to E.coli. This means that the
recombinant protein is produced in an unfamiliar environment and the chaperones in E.coli can
therefore have problems recognizing the recombinant protein. If the protein is produced to high
concentrations, the folding and post-translation machinery will be overloaded thus being the ratelimiting step. By decreasing the synthesis rate, the rate-limiting step can be shifted from folding and
post-translational modifications to earlier steps in the process. Is it possible to increase the
solubility and decrease the proteolysis through an altered synthesis rate as accomplished by
different feed rates?
The second problem refers to the nature of the protein. There will be problems with specific groups
of proteins produced in the cytoplasm despite a reduced synthesis rate. The reason can be that they
are toxic to the cell or need a more oxidizing environment than the cytoplasm can provide. The
common solution is to secrete these proteins to the periplasm. The periplasmic space is more
viscous, it is devoid of ATP and it has oxidizing conditions. For instance, proteins containing
disulfide bridges have to be transported unfolded to the periplasm where the oxidizing conditions
favour disulfide bond formation. As for folding and post-translational modifications we believe that
the transport across the cytoplasmic membrane is the rate-limiting step. The transporters in the
membrane have low affinity for these recombinant proteins and will have problems coping with the
transport since these proteins are unknown and in a high concentration. With a reduced synthesis
rate, the transportation rate across the cytoplasmic membrane would be lower thus giving the
transporters more time. Therefore the transportation would be improved by a reduced synthesis rate.
Can stress and degradation be affected by different feed rates for secreted proteins?
In earlier work it has been observed that the feed rate of substrate has a strong and pronounced
impact on the protein production concerning amounts and quality of the protein (Boström and
Larsson 2004; Sandén et al. 2002). The substrate concentration determines the uptake rate of
glucose. A high substrate concentration will result in a higher glucose uptake rate, which means that
the cell will have more carbon and energy that can be used for growth. Furthermore, a high feed
rate results in an increased amount of RNA-polymerase molecules as well as an increased amount
of ribosomes. This affects transcription and translation rates respectively and thus theoretically the
synthesis rate of the protein. From earlier work it was indeed observed that the uptake rate also
affected the synthesis rate of a recombinant protein (Sandén et al. 2002). Given the central role of
the feed, the strategy in this work was to select the feed rate to influence the quality and quantity of
a recombinant protein.
21
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
Given the importance of the substrate uptake rate and taken into consideration the constraints of
HTPP with regards to simplicity and low volume, can the substrate uptake rate and the total
productivity be influenced without using the fed-batch technique?
2.1 Control of solubility and proteolysis in the cytoplasm (I)
Is it possible to increase the solubility and decrease the proteolysis through an altered synthesis rate
as accomplished by different feed rates? When producing recombinant proteins in E.coli, the
protein is often heterologous. This can result in folding problems since chaperones and foldases
may have problems recognizing these recombinant proteins. By reducing the synthesis rate of the
recombinant protein, the chaperones and foldases can have a lower folding rate per polypeptide to
recognize and aid the folding.
Strategy
We have postulated that proteins naturally subjected to proteolysis or inclusion body formation
would benefit from being produced in E.coli provided that we can find the parameter to control this
quality issue. We have used an isogenic strain producing the proteins MalE and MalE31 for this
evaluation. MalE, represents an easy protein as well as being a control protein, it is a 41 kDa
periplasmic receptor used to transport maltose and maltodextrin. It has been used as a fusion protein
to enhance the solubility of the fused target protein as was discussed above.
A mutated form of the maltose-binding protein, MalE31, was chosen as the difficult model protein.
The mutations consist of two altered consecutive amino acids. Aspartic acid and proline is used in
MalE31 instead of glycine and isoleucine in the wild type MalE. The amino acids are situated in
positions 32 and 33 respectively. This double mutant seems to be less stable than the wild type,
MalE, and it has been suggested that the mutant is more prone to aggregate (Betton and Hofnung
1996).
The lacUV5-promoter is a well-known promoter that is insensitive to catabolite repression (Arditti
et al. 1973). This promoter does not depend on cAMP i.e. the transcription rate is independent of
the glucose concentration and thus independent of the feed rate. The lacUV5-promoter is induced
by the natural substrate allolactose that can be converted from lactose or by the synthetic analogue
isopropyl ß-D-thiogalactopyranoside (IPTG). A drawback of the lac-promoter and its’ derivatives is
the induction kinetics. A plot of the specific production rate as a function of the IPTG concentration
(qp = f(IPTG)), shows that the induction curve is very steep and small variations in IPTG
concentration give dramatic effects on the production (Ruhdal Jensen and Hammer 1997). The
reason for this is that IPTG also induces its’ own transport system, the permease LacY. This can be
overcome and we have constructed a lacY mutant where IPTG is diffusing into the cell rather than
being actively transported and this results in that the promoter is more precisely regulated (Ruhdal
Jensen and Hammer 1997).
The vectors used in this work have a low copy number and the two recombinant proteins are under
the control of the lacUV5-promoter . They are fused to a purification tag that allows for an easy
identification and quantification of the proteins through Western blot using antibodies against the
tag. The tag used was Zbasic or Zb, which is a 7-kDa purification tag originating from the Bdomain of Staphylococcal protein A. Positive charges have been added to the original domain thus
making it positively charged even at high pH-values where it can be efficiently used in ionexchange chromatography (Gräslund et al. 2000).
22
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
The expression system is designed so that the uptake rate of glucose has a large influence on the
synthesis rate instead of other factors that are transcriptionally or translationally controlled. In order
to decrease the synthesis rate of the recombinant protein, a low-copy number vector and the
lacUV5-promoter was used. A low copy number reduces the product accumulation resulting from
increasing copy numbers. Since the lacUV5-promoter is catabolite repressed, the transcription is not
dependent on the glucose feed rate but can be controlled by the addition of inducer, which is further
more precisely controlled in the lacY strain. Furthermore, it is possible that the synthesis rate is
controlled by the translation rate. This was confirmed in a previous study with the same expression
system where the mRNA concentrations were the same for a high and a low feed rate and that the
translation was controlled by the amount of ribosomes (Sandén et al. 2002). This suggest that also
in this case translation controls the accumulation of MalE and MalE31.
Results
In order to use the feed rate as a variable affecting the quality the fed-batch concept was used. The
feed of substrate was exponential, which allows constant uptake rates of glucose for the duration of
the cultivations (7 - 15 hours). One high feed rate and one low feed rate was chosen. Since the
maximal growth rate is 0.7 h-1, the high feed rate was selected to give a theoretical growth rate of
0.5 h-1 to be somewhat lower than the maximal growth rate to gain cultivation stability and control.
The low feed rate on the other hand was selected to give a theoretical growth rate of 0.2 h-1 in order
to avoid problems with maintenance issues that lower feed rates can create. These two feed rates
will lead to different uptake rates of glucose, which in turn results in altered protein synthesis rates.
The overall performances of the cultivations were evaluated for the two feed rates and the two
proteins. As can be seen in Figure 11, the altered amino acid sequence did not affect the cell mass
concentration or the acetic acid formation. They were the same irrespective of which protein that
was produced. As can be expected with a high feed rate (0.5 h-1), acetic acid was produced and
accumulated to 4000 mg/L, which is in accordance with the literature (Meyer et al. 1984). In the
case with the low feed rate (0.2 h-1) the highest concentration of acetic acid, 500 mg/L, was reached
in the end of the batch-phase. Once the feed was started, the acetate was rapidly consumed and it
stayed on a low level for the duration of the cultivation.
20
5000
DW
MalE31
DW
MalE
4000
HAc
3000
MalE31
HAc
10
MalE
DW
MalE31
2000
HAc (mg/l)
Dry weight (g/l)
15
Figure 11: The cell mass and acetic acid
accumulation during two different exponential
glucose feed rates during production of Zb-MalE
and Zb-MalE31. Squares: production of Zb-MalE,
low feed rate. Triangles: production of Zb-MalE31,
low feed rate. Circles: production of Zb-MalE31,
high feed rate. Rhombi: production of Zb-MalE,
high feed rate. Filled symbols: acetic acid
accumulation (HAc). Open symbols: dry weight
accumulation (DW).
DW
5
MalE
1000
Induction
0
HAc
0
MalE
5
10
15
Time from feed start (h)
HAc
MalE31
20
0
23
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
Soluble and insoluble fractions of the proteins were analyzed. The proteins were detected by
Western blots using antibodies against protein A that will bind to the purification tag. As can be
seen in Figure 12 the soluble fractions are generally higher than the insoluble fractions that consist
of inclusion bodies. Both the soluble and the insoluble fractions increase with the inducer
concentration. Using the highest amount of inducer does in most cases not result in more soluble
protein but only results in an increased inclusion body formation. Furthermore, a low feed rate
gives higher product concentrations than the high feed rate. And when comparing the two proteins,
it is the native form (Zb-MalE) that is accumulated to the highest levels of both soluble protein and
inclusion bodies. The lowest amount of soluble protein is achieved by the high feed rate in
combination with the mutated form of the protein and the highest concentration of inducer. The
reason for the higher ratio of insoluble proteins may come from harsh environmental conditions
resulting in stressful conditions for the bacteria. These stressful conditions can render the holding
chaperones unable to stabilize the protein thus causing it to aggregate.
Zb-MalE
Low feed
Zb-MalE31
Low feed
High feed
High feed
Relative prod conc.
0,03
0.03
Soluble
Insoluble
0,02
0.02
0,01
0.01
0
0
30
0
3
10
0
0
30
0
3
10
5
0
0
30
0
3
10
5
0
0
30
0
3
10
5
00
IPTG conc. (µM)
Figure 12: The soluble and insoluble fractions of Zb-MalE and Zb-MalE31 as a function of feed rate and IPTGconcentration. All fractions are compared relative to a standard of Zb-MalE. The samples were taken at a time
corresponding to five generations of production at exponential feed.
In order to determine the reason for this lower solubility, the proteolytic stability of the proteins was
investigated. This was done by withdrawing a sample from the bioreactor and adding
chloramphenicol to stop the protein synthesis. After the addition of chloramphenicol, total protein
samples were taken every five minutes and analyzed using Western blots. As can be seen in Figure
13a, the relative concentration of the native protein is around 100 % for the duration of the
experiment. This was true for both low and high feed rates as well as for different concentrations of
IPTG. On the western blots, some degradation bands in the sizes of 30, 33 and 38 kDa were visible
but they could not be characterized further due to low concentrations.
For the mutated form of MalE, the scenario looks completely different as can be seen in Figure 13b.
During low feed rates, the relative concentration of the protein seems to be stable at first. But after
10-15 minutes, the protein is subjected to proteolysis and after 45 minutes, only 40-50 % remains of
the soluble protein. The scenario for the high feed rate is even more severe. The product
concentration is rapidly decreasing from the first minute to the end of the experiment. From initial
values of 100 %, only 10 % soluble protein remains after 45 minutes. In both cases, the
concentration of IPTG does not seem to be important for the proteolysis. As was the case for Zb-
24
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
MalE, degradation products could be detected for Zb-MalE31 as well but they could not be further
characterized due to the low amounts.
The experiment with proteolysis was repeated 6 hours after induction to determine if the severe
proteolysis was a result of the induction or consistent during the cultivation. The result for Zb-MalE
was similar as the result three hours after induction (Fig 13a), whereas the result for Zb-MalE31
was somewhat different. The degradation rate was decreased with cultivation time and for the low
feed rate, 60-90 % of the protein remained as full-length after 45 minutes. In the high feed rate,
approximately 40 % of the initial full-length protein remained. The high proteolysis rate three hours
after induction can therefore be a result of the actual induction. In the literature, heat-shock proteins
were detected 30 minutes after IPTG-addition (Kosinski et al. 1992). It is a possibility that
proteases from the heat-shock response are the reason for our higher rate of degradation three hours
after induction compared to six hours after induction.
The differences in proteolysis between the native and the mutated form of MalE can also explain
why the native protein is produced to higher amounts of soluble protein than the mutated version.
The Zb-MalE31 protein is likely to have the same synthesis rate of nascent polypeptide, but since it
is unstable it will suffer from a higher degree of proteolysis hence reducing the soluble fraction.
Through the alteration of only two amino acids, the protein has gone from being stable and
relatively soluble to unstable and inclusion body forming.
120
100
Percentage of initial MalE31 (%)
Percentage of initial MalE (%)
100
80
60
40
20
0
80
60
40
20
b
a
0
10
20
Time (min)
30
40
0
0
10
20
30
Time (min)
40
Figure 13: Proteolysis of Zb-MalE and Zb-MalE31. In a) the proteolysis of Zb-MalE during low exponential feed is
shown. The percentage of the initial concentration of Zb-MalE is plotted against time, circles: sample taken 3 h after
induction, squares: sample taken 6h after induction. In b) the proteolysis of Zb-MalE31 is shown. Filled symbols
represent low feed rate, open symbols represent high feed rate, squares represent 300µ M IPTG and circles represent
30µM IPTG. The percentage of the initial concentration of Zb-MalE31 is plotted against time.
Conclusions
Different methods that have been used to increase solubility are for example co-expression of
chaperones, the use of fusion proteins and protein-engineering. All these methods have been shown
to increase the solubility of recombinant proteins. However, when chaperones are co-expressed the
success rate is closer to 50 % than 100 % (Thomas et al. 1997). Furthermore, the cell is required to
produce one extra protein. This reduces the cells capacity to produce the recombinant protein of
interest. Problems can also arise with the compatibility between plasmids when an extra plasmid is
25
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
used to express the chaperone. When fusion proteins are used, the cell is also required to produce
one additional protein thereby consuming resources that could have been used for the recombinant
protein. Additionally, the cost of cleaving the fusion protein from the recombinant product can be
high and time consuming. In order to succeed with protein-engineering the structure of the protein
needs to be known. Since the aim is to produce proteins for structure determination, this is not
possible. Furthermore, protein engineering requires an additional step in the protein production
process. This is a considerable drawback since production is to be as rapid as possible. Instead of
using these methods we have shown that simply by using different feed rates of glucose, i.e.
different glucose uptake rates, the solubility can be increased and the proteolysis can be reduced. A
low feed rate results in the highest solubility and the lowest proteolysis. Therefore it is possible to
use an altered synthesis rate as accomplished by different feed rates to increase the solubility and
decrease the proteolysis without using extra and time-consuming steps in the protein production.
2.2 Control of stress and proteolysis in the periplasm (II)
Can stress and degradation be affected by different feed rates for secreted proteins? Several proteins
are not possible to produce to a high level in the cytoplasm, instead the periplasm is the natural
alternative. For these proteins, we believe that the transport across the cytoplasmic membrane is the
rate-limiting step in folding and post-translational modifications. We thus believe, that by reducing
the synthesis rate of the protein these problems may be diminished. It is natural to decrease the
synthesis as early as possible in the production chain due to carbon and energy considerations. In
this work we have thus attempted to put the regulation under transcriptional control and we are
using a promoter, PmalK, with different induction kinetics.
Strategy
We have investigated a new promoter for use in recombinant protein production, i.e. the promoter
malK (Boström and Larsson 2002; Boström and Larsson 2004). MalK is a gene of the maltose
operon and its’ promoter depends on both activation of MalT and formation of cAMP-CRP. This is
achieved through three separate mechanisms and depending on how many mechanisms that are
activated at the same time; the promoter is induced at different levels through different kinetics.
The first mechanism consists of the formation of the inducer maltotriose from maltose taken up
from the medium. The maltotriose leads to the transcriptional activation through activation of
MalT. The second mechanism builds on the transcription from cAMP-CRP, which is formed when
the glucose concentration in the medium is low and this activates the malK-promoter directly.
Thirdly, maltotriose can also be formed from the degradation of internal glycogen and trehalose.
See Figure 14 for the activation mechanisms.
Maltose
cytoplasm
Glycogen/trehalose
degradation
Figure 14: The activation of the malKpromoter (Boström and Larsson 2002)
MALTOTRIOSE
MalK
26
Low glucose
medium
MalT i
cAMP-CRP
formation
TRANSCRIPTION
MalK
MalT a
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
Depending on how many cAMP-CRP and MalT binding sites that are occupied, the induction
reaches different levels and kinetics from inducer exclusion and catabolite repression. It has been
suggested that at least three out of four cAMP-CRP binding sites and all five MalT binding sites
should be occupied for full induction. The lowest production rates came from batch cultivations on
glucose (lack of maltotriose) or maltose (lack of cAMP) and to increase the production rate, a
glucose-limited fed-batch is utilized where the formation of cAMP is accomplished. By using this
fed-batch, glycogen is at the same time degraded to maltotriose. If the glucose-limited fed-batch is
followed by an addition of maltose, more maltotriose is formed resulting in the highest production
rates (Boström and Larsson 2004).
The model protein in (I), Zb-MalE, is recognized as a periplasmic receptor protein and was chosen
once again. To transport Zb-MalE to the periplasm, the signal peptide from OmpA was fused to the
N-terminal of the protein. The expression vector was a low copy number plasmid to minimize the
copy number effects.
In this case the synthesis rate of the recombinant protein is controlled by the malK-promoter in
combination with a low copy number plasmid which reduces the total synthesis rate. The malKpromoter does furthermore not have the same “on-off control” as the lacUV5-promoter, which
makes it easier to control. It should therefore be possible to decrease the protein synthesis rate
through the transcription rate thus allowing the control through medium uptake.
Results
The different feed rates were accomplished by using the fed-batch technique with glucose as the
substrate. The feed rates were exactly the same as in work (I). The malK-promoter was induced by
the addition of maltose after the glucose fed-batch phase. When glucose is replaced with maltose,
the cell is forced to change its uptake system to use the new substrate. This has effects on the cell
mass accumulation as can be seen in Figure 15. Independent of feed rate, both cytoplasmic
production systems suffer from a 2-hour long lag-phase during the substrate change, which is well
known and has been described in the literature since the 1940’s. However this lag-phase is
drastically reduced when Zb-MalE is transported to the periplasm. This shows that the change of
uptake system is influenced by the location of the recombinant product.
maltose
maltose
addition
addition
glucose feed 0.5
glucose feed 0.2
30
Dry weight (g/l)
25
Figure 15: Cell mass accumulation during two different
exponential feed rates of glucose followed by addition of
maltose. Open symbols: cytoplasmic production, closed
symbols: secretion to the periplasm. Squares: high feed rate,
circles: low feed rate. Plus sign: cell accumulation without
plasmid. Vertical lines indicate the addition of maltose.
20
15
10
5
0
0
5
10
15
20
Time from feed start (h)
27
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
The recombinant protein suffers from degradation as can be seen in Figure 16. The soluble fraction
of the protein was detected with Western blot using antibodies against protein A. Looking at the
cytoplasmic production; the full-length protein is only visible at the low feed rate and not the high.
This may be due to severe degradation or inclusion body formation before the protein is properly
folded. Degradation bands are visible at both feed rates, the most abundant at 38- and 33-kDa. In
the periplasmic system the high feed rate gave the highest amount of full-length protein but
degradation bands are seen here as well. The reason for this degradation in both the cytoplasmic
and periplasmic production systems may be stress responses in the cell. For the cytoplasmic
production, the overproduction of recombinant proteins induces the heat-shock response whereas
the envelope stress response is induced when proteins are misfolded in the periplasm. Both stress
response systems produce among other proteins proteases, which can degrade the recombinant
protein. The product accumulation was also investigated after the addition of maltose, since it has
been shown to increase the production levels (Boström and Larsson 2004). The product pattern
could not be distinguished from the production during the glucose-feed phase, however the sample
taken in the maltose phase only covered the transition between glucose and maltose.
cytoplasm
Figure 16: The protein quality during
cytoplasmic production and during secretion to
the periplasm. Western blots showing the
accumulation of soluble Zb-MalE and its
degradation products during production at two
selected glucose feed-profiles in fed-batch
cultivation (g) and after the addition of maltose
(m). M = molecular marker, R=purified ZbMalE, Low = low feed rate, high = high feed
rate. FL = full-length protein 48 kDa, 38 = a 38
kDa degradation product, 33 = a 33 kDa
degradation product.
Low
periplasm
High
Low
High
52
FL
35
38
33
28
21
M
R
g
m
g
m
g
m
g
m
In order to understand what caused the differences regarding degradation, the accumulation of
acetic acid was investigated since acetate is known to influence productivity (Jensen and Carlsen
1990). Acetate accumulation is as stated in literature depending on the specific growth rate. When
the cells have a growth rate below 0.35 h-1, there should be no accumulation of acetic acid. And as
expected in the low feed systems the acetic acid concentration is less than 20 mg/L at the time of
maltose addition, see Figure 17. After this time the concentration increases up to 70-90 mg/L,
however there is only a slight difference between cytoplasmic production and when the protein is
transported to the periplasm. Using the high feed rate, acetate accumulation would be expected.
During cytoplasmic production at high feed rate, the concentration was as high as 6500 mg/L at the
time of maltose addition. During the change in the substrate uptake system, the cells consumed
some of the acetic acid and the final concentration was about 4000 mg/L. The secretion of the
product showed an altogether different picture. The acetate concentration never reached above 800
mg /L.
28
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
maltose
maltose
addition
addition
glucose feed 0.5
glucose feed 0.2
7000
100
6000
HAc (mg/L)
40
-1
glucose fed-batch 0.5 h
3000
Figure 17: The accumulation of acetic acid during
fed-batch cultivation with high or low feed. The
horizontal arrows at the top indicate the duration
of the respective glucose feed. Vertical lines
indicate the addition of maltose. Open symbols:
HAc accumulation in the cytoplasmic production,
closed symbols: HAc accumulation in the
periplasmic system. Squares: high feed rate,
circles: low feed rate. Note the different Y-axis
scales for high and low feed rates.
HAc (mg/L)
60
4000
glucose fed-batch 0.2 h
-1
80
5000
2000
20
1000
0
0
5
10
15
Time from feed start (h)
20
0
Since carbon starvation is known to induce the stringent response, which might affect the quality of
the recombinant protein, this was also studied during the change of substrate when glucose was
replaced by maltose. During high feed rate the ppGpp accumulation increased at the same time as
maltose was added in both the periplasmic and the cytoplasmic system. But as can be seen in Figure
18, the ppGpp levels was slightly lower when Zb-MalE was transported to the periplasm compared
to when it remained in the cytoplasm. When Zb-MalE is produced in the cytoplasm during a low
feed rate, ppGpp is produced even before the addition of maltose. This indicates that ppGpp is not
coupled to the change in uptake system but rather something else. In the periplasmic system, the
increase in ppGpp comes after the maltose addition and it only reaches about 40 % of the
cytoplasmic levels. This indicates that the transport of Zb-MalE to the periplasm results in a
reduced stringent response, which might also explain the decreased amounts of degradation
products that were seen in the periplasmic system. Since the stringent response has such profound
effect on the cell inhibiting not only DNA initiation but also several components in the translation
machinery, it is not surprising to see that a
maltose
maltose
decreased stringent response results in higher
addition
addition
glucose feed 0.5
quality and productivity.
glucose feed 0.2
1200
Figure 18: The accumulation of ppGpp during fedbatch cultivation with high or low feed. The horizontal
arrows at the top indicate the duration of the respective
glucose feed. Vertical lines indicate the addition of
maltose. Open symbols: ppGpp accumulation in the
cytoplasmic production, closed symbols: ppGpp
accumulation in the periplasmic system. Squares: high
feed rate, circles: low feed rate
ppGpp (nmol/g DW)
1000
800
600
400
200
0
0
5
10
15
Time from feed start (h)
20
29
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
Conclusions
Several stress response systems are present in E.coli and they are activated to enable cell survival
during harsh conditions. One of these systems is the stringent response which is activated during
carbon and energy starvation as well as when charged tRNA’s are lacking. When the stringent
response is induced the alarmone ppGpp is produced and effectively stops both transcription and
translation (Magnusson et al. 2005). This has detrimental effects on the protein synthesis and
increased levels of ppGpp should be avoided. A reduced stringent response has been shown to have
a positive effect on protein production (Dedhia et al. 1996), however since growth is impaired a
better approach is to use a strategy where the formation of ppGpp is avoided.
Furthermore, acetic acid is another component with a negative effect on protein production (Jensen
and Carlsen 1990). Methods (except from fed-batch cultivations) to decrease the accumulation of
acetic acid include vector engineering where enzymatic genes have been modified or even
incorporated from other bacteria (Bauer et al. 1990; Aristidou et al. 1995). Instead of using vector
engineering we have shown that the feed rate can be used to decrease the accumulation of acetic
acid for a secreted protein. For a high feed rate, the acetate concentration was 8-fold reduced if the
protein was secreted to the periplasm.
When the recombinant protein is transported to the periplasm both acetic acid accumulation and the
stringent response is reduced. This can lead to a higher productivity and a high production rate can
be used. This means that protein production is not limited to low growth rates but that also high
growth rates can be used. It seems as if the cell utilizes the carbon and energy for the transport
across the membrane instead of forming acetate. Therefore, stress and degradation for secreted
proteins can be affected by different synthesis rates of the protein accomplished by the feed rates.
2.3 A cellular based system for substrate uptake rate control (III)
Can the substrate uptake rate and the total productivity be influenced without using the fed-batch
technique and without increasing the cultivation volume?
Strategy
The strategy of this work is based on control of the feed rate, i.e. the glucose uptake rate, to
improve quality and quantity. This can be accomplished by mutations in the cells glucose uptake
system. As described above, glucose is mainly transported into the cell through the glucose- and
mannose- PTS systems. We have used strains with mutations in the substrate specific enzyme II in
order to limit the substrate uptake rate. The first strain, PTSMan, has a mutation in the gene manX
which codes for the enzyme IIABMan. The second strain, PTSGlc, has a mutation in the gene ptsG
coding for the enzyme IICBGlc and the third strain, PTSGlcMan, has both mutations. Through these
mutations, the cells are unable to maintain the same glucose uptake rates as wild type cells without
mutations (Picon et al. 2004).
For this work we needed a model protein, which was soluble and not subjected to proteolysis. The
model protein chosen for this work was the protein ß-galactosidase (ß-gal). It is a stable protein and
it has a high solubility even at high concentrations. ß-gal is a 116-kDa protein consisting of four
domains that are 1023 amino acid residues long. Another advantage is the easy accessible activity
analysis. ß-gal hydrolyze lactose into galactose and glucose, but also synthetic substrates exist and
30
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
orto-nitrophenyl ß-D-galactopyranoside (ONPG), which turns yellow when it is cleaved by ß-gal, is
therefore used and quantified by a photometric absorbance reading.
The expression system was principally the same as in previous work (I) and it consisted of a low
copy number plasmid to avoid negative effects of a high copy number. The product, ß-gal, was
under control of the lacUV5-promoter which is independent of the glucose concentration in the
medium. In combination with the genetically modified strains this should make it possible to use
the batch technique (i.e. high glucose concentration) but still create varied glucose uptake rates.
Results
The evaluation process was done in fermenters to enable a higher degree of process control
regarding stirring, pH, and oxygen transfer. The first variable compared was the growth rate. As can
be seen from the dry weight curves in Figure 19, all strains grow exponentially and at different
growth rates. The PTSMan mutant and the wild type grows at approximately the same rate, 0.78 h-1
compared to 0.72 h-1 respectively, when cultivated in batch at their maximal growth rate. This
means that it is not possible to use a fed-batch cultivation of the wild type to compare to PTSMan
since the wild type already in batch are growing at a slower rate than PTSMan. The wild type and
PTSMan will thus be compared against each other in batch processes. The PTSGlc and PTSGlcMan
grow at slower rates, 0.38 h-1 and 0.25 h-1 respectively, and will be compared to fed-batch
cultivations of the wild type at the respective growth rates. As is shown in Figure 19, these two
strains are due to the mutations unable to maintain the same rapid growth as the wild type despite
the unlimited concentration of glucose in the media. By inserting these mutations the first aim has
been accomplished, i.e. the cells have an altered substrate uptake rate, which means that a growth
limitation has been put on the cellular level to mimic a fed-batch cultivation. These results have
been confirmed by another study where the uptake rates of glucose were measured (Picon et al.
2004).
The next variable examined was the acetic acid concentration, which has been shown in literature to
be important in protein production since it can decrease the productivity (Jensen and Carlsen 1990).
As is expected at a high growth rate, both PTSMan and the wild type cultivated in batch produce
acetic acid in proportion to the growth rate (data not shown). The two slower growing mutants,
PTAGlc and PTSGlcMan, did on the other hand not produce acetic acid. This is in accordance with the
literature, where the same differences in acetate production have been observed between another
strain with mutations in the ptsG-gene and its wild type strain (Chou et al. 1994).
6
! = 0.72
5
Figure 19: The dry weights of the mutants compared to
the wild type. Circles: wild type, Squares: PTSMan,
Rhombi: PTSGlc, Triangles: PTSGlcMan
Cell dry weight (g/l)
! = 0.78
4
! = 0.38
3
! = 0.25
2
1
0
0
2
4
6
8
10
Cultivation time (h)
12
14
16
31
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
The production of β-gal was compared in batch processes as is shown in Figure 20. The fastest
growing mutant PTSMan shows both the highest activity as well as the highest specific production
rate for the duration of the cultivation. It is higher than the wild type, which has less than half of
both activity and specific productivity at the end of the cultivation. Concerning the other two
mutants, PTSGlc and PTSGlcMan, they have very low activities in minimal medium and this applies
for specific productivities as well. However, if the medium is supplemented with yeast extract it is
possible to increase both activity and specific productivity. For PTSGlcMan the specific production
rate is initially even higher than what it is for PTSMan in minimal medium. However, one drawback
with the yeast supplement is that it also leads to an increase in growth rate and that is unwanted.
70
600
60
500
50
400
40
300
30
200
20
100
10
0
b
a
qp (U/mg,h)
B-gal activity (U/ml)
700
4
6
8
10
Cultivation time (h)
12
0
5
6
7
8
9
10
11
12
13
Cultivation time (h)
Figure 20: a) ß-galactosidase activity and b) specific production rates of β-galactosidase during different growth rates.
Production is constitutive. Open symbols: minimal medium, closed symbols: minimal medium supplemented with yeast
extract. Circles: wild type, squares: PTSMan, rhombi: PTSGlc, triangles: PTSGlcMan.
However, when production was induced by IPTG, the PTSGlc mutant grown in minimal medium
showed initially the same production pattern as the wild type, see Figure 21. The wild type
maintained production for six hours before it dropped whereas the production from PTSGlc started to
drop after approximately three hours. During these conditions, production from PTSGlc was
comparable to the wild type. The total ß-galactosidase accumulation at the end of the cultivation
estimated to 33 % and 13% of the total protein, for the WT and PTSGlc respectively.
32
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
10
60
50
8
40
DW (g/l)
6
30
20
4
qp (U/mg,h)
Figure 21: Cell dry weights and specific
production rates of β-galactosidase when induction
is induced by IPTG. Filled symbols: cell dry
weight, open symbols: specific production rate,
circles: AF1000, rhombi: PPA652.
10
2
0
0
-4
-2
0
2
4
6
8
10
12
-10
Time from induction (h)
It was also vital to see how high cell densities that could be reached for the mutants compared to the
wild type since a high cell density also results in a higher total productivity. The mutant PTSGlcMan
was chosen in order to get a large difference in glucose uptake rate. At the end of the cultivation,
the wild type had reached a cell density of 27 g/L dry weight whereas PTSGlcMan reached 34 g/L.
The acetic acid concentration showed a huge difference between the two cultivations. The
production of acetic acid has a major impact on the growth rate of the wild type. Already at 1.5 g/L,
when the cell density is 9 g/L, the acetic acid starts to inhibit growth as can be seen in Figure 22.
The growth rate drops gradually from 0.8 h-1 to 0.2 h-1 as the acetic acid concentration increases to
9 g/L.
The oxygen consumption rate for the wild type and PTSGlcMan was also compared. It is important to
see whether the mutations have resulted in a higher respiration that could possibly have a negative
effect on cell mass accumulation and also the total protein production. The wild type reaches the
highest oxygen consumption rate after 5 hours of cultivation when the cell dry weight is 5 g/L. For
PTSGlcMan, it takes 23 hours and a cell dry weight of 23 g/L before the same oxygen consumption
rate is reached. This is important in protein production since it allows for a longer protein
production period before oxygen depletion limits the cultivation. To conclude, the cultivation with
the wild type was ended due to growth limitations caused by acetate whereas PTSGlcMan was ended
due to oxygen depletion.
33
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
1
0,4
a
b
8000
8000
4000
0,2
2000
0
-0,2
0
2
4
6
8
Cultivation time (h)
10
12
0
4000
0,1
2000
0
-0,1
HAc (mg/l)
0,4
6000
0,2
2
6000
2
0,6
my (1/h), Qo (g/g,h)
0,3
HAc (mg/l)
my (1/h), Qo (g/g,h)
0,8
0
5
10
15
Cultivation time (h)
20
25
0
Figure 22: The performance of batch cultivations of a) the wild type and b) PTSGlcMan. Filled symbols: specific growth
rate, open symbols: acetic acid concentration, Line: Qo2.
Conclusions
A study in the 1970’s showed that strains with mutations in the mannose- and glucose-PTS systems
experienced prolonged doubling times (Curtis and Epstein 1975). Other studies have shown that
examples of such mutations result in less acetate being formed (Chou et al. 1994). Furthermore, the
replacement of the glucose-PTS system by the galactose permease also reduces acetate formation
(De Anda et al. 2006) as well as overexpression of the Mlc protein (Cho et al. 2005).
In this work we have shown that strains with mutations in the glucose- and mannose-PTS systems
acts as their wild type strain cultivated in fed-batch. Despite being cultivated in the batch technique,
these strains still have growth limitations that control the glucose uptake rates thus reducing the
growth rate. Instead of having a limitation from the process control equipment, the limitation lies on
the cellular level. Both growth rates, acetic acid formation and product formation was comparable
between the mutants and the wild type cultivated in fed-batch. So it is possible to influence the
substrate uptake rate and the total productivity without using the fed-batch technique and without
increasing the cultivation volume. This result in that the simpler batch technique can be used
although a growth restriction still exist which may benefit the production of more difficult proteins
even in small-scale reactors such as microtiter plates.
34
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
3. CONCLUSIONS
To increase the number of process settings, the aim of this work was to investigate and determine
central parameters that can be used to control and increase the solubility, quality and productivity of
recombinant proteins. These central parameters should be applicable under the constraints of highthroughput production i.e. the multi-parallel format at restricted volume.
The central parameter in this work is the feed rate or the glucose uptake rate. The hypothesis was
that if the synthesis rate could be reduced, solubility, quality and productivity could be increased.
This reduced synthesis rate is accomplished by different feed rates of glucose. These different feed
rates results in different uptake rates of glucose, which in turn affects the recombinant protein
synthesis rate. We have shown that by lowering the feed rate, a higher solubility and a lower
proteolysis can be achieved.
However, it is not possible to use the fed-batch technique to control growth in the small-scale of
high-throughput production. Therefore, mutant strains with altered glucose uptake rates were
evaluated compared to the traditional fed-batch technique. These mutants enable fed-batch like
growth conditions and receive the benefits from fed-batch even in the small-scale batch cultivations
that is used in high-throughput production.
Furthermore, a strong promoter such as the T7-promoter and a high inducer concentration are
traditionally used in high-throughput production. However, this investigation has shown that by
increasing the concentration of inducer, the ratio of soluble and insoluble proteins is shifted towards
more insoluble proteins. Therefore, it is recommended to use primarily a weaker promoter but also
lower inducer concentrations in order to reduce the metabolic burden on the cell thereby increasing
the solubility of the recombinant protein. In order to control the recombinant protein production, a
lacY strain can be used. This results in a more precise regulation of the promoter thus decreasing
large production variations due to a small change in inducer concentration.
Finally, if problems with solubility exist, an option is to transport the recombinant protein to the
periplasm. This can benefit the quality of the recombinant protein as less degradation products have
been formed and the stringent response has been reduced, which otherwise can have a negative
impact on the productivity.
35
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
4. ABBREVIATIONS
ATP
cAMP
CAT
CRP
DNA
DOT
DR
Dsb
E.coli
F
GDP
GFP
GTP
HSP
HTPP
IPTG
kDa
KLa
MalE
MalE31
mRNA
OD
OMP
ppGpp
pppGpp
PlacUV5
PmalK
PphoA
Ptac
Ptrp
PT7
PPIase
PPK
PPX
PTS
qp
Q
RNA
RNAP
rRNA
SD
Si
TF
TIR
tRNA
V
X
YX/S
Zb
β-gal
µ
σ 32 = σ H
σ 24 = σ E
36
adenosine triphosphate
cyclic adenosine monophosphate
chloramphenicol-acetyl-transferase
catabolite repressor protein
deoxyribonucleic acid
dissolved oxygen tension
downstream region
thiol-disulfide oxidoreductase
Escherichia coli
flow rate (L/h)
guanosine diphosphate
green fluorescent protein
guanosine triphosphate
heat shock proteins
High-throughput protein production
isopropyl-β-D-thiogalactopyranoside
kilo Dalton
oxygen transfer coefficient (h-1)
Maltose binding protein (=MBP)
Maltose binding protein with two altered amino acids
messenger RNA
optical density
outer membrane proteins
guanosine tetraphosphate
guanosine pentaphosphate
lacUV5 promoter
malK promoter
phosphatase promoter
derivative of lac-promoter
tryptophane promoter
T7-promoter
peptidyl-prolyl cis/trans isomerase
polyP-kinase
exopolyphosphatase
phosphoenolpyruvate:phosphotransferase system
specific productivity
total productivity
ribonucleic acid
RNA polymerase
ribosomal RNA
Shine Dalgarno sequence
Substrate concentration of feed (g/L)
trigger factor
translation initiation region
transfer RNA
cultivation volume (L)
cell mass (g/L)
Yield of biomass over substrate (g cells / g substrate)
Z-basic
betagalactosidase
specific growth rate
heat-shock sigma factor
extra-cytoplasmic stress sigma factor
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
5. ACKNOWLEDGEMENTS
I would like to start with acknowledging The Swedish Centre for Bioprocess Technology (CBioPT)
and the Royal Institute of Technology (KTH) for funding this project.
Furthermore, I would like to thank all people that have been involved in the making of this thesis
and first of all my supervisor, professor Gen Larsson for having me as a diploma worker and later
on as a PhD-student. Big thanks for all your help making this thesis what it is today!
Thank you Sophia Hober for spending time during your Christmas vacation to review this thesis. I
greatly appreciated it!
Thank you to Olle, for accepting me as a PhD-student and for having time to answer my questions
when I popped my head into your office. Also to the other seniors at the department, Andres for
never ending enthusiasm at the lab and around the coffee table, Lena for giving an superb overview
of the stringent response in just about the right time!
Thank you to Erik Holmgren at Biovitrum for taking part in my project and trying to teach me how
to clone. It didn’t always go so well but I assure you that it was my fault and not yours!
To Sofia Borg for contributions to paper III.
To Anna Maria and Maria for coping with me during my diploma work, answering all my questions
and also for the cultivations and analyses that followed both at Biovitrum and at KTH. Looking
back at it, I had a lot of fun although it didn’t always feel like it when I was running western blot
number one thousand…. I learned a lot from you guys!
A huge thank you to Emma, for all the things that we have done together. Both PhD-courses in and
out of Sweden, long cultivations from early morning to late evening, boring analyses and especially
for all the chats and our gossip-moments! It hasn’t been easy every day, but it helped to have you
by my side so thank you for being there!
A big hug to former and present PhD-students for all beer nights, bowling, Bambi on ice (i.e.
curling) and the fun crab- and Christmas parties that we have had and hopefully will continue to
have!
Thank you to Tommy for always fixing problems on the lab, to Ela for all the analyses, to Marita
and Gunilla for fixing all the administrative stuff.
A special thank you to my family for never ending support. You have been there from the very start
always helping me out no matter what and I love you for it!
To my wonderful fiancé Anders for always being there to listen to both complaints and fun stuff
that had happened during the days and for trying to cheer me up when I had a bad day. Also for
talking me into buying a house even if it took three years to get me there. And for putting a ring on
my finger … I love you and I can’t wait for our next adventures!
37
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
6. REFERENCES
Alba BM, Gross CA (2004) Regulation of the Escherichia coli σE-dependent envelope stress response. Molecular
Microbiology 52:613 - 619
Alba BM, Leeds JA, Onufryk C, Lu CZ, Gross CA (2002) DegS and YaeL participate sequentially in the cleavage of
RseA to activate the σ E-dependent extracytoplasmic stress response. Genes and Development 16:2156 - 2168
Allen SP, Polazzi JO, Gierse JK, Easton AM (1992) Two novel heat shock genes encoding proteins produced in
response to heterologous protein expression in Escherichia coli. Journal of Bacteriology 174:6938-6947
Andersson L, Yang S, Neubauer P, Enfors S-O (1996) Impact of plasmid presence and induction on cellular responses
in fed-batch cultures of Escherichia coli. Journal of Biotechnology 46:255-263
Arditti R, Grodzicker T, Beckwith J (1973) Cyclic adenosine monophosphate-independent mutants of the lactose
operon of Escherichia coli. Journal of Bacteriology 114:652-655
Aristidou AA, San K-Y, Bennett GN (1995) Metabolic engineering of Escherichia coli to enhance recombinant protein
production through acetate reduction. Biotechnology Progress 11:475-478
Arsène F, Tomoyasu T, Bukau B (2000) The heat shock response of Escherichia coli. International Journal of Food
Microbiology 55:3 - 9
Balbás P, Soberón X, Merino E, Zurita M, Lomeli H, Valle F, Flores N, Bolivar F (1986) Plasmid vector pBR322 and
its special-purpose derivatives – a review. Gene 50:3-40
Baneyx F (1999) Recombinant protein expression in Escherichia coli. Current Opinion in Microbiology 10:411-421
Baneyx F (2004) Keeping up with protein folding. Microbial Cell Factories 3
Baneyx F, Mujacic M (2004) Recombinant protein folding and misfolding in Escherichia coli. Nature Biotechnology
22:1399-1408
Baneyx F, Palumbo JL (2003) Improving heterologous protein folding via molecular chaperone and foldase coexpression. Methods in Molecular Biology 205:171-197
Bauer KA, Ben-Bassat A, Dawson M, de la Puente VT, Neway JO (1990) Improved expression of human Interleukin-2
in high-cell-density fermentor cultures of Escherichia coli K-12 by a phosphotransacetylase mutant. Applied
and Environmental Microbiology 56:1296-1302
Bentley WE, Davis RH, Kompala DS (1991) Dynamics of induced CAT expression in E.coli. Biotechnology and
Bioengineering 38:749-760
Berrow NS, Büssow K, Coutard B, Diprose J, Ekberg M, Folkers GE, Levy N, Lieu V, Owens RJ, Peleg Y, Pinaglia C,
Quevillon-Cheruel S, Salim L, Scheich C, Vincentelli R, Busso D (2006) Recombinant protein expression and
solubility screening in Escherichia coli: a comparative study. Acta Crystallographica section D 62:1218-1226
Betton J-M, Hofnung M (1996) Folding of a mutant maltose-binding protein of Escherichia coli which forms inclusion
bodies. The Journal of Biological Chemistry 271:8046 - 8052
Betts JI, Baganz F (2006) Miniature bioreactors: current practices and future opportunities. Microbial Cell Factories
5:21
Booth I (1999) Adaptation to extreme environments. Biology of the prokaryotes, Edited by Joseph Lengeler, Gerhart
Drews, Hans G. Schlegel, Blackwell Science: 652-673
Boström M, Larsson G (2002) Introduction of the carbohydrate-activated promoter PmalK for recombinant protein
production. Applied Microbiology and Biotechnology 59:231-238
Boström M, Larsson G (2004) Process design for recombinant protein production based on the promoter, PmalK. Applied
Microbiology and Biotechnology 66:200 - 208
Cashel M, Gentry DR, Hernandez VJ, Vinella D (1996) The Stringent Response. In Escherichia coli and Salmonella:
Cellular and molecular biology (Neidhardt F.C., ed) ASM Press Vol 1:1458-1496
Chen Y, Song J, Sui S-f, Wang D-N (2003) DnaK and DnaJ facilitated the folding process and reduced inclusion body
formation of magnesium transporter CorA overexpressed in Escherichia coli. Protein Expression and
Purification 32:221-231
Cho S, Shin D, Ji GE, Heu S, Ryu S (2005) High-level recombinant protein production by overexpression of Mlc in
Escherichia coli. J Biotechnol 119:197-203.
Chou CH, Bennett GN, San KY (1994) Effect of modified glucose uptake using genetic engineering techniques on
high-level recombinant protein production in Escherichia coli dense cultures. Biotechnology and
Bioengineering 44:952-960
Curtis SJ, Epstein W (1975) Phosphorylation of D-glucose in Escherichia coli mutants defective in
glucosephosphotransferase, mannosephosphotransferase, and glucokinase. Journal of Bacteriology 122:11891199
Danese PN, Silhavy TJ (1998) CpxP, a stress-combative member of the Cpx regulon. Journal of Bacteriology 180:831 839
Dartigalongue C, Missiakas D, Raina S (2001) Characterization of the Escherichia coli σE regulon. The Journal of
Biological Chemistry 276:20866 – 20875
38
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
De Anda R, Lara AR, Hernández V, Hernández-Montalvo V, Gosset G, Bolivar F, Ramirez OT (2006) Replacement of
the glucose phosphotransferase transport system by galactose permease reduces acetate accumulation and
improves process performance of Escherichia coli for recombinant protein production without impairment of
growth rate. Metabolic Eng 8:281-290
De Las Penas A, Connolly L, Gross CA (1997) The σ E-mediated response to extracytoplasmic stress in Escherichia coli
is transduced by RseA and RseB, two negative regulators of σE. Molecular Microbiology 24:373 - 385
Death A, Notley L, Ferenci T (1993) Derepression of LamB protein facilitates outer membrane permeation of
carbohydrates into Escherichia coli under conditions of nutrient stress. Journal of Bacteriology 175:1475-1483
Dedhia N, Richins R, Mesina A, Chen W (1996) Improvement in recombinant protein production in ppGpp-deficient
Escherichia coli. Biotechnology and Bioengineering 53:379-386
Deuerling E, Patzelt H, Vorderwülbecke S, Rauch T, Kramer G, Schaffitzel E, Mogk A, Schulze-Specking A, Langen
H, Bukau B (2003) Trigger Factor and DnaK possess overlapping substrate pools and binding specificities.
Molecular Microbiology 47:1317-1328
Doig SD, Diep A, Baganz F (2005) Characterisation of a novel miniaturized bubble column bioreactor for high
throughput cell cultivation. Biochemical Engineering Journal 23:97-105
Dong J, Iuchi S, Kwan H-S, Lu Z, Lin EC (1993) The deduced amino-acid sequence of the cloned cpxR gene suggests
the protein is the cognate regulator for the membrane sensor, CpxA, in a two-component signal transduction
system of Escherichia coli. Gene 136:227-230
Duetz WA, Rüedi L, Hermann R, O'Connor K, Büchs J, Witholt B (2000) Methods for intense aeration, growth,
storage, and replication of bacterial strains in microtiter plates. Applied and Environmental Microbiology
66:2641-2646
Edelmann P, Gallant J (1977) Mistranslation in E.coli. Cell 10:131-137
Erickson JW, Gross CA (1989) Identification of the σE subunit of Escherichia coli RNA polymerase: a second
alternative σ factor involved in high-temperature gene expression. Genes and Development 3:1462 - 1471
Fahnert B, Lilie H, Neubauer P (2004) Inclusion bodies: Formation and utilization. Advances in Biochemical
Engineering/Biotechnology 89:93-142
Fuchs C, Köster D, Wiebusch S, Mahr K, Eisbrenner G, Märkl H (2002) Scale-up of dialysis fermentation for high cell
density cultivation of Escherichia coli. Journal of Biotechnology 93:243-251
Gellerfors P, Pavlu B, Axelsson K, Nyhlen C, Johansson S (1990) Separation and identification of growth hormone
variants with high performance liquid chromatography techniques. Acta paediatrica Scandinavica 370:93-100
Georgiou G (1988) Optimizing the production of recombinant proteins in microorganisms. AIChE Journal 34:12331248
Georgiou G, Shuler M, Wilson D (1988) Release of periplasmic enzymes and other physiological effects of ß-lactamase
overproduction in Escherichia coli. Biotechnology and Bioengineering 32:741-748
Georgiou G, Valax P (1996) Expression of correctly folded proteins in Escherichia coli. Current Opinion in
Biotechnology 7:190 - 197
Glick BR (1995) Metabolic load and heterologous gene expression. Biotechnology Advances 13:247-261
Gonzalez de Valdivia EI, Isaksson LA (2004) A codon window in mRNA downstream of the initiation codon where
NGG codons give strongly reduced gene expression in Escherichia coli. Nucleic Acid Research 32:5198-5205
Grabherr R, Nilsson E, Striedner G, Bayer K (2001) Stabilizing plasmid copy number to improve recombinant protein
production. Biotechnology and Bioengineering 77:142 - 147
Gross CA (1996) Function and regulation of the heat shock proteins. In Escherichia coli and Salmonella: Cellular and
molecular biology (Neidhardt F.C., ed) ASM Press 1:1382-1399
Grossman AD, Erickson JW, Gross CA (1984) The htpR gene product of E.coli is a sigma factor for heat-shock
promoters. Cell 38:383-390
Grossman AD, Taylor WE, Burton ZF, Burgess RR, Gross CA (1985) Stringent response in Escherichia coli induces
expression of heat shock proteins. Journal of Molecular Biology 186:357-365
Gräslund T, Lundin G, Uhlén M, Nygren P-Å, Hober S (2000) Charge engineering of a protein domain to allow
efficient ion-exchange recovery. Protein Engineering 13:703-709
Gualerzi CO, Pon CL (1990) Initiation of mRNA translation in prokaryotes. Biochemistry 29:5881-5889
Hammarström M, Hellgren N, van den Berg S, Berglund H, Härd T (2002) Rapid screening for improved solubility of
small human proteins produced as fusion proteins in Escherichia coli. Protein Science 11:313-321
Hedrén M, Ballagi A, Mörtsell L, Rajkai G, Stenmark P, Sturesson C, Nordlund P (2006) GRETA, a new
multifermenter system for structural genomics and process optimization. Acta Crystallographica section D
62:1227-1231
Hellebust H, Murby M, Abrahmsén L, Uhlén M, Enfors S-O (1989) Different approaches to stabilize a recombinant
fusion protein. BIO/Technology 7:165-168
Hermann R, Lehmann M, Büchs J (2002) Characterization of gas-liquid mass transfer phenomena in micro-titer plates.
Biotechnology and Bioengineering 81:178-186
39
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
Horn U, Krug M, Sawistowski J (1990) Effect of high cell density cultivation on plasmid copy number in recombinant
Escherichia coli cells. Biotechnology Letters 12:191-196
Jensen EB, Carlsen S (1990) Production of recombinant human growth hormone in Escherichia coli: expression of
different precursors and physiological effects of glucose, acetate and salts. Biotechnology and Bioengineering
36:1-11
Jin H, Zhao Q, Gonzalez de Valdivia EI, Arcell DH, Stenström M, Isaksson LA (2006) Influences on gene expression
in vivo by a Shine Dalgarno sequence. Molecular Microbiology 60:480-492
John GT, Goelling D, Kliman I, Schneider I, Heinzle E (2003a) pH-sensing 96-well microtitre plates for the
characterization of acid production by dairy starter cultures. Journal of Dairy Research 70:327-333
John GT, Kliman I, Wittman C, Heinzle E (2003b) Integrated optical sensing of dissolved oxygen in microtitre plates: a
novel tool for microbial cultivation. Biotechnology and Bioengineering 81:829-836
Jones CH, Danese PN, Pinkner JS, Silhavy TJ, Hultgren SJ (1997) The chaperone-assisted membrane release and
folding pathway is sensed by two signal transduction systems. The EMBO Journal 16:6394-6406
Jones KL, Kim S-W, Keasling JD (2000) Low-copy plasmids can perform as well as or better than high-copy plasmids
for metabolic engineering of bacteria. Metabolic Engineering 2:328-338
Jones SA, Melling J (1984) Persistance of pBR322-related plasmids in Escherichia coli grown in chemostat culture.
FEMS Microbiology Letters 22:239-243
Kanehara K, Ito K, Akiyama Y (2002) YaeL (EcfE) activates the σE pathway of stress response through a site-2
cleavage of anti-σE, RseA. Genes and Development 16:2147 - 2155
Kapust RB, Waugh DS (1999) Escherichia coli maltose-binding protein is uncommonly effective at promoting the
solubility of polypeptides to which it is fused. Protein Science 8:1668-1674
Kimata K, Inada T, Tagami H, Aiba H (1998) A global repressor (Mlc) is involved in glucose induction of the ptsG
gene encoding major glucose transporter in Escherichia coli. Molecular Microbiology 29:1509-1519
Kimata K, Takahashi H, Inada T, Postma P, Aiba H (1997) cAMP receptor protein-cAMP plays a crucial role in
glucose-lactose diauxie by activating the major glucose transporter gene in Escherichia coli. Proceedings of
the National Academy of Sciences USA 94:12914-12919
Kosinski MJ, Rinas U, Bailey JE (1992) Isopropyl-β-D-thiogalacto-pyranoside influences the metabolism of
Escherichia coli. Applied Microbiology and Biotechnology 36:782-784
Kumar S, Wittman C, Heinzle E (2004) Minibioreactors. Biotechnology Letters 26:1-10
Kuroda A (2006) A polyphosphate-Lon protease complex in the adaptation of Escherichia coli to amino acid starvation.
Bioscience, Biotechnology and Biochemistry 70:325-331
Langer T, Lu C, Echols H, Flanagan J, Hayer MK, Hartl FU (1992) Successive action of DnaK, DnaJ and GroEL along
the pathway of chaperone-mediated protein folding. Nature 356:683-689
Laskowska E, Kucynska-Wisnik D, Bak M, Lipinska B (2003) Trimethoprim induces heat shock proteins and protein
aggregation in E.coli cells. Current Microbiology 47:286-289
Lee SY (1996) High-cell density culture of Escherichia coli. Trends in Biotechnology 14:98-105
Lengeler JW, Postma PW (1999) Global regulatory networks and signal transduction pathways. Biology of the
prokaryotes, Edited by Joseph Lengeler, Gerhart Drews, Hans G. Schlegel, Blackwell Science: 491-523
Lilie H, Schwarz E, Rudolph R (1998) Advances in refolding of proteins produced in E.coli. Current Opinion in
Microbiology 9:497-501
Lipinska B, Fayet O, Baird L, Georgopoulos C (1989) Identification, characterization, and mapping of the Escherichia
coli htrA gene, whose product is essential for bacterial growth only at elevated temperatures. Journal of
Bacteriology 171:1574-1584
Magnusson LU, Farewell A, Nyström T (2005) ppGpp: a global regulator in Escherichia coli. TRENDS in
Microbiology 13:236-242
Mar Carrio M, Villaverde A (2001) Protein aggregation as bacterial inclusion bodies is reversible. FEBS Letters
489:29-33
Mecsas J, Rouviere PE, Erickson JW, Donohue TJ, Gross CA (1993) The activity of σE, an Escherichia coli heatinducible σ-factor, is modulated by expression of outer membrane proteins. Genes and Development 7:2618 2628
Mere L, Bennett T, Coassin P, England P, Hamman B, Rink T, Zimmerman S, Negulescu P (1999) Miniaturized FRET
assays and microfluidics: key components for ultra-high-throughput screening. DDT 4:363-369
Meyer H-P, Leist C, Fiechter A (1984) Acetate formation in continuous culture of Escherichia coli K12 D1 on defined
and complex media. Journal of Biotechnology 1:355-358
Mileykovskaya E, Dowhan W (1996) The Cpx two-component signal transduction pathway is activated in Escherichia
coli mutant strains lacking Phosphatidylethanolamine. Journal of Bacteriology 179:1029 - 1034
Miroux B, Walker JE (1996) Over-production of proteins in Escherichia coli: mutant hosts that allow synthesis of some
membrane proteins and globular proteins at high levels. Journal of Molecular Biology 260:289-298
40
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
Missiakas D, Mayer MP, Lemaire M, Georgopoulos C, Raina S (1997) Modulation of the Escherichia coli σ E (RpoE)
heat-shock transcription-factor activity by the RseA, RseB and RseC proteins. Molecular Microbiology 24:355
- 371
Moat AG, Foster JW, Spector MP (2002) Microbial Physiology. Wiley-Liss: 162
Mosavi LK, Peng Z (2003) Structure-based substitutions for increased solubility of a designed protein. Protein
engineering 16:739-745
Murby M, Samuelsson E, Nguyen TN, Mignard L, Power U, Binz H, Uhlén M, Ståhl S (1995) Hydrophobicity
engineering to increase solubility and stability of a recombinant protein from respiratory syncytial virus.
European Journal of Biochemistry 230:38-44
Murby M, Uhlén M, Ståhl S (1996) Upstream strategies to minimize proteolytic degradation upon recombinant
production in Escherichia coli. Protein Expression and Purification 7:129-136
Märkl H, Zenneck C, Dubach ACH, Ogbonna JC (1993) Cultivation of Escherichia coli cells to high cell densities in a
dialysis reactor. Applied Microbiology and Biotechnology 39:48-52
Nakano K, Rischke M, Sato S, Märkl H (1997) Influence of acetic acid on the growth of Escherichia coli K12 during
high-cell-density cultivation in a dialysis reactor. Applied Microbiology and Biotechnology 48:597-601
Neidhardt FC, Ingraham JL, Schaechter M (1990) Physiology of the bacterial cell. Sinauer Associates, Inc: 361-367
Nikaido H, Vaara M (1985) Molecular basis of bacterial outer membrane permeability. Microbiological Reviews 49:132
Parker J (1989) Errors and alternatives in reading the universal code. Microbiological Reviews 53:273-298
Parker J, Pollard JW, Friesen JD, Stanners CP (1978) Stuttering: High-level mistranslation in animal and bacterial cells.
Proceedings of the National Academy of Sciences USA 75:1091-1095
Picon A, Teixeira de Mattos MJ, Postma PW (2004) Reducing the glucose uptake rate in Escherichia coli affects
growth rate but not protein production. Biotechnology and Bioengineering 90:191 - 200
Plumbridge J (1998) Control of the expression of the manXYZ operon in Escherichia coli: Mlc is a negative regulator of
the mannose PTS. Molecular Microbiology 27:369 - 380
Plumbridge J (1999) Expression of the phosphotransferase system both mediates and is mediated by Mlc regulation in
Escherichia coli. Molecular Microbiology 33:260-273
Plumbridge J (2002) Regulation of gene expression in the PTS in Escherichia coli: the role and interactions of Mlc.
Current Opinion in Microbiology 5:187 - 193
Puskeiler R, Kaufmann K, Weuster-Botz D (2005) Development, parallelization, and automation of a gas-inducing
milliliter-scale bioreactor for high-throughput bioprocess design (HTBD). Biotechnology and Bioengineering
89:512-523
Raivio TL, Popkin DL, Silhavy TJ (1999) The Cpx envelope stress response is controlled by amplification and
feedback inhibition. Journal of Bacteriology 181:5263-5272
Raivio TL, Silhavy TJ (2001) Periplasmic stress and ECF sigma factors. Annual Review of Microbiology. 55:591 - 624
Risenberg D (1991) High-cell density cultivation of Escherichia coli. Current Opinion in Biotechnology 2:380-384
Rose RE (1988) The nucleotide sequence of pACYC184. Nucleic Acids Research 16:355
Rouvière PE, De Las Penas A, Mecsas J, Lu CZ, Rudd KE, Gross CA (1995) rpoE, the gene encoding the second heatshock sigma factor, σE, in Escherichia coli. The EMBO Journal 14:1032-1042
Ruhdal Jensen P, Hammer K (1997) Artificial promoters for metabolic optimization. Biotechnology and
Bioengineering 58:191-195
Sandén AM, Prytz I, Tubulekas I, Förberg C, Le H, Hektor A, Neubauer P, Pragai Z, Harwood C, Ward A, Picon A,
Teixeira de Mattos J, Postma P, Farewell A, Nyström T, Reeh S, Pedersen S, Larsson G (2002) Limiting
factors in Escherichia coli fed-batch production of recombinant proteins. Biotechnology and Bioengineering
81:158-166
Schein CH (1990) Solubility as a function of protein structure and solvent components. Biotechnology 8:308-317
Schmidt FR (2004) Recombinant expression systems in the pharmaceutical industry. Applied Microbiology and
Biotechnology 65:363-372
Shibui T, Nagahari K (1992) Secretion of a functional Fab fragment in Escherichia coli and the influence of culture
conditions. Applied Microbiology and Biotechnology 37:352-357
Snyder WB, Davis LJB, Danese PN, Cosma CL, Silhavy TJ (1995) Overproduction of NlpE, a new outer membrane
lipoprotein, suppresses the toxicity of periplasmic LacZ by activation of the Cpx signal transduction pathway.
Journal of Bacteriology 177:4216-4223
Sorensen HP, Mortensen KK (2005) Soluble expression of recombinant proteins in the cytoplasm of Escherichia coli.
Microbial Cell Factories 4:1
Stenström CM, Holmgren E, Isaksson LA (2001a) Cooperative effects by the initiation codon and its flanking regions
on translation initiation. Gene 273:259-265
Stenström CM, Jin H, Major LL, Tate WP, Isaksson LA (2001b) Codon bias at the 3’-side of the initiation codon is
correlated with translation initiation efficiency in Escherichia coli. Gene 263:273-284
41
Methodology for high-throughput production of soluble recombinant proteins in E.coli - Katrin Markland
Straus DB, Walter WA, Gross CA (1987) The heat shock response of E.coli is regulated by changes in the
concentration of σ32. Nature 329:348-351
Terpe K (2003) Overview of tag protein fusions: from molecular and biochemical fundamentals to commercial systems.
Applied Microbiology and Biotechnology 60:523-533
Thomas JG, Ayling A, Baneyx F (1997) Molecular chaperones, folding catalysts, and the recovery of active
recombinant proteins from E.coli. Applied Biochemistry and Biotechnology 66:197-238
Thomas JG, Baneyx F (1997) Divergent effects of chaperone overexpression and ethanol supplementation on inclusion
body formation in recombinant Escherichia coli. Protein Expression and Purification 11:289-296
Tsai LB, Lu HS, Kenny WC, Curless CC, Klein ML, Lai P-H, Fenton DM, Altrock BW, Mann MB (1988) Control of
misincorporation of de novo synthesized norleucine into recombinant Interleukin-2 in E.coli. Biochemical and
Biophysical Research Communications 156:733-739
Uhlén M, Forsberg G, Moks T, Hartmanis M, Nilsson B (1992) Fusion proteins in biotechnology. Current Opinion in
Biotechnology 3:363-369
Waldo GS, Standish BM, Berendzen J, Terwilliger TC (1999) Rapid protein-folding assay using green fluorescent
protein. Nature Biotechnology 17:691-695
Wang Q, Kaguni JM (1989) A novel sigma factor is involved in expression of the rpoH gene of Escherichia coli.
Journal of Bacteriology 171:4248 - 4253
Weber RF, Silverman PM (1988) The Cpx proteins of Escherichia coli K12: structure of the CpxA polypeptide as an
inner membrane component. Journal of Molecular Biology 203:467-478
Veinger L, Diamant S, Buchner J, Goloubinoff P (1998) The small heat-shock protein IbpB from Escherichia coli
stabilizes stress-denatured proteins for subsequent refolding by a multichaperone network. The Journal of
Biological Chemistry 273:11032-11037
Wendrich TM, Blaha G, Wilson DN, Marahiel MA, Nierhaus KH (2002) Dissection of the mechanism for the stringent
factor RelA. Molecular Cell 10:779-788
Weuster-Botz D, Puskeiler R, Kusterer A, Kaufmann K, John GT, Arnold M (2005) Methods and milliliter scale
devices for high-throughput bioprocess design. Bioprocess and Biosystems Engineering 28:109-119
Wickner S, Maurizi MR, Gottesman S (1999) Posttranslational quality control: folding, refolding and degrading
proteins. Science 286:1888-1893
Woestenenk EA, Hammarström M, van den Berg S, Härd T, Berglund H (2004) His tag effect on solubility of human
proteins produced in Escherichia coli: a comparison between four expression vectors. Journal of Structural and
Functional Genomics 5:217-229
Yanisch-Perron C, Vieira J, Messing J (1985) Improved M13 phage cloning vectors and host strains: nucleotide
sequences of the M13mp18 and pUC19 vectors. Gene 33:103-119
Yansura DG, Henner DJ (1990) Use of Escherichia coli trp promoter for direct expression of proteins. Methods in
Enzymology 185:54-60
Zhang Z, Li Z-H, Wang F, Fang M, Yin C-C, Zhou Z-Y, Lin Q, Huang H-L (2002) Overexpression of DsbC and DsbG
markedly improves soluble and functional expression of single-chain Fv antibodies in Escherichia coli.
Protein Expression and Purification 26:218-228
42