University of Groningen Library design and screening

University of Groningen
Library design and screening strategies for efficient enzyme evolution
van Leeuwen, Johannes Gustaaf Ernst
IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to
cite from it. Please check the document version below.
Document Version
Publisher's PDF, also known as Version of record
Publication date:
2015
Link to publication in University of Groningen/UMCG research database
Citation for published version (APA):
van Leeuwen, J. G. E. (2015). Library design and screening strategies for efficient enzyme evolution
[Groningen]: University of Groningen
Copyright
Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the
author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).
Take-down policy
If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately
and investigate your claim.
Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the
number of authors shown on this cover page is limited to 10 maximum.
Download date: 19-06-2017
Chapter 1
General introduction and outline of the thesis
Chapter 1
Figure 1. Classical propene to epichlorohydrin process. The first step involves the
allylic chlorination of propene. 3-Chloropropene is then reacted with hypochlorous
acid, prepared by dissolving chlorine gas in water, yielding a 3:1 mixture of 1,3dichloropropan-2-ol and 2,3-dichloropropan-1-ol. This mixture is reacted with an
alkaline solution to yield racemic epichlorohydrin. The chlorine atom-efficiency of
this process is only 25% and significant quantities of halogenated side products like
1,2,3-trichloropropane are formed.
!
Figure 2. Newly developed glycerol to epichlorohydrin process. The initial
hydrochlorination of glycerol with hydrogen chloride is mediated by an organic acid
catalyst (e.g. acetic acid) under mild reaction conditions to give a 30-50:1 mixture of
1,3-dichloropropan-2-ol and 2,3-dichloropropan-1-ol. This intermediate is converted
to racemic epichlorohydrin under alkaline conditions. The glycerol to epichlorohydrin
process produces only one equivalent of waste chloride and virtually no organic
side-products are formed.
Gibbs Free Energy
!
transition
state (‡)
ΔG
‡
S
!
!b.
Gibbs Free Energy
a. Classical: propene to epichlorohydrin
!
ΔG
P
Progress
New: glycerol to epichlorohydrin
transition state noncatalyzed reaction(‡)
ΔG
transition
state catalyzed reac*
tion(‡ )
‡
ΔG
‡*
S
P
ΔG
Progress
Figure 3. Comparison of the thermodynamics of the propene to epichlorohydrin and
the glycerol to epichlorohydrin conversion processes. Panel a; the precursors in the
classical propene to epichlorohydrin route are chemically highly reactive resulting in
low activation energy (ΔG‡). The overall reaction energy (ΔG) is strongly negative
which indicates a highly exothermic reaction. Panel b; the activation energy of the
non-catalyzed glycerol to epichlorohydrin reaction is very high (solid line); the
reaction hardly proceeds under ambient conditions. The presence of an organic acid
catalyst such as acetic acid lowers the rate-limiting free energy of activation (dotted
line) and facilitates high reaction rates under mild conditions.
8
Introduction
!
Green chemistry & catalysis for sustainable organic synthesis
The negative effects of growing industrial production and increasing use of
natural resources urge the development of green production processes that
are based on renewable starting materials.[1] Especially the chemical industry
is challenged to produce in a more sustainable manner and to minimize
environmental impact, for example by waste prevention and by reducing
energy use. Traditional chemical processes already have become less
polluting since the 1980s due to measures against the emission of hazardous
compounds, but nowadays also cleaner alternative production routes are
being implemented. A good example of the introduction of cleaner
technology is a newly developed route towards racemic bulk
epichlorohydrin, which is an important intermediate for epoxy resins, paints,
paper products and pharmaceuticals (production ± 1Mton/year). The
traditional process runs under harsh conditions and utilizes propene, chlorine
gas and hypochlorous acid as precursors (Figure 1). Propene is commonly
derived from fossil sources and besides stoichiometric amounts of solid
waste, highly toxic and persistent side products such as 1,2,3trichloropropane (TCP) and chlorinated ethers are released (chapters 2 and
3).[2] The newly developed process utilizes renewable glycerol and
hydrochloric acid as precursors and virtually no organohalogen side
products are formed (Figure 2).[3] Glycerol and hydrochloric acid are
chemically not reactive under ambient conditions. To facilitate high reaction
rates under mild conditions an organic acid is applied as a catalyst. The
function of a catalyst in organic synthesis is to lower the rate-limiting free
energy of activation. This promotes the formation of the transition-state
complex for the desired reaction product and reduces the necessity to use a
high temperature (Figure 3). Milder reaction conditions translate to a
reduced energy input and less formation of undesired side-products. The
overall reaction energy is not affected by the presence of a catalyst and the
catalyst itself is not consumed in the reaction.
Besides the large-scale processes for commodity chemicals also production
processes of fine chemicals are notorious for their high amount of waste
production.[4] This is because of the more complex nature of fine chemical
synthesis and the requirement of specialized catalysts that make (enantio-)
selective conversions possible. The high time-to-market pressure is often not
!
9
Chapter 1
compatible with the time-consuming development of dedicated catalysts and
stoichiometric activation reactions are often used. However, the growing
availability of highly selective organo-metal catalysts and biocatalysts over
the last decades is a major factor that counters this trend.[5-7] Besides an
enhanced sustainability the continuous expansion of the catalytic toolbox
greatly stretches out the possibilities of synthetic organic chemistry. In this
thesis I will focus on the development of enzymes as biocatalyst for
selective chemical conversions.
a.
!
!
!
b.
!!c.
Figure 4. Examples of early biocatalytic processes. a. Glucose isomerase catalyzed
conversion of D-glucose to D-fructose (both displayed in the most abundant βpyranose form, above in open Fisher projection). b. Amino acid acylase catalyzed
kinetic resolution of racemic N-acetyl methionine. The enzyme displays a high
enantiopreference towards the L- form of the substrate; after ~50% conversion Lmethionine is the predominant product and D-N-acetyl methionine remains virtually
untouched by the enzyme. c. Cytochrome P450 catalyzed stereo- and regioselective hydroxylation at the 11 position of progesterone. One oxygen atom that is
derived from atmospheric oxygen is incorporated in the product and the second
oxygen atom is reduced to water by electrons from β-nicotinamide adenine
dinucleotide 2′-phosphate (NADPH). The 11-α-hydroxyprogesterone product can be
further converted to cortisol (gray).
!
10
Introduction
!
Enzymes in organic synthesis
The use of enzymes in organic synthesis started more than a century ago. At
that time scientist discovered the possibility of using living cells or extracts
thereof as catalyst for the production of useful (chiral) fine chemicals.
Already in 1856 Louis Pasteur noticed that living cells of the Lactobacillus
genus were responsible for the conversion of glucose into lactic acid, at that
time an unexplained problem in wine making.[8] Twenty-five years later, Llactic acid was the first natural chiral compound that was produced by
fermentation on an industrial scale.[9] Also the use of isolated enzymes as
catalyst for specific biotransformations was already recognized in the
nineteenth century. In 1833 the French bio-pioneer Anselme Payen
demonstrated the hydrolysis of starch into fermentable sugars by using an
enzyme preparation from malted grains.[10] The first industrial application of
isolated enzymes was more than one hundred years later; during World War
II immobilized invertase was applied at ambient temperature and pH for the
production of invert sugar from sucrose.[11] Sulfuric acid, the preferred
catalyst at that time, was not available due to war activities. Other early
examples of industrial biotransformations are the xylose isomerase catalyzed
conversion of D-glucose into the sweeter tasting sugar D-fructose,[12] the
amino acid acylase catalyzed kinetic resolution of various proteinogenic
amino acids[13] and the cytochrome P450 (whole cells) catalyzed
hydroxylation of progesterone as first step in the production of
corticosteroid hormones (Figure 4).[14]
In these settings enzymes showed to be useful catalysts and high enantioand regio- selectivity was often obtained. For example, the biocatalytic 11α-hydroxylation of progesterone strongly simplified the original Merck
process for cortisone-acetate, which involved 31 chemical steps. After the
biocatalytic step was implemented the product yield increased and its price
dropped from 200 to 6 dollars per gram.[14] On the other hand the use of
enzymes in chemical processes also has serious limitations. This is not
surprising since enzymes are adjusted, through millions of years of
evolution, to their physiological role in an aqueous environment, which is
often very different from what organic chemists require. Problems with
separating the product from the biocatalyst itself could be largely overcome
by the application of carrier-bound enzymes.[15] This also enabled the reuse
!
11
Chapter 1
of the enzyme but other limitations such as a narrow substrate scope, poor
catalytic efficiency, low or undesired regio- and stereo-selectivity, poor
operational stability and (product) inhibition could not be easily solved. Also
the limited availability of stable enzymes restricted their use in organic
synthesis.
Tailoring enzymes for chemical processes
A number of scientific breakthroughs, starting with the discovery of the
DNA double helix structure by Watson and Crick in 1953, initiated a new
age of biocatalysis.[16] In the decades after this major discovery genetic
engineering tools were developed that enabled the over-expression of a gene
from a donor organism in a suitable production host such as E. coli. Human
insulin was in 1982 the first protein drug that was recombinantly produced
in E. coli (Genentech - licensed to Eli Lilly and Company).[17] An early
example of an industrial recombinant enzyme is chymosin from calf, which
was marketed in 1988 by Gist-Brocades (now DSM Food Specialties).[18]
Today fermentation-produced chymosin (FPC) is used for over 80% of the
global cheese production. Examples of other important developments for
biocatalysis are the elucidation of the first protein crystal structure in
1958,[19] the invention of the polymerase chain reaction (PCR) in 1983,[20]
and the advances in DNA sequencing and synthesis technology.[21]
Enzyme optimization by rational design
The greatly improved accessibility of natural enzymes from various species
through recombinant protein production further promoted the discovery and
use of enzymes as catalyst in organic synthesis.[22-25] In an attempt to
overcome some of the limitations of natural enzymes, such as a poor
stability or narrow substrate scope, protein engineering technologies were
developed. In the 1980s and 90s this was primarily done in a rational way
and guided by a crystal structure of the target enzyme. Rationally designed
protein variants were created by site-directed mutagenesis where specific
amino acid substitutions are obtained via targeted mutations in the coding
DNA. For example, in 1989 Matsumura et al. reported the successful
12
Introduction
!
stabilization of phage T4 lysozyme by engineered disulfide bonds.[26,27] They
introduced pairs of cysteine amino acid residues on the surface of the protein
thereby creating one, two or three disulfide bonds which stabilize the native
globular protein structure. The melting temperature of their best mutant
protein turned out to be 23.4°C higher than the wild-type enzyme that has no
disulfide bonds (Figure 5).
Figure 7. Rational design of a phage T4 lysozyme mutant with higher melting
temperature. Structural model based on pdb 1L35. The polypeptide chain of the 164
amino acids long enzyme is displayed from blue (N-terminus) to red (C-terminus),
sulfur atoms of engineered disulfide bonds are shown as yellow spheres. The three
engineered disulfide bonds hamper the thermal unfolding of the native protein.
In many more cases, rational protein design was successfully applied to
enhance enzyme stability or catalytic properties.[28] Nonetheless, this
approach appeared not to be a robust answer to most engineering challenges.
Limitations were encountered, for example, when biocatalysts had to be
developed for the synthesis of non-natural pharma intermediates or in cases
the enantioselectivity towards a target substrate had to be inverted.[29]
Enzyme variants with a desired specificity could not be well predicted in
most cases. Another major limitation of rational protein design is that it
requires a structure of the template enzyme with atomic resolution. This was
especially a problem in the early days of enzyme engineering when the
number of available protein crystal structures was still very small.
!
13
Chapter 1
Enzyme optimization by laboratory!evolution
In a search to circumvent the limitations of rational enzyme design,
researchers have started in the second half of the 1990s with the exploration
of molecular biology methods that mimic Darwinian evolution.[30,31] This
has eventually resulted in a collection of methods, which has been termed
“directed evolution”.[32,33] Contrary to enzyme optimization by rational
design, directed evolution uses a random approach that is based on iterative
cycles of mutagenesis, starting with a target gene or a set of related genes,
followed by the selection or screening of the resulting protein library for
variants with improved target features. The genes of the best hits are used as
template for the next round of mutagenesis and screening. Mutations can, for
example, be introduced by using a random process such as gene
amplification under error-prone conditions (epPCR). This mutagenesis
approach was extensively explored by Francis Arnold and colleagues in the
1990s.[34] In 1994 Pim Stemmer invented a very different method for
creating genetic diversity which is called “gene shuffling”.[35] In this method
a set of homologous parent genes is fragmented into smaller DNA pieces,
which are randomly recombined to form a library of full-length hybrid
genes, which also carry additional random mutations. The general strategy
for obtaining enzymes with novel properties by directed evolution is
outlined in Figure 6. With this laboratory evolution approach it is possible to
discover enzyme variants with unexpected beneficial amino acid
substitutions through the entire protein sequence without requiring
knowledge of the structure or catalytic mechanism. Whilst the evolution of
an enzyme can take millions of years in the context of a living organism this
can now be reduced to several months or even weeks, making directed
evolution a very powerful approach for the development of biocatalysts with
novel properties.
14
Introduction
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
gene collection
!
!
new round of mutagenesis
and screening with gene of hit
protein library
!!! !! !! !! !! !! !! !! !! !! !! !!
!! !! !! !! !! !! !! !! !! !! !! !!
!! !! !! !! !! !! !! !! !! !! !! !!
!! !! !! !! !! !! !! !! !! !! !! !!
screening
e.g. in
!
!
!
!
!
template gene
!
MTP, or selection
Figure 6. Enzyme optimization via directed evolution. The gene of a promising
starting protein is used as template for the creation of a large diversity of gene
variants with (random) mutations in the nucleotide sequence. The resulting proteins
diverge in amino acid sequence and are subjected to a screening, e.g. in microtiter
plate (MTP) format, or a selection assay to identify variants that display enhanced
function. A screening can be done in. The genes of the best hits can be used as
template for another round of mutagenesis and screening, which can be repeated
until a protein variant is obtained that is fit for the application.
The chance of finding good hits in a directed evolution project is largely
determined by the strategy that is chosen and the design and quality of the
gene libraries that are used. For example, gene libraries that are created with
epPCR can reveal beneficial amino acid substitutions and so-called “hotspot” positions throughout the entire protein sequence without the use of
structural knowledge.[36,37] A disadvantage of this method is that many
potentially beneficial mutations will be missed due to intrinsic limitations of
the method such as the mutational bias of the employed DNA-polymerase
and the organization of the genetic code.[38] Also synergetic combinations of
mutations are not efficiently explored with epPCR libraries because the
chance that two or more specific amino acid substitutions occur in the same
mutant enzyme is very small.[39]
Site-saturation mutagenesis (SSM) is an oligonucleotide assisted PCR
technique that allows exploring all amino acid substitutions at predetermined
sites in the protein of interest.[40] Selection of positions for targeting with
SSM can for example be inspired by structural data or by screening and
sequencing results of an earlier round. Directed evolution involving both
!
15
Chapter 1
epPCR and SSM has been used by DSM to engineer a biocatalyst that can
catalyze the selective hydroxylation of compactin as last step in the
production of pravastatin, a cholesterol-lowering drug.[41,42] A wild-type
cytochrome P450 enzyme from Amycolatopsis orientalis was used as
starting point. This enzyme has already the desired regioselectivity; it
catalyzes the hydroxylation of compactin at the 6-position. However it
produces the “wrong” epimer of the product (epi-pravastatin), which lacks
the desired biological function (Figure 7 - left reaction). The aim of the
directed evolution study was to invert the stereopreference of the template
enzyme. In the first round an epPCR library was created and thoroughly
evaluated (no structure or literature data was available of the template
enzyme). The DNA sequences of the best hits revealed several hot-spot
positions. Amino acid substitutions at some of these positions improved the
stereoselectivity where mutations at other positions enhanced the catalytic
efficiency of the enzyme towards the target substrate. The most promising
positions were further explored with combinatorial SSM to identify
potentially beneficial combinations of mutations. Comprehensive screening
of the second generation SSM library revealed a very active triple mutant
enzyme that that is capable of producing the desired form of pravastatin with
an enantiomeric excess of > 95%. (Figure 7 - right reaction).
Figure 7. Cytochrome P450 from Amycolatopsis orientalis as a catalyst in the
hydroxylation of compactin in the last step of the synthesis of the cholesterollowering drug pravastatin. The wild-type enzyme produces the α-variant of
pravastatin, which lacks the desired biological function. A triple mutant enzyme that
was identified in a directed evolution screening campaign produces the desired
epimer of pravastatin with high enantiomeric excess.
16
Introduction
!
Workflows for efficient laboratory enzyme evolution
In the 1990s directed evolution was successfully applied to acquire
improved enzymes for diverse biotechnological applications but the overall
process remained costly and time-consuming.[43,44] Later research was
focused on improving the efficiency of directed evolution. Important
progress was made through the development of better high-throughput
screening assays or with advanced molecular tools for creating genetic
diversity.[44-47] Here I will further elaborate on methods and considerations
for the more effective probing and sampling of sequence diversity.
Considerations for optimal library design and sampling
In 2005 Reetz and colleagues have established a directed evolution
workflow for the development of enzymes with novel catalytic properties
that is called “combinatorial active-site saturation test” (CAST).[48] This
method uses structural data to select amino acid positions around the
substrate binding pocket of an enzyme since mutations at these sites will
influence various catalytic functions.[39] In consecutive rounds these firstshell positions (typically around 10 amino acid residues) are subjected in
small sets of two or three sites to saturation mutagenesis and screening. This
structure based approach appeared to be a robust method and has become a
common way for engineering catalytic parameters such as substrate scope,
inhibition properties, catalytic efficiency and (enantio-) selectivity.[48,49] The
vast expansion of the protein structure database (www.pdb.org)[50] at the end
of the 1990s and 00s supported the use of CASTing.
Despite all successes there are three important limitations of the CASTing
approach, which are also valid for many other directed evolution methods:
A) the evaluation of combinatorial saturation libraries requires a significant
screening effort; B) multiple rounds of laborious mutagenesis and screening
are needed; and C) possible synergetic combinations of two or more
mutations are only partially studied. This is caused by the fact that the
capacity of available screening assays is limited with the consequence that
just a relatively small number of target positions can be efficiently explored
with combinatorial SSM. If, for example, two or three positions are
!
17
Chapter 1
simultaneously randomized to all the twenty proteinogenic amino acids 400
(202) and 8000 (203) enzyme variants are respectively assessed which is
about the maximum number of variants that can be covered with analytical
tools such as high-performance liquid chromatography (HPLC) and gasliquid chromatography (GLC), which are often used in the screening for
catalytic function. At some point the screening capacity is always limiting;
even highly efficient selection methods can cover only a diminutive fraction
of the huge sequence space of proteins.[51]
One of the challenges in directed evolution is to make optimal use of the
screening capacity that is available. A common idea is that comprehensive
screening is required to find the best evolved mutant enzyme.[48] However,
full coverage of a random protein library requires oversampling. It is like
throwing dices; most of the times it takes more than six throws to get all six
values while several possibilities are covered more than once. On average,
reaching 95% coverage of an unbiased SSM library of 8,000 unique variants
requires an average screening effort of approximately 23,965 randomly
picked variants. This translates to almost three-fold library oversampling
whereas the repetitive testing of identical clones does not contribute to the
discovery of better hits. The relationship between library coverage and
oversampling is visualized in Figure 8. Equation 1a is used to calculate the
library coverage as function of the screening effort[52] Equation 1b is used to
determine the average number of clones that needs to be screened to achieve
a desired coverage.
library coverage
p unique variant
100
75
50
25
0
0×
0
1×
1
2×
2
3×
3
4×
4
library oversampling
18
5×
5
Figure 8. Coverage of an unbiased
random protein library in percent as
function of the screening effort (solid
line). The probability of sampling
unique variants decreases when the
library coverage progresses (dotted
line). Example, after 1× oversampling about 63.2% library coverage is
achieved; the probability that the
401th randomly picked clone is
unique (not sampled before) for a
library
with
400
variants
is
approximately 37% (100-63).
Introduction
!
a. !" = 1 − 1 −
!
!"
!"
!×!100%
b.
!" = !"#!!
!!
!
!"
1−
!"
!""
Equation 1. a. Coverage of an unbiased random protein library as a function of the
screening effort. LC is the percentage of the theoretical library diversity that is
covered, LS is the library size (total number of unique variants that are present in
the library) and SC is the number of clones that are screened. The library coverage
corresponds to the probability of finding a specific variant, which is the same as one
minus the chance that such variant is not picked. The chance that a specific variant
is not picked can be written as (1-1/LS)SC. b. The number of randomly picked clones
that have to be screened (SC) in order to achieve a given degree of library coverage
(LC, of theoretical diversity in percent). Equation 1b is derived from 1a.
An important question is to what extend a random protein library should be
screened to make optimal use of the screening capacity. In other words,
what is the optimal balance between library coverage and oversampling? An
absolute answer to this question is hard to give but in case only one
improved variant would be present in a certain library it could make sense to
strive for 95% coverage so that there is only 5% chance of not finding this
hypothetical variant. On the other hand this scenario rarely happens.
Especially in larger combinatorial SSM libraries it is much more likely that
multiple suitable variants are present or that just no improved variants are
formed at all. In a positive situation where multiple sufficiently improved
variants exist, it requires much less library oversampling to discover at least
one of those well improved variants.[52] For example, to have 95% chance of
finding at least one out of five sufficiently improved variants in a random
library of 8,000 variants it is required to achieve just 45.1% library coverage
(Equation 2a). To reach this degree of library coverage it is necessary to
screen about 4793 randomly picked clones (Equation 2b). This example
indicates that comprehensive screening is not always required and that
testing of large numbers of redundant clones can be avoided. Library
screening strategies are further explored in Chapter 3 and also the sampling
of biased libraries is covered in this chapter.
!
19
Chapter 1
!
a. !" = 1 − 1 −
!" !"#
!×!100%
!""
!
b. !" = !"#!!
!!
!
!"
1−
!" !"#
!""
Equation 2. a. Library coverage (LC) required to find at least one of a defined
number of variants that are assigned as sufficiently improved variants (SIV). The
library coverage is indicated with LC, the confidence to find at least one SIV is
indicated with RC (in %). The probability of finding at least one out of multiple SIVs
can be written as 1-(1-(LC/100))SIV and is defined as RC. Using this equation the
required LC can be easily calculated. b. Number of clones that need to be screened
(SC) to find at least one SIV with certain confidence (RC). The library size (LS) is
the theoretical number of unique variants that are encoded. Equation 2b follows
from Equation 2a and 1b.
From completely random, to knowledge-driven library design
To make optimal use of screening capacity it is necessary to create libraries
with a high fraction of genotypically different improved variants. One option
is to suppress uneven distribution of amino acids in SSM libraries, which
arises from the fact that some amino acids are specified by a larger number
of codons than others. However, even in unbiased SSM libraries the
frequency of positives usually remains relatively low. This is because only a
small fraction of the explored amino acid substitutions will be beneficial for
a desired function whereas the vast majority of random mutations are neutral
or even detrimental.[51]
The frequency of beneficial mutations in site-specific mutant libraries can be
significantly enhanced by the application of restricted mutagenic codons that
specify only functional subsets of the twenty proteinogenic amino acids.
With this approach, called site-restricted mutagenesis (SRM),[53] the
randomness in the ensuing libraries is drastically reduced.[54-56] For example,
an SRM library in which four positions are randomized to five different
amino acids plus the wild-type residue (5+1) has a total diversity of 1296
(64) variants. This is 123-fold less than a full saturation library that also
addresses four positions (204 =160,000). With SRM more target positions
can be simultaneously covered with less screening. However, beneficial
amino acid residues that are not included will be missed. This makes the
definition of functional subsets of amino acids in SRM library design of key
importance. One option that was explored by Reetz and co-workers is to
20
Introduction
!
choose subsets in such way that libraries encompass large differences in
physical and chemical properties of amino acids without codon bias.[57]
More data-driven options use, for example, phylogenetic and co-evolution
data, following the assumption that individual or combinations of amino
acids that occur in homologous enzymes are likely to be tolerated.[58-61] Also
structural[62,63] and literature[64,65] data can be used. Moreover computational
methodologies are becoming increasingly more important for the definition
of functional subsets of amino acids[66-69] (Chapter 6). Smart library design
via SRM typically applies knowledge from various sources and this strategy
is visualized in Figure 9.
!
!
2
100.0
3
gi|495185022|ref|WP_007909812.1|:2-291
gi|16974915|pdb|1G5F|A:7-293
gi|379733761|ref|YP_005327266.1|:4-294
gi|492459152|ref|WP_005851751.1|:1-292
gi|495481248|ref|WP_008205935.1|:1-274
gi|497871374|ref|WP_010185530.1|:5-284
gi|154252063|ref|YP_001412887.1|:16-291
gi|497420883|ref|WP_009735081.1|:2-288
gi|492887029|ref|WP_006022791.1|:5-288
gi|222055138|ref|YP_002537500.1|:17-293
gi|407694232|ref|YP_006819020.1|:6-292
gi|494435834|ref|WP_007229113.1|:2-292
Knowledge driven
!
library design
gi|497227087|ref|WP_009541349.1|:27-317
gi|493555358|ref|WP_006508892.1|:52-343
gi|50082962|gb|AAT70109.1|:43-333
gi|442322361|ref|YP_007362382.1|:6-291
gi|89055111|ref|YP_510562.1|:33-322
LV HDWGSAL1 GF!
gi|494370223|ref|WP_007198248.1|:41-323
gi|251795773|ref|YP_003010504.1|:6-296
A
D
L
L
Y
I
G
gi|494871792|ref|WP_007597888.1|:12-301
G
gi|310941367|dbj|BAJ23986.1|:10-292
gi|495102118|ref|WP_007826941.1|:10-287
F
V
N
A
Y
M
F
H
Q
W
17
R
R
16
C
weblogo.berkeley.edu
!
12
9
8
F
T
M
M
T
15
I
P
A
D
11
I
7
4
3
2
1
TG
Q
G
I
V
M
6
L
N
5
L
A
HW
gi|494032893|ref|WP_006975029.1|:39-329
gi|494590639|ref|WP_007349233.1|:9-291
I
10
bits
K
E
N
4
gi|496034515|ref|WP_008759022.1|:11-287
VI T
2
gi|406939851|gb|EKD72788.1|:6-293
gi|374989964|ref|YP_004965459.1|:11-279
1R
gi|61222634|sp|P0A3G2.1|DHAA_RHORH:1-293
V
I
N
A
D
V
S
gi|358380384|gb|EHK18062.1|:6-300
L
AYF
0
gi|358383113|gb|EHK20782.1|:9-298
13
3
gi|212212390|ref|YP_002303326.1|:13-291
14
4
5
!
!
7
Hotspot
Wizard
6
Figure 9. Knowledge and data-driven library design. From the left (clockwise): (1)
phylogenetic data can indicate the mutability of individual sites and the amino acid
diversity that is likely to be accepted by the template enzyme; (2) co-evolution
analysis (2D heat map showing co-evolution scores created on 3dm.bio-prodict.nl)
can identify correlated occurrence of amino acids in homologues enzyme
sequences, which is likely important for a certain function; (3) structural inspection
and modeling can reveal promising sites and favorable amino acid substitutions can
be predicted; (4) target sites and amino acid diversity can be inspired on earlier
mutagenesis studies of (related) enzymes; (5,6) various computational routines that
run for example under Yasara (http://www.yasara.org/) or Rosetta
(http://depts.washington.edu/bakerpg/drupal/) can be used to determine functional
diversity
for
targeting
in
SRM
libraries;
(7)
HotSpot
Wizzard
(http://loschmidt.chemi.muni.cz/hotspotwizard/) is a web tool that uses structural and
phylogenetic data to identify residues that may come into contact with ligand
molecules entering or leaving the active site.
!
21
Chapter 1
Discovery and characterization of haloalkane dehalogenases
The subsequent chapters of this thesis describe tools and strategies for more
efficient laboratory evolution of enzymes. Some approaches focused on
enhancing the biocatalytic potential of haloalkane dehalogenases are
experimentally evaluated. Like many other hydrolytic enzymes such as
lipases, proteases, esterases and epoxide hydrolases, haloalkane
dehalogenases belong to the so-called α/β-hydrolase fold superfamily and
are composed of two domains: a large catalytic α/β-core domain and a
mainly alpha-helical cap-domain that shapes part of the substrate binding
pocket (Figure 10).[70-73]
a.
b.
Figure 10. Cartoon representation of a. haloalkane dehalogenase from
Xanthobacter autotrophicus GJ10 - DhlA (image created from pdb file 1B6G[71]) and
b. haloalkane dehalogenase from Rhodococcus rhodochrous NCIMB 13064 - DhaA
(image created from pdb file 1BN6[72]). α-Helices and β-sheets of the α/β core
domain are shown in blue and red respectively, α-helices of the cap domain are
displayed in dark-blue. Catalytic residues are displayed as yellow sticks.
!
Figure 11. The hydrolytic cleavage reaction of haloalkanes catalyzed by haloalkane
dehalogenase yields an alcohol product, a halide anion and a proton.
22
Introduction
!
Haloalkane dehalogenases occur in some bacteria and catalyze the
hydrolytic cleavage of carbon-halogen bonds (Figure 11). Whereas most
enzyme substrates are naturally occurring compounds, this is not the case for
haloalkane dehalogenases. Until the second half of the previous century
relatively few haloalkanes were present in nature; most natural halogenated
compounds are formed by marine organisms.[74,75] However, this situation
changed dramatically when haloalkanes started being produced at an
industrial scale for diverse applications such as solvent, gasoline additive,
biocide and synthon in organic synthesis. Today halogenated materials can
be found in, for example, solvents,[76] paint removers,[77] as flame
retardant[78] and in about 20% of all pharmaceuticals that are on the
market.[79] In just a few decades this new class of chemicals became
widespread in nature due to spoilage at production sites, extensive use in
agriculture and due to contamination after disposal.[80,81]
Unfortunately, most halogenated compounds and especially haloalkanes
bearing multiple halogen atoms such as 1,2,3-trichloropropane appeared to
be toxic and highly recalcitrant towards biodegradation. Nonetheless, the
selection pressure at contaminated sites and the metabolic versatility and
evolutionary potential of microorganisms have resulted in microbes that can
mineralize halogenated compounds such as 1,2-dichloroethane (DCE),[82]
vinylchloride[83] and pesticides like γ-hexachlorocyclohexane (lindane).[84]
Since 1978 haloalkane-degrading microbes have been isolated from
contaminated soils[85,86] and the enzymes that are involved in dehalogenation
have received widespread scientific attention.[87-89]
The first isolated haloalkane dehalogenase was obtained from Xanthobacter
autotrophicus GJ10 (DhlA) by Keuning et al. in 1984.[90] This Xanthobacter
strain was isolated from 1,2-dichloroethane contaminated soil and is capable
of utilizing this toxic compound as sole carbon and energy source. The
substrate scope of DhlA appeared not to be limited to just DCE. In-vitro
studies have indicated that it also catalyzes the hydrolytic cleavage of
carbon-halogen bonds in a range of other mono- and dihalogenated shortchain alkanes such as 1,2-dibromoethane, 1,3-dichloropropane, 1chloroethane and 1-bromopropane.[90] Site-directed mutagenesis experiments
and the elucidation of a three-dimensional structure of wild-type DhlA by Xray crystallography have shed light on the catalytic mechanism and
!
23
Chapter 1
structure-function relationship of haloalkane dehalogenases.[73,91,92] In DhlA
the residues Asp124, His289 and Asp260 make up a nucleophile-histidineacid catalytic triad, which is a typical motive in α/β-hydrolases. The indole
NH groups of Trp125 and Trp175 form hydrogen bonds with the substratehalogen atom which stabilizes the transition state complex. The
dehalogenation reaction proceeds in two steps. First, a carboxylate-oxygen
of Asp124 carries out a nucleophilic attack on the activated carbon-halogen
atom thereby producing an alkylated enzyme intermediate and a halide ion.
In the next step the alkyl-enzyme intermediate is cleaved by a water
molecule that is activated by His289 (Figure 12). The catalytic pentad
differs somewhat within the haloalkane dehalogenase family. The catalytic
acid that forms a hydrogen bond with the general base (Asp260 in DhlA)
can also be a glutamate residue and the second halide binding residue
(Trp175 in DhlA) can either be a tryptophan or an asparagine.[93] Also the
topological arrangement of the catalytic acid and the second halidestabilizing residue diverges within the family. The second halide-stabilizing
residue can be located either in the cap domain or in the core domain and the
general acid can be positioned in two different loops in the core domain.[93]
First half reaction
Second half reaction
Figure 12. The catalytic mechanism of haloalkane dehalogenase from Xanthobacter
autotrophicus. In the first half reaction a covalent alkyl-enzyme intermediate and a
halide ion are formed. In the second half reaction the alkyl-enzyme intermediate is
hydrolytically cleaved producing the alcohol and a proton.
Besides the value for fundamental studies the structure of DhlA also
suggested ways of engineering its catalytic and physical properties. In the
1990s targeted mutations in the cap-domain were reported that altered the
24
Introduction
!
substrate scope of the enzyme[94,95] and in 2002 a DhlA variant with
engineered disulfide bonds was described that displayed higher
thermostability.[96] Haloalkane dehalogenases have received much attention
for applications in bioremediation. In 1995 a bioreactor harboring
Xanthobacter autotrophicus GJ10 cells was used to cleanup DCE-polluted
groundwater in Germany[97] but many other haloalkanes could not be
mineralized with this system, in part due to limiting catalytic activity and
substrate scope of DhlA.
Since the description of DhlA various other haloalkane dehalogenases were
discovered and characterized. For example haloalkane dehalogenase from
Sphingomonas paucimobilis UT26 (LinB) was reported by Nagata et al. in
1993.[98,99] This enzyme is well-active on bulky compounds such as 1,3,4,6tetrachloro-1,4-cyclohexadiene which is an intermediate in the degradation
of the insecticide Lindane. In 1997 Kulakova et al. described a haloalkane
dehalogense (DhaA) from Rhodococcus rhodochrous NCIMB 13064 which
was the first haloalkane dehalogense known to be involved in the
degradation of several C2-C8 n-haloalkanes.[100] The latter enzyme also
displays better catalytic activity on trihalogenated propanes compared to
DhlA but this was still not sufficient to support bacterial growth on TCP.[101]
In 2002 Bosma and colleagues used a directed evolution approach to
enhance the catalytic efficiency of DhaA towards TCP and applied their best
evolved mutant in the construction of a synthetic biology organism that can
utilize TCP as the sole carbon and energy source.[102] In a subsequent
directed evolution study, which was focused on the substrate access tunnel,
DhaA was further improved in its catalytic activity on TCP[103] and also this
variant, referred to as DhaA31, was used to create a further improved TCP
degrading organism.[104]
Biocatalytic potential of haloalkane dehalogenases
The studies cited above indicate that there are several possibilities to tailor
haloalkane dehalogenases for applications in the field of bioremediation but
the potential of haloalkane dehalogenases in the production of valuable fine
chemicals is only poorly explored. Especially the biocatalytic production of
chiral intermediates for the pharmaceutical industry requires highly
!
25
Chapter 1
enantioselective enzymes. Unfortunately the best studied haloalkane
dehalogenases
so
far
display
only
poor
to
moderate
[105,106,107]
enantioselectivity.
An attractive option to expand the biocatalytic
potential of haloalkane dehalogenases is to develop enantioselective variants
for useful target conversions.
Outline of the Thesis
In Chapter 2 the biocatalytic potential of five wild-type and one mutant
haloalkane dehalogenases is explored in the asymmetric conversion of
prochiral polyhalogenated compounds towards chiral haloalcohol building
blocks. To enhance the optical purity of the primary dehalogenase product
also the subsequent kinetic resolution of the haloalcohols towards the diol is
investigated.
In Chapter 3 a smart library design approach is explored to evolve
haloalkane dehalogenase variants with complementary enantioselectivity
towards the industrial waste product TCP. The anticipated products (R)- and
(S)-2,3-dichloropropanol can be converted into (S)- and (R)-epichlorohydrin
which are valuable chiral building blocks for the pharmaceutical industry.
In Chapter 4 new tools and strategies are explored for designing and
creating site-restricted mutant libraries. The first tool is focused on the
definition of sequence diversity that can be optimally covered with the
screening capacity that is available. The purpose of the second tool is to find
optimal sets of (partly undefined) codons that specify the required subsets of
amino acids.
In Chapter 5 a new combinatorial library design strategy is explored. The
purpose is to efficiently explore larger numbers of positions and amino acid
variation in single combinatorial library designs. Exploring protein sequence
space in a more efficient way can speed up the directed evolution process.
In Chapter 6 a computational evolution methodology is explored for the
development of biocatalysts that can convert non-natural target compounds.
The method takes three key-requirements for enzyme catalysis into account;
26
Introduction
!
enzyme stability, substrate binding and substrate turnover. The final goal of
this study is to evolve haloalkane dehalogenase variants with enhanced
properties in the kinetic resolution of racemic 3-chloro-2-alkyn towards
enantio pure 3-butyn-2-ol building blocks.
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
!
World Commission on Environment and Development (1987) Our common
future. Oxford University Press, USA.
K. Weissermel, H.J. Arpe (1997) Industrial organic chemistry, 3rd Edn., Wiley
VCH, Weinheim, Germany. p 294-299.
B.M. Bell, J.R. Briggs, R.M. Campbell, S.M. Chambers, P.D. Gaarenstroom,
J.G. Hippler, B.D. Hook, K. Kearns (2008) Glycerin as a renewable feedstock
for epichlorohydrin production. The GTE Process. CLEAN - Soil, Air, Water
36: 657-661.
R.A. Sheldon (2000) Atom efficiency and catalysis in organic synthesis. Pure
Appl. Chem. 72: 1233-1246.
R.A. Sheldon, I. Arends, U. Hanefeld (2007) Green chemistry and catalysis.
Wiley VCH, Weinheim, Germany.
Dutch National Research School Combination Catalysis Controlled by
Chemical Design (NRSC-Catalysis) (2009) Future perspectives in
catalysis.^^http://www.nrsc−catalysis.nl/files/media/scientific_reports/Future_p
erspectives_in_Catalysis.pdf
R.H. Garrett, C.M. Grisham (1999) Biochemistry. Saunders college publishing,
Philadelphia, USA p 426-427.
L. Pasteur (1857) Mémoiresur la fermentation appeléelactique.
Comptesrendus des séances de l’Academie des Sciences. 45: 913-916.
R.A. Sheldon (1993) Chirotechnology - Indistrial synthesis of optically active
compounds. CRC press, New York, USA p.105.
A. Payen, J.F. Persoz (1833) Memoire sur la Diastase; les principaux produits
de ses réactions et leurs applications aux arts industriels. J. Ann. Chem.
Phys. 53: 73-92.
P.S.J. Cheetham (1995) The application of enzymes in: industry, in Handbook
of Enzyme Biotechnology, 3rd Edn., Ellis Horwood, London. p 420.
R.L. Antrim, W. Colilla, B.J. Schnyder (1979) Glucose isomerase production of
high-fructose syrups. In: Appl. Biochem. Bioeng. (vol 2), Enzyme Technology.
Academic Press, New York, USA p 97-207.
A.S. Bommarius, K. Drauz, U. Groeger, C. Wandrey (1992) Membrane
bioreactors for the production of enantiomerically pure α-amino acids, Chirality
in Industry. John Wiley & Sons Ltd, New York, USA p 372-397.
O.K. Sebek, D. Perlman (1979) Microbial transformation of steroids and
sterols. Microbial Technology (vol. 1), 2nd Edn., Academic Press, New York,
USA p. 484-488.
L. Cao (2005) Carrier-bound Immobilized Enzymes: Principles, Application
27
Chapter 1
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
28
and Design. Wiley-VCH, Wienheim, Germany.
J.D. Watson, F.H.C. Crick (1953) Molecular structure of nucleic acid: a
structure of deoxyribose nucleic acid. Nature 171: 737–738.
S.S. Hall (2002) Invisible frontiers: the race to synthesize a human gene.
Oxford University Press, USA.
J.A. van den Berg, K.J. van der Laken, A.J. van Ooyen, T.C. Renniers, K.
Rietveld, A. Schaap, A.J. Brake, R.J. Bishop, K. Schultz, D. Moyer, M.
Richman, J.R. Shuster (1990) Kluyveromyces as a host for heterologous gene
expression: expression and secretion of prochymosin. Biotechnology 8: 135139.
J.C. Kendrew, G. Bodo, H.M. Dintzis, R.G. Parrish, H. Wyckoff, D.C. Phillips
(1958) 3-Dimensional model of the myoglobin molecule obtained by X-ray
analysis. Nature 181: 662-666.
K. Mullis, F. Faloona, S. Scharf, R. Saiki, G. Horn, H. Erlich (1986) Specific
enzymatic amplification of DNA in vitro – the polymerase chain-reaction. Cold
Spring Harbor Symposia on Quantitative Biology 51: 263-273.
E. Pettersson J. Lundeberg, A. Ahmadian (2009) Generations of sequencing
technologies. Genomics 93: 105-111.
W. Kühne (1976) Enzymes: One Hundred Years. FEBS Lett. vol. 62.
R. Borriss (1987) Biotechnology of enzymes, in Biotechnology vol 7a, eds.
H.J. Rehm, G. Reed, series Enzyme Technology, ed. J.F. Kennedy, VCH
Verlagsgesellschaft, Weinheim, Germany. p 35-62.
W. Gerhartz (1990) Enzymes in industry, VCH Verlagsgesellschaft,
Weinheim, Germany. p 11.
J.C. Venter, K. Remington, J.F. Heidelberg, A.L. Halpern, D. Rusch, J.A.
Eisen, D. Wu, I. Paulsen, K. E. Nelson, W. Nelson, D.E. Fouts, S. Levy, A.H.
Knap, M.W. Lomas, K. Nealson, O. White, J. Peterson, J. Hoffman, R.
Parsons, H. Baden-Tillson, C. Pfannkoch, Y.H. Rogers, H.O. Smith (2004)
Environmental genome shotgun sequencing of the Sargasso Sea. Science
304: 66-74.
M. Matsumura, W.J. Becktel, M. Levitt, B.W. Matthews (1989) Stabilization of
phage T4 lysozyme by engineered disulfide bonds. Proc. Natl. Acad. Sci. 86:
6562-6566.
M. Matsumura, G. Signor, B.W. Matthews (1989) Substantial increase of
protein stability by multiple disulfide bonds. Nature 342: 291-293.
U.T Bornscheuer, M. Pohl (2001) Improved biocatalysts by directed evolution
and rational protein design. Curr. Opin. Chem. Biol. 2: 137-143.
R. Chen (2001) Enzyme engineering: rational redesign versus directed
evolution. Trends Biotechnol. 19: 13-14.
J.C. Moore, F.H. Arnold (1996) Directed evolution of a para-nitrobenzyl
esterase for aqueous-organic solvents. Nature Biotechnol. 14: 458-467.
W.P. Stemmer (1994) DNA shuffling by random fragmentation and
reassembly: in vitro recombination for molecular evolution. Proc. Natl. Acad.
Sci. USA 91: 10747-10751.
F.H. Arnold (2001) Combinatorial and computational challenges for biocatalyst
design. Nature 409: 253-257.
Introduction
!
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
45.
46.
47.
!
H.E. Schoemaker, D. Mink, M.G. Wubbolts (2003) Dispelling the myths biocatalysis in industrial synthesis. Science 299: 1694-1697.
K. Chen, F.H. Arnold (1993) Tuning the activity of an enzyme for unusual
environments: sequential random mutagenesis of subtilisin E for catalysis in
dimethylformamide. Proc. Natl. Acad. Sci. USA 90: 5618-5622.
W.P Stemmer (1994) Rapid evolution of a protein in vitro by DNA shuffling.
Nature 370: 389-391
L. You, F.H. Arnold (1996) Directed evolution of subtilisin E in Bacillus subtilis
to enhance total activity in aqueous dimethylformamide. Protein Eng. 9: 77-83.
J.H. Spee, W.M. de Vos, O.P. Kuipers (1993) Efficient random mutagenesis
method with adjustable mutation frequency by use of PCR and dITP. Nucleic
Acids Res. 21: 777-778.
T.S. Wong, D. Zhurina, U. Schwaneberg (2006) The diversity challenge in
directed protein evolution. Comb. Chem. High Throughput Screen. 9: 271288.
K.L. Morley, R.J. Kazlauskas (2005) Improving enzyme properties: when are
closer mutations better? Trends Biotechnol. 23: 231-237.
H.H. Hogrefe, J. Cline, G.L. Youngblood, R.M. Allen (2002) Creating
randomized amino acid libraries with the QuikChange Multi Site-Directed
Mutagenesis kit. BioTechniques. 33: 1158-1165.
P. Klaassen, A.W.H. Vollebregt, M.A van den Berg, M. Hans, J.M. van der
Laan (2007) Process for preparing pravastatin. European Patent EP2094841.
M. Hans, J.M. van der Laan, B. Meijrink, W. van Scheppingen, R. Kerkman,
M. van den Berg, M. Kittelmann, A. Kuhn, A. Riepp, J. Kühnöl, A.
Fredenhagen, L. Oberer, O. Ghisalba, S. Luetz, D.P. Mangan, T.S. Moody, D.
Schmid, A. Osorio-Lozada, F.O. Ütkür, J. Collins, C. Brandenbusch, G.
Sadowski, A. Schmid, B. Bühler, M. Kinne, M. Poraj-Kobielska, R. Ullrich, M.
Hofrichter, G. Grogan, M.L. Thompson (2012) Regio- and stereoselective
hydroxylation. In: Practical Methods for Biocatalysis and Biotransformations.
John Wiley & Sons, Ltd, Chichester, UK.
F.H. Arnold, J.C. Moore (1997) Optimizing industrial enzymes by directed
evolution, in: New Enzymes for Organic Synthesis, vol. 58, Adv. Biochem.
Eng. Biotechnol., Springer, Berlin, p 2-14.
M.T. Reetz (2004) Controlling the enantioselectivity of enzymes by directed
evolution: practical and theoretical ramifications. Proc. Natl. Acad. Sci. USA
101: 5716-5722.
H. Lin, V.W. Cornish (2002) Screening and selection methods for large-scale
analysis of protein function. Angew. Chem. 41: 4402-4425.
E.T. Boder, K.D. Wittrup (1997) Yeast surface display for screening
combinatorial polypeptide libraries. Nat. Biotechnol. 15: 553-557.
T.S. Wong, D. Roccatano, D. Loakes, K.L. Tee, A. Schenk, B. Hauer, U.
Schwaneberg (2008) Transversion-enriched sequence saturation
mutagenesis (SeSaM-Tv+): a random mutagenesis method with consecutive
nucleotide exchanges that complements the bias of error-prone PCR.
Biotechnol. J. 3: 74-82.
29
Chapter 1
48.
49.
50.
51.
52.
53.
54.
55.
56.
57.
58.
59.
60.
61.
62.
63.
64.
30
M.T. Reetz, M. Bocola, J.D. Carballeira, D.X. Zha, A. Vogel (2005) Expanding
the range of substrate acceptance of enzymes: combinatorial active-site
saturation test. Angew. Chem. 44: 4192-4196.
M.T. Reetz, L.W. Wang, M. Bocola (2006) Directed evolution of
enantioselective enzymes: iterative cycles of CASTing for probing proteinsequence space. Angew. Chem. 45: 1236-1241.
H.M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T.N. Bhat, H. Weissig, I.N.
Shindyalov, P.E. Bourne (2000) The protein data bank. Nucleic Acids
Research 28: 235-242.
F.H. Arnold (1998) Design by directed evolution. Acc. Chem. Res. 31: 125131.
Y. Nov (2012) When second best is good enough: another probabilistic look at
saturation mutagenesis. Appl. Environ. Microbiol. 78: 258-262.
J.G.E. van Leeuwen, H.J. Wijma, R.J. Floor, J.M. van der Laan, D.B. Janssen
(2012) Directed evolution strategies for enantiocomplementary haloalkane
dehalogenases: from chemical waste to enantiopure building blocks.
Chembiochem. 13: 137-148.
R.E. Campbell, O. Tour, A.E. Palmer, P.A. Steinbach, G.S. Baird, D.A.
Zacharias, R.Y. Tsien (2002) A monomeric red fluorescent protein. Proc. Natl.
Acad. Sci. USA 99: 7877-7882.
R.J. Hayes, J. Bentzien, M.L. Ary, M.Y. Hwang, J.M. Jacinto, J. Vielmetter, A.
Kundu, B.I. Dahiyat (2002) Combining computational and experimental
screening for rapid optimization of protein properties. Proc. Natl. Acad. Sci.
USA 99: 15926-15931.
R. Fox, L.J. Giver, D. Held, D. Hattendorf, T. Choudhary (2010) Reduced
codon mutagenesis. Codexis patent. US2011/0082055 A1.
M.T. Reetz, D. Kahakeaw, R. Lohmer (2008) Addressing the numbers
problem in directed evolution. Chembiochem. 9: 1797-1804.
C. Jäckel, J.D. Bloom, P. Kast, F.H. Arnold, D. Hilvert (2010) Consensus
protein design without phylogenetic bias. J. Mol. Biol. 399: 541-546.
A. Pavelka, E. Chovancova, J. Damborsky (2009) HotSpot Wizard: a web
server for identification of hot spots in protein engineering. Nucleic Acids Res.
37: W376-383.
R.K. Kuipers, H.J. Joosten, E. Verwiel, S. Paans, J. Akerboom, J. van der
Oost, N.G. Leferink, W.J. van Berkel, G. Vriend, P.J. Schaap (2009)
Correlated mutation analyses on super-family alignments reveal functionally
important residues. Proteins 76: 608-616.
H. Jochens, D. Aerts, U.T. Bornscheuer (2010) Thermostabilization of an
esterase by alignment-guided focussed directed evolution. Protein Eng. Des.
Sel. 23: 903-909.
M.T. Reetz, P. Soni, L. Fernandez (2009) Knowledge-guided laboratory
evolution of protein thermolability. Biotechnol. Bioeng. 102: 1712-1717.
J.F. Chaparro-Riggers, K.M. Polizzi, A.S. Bommarius (2007) Better library
design: data-driven protein engineering. J. Biotechnol. 2: 180-191.
L.G. Otten, F. Hollmann, I.W. Arends (2010) Enzyme engineering for
enantioselectivity: from trial-and-error to rational design? Trends Biotechnol.
28: 46-54.
Introduction
!
65.
66.
67.
68.
69.
70.
71.
72.
73.
74.
75.
76.
77.
78.
79.
80.
!
R.K. Kuipers, H.J. Joosten, W.J. van Berkel, N.G. Leferink, E. Rooijen, E.
Ittmann, F. van Zimmeren, H. Jochens, U. Bornscheuer, G. Vriend, V.A. dos
Santos, P.J. Schaap (2010) 3DM: Systematic analysis of heterogeneous
superfamily data to discover protein functionalities. Proteins: Struct. Funct.
Bioinf. 78: 2101-2113.
C.A. Smith, T. Kortemme (2011) Predicting the tolerated sequences for
proteins and protein interfaces using Rosetta Backrub flexible backbone
design. PLoS One 6: e20451.
M.C. Laboissière, M.M. Young, R.G. Pinho, S. Todd, R.J. Fletterick, I. Kuntz,
C.S. Craik (2002) Computer-assisted mutagenesis of ecotin to engineer its
secondary binding site for urokinase inhibition. J. Biol. Chem. 277: 2662326631.
P. Araujo, B. Sosa, T Miller, S. Mayo (2012) In Silico screening of
computational enzyme designs. Protein Science 21: 132.
S.M. Lippow, T.S. Moon, S. Basu, S.H. Yoon, X. Li, B.A. Chapman, K.
Robison, D. Lipovšek, K.L. Prather (2010) Engineering enzyme specificity
using computational design of a defined-sequence library. Chem. Biol. 17:
1306-1315.
J. Marek, J. Vévodová, I.K. Smatanová, Y. Nagata, L.A. Svensson, J.
Newman, M. Takagi, J. Damborský (2000) Crystal structure of the haloalkane
dehalogenase from Sphingomonas paucimobilis UT26. Biochemistry 39:
14082-14086.
I.S. Ridder, H.J. Rozeboom, B.W. Dijkstra (1999) Haloalkane dehalogenase
from Xanthobacter autotrophicus GJ10 refined at 1.15 A resolution. Acta.
Crystallogr. D. Biol. Crystallogr. 55: 1273-1290.
J. Newman, T.S. Peat, R. Richard, L. Kan, P.E. Swanson, J.A. Affholter, I.H.
Holmes, J.F. Schindler, C.J. Unkefer, T.C. Terwilliger (1999) Haloalkane
dehalogenases: structure of a Rhodococcus enzyme. Biochemistry 38: 1610516114.
K.H. Verschueren, S.M. Franken, H.J. Rozeboom, K.H. Kalk, B.W. Dijkstra
(1993) Refined X-ray structures of haloalkane dehalogenase at pH 6.2 and pH
8.2 and implications for the reaction mechanism. J. Mol. Biol. 232: 856-872.
F. Laturnus, C. Wiencke, H. Klöser (1996) Antarctic macroalgae – Sources of
volatile halogenated organic compounds. Mar. Environ. Res. 41: 169-181.
G.W. Gribble (1994) The natural production of chlorinated compounds.
Environ. Sci. Technol. 28: 310A-319A.
S.R. Armstrong, L.C. Green (2004) Chlorinated hydrocarbon solvents. Clin.
Occup. Environ. Med. 4: 481-496.
R.D. Stewart, C.L. Hake (1979) Paint-remover hazard. JAMA 235: 398-401.
M.J. Dagani, H.J. Barda, T.J. Benya, D.C. Sanders (2002) “Bromine
compounds” In: Ullmann's Encyclopedia of Industrial Chemistry. Wiley-VCH,
Weinheim, Germany.
L.N. Herrera-Rodriguez, F. Kahn, K.T. Robins, H.P. Meyer (2011)
Perspectives on biotechnological halogenation. Part 1: Halogenated products
and enzymatic halogenation. Chem. Today. 29: n4 (Lonza)
E.C. Voldner, Y.F. Li (1995) Global usage of selected persistent
organochlorines. Sci. Total Environ. 160: 201-210.
31
Chapter 1
81.
82.
83.
84.
85.
86.
87.
88.
89.
90.
91.
92.
93.
94.
95.
96.
97.
32
D.W. Connell, G.J. Miller, M.R. Mortimer, G.R. Shaw, S.M. Anderson (1999)
Persistent lipophilic contaminants and other chemical residues in the southern
hemisphere. Crit. Rev. Environ. Sci. Technol. 29: 47-82.
D.B. Janssen, A. Scheper, L. Dijkhuizen, B. Witholt (1985) Degradation of
halogenated aliphatic compounds by Xanthobacter autotrophicus GJ10. Appl.
Environ. Microbiol. 49: 673-677.
S. Hartmans, J.A. De Bont (1992) Aerobic vinyl chloride metabolism in
Mycobacterium aurum L1. Appl. Environ. Microbiol. 58: 1220-1226.
R. Imai, Y. Nagat, K. Senoo, H. Wada, M. Fukuda, M. Takagi, K. Yano (1989)
Dehydrochlorination of γ-hexachlorocyclohexane (γ-BHC) by γ-BHCassimilating Pseudomonas paucimobilis. Agric. Biol. Chem. 53: 2015-2017.
T. Omori, M. Alexander (1978) Bacterial and spontaneous dehalogenation of
organic compounds. Appl. Environ. Microbiol. 35: 512-516.
T. Omori, M. Alexander (1978) Bacterial dehalogenation of halogenated
alkanes and fatty acids. Appl. Environ. Microbiol. 35: 867-871.
D.B. Janssen, F. Pries, J.R. van der Ploeg (1994) Genetics and biochemistry
of dehalogenating enzymes. Annu. Rev. Microbiol. 48: 163-191.
S. Fetzner (1998) Bacterial dehalogenation. Appl. Microbiol. Biotechnol. 50:
633-657.
M.I. Arif, G. Samin, J.G.E. van Leeuwen, J. Oppentocht, D.B. Janssen (2012)
Novel dehalogenase mechanism for 2,3-dichloro-1-propanol utilization in
Pseudomonas putida strain MC4. Appl. Environ. Microbiol. 78: 6128-6136.
S. Keuning, D.B. Janssen, B. Witholt (1985) Purification and characterization
of hydrolytic haloalkane dehalogenase from Xanthobacter autotrophicus
GJ10. J. Bacteriol. 163: 635-639.
F. Pries, J. Kingma, M. Pentenga, G. van Pouderoyen, C.M. JeronimusStratingh, A.P. Bruins, D.B. Janssen (1994) Site-directed mutagenesis and
oxygen isotope incorporation studies of the nucleophilic aspartate of
haloalkane dehalogenase. Biochemistry 33: 1242-1247.
F. Pries, J. Kingma, G.H. Krooshof, C.M. Jeronimus-Stratingh, A.P. Bruins,
D.B. Janssen (1995) Histidine 289 is essential for hydrolysis of the alkylenzyme intermediate of haloalkane dehalogenase. J. Biol. Chem. 270: 1040510411.
E. Chovancová, J. Kosinski, J.M. Bujnicki, J. Damborský (2007) Phylogenetic
analysis of haloalkane dehalogenases. Proteins. 67: 305-316.
P. Holloway, K.L. Knoke, J.T. Trevors, H. Lee (1998) Alteration of the
substrate range of haloalkane dehalogenase by site-directed mutagenesis.
Biotechnol. Bioeng. 59: 520-523.
J.P. Schanstra, A. Ridder, J. Kingma, and D.B. Janssen (1997) Influence of
mutations of Val226 on the catalytic rate of haloalkane dehalogenase. Prot.
Eng. 10: 53-61.
M.G. Pikkemaat, A.B. Linssen, H.J. Berendsen, D.B. Janssen (2002)
Molecular dynamics simulations as a tool for improving protein stability.
Protein Eng. 15: 185-192.
G. Stucki, M. Thüer (1995) Experiences of a large-scale application of 1,2dichloroethane degrading microorganisms for groundwater treatment. Environ.
Sci. Technol. 29: 2339-2345.
Introduction
!
98.
99.
100.
101.
102.
103.
104.
105.
106.
107.
!
Y. Nagata, T. Nariya, R. Ohtomo, M. Fukuda, K. Yano, M. Takagi (1993)
Cloning and sequencing of a dehalogenase gene encoding an enzyme with
hydrolase activity involved in the degradation of gammahexachlorocyclohexane in Pseudomonas paucimobilis. J. Bacteriol. 175:
6403-6410.
Y. Nagata, K. Miyauchi, J. Damborsky, K. Manova, A. Ansorgova, M. Takagi
(1997) Purification and characterization of a haloalkane dehalogenase of a
new substrate class from a gamma-hexachlorocyclohexane-degrading
bacterium, Sphingomonas paucimobilis UT26. Appl. Environ. Microbiol. 63:
3707-3710.
A.M. Kulakova, M.J. Larkin, L.A. Kulakov (1997) The plasmid-located
haloalkane dehalogenase gene from Rhodococcus rhodochrous NCIMB
13064. Microbiology. 143: 109-115.
T. Bosma, E. Kruizinga, E.J. de Bruin, G.J. Poelarends, D.B. Janssen (1999)
Utilization of trihalogenated propanes by Agrobacterium radiobacter AD1
through heterologous expression of the haloalkane dehalogenase from
Rhodococcus sp. strain M15-3. Appl. Environ. Microbiol. 65: 4575-4581.
T. Bosma, J. Damborský, G. Stucki, D.B. Janssen (2002) Biodegradation of
1,2,3-trichloropropane through directed evolution and heterologous
expression of a haloalkane dehalogenase gene. Appl. Environ. Microbiol. 68:
3582-3587.
M. Pavlova, M. Klvana, Z. Prokop, R. Chaloupkova, P. Banas, M. Otyepka,
R.C. Wade, M. Tsuda, Y. Nagata, J. Damborsky (2009) Redesigning
dehalogenase access tunnels as a strategy for degrading an anthropogenic
substrate. Nat. Chem. Biol. 5: 727-733.
G. Samin, D.B. Janssen (2012) Transformation and biodegradation of 1,2,3trichloropropane (TCP). Environ. Sci. Pollut. Res. Int. 19: 3067-3078.
R.J. Pieters, J.H. Lutje Spelberg, R.M. Kellogg, D.B. Janssen (2001) The
enantioselectivity of haloalkane dehalogenases. Tetrahedron Letters 42: 469471.
Z. Prokop, Y. Sato, J. Brezovsky, T. Mozga, R. Chaloupkova, T. Koudelakova,
P. Jerabek, V. Stepankova, R. Natsume, J.G.E. van Leeuwen, D.B. Janssen,
J. Florian J, Y. Nagata, T. Senda, J. Damborsky (2010) Enantioselectivity of
haloalkane dehalogenases and its modulation by surface loop engineering.
Angew. Chem. 49: 6111-6115.
A. Westerbeek, J.G.E. Leeuwen van, W. Szymański, B.L. Feringa, D.B.
Janssen (2012) Haloalkane dehalogenase catalysed desymmetrisation and
tandem kinetic resolution for the preparation of chiral haloalcohols.
Tetrahedron 68: 7645-7650.
33
!