Crystal structure and substratebinding mode of cellulase 12A from

proteins
STRUCTURE O FUNCTION O BIOINFORMATICS
Crystal structure and substrate-binding
mode of cellulase 12A from Thermotoga maritima
Ya-Shan Cheng,1 Tzu-Ping Ko,2 Tzu-Hui Wu,3 Yanhe Ma,4 Chun-Hsiang Huang,5
Hui-Lin Lai,3 Andrew H.-J. Wang,2,5 Je-Ruei Liu,1,6* and Rey-Ting Guo4*
1 Institute of Biotechnology, National Taiwan University, Taipei 106, Taiwan
2 Institute of Biological Chemistry, Academia Sinica, Taipei 115, Taiwan
3 Genozyme Biotechnology Inc., Taipei 106, Taiwan
4 Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China
5 Genomics Research Center, Academia Sinica, Taipei 115, Taiwan
6 Department of Animal Science and Technology, National Taiwan University, Taipei 106, Taiwan
ABSTRACT
Cellulases have been used in many applications to
treat various carbohydrate-containing materials.
Thermotoga maritima cellulase 12A (TmCel12A)
belongs to the GH12 family of glycoside hydrolases. It is a b-1,4-endoglucanase that degrades cellulose molecules into smaller fragments, facilitating further utilization of the carbohydrate.
Because of its hyperthermophilic nature, the
enzyme is especially suitable for industrial applications. Here the crystal structure of TmCel12A
was determined by using an active-site mutant
E134C and its mercury-containing derivatives. It
adopts a b-jellyroll protein fold typical of the
GH12-family enzymes, with two curved b-sheets A
and B and a central active-site cleft. Structural
comparison with other GH12 enzymes shows significant differences, as found in two longer and
highly twisted b-strands B8 and B9 and several
loops. A unique Loop A3-B3 that contains Arg60
and Tyr61 stabilizes the substrate by hydrogen
bonding and stacking, as observed in the complex
crystals with cellotetraose and cellobiose. The
high-resolution structures allow clear elucidation
of the network of interactions between the enzyme
and its substrate. The sugar residues bound to the
enzyme appear to be more ordered in the 22 and
21 subsites than in the 11, 12 and 23 subsites.
In the E134C crystals the bound 21 sugar at the
cleavage site consistently show the a-anomeric
configuration, implicating an intermediate-like
structure.
Proteins 2011; 79:1193–1204.
C 2010 Wiley-Liss, Inc.
V
Key words: hyperthermophile;
endoglucanase;
catalytic intermediate; active site mutant; mercury derivatives; synchrotron radiations; biofuel
industry.
C 2010 WILEY-LISS, INC.
V
INTRODUCTION
Glucose produced by photosynthesis is a major energy source for
life. The a-1,4-linked glucose stored in starch-rich food such as rice,
wheat, corn, or potato can be readily released by digestive enzymes
such as the amylases. The b-1,4-linked glucose found in cellulose,
which plays structural role in the plant cell wall, requires microbial
enzymes to make it available. Termites feasting on wood and cattle
browsing in a prairie are hosts of these microbes, which produce
enzymes including xylanases and cellulases that first degrade the
polysaccharide molecules into fragments, and eventually into disaccharides and monosaccharides. Recent demand on renewable energy
resource has been increasing to a level that exploitation of plant waste
becomes a potential means to obtain biofuel and other products.
Heterologous cellulase genes have been overexpressed in well-characterized microorganisms such as Escherichia coli to promote their efficiency in biofuel conversion.1 In addition, cellulases are also used in
fruit juice processing and textile production.
Thermotoga maritima cellulase 12A (TmCel12A; 257 amino-acid residues) belongs to the GH12 family of glycoside hydrolases,2–4 which is
closely related to the GH11 sister family of xylanase. These two families
constitute Clan C of glycoside hydrolases according to the classification
by CAZy (www.cazy.org). Both are retaining enzymes for hydrolysis of
the glucans. The crystal structure of a GH12-family cellulase, the endoglucanase CelB2 from the bacteria Streptomyces lividans, was first determined by Sulzenbacher et al.5 It revealed a similar protein fold as
observed in the GH11-family xylanases, comprising two large antiparallel b-sheets with jellyroll topology that are packed against each other in
a sandwich-like manner. The b-sheets are curved and interconnected
Additional Supporting Information may be found in the online version of this article.
Ya-Shan Cheng and Tzu-Ping Ko contributed equally to this work.
Grant sponsor: National Science Council; Grant number: NSC 98-2313-B002-033-MY3
*Correspondence to: Je-Ruei Liu, Institute of Biotechnology, National Taiwan University, Taipei
106, Taiwan. E-mail: [email protected] and Rey-Ting Guo, Tianjin Institute of Industrial
Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China. E-mail: [email protected]
Received 23 September 2010; Revised 10 November 2010; Accepted 17 November 2010
Published online 30 November 2010 in Wiley Online Library (wileyonlinelibrary.com).
DOI: 10.1002/prot.22953
PROTEINS
1193
Y.-S. Cheng et al.
by associated loops. An extended substrate-binding cleft is
formed across the molecular surface that suggests binding to
at least five glucose units in regions denoted the 23, 22,
21, 11, and 12 subsites. In this cleft the two catalytic residues Glu120 and Glu203, which correspond to Glu134 and
Glu231 in TmCel12A, are facing each other and separated by
about 7 Å. Sulzenbacher et al. also directly observed the
covalent attachment of the nucleophile Glu120 to a substrate
analogue,6 attesting to the double displacement mechanism
of catalysis.
In addition to SlCelB2, GH12-family enzymes from
seven other organisms, Bacillus licheniformis, Rhodothermus marinus, Streptomyces sp 11AG8, Aspergillus niger,
Humicola grisea, Hypocrea jecorina (or Trichoderma reesei), and Hypocrea schweinitzii, are known for their
three-dimensional structures.7–14 All have the similar
protein fold of b-sandwich, but with variations especially in the connecting loops, which may affect the affinity of the enzymes to the substrate and result in different cleavage efficiency despite the common catalytic
mechanism. Since industrial application is usually carried out at elevated temperatures, thermal stability has
also been studied by extensive structural comparison
and mutagenesis.11,12,15 Here we report the crystal
structures of TmCel12A and its complex with cellobiose
and cellotetraose, and present a detailed analysis for the
mode of substrate binding.
MATERIALS AND METHODS
Protein expression and purification
The TmCel12A gene was obtained from Thermotoga maritima MSB8 genomic DNA library (ATCC 43589) and cloned
into the vector pET16b by using XbaI and NdeI. A His6-tag
was added before the N-terminus for purification purposes.
The primers used here were 50 -GCTCTAGAAATAATTTTG
TTTAACTTTAAGAAGGAGATATACCATGGGCCACCACCA
CCACCACCACATGGTACTGATGACAAAA-30 (forward)
and 50 -GGAATTCCATATGTTATCATTCTCTCACCTCCA
GATCAAT-30 (reverse). The E134C mutant was prepared
by using QuickChange sited-directed mutagenesis kit (Agilent) with TmCel12A/pET16b as the template and a forward primer of 50 -AAGCATCGATCGGCGATGTTTGCAT
CATGGTCTGGTTCTATTT-30 . The constructs were transformed into E. coli BL21 (DE3) where the protein expression was induced by adding IPTG. The proteins were then
purified by FPLC using Ni-NTA column and DEAE column. The buffer and gradient were 25 mM Tris, pH 7.5,
150 mM NaCl, and 20–250 mM imidazole for the Ni-NTA
column and those for the DEAE column were 25 mM Tris,
pH 7.5, and 0–250 mM NaCl. The proteins were eluted
at about 75 mM imidazole and 125 mM NaCl, respectively.
The purified proteins were finally concentrated to
5 mg mL21 in 25 mM Tris, 150 mM NaCl, pH 7.5.
1194
PROTEINS
Crystallization and data collection
The wild-type TmCel12A protein was first crystallized
by using the Index screen kit (Hampton Research) and sitting-drop vapor diffusion method. The reservoir solution
(No.66) contained 0.2M ammonium sulfate, 0.1M Bis-Tris,
pH 5.5 and 25% w/v PEG3350. Better (orthorhombic)
crystals were obtained by optimizing the reservoir composition as 0.1M Bis-Tris, pH 5.5, 10% glycerol, and 15%
PEG3350. The reservoir for crystallizing the E134C mutant
was slightly different; it contained 0.1M ammonium sulfate, 0.1M Bis-Tris, pH 5.5, 5% glycerol, and 18%
PEG3350. Adding 10 mM cellobiose to the protein solution
resulted in a different (monoclinic) crystal form. All crystals were obtained at room temperatures. They reached
suitable sizes for X-ray diffraction in 2 days.
Before flash freezing to cryogenic temperatures, the wildtype crystals were soaked for about 1 h in a cryoprotectant solution that contained 0.12M Bis-Tris, pH 5.5, 15% glycerol,
and 20% PEG3350. The wild-type TmCel12A-cellotetraose
complex crystals were obtained by including 10 mM cellotetraose in the soaking solution. The cryoprotectant for the
E134C crystals (both forms) contained 0.15M ammonium
sulfate, 0.15M Bis-Tris, pH 5.5, 15% glycerol, and 25%
PEG3350. To prepare heavy-atom derivatives, the 15 mercurycontaining reagents in Heavy Atom Screen Hg (Hampton
Research) was each diluted 10-fold in the cryoprotectant solution (final concentration 2 mM) and used in soaking the
orthorhombic E134C crystals for 1 h. Soaking in the cryoprotectant containing 10 mM cellobiose or cellotetraose resulted
in the orthorhombic E134C-substrate complex crystals.
All X-ray diffraction experiments were carried out at the
National Synchrotron Radiation Research Center (NSRRC)
in Hsinchu, Taiwan. The wavelength used in collecting the
native E134C and the heavy-atom derivative data was either 1.0011 Å (BL 13B1) or 0.9762 Å (BL 13C1). The other
dataset were collected at a wavelength of 1.0008 Å (BL
13B1). The diffraction images were processed by using
HKL2000.16 The mercury datasets were scaled with anomalous signals conserved for phasing purposes. One dataset
of ‘‘native’’ E134C crystal and 12 datasets of the derivatives
were collected (see Table SI in the Supporting Information), in addition to five datasets of the wild-type enzyme
and substrate-complexes used in refinement and analysis.
Structure determination and refinement
Although the MIR (multiple isomorphous replacement)
datasets were not collected for maximizing the absorption
by mercury, the higher energy of the two fixed wavelengths than the theoretical value of 1.0101 Å allowed sufficient anomalous signals to be observed, which turned
out to be effective in phase angle calculations. Using
SOLVE and RESOLVE,17–19 combinations of any 4 of the
12 datasets with the native dataset of E134C crystal
resulted in FOM (figure of merit) values from 0.60 to
0.78, Z-scores from 8.5 to 12.0, and up to 460 auto-built
TmCel12A-Substrate Complex Structure
Table I
Data Collection and Refinement Statistics for the TmCel12A Crystals
G4 5 cellotetraose
G2 5 cellobiose
Data collection
Space group
Unit-cell parameters
a ()
b ()
c ()
Resolution ()
Unique reflections
Redundancy
Completeness (%)
Average I/r(I)
Rmerge (%)
Refinement
No. of protein chains
No. of reflections
Rwork (95% of data)
Rfree (5% of data)
R.m.s.d. bonds ()
R.m.s.d. angles (8)
Dihedral angles (%)
Most favored
Allowed
Disallowed
No. of non-H atoms
Protein
Water
Carbohydrate
Average B (2)
Protein
Water
Carbohydrate
PDB ID code
Wild-type
Wild-type
E134C
E134C
E134C
Native
G4-soak
G2-soak
G4-soak
G2-cocrystal
P212121
P212121
P212121
P212121
C2
41.7
74.0
181.0
42.3
73.9
181.2
42.0
74.1
180.3
42.1
73.9
179.7
25–2.09
(2.16–2.09)
33,848 (3312)
5.4 (5.0)
98.7 (99.6)
20.2 (6.8)
10.3 (29.3)
25–1.98
(2.05–1.98)
37,964 (3308)
5.9 (5.7)
93.3 (82.9)
25.0 (8.1)
7.4 (28.8)
25–1.47
(1.52–1.47)
94,215 (9191)
8.2 (8.1)
97.7 (96.6)
40.8 (7.8)
5.0 (28.3)
25–1.78
(1.84–1.78)
54,430 (5323)
5.7 (5.3)
99.5 (98.7)
29.2 (4.2)
5.6 (37.0)
230.9
46.9
116.4
b 5 114.28
25–1.80
(1.86–1.80)
104,758 (9837)
6.8 (5.9)
98.9 (94.1)
43.2 (3.8)
4.3 (35.0)
2
32,726 (3076)
0.177 (0.200)
0.220 (0.246)
0.020
1.9
2
36,868 (3062)
0.146 (0.151)
0.189 (0.197)
0.020
2.0
2
92,639 (8504)
0.172 (0.198)
0.197 (0.232)
0.020
1.9
2
52,750 (4756)
0.169 (0.205)
0.202 (0.225)
0.020
2.0
4
100,096 (8268)
0.190 (0.314)
0.230 (0.347)
0.020
2.0
90.6
9.0
0.4
90.4
9.2
0.4
90.7
8.8
0.5
89.1
10.4
0.5
89.9
9.7
0.4
4246
296
4235
543
90
4179
764
92
4171
536
114
8390
1023
128
30.1
48.2
14.7
34.0
28.1
3AMM
15.8
37.8
20.6
3AMN
21.4
39.9
30.2
3AMP
33.4
49.3
34.5
3AMQ
3AMH
All positive reflections (without sigma-cutoff) were used in the refinement. Values in parentheses are for the outermost resolution shells.
amino-acid residues. The best results were obtained using
the derivatives of mersalyl acid, thimerosal, phenyl-mercury
acetate, and tetrakis(acetoxymercuri)methane. Because there
are two TmCel12A molecules in an asymmetric unit of the
orthorhombic crystal, a model of continuous polypeptide
chain was readily constructed by superimposing one molecule on the other, using the program O.20
Subsequent refinement was carried out by using CNS,21
which was also employed in solving the structure of the
monoclinic E134C-cellobiose crystal by molecular replacement. When four copies of the A-chain model from the
orthorhombic crystals were correctly placed in the monoclinic unit cell, the initial R-value was 0.316 for the 1.8 Å-resolution data. At the beginning of refinement all substrate
molecules in the four complex crystals were visible in the difference Fourier maps. Water molecules were included according to strong electron densities and reasonable interactions
with the protein model. All atoms were refined using isotropic temperature factors. O was used in model adjustment
and analysis of the protein structure and its interaction with
the substrate. The figures were produced by using PyMOL.22
RESULTS
Structural features of TmCel12A
Because molecular replacement approaches using other
known cellulase structures from the Protein Data Bank
(PDB; www.pdb.org) did not yield a correct solution, the
active-site residue Glu134 was mutated into a cysteine
for efficient binding to mercury compounds. The mutant
protein E134C retained less than 0.2% activity (data not
shown), but crystallized in the same unit cell as did the
wild-type TmCel12A. Twelve mercury-based heavy-atom
derivatives of the mutant crystal were obtained and used
in solving the structure by MIR methods. Five structures
are presented here1: wild-type without bound substrate,2
wild-type soaked with cellotetraose,3 E134C soaked with
cellobiose,4 E134C soaked with cellotetraose, and5 E134C
cocrystallized with cellobiose in a different unit cell. Data
collection and refinement statistics are listed in Table I.
Each crystal structure contains two protein monomers
as its asymmetric unit, except for the E134C-cellobiose
cocrystal, which contains four monomers. All 12 polypepPROTEINS
1195
Y.-S. Cheng et al.
tide chains seen in these crystals are continuous from Nto C-terminus. Two of them include a four-residue extension from the N-terminus due to the engineered His-tag.
(The models are summarized in Supporting Information
Table SII). The root-mean-square deviations (RMSD) of
Ca-positions between different monomers range from
0.2 Å to 0.4 Å (Supporting Information Table SIII), which
are comparable to but larger than those between the crystallographically equivalent monomers (all less than 0.2 Å).
Whether native or mutant, substrate-bound or free, the
enzyme tends not to undergo significant conformational
changes except for side-chain rotations (Supporting Information Fig. S1). The largest change was a flipping-over of
the Lys216-Asp217 peptide bond by 1808. On the other
hand, in all structures the dihedral angles of Tyr61 (u 5
68.88 3.48 and w 5 263.98 3.78, expressed as mean standard deviation) fell into the disallowed region for nonglycine residues. This special conformation of g-turn23 is
unambiguously defined, as can be judged from the clear
electron densities and backbone interactions with its
neighboring residues (Supporting Information Fig. S2).
As shown in Figure 1, the overall jelly-roll fold of
TmCel12A is similar to those of other GH11- and GH12family enzymes. The two curved b-sheets A and B are
packed against each other and linked by interconnecting
loops. The outer Sheet A of six strands is bent by about 708.
The inner Sheet B of nine strands is bent by nearly 1308 and
it is also highly twisted. The two catalytic residues Glu134
and Glu231 are embedded, respectively in the juxtaposed bstrands B5 and B4. The active-site cleft is formed by this
inner Sheet B and its associated loops, whereas the outer
Sheet A mainly serves a structural role. Two cross-over loops
that connect the b-strand B3 to A5 (residues 70–93) and A6
to B4 (197–225) are significantly longer than the loop (or
‘‘cord’’) between B6 and B9 (142–149). The Loop A6-B4
also encompasses a three-turn a-helix (198–209; Fig. 1).
Interestingly, the b-ribbon formed by the outermost strands
B8 and B9 are severely twisted by 1808 and the C-terminal
region of B9 forms anti-parallel b-sheet interactions with
A6, thus extending the size of the six-stranded outer Sheet A
by two strands. Such extension may contribute to the protein’s stability at high temperatures.
Comparison with other GH12 enzymes
Based on the source organisms, three major categories
of GH12-family cellulases are found in nature: archaeal,
bacterial, and fungal (CAZy; www.cazy.org). When the
eight structures of 2NLR (Streptomyces lividans),6 1H8V
(Hypocrea jecorina),10 1KS5 (Aspergillus niger),14 1OA3
and 1OA4 (Hypocrea schweinitzii and Streptomyces
sp.11AG8),12 1UU6 (Humicola grisea),13 2BWA (Rhodothermus marinus),9 and 2JEN (Bacillus licheniformis)7
from the PDB (www.pdb.org) are superimposed, it is
clear that the bacterial enzymes are more diverse (Supporting Information Table SIV and Fig. S3). The struc-
1196
PROTEINS
ture of 1OA4 is virtually identical to that of 2NLR
(RMSD 5 0.49 Å) because both enzymes are from the
same bacterial genus Streptomyces. These are significantly
different from the other two bacterial enzymes of 2BWA
and 2JEN (RMSD 5 1.36–1.51 Å) and the four fungal
enzymes (RMSD 5 1.40–1.74 Å). Likewise, the structures
of 1OA3 and 1H8V, both from Hypocrea, are nearly identical (RMSD 5 0.34 Å), but they are also relatively less
different from those of 1KS5 and 1UU6 (RMSD 5 0.83–
1.12 Å). Superposition of the structure of TmCel12A with
the above eight structures showed RMSDs of 1.53–1.75Å,
placing it among the bacterial GH12-family enzymes.
The major differences are between the connecting loops,
whereas the b-strands constituting the jelly-roll scaffold
are largely conserved, with RMSDs of about 1.0 Å
for 130–160 Ca-positions (Supporting Information
Table SIV).
The most prominent structural differences between
these enzymes occur in the Loops B2-A2, A3-B3, B4-A4,
B3-A5, B5-B6, B8-B7, B7-A6, and B8-B9 (Supporting Information Fig. S4). The first three loops overhang the
central cleft of the protein molecule and form a major
part of the substrate-binding site. They are more variable
in the bacterial enzymes than in the fungal enzymes, as
are the overall structures (Supporting Information Fig.
S3). The Streptomyces and Rhodothermus enzymes
(2NLR/1OA4 and 2BWA) appear comparatively more
similar to each other than to the Bacillus enzyme (2JEN),
but the Thermotoga enzyme TmCel12A is the most different. As shown in Figure 2(A), the Loop B2-A2 is longer
and is shifted toward the neighboring Loop A1-B1; the
Loop A3-B3 is much longer and displaces part of Loop
B2-A2 of the other structures; and the Loop B4-A4 is
also longer and displaces part of Loop A3-B3. The crossover Loop B3-A5 and the adjacent Loop B5-B6 also form
part of the distal substrate-binding subsites 23 and 24,
and supposedly account for specificities regarding different substrate lengths at the nonreducing end. Slight
movements in the Loop B8-B7 are dependent on the
interactions, if present, with bound substrates. This loop
is longer in 1KS5 and 2JEN, the latter of which is a xyloglucanase for branched substrates.7 The shift in Loop
B7-A6, as compared with other Cel12 structures, is a
result of interactions with the 1808-twisted b-strands B8
and B9, which extends the outer b-sheet A beyond strand
A6. Such an extension has never been observed before in
this enzyme family.
Like the known structures of Cel12 from the other
species, the interior of the jelly-roll b-sandwich of
TmCel12A is filled by hydrophobic residues that constitute the protein core, including a large fraction of aromatic amino acids, mostly phenylalanine. The central
cleft is also embedded with several tryptophan and tyrosine side chains (Trp26, 75, 118, 138, 176, 178, and
Tyr61, 65, 180), which interact with the substrate molecule. There are more aromatic amino-acid residues in the
TmCel12A-Substrate Complex Structure
Figure 1
The protein fold of TmCel12A. (A) A ribbon diagram of the model is shown in a stereoscopic view. The secondary structural elements and loops
are spectrum-colored from blue (N-terminus) to red (C-terminus) according to their positions in the amino-acid sequence. The b-strands are
organized into two large, mostly anti-parallel sheets A and B that pack against each other. Between strands A6 and B4 lies the only a-helix in the
structure, which is shown in orange. (B) The protein topology is depicted as a schematic diagram. (C) The model is rotated about 908 to show the
active-site cleft and how the twisted strands B8 and B9 (yellow) in one b-sheet associate with the outmost strand A6 (orange) of the other.
cleft of the bacterial enzymes when compared with the
fungal enzymes, as reflected by the presence of four to
eight tryptophan residues in the former and two or three
in the latter. Tyr61 is unique to TmCel12A (see below)
whereas Tyr65 is strictly conserved. The equivalent to
Tyr180 is a valine in all others except for the Rhodothermus enzyme, which also has a Tyr163. In TmCel12A,
Gly233 replaces either a tryptophan or a phenylalanine
residue in the other enzymes [e.g., Trp205 in 2NLR and
Phe202 in 1H8V; Fig. 2(B)]. This occurs at the junction
between strand B4 and Loop B4-A4, and makes the peptide chain deviates by 1508 from all others. The longer
Loop B4-A4 of TmCel12A also extends its overhang from
the central cleft by about 7 Å. To compensate for the
shifted Loop B4-A4, the side chain of Leu109 is rotated
slightly outward to fill the space but still makes hydroPROTEINS
1197
Y.-S. Cheng et al.
Figure 2
Unique loops in TmCel12A. (A) The A-chain model of the mutant E134C crystal soaked with cellobiose is superimposed onto other known
structures of GH12-family enzymes. The colors used here are red for TmCel12A, blue for 1H8V and 1OA3, cyan for 1KS5, green for 1UU6, gray for
2NLR and 1OA4, orange for 2BWA, and pink for 2JEN. The bound cellobiose molecules are shown as stick models with yellow carbons. The view
is approximately orthogonal from that in Figure 2. (B) A close view of the region shows where the Loop B4-A4 of TmCel12A deviates from its
equivalent loops in the other structures. This occurs at Gly233, which replaces an aromatic amino-acid residue (Trp or Phe) in the others. The B4A4 loop makes a sharp turn here, and results in a very different course. The neighboring Loop A3-B3 shows even larger deviation from its
equivalents in the other enzymes.
phobic interactions with the core residues. The next residue Pro110, instead of a hydrophilic residue in the other
enzymes, gives some rigidity to the loop structure. On
1198
PROTEINS
the other hand, Loop B6-B9, or the ‘‘cord,’’ is structurally
more conserved, which varies by no more than a single
residue in its length.
TmCel12A-Substrate Complex Structure
The cleft-bound cellotetraose
When the wild-type TmCel12A crystals were soaked
with cellotetraose, each of the two enzyme molecules in
the asymmetric unit bound to one molecule of the substrate in the active-site cleft. By comparing the cellotetraose-bound and the unbound enzyme structures, the
sugar was found to displace eleven water molecules in the
active site of one TmCel12A and nine in the other. Among
these active-site water molecules, eight occupied equivalent
positions. Furthermore, 6 of the 8 conserved waters correspond to 6 of the 14 hydroxyl-group positions in the
bound cellotetraose molecule. Although specific interactions between the enzyme and the substrate are probably
the major determinants for cellulose binding to TmCel12A,
the displaced water molecules can nonetheless increase the
entropy that favors the enzyme-substrate association, especially at an elevated temperature. On the other hand, the
side chain of Arg60 lacked strong electron densities in the
native crystal but became more ordered in the presence of
bound substrate (Supporting Information Fig. S2). The average temperature factors of the two Arg60 residues are
44.1 Å2 in the native crystal and 17.3 Å2 in the cellotetraose
complex. The guanidine groups have an average of 55.6 Å2
in the former and 19.9 Å2 in the latter. When these are
compared with the overall protein temperature factors of
30.1 Å2 and 14.7 Å2 (Table I), it is evident that Arg60 is significantly stabilized by binding to substrate, especially for
its side chain. The poorly and well defined guanidine
groups in the native and complex crystals differ by a nearly
1808 rotation (Supporting Information Fig. S2). This
Arg60 is located in the Loop A3-B3, which is longer than
its equivalents in the other enzymes, protrudes conspicuously over the active-site cleft, and is unique to TmCel12A.
The two bound cellotetraose molecules show an
RMSD of 0.178 Å for the 45 non-hydrogen atoms, indicating an almost identical means of binding. The four bglucose residues occupy the 22, 21, 11, and 12 subsites of the central cleft. However, the occupancy may
vary significantly, as reflected by the average temperature
factors of these residues, 12.4 Å2 (22), 18.4 Å2 (21),
35.3 Å2 (11), and 44.5 Å2 (12), and also by the corresponding strength of electron density (Supporting Information Fig. S5). Previous studies showed that TmCel12A
was most active at 958C and pH 5.2,3 Because the wildtype enzyme is supposed to remain active in the crystal,
given the high cellotetraose concentration and length of
incubation, hydrolysis of the substrate molecules should
have occurred during the soaking time. Nevertheless, 14
direct hydrogen bonds between the enzyme and the substrate can be unambiguously identified by analyzing the
refined crystal structure. As shown in Figure 3, the 22
sugar residue at the non-reducing end is sandwiched by
the large aromatic side chains of Trp26 and Trp75, and
apparently stabilized by the strong stacking interactions
with its six-atom sugar ring. In addition, it makes three
direct hydrogen bonds and at least two indirect hydrogen
bonds (not shown), mediated by conserved water molecules that were observed in both complex structures.
Judging by the extensive interactions between the 22 residue and the enzyme, it is not surprising that this most
tightly bound residue had the strongest density and the
lowest temperature factor.
The adjacent 21 residue, which is to be attacked by
the nucleophile Glu134, makes four direct hydrogen
bonds with the enzyme. One carboxyl oxygen atom of
Glu134 makes a bond of 2.8–2.9 Å to the O2 atom of
the 21 residue, and the other carboxyl oxygen is 3.6 Å
from the anomeric carbon C1 of the sugar, ready for
nucleophilic attack (see Fig. 3). This latter oxygen also
makes a short hydrogen bond to the side chain of
Glu116 (not shown), which is to deprotonate the nucleophile in the catalytic reaction. On the other hand, the
side chain of Glu231 is hydrogen bonded to the O6 atom
of the 21 residue and the O3 and O4 atoms of the 11
residue, and one of the carboxyl oxygen atoms is also
2.6–2.7 Å from O5 of the 21 residue. In the first halfreaction, Glu231 stabilizes the negatively charged transition state by providing a proton; in the second half-reaction, it serves as a catalytic base to activate a water molecule, which is supposed to take over the current position
of the O4 atom (see Fig. 4). Of particular note is that
the side-chain guanine group of Arg60 makes hydrogen
bonds to all three sugar residues 22, 21, and 11, and
its backbone carbonyl group makes a hydrogen bond to
the O6 of residue 11. Besides, the shortest distance
between the side chains of Arg60 and Trp178 is 3.8–3.9
Å. In this way an enclosure is formed around the 21
cleavage site, presumably important for efficient catalysis
by holding the 21 sugar residue in place. Because the
11 and 12 subsites are more open to the solvent, the
bound sugar residues are less ordered (Supporting Information Fig. S5), despite the six direct hydrogen bonds to
the 11 sugar and the stacking interaction of Tyr61 with
the 12 sugar (see Fig. 3). Interestingly, the O3 atom of
21 makes a hydrogen bond to the O5 of 22 (not
shown) and so does the O3 of 12 to the O5 of 11,
resulting in a similar conformation of the two disaccharide units.
The E134C-sugar complexes
Two cellobiose molecules were bound to each
TmCel12A molecule when the mutant crystals of E134C
were soaked with the disaccharide (Supporting Information Fig. S6). One cellobiose molecule occupied the 22
and 21 subsites and the other occupied the 11 and 12
subsites of the enzyme. The RMSD between the bound
sugars in the two crystallographically-independent
E134C-cellobiose complexes is 0.139 Å for 46 non-hydrogen atoms. When the E134C crystal was soaked with cellotetraose, the bound sugar residues occupied the 23
subsite in addition to the 22, 21, 11, and 12 sites
PROTEINS
1199
Y.-S. Cheng et al.
Figure 3
The wild-type TmCel12A-cellotetraose complex. The four units of b-glucose are shown as heavy stick models with gray carbons and labeled from
22 to 12 according to the subsites. Some surrounding amino-acid residues of TmCel12A are shown as thin stick models with green carbons. The
dash lines denote the direct hydrogen bonds between the enzyme and the substrate.
(Supporting Information Fig. S6). At the reducing end of
the binding site, some densities beyond the O1 atom of
the 12 sugar were also seen. Because the mutant enzyme
is inactive, the bound sugars should remain intact. Consequently, the observed densities might represent two
bound cellotetraose molecules, but one was modeled as a
cellotriose and the other as a cellobiose due to disorder
at both ends of the substrate-binding cleft. The RMSD
between the two independent five-sugar-residue models
from 23 to 12 is 0.240 Å for 57 non-hydrogen atoms.
In the E134C-cellobiose cocrystal, each of the four protein molecules in the asymmetric unit had its 22
and 21 subsites occupied by a cellobiose molecule. The
RMSD varies from 0.071 to 0.136 Å between the four cellobiose models. Weak densities were also observed in
the region of 11 and 12 subsites in three of the four
protein molecules, but they could only be modeled as a
b-glucose molecule plus a few waters (Supporting Information Fig. S7). The RMSD ranges from 0.094 Å to
0.205 Å. The lower occupancies might be due to the
1200
PROTEINS
absence of substrate in the cryoprotectant solution (see
Materials and Methods).
Despite their different lengths, the models of bound
sugars in the E134C mutant crystals superimpose well on
one another, as shown in Figure 5(A). Similar to those
observed in the cellotetraose complex structure of the
wild-type enzyme, the sugar residues bound to the 22
and 21 subsites of the mutant appear to be the most
stable, judging by their individual average temperature
factors (Supporting Information Figs. S6 and S7). The
stability of the bound substrate residues in the subsites
may have the order of 22 > 21 > 11 > 12 > 23,
which also matches well with the corresponding strength
of electron density. Unlike its succeeding sugar units, the
23 residue observed in the E134C-cellotetraose structure
is much exposed to the solvent (Supporting Information
Fig. S8), with its 6-hydroxyl group making a single direct
hydrogen bond to the side chain of Glu76. Interactions
in the other subsites are similar to those in the wild-type
complex, including the sandwiched stacking of 22 sugar
TmCel12A-Substrate Complex Structure
Figure 4
Catalytic mechanism of TmCel12A. Here the two-step reaction typical of a retaining enzyme is depicted in a schematic diagram. The two glucose
residues correspond to those bound to the 21 and 11 subsites. The acidic side chain of Glu116 adjacent to the nucleophile Glu134 is believed to
maintain a negative charge at low pH values.
by Trp26 and Trp75, the four direct hydrogen bonds of
Arg60 to three sugar residues, and the single-sided stacking of Tyr61 with the 12 residue. However, there are a
few exceptions particularly in the 21 subsite, which will
be detailed below. It is also worth noting that in every
bound glucose residue, the 6-hydroxyl group consistently
forms at least one direct hydrogen bond to the
TmCel12A protein, no matter whether the enzyme is
wild-type or a mutant. Specifically, the O6 atoms of the
glucose residues from 23 to 12 are hydrogen bonded to
Glu76 OE2, Arg60 NH1, Trp26 NE1/Glu231 OE1, Arg60
NH2, and Thr145 N. These O6 atoms no longer interact
with the O2 atoms of their proceeding glucose residues
of the same chain and the O3 atoms of the neighboring
chains as observed in crystalline cellulose.24 The O2
atom of residues 22, 21, and 11 also forms separate
hydrogen bonds to Asn24 ND2, Glu134 OE2, and
Thr145 O of the wild-type enzyme (see Fig. 3).
In all TmCel12A-substrate complex structures, the two
glucose residues bound to the 22 and 21 subsites of the
enzyme are the most clearly visible. The RMSD between
the disaccharide models is 0.08 Å for the two wild-type
enzyme complexes and it ranges from 0.07 to 0.20 Å
between the eight E134C mutant complex models. By
contrast, a marked increase in RMSD is seen between the
disaccharides in the wild-type and the mutant structures,
ranging from 0.65 to 0.72 Å (Supporting Information Table SV). While all models of the 22 sugar residue are
nearly identical to one another, those of the 21 residue
show a distinct structural difference at the C1 atom. The
21 glucose residue has the b-anomeric configuration
in the two wild-type complex structures but has the
a-anomeric configuration in all eight complexes of the
E134C mutant. For a D-glucopyranoside the b-anomer is
more stable than the a-anomer. The two anomers are
interconvertible in aqueous solution with a ratio of 2:1.
Probably the mutant E134C prefers binding to a-anomer
at the 21 subsite, which corresponds to the substrate
cleavage site. The presence of high substrate concentration in the crystallization solution should have allowed
the binding to full occupancy. The O3 and O6 atoms
remained hydrogen bonded to Arg60, Trp26 and Glu231
as in the wild-type structure, but the substitution of
Glu134 by Cys134 aborted its hydrogen bond to the O2
atom. As shown in Figure 5(B), the original position of
Glu134 OE2 is filled in by a water molecule, which forms
three hydrogen bonds to Cys134 SG, Trp173 NE1 and
the sugar’s O2 atom. Interestingly, the sugar ring is
slightly rotated and the C1 atom is thus shifted by about
1 Å toward the original nucleophile residue, now Cys134.
The O1 atom of the sugar, which is an a-anomer, is
about 3.5 Å from Cys134 SG. It is located half-way
between Glu134 OE1 (not OE2) of the wild-type enzyme
and the sugar’s C1 atom, and replaces the Glu134 OE1
atom in forming a hydrogen bond to Glu116. In view of
the double displacement mechanism of a retaining
enzyme,15 the bound substrate models in the E134C
complex structures may mimic the catalytic intermediate
configuration.
DISCUSSION
Two approaches are taken in structure determination
of a protein crystal: experimental phasing and molecular
replacement.25 Because the number of structures in the
PROTEINS
1201
Y.-S. Cheng et al.
Figure 5
The E134C-substrate complexes. (A) The enzyme-bound substrate molecules in the E134C crystals are superimposed and shown as thin stick
models. The models with carbon atoms colored blue and green are from the soaked cellobiose and cellotetraose complexes. Those with yellow
carbons are from the cocrystallized cellobiose complex. Note that all of the 21 sugars show the same a-anomeric configuration. (B) The E134C
cellobiose-soaking crystal structure is superposed onto that of the wild-type cellotetraose complex. Carbon atoms in the E134C model are colored
in pink and green for the protein and substrate, and those of the wild-type are in gray and cyan. Conserved hydrogen bonds in both complexes are
shown as black dash lines. New bonds in the mutant structure and a mediating water molecule are colored in magenta.
PDB is increasing rapidly, it becomes more likely to find
a homologous structure for studying a new protein. By
aligning the protein sequences and substituting the side
chains a model can usually be created to yield some information about, for example, active-site environment
and inter-molecular interactions. The accuracy of the
resulting model, however, is dependent on the extent of
sequence identity. Although about two-thirds of structures in the PDB were solved by molecular replacement,26 isomorphous replacement and anomalous dispersion still remain in wide use. One reason is that the
homologous proteins contain significant variations especially in the loop regions, despite their common folds.
Other reasons can be conformational changes, oligomer
formation, or different crystal packing, which may affect
the accuracy of Patterson function search.
The amino-acid sequence of TmCel12A has less than
20% identity to those of other bacterial cellulases and
less than 14% to the fungal enzymes (Supporting Information Table SVI). It turns out that the proteins share
no more than a common fold and a few conserved
1202
PROTEINS
active-site residues such as Glu134 and Glu231. Significant variations occur in most connecting loops between
the b-strands and also in some b-strands in the two
jelly-roll sheets. Consequently it is not surprising that
our molecular replacement search failed to yield a correct
solution. Because the catalytic nucleophile of TmCel12A
had been identified to be Glu134 by sequence comparison, it was mutated to a cysteine for preparation of heavy
atom derivatives. As expected, a major site was located
adjacent to each Cys134 side chain. In general, the active
site of an enzyme is more easily identified by the aminoacid sequence and also more likely to bind to heavy
atoms because some ‘‘active’’ functional groups must be
present there. In the study of the Aspergillus niger enzyme
(1KS5),14 the enzyme’s activity was inhibited by a palladium ion bound to the side chain of the nucleophile
Glu116 in the active site, which showed clear electron
density in the Fourier map. Another example is seen in
the structure determination of hexaprenyl pyrophosphate
synthase from Sulfolobus solfataricus, which is homologous to other prenyltransferases but also shows signifi-
TmCel12A-Substrate Complex Structure
cant variations.27 In that study, Asp81 of the first aspartate-rich motif in the active site was mutated to Cys81
for binding to mercury-containing compounds. Thus, it
is a good alternative to try experimental phasing by
mutating an active-site residue for heavy-atom binding,
in addition to molecular replacement.
The structure of TmCel12A differs significantly from
those of other GH12 enzymes. The two outermost bstrands of the inner sheet B extend and twist to dock
onto the rim of the outer sheet A. When linked by these
two b-strands (B8 and B9), the two b-sheets A and B are
integrated into something like a flattened b-barrel.
Although it appears to be a feature of TmCel12A,
whether the barrel-like formation would make this structure more stable than that of two individual b-sheets at
higher temperatures remains to be investigated. The
loops connecting the b-strands show variations in their
lengths and dispositions, some apparently giving rise to
different substrate affinity and specificity. The unique
Loop A3-B3, which contains Arg60 and Tyr61, protrudes
on one side of the active-site cleft. Arg60 forms direct
hydrogen bonds with three sugar residues in the 22, 21,
and 11 subsites, and Tyr61 provides stacking interaction
with the 12 sugar. In all eight other enzyme structures
of the same family, an aromatic side chain (phenylalanine
or tryptophan) from Loop B4-A4 occupies the equivalent
space of the Tyr61 side chain. The residue corresponds to
Gly233 in TmCel12A [Fig. 2(B)]. Unlike Tyr61 that stacks
with the 12 sugar, the aromatic group is perpendicular
to the sugar ring, making only weak van der Waals contacts. Aromatic side chain stacking is an important
means of binding to sugar, as observed in lectins and
other proteins.28 The 22 sugar, which is sandwiched
between Trp26 and Trp75, appears to be the most tightly
bound residues due to strong stacking interaction on
both sides of the sugar ring. In addition, other direct and
water-mediated hydrogen bonds, presumably taking over
the original intra- and inter-strand hydrogen bonds in a
cellulose fiber, also contribute to the enzyme’s affinity to
an isolated strand of cellulose.
The a-anomer observed in the E134C-cellobiose and
E134C-cellotetraose complexes, obtained either by soaking or by cocrystallization, may mimic the glycosylenzyme intermediate.6,15 The shorter chain lengths of
the bound sugars than expected for cellotetraose in the
complex may not be a result of cleavage because the mutant is inactive. Instead, they reflect the number of subsites in TmCel12A and their binding strengths. Judging
from the electron densities and temperature factors, the
bound sugar residues appear to be more stable in the 22
and 21 subsites than those in the 11, 12, and 23 subsites. The sugar residues beyond 23 and 12 were most
likely disordered rather than cleaved. Consequently, the
E134C mutant tends to bind to two molecules of cellotetraose instead of a single cellotetraose as observed in the
wild-type complex. It suggests that the active-site envi-
ronment provided by the substrate-binding cleft of
the enzyme may favor a distorted geometry at the cleavage site especially around the anomeric C1-carbon of
the 21 residue. The enzyme also tends to release the
sugar residues on the reducing end from the active site
once the glycosyl-enzyme intermediate is formed, which
should have an a-anomeric configuration in the 21
sugar. The reaction then proceeds by the attack of
water, assisted by Glu231 (see Fig. 4), at the C1-carbon
and yields a product with its C1 reverted to the banomeric configuration.
Enzymes from extremophiles have been studied extensively for their special properties. These characteristics
make the enzyme highly useful in various applications,
since industrial processes such as plant waste treatment
usually involve high temperature and low pH. Thermostability of TmCel12A could be attributed in part to the
longer b-strands B8 and B9, which associate with A6 of
the other b-sheet. Hydrophobic interactions that hold
the two b-sheets together are also important for stability,11,12,15 whose relationship with the protein sequence
awaits further studies. Although the catalytic residues in
the active site are conserved, and so are some interactions
with the substrate such as stacking of Trp26 with the 22
sugar,9 the environments of the substrate-binding cleft in
the GH12-family enzymes differ from one another due to
large variations in the loop structures. These presumably
determine substrate specificity and catalytic efficiency.
The detailed enzyme-substrate interactions presented
here should provide a basis for subsequent mutagenesis
studies and protein-engineering projects.
ACKNOWLEDGMENTS
The authors are grateful to NSRRC for synchrotron
beam-time allocations and data-collection assistance. The
atomic coordinates and structure factors (codes 3AMH,
3AMM, 3AMN, 3AMP, and 3AMQ) have been deposited
in the Protein Data Bank.
REFERENCES
1. Allgaier M, Reddy A, Park JI, Ivanova N, D’haeseleer P, Lowry S,
Sapra R, Hazen TC, Simmons BA, VanderGheynst JS, Hugenholtz
P. Targeted discovery of glycoside hydrolases from a switchgrassadapted compost community. PLo S One 2010;5:e8812.
2. Bronnenmeier K, Kern A, Liebl W, Staudenbauer WL. Purification
of Thermotoga maritima enzymes for the degradation of cellulosic
materials. Appl Environ Microbiol 1995;61:1399–1407.
3. Liebl W, Ruile P, Bronnenmeier K, Riedel K, Lottspeich F, Greif I.
Analysis of a Thermotoga maritima DNA fragment encoding two
similar thermostable cellulases, CelA and CelB, and characterization
of the recombinant enzymes. Microbiology 1996;142:2533–2542.
4. Nelson KE, Clayton RA, Gill SR, Gwinn ML, Dodson RJ, Haft DH,
Hickey EK, Peterson JD, Nelson WC, Ketchum KA, McDonald L,
Utterback TR, Malek JA, Linher KD, Garrett MM, Stewart AM,
Cotton MD, Pratt MS, Phillips CA, Richardson D, Heidelberg J,
Sutton GG, Fleischmann RD, Eisen JA, White O, Salzberg SL, Smith
HO, Venter JC, Fraser CM. Evidence for lateral gene transfer
PROTEINS
1203
Y.-S. Cheng et al.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
between Archaea and bacteria from genome sequence of Thermotoga maritima. Nature 1999;399:323–329.
Sulzenbacher G, Shareck F, Morosoli R, Dupont C, Davies GJ. The
Streptomyces lividans family 12 endoglucanase: construction of the
catalytic core, expression, and X-ray structure at 1.75 A resolution.
Biochemistry 1997;36:16032–16039.
Sulzenbacher G, Mackenzie LF, Wilson KS, Withers SG, Dupont C,
Davies GJ. The crystal structure of a 2-fluorocellotriosyl complex of
the Streptomyces lividans endoglucanase CelB2 at 1.2 A resolution.
Biochemistry 1999;38:4826–4833.
Gloster TM, Ibatullin FM, Macauley K, Eklöf JM, Roberts S, Turkenburg JP, Bjørnvad ME, Jørgensen PL, Danielsen S, Johansen KS,
Borchert TV, Wilson KS, Brumer H, Davies GJ. Characterization and
three-dimensional structures of two distinct bacterial xyloglucanases
from families GH5 and GH12. J Biol Chem 2007;282:19177–19189.
Crennell SJ, Hreggvidsson GO, Nordberg Karlsson E. The structure
of Rhodothermus marinus endoglucanase Cel12A, a highly thermostable family 12 endoglucanase, at 1.8 A resolution. J Mol Biol
2002;320:883–897.
Crennell SJ, Cook D, Minns A, Svergun D, Andersen RL, Nordberg
Karlsson E. Dimerization and an increase in active site aromatic
groups as adaptations to high temperatures: X-ray solution scattering and substrate-bound crystal structures of Rhodothermus marinus
endoglucanase Cel12A. J Mol Biol 2006;356:57–71.
Sandgren M, Shaw A, Ropp TH, Wu S, Bott R, Cameron AD, Ståhlberg J, Mitchinson C, Jones TA. The X-ray crystal structure of the
Trichoderma reesei family 12 endoglucanase 3, Cel12A, at 1.9 A resolution. J Mol Biol 2001;308:295–310.
Sandgren M, Gualfetti PJ, Paech C, Paech S, Shaw A, Gross LS, Saldajeno M, Berglund GI, Jones TA, Mitchinson C. The Humicola grisea Cel12A enzyme structure at 1.2 A resolution and the impact of
its free cysteine residues on thermal stability. Protein Sci
2003;12:2782–2793.
Sandgren M, Gualfetti PJ, Shaw A, Gross LS, Saldajeno M, Day AG,
Jones TA, Mitchinson C. Comparison of family 12 glycoside hydrolases and recruited substitutions important for thermal stability.
Protein Sci 2003;12:848–860.
Sandgren M, Berglund GI, Shaw A, Ståhlberg J, Kenne L, Desmet T,
Mitchinson C. Crystal complex structures reveal how substrate is
bound in the 24 to the 12 binding sites of Humicola grisea
Cel12A. J Mol Biol 2004;342:1505–1517.
Khademi S, Zhang D, Swanson SM, Wartenberg A, Witte K, Meyer
EF. Determination of the structure of an endoglucanase from Asper-
1204
PROTEINS
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
gillus niger and its mode of inhibition by palladium chloride. Acta
Crystallogr 2002;D58:660–667.
Sandgren M, Ståhlberg J, Mitchinson C. Structural and biochemical
studies of GH family 12 cellulases: improved thermal stability, and
ligand complexes. Prog Biophys Mol Biol 2005;89:246–291.
Otwinowsk Z, Minor W. Processing of X-ray diffraction data collected in oscillation mode. Methods Enzymol 1997;276:307–326.
Terwilliger TC, Berendzen J. Automated MAD and MIR structure
solution. Acta Crystallogr 1999;D55:849–861.
Terwilliger TC. Maximum likelihood density modification. Acta
Crystallogr 2000;D56:965–972.
Terwilliger TC. Automated main-chain model building by template
matching and iterative fragment extension. Acta Crystallogr
2003;D59:38–44.
Jones TA, Zou JY, Cowan SW, Kjeldgaard M. Improved methods
for the building of protein models in electron density maps and the
location of errors in these models. Acta Crystallogr 1991;A47:110–
119.
Brunger AT, Adams PD, Clore GM, DeLano WL, Gros P, GrosseKunstleve RW, Jiang JS, Kuszewski J, Nilges N, Pannu NS, Read RJ,
Rice LM, Simonson T, Warren GL. Crystallography and NMR system (CNS): a new software system for macromolecular structure
determination. Acta Crystallogr 1998;D54:905–921.
DeLano WL. The PyMOL molecular graphics system. USA: DeLano
Scientific LLC; 2008.
Richardson JS. The anatomy and taxonomy of protein structure.
Adv Protein Chem 1981;34:167–339.
Nishiyama Y, Langan P, Chanzy H. Crystal structure and hydrogen-bonding system in cellulose I-beta from synchrotron X-ray
and neutron fiber diffraction. J Am Chem Soc 2002;124:9074–
9082.
Adams PD, Afonine PV, Grosse-Kunstleve RW, Read RJ, Richardson
JS, Richardson DC, Terwilliger TC. Recent developments in phasing
and structure refinement for macromolecular crystallography. Curr
Opin Struct Biol 2009;19:566–572.
Long F, Vagin AA, Young P, Murshudov GN. BALBES, a molecularreplacement pipeline. Acta Crystallogr 2008;D64:125–132.
Sun HY, Ko TP, Kuo CJ, Guo RT, Chou CC, Liang PH, Wang AHJ.
Homodimeric hexaprenyl pyrophosphate synthase from the thermoacidophilic crenarchaeon Sulfolobus solfataricus displays asymmetric subunit structures. J Bacteriol 2005;187:8137–8148.
Rudiger H, Gabius HJ. Plant lectins: occurrence, biochemistry,
functions and applications. Glycoconj J 2001;18:589–613.