Domain evolution in the GH13 pullulanase subfamily

Biologia 63/6: 1057—1068, 2008
Section Cellular and Molecular Biology
DOI: 10.2478/s11756-008-0162-4
Domain evolution in the GH13 pullulanase subfamily with focus
on the carbohydrate-binding module family 48
Martin Machovič1 & Štefan Janeček1,2*
Institute of Molecular Biology, Slovak Academy of Sciences, Dúbravská cesta 21, SK-84551 Bratislava, Slovakia; e-mail:
[email protected]
2
Department of Biotechnologies, Faculty of Natural Sciences, University of SS. Cyril and Methodius, Nám. J. Herdu 2,
SK-91701 Trnava, Slovakia
1
Abstract: Glycoside hydrolase (GH) family 13 comprises about 30 different specificities. Four of them have been proposed
to form the GH13 pullulanase subfamily: pullulanase, isoamylase, maltooligosyl trehalohydrolase and branching enzyme
forming the seven CAZy GH13 subfamilies: GH13 8–GH13 14. Recently, a new family of carbohydrate-binding modules
(CBMs), the family CBM48 has been established containing the putative starch-binding domains from the pullulanase
subfamily, the β-subunit of AMP-activated protein kinase and some other GH13 enzymes with pullulanase and/or αamylase-pullulanase specificity. Since all of these enzymes are multidomain proteins and the structure for at least one
representative of each enzyme specificity has already been determined, the main goal of the present study was to elucidate
domain evolution within this GH13 pullulanase subfamily (84 real enzymes) focusing on the CBM48 module. With regard
to CBM48 positioning in the amino acid sequence, the N-terminal end of a protein appears to be a predominant position.
This is especially true for isoamylases and maltooligosyl trehalohydrolases. Secondary structure-based alignment of CBM
modules from CBM48, CBM20 and CBM21 revealed that several residues known as consensus for CBM20 and CBM21
could also be identified in CBM48, but only branching enzymes possess the aromatic residues that correspond with the
two tryptophans forming the evolutionary conserved starch-binding site 1 in CBM20. The evolutionary trees constructed
for the individual domains, complete alignment, and the conserved sequence regions of the α-amylase family were found to
be comparable to each other (except for the C-domain tree) with two basic parts: (i) branching enzymes and maltooligosyl
trehalohydrolases; and (ii) pullulanases and isoamylases. Taxonomy was respected only within clusters with pure specificity,
i.e. the evolution of CBM48 reflects the evolution of specificities rather than evolution of species. This is a feature different
from the one observed for the starch-binding domain of the family CBM20 where the starch-binding domain evolution
reflects the evolution of species.
Key words: α-amylase enzyme family; pullulanase subfamily; starch-binding domain; domain evolution; evolutionary tree.
Abbreviations: AMPK, AMP-activated protein kinase; CBM, carbohydrate-binding module; GBD, glycogen-binding domain; GBE, glycogen branching enzyme; GH, glycoside hydrolase; IAM, isoamylase; MOTH, maltooligosyl trehalohydrolase;
PUL, pullulanase; RMSD, root mean square deviation; SBD, starch-binding domain; SBE, starch branching enzyme.
Introduction
The α-amylase family (Kuriki & Imanaka 1999; MacGregor et al. 2001) forms within the CAZy classification (Coutinho & Henrissat 1999a) the clan GH-H of
three glycoside hydrolase (GH) families: GH13, GH70
and GH77. Despite an extremely large number of available sequences (more than 4,500 CAZy entries) and almost 30 different enzyme specificities, the α-amylase
family members possess several typical common features. These are (Svensson 1994; Kuriki & Imanaka
1999; MacGregor et al. 2001; Janecek 1997, 2002a; van
der Maarel et al. 2002; MacGregor 2005; Kuriki et al.
2006; Seo et al. 2008): (i) acting on α-glucosidic linkages, i.e. their hydrolysis or formation by transglycosy-
lation; (ii) possessing from 4 up to 7 conserved sequence
regions; (iii) adopting the parallel (β/α)8 -barrel domain
(i.e. TIM-barrel) as a fold for the catalytic domain; (iv)
using the catalytic triad machinery consisting of aspartic acid, glutamic acid and aspartic acid at the TIMbarrel β-strands β4, β5 and β7, respectively; and (v)
employing the retaining reaction mechanism.
Based on detailed previous evolutionary studies,
several subfamilies of the α-amylase family were identified, such as the oligo-1,6-glucosidase and neopullulanase subfamilies (Oslancova & Janecek 2002), cyclodextrin glucanotransferase and pullulanase subfamilies (Janecek et al. 2007). The main GH α-amylase
family, family GH13, has recently been divided by the
CAZy curators into subfamilies (both monofunctional
* Corresponding author
c 2008 Institute of Molecular Biology, Slovak Academy of Sciences
Unauthenticated
Download Date | 6/18/17 7:35 PM
M. Machovič & Š. Janeček
1058
and polyspecific) in order to establish robust groups
exhibiting an improved correlation between sequence
and enzymatic specificity (Stam et al. 2006). This is an
opposite trend to previous aims focused on grouping
related families into clans (Henrissat & Bairoch 1996).
The members of the individual GH13 subfamily share in
general a closer relatedness to each other than to the remaining GH13 family members, i.e. higher similarity in
sequence, structure, specificity, and/or even taxonomy
(Oslancova & Janecek 2002; Stam et al. 2006; Janecek
et al. 2007). Similar divisions were done for other large
GH families, e.g., for the families GH1 (Marques et al.
2003), GH18 (Durand et al. 2005), GH57 (Zona et al.
2004) and GH97 (Naumoff 2005).
Approximately 10% of the amylolytic enzymes are
able to bind and degrade the raw starch (Janecek &
Sevcik 1999). This ability is mostly due to a presence
of a distinct starch-binding domain (SBD) (Svensson
et al. 1989). However, if also the secondary-binding
sites are taken into account (Tibbot et al. 2002; Bozonnet et al. 2005; Gasperik et al. 2005; Tranier et
al. 2005; Nielsen et al. 2008; Ragunath et al. 2008),
obviously more than 10% of amylolytic enzymes can
bind the raw starch. The individual SBDs have been
classified into families of carbohydrate-binding modules (CBMs) (Coutinho & Henrissat 1999b) that analogously with GHs are also available within the CAZy
server (http://www.cazy.org/). Originally, the SBDs
were found in microbial amylases only (Svensson et
al. 1989; Janecek & Sevcik 1999) but at present it
has become clear that several regulatory proteins of
plant and animal origin also contain an SBD motif
(Rodriguez-Sanoja et al. 2005; Machovic & Janecek
2006a). Especially the motifs present in mammalian
proteins (e.g., laforin, genethonin-1) have to be considered rather as glycogen-binding domains (GBDs)
(Minassian et al. 2000; Janecek 2002b; Polekhina et
al. 2005; Gentry et al. 2007). Thus the SBDs and/or
GBDs can be found in the families CBM20, CBM21,
CBM25, CBM26, CBM34, CBM41, CBM45 and recently also in CBM48 (Machovic & Janecek 2006a).
Except for the family CBM45 (Mikkelsen et al. 2006),
three-dimensional structures are known for at least
one representative from each of the seven families
(Coutinho & Henrissat 1999a); interestingly these SBDs
all adopt a β-sandwich motif with an immunoglobulin fold (Boraston et al. 2004; Rodriguez-Sanoja et al.
2005; Hashimoto 2006; Machovic & Janecek 2006a).
From the evolutionary point of view, based on a detailed bioinformatics study the families CBM20 and
CBM21 have already been proposed to be classified
into a CBM clan (Machovic et al. 2005). Subsequently
many putative SBDs from both GH13 enzymes as well
as plant and animal regulatory proteins have been suggested to be added to the CBM20–CBM21 clan because
they share significant sequence similarities (Machovic &
Janecek 2006b). Just these putative SBDs were recently
classified as a new CAZy family CBM48 (Coutinho &
Henrissat 1999a). In addition to GBD represented by
the β-subunit of the mammalian AMP-activated pro-
tein kinase (AMPK) (Polekhina et al. 2005), the family
CBM48 contains the putative SBDs present in the four
enzyme specificities from the α-amylase enzyme family: pullulanase (PUL), isoamylase (IAM), maltooligosyl trehalohydrolase (MOTH), as well as the glycogen
branching enzyme (GBE) and the starch branching enzyme (SBE). These enzymes have recently been revealed to constitute the so-called GH13 PUL subfamily
(Janecek et al. 2007); the SBD being found as a domain
that precedes the catalytic (β/α)8 -barrel (Katsuya et
al. 1998; Feese et al. 2000; Abad et al. 2002; Mikami
et al. 2006). Within the CAZy classification, these enzymes have been grouped in subfamilies as follows: GBE
and SBE (GH13 8, GH13 9), MOTH (GH13 10), IAM
(GH13 11) and PUL (GH13 12, GH13 13, GH13 14).
The aim of this work was to elucidate in detail the
evolutionary relationships within the GH13 PUL subfamily with a special focus on the evolution of their
CBM48. The rigorous phylogenetic analyses, as shown
recently also for a different CBM family, the family
CBM32 (Abbott et al. 2008), may even help to relate an
unknown CBM sequence to its biological function. With
regard to previous bioinformatics studies aimed at the
SBDs of amylolytic enzymes (Janecek & Sevcik 1999;
Janecek et al. 2003; Machovic et al. 2005, Machovic &
Janecek 2006b) the present work may contribute to our
better understanding the individual CBM families covering binding of starch and/or glycogen.
Material and methods
Amino acid sequences of the CBM48 modules studied here
were retrieved from GenBank (Benson et al. 2000) and are
listed in Table 1. The final selection was based on information at the CAZy server (Coutinho & Henrissat 1999a; accessed January 2007). Although many hypothetical enzymes
have been classified to contain a CBM48, only enzymes
with experimentally confirmed amylolytic function were selected forming a set of 84 proteins (Table 1). Seventy-eight
of these belong to the four main specificities creating the
GH13 PUL subfamily, i.e. GBE and SBE (15 and 25, respectively), PUL (16), IAM (14) and MOTH (8). In addition,
six enzymes having different specificity and domain arrangement were also included in order to examine their relationship to the PUL subfamily. These were the α-amylase from
Roseburia sp. A2–194 (Ramsay et al. 2006), PUL type III
from Thermococcus aggregans (Niehaus et al. 2000), the αamylase-pullulanases from Bacillus sp. KSM-1378 (Hatada
et al. 1996) and Bifidobacterium breve (Ryan et al. 2006)
and the amylopullulanases from Bacillus sp. XAL 601 (Lee
et al. 1994) and Geobacillus stearothermophilus (Chen et al.
2001).
The borders for individual CBM48 modules were defined using the structural data of the GH13 PUL subfamily
representatives, i.e. the GBE from Escherichia coli (Abad
et al. 2002) (Protein Data Bank code 1M7X), PUL from
Klebsiella pneumoniae (Mikami et al. 2006) (2FHF), IAM
from Pseudomonas amyloderamosa (Katsuya et al. 1998)
(1BF2) and MOTH from Sulfolobus solfataricus (Feese et al.
2000) (1EH9), as well as using the UniProt (Apweiler et al.
2004), GenBank (Benson et al. 2000) and Pfam (Bateman
et al. 2002) databases. The GH13 PUL subfamily representatives, the GBD of AMPK β1 subunit from Rattus norvegiUnauthenticated
Download Date | 6/18/17 7:35 PM
Evolution of starch-binding domain of CBM48
1059
Table 1. The enzymes containing the CBM48 module used in the present study.
Abbreviationa
Bacteria
GBE Agrtu
GBE Anago
GBE Bacca
GBE Bacce
GBE Butfi
GBE Escco
GBE GeostI
GBE GeostT
GBE Neide
GBE Strau
GBE hStrco1
GBE Strco2
Fungi, Mammal
GBE Aspor
GBE Homsa
GBE Sacce
Plants
SBE Arath2
SBE Arath2’
SBE Horvu2a
SBE Horvu2b
SBE Ipoba2
SBE Manes1
SBE Orysa1
SBE Orysa3
SBE Orysa4
SBE Phavu1
SBE Phavu2
SBE Pissa1
SBE Pissa2
SBE Soltu1
SBE Soltu2
SBE Soltu2’
SBE Sorbi1
SBE Sorbi2b
SBE Triae1
SBE Triae1’
SBE Triae1”
SBE Triae2
SBE Zeama1
SBE Zeama2a
SBE Zeama2b
Bacteria
PUL Anago
PUL Anaho
PUL Bacde
PUL Batth
PUL Calsa
PUL Ferpe
PUL Geoth
PUL Klepn
PUL Stcpn
PUL Thema
PUL Thrsp
PUL Thrth
Plants
PUL Horvu
PUL Orysa
PUL Spiol
PUL Zeama
Archaea
IAM Sulac
Bacteria
IAM Flasp
IAM Myrod
IAM Pseam
IAM Psesp
IAM Rhoma
Lengthb
CBM48c
Source
GenBank
Agrobacterium tumefaciens
Anaerobranca gottschalkii
Bacillus caldolyticus
Bacillus cereus
Butyrivibrio fibrisolvens
Escherichia coli
Geobacillus stearothermophilus 1503–4R
Geobacillus stearothermophilus TRBE14
Neisseria denitrificans
Streptomyces aureofaciens
Streptomyces coelicolor
Streptomyces coelicolor
AAD03472
CAJ38414
CAA78440
BAE96028
AAA23007
AAA23872
AAA22482
BAA19588
AAF04747
AAA67437
CAB72416
CAB92878
734
613
666
645
639
728
639
652
762
764
774
741
125–230
14–114
17–122
17–122
18–123
117–223
17–122
17–122
128–233
154–259
171–276
135–240
Aspergillus oryzae
Homo sapiens
Saccharomyces cerevisiae
BAB69770
AAA58642
AAB64488
689
702
704
51–162
65–176
47–170
Arabidopsis thaliana
Arabidopsis thaliana
Hordeum vulgare
Hordeum vulgare
Ipomoea batatas
Manihot esculenta
Oryza sativa
Oryza sativa
Oryza sativa
Phaseolus vulgaris
Phaseolus vulgaris
Pisum sativum
Pisum sativum
Solanum tuberosum
Solanum tuberosum
Solanum tuberosum
Sorghum bicolor
Sorghum bicolor
Triticum aestivum
Triticum aestivum
Triticum aestivum
Triticum aestivum
Zea mays
Zea mays
Zea mays
AAB03099
AAB03100
AAC69753
AAC69754
BAB64912
CAA54308
AAD28284
BAA03738
BAA82828
BAA82349
BAA82348
CAA56319
CAA56320
CAA49463
CAA03846
CAB40748
AAD50279
AAP72267
AAB17086
AAG27622
AAG27621
AAG27623
AAA82735
AAB67316
AAA18571
854
800
734
829
868
852
820
825
841
847
870
922
826
861
830
882
832
803
729
833
830
823
823
814
799
190–299
153–263
90–199
184–293
214–323
129–240
112–223
181–290
196–305
125–236
201–310
202–312
112–223
130–243
165–275
214–323
117–229
157–268
85–194
119–230
116–227
179–288
119–229
166–275
155–264
Anaerobranca gottschalkii
Anaerobranca horikoshii
Bacillus deramificans
Bacteroides thetaiotaomicron
Caldicellulosiruptor saccharolyticus
Fervidobacterium pennivorans
Geobacillus thermoleovorans
Klebsiella pneumoniae
Streptococcus pneumoniae
Thermotoga maritima
Thermus sp. IM6501
Thermus thermophilus
AAS47565
AAP45012
AAE10887
AAC44685
AAB06264
AAD30387
CAC85704
AAA25124
AAG33958
AAD36907
AAC15073
BAB62095
865
865
928
668
825
849
718
1102
1287
843
718
718
255–350
255–350
312–407
37–136
206–306
230–325
104–197
310–414
455–566
222–317
104–197
104–197
Hordeum vulgare
Oryza sativa
Spinacia oleracea
Zea mays
AAD04189
BAA09167
CAA58803
AAD11599
904
986
964
962
147–246
199–301
204–304
204–303
Sulfolobus acidocaldarius
BAA11864
713
15–149
Flavobacterium sp.
Myroides odoratus
Pseudomonas amyloderamosa
Pseudomonas sp.
Rhodothermus marinus
AAB63356
BAA82695
AAA25854
AAA25855
AAN89211
777
762
771
776
726
40–194
29–180
27–170
27–170
23–155
Unauthenticated
Download Date | 6/18/17 7:35 PM
M. Machovič & Š. Janeček
1060
Table 1. (continued)
Abbreviationa
Plants
IAM Horvu
IAM Ipoba
IAM OrysaA
IAM OrysaB
IAM Pissa
IAM Soltu
IAM Triae
IAM Zeama
Archaea
MOTH Sulac
MOTH Sulsh
MOTH Sulso
Bacteria
MOTH Artra
MOTH Artsp
MOTH Brehe
MOTH Deira
MOTH Rhisp
Bacteria
AAPU Bacsp
AAPU Bifbr
Archaea
PUL3 Thcag
Bacteria
AMY Rossp
APU Geost
APU Bacsp
Lengthb
CBM48c
Source
GenBank
Hordeum vulgare
Ipomoea batatas
Oryza sativa
Oryza sativa
Pisum sativum
Solanum tuberosum
Triticum aestivum
Zea mays
BAB72000
AAY84833
BAA29041
BAC75533
AAZ81835
AAN15317
AAL31015
AAA91298
789
785
733
811
791
793
790
818
91–215
90–213
27–151
105–229
86–210
81–211
92–216
118–243
Sulfolobus acidocaldarius
Sulfolobus shibatae
Sulfolobus solfataricus
BAA11863
AAF17553
BAA11010
556
559
559
1–88
1–88
1–88
Arthrobacter ramosus
Arthrobacter sp. Q36
Brevibacterium helvolum
Deinococcus radiodurans
Rhizobium sp. M-11
BAB40766
BAA09668
AAB95369
AAF10042
BAA11187
575
598
589
600
596
1–81
1–98
1–89
1–108
1–96
Bacillus sp. KSM-1378
Bifidobacterium breve
BAA11332
AAY89038
1938
1708
1140–1242
954–1057
Thermococcus aggregans
CAB94218
726
55–140
Roseburia sp. A2–194
Geobacillus stearothermophilus
Bacillus sp. XAL601
CAJ20070
AAG44799
BAA05832
1674
2018
2032
317–403
42–128
42–123
a GBE, glycogen branching enzyme (orange); SBE, starch branching enzyme (orange); PUL, pullulanase (blue); IAM, isoamylase
(green); MOTH, maltooligosyl trehalohydrolase (pink), AAPU, α-amylase-pullulanase (blue); PUL3, pullulanase tye III; AMY, αamylase; APU, amylopullulanase.
b Length of the enzyme.
c Borders of the individual CBM48 modules.
cus (Polekhina et al. 2005) (1Z0M) and the SBDs from
CBM20 and CBM21 families, i.e. CBM20s of glucoamylase
from Aspergillus niger (Sorimachi et al. 1997) (1AC0) and
of cyclodextrin glucanotransferase from Bacillus circulans
251 (Lawson et al. 1994) (1CDG) and CBM21 of glucoamylase from Rhizopus oryzae (Liu et al. 2007) (2DJM), were
superimposed on each other using the MULTIPROT server
at http://bioinfo3d.cs.tau.ac.il/MultiProt/ (Shatsky et al.
2004). All three-dimensional structures were retrieved from
the Protein Data Bank (Berman et al. 2002).
Sequence alignments were performed using the programs CLUSTALW (Thompson et al. 1994), CLUSTALX
(Jeanmougin et al. 1998) and T-COFFEE (Notredame et
al. 1998), and manually tuned. The method used for calculating the evolutionary trees was the neighbour-joining
method (Saitou & Nei 1987) with the Phylip format tree
output (Felsenstein 1985) using the alignments including
the gaps. The trees were calculated for alignment of: (i)
CBM48 modules; (ii) complete sequences; (iii) sequences of
catalytic (β/α)8 -barrel domains including domain B; (iv)
domain C sequences; and (v) the α-amylase family conserved sequence regions. All the trees were displayed with
the program TREEVIEW (Page 1996) and then manually
tuned.
Results and discussion
Location of CBM48 module
The CBM48 modules from GH13 PUL subfamily studied here are listed in Table 1. It is obvious that the
N-terminus of the enzyme or a region close to the
N-terminal end of the polypeptide chain is a typical position for these CBMs (Fig. 1). This positioning is similar to that observed for example in 6-αglucosyltransferase (Mukai et al. 2004), α-glucan, water dikinase (Baunsgaard et al. 2005) and laforin (Minassian et al. 2000) that all contain CBM20 (Coutinho
& Henrissat 1999b). Typical microbial amylolytic enzymes, however, have the classical SBD of CBM20 type
positioned C-terminally (Svensson et al. 1989; Janecek
& Sevcik 1999; Machovic et al. 2005). In all cases the
CBM48 motif originating from a GH13 PUL subfamily
member precedes the catalytic (β/α)8 -barrel domain of
the protein.
As can be seen from Figure 1, the exact position of
a CBM48 module correlates with the enzyme specificity.
The CBM48 of some bacterial and eukaryotic GBEs is
situated closer to the N-terminus in comparison with
the motif from plant SBEs. This is in agreement with
the results of the study focused on the N-terminal end of
GBE from E. coli (Lo Leggio et al. 2002) indicating that
the long N-terminal domain of some prokaryotic GBEs
(group 1) has originated from a duplication of a similar N-domain present in various GH13 enzymes. This
event resulted in the existence of N1 and N2 modules
in group-1 GBEs, whereas the short N-terminal domain
of other prokaryotic GBEs (group 2) is a single domain
known as the N2 module (Lo Leggio et al. 2002). Thus,
Unauthenticated
Download Date | 6/18/17 7:35 PM
Evolution of starch-binding domain of CBM48
1061
Fig. 1. Positions of the CBM48 modules in the amino acid sequences. The black lines are drawn to scale to represent protein lengths.
For the two proteins with “a ” in front of the enzyme length (PUL Klepn, 1102 and PUL Stcpn, 1287), only the first 1000 residues
from the N-terminal end are shown. The abbreviations and colour code of the proteins are explained in Table 1.
the location of a CBM48 module from bacterial GBEs
belonging to the group 2 is closer to the N-terminus
than is the location of CBM48 modules from bacterial
GBEs of group 1 (Fig. 1). It was also pointed out (Lo
Leggio et al. 2002) that both mammalian GBEs and
plant SBEs belong to the group 2 (having only the N2
module) despite the location of their CBM48 modules
is more comparable with that of bacterial GBEs from
group 1.
The location of CBM48 modules in PULs seems
to be irregular with respect to taxonomy (Fig. 1). It
is thus worth mentioning that, for example, the PUL
from Klebsiella pneumoniae (Mikami et al. 2006) possesses three N-terminal domains: (i) domain N1 known
as the CBM41 module; (ii) domain N2; and (iii) domain N3 recognized as the CBM48 module succeeding
by the characteristic α-amylase family GH13 domains
A, B and C. The number and arrangement of the eventual N-terminal domains can be crucial also for location
of the CBM48 module in PULs because their function
has not yet been confirmed experimentally.
On the other hand, the position of CBM48 module of both archaeal and bacterial IAMs and MOTHs
is well N-terminally conserved. This is especially true
for MOTHs because the N-terminus of their polypep-
tide chain is equivalent to the beginning of the CBM48
module (Fig. 1). The N-terminal position is not conserved for the modules from plant IAMs with the exception of a rice enzyme (IAM OrysaA) having, however, adequately shorter sequence in comparison with
other plant IAMs (Table 1).
Finally, the CBM48 locations of the remaining
six GH13 enzymes (Table 1) vary due to a different
domain arrangement and enzyme length. The two αamylase-pullulanases from Bacillus sp. KSM-1378 and
Bifidobacterium breve are both almost 2000 amino acid
residues long. They contain two independent catalytic
domains that are responsible for the α-amylase and
PUL activity occupying the N-terminal and C-terminal
part of the protein, respectively (Hatada et al. 1996;
Ryan et al. 2006). The CBM48 module is positioned
in the PUL part of the enzyme, i.e. in the N-terminal
region of the approximate second half of the primary
structure (Fig. 1). With regard to CBM48 modules
of the PUL type III from Thermococcus aggregans
and two amylopullulanases from Geobacillis stearothermophilus and Bacillus sp. XAL 601, all the three bacterial proteins contain their CBM48 module close to the
N-terminus end seemingly in a similar way as observed
for typical prokaryotic PULs (Fig. 1). As far as the
Unauthenticated
Download Date | 6/18/17 7:35 PM
1062
M. Machovič & Š. Janeček
Fig. 2. Structure-based alignment of CBM48, CBM20 and CBM21 representatives. The CBM48 representatives (GBE Escco,
PUL Klepn, IAM Pseam and MOTH Sulso) are abbreviated in accordance with Table 1. Other CBM modules are abbreviated as
follows: CBM48 AMPK1 Ratno, GBD from R. norvegicus AMPK β-subunit 1; CBM20 GMY Aspni, SBD from A. niger GH15 glucoamylase; CBM20 CGT Bacci, B. circulans 251 GH13 cyclodextrin glucanotransferase; CBM21 GMY Rhior, SBD from R. oryzae
GH15 glucoamylase. The numbers preceding and succeeding the alignment indicate the position of the CBMs in primary structures.
Twelve residues in the alignment are highlighted by colours: (i) 11 consensus residues (Svensson et al. 1989) – two aromatics of the
starch-binding site 1 in yellow and remaining nine residues in turquoise; and (ii) the conserved phenylalanine (Machovic et al. 2005;
Machovic & Janecek 2006b) in green. In the case of a substitution (or a gap), the position is highlighted by pink. The β-strands and
α-helixes are highlighted in grey and red, respectively, and they are also signified by italics.
α-amylase from Roseburia sp. A2-194 is concerned, it
possesses the CBM48 module in the N-terminal part
of the sequence but the module is in this case preceded by the two ∼110 residues long repeat units (rich
in aromatic residues), PUL-associated domain and an
additional unidentified domain (Ramsay et al. 2006).
It should be taken into account that despite this enzyme was assigned α-amylase specificity (EC 3.2.1.1),
it may be expected to be refined in the future, because the domain arrangement and amino acid sequence
(discussed below) is rather strange for a typical αamylase.
Structure-based alignment and superposition of CBM48
modules
Since three-dimensional structures have been determined for each GH13 PUL subfamily specificity studied here as well as for GBD of the AMPK β-subunit,
it was possible to perform the alignment of representative CBM48 modules with taking into account at least
the secondary structure elements. In addition, the representatives of closely related CBM families CBM20
and CBM21 (Machovic & Janecek 2006b) can also be
aligned with CBM48 modules in an effort to compare
the overall secondary structure arrangement of CBM48,
CBM20 and CBM21 modules (Fig. 2). It is well known
that a CBM in general can be characterized by a typical secondary structure composed of β-strands forming
the core of the motif (Boraston et al. 2004; Hashimoto
2006). This applies for the CBM20 (Penninga et al.
1996; Sorimachi et al. 1997) and CBM21 (Liu et al.
2007) as well as for the known structures of CBM48
modules (Fig. 2). Thus the modules of GBE from E.
coli (Abad et al. 2002) and MOTH from S. solfataricus (Feese et al. 2000) consist exclusively of β-strands,
whereas those of PUL from K. pneumoniae (Mikami et
al. 2006) and IAM from P. amyloderamosa (Katsuya et
al. 1998) possess a short α-helix succeeding the strand
β5.
From the alignment (Fig. 2) it seems that the position of CBM48, CBM20 and CBM21 β-strands is generally better conserved in the N-terminal part of the
modules. It is clear that the regions at or around the
residues known as consensus CBM20 signatures (Svensson et al. 1989) belong to the best conserved segments.
These consensus residues have been found well conserved not only among the real SBDs originating from
various microbial amylolytic enzymes (Janecek & Sevcik 1999; Janecek et al. 2003), but also in the putative
SBDs exhibiting sequence similarities to motifs from the
families CBM20, CBM21 and CBM48 (Machovic et al.
2005; Machovic & Janecek 2006b). Moreover, not all
of the consensus residues are indispensable for a CBM
module to harbour the starch- and/or glycogen-binding
activity (Machovic & Janecek 2006a,b). This was documented for example for the eukaryotic regulatory proteins, such as laforin (Minassian et al. 2000; Gentry et
al. 2007), AMPK (Polekhina et al. 2005), and starch
excess 4 protein (Kerk et al. 2006; Niittyla et al. 2006)
that were evidently shown to bind starch and/or glycoUnauthenticated
Download Date | 6/18/17 7:35 PM
Evolution of starch-binding domain of CBM48
1063
Table 2. Characteristics of the overlapped pairs of CBM motifs.a
48 GBE
48
48
48
48
48
20
20
21
GBE
PUL
IAM
MOTH
AMPK
GMY
CGT
GMY
81
74
79
68
51
56
50
48 PUL
48 IAM
48 MOTH
48 AMPK
20 GMY
20 CGT
21 GMY
1.38
1.35
1.36
1.19
1.26
1.29
1.26
1.36
1.23
1.29
1.94
1.86
2.09
1.89
2.00
1.78
1.76
1.82
1.69
1.44
2.00
1.93
1.74
1.91
1.75
2.07
1.82
1.75
81
81
65
46
51
64
73
57
48
30
57
65
49
56
63
79
68
53
79
58
58
a
The abbreviations are as follows: 48 GBE, CBM48 of glycogen branching enzyme from E. coli; 48 PUL, CBM48 of pullulanase
from K. pneumoniae; 48 IAM, CBM48 of isoamylase from P. amyloderamosa; 48 MOTH, CBM48 of maltooligosyl trehalohydrolase
from S. solfataricus; 48 AMPK, CBM48 of the β1 subunit of AMP-activated protein kinase from R. norvegicus; 20 GMY, CBM20 of
glucoamylase from A. niger; 20 CGT, CBM20 of cyclodextrin glucanotransferase from B. circulans strain 251; 21 GMY, CBM21 of
glucoamylase from R. oryzae. For every pair of CBMs overlapped to each other two characteristic values are shown: the RMSD value
(Å) of each superposition (above the diagonal) and the number of mutually overlapped residues (below the diagonal). Both values
were obtained using the MULTIPROT server at http://bioinfo3d.cs.tau.ac.il/MultiProt/.
gen. It should thus be pointed out that not each of
the four GH13 PUL subfamily specificities contain even
the two aromatic residues corresponding with the raw
starch-binding site-1 of the family CBM20 (Trp543 and
Trp590 in SBD from A. niger GH15 glucoamylase) (Sorimachi et al. 1997). The details from comparison of sequences of CBM48 motifs from GBEs (SBEs), PULs,
IAMs and MOTHs have already been given by Machovic & Janecek (2006b). Since the presence of the
starch-binding site-1 appears to be in a close relationship with the real raw starch-binding ability (at least in
the families CBM20 and CBM21), the alignment of the
CBM48 sequences may indicate that within the GH13
PUL subfamily only the branching enzymes should exhibit the raw starch-binding function (Fig. 2).
As mentioned above, three-dimensional structures
have been available for at least one representative of
the four GH13 PUL subfamily specificities as well as
for GBD of AMPK β1 subunit. This made it possible
to superimpose these CBM48 modules. In addition, the
CBM20 and CBM21 representatives were superimposed
with CBM48 modules in order to compare the degree
of structural similarity within the family CBM48 with
that revealed among all the three CBM families (Table 2). Focusing on the length of the motifs from the
GH13 PUL subfamily, it varies as follows: GBE – 105
residues, PUL – 103, IAM – 144 and MOTH – 87. Despite the fact that the CBM48 motifs from IAM and
MOTH are substantially longer and shorter, respectively, in comparison with the average CBM48 length
(∼100 residues) (Coutinho & Henrissat 1999b), at least
50% residues of IAM and more than 80% residues of
MOTH were aligned in each superposition (Table 2).
The lengths of CBM20 and CBM21 modules are comparable to those of CBM48 modules, i.e. CBM20 from
glucoamylase – 108 residues, CBM20 from cyclodextrin
glucanotransferase – 104 and CBM21 from glucoamylase – 106. The values of both the number of overlapped
Cα atoms and the root mean square deviation (RMSD)
clearly support the close similarity of the four GH13
PUL subfamily CBM48 motifs as well as the adequately
close relatedness of the CBM48 module of AMPK β1
subunit (87 residues) to the motifs from GBE, PUL,
IAM and MOTH. It is worth mentioning here that
based on a sequence comparison the GBDs of AMPK βsubunits clustered rather in the CBM20 part of the evolutionary tree (Machovic & Janecek 2006b). This may
evoke an idea that also the family CBM48 could join
the proposed clan of the families CBM20 and CBM21
(Machovic et al. 2005), or at least the GBDs of AMPK
β-subunits could be considered as an intermediate between CBM20 and CBM48. This is strongly supported
by the values found for overlaying the AMPK’s CBM48
with both CBM20 representatives (Table 2); the values
for CBM48 from AMPK vs. CBM20 from cyclodextrin
glucanotransferase are even better than those for the
mutual CBM20 overlay. The evidently decreased number of overlapped residues for CBM20–CBM21 superpositions (Table 2) supports the previously postulated
idea (Machovic et al. 2005) that, although the families
may constitute a common CBM clan, they retain their
own independence.
Evolutionary tree of CBM48 module
The evolutionary relationships among the CBM48 modules are displayed in Figure 3. The evolutionary tree
possesses five main clusters – four of these represent
the four individual specificities of the GH13 PUL subfamily, while the fifth cluster is formed by the CBM48
modules from the two bacterial amylopullulanases, the
PUL type III from T. aggregans and the α-amylase from
Roseburia sp. A2–194 (Table 1). The GH13 PUL subfamily specificities form, in fact, three main parts of
the tree: (i) branching enzymes; (ii) MOTHs; and (iii)
IAMs and PULs. It should be pointed out that firstly
specificity is reflected in CBM48 evolutionary tree and
only then the taxonomy is more-or-less kept within the
specificity clusters (Fig. 3). This is in contrast to what
has been observed for the SBD of the CBM20 family
(Janecek & Sevcik 1999).
The cluster of branching enzymes contain forty
CBM48 modules from bacterial and eukaryotic GBEs
and plant SBEs. It is clear that prokaryotic and eukaryotic branching enzymes are well separated. Prokaryotic
GBEs are found in three groups: Proteobacteria (with
one exception – the GBE from Butyrivibrio fibrisolUnauthenticated
Download Date | 6/18/17 7:35 PM
1064
M. Machovič & Š. Janeček
Fig. 3. The evolutionary tree of the CBM48 modules. The abbreviations and colour code of the proteins are explained in Table 1. The
tree was constructed using the alignment including gaps.
vens), Actinobacteria and Firmicutes, whereas eukaryotic GBEs from human and fungi are clustered together
with eukaryotic SBEs from plants forming two compact
clusters (Fig. 3). It has already been demonstrated that
there exist two classes of SBEs, i.e. the branching enzyme I and the branching enzyme II and, in addition, in
monocots the branching enzyme II class is represented
by the two discrete sets of genes known as branching
enzyme IIa and branching enzyme IIb (Rahman et al.
2001). The mutual positions of the plant SBE isoforms
on the evolutionary tree are well conserved although the
SBEs I from wheat and pea (SBE Triae1, SBE Pissa1)
are placed within the cluster of SBE II and the SBE II
from pea (SBE Pissa2) is positioned among the SBEs I
(Fig. 3).
With regard to MOTHs, their cluster includes
three archaeal and five bacterial sources that all are
on their separate branches, the division between the
MOTHs from archaeons and bacteria being visible
(Fig. 3).
Concerning the observed close evolutionary relatedness of CBM48 motifs from IAMs and PULs, it is not
surprising if the overall significant three-dimensional
structural similarity of MOTH and PUL is taken into
account (Mikami et al. 2006). The cluster of IAMs is
formed by three different groups separating thus archaeal, bacterial and plant enzymes with one exception of interest: the bacterial CBM48 module of IAM
from Rhodothermus marinus (IAM Rhoma) goes well
with the archaeal counterpart from Sulfolobus acidocaldarius (Fig. 3). Remarkably, this sharing position is
held not only in this CBM48 tree but (as it will be
discussed below) also in the evolutionary trees based
on other domains. This indicates that the whole amino
acid sequence of bacterial IAM from R. marinus possesses the features typical for the archaeal IAMs verified also by BLAST (Altschul et al. 1990; data not
shown). Note that also the lengths of the two IAMs
are mutually comparable: 726 (R. marinus) and 713
residues (S. solfataricus) in comparison with ∼760–
Unauthenticated
Download Date | 6/18/17 7:35 PM
Evolution of starch-binding domain of CBM48
1065
Fig. 4. The evolutionary trees of various motifs typical for the GH13 PUL subfamily. (a) Complete-sequence tree, (b) TIM-barrel
tree, (c) conserved-sequence-regions tree, and (d) C-domain tree. The abbreviations and colour code of the proteins are explained in
Table 1. All the trees were constructed using the relevant alignments including gaps.
770 residues characteristic of most bacterial IAMs (Table 1).
As far as the PULs are concerned, the four plant
PUL CBM48 motifs form a small cluster and the individual CBM48 representatives of bacterial PULs are
on their own independent branches (Fig. 3). Two bacterial motifs from K. pneumoniae and Caldicellulosiruptor saccharolyticus appear to be the closest bacterial
counterparts to plant CBM48 modules. The CBM48
modules from the two bacterial α-amylase-pullulanases
are also positioned in the PUL cluster (Fig. 3). This
is not surprising as they may be considered here to be
pure PUL enzymes since their N-terminal α-amylase
parts were ignored.
Finally, the CBM48 motifs from the two bacterial amylopullulanases, the PUL type III from T. aggregans and the α-amylase from Roseburia sp. A2–194
are placed all together between the clusters of branching enzymes and MOTHs (Fig. 3). It is worth mentioning that the CBM48 module in these proteins has been
classified in the Conserved-domain database (MarchlerBauer et al. 2007) as a domain associated either N- or
Unauthenticated
Download Date | 6/18/17 7:35 PM
M. Machovič & Š. Janeček
1066
C-terminally with different types of catalytic domains
and belonging to the so-called “E” or “early” set-like
proteins, i.e. the α-amylase-like sugar utilizing enzymes
that may be related to the immunoglobulin and/or fibronectin type III superfamilies.
Domain evolution of the GH13 PUL subfamily
Taking into account the modular character of
the enzymes from the GH13 PUL subfamily studied
here (Table 1), the evolutionary trees based on the
alignments of various parts of their sequences were
constructed (Fig. 4) in addition to the CBM48 tree
(Fig. 3). Thus the complete-sequence tree (Fig. 4a),
TIM-barrel tree (Fig. 4b), conserved-sequence-regions
tree (Fig. 4c), and C-domain tree (Fig. 4d) were calculated based on the alignment of complete sequences, the
sequences of the catalytic TIM-barrel domain including
domain B, the isolated sequences of conserved sequence
regions of the α-amylase family, and the sequences of
domain C succeeding the catalytic TIM-barrel, respectively.
The most important feature documented by the
trees is that the three trees based on complete sequences, TIM-barrels, and conserved sequence regions
(Fig. 4a,b,c) are comparable to each other and basically
also to the CBM48 tree (Fig. 3). The only substantial
difference is branching of the group containing the two
bacterial amylopullulanases, the PUL type III from T.
aggregans and the α-amylase from Roseburia sp. A2-194
that in the trees shown in Figure 4 separates the clusters of GBE (SBE) and MOTH from those of IAM and
PUL. In the CBM48 tree (Fig. 3), the above-mentioned
group of the four enzymes that do not belong to any
of the four GH13 PUL specificities (branching enzyme,
MOTH, IAM and PUL) is positioned between the GBE
(SBE) and MOTH clusters. As pointed out already for
the CBM48 tree, the cluster of IAMs is unique since the
bacterial IAM from R. marinus is positioned together
with the archaeal IAM from S. solfataricus (Fig. 4).
The obvious similarities among the trees can easily be explained by the facts that: (i) the conserved
sequence regions are the best conserved elements of
these enzymes, i.e. they may be considered as their
“sequence fingerprints” (Janecek 2002); and (ii) the
TIM-barrel represents on average ∼60% of the enzyme length, i.e. it is a substantial part of the protein chain (Janecek et al. 2003). Perhaps slight visual differences, such as shortened branches, can be
seen in the conserved-sequence-regions tree (Fig. 4c)
when compared with the other trees (Fig. 3 and
Fig. 4a,b). This reflects the alignment of very similar and short sequence regions, i.e. the segments
around the strands β2, β3, β4, β5, β7 and β8 of the
TIM-barrel, e.g., 489 GVTHIELLP, 621 DVVYNH,
692 GFRFDLMGY, 721 YFFGEGWD, 848 YVSKHD
and 890 GIAFDQQGS in the PUL from K. pneumoniae. The presence of the clusters that are better separated from each other in an evolutionary tree is a
feature characteristic of the trees based on the short
alignments (i.e. conserved sequence regions) where the
differences between the sequences are automatically neglected (Zona et al. 2004).
Interestingly, the C-domain tree (Fig. 4d) exhibits
most dissimilarities to the trees discussed above. Similar phenomenon has been observed in domain evolution of the α-amylase family members containing the
CBM20 module (Janecek et al. 2003). Only the GBEs
and SBEs keep themselves together in a cluster with
taxonomy-respecting arrangement within their cluster
(Fig. 4d). With regard to MOTHs, IAMs and PULs,
these three GH13 PUL subfamily enzyme specificities
are more-or-less scattered in the tree, although at least
some taxonomic groups are still positioned in clusters,
such as plant IAMs and plant PULs as well as archaeal
MOTHs (Fig. 4d).
Acknowledgements
This work was supported in part by the grant No. 2/0114/08
from the Slovak grant agency VEGA and the project AV4/2023/08 from the Ministry of Education of the Slovak
Republic.
References
Abad M.C., Binderup K., Rios-Steiner J., Arni R.K., Preiss J. &
Geiger J.H. 2002. The X-ray crystallographic structure of Escherichia coli branching enzyme. J. Biol. Chem. 277: 42164–
42170.
Abbott D.W., Eirin-Lopez J.M. & Boraston A.B. 2008. Insight
into ligand diversity and novel biological roles for family 32
carbohydrate-binding modules. Mol. Biol. Evol. 25: 155–167.
Altschul S.F., Gish W., Miller W., Myers E.W. & Lipman D.J.
1990. Basic local alignment search tool. J. Mol. Biol. 215:
403–410.
Apweiler R., Bairoch A., Wu C.H., Barker W.C., Boeckmann B.,
Ferro S., Gasteiger E., Huang H., Lopez R., Magrane M. &
Yeh L.S. 2004. UniProt: the Universal Protein knowledgebase.
Nucleic Acids Res. 32: D115–D119.
Bateman A., Birney E., Cerruti L., Durbin R., Etwiller L., Eddy
S.R., Griffiths-Jones S., Howe K.L., Marshall M. & Sonnhammer E.L. 2002. The Pfam protein families database. Nucleic
Acids Res. 30: 276–280.
Baunsgaard L., Lutken H., Mikkelsen R., Glaring M.A., Pham
T.T. & Blennow A. 2005. A novel isoform of glucan, water
dikinase phosphorylates pre-phosphorylated α-glucans and is
involved in starch degradation in Arabidopsis. Plant J. 41:
595–605.
Benson D.A., Karsch-Mizrachi I., Lipman D.J., Ostell J. &
Wheeler D.L. 2004. GenBank: update. Nucleic Acids Res. 32:
D23–D26.
Berman H.M., Battistuz T., Bhat T.N., Bluhm W.F., Bourne
P.E., Burkhardt K., Feng Z., Gilliland G.L., Iype L., Jain S.
& Zardecki C. 2002. The protein data bank. Acta Crystallogr.
D58: 899–907.
Boraston A.B., Bolam D.N., Gilbert H.J. & Davies G.J. 2004.
Carbohydrate-binding modules: fine-tuning polysaccharide
recognition. Biochem. J. 382: 769–781.
Bozonnet S., Bonsager B.C., Kramhoft B., Mori H., Abou
Hachem M., Willemoes M., Jensen M.T., Fukuda K., Nielsen
P.K., Juge N., Aghajari N., Tranier S., Robert X., Haser R.
& Svensson B. 2005. Binding of carbohydrates and protein
inhibitors to the surface of α-amylases. Biologia 60 (Suppl.
16): 27–36.
Chen J.T., Chen M.C., Chen L.L. & Chu W.S. 2001. Structure
and expression of an amylopullulanase gene from Bacillus
stearothermophilus TS-23. Biotechnol. Appl. Biochem. 33:
189–199.
Unauthenticated
Download Date | 6/18/17 7:35 PM
Evolution of starch-binding domain of CBM48
Coutinho P.M. & Henrissat B. 1999a. Carbohydrate-active enzymes: an integrated database approach, pp. 3–12. In: Recent Advances in Carbohydrate Bioengineering (Gilbert H.J.,
Davies G., Henrissat B. & Svensson B., eds), The Royal Society of Chemistry, Cambridge; http://www.cazy.org/.
Coutinho P.M. & Henrissat B. 1999b. The modular structure
of cellulases and other carbohydrate-active enzymes: an integrated database approach, pp. 15–23. In: Genetics, Biochemistry and Ecology of Cellulose Degradation (Ohmiya K.,
Hayashi K., Sakka K., Kobayashi Y., Karita S. & Kimura T.,
eds), Uni Publishers Company, Tokyo.
Durand A., Hughes R., Roussel A., Flatman R., Henrissat B. &
Juge N. 2005. Emergence of a subfamily of xylanase inhibitors
within glycoside hydrolase family 18. FEBS J. 272: 1745–
1755.
Feese M.D., Kato Y., Tamada T., Kato M., Komeda T., Miura
Y., Hirose M., Hondo K., Kobayashi K. & Kuroki R. 2000.
Crystal structure of glycosyltrehalose trehalohydrolase from
the hyperthermophilic archaeum Sulfolobus solfataricus. J.
Mol. Biol. 301: 451–464.
Felsenstein J. 1985. Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39: 783–791.
Gasperik J., Hostinova E. & Sevcik J. 2005 Acarbose binding
at the surface of Saccharomycopsis fibuligera glucoamylase
suggests the presence of a raw starch binding site. Biologia
60 (Suppl. 16): 167–170.
Gentry M.S., Dowen R.H. 3rd , Worby C.A., Mattoo S., Ecker
J.R. & Dixon J.E. 2007. The phosphatase laforin crosses evolutionary boundaries and links carbohydrate metabolism to
neuronal disease. J. Cell Biol. 178: 477–488.
Hatada Y., Igarashi K., Ozaki K., Ara K., Hitomi J., Kobayashi
T., Kawai S., Watabe T. & Ito S. 1996. Amino acid sequence
and molecular structure of an alkaline amylopullulanase from
Bacillus that hydrolyzes α-1,4 and α-1,6 linkages in polysaccharides at different active sites. J. Biol. Chem. 271: 24075–
24083.
Hashimoto H. 2006. Recent structural studies of carbohydratebinding modules. Cell. Mol. Life Sci. 63: 2954–2967.
Henrissat B. & Bairoch A. 1996. Updating the sequence-based
classification of glycosyl hydrolases. Biochem. J. 316: 695–
696.
Janecek S. 1997. α-Amylase family: molecular biology and evolution. Prog. Biophys. Mol. Biol. 67: 67–97.
Janecek S. 2002a. How many conserved sequence regions are there
in the α-amylase family? Biologia 57 (Suppl. 11): 29–41.
Janecek S. 2002b. A motif of a microbial starch-binding domain
found in human genethonin. Bioinformatics 18: 1534–1537.
Janecek S. & Sevcik J. 1999. The evolution of starch-binding
domain. FEBS Lett. 456: 119–125.
Janecek S., Svensson B. & MacGregor E.A. 2003. Relation between domain evolution, specificity, and taxonomy of the
α-amylase family members containing a C-terminal starchbinding domain. Eur. J. Biochem. 270: 635–645.
Janecek S., Svensson B. & MacGregor E.A. 2007. A remote but
significant sequence homology between glycoside hydrolase
clan GH-H and family GH31. FEBS Lett. 581: 1261–1268.
Jeanmougin F., Thompson J.D., Gouy M., Higgins D.G. & Gibson T.J. 1998. Multiple sequence alignment with Clustal X.
Trends Biochem. Sci. 23: 403–405.
Katsuya Y., Mezaki Y., Kubota M. & Matsuura Y. 1998. Threedimensional structure of Pseudomonas isoamylase at 2.2 Å
resolution. J. Mol. Biol. 281: 885–897.
Kerk D., Conley T.R., Rodriguez F.A., Tran H.T., Nimick M.,
Muench D.G. & Moorhead G.B. 2006. A chloroplast localized
dual-specificity protein phosphatase in Arabidopsis contains a
phylogenetically dispersed and ancient carbohydrate-binding
domain, which binds the polysaccharide starch. Plant J. 46:
400–413.
Kuriki T. & Imanaka T. 1999. The concept of the α-amylase family: structural similarity and common catalytic mechanism. J.
Biosci. Bioeng. 87: 5575–5565.
Kuriki T., Takata H., Yanase M., Ohdan K., Fujii K., Terada
Y., Takaha T., Hondoh H., Matsuura Y. & Imanaka T. 2006.
1067
The concept of the α-amylase family: a rational tool for interconverting glucanohydrolases/ glucanotransferases, and their
specificities. J. Appl. Glycosci. 53: 155–161.
Lawson C.L., van Montfort R., Strokopytov B., Rozeboom H.J.,
Kalk K.H., de Vries G.E., Penninga D., Dijkhuizen L. & Dijkstra B.W. 1994. Nucleotide sequence and X-ray structure
of cyclodextrin glycosyltransferase from Bacillus circulans
strain 251 in a maltose-dependent crystal form. J. Mol. Biol.
236: 590–600.
Lee S.P., Morikawa M., Takagi M. & Imanaka T. 1994. Cloning of
the aapT gene and characterization of its product, α-amylasepullulanase (AapT), from thermophilic and alkaliphilic Bacillus sp. strain XAL601. Appl. Environ. Microbiol. 60: 3764–
3773.
Liu Y.N., Lai Y.T., Chou W.I., Chang M.D. & Lyu P.C. 2007.
Solution structure of family 21 carbohydrate-binding module
from Rhizopus oryzae glucoamylase. Biochem. J. 403: 21–30.
Lo Leggio L., Ernst H.A., Hilden I. & Larsen S. 2002. A structural model for the N-terminal N1 module of E. coli glycogen
branching enzyme. Biologia 57 (Suppl. 11): 109–118.
MacGregor E.A. 2005. An overview of clan GH-H and distantly
related families. Biologia 60 (Suppl. 16): 5–12.
MacGregor E.A., Janecek S. & Svensson B. 2001. Relationship of
sequence and structure to specificity in the α-amylase family
of enzymes. Biochim. Biophys. Acta 1546: 1–20.
Machovic M. & Janecek S. 2006a. Starch-binding domains in the
post-genome era. Cell. Mol. Life Sci. 63: 2710–2724.
Machovic M. & Janecek S. 2006b. The evolution of putative
starch-binding domains. FEBS Lett. 580: 6349–6356.
Machovic M., Svensson B., MacGregor E.A. & Janecek S. 2005. A
new clan of CBM families based on bioinformatics of starchbinding domains from families CBM20 and CBM21. FEBS J.
272: 5497–5513.
Marchler-Bauer A., Anderson J.B., Derbyshire M.K., DeWeeseScott C., Gonzales N.R., Gwadz M., Hao L., He S., Hurwitz
D.I., Jackson J.D., Ke Z., Krylov D., Lanczycki C.J., Liebert
C.A., Liu C., Lu F., Lu S., Marchler G.H., Mullokandov M.,
Song J.S., Thanki N., Yamashita R.A., Yin J.J., Zhang D. &
Bryant S.H. (2007) CDD: a conserved domain database for
interactive domain family analysis. Nucleic Acids Res. 35:
D237–D240.
Marques A.R., Coutinho P.M., Videira P., Fialho A.M. & SaCorreia I. 2003. Sphingomonas paucimobilis β-glucosidase
Bgl1: a member of a new bacterial subfamily in glycoside
hydrolase family 1. Biochem J. 370: 793–804.
Mikami B., Iwamoto H., Malle D., Yoon H.J., Demirkan-Sarikaya
E., Mezaki Y. & Katsuya Y. 2006. Crystal structure of pullulanase: evidence for parallel binding of oligosaccharides in
the active site. J. Mol. Biol. 359: 690–707.
Mikkelsen R., Suszkiewicz K. & Blennow A. 2006. A novel
type carbohydrate-binding module identified in α-glucan,
water dikinases is specific for regulated plastidial starch
metabolism. Biochemistry 45: 4674–4682.
Minassian B.A., Ianzano L., Meloche M., Andermann E., Rouleau
G.A., Delgado-Escueta A.V. & Scherer S.W. 2000. Mutation
spectrum and predicted function of laforin in Lafora’s progressive myoclonus epilepsy. Neurology 55: 341–346.
Mukai K., Maruta K., Satouchi K., Kubota M., Fukuda S.,
Kurimoto M. & Tsujisaka Y. 2004. Cyclic tetrasaccharidesynthesizing enzymes from Arthrobacter globiformis A19.
Biosci. Biotechnol. Biochem. 68: 2529–2540.
Naumoff D.G. 2005. GH97 is a new family of glycoside hydrolases,
which is related to the α-galactosidase superfamily. BMC Genomics 6: 112.
Niehaus F., Peters A., Groudieva T. & Antranikian G. 2000.
Cloning, expression and biochemical characterisation of a
unique thermostable pullulan-hydrolysing enzyme from the
hyperthermophilic archaeon Thermococcus aggregans. FEMS
Microbiol. Lett. 190: 223–229.
Nielsen M.M., Seo E.S., Bozonnet S., Aghajari N., Robert X.,
Haser R. & Svensson B. 2008. Multi-site substrate binding
and interplay in barley α-amylase 1. FEBS Lett. 582: 2567–
2571.
Niittyla T., Comparot-Moss S., Lue W.L., Messerli G., Trevisan
M., Seymour M.D., Gatehouse J.A., Villadsen D., Smith
Unauthenticated
Download Date | 6/18/17 7:35 PM
1068
S.M., Chen J., Zeeman S.C. & Smith A.M. 2006. Similar
protein phosphatases control starch metabolism in plants
and glycogen metabolism in mammals. J. Biol. Chem. 281:
11815–11818.
Notredame C., Holme L. & Higgins D.G. 1998. COFFEE: a new
objective function for multiple sequence alignmnent. Bioinformatics 14: 407–422.
Oslancova A. & Janecek S. 2002. Oligo-1,6-glucosidase and neopullulanase enzyme subfamilies from the α-amylase family defined by the fifth conserved sequence region. Cell. Mol. Life
Sci. 59: 1945–1959.
Page R.D. 1996. TreeView: an application to display phylogenetic
trees on personal computers. Comput. Appl. Biosci. 12: 357–
358.
Polekhina G., Gupta A., van Denderen B.J., Feil S.C., Kemp
B.E., Stapleton D. & Parker M.W. 2005. Structural basis for
glycogen recognition by AMP-activated protein kinase. Structure 13: 1453–1462.
Ragunath C., Manuel S.G.A., Kasinathan C. & Ramasubbu N.
2008. Structure-function relationships in human salivary αamylase: role of aromatic residues in a secondary binding site.
Biologia 63: 1028–1034.
Rahman S., Regina A., Li Z., Mukai Y., Yamamoto M., KosarHashemi B., Abrahams S. & Morell M.K. 2001. Comparison of starch-branching enzyme genes reveals evolutionary
relationships among isoforms. Characterization of a gene for
starch-branching enzyme IIa from the wheat genome donor
Aegilops tauschii. Plant Physiol. 125: 1314–1324.
Ramsay A.G., Scott K.P., Martin J.C., Rincon M.T. & Flint H.J.
2006. Cell-associated α-amylases of butyrate-producing Firmicute bacteria from the human colon. Microbiology 152:
3281–3290.
Rodriguez-Sanoja R., Oviedo N. & Sanchez S. 2005. Microbial
starch-binding domain. Curr. Opin. Microbiol. 8: 260–267.
Ryan S.M., Fitzgerald G.F. & van Sinderen D. 2006. Screening
for and identification of starch-, amylopectin-, and pullulandegrading activities in bifidobacterial strains. Appl. Environ.
Microbiol. 72: 5289–5296.
Saitou N. & Nei M. 1987. The neighbor-joining method: a new
method for reconstructing phylogenetic trees. Mol. Biol. Evol.
4: 406–425.
Seo E.S., Christiansen C., Abou Hachem M., Nielsen M.M.,
Fukuda K., Bozonnet S., Blennow A., Aghajari N., Haser R.
& Svensson B. 2008. An enzyme family reunion – similarities,
differences and eccentricities in actions on α-glucans. Biologia
63: 967–979.
M. Machovič & Š. Janeček
Shatsky M., Nussinov R. & Wolfson H.J. 2004. A method for simultaneous alignment of multiple protein structures. Proteins
56:143–156.
Sorimachi K., Le Gal-Coeffet M.F., Williamson G., Archer D.B.
& Williamson M.P. 1997. Solution structure of the granular starch binding domain of Aspergillus niger glucoamylase
bound to β-cyclodextrin. Structure 5: 647–661.
Stam M.R., Danchin E.G.J., Rancurel C., Coutinho P.M. & Henrissat B. 2006. Dividing the large glycoside hydrolase family
13 into subfamilies: towards improved functional annotations
of α-amylase-related proteins. Protein Eng. Des. Sel. 19: 555–
562.
Svensson B. 1994. Protein engineering in the α-amylase family:
catalytic mechanism, substrate specificity, and stability. Plant
Mol. Biol. 25: 141–157.
Svensson B., Jespersen H., Sierks M.R. & MacGregor E.A. 1989.
Sequence homology between putative raw-starch binding domains from different starch-degrading enzymes. Biochem. J.
264: 309–311.
Thompson J.D., Higgins D.G. & Gibson T.J. 1994. CLUSTAL
W: improving the sensitivity of progressive multiple sequence
alignment through sequence weighting, position specific gap
penalties and weight matrix choice. Nucleic Acids Res. 22:
4673–4680.
Tibbot B.K., Wong D.W.S. & Robertson G.H. 2002. Studies on
the C-terminal region of barley α-amylase 1 with emphasis
on raw starch-binding. Biologia 57 (Suppl. 11): 229–238.
Tranier S., Deville K., Robert X., Bozonnet S., Haser R., Svensson
B. & Aghajari N. 2005. Insights into the “pair of sugar tongs”
surface binding site in barley alpha-amylase isozymes and
crystallization of appropriate sugar tongs mutants. Biologia
60 (Suppl. 16): 37–46.
van der Maarel M.J., van der Veen B., Uitdehaag J.C.,
Leemhuis H. & Dijkhuizen L. 2002. Properties and applications of starch-converting enzymes of the α-amylase family.
J. Biotechnol. 94: 137–155.
Zona R., Chang-Pi-Hin F., O’Donohue M.J. & Janecek S. 2004.
Bioinformatics of the family 57 glycoside hydrolases and identification of catalytic residues in amylopullulanase from Thermococcus hydrothermalis. Eur. J. Biochem. 271: 2863–2872.
Received July 3, 2008
Accepted August 6, 2008
Unauthenticated
Download Date | 6/18/17 7:35 PM