A metagenomic approach to discover a novel b

1
Supplementary Information
A metagenomic approach to discover a novel -glucosidase from bovine
rumens
Eukote Suwan1, Siriphan Arthornthurasuk2, and Prachumporn Kongsaeree1,2,3,‡
1
Interdisciplinary Graduate Program in Genetic Engineering, Faculty of Graduate School,
Kasetsart University, Bangkok 10900, Thailand;
2
Department of Biochemistry, Faculty of
Science, Kasetsart University, Bangkok 10900, Thailand;
3
Center for Advanced Studies in
Tropical Natural Resources, NRU-KU, Kasetsart University, Bangkok 10900, Thailand
‡Corresponding authors: E-mail: [email protected]
2
Supplementary Table 1. The species of rumen microorganisms and the numbers of
amino acid sequences of  -glucosidase obtained from NCBI database.
No
Species
Strain
GH1
GH3
1
2
Actinobacillus succinogenes
Butyrivibrio fibrisolvens
3
Butyrivibrio proteoclasticus
4
5
Clostridiales bacterium
Desulfovibrio desulfuricans
6
7
Erysipelotrichaceae bacterium
Eubacterium rectale
8
9
10
Fibrobacter succinogenes
Holdemanella biformis
Lachnobacterium bovis
11
Lachnospiraceae bacterium
12
Lactobacillus mucosae
13
Lactobacillus ruminis
14
Megasphaera elsdenii
15
16
Peptostreptococcaceae bacterium
Prevotella ruminicola
17
18
Propionibacteriaceae bacterium
Ruminococcaceae bacterium
19
Ruminococcus albus
20
21
22
23
Ruminococcus bromii
Ruminococcus champanellensis
Sagittula stellata
Selenomonas ruminantium
24
25
26
Slackia heliotrinireducens
Streptococcus henryi
Treponema saccharophilum
16/4
FE2007
YRB2005
WTE3004
MD2001
ND3005
AB2020
B316
P6B7
FD2007
NK3B98
DSM 642
ATCC 29578
NK3D112
ATCC 33656
DSM 17629
M104/1
S85
AE2004
C6A12
NK4B19
AB2028
AC2012
AC2014
AC2028
AC2029
AC2031
AC3007
AD3010
FD2005
JC7
MD2004
NC2004
NC2008
ND2006
NK4A136
NK4A144
NK4A179
P6A3
V9D3004
YSB2008
LM1
DPC 6426
ATCC 27782
ATCC 25644
SPM0211
ATCC 25644
DPC 6832
DSM 20460
T81
24-50
VA2
Bryant 23
Ga6B6
P6A17
AB4001
AE2021
7
DSM 20455
8
AD2013
SY3
L2-63
18P13
TAM6421
AB3002
ATCC 12561
AC2024
DSM 2985
1
3
1
2
2
2
1
2
1
2
26
1
1
1
1
1
1
1
4
3
2
4
4
3
1
4
4
3
1
4
1
1
1
2
2
1
5
2
4
2
10
-
8
6
5
5
7
5
5
6
5
6
9
1
1
1
3
4
3
2
1
3
4
3
1
3
1
1
2
3
3
1
1
5
1
2
1
4
1
1
1
1
1
10
10
1
1
2
5
5
6
5
5
1
1
1
1
2
1
2
3
A)
Supplementary Fig. 1.
B)
The phylogenetic trees of rumen microorganism β-glucosidase
sequences of GH1 (A) and GH3 (B).
4
cacagttcttgaattgtatacgactcactatagggcgaattgggcccgacgtcgcatgct
H S S - I V Y D S L - G E L G P T S H A
cccggccgccatggcggccgcgggaattcgattacaatgtaccactgggacctgccacat
P G R H G G R G N S I T M Y H W D L P H
gcccttcatctcaaagggggatggctgaatgatgattctcctaattggtttgctgaatat
A L H L K G G W L N D D S P N W F A E Y
gccaaggtaataaaaacatattttgggaaagaagtatcttactttatcacctttaatgaa
A K V I K T Y F G K E V S Y F I T F N E
cctcaggtttttgttggctgtgggtatttatcaggaaatcatgcaccaggatatcaatta
P Q V F V G C G Y L S G N H A P G Y Q L
ccaaaggctgaaattgtacgtatagctcataatgtacttaaggctcatggacttgcagtg
P K A E I V R I A H N V L K A H G L A V
aaagagttgagaaaaggggaaccatgcaaaataggctttactggtgcttcctgtccatgc
K E L R K G E P C K I G F T G A S C P C
ataccggcttctgatagaaaagaagatatagaagctgcctataatcaatatttttcaagt
I P A S D R K E D I E A A Y N Q Y F S S
aatagcaacgaatttgttttcacagatgcattttggtttgacccggttttgaagggcaga
N S N E F V F T D A F W F D P V L K G R
tatcccaaatgggtaacctacataaacaatgtaagcatgccaatcattaccaaggaggat
Y P K W V T Y I N N V S M P I I T K E D
atggaactcatcagtcagcccattgactttgtggggttgaatatttataacggaaaatat
M E L I S Q P I D F V G L N I Y N G K Y
gtaaatgaggatggtggtatccttcagaaaaaacaaggagttcccagaacagcaattggt
V N E D G G I L Q K K Q G V P R T A I G
tggcccatagcacaagaagcattatactggggaccgaggtttaccagtgaaaggtatcat
W P I A Q E A L Y W G P R F T S E R Y H
aagcctatcatgattaccgaaaacggtatgtcctgtcatgattgtatttcactggatggt
K P I M I T E N G M S C H D C I S L D G
aaggtgcatgacgagaaccgcattgactatatgcacagatatttactgcagctgaaaaaa
K V H D E N R I D Y M H R Y L L Q L K K
gcaattgcagatggtgtggatgtagaaggttattatgcctggtccctgttggataatttt
A I A D G V D V E G Y Y A W S L L D N F
gagtggaatcactagtgaattcgcggccgcctgcaggtcgaccatatgggagagctccca
E W N H - - I R G R L Q V D H M G E L P
acgcgttggatgcatagcttgagtattctataggtcaccctaaaaggttg
T R W M H S L S I L - V T L K G
Supplementary Fig. 2. Nucleotide and deduced amino acid sequences of Br1 which was
amplified from the rumen metagenomics DNA. The nucleotide sequences corresponding to
those of the degenerate primers are underlined. The amino acid sequence that showed 56%
identity to a β-glucosidase from Cellulosilyticum ruminicola JCM 14822 is under grey
highlight. The conserved motifs unique to GH1 β-glucosidases, NEP and ITENG, are shown
as bold and underlined letters.
5
Br2
Cellulosilyticum_ruminicola
Lachnospiraceae_bacterium
uncultured_Clostridium
Blautia_producta
Hungatella_hathewayi
Ruminococcus_torques
MGFPKDFLWGTATASYQIEGAAFEDGKGLNIWDVFSHQEGKIFENHNGDVACDHYNRLEE
MSFNKNFVWGAATASFQIEGAAYEDHKGLNIWDTFCREEGKVYGGHNGDVACDHYHRMEE
MSFNKDFVWGVATSSYQIEGAAYEDGKGLSIWDVYCTQPGRVYEGHNGDVACDHYHRYKE
MSFRKDFVWGAATASYQVEGAAYEDGKGLNIWDVFCKEDGHVYEHHTGDVACDQYHRYKE
MGFPESFLWGTATASYQIEGGAFEDGRGYTVWDDFCRTPGKVFSMHNGDVACDHYHRYKE
MGFQKDFVWGAATSSYQIEGAAFEDGKGLSIWDVYAHQPGKVFEGHNGDVACDHYHRFEE
MGFKKDFIWGGATASYQVEGAAYEDGKGLNIWDIFCKDGGHIYENQTGDAACDQYHRYKE
*.* :.*:** **:*:*:**.*:** :* .:** :.
*::: :.**.***:*.* :*
Br2
Cellulosilyticum_ruminicola
Lachnospiraceae_bacterium
uncultured_Clostridium
Blautia_producta
Hungatella_hathewayi
Ruminococcus_torques
DLDILSKLGVKSYRFSVSWSRVLPAGIGQVNHKGIAFYQMLISGLRERGIIPCMTLYHWD
DVKLMAELGLKAYRFSVSWARILPEGTGEVCQAGLDFYNRLIDTLLEYGITPYMTLYHWD
DVKMMKEMGIKAYRFSISWPRVLPNGIGEVNELGLAFYDNLVDELIEAGIEPYVTLFHWD
DVAIMKEMGLKAYRFSVNWARILPEGTGKVNEKGLAYYDHLVNCLIENGIEPYMTLYHWD
DVKMMADMGIRAYRFSIAWSRILPEGRGEVNQSGIDFYNALIDELLKYNIKPCLTLFHWD
DVKLMKQLGIKAYRFSISWPRILPDGIGTVNQKGLDFYSRLTDALLENGITPYVTLYHWD
DVQIMKEMGMKAYRFSLSWARIMPEGTGTVNEKGLKYYDNLINELLDNGIEPFVTLYHWD
*: :: .:*:::****: * *::* * * * . *: :*. * . * . * * :**:***
Br2
Cellulosilyticum_ruminicola
Lachnospiraceae_bacterium
uncultured_Clostridium
Blautia_producta
Hungatella_hathewayi
Ruminococcus_torques
▼
LPYALHLKGGWLNDDSPNWFAEYAKVIKTYFGKEVSYFITFNEPQVFVGCGYLSGNHAPG
LPYALHKKGGWLNDESVQWFAEFAAIISKNYSDRVKHFITFNEPQVFVGCGYKMGEHAPG
LPYELHKKGGWMNPDSPMWFAEYTKVIVERLSDRVKYFMTFNEPQCFVGLGYSQGLHAPG
LPYALHQRGGWLNPQSPEWFYEYAKLMAAHFSDRVSHFFTFNEPQCTVGLGYVTGEHAPG
LPFALHRMGGWQNPEIVNWFAEYAAVAARAFGDRVKFFMTFNEPQCFVGLGHVSGEHAPG
QPYELYLRGGWLNPDSPKWFAEYAAVVARALGDRVKNFITFNEPQVFIGLAFVDGVHAPG
LPYALHLQGGWMNPNSPSWFYEYAKVVAEHFSDRVKNFFTINEPQCIVGLGYQTGEHAPG
*: *: *** * :
** *:: :
...*. *:*:**** :* .. * ****
Br2
Cellulosilyticum_ruminicola
Lachnospiraceae_bacterium
uncultured_Clostridium
Blautia_producta
Hungatella_hathewayi
Ruminococcus_torques
YQLPKAEIVRIAHNVLKAHGLAVKELRKGEPCK--IGFTGASCPCIPASDRKEDIEAAYN
YKLCDFELLQIGHNVLKAHGAATKALRENAPTSIEVGIVVATCPSIPVTENAADIKAAYA
LKQSIRDTLEMAHHILLAHGHSVKTIRKYAKGEVKVGFAPTASMNYPASDSKEDIEAAKR
LKIGPHDYFAIWHNVLKAHGRGVQAIREAAVRPVGVGMAPCGALYYPATDAKEDIEAARK
NIMSRRSVLEMAHHVMMAHGKAVQAIRSLV-PDAQIGYAPTSNPVIPASDTLEDIEAARR
HKLPRREALSMAHHVMMAHGLASMEIRSIV-PDAKIGYAPTSNVPVPVSSDPKDVEAARN
LKVGPSDYFRIWHNVLKAHGRAVEALREFSKQPVKISMAPCGALYVPETNKPEDIEAARK
. . : *.:: *** .
:*.
:. .
* :.
*::**
Br2
Cellulosilyticum_ruminicola
Lachnospiraceae_bacterium
uncultured_Clostridium
Blautia_producta
Hungatella_hathewayi
Ruminococcus_torques
QYFSSNS---NEFVFTDAFWFDPVLKGRYPKWVTYINNVSMPIITKEDMELISQPIDFVG
AYNRANL---DNYIFTDPYWLDPIVFGHYPEEVMKTCGHLMPKITEEDMALIQGPLDFIG
SLFEMPREIREEWAWNITWWNDPIFFGHYPEDGLELFKDYLPEIKEGDMEIISQPLDFLG
ACFALPEADIRAASWDVAFCADPVFLGQYPEDIMKSFGQYFPKTLEKDLELISQPLDFYG
AYFAVEDK--PDYMWSVSWWSDPVMLGRYPEDGLKLFEKDMPEFKPEDLELMHQPLDFYG
AYFRMPEN--GDWSWNVSWWSDPVMLGNYPEEGLRILEKDLPVMGPDDMKIIHQKPDFYG
ANFSLPENSIGACSWDVALCCDPVYLGQYPEDILKEFGQFFPKVTDADMKLISQPLDFLG
:
**: *.**:
:*
*: ::
** *
Supplementary Fig. 3.
Amino acid sequence comparison of Br2 with selected β-
glucosidases from anaerobic bacteria present in gastrointestinal tract of mammals. The
conserved motifs in GH1 enzymes, NEP and I/VTENG, are shown as white letters against
black background, and the predicted catalytic nucleophiles are marked with the filled triangle
above the sequences. The sources, accession numbers and percent identities of these
sequences are Cellulosilyticum ruminicola, WP_054742750.1, 59 %; Lachnospiraceae
bacterium, WP_053984599.1, 56 %; uncultured Clostridium, SCG87754.1, 55 %; Blautia
producta, WP_033139656.1, 55 %; Hungatella_hathewayi, WP_006771328.1, 55 %;
Ruminococcus torques, CUP83806.1, 54 %.
6
Br2
Cellulosilyticum_ruminicola
Lachnospiraceae_bacterium
uncultured_Clostridium
Blautia_producta
Hungatella_hathewayi
Ruminococcus_torques
▼
LNIYNGKYVNED----GGILQKKQGVPRTAIGWPITQEALYWGPRFTSERYHKPIMITEN
TNIYRGRYIKADINGNPEYMGIKVGMPRTAIGWEITPEALYWGAKESSDRYHLPYYITEN
QNVYNGREIKAGENGEIIYLTREAGSPKTALNWPITPKSLYWGPKFLYERYKKPIYITEN
QNVYNAVPVRADENGNPVRVDRYPGFPKTAIQWPVTPEVLYWAPKFLYERYQKPIYVTEN
QNIYNGYRVKSDKKGGWETVERPVGYPRTGNGWPVVPESLYWGPRFLYERYRKPIVITEN
QNIYRGIPTKA-VPGGWETVPHSPGAPKTAINWHVDFDCLYWGVKFLYERYQTPVVITEN
QNIYNAVTVRAGEDGKAVRAARYDGFPQTAIGWPVTPEVLYWAPKFMQERYKKPFMITEN
*:*..
.
* *:*. * : . ***. :
:**: * :***
Br2
Cellulosilyticum_ruminicola
Lachnospiraceae_bacterium
uncultured_Clostridium
Blautia_producta
Hungatella_hathewayi
Ruminococcus_torques
GMSCHDCISLDGKVHDENRIDYMHRYLLQLKKAIADGVDVEGYYAWSLLDNFEWANGYND
GMSAHDVVSLDGKVHDPNRIDYLNRYLKGLKRAASEGVDVRGYFTWSFLDNFEWAKGYAD
GLSCHDVVSLDGKVHDPNRIDFLQRYLREFKRAGEDGVEVAGYFQWSLMDNFEWHSGYGE
GMSSHDWVALDGKVHDASRVDFMHRYLREFKKAAADGVDLRGYFAWSLMDNFEWAYGYSE
GCCCADVVSLDGKVHDPGRIDFYHRYLLELGRAIEDGVRVDAYFAWSVIDNFEWAKGYSD
GMSSHDWPALDGKIHDYARIDYLHRHLRGLKRAAEEGVDVAGYFQWSLMDNFEWARGYND
GMASHDWVGVDGKVHDQARVDFMARYLGAYKRAAEDGVDLAGYFAWSVMDNFEWAYGYSQ
* .. * .:***:** *:*: *:*
:* :** : .*: **.:***** ** :
Br2
Cellulosilyticum_ruminicola
Lachnospiraceae_bacterium
uncultured_Clostridium
Blautia_producta
Hungatella_hathewayi
Ruminococcus_torques
RFGITYVDYETQQRIIKDSGFFYQQIIETNGDLL
RFGLVYVDYETQKRTVKDSAYWYQTVIASNGENL
RFGIVYVDYATGERIIKDSGYWYKSVIEANGENL
RFGMVYVDYETQKRTMKDSGLFYKEVIASNGEIL
RFGMVYVDFETQERILKDSANWYAEVIRNNGANL
RFGLIYVDYATQERIPKDSFEWYRNTIMQNGENL
RFGLVYTDYNTQKRIWKDSAYFYKNIIETNGENL
***: *.*: * :* *** :*
* ** *
Supplementary Fig. 3. Continued.
7
M 1 2
3 4
M 1 2 3 4
kDa
116
66.2
45
35
25
Supplementary Fig. 4. 10% SDS-PAGE analysis (left panel) and western blot (right panel)
of Br2 expression. Lane M, protein size markers; Lane 1, uninduced culture sample; Lane 2,
3-h induced culture sample; Lane 3, total cell lysate; Lane 4, insoluble fraction after cell lysis.