© 2001 Nature Publishing Group http://structbio.nature.com letters © 2001 Nature Publishing Group http://structbio.nature.com Characterization of a cellulosome dockerin domain from the anaerobic fungus Piromyces equi S. Raghothama1,2, Ruth Y. Eberhardt3,4, Peter Simpson1, Darran Wigelsworth3, Peter White3,5, Geoffrey P. Hazlewood3,6, Tibor Nagy7, Harry J. Gilbert7 and Michael P. Williamson1 1 Department of Molecular Biology and Biotechnology, Krebs Institute, University of Sheffield, Firth Court, Western Bank, Sheffield S10 2TN, UK. 2Present address: Sophisticated Instruments Facility, Indian Institute of Science, Bangalore 560 012, India. 3Laboratory of Molecular Enzymology, The Babraham Institute, Babraham, Cambridge CB2 4AT, UK. 4Present address: Department of Clinical Veterinary Medicine, University of Cambridge, Cambridge CB3 0ES, UK. 5Present address: CBD Porton Down, Salisbury SP4 0JG, UK. 6Present address: Finnfeeds International, PO Box 777, Marlborough SN8 1XN, UK. 7Department of Biological and Nutritional Sciences, University of Newcastle upon Tyne, Newcastle upon Tyne NE1 7RU, UK. The recycling of photosynthetically fixed carbon in plant cell walls is a key microbial process. In anaerobes, the degradation is carried out by a high molecular weight multifunctional complex termed the cellulosome. This consists of a number of independent enzyme components, each of which contains a conserved dockerin domain, which functions to bind the enzyme to a cohesin domain within the protein scaffoldin protein. Here we describe the first three-dimensional structure of a fungal dockerin, the N-terminal dockerin of Cel45A from the anaerobic fungus Piromyces equi. The structure contains a novel fold of 42 residues. The ligand binding site consists of residues Trp 35, Tyr 8 and Asp 23, which are conserved in all fungal dockerins. The binding site is on the opposite side of the N- and C-termini of the molecule, implying that tandem dockerin domains, seen in the majority of anaerobic fungal plant cell wall degrading enzymes, could present multiple simultaneous binding sites and, therefore, permit tailoring of binding to catalytic demands. The recycling of photosynthetically fixed carbon trapped in the plant cell wall is mediated by a repertoire of different enzymes that act in synergy to hydrolyze the composite struc- ture. Anaerobic organisms produce scaffoldin, a multidomain protein consisting of a cellulose binding module that attaches the scaffoldin to the surface layers of the host microorganism, and a series of highly self-homologous ‘cohesin’ domains that bind to highly conserved noncatalytic sequences referred to as dockerins, in polysaccharide degrading enzymes1,2. In anaerobic bacteria that degrade plant cell walls, exemplified by Clostridium thermocellum, the dockerin domains of the catalytic polypeptides can bind equally well to any cohesin from the same organism. As a consequence, the enzymes assemble stochastically into a large (∼2 × 106 Da) complex, termed the cellulosome, that catalyzes the complete breakdown of plant cell walls3,4. The interaction between cohesin and dockerin, however, appears to be species specific in bacteria. More recently, anaerobic fungi, typified by Piromyces equi found in the horse caecum5,6, have been suggested to also synthesize a cellulosome complex, although the dockerin sequences of the bacterial and fungal enzymes are completely different. The fungal complex has a mass of >700 kDa, with at least 15 polypeptides involved7–9. Fungal dockerins bind to a 100 kDa protein within the cellulase complex. A 100 kDa polypeptide has been proposed to be the scaffoldin, although it has yet to be sequenced and formally demonstrated to contain cohesin sequences8. The fungal enzymes contain one, two or three copies of the dockerin sequence in tandem within the catalytic polypeptide. In contrast, all the C. thermocellum cellulosome catalytic components contain a single dockerin domain10. Fungal dockerin domains have been characterized from a number of organisms and were shown to have similar sequences11,12 (Fig. 1). In contrast to the bacterial proteins, they have broad interspecies specificity8. The cellulosome complex is the most efficient plant cell wall degrading system known and represents a powerful and versatile method of delivering multiple enzymes to a localized region in a composite substrate. Therefore, there have been a number of studies aimed at characterizing structure/function relationships in cellulosome components. Three-dimensional structures have been reported for the C. thermocellum scaffoldin cellulose binding module and two of its cohesin modules13,14. Recently a structure for the dockerin module has been reported, which forms a nonstandard paired EF-hand structure15,16. Currently there are no structural details available for any part of the fungal system. Here we present the first structure of a fungal dockerin domain and characterize its ligand binding site. The domain is the first of three consecutive dockerin domains from the P. equi endoglucanase Cel45A17. Fig. 1 Comparison of the primary sequences of dockerin domains from P. equi polysaccharide hydrolases. At the top of the figure are given the secondary structures found in this study (S = sheet, H = helix and T = turn). The consensus sequence is given at the bottom. The residue numbering is the same as that used here for endoglucanase Cel45A-1. Note that for Cel5A, the Lys at the C-terminus of domain 2 is the same residue as the Lys at the N-terminus of domain 3. Similarly for mannanase ManA, the Ser residues at the C-termini of domains 1 and 2 are the same residues as the Ser at the N-termini of domains 2 and 3, respectively. The last residue for Cel5A domain 3, ManA domain 3, ManB domain 2 and ManC domain 2 is the C-terminus of the protein. EMBL accession numbers are as follows: Cel5A is AJ277483; Cel45A, AJ277482; XylA, X91858; ManA, X91857; ManB, X97408; ManC, X97520; and EstA, AF164516. nature structural biology • volume 8 number 9 • september 2001 775 © 2001 Nature Publishing Group http://structbio.nature.com letters © 2001 Nature Publishing Group http://structbio.nature.com a b Effect of construct length on stability and binding The full length Cel45A sequence contains three repeats of the 38-residue dockerin domain, with 45% identity between all three domains (using residues 1–38 of the dockerins; Fig. 1). The residues from 39 onwards are much more variable. Our initial construct, therefore, consisted of residues 1–38 (plus two additional residues at the N-terminus, for convenience in cloning and to ensure that the full length domain was used). This protein domain binds both to the P. equi and to the Neocallimastix patriciarum plant cell wall degrading complexes at 4 °C. However, binding becomes substantially weaker as the temperature is raised above 20 °C. NMR spectra of the domain showed typical well-dispersed peaks at 4 °C, which moved gradually towards their random coil positions as the temperature was raised to 30 °C (data not shown). These results suggested that the protein was stably folded only at low temperature, and a longer construct was therefore prepared, consisting of residues 1–50 (plus two additional N-terminal residues). The longer construct binds approximately two orders of magnitude more tightly (data not shown). NMR confirmed that the longer construct is more stable and maintains a well-dispersed spectrum up to at least 45 °C. Therefore, some of the residues C-terminal of amino acid 39 are clearly needed for maintenance of a stably folded structure despite their apparent lack of sequence conservation among other dockerin sequences. Characterization of the functional dockerin The anaerobic bacterial dockerins are homologous to EF hands (calcium-binding motifs found in calmodulin and troponin C) and have been reported to require calcium for activity. However, CaCl2, MgCl2, EDTA and EGTA had no effect on ligand binding of the fungal dockerin. Addition of CaCl2 to solutions in the NMR tube had no effect on signal positions. The fungal dockerin, therefore, does not require calcium. 776 c Fig. 2 Solution structure of the P. equi dockerin. a, Stereo view of the ensemble of 34 structures, colored from red at the N-terminus to blue at the C-terminus. b, Stereo MOLSCRIPT27 diagram of the dockerin structure, showing the locations of disulfide bridges and the binding site residues discussed in the text. The protein present in the sample consisted of 52 residues — that is, the 38-residue ‘core’ sequence that is highly conserved in fungal dockerins, plus all 12 residues C-terminal of the core sequence prior to the start of the second dockerin domain, and two N-terminal residues from the GST-tag. The figure shows only the ordered residues –2–44. c, Surface of the protein, in the same orientation as (b). Side chains of residues Tyr 8 and Trp 35 are in purple, and Asp 23 in light lilac. Acidic residues are in red, basic in dark blue and hydrophobic in yellow. The dockerin contains four Cys residues and comes from an organism that lives in a highly reducing environment. Therefore, the functional state of the Cys residues is not obvious. As isolated from Escherichia coli, the protein had a mass (measured using MALDI-TOF) of 4,309.4 Da, compared to an expected mass of 4,311 Da for a disulfide-bridged protein or 4,315 Da for a reduced protein. Binding was markedly reduced upon incubation of the protein with 10 mM dithiothreitol (DTT). Addition of DTT to a sample in an NMR tube produces rapid loss of signal dispersion. Initial NMR structure calculations were carried out with no constraints on the Cys sulfur atoms and produced structures in which pairs of Cys were close together and could be linked together in an obvious pairing (1–10 and 11–36). These results demonstrate that the functional protein is oxidized. Structure of the dockerin The dockerin structure (Fig. 2) contains two helical stretches (residues 2–5 and a helical turn from 14–17), and four short β-strands: residues 20–23, 26–31 and 34–37, which form an antiparallel sheet structure, plus residues 9–11, which form an additional short twisted parallel strand. The N- and C-termini are adjacent to each other. A search using DALI revealed no other known similar structures; therefore, the structure represents a novel fold. A patch of well-conserved surface-exposed hydrophobic residues centered around Trp 35 presents itself as an obvious candidate for the site of interaction with cohesins (Fig. 2c). This region consists of a number of aromatic or hydrophobic residues (Trp 35, Tyr 8, Tyr 21 and Val 30) bordered by two acidic residues, Asp 23 and Glu 20. The patch is on the opposite face to the N- and C-termini and could easily be presented for binding. Moreover, the structure suggests that consecutive dockerins, as found in many proteins, could be organized such that the binding sites are nature structural biology • volume 8 number 9 • september 2001 © 2001 Nature Publishing Group http://structbio.nature.com letters © 2001 Nature Publishing Group http://structbio.nature.com Fig. 3 15N T2 values for backbone nitrogens. Values (ms) are plotted against residue number. next to one another and could bind to adjacent cohesin sites. Such an arrangement would give an opportunity for synergy in binding of multiple-dockerin proteins, as seen experimentally in the bacterial system18,19. In particular, increasing the number of domains in tandem may increase the binding affinity, thereby altering the mix of degradative enzymes present in the cellulosome complex. This would have advantages not only for production of an efficient degradative complex but could also be useful in the design of biotechnological applications. To investigate how far the structured region extends towards the C-terminus, 15N T2 relaxation times were measured for backbone amide nitrogens (Fig. 3). The data show that the wellordered region of the protein continues up to residue 42, in agreement with the lack of NOEs further along the chain and the lack of stability of the 1–38 construct. The side chains of residues Tyr 40 and Tyr 42 clearly interact with the body of the protein, as indicated by the observation of unusual chemical shifts for a number of resonances close to the rings. There is essentially no sequence similarity after residue 38, and a number of fungal dockerin domains terminate at or soon after residue 39 (Fig. 1). These latter domains may be stabilized by interactions with N-terminal tandem dockerin domains. Binding site for cohesin Residues Trp 2, Glu 6, Tyr 8, Glu 20, Tyr 21, Asp 23, Trp 28, Val 30 and Trp 35 were replaced by Ala and, in some cases, also by more conservative amino acid substitutions. Mutation of residues 2, 6, 20, 21 or 30 has very little effect on binding (Fig. 4), consistent with their relatively low degree of conservation among fungal dockerin sequences — for example, 52% and 72% for Trp 2 and Tyr 21, respectively. Trp 28 forms part of the hydrophobic core of the protein and, not surprisingly, is crucial for binding (Fig. 4). Tyr 8 and Trp 35 are both partially exposed and adjacent on the surface of the protein in an unusual configuration. Both residues are aromatic in all sequences identified to date (Fig. 1). Mutation of either residue to Ala completely abolishes binding, suggesting that both residues are involved in the binding of the dockerin to its ligand. However, the Y8F mutation actually gives improved binding. In agreement with previous results on the species specificity of binding, the affinities of the dockerin mutants were similar irrespective of whether the P. equi, N. patriciarum or Orpinomyces spp. complexes were used as the ligands. Thus, there is almost no species specificity of binding within fungal species (in contrast to the bacterial dockerins) and no identified sites that distinguish different species. There are a number of surface residues in the vicinity of the binding site that might be expected to be important for ligand recognition. In particular, Asp 23 is completely conserved in all fungal dockerin sequences (Fig. 1). Mutation of this residue to Ala reduced binding by two orders of magnitude (Fig. 4), confirming that it is crucial for strong binding. Other conserved residues also have clear structural or functional significance, except for Asn 32, which has no obvious rationale for its conservation. We conclude that the core binding site consists of two aromatic residues, Trp 35 and Tyr 8, plus an acidic residue, Asp 23. The binding site is formed from the edges rather than the faces of the aromatic rings (Fig. 2c) and is basically flat and exposed, with a combination of hydrophobic and hydrophilic interactions. This suggests that the ligand is a protein, rather than a saccharide, because saccharide binding typically requires exposed aromatic surfaces20. The interaction is, however, different both structurally and functionally16 from the C. thermocellum dockerin–cohesin interaction. We have shown here that mutation of key dockerin residues completely abolishes binding, whereas studies to date on the bacterial dockerin show that no single residue is crucial for effective binding21. Moreover, the fungal system makes use of single, duplicated or triplicated tandem dockerin domains, with potential synergistic interactions for multiple repeats of the domain, whereas bacterial enzymes contain only single dockerin domains19. Thus, the results presented here show that the fungal system represents a novel and completely unrelated modular protein–protein docking interaction that may prove a useful model for developing further modular binding systems. Methods Cloning and expression. The dockerin sequence was amplified by PCR from the plasmid pKPC28 obtained from a P. equi cDNA library17. Primers incorporated a BamHI site at the 5′ end and an XhoI site plus a stop codon at the 3′ end of the PCR product. The sequence was ligated into the glutathione S-transferase (GST) Fig. 4 Selected binding titration curves for dockerins. The dockerin concentrations are in molar units, and the absorbance is the change (from blank) at 450 nm, using a 1:1,000 dilution of the P. equi cellulase complex. The titrations show that Y8A, W28A, W28Q, W28F, W35A and W35F have no detectable binding; D23A has much reduced binding; and W2A, Q6A, Y8F, E20A, Y21A and V30A have binding close to that of wild type. The binding affinities of the mutants can best be expressed as the concentration of dockerin required to give a half-maximal change in absorbance in the ELISA-based assay at 4 °C, relative to the concentration of wild type protein in the same experiment (typically ∼100 nM). A value of 1.0, therefore, indicates that the mutation makes no difference to the affinity. The relative half-maximal concentrations are W2A = 1.6, Y8F = 0.2, E20A = 0.2, Y21A = 2.1, D23A = 170 and V30A = 1.0. nature structural biology • volume 8 number 9 • september 2001 777 © 2001 Nature Publishing Group http://structbio.nature.com letters Table 1 Structural statistics © 2001 Nature Publishing Group http://structbio.nature.com <DOCKERIN>1 R.m.s. deviation from experimental restraints Distance restraints2(Å) Dihedral restraints3 (°) R.m.s. deviations from idealized covalent geometry Bonds (Å) Angles (°) Impropers (°) X-PLOR energies (kcal mol–1) Etotal ENOE Ebond Eangle Ramachandran analysis (%) Most favored region Additionally allowed regions Generously allowed regions Disallowed regions Precision analysis4 Backbone atoms All heavy atoms DOCKERINav-min1 0.042 ± 0.001 0.75 ± 0.033 0.04 0.65 0.005 ± 0.000 0.57 ± 0.007 0.41 ± 0.008 0.004 0.53 0.38 177.63 ± 2.63 84.14 ± 1.63 14.51 ± 0.32 57.83 ± 1.45 74.6 23.2 2.2 0.0 158.8 74.1 12.6 49.9 67.6 29.7 2.7 0.0 0.20 ± 0.04 0.58 ± 0.05 <DOCKERIN> is the ensemble of 34 structures; DOCKERINav-min, minimized average structure. Total number of experimental distance restraints is 929, comprising 294 intraresidue, 184 sequential, 147 medium-range, 274 long-range nOes and 15 pairs of hydrogen bond restraints. 3Total number of experimental dihedral restraints is 63, comprising 37 φ and 26 χ1 dihedral restraints. 4R.m.s. deviation to mean structure for residues 1–43. 1 NMR studies. Solutions for NMR were ∼1 mM protein in 50 mM sodium phosphate, pH 6.5. NMR spectra were obtained at 500 and 600 MHz on Bruker DRX spectrometers using 5 mm probeheads with z gradients. Assignments and structure restraints were obtained from 2D and 3D 15N-separated NOESY and TOCSY spectra. Additional restraints were obtained from E.COSY, HNHA and HNHB experiments. In initial rounds of structure calculation, only unambiguous NOEs and ϕ-dihedral restraints were used. In subsequent rounds, further NOE restraints, including ambiguous restraints23, were added along with side chain dihedral restraints, stereoassignments and hydrogen bonding restraints24. Structures were calculated by hybrid distance geometry/simulated annealing in X-PLOR25. In the final round of calculations, 50 structures were selected, of which 34 had low energy and few violations of restraints, and used as the ensemble for analysis. A minimized average structure was calculated from the ensemble (Table 1). The structures had no violations >0.5 Å or dihedral violations >5°. 15N T2 relaxation measurements were carried out as described26. Coordinates. The coordinates have been deposited in the Protein Data Bank (accession codes 1E8P and 1E8Q for minimized average and ensemble, respectively), and chemical shifts in BioMagResBank (accession number 3322). Acknowledgments We thank the Biotechnology and Biological Sciences Research Council (BBSRC) for project grants and CASE studentships, Finnfeeds International for financial support, and BBSRC and the Wellcome Trust for equipment grants. The Krebs Institute is a BBSRC Centre. M.P.W., S.R. and P.J.S .are members of the BBSRCfunded North of England Structural Biology Centre. 2 fusion expression vector pGEX-4T-2 (Amersham Pharmacia Biotech) to generate dockerin fused to the C-terminus of glutathione S-transferase. The cells were grown at 37 °C in media containing 100 µg ml–1 ampicillin. When the cells reached an OD600 0.5, expression was induced by addition of 1 mM IPTG. The protein was purified using Glutathione Sepharose 4B (Amersham Pharmacia Biotech) in batch mode. For production of cleaved protein, the matrix-bound protein was incubated with thrombin (50 U l–1) for 16 h at room temperature. Mutations were introduced using PCR with the QuikChange site-directed mutagenesis kit (Stratagene). Binding assays. The complex was isolated from P. equi or N. patriciarum cultured in cellobiose-containing medium as described7,22. Microtiter wells were coated with 100 µl per well of complex diluted 1:1,000 in 0.05 M sodium carbonate, pH 9.6, and incubated for 16 h at 4 °C. Dilutions of up to 1:25,000 gave essentially identical results. Wells were washed and blocked using 2% (w/v) BSA (bovine serum albumin)/0.05% (v/v) Tween-20 in PBS. Dilutions of GST–dockerin were prepared in the same solvent, typically ranging from 0.8–100 µg ml–1. Aliquots (100 µl) were added to wells in triplicate and incubated for 1 h at 4 °C. After washing, 100 µl of a primary goat anti-GST (diluted to 1:2,500) was added and incubated for 1 h at 4 °C. Wells were washed three times with 0.05% (v/v) Tween in PBS, followed by a secondary horseradish peroxidase conjugated anti-goat IgG diluted to 1:5,000, and incubated for 1 h at 4 °C. After washing, bound antibody was detected using 100 µl substrate solution (50 mM K2HPO4, pH 5.0, 0.1 mg ml–1 3,3′,5,5′-tetramethylbenzidine and 0.02% (v/v) H2O2) and incubating for 5 min at room temperature. The reaction was stopped by addition of 50 µl 1 M H2SO4, and the absorbance was measured at 450 nm using a Titertek Multiskan plate reader. All assays were carried out using GST as controls. When evaluating the effect of calcium, 50 mM Tris-HCl buffer, pH 7.5, containing 150 mM NaCl was used instead of PBS. 778 Correspondence should be addressed to MPW email: [email protected] Received 28 March, 2001; accepted 19 July, 2001. 1. Shoham, Y., Lamed, R. & Bayer, E.A. Trends Microbiol. 7, 275–281 (1999). 2. Bayer, E.A., Chanzy, H., Lamed, R. & Shoham, Y. Curr. Opin. Struct. Biol. 8, 548–557 (1998). 3. Yaron, S., Morag, E., Bayer, E.A., Lamed, R. & Shoham, Y. FEBS Lett. 360, 121–124 (1995). 4. Lytle, B.L., Myers, C., Kruus, K & Wu, J.H.D. J. Bacteriol. 178, 1200–1203 (1996). 5. Munn, E.A. In Anaerobic fungi: biology, ecology, and function. (eds Mountfort, D.O. & Orpin, C.G.) 47–105 (Marcel Dekker, New York; 1994). 6. Orpin, C.G. J. Gen. Microbiol. 123, 287–296 (1981). 7. Ali, B.R.S. et al. FEMS Microbiol. Lett. 125, 15–21 (1995). 8. Fanutti, C., Ponyi, T., Black, G.W., Hazlewood, G.P. & Gilbert, H.J. J. Biol. Chem. 270, 29314–29322 (1995). 9. Hazlewood, G.P. & Gilbert, H.J. In Carbohydrases from Trichoderma reesei and other microorganisms (eds Claeyssens, M., Nerinckx, W. & Piens, K.) 147–155 (Royal Society of Chemistry, Cambridge, UK; 1998). 10. Bayer, E.A., Shimon, L.J.W., Shoham, Y. & Lamed, R. J. Struct. Biol. 124, 221–234 (1998). 11. Li, X.-L., Chen, H. & Ljungdahl, L.G. Appl. Env. Microbiol. 63, 628–635 (1997). 12. Li, X.-L., Chen, H. & Ljungdahl, L.G. Appl. Environ. Microbiol. 63, 4721–4728 (1997). 13. Shimon, L.J.W. et al. Structure 5, 381–390 (1997). 14. Tavares, G.A., Souchon, H., Guérin, D.M., Lascombe, M.-B. & Alzari, P.M. In Carbohydrases from Trichoderma reesei and other microorganisms (eds Claeyssens, M., Nerinckx, W. & Piens, K.) 174–181 (Royal Society of Chemistry, Cambridge, UK; 1998). 15. Lytle, B.L., Volkman, B.F., Westler, W.M. & Wu, J.H.D. Arch. Biochem. Biophys. 379, 237–244 (2000). 16. Lytle, B.L., Volkman, B.F., Westler, W.M., Heckman, M.P. & Wu, J.H.D. J. Mol. Biol. 307, 745–753 (2001). 17. Eberhardt., R.Y., Gilbert, H.J. & Hazlewood, G.P. Microbiol. 146, 1999–2008 (2000). 18. Kataeva, I., Gugliemi, G. & Béguin, P. Biochem. J. 326, 617–624 (1997). 19. Tavares, G.A., Béguin, P. & Alzari, P.M. J. Mol. Biol. 273, 701–713 (1997). 20. Simpson, P.J., Xie, H., Bolam, D.N., Gilbert, H.J. & Williamson, M.P. J. Biol. Chem. 275, 41137–41142 (2000). 21. Mechaly, A. et al. J. Biol. Chem. 276, 9883–9888 (2001). 22. Kemp, P., Lander, D.J. & Orpin, C.G. J. Gen. Microbiol. 130, 27–37 (1984). 23. Nilges, M. J. Mol. Biol. 245, 645–660 (1995). 24. Baxter, N.J. & Williamson, M.P. J. Biomol. NMR 9, 359–369 (1997). 25. Simpson, P.J, et al. Structure 7, 853–864 (1999). 26. Sorimachi, K., Le Gal-Coëffet, M-F., Williamson, G., Archer, D.B. & Williamson, M.P. Structure 5, 647–661 (1997). 27. Kraulis, P. J. Appl. Crystallogr. 24, 946–950 (1991). nature structural biology • volume 8 number 9 • september 2001
© Copyright 2026 Paperzz