doi:10.1016/S0022-2836(02)01036-7 available online at http://www.idealibrary.com on w B J. Mol. Biol. (2002) 324, 105–121 Analysis of Catalytic Residues in Enzyme Active Sites Gail J. Bartlett1,2, Craig T. Porter1,2, Neera Borkakoti3 and Janet M. Thornton2*† 1 Department of Biochemistry and Molecular Biology University College London Darwin Building, Gower Street London WC1E 6BT, UK 2 European Bioinformatics Institute, European Molecular Biology Laboratory Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK We present an analysis of the residues directly involved in catalysis in 178 enzyme active sites. Specific criteria were derived to define a catalytic residue, and used to create a catalytic residue dataset, which was then analysed in terms of properties including secondary structure, solvent accessibility, flexibility, conservation, quaternary structure and function. The results indicate the dominance of a small set of amino acid residues in catalysis and give a picture of a general active site environment. It is hoped that this information will provide a better understanding of the molecular mechanisms involved in catalysis and a heuristic basis for predicting catalytic residues in enzymes of unknown function. q 2002 Elsevier Science Ltd. All rights reserved 3 Roche Discovery Welwyn Broadwater Road Welwyn Garden City, Herts AL7 3AY, UK *Corresponding author Keywords: enzyme active site; catalysis; amino acid residue; enzyme function Introduction Enzymes are probably the most studied biological molecules. They constitute nature’s toolkit for making and breaking down molecules required by cells in the course of growth, repair, maintenance and death. Virtually every biological process requires an enzyme at some point. Enzymes are capable of carrying out complex transformations in aqueous solution, at biological temperatures and pH, in a stereospecific and regiospecific manner, a feat seldom achieved by the best of organic chemists.1 Perhaps the most well-known enzyme catalytic mechanism is that of the serine proteases, which contain a Ser-His-Asp Present address: N. Borkakoti, Medivir UK Ltd, Peterhouse Technology Park, 100 Fulbourn Road, Cambridge, UK. † On secondment from the Department of Biochemistry and Molecular Biology, University College London, Darwin Building, Gower Street, London WC1E 6BT, UK and Department of Crystallography, Birkbeck College, Malet Street, London WC1E 7HX, UK. Abbreviations used: DOPS, diversity of position score; EC, Enzyme Commission; NRDB, Non-Redundant DataBase; PDB, Protein Data Bank. E-mail address of the corresponding author: [email protected] triad.2,3 This triad has evolved more than once in different structural folds.4 Knowledge and improved understanding of the properties of enzyme active sites and their assorted catalytic mechanisms is vital for novel protein design and predicting protein function from structure. Crystallographic and NMR studies of enzymes have shed light on the relationship between an enzyme’s three-dimensional structure and the chemical reaction it performs. However, from a structure alone it is a challenging task to extrapolate a catalytic mechanism. Detailed biochemical information about the enzyme can be used to design substrate or transition state analogues, which can then be bound into the enzyme for structure determination. These can reveal binding site locations and identify residues, which are likely to take part in the chemical reaction. From this, a catalytic mechanism can be proposed and can be confirmed by other information, for example, site-directed mutagenesis, kinetic analyses and by extrapolation from homologues. This analysis concentrates on the amino acid residues directly involved in enzyme catalysis, as revealed by structural studies. It builds on the work of Zvelebil & Sternberg,5 who in 1988 performed a comparative analysis of catalytic residues in just 17 enzymes. Since this work was published, 0022-2836/02/$ - see front matter q 2002 Elsevier Science Ltd. All rights reserved 106 Analysis of Catalytic Residues Table 1. Example of information extracted for each enzyme in the dataset: carboxypeptidase D Attribute Information PDB code Enzyme name EC number CATH classification Reaction catalysed Mechanism Active site residues 1bcr Carboxypeptidase D 3.4.16.6 3.40.50.1570 (a/b-class, 3-layer(aba) sandwich) Protease catalysing C-terminal hydrolysis of protein, with preference for arginine and lysine residues Ser-His-Asp catalytic triad (see Figure 1) (1) A Ser146—nucleophile (2) B His397—primer, activates serine nucleophile; acid/base—donates proton to leaving NH group; activates water molecule for hydrolysis of covalent intermediate (3) B Asp338—primer, ensures lone pair of electrons on His397 N12 (4) A Gly53-NH—transition state stabilisation—stabilises negative charge on tetrahedral intermediate (5) A Tyr147-NH—transition state stabilisation—stabilises negative charge on tetrahedral intermediate Homodimer PDOC00122 Medline: 7727364 Bioactive form Prosite entry Reference the number of enzyme structures in the PDB6 has increased forty-fold†, and techniques for elucidating enzyme catalytic mechanisms have improved. Therefore, it is appropriate to re-examine amino acid residues involved in catalysis as well as their properties and roles, on a wider scale. A major problem is the complexity of the data, and the difficulty of extracting the relevant information from the literature. In addition, the need to cluster proteins into related families to generate “good” unbiased data is non-trivial. The following properties of catalytic residues are examined: frequency distribution of residue type, function, secondary structure environment, solvent accessibility, flexibility, conservation, hydrogen bonding and quaternary structure. It is hoped that these data will improve our understanding of the generic principles of catalysis. They provide structurebased sequence annotation, which can help identify potential catalytic residues from structure, and a test-bed for developing tools to predict mechanism from structure. Such tools are the basis for predicting the function of structures produced by structural genomics initiatives. Criteria and Analysis of Catalytic Residues Collection of dataset A Protein Site Atlas of functional sites, including literature-defined enzyme active sites, is currently under construction (C.T.P. & J.M.T., unpublished results). Starting from the EC system,7 for each EC number (see legend to Figure 3), enzymes with structures in the PDB were examined and where possible, active site residues assigned. It must be noted that these are not simply the contents of the SITE records of the PDB files, but contain information manually extracted from the primary literature. In order to generate a non-homologous † http://www.rcsb.org/pdb/holdings_table.html dataset, the CATH classification8,9 of each enzyme was examined, and any duplicates at the CATH “H” level (defined by structure and sequence comparisons as having a common ancestor) were removed. A complexity here is that CATH classifies protein domains, whereas this analysis concentrates on whole enzymes, which may have one or more domains. Domains with identical CATH numbers are retained in the dataset if they form part of the same enzyme (i.e. a tandem repeat) or if an identical domain is shared between two multi-domain proteins with different functions. From this list, a set of 178 was taken to form the dataset for our work. Each enzyme has an X-ray crystal structure (resolutions vary between 1.5 Å and 3.2 Å) in the PDB (except for two cases which are NMR-derived structures; PDB ID: 1mek and 1adn), a well-defined active site and a mechanism of action proposed in the literature, usually corroborated by site-directed mutagenesis and other data. The information gathered for the enzyme carboxypeptidase D10 is shown in Table 1 and Figure 1 as an example. Primary literature used to collate this dataset and residue assignments can be found at the website‡. Definition of catalytic residues Catalytic residues are not consistently defined in the literature, therefore, the following rules were adhered to for classifying active site residues as catalytic. 1. Direct involvement in the catalytic mechanism—e.g. as a nucleophile. 2. Exerting an effect on another residue or water molecule which is directly involved in the catalytic mechanism which aids catalysis (e.g. by electrostatic or acid –base action). 3. Stabilisation of a proposed transition-state intermediate. ‡ http://www.ebi.ac.uk/thornton-srv/databases/ CATRES/index.html 107 Analysis of Catalytic Residues Figure 1. The catalytic mechanism of carboxypeptidase D.10 4. Exerting an effect on a substrate or cofactor which aids catalysis, e.g. by polarising a bond which is to be broken. Includes steric and electrostatic effects. Residues that bind substrate, cofactor or metal are not included, unless they also perform one of the functions listed above. Residue analysis Solvent accessibility was calculated using NACCESS, taking the biological molecule as defined by either the literature or PQS,11 in the presence and absence of ligands (either the true substrate or a substrate analogue). Enzyme clefts were derived using SURFNET.12 Temperature factors (B-factors) were taken from the PDB file for each atom in a residue, and then averaged over the whole residue. To exclude variations between proteins, the B-factors for each protein were then normalised over the whole protein giving a B-factor of between 0 and 1 for each residue. NMR models were removed for this part of the analysis. Homologous sequences from the Non-Redundant DataBase (NRDB, a database of protein sequences maintained by NCBI) were identified for each enzyme using the iterative profile search program PSI-BLAST,13 which was allowed a maximum of 20 iterations to reach convergence. The E-value threshold for inclusion of new sequences was set conservatively at 10240, in order to minimise profile drift but maximise the detection of remote homologues. The profile was 108 Analysis of Catalytic Residues Figure 2. The role of histidine in the first step of serine protease and adenylosuccinate lyase reactions.10,38 (a) Carboxypeptidase D, histidine primes serine residue for nucleophilic attack on the substrate; (b) Adenylosuccinate lyase, histidine residue directly deprotonates the substrate. used as a multiple alignment to score catalytic residue conservation, using the method of Valdar & Thornton.14 This method uses amino acid residue similarities inferred from a Dayhoff-like mutation data matrix15 to assess the diversity of amino acid residues at a given aligned position. Only those enzymes whose alignment had a diversity of position score (DOPS) score of greater than 90 were included in this part of the analysis. DOPS is a measure of the number of distinct permutations of residue scores, based on Shannon’s entropy.16 The greater the DOPS, the more diverse the alignment, providing a more discriminating Table 2. Functional classification of catalytic residues Catalytic function Acid–base Nucleophilea Transition state stabiliser Activate water Activate cofactor Primer Activate substrate Radical Modified Description Involved in proton abstraction, donation or both, to or from a substrate, as a direct part of the catalytic mechanism. Excludes residues which affect other residues or water molecules in this manner Forms a covalent intermediate with the substrate via nucleophilic attack Stabilises the transition state in some way (e.g. by stabilising an oxyanion hole formed during ester hydrolysis), lowering the activation energy of the reaction Alter the pKa of or deprotonate a water molecule which is directly involved in the reaction Exerts a favourable effect on a cofactor (could be metal or minor substrate such as FAD) through various means (e.g. by altering redox potential, or increasing effective charge) Exerts a favourable effect on another residue directly involved in the catalytic mechanism, e.g. by acting as an acid or base, or through electrostatic effects Exerts a favourable effect on the substrate (e.g. by polarising a bond to be broken) Forms a radical which is involved in the catalytic reaction Modified in some way in order to perform catalysis during the reaction, e.g. carbamylated lysine residue a This differs from the classical organic chemistry definition of a nucleophile, which is an electron pair donor. By this definition, bases would also be classified as nucleophiles, therefore in this analysis, the definition of nucleophile applies to covalent catalysis only. conservation score. Catalytic residue hydrogen bonding was investigated using HBPlus.17 The secondary structure environment of catalytic residues was analysed using PROMOTIF.18 Residue function There are many complications in assigning the function of a catalytic residue, due to the multistep nature of chemical reactions. One residue can play more than one role and can be involved in different steps of the reaction. Inevitably, catalytic mechanisms can only be properly modelled by quantum mechanical methods, and the “curlyarrow” diagrams are just a schematic representation. Standard organic chemistry terms for the role a residue plays in a catalytic mechanism can occasionally be ambivalent, for example, the role of histidine in the first step of the serine protease mechanism and in the adenylosuccinate lyase mechanism is, chemically speaking, the same— both residues are performing proton abstraction (see Figure 2). However, the effect in adenylosuccinate lyase is on the substrate itself, while in the serine proteases, the effect is on another protein residue directly involved in the reaction. In the classification proposed herein, the histidine residue in the serine protease is described as priming another residue in the first step, while in adenylosuccinate lyase, the histidine residue is described as a base. In addition, some residues will achieve the same function (e.g. lowering the pKa of another catalytic residue) by different means (e.g. by direct acid – base action, or by increasing the effective charge in the locality), so it is more meaningful to group these together. The identified catalytic residues have therefore been grouped into the classifications shown in Table 2. Some residues may perform more than one function in a particular reaction step, but the more functionally informative classification is chosen as the main classification. For instance, if during the course of a reaction, a residue acts as a base and deprotonates a substrate, and then the substrate goes on to perform nucleophilic attack on another substrate, then the classification “activates substrate” is chosen rather than “acid –base”. The groups can be broadly classified into two classes—primary and secondary. The primary groups, acid/base, Analysis of Catalytic Residues 109 Figure 3. Structural and functional description of enzyme dataset. (a) EC wheel functional classification of dataset. The EC classification7 assigns a four digit number to the reaction catalysed, where the first digit denotes the class of reaction (green, oxidoreductases (EC 1.– . – .– ); red, transferases (EC 2.– . –. – ); yellow, hydrolases (EC 3. – .– .– ); blue, lyases (4.– .– . –); orange, isomerases (EC 5.– . –. – ); pink, ligases (EC 6.– .– .– ). The second, third and fourth levels classify type of bond or substrate acted upon, substrate/product specificities and cofactor dependency. The meaning of the second, third and fourth levels is dependent on the primary level. See Todd et al.39 for more details. (b) EC wheel functional classification of all known enzymes,40 colours as in Figure (a). (c) The CATH structural classification of the dataset. The CATH classification assigns a four digit number to each protein domain according to its secondary structure9 (red, mainly a; green, mainly b; yellow, a/b; blue, few secondary structures). See http://www.biochem.ucl. ac.uk/bsm/cath_new/cath_info.html for more details. nucleophile and transition state stabiliser, are at the forefront of the chemical reaction the enzyme performs. Residues that activate substrate, water, cofactor or prime another residue can be thought of as secondary catalytic residues, important for “setting up” the reaction. Results Description of dataset There are 178 enzymes in the dataset, and 615 catalytic residues, giving each enzyme an average of 3.5 catalytic residues. A functional description of the dataset is given by the EC wheel (see Figure 3(a)). The EC wheel is a visual representation of all the EC numbers covered by the dataset. Each ring in the concentric pie chart represents one level of the EC classification. The primary classification (1st digit) is represented by colours and the innermost circle. The EC wheel for all enzymes which have been classified by the Enzyme Commission (EC)7 is shown in Figure 3(b) for comparison. The two datasets have similar proportions of each EC classification, although there are slightly fewer hydrolases (EC 3.– .– . –, 28% of dataset compared with 34% overall) and slightly more lyases (EC 4. – .– .– , 16% of dataset compared with 11% overall) in our dataset. This shows that the dataset is a reasonable representation of all known enzyme functions. A structural description of the dataset is given by the CATH wheel (Figure 3(c)), with the CATH wheel for all proteins in the PDB as a comparison (Figure 3(d)). A total of 262 out of 303 protein 110 Analysis of Catalytic Residues Figure 4. Observed frequency distribution of catalytic residue types compared with all residues in the dataset. CYSH indicates free cysteine residues. CYSS indicates disulphide-bridged cysteine residues. Catalytic residues were taken from each structure. In the case of structures with multiple subunits, the smallest possible unique unit was taken, e.g. in a homodimer with catalytic residues on one subunit only, one subunit was used for the all residue calculation. If the catalytic residues were split across two subunits, and there were two active sites in the homodimer, only one subunit was used for the all residue calculation. However, if catalytic residues were split across two subunits, and there was only one active site in the dimer, both subunits were used for the all residue calculation. domains (86%) in the dataset is fully classified in the CATH database. Of these, approximately 2/3 of the enzymes in the dataset fall into the a/b-class of proteins, with approximately 1/6 each in the mainly a and mainly b classes. There are a small number of enzymes with few secondary structures. The dominance of a/b-structures is different to the distribution across the whole PDB. The mainly a and mainly b classes are both under-represented when compared with all proteins in the PDB. It has previously been suggested19 that the underrepresentation of the mainly a class is due to the fact that in helices, the main-chain polar group hydrogen bonding potential is fully satisfied and these groups are not available for catalytic interactions. The edges of b-sheets are thought to be more accessible for interactions with the substrate and catalytic machinery. The dominance of a/b-folds is largely due to the presence of the nucleotide binding domain in many enzymes, and has been seen in previous fold/function analyses.19,20 Frequency distribution Figure 4 shows the observed frequency distribution of the different types of catalytic residue, compared with that of all residues in the dataset. Table 3 groups these into catalytic residue types. From these, 65% of catalytic residues are provided by the charged group of residues (H, R, K, E, D), while 27% of catalytic residues come are provided by the polar group of residues (Q, T, S, N, C, Y, W), and just 8% are provided by the hydrophobic group of residues. This is as expected: catalysis involves the movement of protons and electrons and charge stabilisation, which needs electrostatic forces provided by charged and/or polar residues. There is no correlation between percentage abundance in the dataset and contribution to catalysis. Table 3. Catalytic residue types and their secondary structure compared with all residues in the dataset Catalytic residue typea Catalytic residues All residues Secondary structure environment Charged (%) Polar (%) Hydrophobic (%) Alpha helix (%) Beta sheet (%) Coil (%) 65 25 27 25 8 50 28 47 22 23 50 30 a Histidine has been included in the charged group of residues, although strictly speaking it should be described as polar, its pKa in a protein is usually altered so that it behaves as a charged residue. Analysis of Catalytic Residues 111 Figure 5. Catalytic propensity of residue types. Catalytic propensity is defined as the percentage of catalytic residues constituted by a particular residue type, divided by the percentage of all residues constituted by the same particular residue type. Histidine constitutes 18% of all catalytic residues in proteins, although it has a low overall percentage abundance (2.7%). Histidine is particularly suitable for carrying out catalytic reaction steps, as it can be either charged or neutral at physiological pH and can play the role of nucleophile, acid, base or be involved in stabilising the transition state of a reaction. Aspartate and glutamate residues constitute 15% and 11% of catalytic residues, respectively. Their natural abundance is almost identical (5.7% and 5.9%, respectively). It could be that aspartate residue is slightly favoured over glutamate residue because it has a shorter side-chain by one methylene group, making the side-chain less flexible so it could be held in place, aiding catalysis. Arginine and lysine constitute 11% and 9% of catalytic residues, respectively. Arginine occurs more frequently in spite of its lower natural abundance in the dataset (4.9% for arginine and 5.8% for lysine). This preference may be due to the three nitrogen groups in the side-chain, all of which can perform electrostatic interactions, compared with just one in the side-chain of lysine. Additionally, since the side-chain of arginine can make more electrostatic interactions, it can be positioned more accurately to facilitate catalysis. The arginine side-chain also has a good geometry to stabilise a pair of oxygen atoms on a phosphate group, a common biological moiety. Cysteine constitutes 5.6% of catalytic residues, while its natural abundance is only 1.2%. Disulphide bridges identified by PROMOTIF18 were grouped separately, so only “free” cysteine residues are counted in the cysteine group. Four disulphide bridge-forming cysteine residues are involved in catalysis, these are found in glutathione reductase and protein disulphide isomerase. Formation and cleavage of a disulphide bridge between the two residues in glutathione reductase forms part of the catalytic cycle.21,22 Destabilisation of the disulphide bridge in protein disulphide isomerase is thought to be part of the driving force for catalysis in this enzyme. The high proportion of catalytic cysteine residues highlights the importance of the thiol group in catalysis. Its 2 SH group is easily deprotonated to 2 S2 as it has a pKa value of 9. Indeed, if one looks at the catalytic propensity (or “catalycity”) of each residue (the proportion of catalytic residues/ proportion of all residues for each residue type, see Figure 5), cysteine has the second highest catalycity behind histidine. These two residues have the closest pKa values to biological pH of all the amino acid residue side-chains, and this may explain their high catalycity. Acid – base reactions are very important in enzyme catalysis: the easier to deprotonate and reprotonate a residue, the faster it will be able to perform its catalytic function, and the higher the turnover of the the enzyme. However, it is well-known that pKa values of any amino acid residue can be altered from the standard solution value when buried within a protein environment.23 This feature is often important in catalysis. Figure 5 shows the catalytic propensities of the 20 amino acid residues. Histidine and cysteine residues have the highest propensities, these are followed by the rest of the charged residues. Glutamate moves down in the order due to its higher 112 Analysis of Catalytic Residues Figure 6. Catalytic propensity of residues interacting via their main-chain N– H or CvO groups. abundance compared with arginine. The charged residues are followed by the polar residues. Tryptophan is the ninth out of 20 residues, an unusually high position. The side-chain of tryptoTable 4. Hydrophobic residues aiding catalysis via their side-chain as opposed to their main-chain Residue Enzyme Met219 Human fibroblast stromelysin-1 Met20 Dihydrofolate reductase Leu28 Dihydrofolate reductase Leu54 Dihydrofolate reductase Leu20 D -amino-acid aminotransferase Gly734 Pyruvateformate lyase Dihydrofolate reductase Phe31 Phe175 L -2-haloacid dehalogenase Phe77 Pentalenene synthase Phe50 4-oxalocrotonate tautomerase Description of function Enhances effective concentration of Zn2þ cofactor, which coordinates and enhances the nucleophilicity of a hydroxyl nucleophile.41,42 Provides a hydrophobic region pushing positive charge from N5 of folate to C6 where it can accept hydride from NADPH.37 Constrains folate ring in optimum position to receive hydride.37 Constrains folate ring in optimum position to receive hydride.37 Aids PLP cofactor catalysis by supporting the ring orientation without disturbing oscillating motions.43 Radical formation at the C-a position.44,45 Forces proximity between folate and NADPH optimising hydride transfer.37 Forms a halide stabilising cradle which makes the halide a better leaving group.46 Stabilises carbocation intermediate with Asn219, by cation– p interactions.47 Provides a hydrophobic environment to lower the pKa of the N-terminal nucleophile.48 phan is found to be catalytic in only nine situations, however, its propensity is raised by its very low natural abundance. After the polar residues come the rest of the hydrophobic and aromatic residues, as expected. These results are surprisingly similar to the distribution found by Zvelebil & Sternberg5 whose dataset included only 17 enzymes and 36 catalytic residues. Minor differences between their results and these are probably due to the difference in size of the respective datasets. Side-chain and main-chain interactions It is useful to distinguish between side-chain and main-chain interactions, because for main-chain interactions the identity of the residue is often irrelevant. For main-chain interactions, only the N – H and CvO groups are involved, and any one of the 19 amino acid residues (i.e. all except proline) can provide this. Figure 6 shows catalytic propensities of the 20 residues by main-chain. The side-chain is used by 92% of catalytic residues, while that of main-chain is 8%. Of those using the main-chain, 82% use the N – H group and 18% use the CvO group. Main-chain groups often stabilise transition state intermediates, e.g. Gly30 in phospholipase A2.24 Glycine constitutes by far the highest proportion of catalytic residues using the main-chain (44%). It is often seen, as in phospholipase A2, stabilising oxyanion holes. Glycine is ideal for this role because of the small size of its side-chain, which can easily fit into any gap in the active site architecture. Its N – H and CvO groups are more accessible than those of bulkier amino acid residues, which are often occluded by the side-chain or 113 Analysis of Catalytic Residues Figure 7. Residue solvent accessibility in the absence of ligands. their positions in secondary structure. It has been previously suggested that glycine residues provide flexibility necessary for enzyme active sites to change conformation.25 For the hydrophobic residues (M, F, L, I, G, A, P, V), 81% of interactions involve the main-chain, with a few notable exceptions (see Table 4). Where hydrophobic residue side-chains are classified as catalytic, their function is often to provide a neutral environment to increase the relative catalytic power of charged moieties in the same region, or to exert steric strain on substrates which lowers the energy of the transition state of a reaction. Secondary structure Table 3 shows the secondary structure distribution of catalytic residues compared with all residues in the dataset. The majority (50%) occur in coil regions (i.e. not helix or sheet), considerably more than expected by chance. They are found with similar frequencies in a-helices and b-sheets (28% and 22%, respectively). This differs from the distribution of all residues, which has a much Table 5. Occurrence of catalytic residues in clefts in the enzyme, as calculated by SURFNET12 Number of enzymes (%) $50% of catalytic residues in three largest clefts $50% of catalytic residues in any cleft At least one catalytic residue in any cleft No catalytic residues in any cleft 151 (85%) 160 (90%) 165 (93%) 12 (7%) higher proportion of residues in an a-helical state. A high percentage of b-strand residues are either in an edge strand or at the end of a strand and are therefore available for catalytic interactions with substrates. On the other hand, a larger fraction of residues are “internal” to the helix, i.e. not at the ends of the helix, and so fewer are available for catalytic interactions. Indeed, the active site of all the TIM barrel family of enzymes is found at the C-terminal end of a b-barrel, with catalytic residues either at the end of a b-sheet or in the loops connecting the b-sheets.26 These results differ significantly from those of Zvelebil & Sternberg,5 who found little difference between the secondary structure environment of catalytic residues and all residues in their dataset. This work probably gives a better representation of the distribution due to the increased size of the dataset. Solvent accessibility Figure 7 shows the relative solvent accessibilities of catalytic residues compared with polar residues and all residues in the dataset, calculated in the absence of ligands. The 89% of catalytic residues have a relative solvent accessibility (%RSA) compared to fully exposed residues of less than 30%. We find approximately 50% of all catalytic residues in the 0 – 10% bracket, and approximately 25% in the 10 –20% bracket. 5% of all catalytic residues have 0% RSA and are totally buried. One might expect to find all catalytic residues fully exposed on the surface of the protein, but the results show that this is not the case. Most catalytic residues have very small exposures to solvent. The major 114 Analysis of Catalytic Residues Figure 8. Comparison of residue solvent accessibility in the presence and absence of ligands. factor could be the need for correct positioning and restriction of the mobility of catalytic residues. Considering surface topography we find that the majority of catalytic residues occur in a large cleft (see Table 5). In 160 enzymes (90%), over half of the catalytic residues are found in one of the ten largest clefts. Of these enzymes, almost all have over half of their catalytic residues in one of the three largest clefts (151). The cleft environment will lower the effective dielectric response in the region, which will increase the stabilisation of polar transition states by neighbouring charged residues or metal ions.27 Binding of the ligand serves even more to exclude the solvent (Figure 8). Of the structures in the dataset, 85 contain bound substrates and/or substrate analogues and/or inhibitors (i.e. not always the cognate ligand). Figure 8 shows the solvent accessibility of catalytic residues in these enzymes with and without the ligand present. Upon ligand binding, the percentage of residues with 0– 5% relative solvent accessibility increases from 27% to 72%. However, we cannot take into account any change in domain motion that occurs on substrate binding. We can only examine the solvent accessibility of one rigid Figure 9. Catalytic residue B-factors – a measure of residue flexibility. Absolute B-factors were taken from the PDB file for each enzyme and normalised over the whole protein. Enzyme structures determined by NMR (1mek and 1adn) were excluded. Normalised B-factor values were placed into bins and the percentage of residues in each bin displayed. 115 Analysis of Catalytic Residues Figure 10. Normalised B-factors for individual catalytic residues (Arg, Asp, Cys, Glu, His and Lys) compared with normalised B-factors for all residues of the same type. structure with and without the ligand present. Apo-enzymes may have exposed catalytic residues that are buried due to domain motion on substrate binding. Residue flexibility B-factors in the crystal structures were used as a measure of residue flexibility. Figure 9 shows the absolute temperature factors of catalytic residues and all residues in the dataset. Catalytic residues tend to have lower B-factors than all residues, suggesting that they have to be more rigidly held in place than the average residue. Catalytic residues in enzymes without any ligand or cofactor present (182 residues) are similar to those of all catalytic residues, but have slightly higher B-factors, suggesting that catalytic residues become slightly more “fixed” only when the substrate or cofactor is bound. B-factor plots for individual residue types can be seen in Figure 10. Catalytic arginine, lysine, aspartate and glutamate residues all have much lower B-factors than on average. Arginine could have one or two nitrogen groups tethered while the others perform the catalytic function. Lysine, which normally has a very flexible side-chain, has to be tethered for catalysis. For glutamate and aspartate residues, one of the oxygen atoms of the carboxylic acid group can be tethered whilst the other performs its catalytic function. The distribution of B-factors for catalytic histidine and cysteine residues is more similar to all histidine and cysteine residues. This could be due to the higher proportion of these residues being catalytic. Conservation One-hundred and ten enzymes in the dataset produced sequence alignments that were suitably 116 Analysis of Catalytic Residues Figure 11. Residue conservation scores. (a) Catalytic residue conservation scores compared with conservation scores for all residues in the dataset. The conservation score ranges from 0 (least conserved) to 1 (most conserved). (b) Conservation scores in sequence and structural locality. The centre of gravity of the catalytic residues in each enzyme was calculated and the conservation score of any residue falling within a sphere of 4 Å, 8 Å, and 12 Å of the centre of gravity was recorded. Additionally the conservation scores of residues at sequence positions ^ 4, 8 and 12 amino acid residues from each catalytic residue were recorded. diverse for meaningful conservation analysis. The conservation of catalytic residues compared with all residues is shown in Figure 11(a). Catalytic residues are clearly more conserved than the average residue. Figure 11(b) shows the conservation of residues within spheres of 4 Å, 8 Å and 12 Å radius around the centre of gravity of the catalytic residues, and also the conservation of resi- dues ^ 4, 8 and 12 sequence positions away from each catalytic residue. The conservation of residues falls steadily as the distance from the catalytic residues increases. This highlights the strong selection pressures on catalytic residues compared with other residues in the vicinity of the active site, which will be important for substrate recognition. Efficient catalysis depends on exquisite 117 Analysis of Catalytic Residues Table 6. Catalytic residue hydrogen bonds Number making $1 H-bond Via –N–H or – CvO group Residue (number analysed) Via side-chain atoms To protein To ligand Total To protein To ligand Total Total His (107) Asp (93) Arg (67) Glu (65) Lys (55) Cys (38) Tyr (32) Asn (26) Ser (26) Gly (24) Thr (18) Gln (14) Trp (9) Phe (7) Leu (7) Met (4) Ala (1) Ile (2) Pro (2) Val (1) 86 74 55 59 49 32 27 18 22 10 16 13 8 5 6 2 1 2 1 1 1 6 3 1 3 3 2 1 5 7 3 1 0 2 1 0 0 0 0 0 87 77 55 59 50 32 27 19 25 14 16 13 8 6 6 2 1 2 1 1 81 (96%) 73 (96%) 54 (92%) 44 (94%) 40 (89%) 16 21 20 20 0 13 11 5 – – – – – – – 20 (24%) 10 (13%) 30 (51%) 6 (12%) 17 (38%) 2 5 3 4 0 3 5 0 – – – – – – – 84 76 59 47 45 16 22 20 21 0 14 13 5 – – – – – – – 100 90 65 62 54 33 31 23 26 14 18 14 8 6 6 2 1 2 1 1 All (598) M/Ca (48) 487 28 39 14 501 34 498 6 105 3 422 7 557 (93%) 34 (71%) Percentages shown for His, Asp, Arg, Glu and Lys are percentages of the total number making at least one hydrogen bond via sidechain atoms (e.g. 84 for histidine residue). Percentages shown on the “All” and M/C lines are percentage of the total number of residues considered (600 for All, 48 for M/C). a Catalytic residues acting via main-chain groups. positioning of critical atoms, which can often only be achieved by using specific amino acid residues (e.g. aspartate instead of glutamate). Additionally, residues structurally close to catalytic residues are more conserved than those close by in amino acid residue sequence. One caveat is that enzyme active sites are not necessarily spherical, and the sphere may also pick up some buried core residues which are conserved because they are essential for maintaining the structural integrity of the protein. Hydrogen bonding Table 6 shows the hydrogen bonds made by all catalytic residues. Hydrogen bonds to water molecules were excluded from this part of the analysis although these are often critical components of catalysis. Of 598 catalytic residues considered, the majority (93%) enter into at least one hydrogen bond interaction, be it as a donor or acceptor. This shows that catalytic residues have a limited conformational freedom. The 84% of residues make at least one hydrogen bond via either their N –H or CvO group, while 75% of residues make at least one hydrogen bond via a side-chain atom. This suggests that usually the residue conformation is strongly tethered both for the main-chain and the side-chain. Of the residues making hydrogen bonds via the N– H or CvO groups, almost all (97%) hydrogen bond to another residue in the protein, and a very small proportion (8%) hydrogen bond to a ligand. Most of these hydrogen bonds will probably be necessary to maintain positioning of the catalytic residues. Of the residues making hydrogen bonds via side-chain atoms, almost all (94%) hydrogen bond with other amino acid residues in the protein. A relatively small proportion form a hydrogen bond with a ligand (19%). Looking at individual residues, a significantly higher proportion of the positively charged amino acid residues, lysine and arginine, hydrogen bond to a ligand (38% and 51%, respectively) compared with negatively charged amino acid aspartate and glutamate residues (13% and 12%, respectively). This is possibly due to the fact that many metabolites are negatively charged. Phosphorylating compounds such as glucose is a mechanism by which they can be retained inside the cell. All residues taking part in catalysis via their main-chain groups form hydrogen bonds with the protein (94%) or with the ligand (41%). Again, these residues have tethered conformations. Only 21% form side-chain hydrogen bonds, reflecting in part the high percentage of glycine residues, but also the non-involvement of the side-chain in catalysis. Quaternary structure/domain usage Almost all enzymes in our dataset (159) have their active site contained within just one subunit, with only 19 out of 178 enzymes (11%) having 118 Analysis of Catalytic Residues Table 7. Catalytic residue functions Residue (total) Histidine (113) Aspartate (92) Arginine (68) Glutamate (67) Lysine (56) Cysteine (39) Tyrosine (34) Asparagine (28) Serine (27) Glycine (24) Threonine (18) Glutamine (15) Tryptophan (9) Phenylalanine (7) Leucine (7) Methionine (4) Alanine (2) Isoleucine (2) Proline (2) Valine (1) All (615) No. of enzymes with at least one residue performing this function Variation in no. of residues performing this function in any one enzyme Nucleophile Transition state stabiliser Activates water/ cofactor/residue Activates substrate Other (radical/ modified) 58 31 6 30 13 6 17 1 6 – 3 – – – – – – – 1 – 4 6 – – 1 21 1 – 9 – 1 – – – – – – – – – 18 10 51 7 24 2 7 19 4 19 10 7 3 5 3 2 2 – – – 37 45 9 31 11 5 4 6 9 3 4 5 3 1 2 1 – – – 1 13 6 5 5 9 1 5 5 1 1 – 3 – 1 2 1 – 2 1 – 1 – – – 4 7 2 – – 1 – – 3 – – – – – – – 172 (28%) 106 44 (7.2%) 193 (31%) 178 (29%) 60 (9.8%) 17 (2.7%) 44 96 104 41 11 1–7 1 1–6 1–6 1–6 1–2 Acid/ base Residue functions are as defined in Table 2. The activate water, activate cofactor and primer categories are grouped into one, as are the radical and modified groups. “All” percentages add up to more than 100 as a residue can have more than one function assigned to it. The range of numbers of residues which can perform each function in any one enzyme using that function is given, e.g. in an enzyme which uses at least one residue as an acid/base, there may be anything from one to seven catalytic residues performing that function in that enzyme. catalytic residues in more than one subunit of the enzyme, i.e. the active site is at the interface of two subunits. Of these 19 enzymes, 17 have catalytic residues split between two subunits, while just two have catalytic residues split between three subunits. In addition, 108 out of 178 enzymes (60%) have more than one domain. Of these, 35 have catalytic residues split across more than one domain. However, as this analysis deals with catalytic residues and not with residues that only bind substrate or ligand, this number is probably an underestimate of enzymes whose active site is found at an interface between domains and/or subunits. Functions Table 7 shows catalytic residue function as defined by the classification previously described. Of 615 catalytic residues, roughly equal proportions are involved in stabilising a proposed transition state intermediate, affecting water/ cofactor/other residue and acting as an acid/base (31%, 29% and 28%, respectively). Approximately 10% of residues activate the substrate in some way, while 7% of residues form a covalent intermediate with the substrate via nucleophilic attack. A very small number act as radicals or are modified to perform their function. Just over half of enzymes in this dataset have at least one residue stabilising a proposed transition state. The actual number of residues performing this function in each enzyme can range from one to six. Typically enzymes will have 1 –3 residues acting in this way. Just three enzymes use six transition-state stabilising residues—pentalenene synthetase, 2-haloacid dehalogenase and adenylate kinase. Pentalenene synthetase has to stabilise a positively charged carbocation intermediate, as well as a negatively charged pyrophosphate leaving group.28 The mechanism of 2-haloacid dehalogenase involves an ester hydrolysis step which produces a negatively charged oxyanion hole, as well as a halide stabilising cradle for the leaving halide ion.29,30 Adenylate kinase uses mainly positively charged residues to stabilise the negatively charged penta-coordinated transition state which occurs during phosphate group transfer.31,32 Approximately, half of the enzymes in this dataset will have at least one residue acting as an acid or base, while the actual number of residues performing this function ranges from one to seven. Only one enzyme uses seven residues, this is aconitase, which uses three different ion pairs to transfer protons to hydroxide for elimination as water, as well as a base to extract a proton from the substrate.33,34 Typically enzymes will use 1 –2 acid/base residues. 119 Analysis of Catalytic Residues Almost 60% of enzymes in this dataset use at least one residue to activate a water, cofactor or other residue. Typically enyzmes will use 1 –3 residues in this way, but there are exceptions which use five or six residues in this way. High molecular weight acid phosphatase uses four residues to alter the pKa of the nucleophile involved in the reaction and one residue to activate a water molecule for attack on the enzyme – substrate intermediate.35,36 Glutathione reductase uses six residues, four of which are involved in facilitating hydride transfer from NADPH to FAD. The other two are responsible for activating the cysteine nucleophile.21,22 Just over 20% of enzymes have a residue which activates the substrate in some way. The number of residues performing this role in each enzyme can vary from one to six. Typically 1 –2 residues will perform this role. However, dihydrofolate reductase uses six residues in this way to put electrostatic and steric strain on the substrate in order for the reaction to occur.37 A quarter of enzymes in this dataset use a nucleophile, but in each of these enzymes there is only ever one nucleophilic residue. A very small proportion of enzymes in this dataset employ radical formation or residue modification for catalysis. Usually 1 –2 residues in each of these enzyme play such a role. There are typical roles for one or two residues, for instance the nucleophiles are generally cysteine, serine and occasionally aspartate (but surprisingly, hardly ever the similar corresponding residues threonine and glutamate—this could be due to increased bulkiness as these residues have an extra carbon in their side-chain, or possibly increased flexibility of the side-chain). The major role of arginine is to stabilise the transition state, while the negatively charged residues aspartate and glutamate are typically acid/bases. The major role of histidine is also as an acid/base, which it performs more often than expected on average. The most common function for the hydrophobic group of residues (G, F, L, M, A, I, P, V) is in the stabilisation of a proposed transition state intermediate, usually but not always via the mainchain groups. Discussion Caveats The classification of catalytic residues presented here is dependent on manual-extraction of information from the primary literature. The residue selection used is, therefore, only as complete as the literature from which it was extracted. For instance, if oxyanion hole-stabilising residues have not been identified in an enzyme that clearly utilizes a serine protease-like mechanism, they were not included in the analysis. Information in the literature is, in turn, dependent on the accuracy and reliability of structural data deposited in the PDB, and mutagenesis studies from which catalytic residues and mechanisms are inferred. Not all the dataset has been fully classified in CATH, and although every effort has been made to ensure that the dataset is non-redundant, as those partially classified domains become fully classified, previously undetected homologies may come to light. Analysis of catalytic residue conservation could only be performed on those enzyme families whose sequences, when PSIBLASTed, produced a suitably diverse alignment (60% of the dataset). Conclusions This work represents a structural and functional analysis of enzyme catalytic residues across a dataset of 178 enzymes, chosen using the strict criteria described herein. Catalytic residue types are limited, with just six residue types (H, C, E, D, R, K) accounting for 70% of all catalytic residues. Surprisingly, serine residue, usually thought of as a typical example of a catalytic residue type, is not included in this set. Catalytic residues have very limited exposure to solvent (as defined by relative accessibility) despite their polarity. They are very precisely positioned and held in place, as shown by their low B-factors and hydrogen bonding. Nearly all catalytic residues are hydrogen bonded via backbone groups, and three-quarters also via side-chain groups. Some of these hydrogen bonds will be important to maintain the structural integrity of the active site, others will be important in setting up the reaction that the enzyme performs. In spite of their apparent rigidity, catalytic residues are often found in a coil environment, and the vast majority are found in a cleft. Catalytic residues are highly conserved, as is their local three-dimensional environment. The local three-dimensional environment is more conserved than other residues close by in sequence. The most common catalytic residue function is to stabilise a proposed transition state intermediate. Primary catalytic residue functions (acid/base, nucleophile and transition state stabilisation) account for 65% of all catalytic residue functions. Typically the number of residues performing any one function in an enzyme ranges from one to three, but there are some enzymes which use up to six or seven residues for functions such as transition state stabilisation, acid/base, activating the substrate and activating water, cofactor or other residue. This could be in part due to variation in the number of steps involved in a catalytic mechanism, but could also be due to variation in the number of residues quoted by authors as having a functional role. Catalytic residues are most commonly found confined to one subunit or within a single domain. However, the fact that almost a third of enzymes in our dataset have their catalytic residues split across several subunits and/or domains has 120 interesting implications for enzyme evolution. If one considers that the basic unit of protein structure is the domain, how did an active site evolve with catalytic residues split across two or more domains or subunits? One hypothesis is that the primitive enzyme had its catalytic machinery on one domain, and catalysed the first step of a reaction. Later steps may have involved a process, which may have occurred naturally over time (e.g. hydrolysis). Then, by chance, the enzyme may have evolved another domain with residues that were well placed to speed up these processes, or stabilise intermediates so that the enzyme had an optimised turnover and a selective advantage. Another possible explanation is that convergent evolution of two different functional elements on two distinct domains occurred to form an enzyme with an adapted function. These results will provide a heuristic basis for predicting catalytic residues in enzymes of unknown function and hopefully facilitate a better understanding of enzyme mechanisms. Analysis of Catalytic Residues 10. 11. 12. 13. 14. 15. 16. Acknowledgements G.J.B. is funded by a BBSRC CASE studentship in association with Roche Products Ltd. We thank Annabel Todd and Stuart Rison for helpful discussion. 17. 18. 19. References 1. Walsh, C. (2001). Enabling the chemistry of life. Nature, 409, 226– 231. 2. Blow, D., Birktoft, J. & Hartley, B. (1969). Role of a buried acid group in the mechanism of action of chymotrypsin. Nature, 221, 337– 340. 3. Wright, C., Alden, R. & Kraut, J. (1969). Structure of subtilisin BPN0 at 2.5 angstrom resolution. Nature, 221, 235– 242. 4. Wallace, A., Laskowski, R. & Thornton, J. (1996). Derivation of 3D coordinate templates for searching structural databases: application to Ser-His-Asp catalytic triads in the serine proteinases and lipases. Protein Sci. 5, 1001– 1013. 5. Zvelebil, M. & Sternberg, M. (1988). Analysis and prediction of the location of catalytic residues in enzymes. Protein Eng. 2, 127– 138. 6. Bernstein, F., Koetzle, T., Williams, G., Meyer, E. E. J., Brice, M., Rodgers, J. et al. (1977). The Protein Data Bank: a computer-based archival file for macromolecular structures. J. Mol. Biol. 112, 535– 542. 7. Webb, E. (1992). Enzyme Nomenclature. Recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology. Academic Press, New York. 8. Orengo, C., Michie, A., Jones, S., Jones, D., Swindells, M. & Thornton, J. (1997). CATH—a hierarchic classification of protein domain structures. Structure, 5, 1093– 1108. 9. Orengo, C., Pearl, F., Bray, J., Todd, A., Martin, A., Lo Conte, L. & Thornton, J. (1999). The CATH database 20. 21. 22. 23. 24. 25. 26. 27. 28. provides insights into protein structure/function relationships. Nucl. Acids Res. 27, 275– 279. Bullock, T., Branchaud, B. & Remington, S. (1994). Structure of the complex of L -benzylsuccinate with wheat serine carboxypeptidase II at 2.0 Å resolution. Biochemistry, 33, 11127– 11134. Henrick, K. & Thornton, J. (1998). PQS: a protein quaternary structure file server. Trends Biochem. Sci. 23, 358– 361. Laskowski, R. (1995). SURFNET: a program for visualizing molecular surfaces, cavities, and intermolecular interactions. J. Mol. Graph. 13, 323– 330. See also pp. 307– 308. Altschul, S., Madden, T., Schaffer, A., Zhang, J., Zhang, Z., Miller, W. & Lipman, D. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucl. Acids Res. 25, 3389– 3402. Valdar, W. & Thornton, J. (2001). Conservation helps to identify biologically relevant crystal contacts. J. Mol. Biol. 313, 399–416. Dayhoff, M. O., Schwartz, R. M. & Orcutt, B. C. (1978). A model of evolutionary change in proteins: matrices for detecting distant relationships. In Atlas of Protein Sequence and Structure, vol. 5, pp. 345– 358, National Biomedical Research Foundation, Washington, DC. Shannon, C. E. (1948). A mathematical theory of communication. Bell Sys. Tech. J. 27, 379– 423. See also pp. 623– 656. McDonald, I. & Thornton, J. (1994). Satisfying hydrogen bonding potential in proteins. J. Mol. Biol. 238, 777– 793. Hutchinson, E. & Thornton, J. (1996). PROMOTIF a program to identify and analyze structural motifs in proteins. Protein Sci. 5, 212– 220. Martin, A., Orengo, C., Hutchinson, E., Jones, S., Karmirantzou, M., Laskowski, R. et al. (1998). Protein folds and functions. Structure, 6, 875– 884. Hegyi, H. & Gerstein, M. (1999). The relationship between protein structure and function: a comprehensive survey with application to the yeast genome. J. Mol. Biol. 288, 147–164. Pai, E. F. & Schulz, G. E. (1983). The catalytic mechanism of glutathione reductase as derived from X-ray diffraction analyses of reaction intermediates. J. Biol. Chem. 258, 1752– 1757. Karplus, P. A. & Schulz, G. E. (1989). Substrate binding and catalysis by glutathione reductase as derived from refined enzyme: substrate crystal structures at 2 Å resolution. J. Mol. Biol. 210, 163– 180. Silverman, R. (2000). The Organic Chemistry of Enzyme-catalysed Reactions, Academic Press, New York. Scott, D., Otwinowski, Z., Gelb, M. & Sigler, P. (1990). Crystal structure of bee venom phospholipase A2 in a complex with a transition-state analogue. Science, 250, 1563– 1566. Yan, B. & Sun, Y. (1997). Glycine residues provide flexibility for enzyme active sites. J. Biol. Chem. 272, 3190– 3194. Kallenbach, N. (2001). Breaking open a protein barrel. Proc. Natl. Acad. Sci. USA, 98, 2958– 2960. Fersht, A. (1998). Structure and Mechanism in Protein Science, Freeman, San Francisco, CA. Lesburg, C. A., Zhai, G., Cane, D. E. & Christianson, D. W. (1997). Crystal structure of pentalenene synthase: mechanistic insights on terpenoid cyclization reactions in biology. Science, 277, 1820– 1824. 121 Analysis of Catalytic Residues 29. Li, Y. F., Hata, Y., Fujii, T., Hisano, T., Nishihara, M., Kurihara, T. & Esaki, N. (1998). Crystal structures of reaction intermediates of 2-haloacid dehalogenase and implications for the reaction mechanism. J. Biol. Chem. 273, 15035– 15044. 30. Ridder, I. S., Rozeboom, H. J., Kalk, K. H. & Dijkstra, B. W. (1999). Crystal structures of intermediates in the dehalogenation of haloalkanoates by 2-haloacid dehalogenase. J. Biol. Chem. 274, 30672– 30678. 31. Yan, H. G. & Tsai, M. D. (1991). Mechanism of adenylate kinase. Demonstration of a functional relationship between Aspartate 93 and Mg2þ by sitedirected mutagenesis and proton, phosphorus-31, and magnesium-25 NMR. Biochemistry, 30, 5539–5546. 32. Muller, C. W. & Schulz, G. E. (1992). Structure of the complex between adenylate kinase from Escherichia coli and the inhibitor Ap5A refined at 1.9 Å resolution. Model for a catalytic transition state. J. Mol. Biol. 224, 159 –177. 33. Zheng, L., Kennedy, M. C., Beinert, H. & Zalkin, H. (1992). Mutational analysis of active site residues in pig heart aconitase. J. Biol. Chem. 267, 7895– 7903. 34. Lauble, H., Kennedy, M. C., Beinert, H. & Stout, C. D. (1992). Crystal structures of aconitase with isocitrate and nitroisocitrate bound. Biochemistry, 31, 2735–2748. 35. Lindqvist, Y., Schneider, G. & Vihko, P. (1994). Crystal structures of rat acid phosphatase complexed with the transition-state analogs vanadate and molybdate. Implications for the reaction mechanism. Eur. J. Biochem. 221, 139–142. 36. Zhang, M., Zhou, M., Etten, V. R. L. & Stauffacher, C. V. (1997). Crystal structure of bovine low molecular weight phosphotyrosyl phosphatase complexed with the transition state analog vanadate. Biochemistry, 36, 15 – 23. 37. Bystroff, C., Oatley, S. & Kraut, J. (1990). Crystal structures of escherichia coli dihydrofolate reductase: the NADPþ holoenzyme and the folate·NADPþ ternary complex. Substrate binding and a model for the transition state. Biochemistry, 29, 3263–3277. 38. Toth, E. A. & Yeates, T. O. (2000). The structure of adenylosuccinate lyase, an enzyme with dual activity 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. in the de novo purine biosynthetic pathway. Struct. Fold Des. 8, 163– 174. Todd, A., Orengo, C. & Thornton, J. (2001). Evolution of function in protein superfamilies, from a structural perspective. J. Mol. Biol. 307, 1113 – 1143. Bairoch, A. (1993). The ENZYME data bank. Nucl. Acids Res. 21, 3155– 3156. Rawlings, N. & Barrett, A. (1995). Evolutionary families of metallopeptidases. Methods Enzymol. 248, 183 –228. Gomis-Ruth, F. & Stockler, W. (1993). Astacins, serralysins, snake venom and matrix metalloproteinases exhibit identical zinc-binding environments (hexxhxxgxxh and met-turn) and topologies and should be grouped into a common family, the metzincins. FEBS Letters, 331, 134 –140. Sugio, S., Kashima, A., Kishimoto, K., Peisach, D., Petsko, G., Ringe, D. et al. (1998). Crystal structures of L201A mutant of D -amino acid aminotransferase at 2.0 Å resolution: implication of the structural role of Leu201 in transamination. Protein Eng. 11, 613 –619. Plaga, W., Vielhaber, G., Wallach, J. & Knappe, J. (2000). Modification of Cys-418 of pyruvate formatelyase by methacrylic acid, based on its radical mechanism. FEBS Letters, 466, 45 – 48. Becker, A., Fritz-Wolf, K., Kabsch, W., Knappe, J., Schultz, S. & Volker Wagner, A. (1999). Structure and mechanism of the glycyl radical enzyme pyruvate formate-lyase. Nature Struct. Biol. 6, 969– 975. Ridder, I., Rozeboom, H., Kalk, K. & Dijkstra, B. (1999). Crystal structures of intermediates in the dehalogenation of haloalkanoates by L -2-haloacid dehalogenase. J. Biol. Chem. 274, 30672– 30678. Lesburg, C., Zhai, G., Cane, D. & Christianson, D. (1997). Crystal structure of pentalenene synthase: mechanistic insights on terpenoid cyclization reactions in biology. Science, 277, 1820– 1824. Czerwinski, R., Harris, T., Massiah, M., Mildvan, A. & Whitman, C. (2001). The structural basis for the perturbed pKa of the catalytic base in 4-oxalocrotonate tautomerase: kinetic and structural effects of mutations of Phe-50. Biochemistry, 40, 1984– 1995. Edited by M. Levitt (Received 13 March 2002; received in revised form 26 July 2002; accepted 10 August 2002)
© Copyright 2024 Paperzz