Evolution of protein fold in the presence of functional constraints Antonina Andreeva and Alexey G Murzin The functional requirement to form and maintain the active site structure probably exerts a strong selective pressure on a protein to adopt just one stable and evolutionarily conserved fold. Nonetheless, new evidence suggests the likelihood of protein fold being neither physically nor biologically invariant. Alternative folds discovered in several proteins are composed of constant and variable parts. The latter display contextdependent conformations and a tendency to form new oligomeric interfaces. In turn, oligomerisation mediates fold evolution without loss of protein function. Gene duplication breaks down homo-oligomeric symmetry and relieves the pressure to maintain the local architecture of redundant active sites; this can lead to further structural changes. scenarios [2–4], as evolutionary considerations thus far played little if any role in the selection of proteins for structure determination. The combination of structural genomics and targeted structural studies promises to fill the gaps in this knowledge. It has already made an impact on the growth rate and composition of structural data, with a significant increase in the number of protein families of known structure [5–7]. For this review, we have elected to discuss those families whose non-trivial structural relationships have enabled new advances in our understanding of how a protein fold can change without compromising the integrity of the functional site structure. Shifting paradigm of protein fold Addresses MRC Centre for Protein Engineering, Hills Road, Cambridge CB2 2QH, UK Corresponding author: Murzin, Alexey G ([email protected]) Current Opinion in Structural Biology 2006, 16:399–408 This review comes from a themed issue on Sequences and topology Edited by Nick V Grishin and Sarah A Teichmann Available online 2nd May 2006 0959-440X/$ – see front matter # 2006 Elsevier Ltd. All rights reserved. DOI 10.1016/j.sbi.2006.04.003 Introduction ‘‘No new principle will declare itself from below a heap of facts’’ Peter Medawar An understanding of natural history at the molecular level derives from knowledge of probable evolutionary relationships, as deduced from sequence, structural and functional similarities among known genes and their products. There is a functional requirement to form and maintain the definite active site structure that probably exerts a strong selective pressure on a protein to adopt just one stable and conserved fold. Therefore, protein structures are generally regarded as ‘fossil records’ of molecular evolution. Systematic analysis of accumulated structural data provided many insights into the evolution of protein structure and function [1]. Yet our knowledge of the field remained a collection of miscellaneous ‘facts of protein fold evolution’ that could fit very different theoretical www.sciencedirect.com A protein fold is a simplified representation of protein structure that was originally intended to be invariant to possible conformational changes. The potential flexibility of protein loops and other peripheral regions, and their structural variability in homologous proteins were recognised a long time ago. Therefore, the fold of a protein was defined by the composition, architecture and topology of its core secondary structure elements (i.e. a helices and/or b strands). The discovery of chameleon sequences that can adopt alternative secondary structures in the same protein, thus affecting its composition, has already shaken the presumed invariance of protein fold [8,9]. Recent structures revealed even more remarkable examples of large-scale fold variations, altering the protein architecture and topology. These examples provide new insights into protein fold evolution and may be of value in the quest for an evolving paradigm of changeable fold. Chameleon sequences and alternative folds The spindle assembly checkpoint protein Mad2 is one of the best-studied proteins shown to adopt two distinct folded conformations in native conditions. Previously, they were thought to represent two different functional states: one is the free protein, whereas the other is the ligand-bound form [10]. Recently, both conformations were shown to be at equilibrium in ligand-free Mad2 [11]. Comparison of the two alternative conformations revealed a compact common core comprising about 70% of the Mad2 protein chain. The rest of the chain undergoes a major conformational change involving the refolding and translocation of the C-terminal b-structure from one end of the main b-sheet to the other end. In addition, there are flexible regions, the sequence locations of which differ between the conformations. The conformational flexibility of the Mad2 C terminus appears to have a functional role. Its hinged mobile elements wrap around the elongated ligands like molecular ‘safety belts’ [12]. Current Opinion in Structural Biology 2006, 16:399–408 400 Sequences and topology A new chameleon motif has been found in the AXH domain, a common region of the ataxin-1 and HBP1 proteins implicated in binding RNA. The structures of the AXH domains of both proteins have been published recently [13,14]. The ataxin-1 domain forms a dimer in the crystals, with the two subunits adopting similar but non-identical conformations. The most different parts are the N-terminal tails, which are found at the subunit interface and interact complementarily along the pseudo-twofold axis. This observation of a dimeric interface formed by two mutually adapting yet different chameleon motifs is novel. It is thought that this structural adaptability is essential in maintaining the AXH dimer [13]. The solution structure of the monomeric HBP1 AXH domain has revealed an even larger conformational variability in the N-terminal region, compared with the ataxin-1 domain structures. It retains similar secondary structure, but otherwise adopts a very different conformation and occupies a different position relative the C-terminal common core. It should be noted here that the relationship between the monomeric HBP1 and dimeric ataxin-1 AXH domains is unlike the domainswapping relationships discussed below. Structural comparison of the cyanobacterial circadian clock proteins SasA and KaiB underlines a possible role for chameleon sequences in protein fold evolution. SasA is a histidine kinase that contains an additional N-terminal domain, N-SasA, whose sequence is very similar to the entire KaiB sequence (Figure 1). Both N-SasA and KaiB interact with the KaiC component of the same molecular clock. The isolated N-SasA domain is a monomer in solution and adopts a thioredoxin-like fold [15]. Surprisingly, despite its significant similarity to N-SasA and detectable similarity to other thioredoxin-like structures, KaiB adopts a quite different fold [16,17]. There is a good correlation between sequence and structure for the N-terminal halves of both proteins (which comprise a common bab unit), whereas the C-terminal portions of the KaiB and N-SasA folds differ substantially in their secondary structure and topology. Instead of the C-terminal bba motif of N-SasA, KaiB has an aab motif (Figure 1). In the KaiB crystals, the C-terminal motif mediates the formation of a tetramer thought to be the KaiB biological unit [17,18]. The apparent homology between N-SasA and KaiB, and their probable remote homology to thioredoxins suggest the hypothesis that their common ancestor is a thioredoxin-like protein that evolved a chameleon sequence. Several critical amino acid substitutions in the KaiB lineage probably tipped the balance toward the formation of different local structures and facilitated the acquisition of a new protein function. Variable structural parts and oligomerisation The above examples show that the entire protein fold, including every secondary structure element, is not necessarily invariant, but there may be a constant part. The remaining structure may vary depending on both the constant part and the external conditions (binding ligands, oligomerisation state and so on). The involvement of variable structures in the oligomerisation of the AXH domain, KaiB and, probably, Mad2 may reflect a more general phenomenon. A remarkable yet overlooked example of this phenomenon was revealed by crystal structures of the oxygen-dependent coproporphyrinogen oxidase (CPO) from Saccharomyces cerevisiae (Hem13p) [19]. Its protein fold features a large central antiparallel sheet that is flanked by helices (Figure 2). The Figure 1 Comparison of (a) the sequences and (b) the structures of the SasA N-terminal domain (left, PDB code 1t4y) and the KaiB subunit (right, PDB code 1r5p). FASTA alignment of the two proteins is shown with identical residues highlighted and structurally equivalent segments boxed. Structures are rainbow coloured (blue!red) from the N to C termini, so aligned segments of the two proteins are the same colour. The structural cartoons in all figures were produced with PyMOL (http://www.pymol.org) [48]. Current Opinion in Structural Biology 2006, 16:399–408 www.sciencedirect.com Evolution of protein fold in the presence of functional constraints Andreeva and Murzin 401 Figure 2 Two different dimerisation modes of yeast CPO. (a) Probable biological unit in the closed form and (b) the self-inhibited ‘tight’ dimer. One rainbow-coloured subunit is shown in approximately the same orientation in both dimers. C-terminal, mainly helical segment contributes to the active site and forms the dimerisation interface, which is observed in two different crystal forms capturing the open and closed conformations of the active site cleft. A third crystal form contains a very different dimer. More than 30% of the subunit structure, including almost the entire C-terminal segment, is disordered and displaced to create an alternative, more intimate dimerisation interface. A very similar dimeric structure has been observed independently in a similar but distinct crystal form (PDB code 1txn), suggesting that this dimer, which incidentally produces better-diffracting crystals, is not a crystallisation artefact. This allows the hypothesis that CPO from other organisms may also form similar dimers, suggesting an alternative explanation for the deleterious effect of many of the mutations associated with the disease hereditary coproporphyria [19,20]. of the CspA original structure, whereas the S1 secondary structures are rearranged to complete the b-barrel structure (Figure 3). Thus, the structure of the CspA segment can be seen as a template around which the rest of the protein folds. Besides the completion of the barrel fold, the S1 segment forms a compact structure composed of four identical chains interlocking around a new hydrophobic core in the centre of the tetramer. Figure 3 Non-homologous recombination The proposed division of protein fold into constant and variable parts, and the probable role of the latter in the formation of new oligomeric interfaces gain further support from the structure of the chimeric 1B11 protein obtained by non-homologous recombination [21,22]. This protein is composed of two subdomain-size segments, one from the cold shock protein CspA and the other from ribosomal protein S1 (Figure 3). Incidentally, these two proteins are distantly related, sharing the nucleic-acid-binding OB-fold. Remarkably, the CspA and S1 segments form practically equivalent rather than complementary parts of this common fold, but there is no fold similarity between the ‘parent’ fragments in the 1B11 structure, which comprises a tetramer made of two segment-swapped dimers. The six-stranded b-barrel of the 1B11 compact core resembles the ‘parent’ five-stranded OB-fold barrel. The structure of the CspA segment is retained and superimposes well on the equivalent region www.sciencedirect.com Structures of the combinatorial 1B11 protein and its ‘parents’. (a) Cold shock protein CspA and (b) a domain of the ribosomal S1 protein share the common OB-fold. (c) One 1B11 subunit is composed of the N-terminal fragment of CspA (yellow) and the equivalent region of the S1 domain (blue); in the other subunit of the segment-swapped dimer, these regions are shown in orange and cyan, respectively. The second swapped dimer that completes the 1B11 tetramer is not shown. Current Opinion in Structural Biology 2006, 16:399–408 402 Sequences and topology The successful in vitro demonstration of the non-homologous recombination of subdomain-size segments to yield a stable protein of novel fold has enormous implications for our understanding of protein evolution. In theory, such recombination can happen readily on the DNA level, but it seemed unlikely to result in a viable protein. The ability of one segment to act as a template for the folding of the rest of the resulting polypeptide would increase the likelihood of successful non-homologous recombination in vivo, particularly if this segment contains all essential functional determinants of a parent protein. In principle, many proteins that contain subdomains with similar structures and functions, such as the DNA-binding HTH motif or RNA-binding KH and S4 (or aL) motifs, but otherwise display different protein folds may have evolved by non-homologous recombination of these motifs with other protein segments. However, it remains to be demonstrated that non-homologous recombination involving an intact subdomain can yield a functional protein. The recombinant 1B11 protein probably does not retain the nucleic-acid-binding activity of the parent proteins, as the putative RNA-binding site of CspA is buried in the tetramer. Evolution of structural complexity The remarkable ‘spontaneous’ oligomerisation of 1B11 appears to be a common feature of other combinatorial proteins selected by proteolytic stability [22]. This challenges the conventional paradigm of the evolution of oligomeric complexes from monomeric proteins [23]. Indeed, recent evidence suggests that the first oligomers were produced before genetics, as we know it, was at work. Oligomeric biological units There are examples of oligomeric proteins whose structures almost certainly predated their functions. Such oligomers are composed of interlocking monomers that are not compact by themselves, and cannot be rearranged into compact and contiguous units by exchanging the equivalent segments between different monomers. The unanticipated possibility of such interlocked oligomers presents a certain problem for their structure determination in solution; this has already resulted in a few known and probable structural artefacts [24,25]. There are other proteins whose oligomeric structures most likely preceded their functions. Composed of more compact monomers, they contain multiple equivalent active sites at the subunit interfaces. For example, in tetrameric flavindependent thymidylate synthase (ThyX), each of its four equivalent active sites is formed by three different subunits [26,27]. As each subunit contributes to three different active sites, its functionally essential residues are distributed over a large area of the subunit surface and therefore are very unlikely to have evolved in a functionless ThyX monomer. These ‘ancient’ oligomeric biological units are common and frequently observed in new protein structures. Importantly, in these oligomers, segment swapping and gene duplication can occur without changing the overall and active site architectures, as illustrated by three structures from the AhpD-like superfamily (Figure 4). The alkylhydroperoxidase AhpD is a metabolic enzyme linked to antioxidant defence in mycobacteria [28,29]. It has a thioredoxin-like active site, but an unrelated all-a fold. AhpD monomers are composed of two structural repeats and assemble into a symmetrical homotrimer. The AhpD oligomeric architecture is strikingly similar to the hexameric architecture of TM1620 protein from the carboxymuconolactone decarboxylase (CMD) family (PDB codes 1p8c, 1vke), which also shows good sequence and structural similarity to AhpD in the active site region. Figure 4 Segment swapping and gene duplication in the oligomeric biological unit. The architectures of the CMD family hexameric proteins (a) TM1620 and (b) TTHA0727, and (c) trimeric alkylhydroperoxidase AhpD. Helices that are common to all three proteins are shown in colour and grey; other helices are white. The structural repeats of one AhpD subunit are shown in cyan and blue, the corresponding TTHA0727 subunits are in yellow and green, and their TM1620 counterparts are in violet and magenta. The ‘swapped’ helical segment in a third TM1620 subunit is shown in purple. Current Opinion in Structural Biology 2006, 16:399–408 www.sciencedirect.com Evolution of protein fold in the presence of functional constraints Andreeva and Murzin 403 The TM1620 subunit adopts a segment-swapped variant of the fold of one AhpD repeat. Surprisingly, the pair of TM1620 monomers that swap helices is not the same as the pair that corresponds to one AhpD monomer. More recently, the structure of another member of the CMD family, TTHA0727 [30], provided a halfway house on the route between the TM1620 and AhpD structures. TTHA0727 is a hexamer, like TM1620, but its subunit fold resembles one AhpD repeat without any segment swapping. Symmetry and duplication Homo-oligomeric biological units usually contain multiple equivalent active sites that can be of very complex architecture. With one active site per monomer, they generally have higher ratios of active sites per residue than monomeric enzymes. This organisation may prove advantageous, provided there is a selective pressure to limit the amount of DNA in the cell. On the other hand, the oligomer symmetry imposes constraints on evolutionary changes of its structure that can hinder further optimisation and expansion of its functions. The higher exact symmetry of homo-oligomers can be and frequently is broken by duplication, as seen in related structures with the same or similar function. For example, the recent structure of the TusBCD complex, a mediator of thiouridine modification of tRNA, revealed a heterohexamer of homologous but non-identical subunits related to the homohexameric YchN protein, which is implicated in sulfur metabolism [31,32]. The original D3 symmetry of the YchN hexamer is reduced in the TusBCD complex, which retains just one exact twofold symmetry axis. Of the six equivalent putative active sites in YchN, only two remain functional in TusBCD. The loss of redundant sites releases additional surface area available for interaction with other components of the sulfur relay system. Usually, gene duplication is followed by in-frame fusion, resulting in a single-chain multidomain protein that retains many structural features of the original oligomeric unit. Generally, all but one active sites are lost after the duplication and fusion events. In theory, new linkers connecting the termini of former monomers into a single chain can obscure some of the original sites, thus influencing the selection of the location of the remaining active site, but there is also some evidence of random selection [33]. The chorismate mutase domain of Escherichia coli P-protein (EcCM) forms an interlocked homodimer that contains two equivalent active sites at the subunit interface. The structure of the yeast chorismate mutase (ScCM) subunit is similar to the dimeric assembly of EcCM, suggesting a probable duplication and fusion event in the chorismate mutase family. The structure of secreted chorismate mutase from Mycobacterium tuberculosis (*MtCM) probably resulted from an independent similar event. Both *MtCM and ScCM retain only one of the two active sites of their probable EcCM-like precursors, but they lost different sites. The loss of redundant active sites upon duplication relieves the functional pressure to maintain their local architecture and can lead to further structural changes. These changes can be both subtle and extensive, as evidenced by the structure of monomeric dUTPase from Epstein–Barr virus and its relationship with the trimeric dUTPase structure [34]. In the trimeric dUTPase, the C-terminal tail of one subunit completes the active site at the interface of the two other subunits (Figure 5). In the monomeric dUTPase, there is virtually the same active site between two globular domains, also completed by the C-terminal tail. The two domains retain some similarity to the subunit structure of the trimeric enzyme, but they deviate from it in different ways and therefore are less Figure 5 Transition from the oligomeric biological unit to a monomeric multidomain enzyme. Cartoons of (a) the trimeric dUTPase structure, showing monomers in different colours, and (b) monomeric dUTPase in similar colours, highlighting the relationship between its ‘domains’ and subunits of the trimeric enzyme. dUTP molecules (space-fill) indicate the location of the active sites in both proteins. www.sciencedirect.com Current Opinion in Structural Biology 2006, 16:399–408 404 Sequences and topology similar to each other. There is a long but compact linker between the second domain and the C-terminal tail, a possible remnant of the third domain, which decayed almost completely. Analogous structural changes are proposed for the evolution of dimeric all-a dUTPase from the ancestral tetrameric NTP pyrophosphatases of the MazG-like superfamily [35]. Transient oligomers There are many multidomain structures composed of segment-swapped structural repeats, suggesting that they may have evolved from monomeric single-domain proteins via transient oligomers. In theory, the segment boundaries can be selected so that, in the swapped oligomer, the active sites of the original monomers will be combined in a larger symmetrical site. This new symmetry subsequently can be utilised by evolution to bind complex molecules that display the same symmetry (or to stabilise the symmetrical transition states). Extant oligomers with combined symmetrical active sites are extremely rare. One recent example is the structure of the putative syrohydrochlorin cobaltochelatase CbiX (PDB code 1tjn), the swapped dimer of which bears similarities to one monomer of the cobalt chelatase CbiK [36]. Evidence supporting the transient existence of such oligomers is suggested by a newly discovered relationship between the sirtuin (Sir2) family of deacetylases, which catalyze NAD-dependent deacetylation of modified lysine residues in histones and other proteins [37], and the molybdenum-cofactor-containing enzymes of the DMSO reductase/formate dehydrogenase family [38]. The molybdenum cofactor (Mo-co) consists of two molecules of molybdopterin guanosine dinucleotide (MGD) [38] and binds at the interface of two structurally similar domains. The MGD-binding domains are related by a circular permutation and arranged about the pseudo-twofold symmetry axis, coinciding with the twofold symmetry axis of Mo-co (Figure 6). They show previously unreported structural similarities to the sirtuin NADbinding domain, extending to the architectures of the cofactor-binding sites and the modes of recognition of the GDP moieties of MGD and the ADP moiety of NAD (A Andreeva, unpublished). This suggests that these dinucleotide-binding domains probably have evolved from a common ancestor. A putative evolutionary pathway from an ancestral sirtuin-domain-like monomer to the molybdenum-cofactor-binding domain includes the formation of a segment-swapped dimer followed by a gene duplication and fusion event. There are many other proteins of analogous domain architectures that bind no symmetrical ligands in their Figure 6 Structural and functional relationship between the cofactor-binding domains of the DMSO reductase/formate dehydrogenase and sirtuin families. (a) Structure of the Mo-co domain of dissimilatory nitrate reductase (PDB code 2nap). The sphere represents the molybdenum atom, which is bound between two MGD molecules, shown in stick representation. (b) Structure of archaeal sirtuin AF1676 with bound NAD (PDB code 1ici). Two sets of structurally similar segments are coloured in similar hues: one in cyan and light and dark blue, the other in yellow and light and dark orange. (c) Schematic showing the sequential order of these segments in both structures (top, 2nap; bottom, 1ici). (d) Superimposition of the cofactor-binding sites of the MGD-binding domains (cyan and yellow) and NAD-binding domain (pink). Current Opinion in Structural Biology 2006, 16:399–408 www.sciencedirect.com Evolution of protein fold in the presence of functional constraints Andreeva and Murzin 405 ‘combined’ active sites. These proteins also may have evolved from pre-existing functional monomers by a similar mechanism, using duplication and fusion events as a necessary step to break down the redundant symmetry of the hypothetical transient oligomers. Evolution of structure in 4D We have discussed the possible changes of protein structure, caused by domain (segment) swapping, duplication, deletion (of the redundant active sites and supporting structures) and decoration (with additional structures), that can occur without affecting the integrity of the remaining active site(s). A series of such ‘D-events’ in the evolution of a protein family may produce members with very dissimilar folds. Yet their evolutionary relationship can be traced step-by-step through extant intermediate structures if available. We illustrate this by a probable scenario for fold evolution in the phosphogluconate dehydrogenase (PGDH)-like suprafamily of oxidoreductases (Figure 7). Members of this family contain two different structural domains: an N-terminal a/b domain and a C-terminal all-a domain. There is a familyspecific extension of the b-sheet of the N-terminal domain that distinguishes it from the related NAD(P)binding Rossmann fold domains of other oxidoreductase families. In contrast, the C-terminal domain structures often appear dissimilar, most notably in the larger structures of the founding member PGDH [39], class II ketoacyl reductoisomerase (KARI) [40] and mannitol dehydrogenase [41]. The PGDH C-terminal domain consists of two structural repeats packed side-by-side. The class II KARI C-terminal domain contains two Figure 7 Structural evolution of the PGDH-like oxidoreductases. The conserved N-terminal domains are shown in blue in all structures. In the top row, dimeric structures of UDPGDH (left) and GDPMDH (middle) are shown with the dimerisation domains of different monomers in green and pink; the extra C-terminal domains are removed for clarity. The structure of a GPD monomer (right); the colouring of its C-terminal domain highlights its similarity to a compact half of the GDPMDH dimer. In the bottom row, the subunit structures of PGDH (left) and class II KARI (middle) are shown; the internal structural repeats that correspond to parts of different subunits of the above dimers are shown in the same colours. The structure of mannitol dehydrogenase (right) is coloured by similarity to GPD, with additional helices shown in grey. Major D-events (see text) are shown with block arrows. Dup/Del designates a series of gene duplication and fusion events, followed by deletions of redundant domains. www.sciencedirect.com Current Opinion in Structural Biology 2006, 16:399–408 406 Sequences and topology repeats of a different fold, intertwined into a ‘figure eight’ knot [42]. The C-terminal domain of mannitol dehydrogenase shows limited similarity to the PGDH and class II KARI C-terminal domains, and contains no structural repeats or unusual topological features. In several other family members, the C-terminal domain provides the dimerisation interface and contributes to the active site formed by the coenzyme-binding domain of the symmetry-related subunit. The structures of these members can be organised in two different groups related by domain swapping within the dimeric biological unit: the UDP-glucose dehydrogenase (UDPGDH) group [43] and the GDP-mannose dehydrogenase (GDPMDH) group [44]. The ‘archetypal’ members of these groups (i.e. UDPGDH and GDPMDH) are closely related. They have similar sequences and extra C-terminal domains. The UDPGDH group includes hydroxyisobutirate dehydrogenase [45], which has extensive sequence similarity to PGDH. The structure of one PGDH subunit closely resembles the structure of the hydroxyisobutirate dehydrogenase dimer without one coenzyme-binding domain and most probably evolved from the hydroxyisobutiratedehydrogenase-like precursor by gene duplication and domain deletion events. Another series of gene duplication and domain deletion events relates the subunit structure of the class II KARI to the structure of the class I KARI dimer [46] (from the GDPMDH group), decorated with additional helices. Finally, the GDPMDH group has a dimer-to-monomer swapping relationship with a third group of family members, represented by glycerol-3-phosphate dehydrogenase (GPD) [47]. One half of the GDPMDH dimeric fold, composed of complementary parts of different subunits enclosing one active site, is similar to the GPD subunit fold. The mannitol dehydrogenase structure belongs to the GPD group and its common fold is further decorated with additional secondary structures. What lies ahead The selection of newly observed structural relationships for discussion in this review benefited from an apparent redundancy of structural genomics efforts aimed at the determination of a representative structure for each protein family. So far, competition between structural genomic centres and independent structural biology groups has resulted in more than one structure having been determined independently for almost every structurally characterised family and, in a few cases, for the same protein. Hopefully, this promising trend will persist. As structural data continue to grow, one can expect to find more and more examples of protein families that display significant fold variations. There is the possibility of ‘accidental’ discoveries of unknown proteins with alternative stable folds. There is an even more intriguing possibility of finding alternative folds in known proteins. Such discoveries would help to trigger systematic Current Opinion in Structural Biology 2006, 16:399–408 research into the (in)variability of already known protein folds. References and recommended reading Papers of particular interest, published within the annual period of review, have been highlighted as: of special interest of outstanding interest 1. Murzin AG: How far divergent evolution goes in proteins. Curr Opin Struct Biol 1998, 8:380-387. 2. Grishin NV: Fold change in evolution of protein structures. J Struct Biol 2001, 134:167-185. 3. James LC, Tawfik DS: Conformational diversity and protein evolution – a 60-year-old hypothesis revisited. Trends Biochem Sci 2003, 28:361-368. 4. Friedberg I, Godzik A: Connecting the protein structure universe by using sparse recurring fragments. Structure 2005, 13:1213-1224. 5. Chandonia JM, Brenner SE: The impact of structural genomics: expectations and outcomes. Science 2006, 311:347-351. An insider review of the completion of phase one of the Protein Structure Initiative. 6. Wlodawer A: Giving credit where credit is due. Nat Struct Mol Biol 2005, 12:634. By citing this correspondence [6,7], we wish to express our support for the proposal to assign a document object identifier (DOI) to each PDB entry, so that structures deposited in the PDB can be referenced and accessed in the same manner as other electronic publications. Meanwhile, we list here the PDB identifiers of all currently unpublished structures discussed in this review: 1txn, 1p8c, 1vke, 1tjn. 7. Berman HM: Giving credit where credit is due – reply. Nat Struct Mol Biol 2005, 12:634. See annotation to [6]. 8. Minor DL Jr, Kim PS: Context-dependent secondary structure formation of a designed protein sequence. Nature 1996, 380:730-734. 9. Tidow H, Lauber T, Vitzithum K, Sommerhoff CP, Rosch P, Marx UC: The solution structure of a chimeric LEKTI domain reveals a chameleon sequence. Biochemistry 2004, 43:11238-11247. 10. Luo X, Tang Z, Rizo J, Yu H: The Mad2 spindle checkpoint protein undergoes similar major conformational changes upon binding to either Mad1 or Cdc20. Mol Cell 2002, 9:59-71. 11. Luo X, Tang Z, Xia G, Wassmann K, Matsumoto T, Rizo J, Yu H: The Mad2 spindle checkpoint protein has two distinct natively folded states. Nat Struct Mol Biol 2004, 11:338-345. Mad2 is a relatively small, single-domain protein that shows some prionlike properties. A transiently formed heterodimer of the N1-Mad2 and N2Mad2 conformers is converted into the wild-type N2-Mad2 homodimer. The N2-Mad2 solution structure has been determined using a monomeric mutant. 12. Sironi L, Mapelli M, Knapp S, De Antoni A, Jeang KT, Musacchio A: Crystal structure of the tetrameric Mad1-Mad2 core complex: implications of a ‘safety belt’ binding mechanism for the spindle checkpoint. EMBO J 2002, 21:2496-2506. 13. Chen YW, Allen MD, Veprintsev DB, Löwe J, Bycroft M: The structure of the AXH domain of spinocerebellar ataxin-1. J Biol Chem 2004, 279:3758-3765. See annotation to [14]. 14. de Chiara C, Menon RP, Adinolfi S, de Boer J, Ktistaki E, Kelly G, Calder L, Kioussis D, Pastore A: The AXH domain adopts alternative folds: the solution structure of HBP1 AXH. Structure 2005, 13:743-753. One of many examples of the parallel determination of the same or closely related target structures by independent groups [13]. The discovery of unexpected structural diversity of this domain is a bonus. www.sciencedirect.com Evolution of protein fold in the presence of functional constraints Andreeva and Murzin 407 15. Vakonakis I, Klewer DA, Williams SB, Golden SS, LiWang AC: Structure of the N-terminal domain of the circadian clock-associated histidine kinase SasA. J Mol Biol 2004, 342:9-17. 16. Garces RG, Wu N, Gillon W, Pai EF: Anabaena circadian clock proteins KaiA and KaiB reveal a potential common binding site to their partner KaiC. EMBO J 2004, 23:1688-1698. 17. Hitomi K, Oyama T, Han S, Arvai AS, Getzoff ED: Tetrameric architecture of the circadian clock protein KaiB. A novel interface for intermolecular interactions and its impact on the circadian rhythm. J Biol Chem 2005, 280:19127-19135. 18. Iwase R, Imada K, Hayashi F, Uzumaki T, Morishita M, Onai K, Furukawa Y, Namba K, Ishiura M: Functionally important substructures of circadian clock protein KaiB in a unique tetramer complex. J Biol Chem 2005, 280:43141-43149. 19. Phillips JD, Whitby FG, Warby CA, Labbe P, Yang C, Pflugrath JW, Ferrara JD, Robinson H, Kushner JP, Hill CP: Crystal structure of the oxygen-dependant coproporphyrinogen oxidase (Hem13p) of Saccharomyces cerevisiae. J Biol Chem 2004, 279:38960-38968. 20. Lee DS, Flachsova E, Bodnarova M, Demeler B, Martasek P, Raman CS: Structural basis of hereditary coproporphyria. Proc Natl Acad Sci USA 2005, 102:14232-14237. 21. de Bono S, Riechmann L, Girard E, Williams RL, Winter G: A segment of cold shock protein directs the folding of a combinatorial protein. Proc Natl Acad Sci USA 2005, 102:1396-1401. This paper reports the first structure of an artificial protein from the set of novel folded domains generated by random shuffling of nonhomologous polypeptide segments and discusses its evolutionary implications. 22. Riechmann L, Lavenir I, de Bono S, Winter G: Folding and stability of a primitive protein. J Mol Biol 2005, 348:1261-1272. The authors report the biophysical characterisation of the 1B11 combinatorial protein, the structure of which is described in [21]. They confirm that segment swapping and associated oligomerisation are both powerful ways of stabilising proteins, supporting the view that this may have been a feature of early protein evolution. target for antitubercular drug design. J Biol Chem 2002, 277:20033-20040. 30. Ito K, Arai R, Fusatomi E, Kamo-Uchikubo T, Kawaguchi S-I, Akasaka R, Terada T, Kuramitsu S, Shirouzu M, Yokoyama S: Crystal structure of the conserved protein TTHA0727 from Thermus thermophilus HB8 at 1.9 Å resolution: a CMD family member distinct from carboxymuconolactone decarboxylase (CMD) and AhpD. Protein Sci 2006; doi:10.1110/ps.062148506. 31. Numata T, Fukai S, Ikeuchi Y, Suzuki T, Nureki O: Structural basis for sulfur relay to RNA mediated by heterohexameric TusBCD complex. Structure 2006, 14:357-366. Recent genetic studies reveal that the products of five novel genes, tusABCDE, function in 2-thiouridine modification of tRNA wobble positions. The TusBCD complex is a dimer of heterotrimers of homologous subunits, related to hypothetical protein YchN [32], of which only TusD retains the catalytic cysteine residue. 32. Shin DH, Yokota H, Kim R, Kim SH: Crystal structure of a conserved hypothetical protein from Escherichia coli. J Struct Funct Genomics 2002, 2:53-66. 33. Ökvist M, Dey R, Sasso S, Grahn E, Kast P, Krengel U: 1.6 Å crystal structure of the secreted chorismate mutase from Mycobacterium tuberculosis: novel fold topology revealed. J Mol Biol 2006, 357:1483-1499. Structures of members of three chorismate mutase AroQ subclasses point to divergent evolution in the distant past. The AroQb and AroQg subclasses probably evolved from an ancestor of the much simpler AroQa subclass by gene duplication and fusion events to generate different fold variants. 34. Tarbouriech N, Buisson M, Seigneurin J-M, Cusack S, Burmeister WP: The monomeric dUTPase from Epstein-Barr virus mimics trimeric dUTPases. Structure 2005, 13: 1299-1310. The monomeric and trimeric dUTPases both contain the same five characteristic sequence motifs, but in a different order. This example of evolution from the trimeric to the monomeric enzyme is contrary to the commonly observed trend for efficient genome usage in viruses, as the monomeric dUTPase needs more coding sequence than a trimeric one. 23. D’Alessio G: The evolutionary transition from monomeric to oligomeric proteins: tools, the environment, hypotheses. Prog Biophys Mol Biol 1999, 72:271-298. 35. Moroz OV, Murzin AG, Makarova KS, Koonin EV, Wilson KS, Galperin MY: Dimeric dUTPases, HisE, and MazG belong to a new superfamily of all-a NTP pyrophosphohydrolases with potential ‘house-cleaning’ functions. J Mol Biol 2005, 347:243-255. 24. Bobay BG, Andreeva A, Mueller GA, Cavanagh J, Murzin AG: Revised structure of the AbrB N-terminal domain unifies a diverse superfamily of putative DNA-binding proteins. FEBS Lett 2005, 579:5669-5674. 36. Schubert HL, Raux E, Wilson KS, Warren MJ: Common chelatase design in the branched tetrapyrrole pathways of heme and anaerobic cobalamin synthesis. Biochemistry 1999, 38:10660-10669. 25. Coles M, Djuranovic S, Söding J, Frickey T, Koretke K, Truffault V, Martin J, Lupas AN: AbrB-like transcription factors assume a swapped hairpin fold that is evolutionarily related to double-psi b-barrels. Structure 2005, 13:919-928. 37. Blander G, Guarente L: The Sir2 family of protein deacetylases. Annu Rev Biochem 2004, 73:417-435. 26. Mathews II, Deacon AM, Canaves JM, McMullan D, Lesley SA, Agarwalla S, Kuhn P: Functional analysis of substrate and cofactor complex structures of a thymidylate synthase-complementing protein. Structure 2003, 11:677-690. 27. Leduc D, Graziani S, Lipowski G, Marchand C, Le Marechal P, Liebl U, Myllykallio H: Functional evidence for active site location of tetrameric thymidylate synthase X at the interphase of three monomers. Proc Natl Acad Sci USA 2004, 101:7252-7257. Homo-oligomeric enzymes with active sites formed at the interface of three or more monomers are rare. In this case, each monomer is shown to contribute a catalytically essential residue, suggesting that the ThyX tetramer may have evolved by oligomerisation of ‘inactive’ monomers. 28. Bryk R, Lima CD, Erdjument-Bromage H, Tempst P, Nathan C: Metabolic enzymes of mycobacteria linked to antioxidant defense by a thioredoxin-like protein. Science 2002, 295:1073-1077. 29. Nunn CM, Djordjevic S, Hillas PJ, Nishida CR: Ortiz de Montellano PR. The crystal structure of Mycobacterium tuberculosis alkylhydroperoxidase AhpD, a potential www.sciencedirect.com 38. Kisker C, Schindelin H, Rees DC: Molybdenum-cofactorcontaining enzymes: structure and mechanism. Annu Rev Biochem 1997, 66:233-267. 39. Adams MJ, Ellis GH, Gover S, Naylor CE, Phillips C: Crystallographic study of coenzyme, coenzyme analogue and substrate binding in 6-phosphogluconate dehydrogenase: implications for NADP specificity and the enzyme mechanism. Structure 1994, 2:651-668. 40. Biou V, Dumas R, Cohen-Addad C, Douce R, Job D, Pebay-Peyroula E: The crystal structure of plant acetohydroxy acid isomeroreductase complexed with NADPH, two magnesium ions and a herbicidal transition state analog determined at 1.65 Å resolution. EMBO J 1997, 16:3405-3415. 41. Kavanagh KL, Klimacek M, Nidetzky B, Wilson DK: Crystal structure of Pseudomonas fluorescens mannitol 2-dehydrogenase binary and ternary complexes: specificity and catalytic mechanism. J Biol Chem 2002, 277:43433-43442. 42. Campbell RE, Mosimann SC, van De Rijn I, Tanner ME, Strynadka NC: The first structure of UDP-glucose dehydrogenase reveals the catalytic residues necessary for the two-fold oxidation. Biochemistry 2000, 39:7012-7023. Current Opinion in Structural Biology 2006, 16:399–408 408 Sequences and topology 43. Taylor WR: A deeply knotted protein structure and how it might fold. Nature 2000, 406:916-919. 44. Snook CF, Tipton PA, Beamer LJ: Crystal structure of GDP-mannose dehydrogenase: a key enzyme of alginate biosynthesis in P. aeruginosa. Biochemistry 2003, 42:4658-4668. 45. Lokanath NK, Ohshima N, Takio K, Shiromizu I, Kuroishi C, Okazaki N, Kuramitsu S, Yokoyama S, Miyano M, Kunishima N: Crystal structure of novel NADP-dependent 3-hydroxyisobutyrate dehydrogenase from Thermus thermophilus HB8. J Mol Biol 2005, 352:905-917. Current Opinion in Structural Biology 2006, 16:399–408 46. Ahn HJ, Eom SJ, Yoon HJ, Lee BI, Cho H, Suh SW: Crystal structure of class I acetohydroxy acid isomeroreductase from Pseudomonas aeruginosa. J Mol Biol 2003, 328:505-515. 47. Suresh S, Turley S, Opperdoes FR, Michels PA, Hol WG: A potential target enzyme for trypanocidal drugs revealed by the crystal structure of NAD-dependent glycerol-3-phosphate dehydrogenase from Leishmania mexicana. Structure 2000, 8:541-552. 48. DeLano WL: The PyMOL Molecular Graphics System. San Carlos, CA, USA: DeLano Scientific; 2002. www.sciencedirect.com
© Copyright 2025 Paperzz