Mikko Laitaoja: Structure-Function Studies of Zinc Proteins 140 110/2011 TORVINEN Mika: Mass spectrometric studies of host-guest complexes of glucosylcalixarenes 111/2012 KONTKANEN Maija-Liisa: Catalyst carrier studies for 1-hexene hydroformulation: cross-linked poly(4-vinylpyridine), nano zinc oxide and one-dimensional ruthenium polymer 112/2012KORHONENTuulia:Thewettabilitypropertiesofnano-andmicromodifiedpaintsurfaces 113/2012JOKI-KORPELAFatima:Functionalpolyurethane-basedfilmsandcoatings 114/2012 LAURILA Elina: Non-covalent interactions in Rh, Ru, Os, and Ag complexes 115/2012 MAKSIMAINEN Mirko: Structural studies of Trichoderma reesei, Aspergillus oryzae and Bacillus circulans sp. alkalophilus beta-galactosidases – Novel insights into a structure-function relationship 116/2012 PÖLLÄNEN Maija: Morphological, thermal, mechanical, and tribological studies of polyethylene compositesreinforcedwithmicro–andnanofillers 117/2013LAINEAnniina:Elementaryreactionsinmetallocene/methylaluminoxanecatalyzedpolyolefin synthesis 118/2013TIMONENJuri:Synthesis,characterizationandanti-inflammatoryeffectsofsubstitutedcoumarin derivatives 119/2013 TAKKUNEN Laura: Three-dimensional roughness analysis for multiscale textured surfaces: Quantitative characterization and simulation of micro- and nanoscale structures 120/2014 STENBERG Henna: Studies of self-organizing layered coatings 121/2014 KEKÄLÄINEN Timo: Characterization of petroleum and bio-oil samples by ultrahigh-resolution Fourier transform ion cyclotron resonance mass spectrometry 122/2014 BAZHENOV Andrey: Towards deeper atomic-level understanding of the structure of magnesium dichloride and its performance as a support in the Ziegler-Natta catalytic system 123/2014 PIRINEN Sami: Studies on MgCl2/ether supports in Ziegler–Natta catalysts for ethylene polymerization 124/2014 KORPELA Tarmo: Friction and wear of micro-structured polymer surfaces 125/2014 HUOVINEN Eero: Fabrication of hierarchically structured polymer surfaces 126/2014 EROLA Markus: Synthesis of colloidal gold and polymer particles and use of the particles in preparation of hierarchical structures with self-assembly 127/2015 KOSKINEN Laura: Structural and computational studies on the coordinative nature of halogen bonding 128/2015 TUIKKA Matti: Crystal engineering studies of barium bisphosphonates, iodine bridged ruthenium complexes, and copper chlorides 129/2015JIANGYu:Modificationandapplicationsofmicro-structuredpolymersurfaces 130/2015 TABERMAN Helena: Structure and function of carbohydrate-modifying enzymes 131/2015KUKLINMikhailS.:Towardsoptimizationofmetaloceneolefinpolymerizationcatalystsvia structuralmodifications:acomputationalapproach 132/2015SALSTELAJanne:Influenceofsurfacestructuringonphysicalandmechanicalpropertiesof polymer-cellulosefibercompositesandmetal-polymercompositejoints 133/2015 CHAUDRI Adil Maqsood: Tribological behavior of the polymers used in drug delivery devices 134/2015 HILLI Yulia: The structure-activity relationship of Pd-Ni three-way catalysts for H2S suppression 135/2016 SUN Linlin: The effects of structural and environmental factors on the swelling behavior of Montmorillonite-Beidellite smectics: a molecular dynamics approach 136/2016 OFORI Albert: Inter- and intramolecular interactions in the stabilization and coordination of palladium and silver complexes: DFT and QTAIM studies 137/2016 LAVIKAINEN Lasse: The structure and surfaces of 2:1 phyllosilicate clay minerals 138/2016 MYLLER Antti T.: The effect of a coupling agent on the formation of area-selective monolayers of iron a-octabutoxy phthalocyanine on a nano-patterned titanium dioxide carrier 139/2016KIRVESLAHTIAnna:Polymerwettabilityproperties:theirmodificationandinfluencesupon water movement Dissertations Department of Chemistry University of Eastern Finland No. 140 (2016) Mikko Laitaoja Structure-Function Studies of Zinc Proteins Structure–Function Studies of Zinc Proteins Mikko Laitaoja Department of Chemistry University of Eastern Finland Finland Joensuu 2016 Mikko Laitaoja Department of Chemistry, University of Eastern Finland P.O. Box 111, FI-80101 Joensuu Supervisor Prof. Janne Jänis, University of Eastern Finland, Joensuu, Finland Referees Prof. Risto Kostiainen, University of Helsinki, Helsinki, Finland Assoc. Prof. Vesa Hytönen, University of Tampere, Tampere, Finland Opponent Prof. Mariusz Jaskólski, Adam Mickiewicz University, Poznań, Poland To be presented with the permission of the Faculty of Science and Forestry of the University of Eastern Finland for public criticism in Auditorium F100, Yliopistokatu 7, Joensuu, on December 9th 2016, at 12 noon. Copyright © 2016 Mikko Laitaoja ISBN: 978-952-61-2352-3 ISSN: 2242-1033 Grano Oy Joensuu 2016 3 ABSTRACT Zinc is one of the most abundant metals in biology. A wide variety of proteins include a zinc cofactor, having roles in protein folding, protein-protein interactions and enzyme catalysis. Zinc proteins range from small zinc fingers to large multi-protein complexes, and are present in all enzyme classes with zinc acting as an active site or a structural metal. Their role in gene expression and regulation makes them especially attractive targets for biomedical research. High bioavailability, redox-inertness and possibility for different coordination numbers and geometries are the key aspects that allow zinc to hold a prominent place in biological systems. In this work, different zinc proteins were studied by using protein structures available in the Protein Data Bank (PDB) or by experimental methods, especially high-resolution Fourier transform ion cyclotron resonance (FT-ICR) mass spectrometry. Currently, the PDB contains thousands of zinc protein structures. A detailed analysis of these structures revealed that many previous database surveys suffer from serious flaws and misinterpretations, such as exclusion of symmetry-related molecules in crystals or incorrect assignment of ligands, resulting in an incorrect coordination number and/or geometry. Thus, a more comprehensive analysis was conducted in this work, taking the aforementioned factors fully into account. Interestingly, a clear dependence of the metalto-ligand bond lengths on the crystallographic resolution was observed, pointing to the need for further analysis and validation protocols of the metalloprotein crystal structures when assessing metal ion coordination. SAP30L is a member of the Sin3A corepressor protein complex, which is involved in transcriptional regulation. SAP30L contains a novel zinc finger motif, which mediates the key protein, lipid and DNA interactions of the complex. By using high-resolution FT-ICR mass spectrometry, we characterized redox-dependent disulfide formation in the SAP30L ZnF motif and its structural and functional implications. Upon oxidative stress, SAP30L undergoes formation of two disulfide bonds with a concomitant release of the coordinated zinc ion. The oxidized SAP30L was shown to remain folded and to bind signaling phospholipids with markedly higher affinity as compared to holo-SAP30L. The results suggest that ZnF in SAP30L works as a redox switch, which may be essential in controlling the repression activity of the Sin3A complex. Small zinc finger motifs are promising molecular scaffolds for protein design, serving as versatile building blocks for artificial nanocatalysts or specific biosensors. In the last part of this work, structural robustness of a designed ZnF motif, named MM1, was characterized with the use of FT-ICR mass spectrometry. The results showed that MM1 binds zinc specifically with sub-micromolar affinity. Additionally, only gold ions were able to form a complex with the peptide. Surprisingly, MM1 was able to retain most of its metal ion binding affinity in the presence of selective alanine mutations of the primary zinc coordinating amino acid residues, indicating an exceptional structural stability. 4 LIST OF ORIGINAL PUBLICATIONS This dissertation is a summary of the following original publications I–III. I Laitaoja, M.; Valjakka, J.; Jänis, J. Zinc Coordination Spheres in Protein Structures. Inorg. Chem. 2013, 52, 10983–10991. II Laitaoja, M.; Tossavainen, H.; Pihlajamaa, T.; Valjakka, J.; Viiri, K.; Lohi, O.; Permi, P.; Jänis, J. Redox-Dependent Disulfide Bond Formation in SAP30L Corepressor Protein: Implications for Structure and Function. Protein Sci. 2016, 25, 572-586. III Laitaoja, M.; Isoniemi, S.; Valjakka, J.; Mándity, I.M.; Jänis, J. Deciphering Metal Ion Preference and Primary Coordination Sphere Robustness of a Designed Zinc Finger with High-Resolution Mass Spectrometry, Protein Sci., in press. 5 CONTENTS ABSTRACT .................................................................................................................... 3 LIST OF ORIGINAL PUBLICATIONS ........................................................................ 4 CONTENTS .................................................................................................................... 5 ABBREVIATIONS ......................................................................................................... 6 1. INTRODUCTION................................................................................................. 7 1.1. ZINC PROTEINS ............................................................................................ 7 1.2. ZINC FINGER MOTIFS ................................................................................. 7 1.3. MASS SPECTROMETRY .............................................................................. 9 1.3.1. GENERAL ............................................................................................... 9 1.3.2. ELECTROSPRAY IONIZATION ........................................................... 9 1.3.3. FOURIER TRANSFORM ION CYCLOTRON RESONANCE MASS SPECTROMETRY ................................................................................ 10 1.3.4. PROTEIN MASS SPECTROMETRY ................................................... 12 2. AIMS OF THE STUDY...................................................................................... 13 3. EXPERIMENTAL .............................................................................................. 14 3.1. DATABASE SURVEY ................................................................................. 14 3.2. PROTEIN AND PEPTIDE MATERIALS .................................................... 15 3.3. MASS SPECTROMETRY ............................................................................ 16 4. RESULTS AND DISCUSSION ......................................................................... 17 4.1. DATABASE SURVEY ON ZINC PROTEINS I .......................................... 17 4.2. NMR STRUCTURES .................................................................................... 20 4.2.1. CLASSIFICATION AND COORDINATION SPHERES .................... 20 4.2.2. COORDINATING LIGANDS AND BOND LENGTHS ...................... 21 4.3. X-RAY STRUCTURES ................................................................................ 22 4.3.1. CLASSIFICATION AND COORDINATION SPHERES .................... 22 4.3.2. COORDINATING LIGANDS AND BOND LENGTHS ...................... 24 4.3.3. INCOMPLETE SPHERES..................................................................... 26 4.4. CHARACTERIZATION OF THE SAP30L COREPRESSOR PROTEIN II .. 30 4.4.1. GENERAL ............................................................................................. 30 4.4.2. ZINC-INDUCED FOLDING OF SAP30L ............................................ 32 4.4.3. REDOX-DEPENDENT DISULFIDE FORMATION ........................... 33 4.4.4. PHOSPHOLIPID AND DNA BINDING .............................................. 35 4.4.5. SOLUTION NMR STRUCTURE OF SAP30L ..................................... 36 4.4.6. DISCUSSION ........................................................................................ 37 4.5. DESIGNED ZINC FINGER PEPTIDES III ................................................... 39 4.5.1. GENERAL ............................................................................................. 39 4.5.2. METAL ION BINDING OF ZINC FINGER PEPTIDES ..................... 40 4.5.3. ZINC COORDINATION SPHERE ROBUSTNESS OF MM1............. 42 5. CONCLUSIONS ................................................................................................. 44 ACKNOWLEDGEMENTS .......................................................................................... 46 REFERENCES .............................................................................................................. 47 6 ABBREVIATIONS CID DFF2 EC ECD ESI FT-ICR HOAc IUBMB Kcx Kd m/z MeCN MM1 MS NC NH4OAc NLS NMR PDB ppm SAP30 SAP30L SPPS ZnF collision induced dissociation Designed Functional Finger 2 Enzyme Commission electron capture dissociation electrospray ionization Fourier transform ion cyclotron resonance acetic acid International Union of Biochemistry and Molecular Biology carboxylated lysine dissociation constant mass/charge ratio acetonitrile Minimal Mutant 1 mass spectrometry nucleocapsid ammonium acetate nuclear localization signal nuclear magnetic resonance Protein Data Bank parts-per-million Sin3A associated protein 30 Sin3A associated protein 30-like solid-phase peptide synthesis zinc finger 7 1. INTRODUCTION 1.1. ZINC PROTEINS Approximately one third of all proteins require a metal cofactor for proper function. In general, metal cofactors have important roles in protein folding and structural stabilization, formation of protein–protein assemblies, and when acting as active site metals in many enzymes.1–6 Zinc holds the prominent place among these metals. The genomic studies have estimated that zinc is present in about 10% of all proteins expressed, representing a vast amount of proteins among a broad range of protein families.7 A steadily growing number of zinc containing protein structures in the Protein Data Bank (PDB) seems to reflect this. At present, the PDB contains about 10 500 protein structures with coordinated zinc ions. Zinc is the second most abundant trace metal in the human body after iron.7–10 Structural zinc sites are made up by small protein motifs or larger protein domains, requiring one or more zinc ions for correct folding and function.11–13 Zinc ions are typically coordinated by four cysteine and/or histidine residues, although penta- and hexa-coordinated zinc ions also exist with a variety of amino acid ligands. In catalytic and co-catalytic zinc sites, one of the coordination sites is usually occupied by a water molecule, which is easily displaced by an incoming substrate molecule to initiate the catalysis. Zinc containing enzymes are present in all six enzyme classes, catalyzing various different reactions. Notable examples of zinc metalloenzymes are carbonic anhydrases and alcohol dehydrogenases.14,15 At protein interfaces, zinc ions assist formation of protein subunit assemblies.2,9,16,17 In these sites, zinc ions may be essential for complex formation or merely stabilize certain conformations. While some of these interactions might be present only in a crystalline state, such transient intermolecular interactions may also be important for the regulation of protein expression.9,18,19 A good example of a zinc ion mediated protein assembly is insulin, where three protein dimers and two zinc ions form a hexameric assembly, a main physiological form of insulin.4,20 The methods to study zinc proteins are quite limited due to the spectroscopically silent nature of zinc. The completely filled d-orbital of Zn(II) ion (d10 electron configuration) renders zinc diamagnetic and, thus, invisible in electron spectroscopy. In addition, zinc complexes are colorless and have no absorbance in the ultraviolet or microwave spectral regions.21 Therefore, indirect methods, e.g. metal substitution, have been used for biochemical studies of zinc proteins. 1.2. ZINC FINGER MOTIFS Protein transcription factors that regulate gene expression and differentiation are relatively difficult to study owing to their low abundance and dynamic behavior with nucleic acids. The transcription factor IIIA (TFIIIA) from Xenopus laevis (African 8 clawed frog) was the first to be characterized to contain a series of zinc-dependent structural motifs, able to “grip” RNA, hence the name “zinc finger” (ZnF). The first three-dimensional structure of a ZnF motif was determined from the 31st zinc finger of the Xfin protein (PDB 1ZNF) by using nuclear magnetic resonance (NMR) spectroscopy. The term “zinc finger” is broadly used to describe any small and compact protein motif capable of binding zinc ions in a tetrahedral arrangement by a combination of four cysteine and/or histidine residues. In recent years, the rational design of short non-native amino acid sequences, capable of performing specific functions, such as metal binding, oligomerization or genome editing with nucleases has gained considerable interest.22–25 ZnF motifs provide an excellent starting point for such endeavors due to their small size, straightforward production and easily tailorable structures. Figure 1 shows the three-dimensional structures for different types of zinc fingers. The first structure is a canonical Cys2His2 ZnF motif from the Xfin protein (PDB 1ZNF),11 in which the zinc ion is bound in a tetrahedral geometry by four amino acid ligands, two cysteine and two histidine residues.12 In the middle is a structure of the CREB binding protein (CBP) (PDB 1U2N), where three zinc ions coordinate to the same polypeptide chain, forming a scaffold for protein binding.26 When the terminal ligands of the two “zinc bundles” are removed, the region shown in blue folds independently to a stable non-natural structure.27,28 On the right is the N-terminal zinc finger from the HIV-1 nucleocapsid protein 7 (NCp7) (PDB 1HVO) bound to the pentanucleotide.29 Figure 1. Cartoon representation of structures of different types of zinc finger (ZnF) motifs. a) Canonical Cys2His2 zinc finger from X. laevis Xfin protein (PDB 1ZNF), b) CREB binding protein (CBP), containing three consecutive zinc finger motifs (PDB 1U2N) and c) N-terminal zinc finger of the HIV-1 nucleocapsid protein 7 (NCp7) bound to the pentanucleotide d(ACGCC) (PDB 1HVO). 9 1.3. MASS SPECTROMETRY 1.3.1. GENERAL Mass spectrometry (MS) is an analytical technique that aims to determine the neutral molecular mass of an analyte (an atom or a molecule) by measuring the ratio of the mass and the charge (m/z) of the corresponding ion. The current definition of m/z ratio defines it as a dimensionless quantity, calculated by the mass of the ion (given in unified atomic mass units, u or Da) divided by its charge number (z).30 The analyzed ions can be positive, negative or radical ions. Since mass spectrometry is independent on the spectroscopic nature of the analyte, it provides excellent means to characterize various different types of molecules, reactions and interactions.31–33 Besides analysis of intact molecules, mass spectrometry can be used to further identify or characterize ions by fragmenting them into smaller components.34,35 Various types of instruments have been developed, each with its own strengths and weaknesses, including ionization methods available, mass range (the lowest and the highest detected m/z), mass resolution and mass accuracy, sensitivity, dynamic range, speed, and the capability of performing tandem mass spectrometry (MS/MS) experiments. 1.3.2. ELECTROSPRAY IONIZATION A mass spectrometer is only able to analyze charged atoms or molecules and therefore the analyte molecules have to be ionized and transferred into the gas-phase. The most common ionization method used in the biomolecule analysis is electrospray ionization (ESI).36–38 ESI is a “soft” ionization method causing very little fragmentation of the analyte molecules or even their non-covalent complexes, thus making it suitable for analysis of interactions between biomolecules, e.g. proteins, nucleic acids, carbohydrates, metal ions or other small ligands.39 In ESI, a polar volatile solvent containing analyte molecules, typically at low micromolar concentrations, is sprayed through a narrow-bore metal capillary placed in a high electric field in ambient conditions.40 The electrical field causes a formation of a Taylor cone and a jet at the apex of the cone. Due to electrostatic repulsion of the ions, the jet breaks into small non-spherical droplets carrying excess ionic charge (Figure 2). The charge imbalance of the ions in the droplets is one of the key aspects of ESI, which eventually leads to the formation of gas-phase ions. The electrospray can be regarded as an electrochemical cell where the current is transported by the droplets. The evaporation of the solvent molecules from the droplets causes repeated shrinking and coulombic fissions of the droplets, ultimately forming gas-phase ions. Current instruments use a pneumatic nebulizer, where the inner metal capillary carries the solution and the outer capillary carries pressure-controlled nitrogen gas to produce uniformly sized droplets and to improve the evaporation of the droplets and the ionization process. A heated countercurrent drying gas is used to further aid the evaporation process and to block the neutrals from entering into the ion source. 10 Another key aspect of the ionization process of large biomolecules in ESI is a multiple charging phenomenon, which means that the same analyte may be present in the spectrum with several signals, corresponding to different amounts of charge carriers (e.g., protons) attached.41,42 Most biomolecules are either polybasic or acidic in nature and have multiple sites of charging, which usually complicates the spectral interpretation. The main advantage of the multiple charging phenomenon is that it lowers the requirement for the highest detected m/z for a given mass. In addition, mass accuracy is statistically increased when the mass is calculated from several peaks. Figure 2. Schematic representation of the electrospray process. In the presence of a high electric field the solution creates the typical cone-jet plume from the Taylor cone upon exiting the capillary. 1.3.3. FOURIER TRANSFORM ION CYCLOTRON RESONANCE MASS SPECTROMETRY In Fourier transform ion cyclotron resonance (FT-ICR) mass spectrometry, detection of the externally generated ions occurs in an ICR cell, placed in the center of a large superconducting magnet.43–45 The ICR cell consists of pairs of trapping, excitation and detection plates (Figure 3a). In the presence of a high magnetic field, the ions start to circulate around the magnet axis, at a frequency which is dependent only on the m/z of the ions and the strength of the magnetic field. This phenomenon is called the cyclotron motion. The magnet is able to confine ions only radially; the axial confinement is accomplished by using a static electric field (a few volts) set to the trapping plates. The radius of the natural cyclotron motion is too small and incoherent for detection, so in order to produce a coherent measurable signal the ions are excited to a larger radius by using a short radio-frequency (RF) pulse, applied to the excitation plates (Figure 3b). After the excitation, the ions circulate in a larger cyclotron radius and induce a small 11 alternating current (image current) between the detection plates, known as the timedomain transient (Figure 3c). In the presence of an ultra-high vacuum (∼10-10 mbar) inside the cell, the ions travel tens of kilometers in one second, a typical transient time. Modern FT-ICR instruments have the capability of measuring millions of data points during the data acquisition period, providing unparalleled resolution and mass accuracy. In principle, mass resolution can be increased by acquiring a longer transient or by using a higher field magnet. The higher the field, the higher the cyclotron frequency and faster the data acquisition is for a given resolution.46 For the same post-excitation radius, the theoretical maximum resolution does not increase with the field strength; however, the higher field allows longer transients to be measured without a signal decay, resulting in higher resolution.34 The time-domain transient is then Fourier transformed into a frequency-domain signal, followed by magnitude calculation and frequency-to-mass conversion. Figure 3d shows the ESI FT-ICR mass spectrum of a protonated arginine (monoisotopic mass 175.11895 Da) at 12-T field. Due to the obtained ultra-high resolution (∼300 000), the arginine 15N and 13C isotopologue ions (176.11599 and 176.12231 Da, respectively) can be fully resolved. Note that the mass difference between these two ions is only ∼6.3 mDa, which corresponds to the mass of a dozen electrons. Figure 3. a) Geometry and construction of a cylindrical ICR cell. b) Principles of ion excitation and detection in the ICR cell. c) Time-domain transient. d) 12-T ESI FT-ICR mass spectrum of protonated arginine (the inset shows the magnified view at the first isotopic peak pattern). 12 1.3.4. PROTEIN MASS SPECTROMETRY High sensitivity and small sample requirement associated with mass spectrometry have made it an indispensable tool in proteomics, and the advent of soft ionization methods, has enabled measurements of intact proteins and even their non-covalent complexes.47-53 Figure 4a shows an ESI-MS spectrum of a recombinant non-glycosylated form of an avidin protein, expressed in E. coli, measured in a 500 mM ammonium acetate buffer (pH 6.8) at 1.5 µM concentration.54–56 The signals correspond to the charge states of 16+, 17+, and 18+ of the intact avidin tetramer, whose crystal structure (PDB 2AVI) is shown above. The approximate mass of the protein can be obtained by spectral deconvolution (see inset in Figure 4a). Figure 4b shows the same protein measured in a mixture of acetonitrile (MeCN) and water (50:50, v/v) with 1% of acetic acid (HOAc), showing only an unfolded protein monomer. Since the spacing of the baseline-resolved isotopic peaks directly provides the charge state as 1/z, the mass of the protein can be accurately calculated from any signal. Figure 4c shows a signal for a single fragment ion, corresponding to a C-terminal peptide of the avidin protein. Fragment ions can be obtained through MS/MS experiments, in which intact protein ions are fragmented, e.g. by collision induced dissociation (CID) or electron capture dissociation (ECD) techniques (a method known as top-down mass spectrometry).57–59 Such experiments can be used for determination of the amino acid sequences or post-translational modifications.60–63 Figure 4. Protein mass spectra at different structural levels (see text for details). a) Native tetrameric avidin. The inset shows a deconvoluted mass spectrum. b) Avidin monomer and c) C-terminal fragment ion, corresponding to the residues of D109–E128. The structural models above the spectra are based on the crystal structure of egg-white avidin (PDB 2AVI).54 13 2. AIMS OF THE STUDY Zinc proteins play essential roles in many physiological processes as enzymes, storage proteins, transcription factors, and replication proteins. Therefore, knowledge of their structure–function relationships is of importance for understanding these processes at the molecular level. Moreover, designed metal binding protein scaffolds, based on zinc proteins, hold great promise as versatile building blocks for a variety of purposes (e.g., artificial catalysts or specific biosensors). In this work, the structure and function of several different zinc proteins were studied. The specific aims were as follows: 1. Analysis of all zinc protein structures available in the PDB with respect to their structural classification, coordination geometry, ligand types, and metal-to-ligand bond lengths. These structures were manually analyzed to overcome some softwarerelated limitations. The study aimed at addressing some of the shortcomings of the previous database surveys, and providing a more complete, up-to-date view on the zinc coordination environments in proteins. 2. Structural characterization of the SAP30L corepressor protein, especially its redoxdependent disulfide bond formation, by using high-resolution ESI FT-ICR mass spectrometry. SAP30L has been previously shown to contain an N-terminal Cys3His zinc finger motif, a key structural element for its folding and function. In addition, the aim was to determine the three-dimensional structure for holo-SAP30L and to study its DNA and lipid binding by using NMR spectroscopy. 3. Analysis of metal ion preference and structural robustness of the designed zinc finger motif (named MM1) by using ESI FT-ICR mass spectrometry. The main aim of the study was to characterize the primary coordination sphere robustness of MM1 towards efficient metal ion binding upon selective alanine substitutions of the primary zinc coordinating amino acid residues. 14 3. EXPERIMENTAL 3.1. DATABASE SURVEY The Protein Data Bank (as of January 18, 2012) was queried for molecules containing zinc ions.10 In this study, only the protein structures without nucleic acids or their complexes were analyzed. Further analyses and handling of the data were conducted separately for the NMR and crystal structures. To avoid redundancy, removal of structures with 95% sequence identity was accomplished via the BLASTClust algorithm. No other restraints, such as resolution, molecular mass or atom-specific cut-offs, were used in collecting the dataset. Since a large part of zinc proteins are enzymes, a separate search was performed on each enzyme class and the enzyme datasets were compared with the total number of zinc proteins. The final working dataset contained a total of 2616 structures (590 NMR and 2026 crystal structures). The structures were classified according to the Enzyme Commission (EC) numbers, assigned to each structure based on the reaction it catalyzes. The structures without enzymatic activity were classified as structural sites. The structures were initially analyzed using Ligand Explorer 3.9 (available in the PDB), for zinc coordinating ligands, metal-to-ligand bond lengths and coordination geometries. In cases where the coordination sphere showed a largely distorted geometry or was clearly missing some of the coordinating ligands, the corresponding original publications (if available) were inspected for clarification. Further analyses of the structures were performed by using PyMOL 1.3 software.64 Protein figures in the original publications and in this thesis were created using the same software. Coordinating ligands were listed in order of appearance and marked by their three letter codes, e.g. Cys for cysteine and Kcx for carboxylated lysine. Water was marked as Wat and the other exogenous ligands, inhibitors and solvent molecules by their coordinating atoms, e.g. O for oxygen, N for nitrogen, and Cl for chloride. Histidine can bind via either of its nitrogen atoms; however, the structural and functional significance of these two binding modes remains elusive, thus the binding modes were not treated separately. Carboxylates, such as aspartate, also possess multiple binding modes. In this study, aspartate and glutamate residues were treated as monodentate ligands, i.e., possessing a two-electron donor to a single zinc ion. The bidentate mode observed for carboxylates would form a four-membered ring, with two highly distorted orbitals and is represented by resonance structures of single coordination bonds. In the coordination spheres with four ligands, where a bidentate mode was observed, the bond angles of the other ligands were consistent with the tetrahedral geometry, thus supporting the analysis. When bridging two different metal ions, a carboxylate group can coordinate with both of its oxygen atoms. Since the coordination spheres usually have some distortions from ideal geometries, the structures were broadly categorized into tetrahedral, trigonal bipyramidal/square pyramidal, octahedral and incomplete geometries. 15 3.2. PROTEIN AND PEPTIDE MATERIALS The N-terminal zinc finger motif of SAP30L (residues 25-92) was produced as a GST fusion protein by using a modified expression construct in the pGEX-4T1 vector in Escherichia coli Rosetta (DE3) cells.65 Thrombin and GST protein were removed by using benzamidine and glutathione affinity columns (GE Healthcare) and the SAP30L protein was further purified by using cation exchange (Resource S column; GE Healthcare) and size exclusion chromatography (Superdex 75 column; GE Healthcare). The construct used contained two additional N-terminal amino acids (GS) from the expression vector construct (a thrombin cleavage site). For mass spectrometric experiments, SAP30L was buffer-exchanged to 20 mM ammonium acetate (NH4OAc) buffer (pH 6.8) by using PD-10 desalting columns (GE Healthcare). Concentrations of the eluates were determined by UV absorbance at 280 nm, based on the extinction coefficient calculated from the amino acid sequence. The coefficient values used were 2560 M–1 cm–1 for the apo (without zinc) and holo (with zinc) forms and 2800 M–1 cm–1 for the oxidized form (ox).66 The protein eluates were concentrated with 3K MWCO (molecular weight cut-off) centrifugal concentrators (Amicon Ultra or Vivaspin2; Millipore) and were stored at 4 °C until used. Water, acetonitrile and acetic acid were HPLC or the highest quality available. Chelation of zinc ions was performed with 1,10-phenanthroline, the in vitro oxidation was carried-out with hydrogen peroxide (H2O2) and disulfide reduction was done with dithiothreitol (DTT).67 For disulfide bond analysis of SAP30L, trypsin digestion was performed at a 1:40 (w/w) protease-to-substrate ratio in 20 mM NH4OAc (pH 6.8) at room temperature, from 15 minutes to overnight. Prior to the measurements, acetonitrile and acetic acid were added to the solutions to quench the reaction and enhance ionization. Online (pepsin) digestion was performed using a previously published protocol.68 The resulting peptides from all digestions were analyzed by using GPMAW 7.10 software (Lighthouse Data, Denmark) or online ProteinProspector tools (available at http://prospector.ucsf.edu). The C8phosphoinositides (PIPs) were dissolved in a 20 mM NH4OAc buffer and used without further purification. The lipids were stored in a freezer until used. The binding experiments were performed by mixing appropriate amounts of SAP30L and lipid samples and the mixtures were heated at 50 °C for 15 minutes. The zinc finger peptides (MM1, DFF2, NC and the MM1 mutants) were synthesized either manually by solid-phase peptide synthesis (SPPS) at the University of Szeged (Szeged, Hungary) or purchased from Genecust (Dudelange, Luxembourg) as a lyophilized powder (≥ 95% purity) and used without further purification. The commercial peptides were identical to those made by SPPS, except that their C-terminus was a free acid compared to the amide from the SPPS, due to the Tentagel R RAM resin used in the synthesis. The oxidized peptides were reduced by heating the stock solutions at 70 °C for 10 minutes in the presence of 1 mM DTT and stored at 4 °C until analyzed. The metal ions (Ag+, Cd2+, Co2+, Cu2+, Hg2+, Mg2+, Mn2+, Ni2+ and Zn2+) were added as acetate salts, except for Au+, where a chloride salt was used (all from Sigma-Aldrich). 16 3.3. MASS SPECTROMETRY All mass spectrometry experiments were performed with a Bruker Apex-Qe FT-ICR mass spectrometer, equipped with a 4.7-T superconducting magnet, an Infinity ICR cell, an Apollo II electrospray ionization (ESI) source and a mass selective quadrupole. The positive ionization mode was used in all measurements. The ultra-high vacuum (~10-10 mbar) needed was generated with two rotary pumps and four turbomolecular pumps. The mass range of m/z 387–4000 and a 512-kWord time-domain transient size were used in all measurements, providing a maximum theoretical mass resolution of ∼170 000 at m/z 500. The mass spectra were externally calibrated with respect to the ions of the ESI Tuning Mix calibration mixture (Agilent Technologies). The data were measured and further processed with XMASS 7.0.8 software. For intact protein mass analysis in denaturing solution conditions, SAP30L was diluted with a MeCN/H2O/HOAc (49.5:49.5:1.0, v/v) solvent mixture. For measurements in native conditions, 20 mM ammonium acetate (pH 6.8) was used as a solvent instead. The samples were directly infused using a syringe pump at a flow rate of 1.5 µL/min with dry nitrogen gas serving as the nebulizer gas, at a pressure of 1.0 bar. Heated nitrogen gas was used as the drying gas with a flow rate of 5.0 L/min and temperature of 240 °C. Electron capture dissociation (ECD) measurements were performed by isolating the peptide ion of interest in the quadrupole with an isolation window of 5 m/z units and subjected to low-energy electrons inside the ICR cell. For the lipid binding experiments, approximately ~70 spectra were measured from a single sample using a serial measurement program. The binding affinities were calculated from the chargenormalized intensities over all the observed charge states.69 Similar parameters and solvent conditions were used for the measurements of the zinc finger peptides. A mass range of m/z 184–2000 was used in the measurements, and the instrument parameters were adjusted for optimal detection of the peptide signals and to prevent unintentional collisional activation of the peptide–metal ion complexes. The solvent used for the metal binding experiments was MeCN/H2O (50:50, v/v; pH 7.0), since pure water or an acetate buffer resulted in rapid peptide oxidation, and the excessive use of DTT (for reduction) was not possible due to its zinc ion affinity. The determination of the zinc binding affinity of MM1 was performed using a microchipbased nanoESI ion source (Advion Triversa Nanomate), where the flow rate is approximately 0.2 µL/min. In these measurements, the MM1 peptide concentration was 1.0 µM while the zinc concentration varied between 0.25 and 5.0 µM. The positions of the disulfide bonds in the oxidized peptides were determined by using collision-induced dissociation (CID) experiments. 17 4. RESULTS AND DISCUSSION 4.1. DATABASE SURVEY ON ZINC PROTEINS I At the time of the database survey, the PDB (http://www.rcsb.org/pdb) contained about 76000 structures, with zinc present in around 7100 of them, reflecting the amount of zinc proteins estimated from genomic studies (ca. 10%). Zinc has the highest amount of structures and binding sites, and is only surpassed by magnesium in the number of individual ions.70 Most protein structures present in the PDB are determined from human, mouse, E. coli and yeast proteins. Highly similar structures were removed from the initial search results to avoid data redundancy (see, Experimental section for details). Thus, the working dataset contained about 35% of all deposited zinc protein structures. Since the search was conducted separately for NMR and crystal structures, the same protein structure may be present in both datasets, such as the zinc finger of histone lysine demethylase JARID1A (PDB entries 2KGI and 3GL6).71 Figure 5 shows the most common metal ions in protein structures and the growth of the unique zinc protein structures available in the PDB over the past twenty years. Figure 5. a) Number of protein structures containing different metal ions in the Protein Data Bank and b) growth of the unique zinc protein structures, determined by NMR spectroscopy (red) and X-ray crystallography (blue). The analyzed structures amounted to around 85% of zinc proteins determined by NMR. It must be noted that around 60% of the structures do not have a journal article associated to them, as most of these structures are from structural genomics or proteomic initiatives. Figure 6 shows the molecular mass distributions of zinc proteins determined by NMR and X-ray. The NMR structures have an average molecular mass of 8.6 kDa, thus the majority of these structures are from small zinc fingers. A classical zinc finger contains one zinc ion tetra-coordinated by four amino acid residues, whereas PHD-, LIM- and RING-type zinc fingers contain two zinc ions. The presence of zinc has to be known beforehand, as zinc is “invisible” in NMR spectroscopy and has to be determined by using distance and bond angle restraints, and the models are calculated based on these values. The structures are highly similar as the short sequence motifs require less 18 changes in the sequence to fulfill the identity criteria (<95%, i.e. 1/20 amino acids). This can also be seen from the molecular mass distribution (bimodal; centered around 5 and 10 kDa) (Figure 6a). The structures beyond 10 kDa are metallothioneins, enzymes and repeats of the single zinc finger motifs. The structure with the highest molecular mass determined with NMR is hexameric insulin with a mass of 35.6 kDa (PDB 1AI0).72 The NMR structures contained a total of 922 zinc ions. Figure 6. Molecular mass distributions of zinc proteins determined by a) NMR and b) X-ray crystallography. Red bars in b) correspond to proteins where zinc ions are merely crystallization artifacts. Among the crystal structures, numerous zinc proteins have been determined several times in different space groups, as different expression constructs or mutant structures, different protein–ligand complexes, or simply at higher resolution, and these duplicate structures were removed from the working dataset. For example, the database query for carbonic anhydrase II results in over 400 nearly identical structures.73 The largest zinc protein crystal structure is from RNA polymerase II (PDB 3H0G) with the structural molecular mass of 996.1 kDa. Figure 6 emphasizes the main difference between NMR and X-ray crystallography; ca. 90% of the zinc protein structures determined with NMR are spanning the mass range of 2–20 kDa, whereas the mass range for X-ray extends up to 1000 kDa. Thus, NMR is still, in practice, limited to the study of only small proteins (with a few exceptions), despite major progress in the field in recent years. The resolution range for zinc proteins determined by X-ray crystallography is from 0.79 Å (atomic resolution) up to 4.30 Å with a weighted average of 2.06 Å (Figure 7a). The average resolution has remained the same for the past 20 years, despite improvement in the crystallization techniques and the use of synchrotron radiation sources (data not shown). About 96% of the structures have a resolution better than 3.00 Å and the rest are very large protein complexes, where the size sets an obvious limit to the achievable resolution. In addition, the average structural molecular mass increases quite linearly with the achieved crystallographic resolution (Figure 7b). In total, the 2026 analyzed X-ray structures contained 6950 zinc ions. 19 Figure 7. a) Distribution of crystallographic resolution for zinc proteins (0.1 Å bins) and b) average structural molecular mass as a function of resolution. Red bars/dots correspond to the structures where the zinc ion is considered a crystallization artifact. A large number of structures contained zinc ions bound to the protein surface, where the number of ligands ranged from zero to six with zero being the most common occasion. Usually the binding was accompanied by more than one water molecule as a result of a nearby solvent channel. By inspecting these structures and the corresponding publications, it was realized that these zinc ions are not required for the protein function and are merely crystallization artifacts resulting from high concentrations of zinccontaining buffers used upon crystallization experiments (Figures 6b and 7a). Many publications indicated that higher quality crystals having better X-ray diffraction could be obtained in the presence of zinc. Zinc ions presumably aid the crystal formation by stabilizing intermolecular contacts. 20 4.2. NMR STRUCTURES 4.2.1. CLASSIFICATION AND COORDINATION SPHERES The coordination spheres in the NMR structures mainly comprise of tetrahedral zinc sites (98%) with cysteine and histidine residues. As most of them are from small zinc fingers they possess only a structural role. The total number of enzymes among the NMR structures is 84 (of 590 structures). The majority of enzymes are hydrolases (EC 3.x.x.x) and transferases (EC 2.x.x.x) with minor contributions from ligases (EC 6.x.x.x) and oxidoreductases (EC 1.x.x.x). Only one lyase (EC 4.x.x.x) and no isomerases (EC 5.x.x.x) were found among the NMR structures. However, as the molecular mass of these enzymes is less than 22 kDa, it can be assumed that the zinc ions are structural rather than active site metals, the number of active site metals is less than 2%. These domains most likely have a role in substrate recognition or in regulating the enzyme activity, as can be seen from the zinc functions in transferases and ligases. Figure 8 shows classification and the coordination geometry of zinc ions in the analyzed NMR structures. Figure 8. a) Classification and b) coordination geometry for zinc ions in protein structures determined by NMR. In a), the two numbers for enzyme classes indicate the distribution of zinc ions in active sites and structural sites, respectively. In the NMR structures, all structural zinc sites are tetrahedral (98% share), reflecting the high amount of zinc fingers and the low amount of zinc enzymes deposited. Trigonal pyramidal spheres were found in the enzyme-inhibitor complexes. Moreover, a few tricoordinated patterns were found in enzyme active sites, where the geometry clearly pointed to a tetrahedral coordination. Indeed, original publications indicated that the “empty” coordination sites in these structures are occupied by water molecules, increasing the actual coordination number to four. The most common coordination spheres in NMR structures are Cys3-His (36.4%), Cys4 (29.6%) and Cys2-His2 (26.6%), with some positional variations. As most of the NMR structures are small zinc fingers, cysteine and histidine residues account for 97.5% of coordinating ligands, in an approximate ratio of 3:1. Other coordinating amino acids are aspartate and glutamate residues, both having a ∼1% share. The rest of the ligands in these structures are oxygenor sulfur-containing small molecules (inhibitors) or water molecules. 21 4.2.2. COORDINATING LIGANDS AND BOND LENGTHS The bond lengths in the NMR structures were typically within the accepted values for zinc-ligand distances (Table 1). However, some unrealistic bond lengths were also found. The shortest bond length found was only 1.23 Å for a Zn–Cys bond (PDB 2FUU). In contrast, the longest bond found was 4.12 Å, observed for a Zn–His bond (PDB 2JMI). In these NMR ensembles, the quality of the calculated models is very low, resulting in clearly incorrect coordination geometries and bond lengths.74,75 In general, Zn-Cys bond lengths in the NMR structures display a rather symmetrical and narrow distribution, centered at 2.35 Å (Figure 9). For Zn-His bond lengths, a bimodal distribution can be observed instead, peaking at 2.05 Å and 2.35 Å, the latter being roughly the same as the zinc–cysteine bond length. This results when the cysteine and histidine residues are not treated separately in the structure calculation. A bimodal distribution is observed for glutamate, and the average bond length (1.83 Å) is markedly shorter than for aspartate (2.10 Å), although the coordination should be identical. The low counts for Zn–Glu and Zn-Asp bonds, however, prevent good statistical comparison of these ligands. Table 1. Coordinating ligands in zinc structures determined with NMR spectroscopy. Coordinating ligand Occurrence Bond length (Å)a Relative share (%) Cysteine (Cys) 2659 2.32 ± 0.16 72.16 Histidine (His) 936 2.09 ± 0.14 25.40 Glutamate (Glu) 36 1.83 ± 0.16 0.98 Aspartate (Asp) 33 2.10 ± 0.24 0.90 Other oxygen (O) 17 2.20 ± 0.13 0.46 Water (Wat) 3 2.18 ± 0.03 0.08 Other sulfur (S) 1 2.63 0.03 Total 3685 a Reported as average ± standard deviation Figure 9. Distributions of zinc-ligand bond lengths (0.05 Å bins) in NMR structures for a) cysteine, b) histidine, c) glutamate and d) aspartate. 22 4.3. X-RAY STRUCTURES 4.3.1. CLASSIFICATION AND COORDINATION SPHERES Protein structures determined by X-ray crystallography dominate in the PDB database and this also applies to zinc proteins. When the search was limited to enzyme classes, some additional structures were also found, as compared to the full database search. The additional structures are high-resolution structures that have not been classified as enzymes or are highly similar structures without enzymatic function, which are then excluded from the search. Furthermore, some enzymes have not been given an EC number, although the publications clearly state enzymatic function for these proteins. These structures were added to their corresponding enzyme classes or to the “other enzymes” class if the reaction catalyzed was uncertain. Similar to the NMR structures, for oxidoreductases, hydrolases, lyases and isomerases, a majority of the zinc ions are active site metals (Figure 10). On average, enzyme structures contained three zinc ions, which indicates a tendency to form higher oligomers as the functional unit, such as tetrameric D-hydantoinase or trimeric γ-carbonic anhydrase.3,76 In the X-ray structures, the coordination sphere was more diverse, but mostly tetrahedral geometries were found (Figure 10). Most structural proteins and mononuclear enzymes are tetrahedral, while many binuclear enzymes and inhibitor complexes display trigonal bipyramidal coordination geometry. Figure 10. a) Classification and b) coordination geometry for zinc ions in protein structures determined by X-ray crystallography. In a), the two numbers for enzyme classes indicate the distribution of zinc ions in active sites and structural sites, respectively. The most common coordination spheres in the X-ray structures are the same as in the case of NMR structures, with Cys4 having a prominent share (ca. 20% out of all spheres, or 842 out of a total of 4415 zinc ions). Table 2 summarizes the most common coordination spheres present in different classes of zinc proteins determined by X-ray. As nearly 50% of the structures are from enzymes, the coordination is much more diverse as compared to the NMR structures (more than 500 different spheres found). 23 Table 2. The most common coordination spheres in different zinc protein classes determined by X-ray crystallography. Functional class Common coordination spheres (share-%)a Structural Cys4 (31.4%), Cys2-His-Cys (10.9%), Cys2-His2 (4.7%), His-Cys3 (4.5%), His3-Asp (4.0%), Glu2-His-Glu (2.0%) Oxidoreductase Cys4 (27.7%), His3-Asp (13.3%), Cys-His-Cys (12.5%), Cys-His-Asp, (6.3%), Cys-His-Glu (3.9%), Asp-His-His (3.7%) Transferase Cys4 (45.8%), Cys3-His (6.9%), His-Cys3 (4.9%), Cys2-His-Cys (3.8%), Cys3 (3.6%), Cys-His-Cys2 (3.3%) Hydrolase Cys4 (7.1%), His3 (5.8%), His2-Glu (5.7%), His2-Kcx-Asp (4.3%), Kcx-His2 (4.3%), His3-Asp (4.2%), His2-Glu; (4.0%), Asp-His2 (3.3%), His-Glu-His (3.1%), Asp-Glu-His (3.0%), His2-Asp2 (2.9%) Lyase His3 (28.6%), Cys-His-Cys (12.0%), Asp-His2 (8.1%), Cys-Asp-His-Cys (7.3%), Glu2-His2 (5.1%), Cys3 (3.8%) Isomerase His-Asp-His-Asp (17.0%), Cys4 (9.6%), Glu-Asp-His-Asp (8.5%), His3 (6.4%), His2-Glu-His (5.3%) Ligase Cys4 (42.9%), Cys-His-Cys2 (13.8%), Cys2-His-Cys (11.7%), Cys3-His (6.1%), Cys-His2 (3.6%), Cys2-His2 (3.1%) Unclassified enzyme His2-Glu (13.7%), His3 (9.2%), His2-Glu-Asp (5.9%), His2-Kcx-Asp (5.2%), Kcx-His2 (5.2%), Glu-Asp-His-Glu (5.2%), His3-Asp (4.9%), His-Glu-His (4.2%), Asp-His2 (4.2%) Artifact His (8.3%), His-Glu (5.7%), Glu2 (5.6%), Asp (4.4%), Glu (4.1%), His-Asp (4.1%), Asp2 (4.0%), Glu-His (3.6%), His2 (3.6%), Glu3 (3.2%), Asp-His (3.1%) a Note that these contain only the incorporated protein ligands and the actual structures may also contain various external ligands, especially at low coordination numbers (incomplete spheres). In enzymes, typically three amino acid residues are involved in the zinc ion binding and the fourth coordination site is occupied by a water molecule, substrate or an inhibitor molecule. In oxidoreductases, the most common sphere is Cys4, followed by His3-Asp, due to the high number of Cu/Zn superoxide dismutases, and Cys-His-Cys, which is typical in alcohol dehydrogenases and where the fourth coordination site is occupied by various oxygen ligands. Most inhibitors are based on sulfonamide, hydroxamic acid or phosphonate functional groups. Hydrolases have the most diverse coordination, most likely owing to the higher number of structures. Lyases, isomerases and the unclassified enzymes have similar spheres as in hydrolases. Transferases and ligases have mostly classical zinc finger coordination spheres (~85 %), which suggests that these enzymes use zinc merely for stabilization of the structures rather than the active site metal. 24 4.3.2. COORDINATING LIGANDS AND BOND LENGTHS Cysteine and histidine are the most frequent coordinating residues in X-ray structures, followed by aspartate, glutamate and a water molecule (Table 3). Other oxygencontaining (non-amino acid) ligands have the next biggest share, including a variety of molecules (e.g., substrates and inhibitors). Carboxylated lysine (Kcx) is also a notable ligand which has not been categorized in the previous studies. Kcx is an important ligand in binuclear zinc enzymes and is frequently found in hydrolases. Interestingly, carboxylated lysine residues were not observed in the coordination spheres of artifact zinc ions. Table 3. Coordinating ligands in zinc protein structures determined with X-ray crystallography. Coordinating ligand Cysteine (Cys) Histidine (His) Aspartate (Asp) Water (Wat) Glutamate (Glu) Other oxygen (O) Carboxylated lysine (Kcx) Other nitrogen (N) Chlorine (Cl) Lysine (Lys) Asparagine (Asn) Other sulfur (S) Serine (Ser) Threonine (Thr) Tyrosine (Tyr) Glutamine (Gln) Phosphoserine (Sep) Selenomethionine (Mse) Methionine (Met) Formylglycine (Fgl) Arginine (Arg) Bromine (Br) Tryptophan (Trp) Total Functional Artifact Total 6102 5716 2026 1753 1293 974 161 137 65 54 51 49 24 24 22 20 7 5 4 4 18491 122 1810 1415 2586 1952 546 193 115 46 50 1 42 36 14 37 2 7 6 2 2 8984 6224 7526 3441 4348 3245 1521 161 330 180 100 101 50 66 60 36 57 9 5 11 4 6 2 2 27475 The bond length analysis of all coordinating residues shows that cysteine has a quite symmetrical and narrow length distribution, while the others have clearly broader distributions, which are tailing towards the longer bond lengths (Figure 11). For some residues, especially nitrogen and carboxylated lysine, the occurrence counts are low, thus the distributions are not well defined. 25 Figure 11. Distributions of zinc-ligand bond lengths (0.05 Å bins) in the X-ray structures for a) cysteine, b) histidine, c) aspartate, d) glutamate, e) nitrogen, f) carboxylated lysine, g) water and h) oxygen ligands. Figure 12 shows average zinc–ligand bond lengths for cysteine, histidine and water as a function of crystallographic resolution. The average bond lengths clearly increase when the resolution lowers. This is rather surprising, since the bond length should be independent of the achieved resolution. Furthermore, the data shows that the bond length is only dependent on the crystallographic resolution, and not affected by the date of the data acquisition, molecular mass of the protein or the refinement method applied. Figure 12. Zinc-ligand bond lengths and average bond length per crystallographic resolution (0.1 Å bins) for cysteine (a,b), histidine (c,d) and water (e,f) ligands. 26 Interestingly, the Zn–His bond length is 2.03 ± 0.04 Å and Zn–Cys bond length is 2.31 ± 0.03 Å on average for atomic resolution structures, which are very close to the average values determined from small molecule complexes. The variation of the average bond lengths with resolution is not well understood but a similar phenomenon has been observed with other metals, too.77,78 4.3.3. INCOMPLETE SPHERES Zinc-specific coordination spheres have been previously analyzed in several database surveys, which have mostly dealt with high to medium-resolution crystal structures and have used different computer algorithms to obtain statistical data. In these surveys, coordination numbers from two to eight have been reported.79–82 Thus, a large number of di- and tricoordinated zinc ions have been reported in the crystal structures of zinc proteins. Finding di- or tricoordinated metal ions in proteins is very surprising given the rarity of any transition metal complex to possessing required linear or trigonal planar coordination geometry.77,78,83–85 Re-examination of these structures revealed several factors which have been overlooked in the past, such as the exclusion of the symmetryrelated molecules in crystals or missing electron densities for ligands, leaving the coordination spheres incomplete.86–89 In compounds, zinc appears exclusively in the oxidation state +II. In some very rare compounds, having Zn–Zn bonds, the oxidation state +I is observed.18 Due to a fully filled d-subshell, covalent coordination bonds are formed by sp3 hybrid orbitals, with the Zn(II) ion acting as a strong electron acceptor. In the case of tetrahedral coordination, all four sp3 orbitals are fully occupied in accordance with the 18-electron rule. Removal of electrons from these orbitals greatly destabilizes the complex, leaving an electronically unsaturated sphere. 16-electron complexes are possible with certain low-spin d8 metals, with ligands having both σ-donor and π-acceptor properties. These complexes are square planar with an empty dx²–y² orbital, which is not possible with zinc due to a d10 electron configuration. Higher coordination numbers for Zn(II) are also possible with the incorporation of empty 4d orbitals (i.e., penta- and hexacoordinate zinc ions). It must be noted that di- and tricoordinate zinc ions do exist in some organozinc compounds, but they are electron-deficient and highly reactive in ambient conditions. Thus, it is obvious that di- or tricoordinated zinc ions cannot exist in a biological environment. To correct some of the misinterpretations of the coordination spheres in zinc proteins, made in the previous database surveys, all the structures were manually validated in this study by taking symmetry molecules into account or by inspecting the actual electron density maps to find unassigned ligands, incorrect side-chain conformations or unusually long metal-to-ligand bonds. In addition, in many occasions, the sole inspection of the original publications revealed details, which pointed to the unassigned ligands present in the structures, while being fully absent in the actual coordinate files. It must be emphasized that no structures with di- or tricoordinated zinc ions were found in this study upon taking the above-mentioned factors into account. 27 An incomplete sphere is a rare occasion in the NMR structures, ca. 1% out of all spheres, and is mostly due to missing exogenous ligands or unresolved binding conformations. In the crystal structures, incomplete spheres are far more frequent, counting up to 40% of all zinc sites, even though a majority of them appear in the artifact sites. The following are the main reasons for incorrect assignments of the coordination spheres found in zinc protein structures: a) Symmetry-related molecules in crystals. The coordinating ligands come from different polypeptide chains related to each other by crystallographic symmetry. These are easily ignored for the simple reason that the asymmetric unit does not necessarily represent these interactions and the symmetry molecules need to be manually generated. Insulin provides an excellent example of this. In the hexameric insulin assembly, the two zinc ions are coordinated by three histidine residues (Figure 13a) from three insulin monomers that reside on the same three-fold symmetry axis.4 b) Missing or unassigned electron density for ligand. The placement of atoms is based on the electron density map, where either missing or unambiguous density prevents the exact placement of the atoms. This is usually true for water molecules, ligands (substrates or inhibitors) in the enzyme active sites or even the entire side-chains (Figure 13b). For example, water molecules cannot be placed with certainty to the electron density in the low-resolution enzyme structures, even though the bond angles of the other three ligands would clearly indicate a tetrahedral coordination.90,91 c) Bond-lengths over the used cut-off values. Even though the bond lengths between the metal and the ligand should be close to the values determined, on average, for small compounds, protein structures exhibit variations from these values due to limited conformations of the main and side-chains. Due to differences in resolution, some bond lengths may be longer than the others (Figure 13c). If specific cut-off values are used, such deviations from the average values may result in false interpretation of the coordination numbers. The low resolution limits the accuracy of the electron density refinement, which becomes clearly evident in very large structures, such as RNA polymerases.92 d) Multiple or incorrect side-chain conformations. In structures, where the ligands have multiple conformations or incorrectly placed atoms, the coordination sphere may be interpreted as duplicate or incomplete. (Figure 13d). For example, in some structures, the imidazole ring of histidine may be flipped, so the carbon is coordinating to the metal, resulting in a seemingly vacant coordination site. Similar flips have been seen in the amide coordination. On the other hand, some structures have very distorted and clustered ligands that overlap with each other.89 e) Overlapping or an unknown metal. Metal ions are usually added to the structure, when an unexpected, large electron density is found (Figure 13e). Good refinement statistics for a certain metal does not necessarily mean that the metal ion is correct. 28 The atomic (van der Waals) radii for zinc and other transition metals are very similar and the identification of the correct metal ion can be problematic without additional experiments, as the refinement may give better values for incorrect metals. The first three reasons count for the majority of errors observed in this study. However, the latter two also pose a need for re-refinement of the structures involved (Table 4). Since the protein crystal is a repeating unit, interactions between the molecules are not necessarily described by the asymmetric unit, especially when the surface contacts are involved. In many crystal structures, metal ions are present on the surface of the protein, resulting e.g. from high concentrations of metal ion containing buffers used in crystallization. Thus, the coordination is frequently misinterpreted if symmetry-related molecules are not taken into account, as in the case of hexameric insulin. Table 4. Factors leading to incomplete zinc coordination spheres in crystal structures Reason Symmetry-related molecules Missing solvent molecules Missing water from active site Symmetry with missing solvent Missing sidechain or ligand Metal placed to fit electron density Sidechain conformation Metal or ligand occupancy Unknown or missing metal Sidechain flip (His/Asn/Gln) Total Fraction of artifact sites (%) 39.2 33.1 6.8 14.9 0.4 Functional Artifact 90 21 467 116 760 642 132 288 8 Fraction of functional sites (%) 10.7 2.5 55.5 13.8 - 70 - 3.6 60 58 12 17 841 10 11 17 2 1940 7.1 6.9 1.4 2.0 19.1 0.5 0.6 0.9 0.1 76.5 29 Figure 13. Factors leading to misinterpretation of zinc coordination spheres in some protein crystal structures deposited in the PDB. Left: overall structure; right: zinc coordination site in detail. (a) Symmetry related molecule in the crystal (PDB 1OHT). (b) Missing ligand/electron density for ligand (1Z5R). (c) Bond lengths over cut-off values (2FE8). (d) Multiple occupancy or erroneous atom deposition (2APO). (e) Unknown metal ion in the active site (3QC3). 30 4.4. CHARACTERIZATION OF THE SAP30L COREPRESSOR PROTEIN II 4.4.1. GENERAL The Sin3A-associated protein 30-like (SAP30L) is the newest member of the Sin3A corepressor complex, a multi-component regulatory element of gene expression in mammalian cells. (Figure 14a)93,94 The Sin3A complex contains at least ten different proteins, including SAP30, which is highly homologous to SAP30L (∼70% sequence identity). Thus, SAP30 and SAP30L are collectively known as SAP30 proteins. Sin3A itself is an acidic protein involved in transcriptional repression, but is unable to bind DNA, and therefore requires the recruitment and interaction of the other DNA-binding proteins. It has been suggested that SAP30 and SAP30L serve as bridging and stabilizing molecules between Sin3A and other co-repressors and transcription factors.95–101 Figure 14. a) Overview of the Sin3A multiprotein complex (for details, see original publication II). b) Known functional motifs of SAP30L: ZnF, zinc finger motif; NLS, nuclear localization signal; NoLS, nucleolar localization signal. c) Primary structure of the ZnF motif of SAP30L. 31 SAP30L has been previously shown to contain several structural motifs, which have different functions in the corepressor complex (Figure 14b,c).102 The N-terminal zinc finger motif is necessary for the folding and function of the protein, mainly responsible for the DNA and phospholipid binding. The ZnF motif is almost identical in SAP30 and SAP30L.98,102,103 In addition to the ZnF motif, SAP30L also contains an N-terminal low complexity region (residues 1–25), with a yet unknown function. These are followed by a central nuclear localization signal (NLS), which also mediates DNA binding.104–108 The C-terminal part contains an acidic region for histone 2A/2B binding and a Sin3A binding region along with a nuclear matrix targeting signal.65,102,103,109 In the previous study, native mass spectrometry was used to demonstrate that SAP30L contains a novel N-terminal Cys3His type ZnF motif with Cys29, Cys38, Cys74 and His77 serving as the primary zinc coordinating amino acid residues.102 In addition, the ZnF motif also contains a non-coordinating Cys30 residue, which is highly conserved through evolution. Previous mutation analyses indicated that this residue is not directly involved in the zinc ion binding. Later, the solution NMR structure was determined for the homologous SAP30 (PDB 2KDP), which indicated that the ZnF motif adopts a totally new fold with two anti-parallel β-sheets and a pair of α-helices, which form a well-defined hydrophobic core where the zinc binding site is situated.110 Although lacking any sequence similarity to other zinc fingers, the three-dimensional structure bears resemblance to the treble clef zinc finger motif.12,110,111 The coordinating residues and the additional non-coordinating cysteine (Cys30) are highly conserved among the SAP30 family proteins, suggesting that they are critical for the protein folding and function. Similar structural motifs can be found in the zinc-binding THAP domains, which are also involved in nucleic acid binding and transcriptional regulation.111,112 In this study, we characterized a redox-dependent disulfide bond formation in SAP30L as a regulatory mechanism for its structure and function by using high-resolution ESI FT-ICR mass spectrometry. The main hypothesis was that the ZnF motif in SAP30L works as a redox switch, which controls the DNA and phospholipid binding and, thus the repression activity of the whole Sin3A complex. In addition, we determined the three-dimensional structure for the SAP30L ZnF motif (hereafter simply referred to as SAP30L) by NMR spectroscopy. The construct used for the structure determination comprised residues 25–92, since the largely unstructured N-terminal region has not been shown to possess any function or importance for the folding. 32 4.4.2. ZINC-INDUCED FOLDING OF SAP30L In denaturing solution conditions, the mass spectrum of SAP30L showed a wide charge state distribution (CSD), from 6+ to 14+ (centered around 10+), typical for a fully unfolded protein (Figure 15a). The deconvoluted spectrum shows the experimental isotopic distribution, with the most abundant isotopic mass (8100.291 Da) matching perfectly with the mass calculated from the amino acid sequence (8100.267 Da for C350H580N112O101S4). No zinc binding was observed in these conditions. In contrast, the mass spectrum measured in near-native conditions (Figure 15b) showed a much narrower CSD (5+ to 7+) as compared to the denaturing conditions, accompanied with the mass increment of ∼64 Da, which is consistent with the binding of a single zinc ion. The most abundant isotopic mass (8164.160 Da) agrees well with the theoretical mass of holo-SAP30L (8164.181 Da for C350H578N112O101S4Zn1), assuming removal of two protons upon zinc ion binding.113,114 A complete saturation of the protein with zinc was observed, indicating very high affinity. Zinc finger motifs are known to lose their tertiary structure upon demetallation, due to the lack of a large hydrophobic core. The mass spectrum of SAP30L in 20 mM NH4OAc (pH 6.9) in the presence of 1,10-phenantroline (Figure 15c), a strong zinc chelator, was essentially the same as upon denaturing conditions, indicating a lack of a stable fold in the absence of zinc. Figure 15. ESI FT-ICR mass spectra of SAP30L measured in a) denaturing solution conditions (MeCN/H2O/HOAc 49.5:49.5:1.0, v/v) and in near-native conditions (20 mM NH4OAc, pH 6.8) in the b) presence or c) absence of 1,10-phenanthroline, and d) after in vitro oxidation for 10 min with H2O2. The numbers indicate different charge states. The insets show deconvoluted mass spectra with the arrows denoting the peaks representing the most abundant isotopologues. 33 4.4.3. REDOX-DEPENDENT DISULFIDE FORMATION Cysteine residues in proteins are redox-active and can respond to the oxidative stress by formation of disulfide bonds or other oxidative modifications.115–120 Thus, they can act as “redox switches” to sense changes in the redox status of the cell. These structural changes may lead to conformational changes, affecting protein function, e.g. DNA binding. As ZnF proteins are known to be redox-regulated, we sought to investigate redox-dependent structural changes in SAP30L. Figure 15d shows the mass spectrum of SAP30L, measured in 20 mM NH4OAc (pH 6.9) after the in vitro oxidation with hydrogen peroxide (H2O2). Interestingly, the charge state distribution was very similar to that observed with holo-SAP30L (Figure 16b). However, the most abundant isotopic mass was determined to be 8096.219 Da, suggesting that two disulfide bonds had been formed with a concomitant release of the coordinated zinc ion (theor. 8096.236 Da for C350H576N112O101S4). This is an interesting finding, since SAP30L has only three cysteine residues coordinating to the zinc ion (Cys29, Cys38 and Cys74). Therefore, it is evident that the unliganded Cys30 was also involved in the oxidation. No dimeric forms of the protein were observed, indicating that only intramolecular disulfide bonds were formed upon oxidative stress. Very similar CSD observed for the oxidized SAP30L as compared to holo-SAP30L suggests that it remained folded in solution, having a stable tertiary structure. This is plausible, as the cysteine residues in SAP30L are in close proximity in the zinc-bound form, and it is assumed that only minor conformational changes are needed to form the two observed disulfide bonds. To determine the reversibility of the oxidation, disulfide bond reduction was performed with the oxidized SAP30L, which resulted in full reduction and instant re-coordination of the zinc ion. This shows that the folding of SAP30L is initiated by the zinc ion binding and the oxidation is a fully reversible process. In the presence of a zinc-chelating agent (1,10-phenantroline), the protein did not reduce at all, suggesting a combined effect of zinc and the reducing agent. Zinc and cysteine is a known redox pair involved in the regulation of the function of many cellular proteins. The zinc protein oxidation causes two major effects, a consumption of the oxidative molecules and the release of the coordinated zinc ion, which may act as a secondary messenger in cellular environments.9 In order to determine the pairing of the cysteines, the oxidized and reduced SAP30L were digested with trypsin (in solution digestion) or pepsin (online digestion). HoloSAP30L digested with trypsin in the presence of DTT showed all cysteine containing tryptic peptides in their reduced form (Figure 16a; for the list of all identified tryptic peptides, see Table II in the original publication II). The high mass accuracy of the FTICR instrument allowed unambiguous identification of the resulting peptides and full sequence coverage was obtained for the holoprotein. In contrast, the tryptic digest of the oxidized SAP30L revealed the presence of a number of oxidized peptides having disulfide bonds (Figure 16b; for the list of peptides, see Table 2 in II). Among the identified peptides, a peptide with an intramolecular disulfide bond between Cys29 and 34 Cys30 (1613.64 Da) was observed, demonstrating that the disulfide bond is formed between the adjacent cysteine residues. Such vicinal disulfide bonds have been extensively studied in the past.119–122 Also, two other tryptic peptides were observed (2408.15 and 2564.25 Da), both having a disulfide bond between Cys38 and Cys74. Moreover, three larger disulfide-linked peptides were detected, containing both disulfides (e.g. 4003.78 Da, see Table II in I). The online pepsin digestion resulted in essentially the same results (see I for details). The combined results from both protease digestions indicated that the two specific disulfide bonds, Cys29-Cys30 and Cys38-Cys74, are formed in SAP30L upon oxidative stress. Figure 16. ESI FT-ICR mass spectra measured from tryptic digests of a) holo-SAP30L and b) oxidized SAP30L. The tryptic peptides carrying Cys residues have been assigned (for details, see original publication II). The insets show isotopic patterns for Cys-containing tryptic peptides. The disulfide-containing peptides were subjected to ECD experiments for additional verification (see, Supplementary Figure S5 in II), which further confirms their identity and sequence. 35 4.4.4. PHOSPHOLIPID AND DNA BINDING Cell signaling phospholipids are known to mediate the DNA binding of SAP30L by targeting the same binding site. Therefore, nuclear phospholipids can regulate chromatin association of SAP30L and decrease the repression activity of the whole Sin3A complex.102 SAP30L binds nucleic acids and phospholipids primarily by its polybasic region (Figure 14b,c). The phospholipid binding competes with the binding of the nucleic acids and leads to the detachment of the protein from the chromatin. Therefore, the lipid binding affinity of SAP30L in different redox states was studied by using three phosphatidylinositol monophosphates (PIPs), PI5P, PI4P and PI3P (bearing distal phosphate units at the 5-, 4-, or 3-position of the inositol ring, respectively), which are among the main cell signaling phospholipids, anchored to the intracellular membranes. Native mass spectra for the holo and the oxidized SAP30L were measured in the presence of an equimolar amount of each lipid to avoid extensive non-specific binding or oligomerization typically encountered with lipids (see, Figure S7 in II). Holo-SAP30L was found to have the highest affinity towards PI5P, having a dissociation constant (Kd) in the low micromolar range (Table 7). PI3P was found to bind to SAP30L with somewhat lower affinity than PI5P, although the structures are very similar and differ only by the orientation of the phosphate and hydroxyl groups. These lipids also have the closest structures compared to the sugar–phosphate chain of DNA and based on their structures, would bind with similar affinity.109 Surprisingly, PI4P was found to bind to holo-SAP30L with very similar affinity to that of PI5P, despite the structural differences of these lipids. The similar binding affinities of these lipids could be explained by a conformational change in the lipid sugar-phosphate ring, from axial to equatorial, upon the binding. The length of the acyl chain is not expected to have a role in the binding, since they are hidden in the cell membranes and only the head groups are accessible to proteins. Interestingly, it was observed that all PIPs bind to the oxidized SAP30L, with the determined Kd values indicating ca. 4-fold higher affinity compared to the holoprotein (Table 5). This further supports the finding that the oxidized SAP30L adopts a folded structure in solution, as suggested by the CSD analysis (Figure 15). Table 5. Dissociation constants for phosphoinositide binding to SAP30L. Dissociation constant Kd (µM)a Protein form PI5P PI4P PI3P Holo 61 ± 4 66 ± 4 83 ± 6 Oxidized 16 ± 1 22 ± 1 32 ± 2 a Values reported as average ± standard deviation 36 4.4.5. SOLUTION NMR STRUCTURE OF SAP30L NMR spectroscopy was used to characterize zinc-dependent folding and DNA/lipid binding of SAP30L and to determine its three-dimensional structure. The 1H,15N HSQC spectrum recorded for holo-SAP30L showed characteristics of a well-folded protein in solution with good signal dispersion and roughly equal cross-peak intensities (see Figure 3 in II). The slightly downfield Cβ shifts of Cys29, Cys38 and Cys74 (32.8, 31.3, and 33.3 ppm, respectively), compared to that of Cys30 (28.9 ppm), are in accordance with these three cysteines being coordinated to the zinc ion. The final NMR ensemble of the fifteen lowest-energy structures for SAP30L is shown in Figure 17a. The residues Ser28–Arg85 form the core structure, while the N and Cterminal residues, Gly23–Gln27 and Asn86–Thr92, respectively, are disordered. The regular secondary structure elements are formed by the residues Leu31–Glu33 and Leu62–Ile64, making a short antiparallel β-sheet, and the residues Val51–Ser56 and Asp75-Ser85, forming the two α-helices (Figure 17b). The overall three-dimensional structure of SAP30L is remarkably similar to that of SAP30 determined earlier (Figure 17c) with the backbone atom RMSD of 0.58 Å only. Figure 17. Solution NMR structure of SAP30L. (a) Ensemble of 15 lowest-energy structures. (b) Ribbon model of SAP30L with secondary structure elements and zinc binding site highlighted. (c) Overlay of the best matching structures of SAP30L (tan; PDB 2N1U) and SAP30 (light blue; PDB 2KDP) NMR ensembles. The phospholipid binding of SAP30L was examined with NMR by a reverse titration because of the aggregation phenomenon occurring in the direct titration with PI5P, at the concentration exceeding that of SAP30L. The cross-peaks shifted approximately linearly with an increasing concentration of the ligand, indicative of low, micro- to millimolar binding affinity that is consistent with the mass spectrometry results. All attempts to 37 determine the DNA binding with native mass spectrometry, either with the holo or the oxidized protein, were unsuccessful, probably due to the low ionization efficiency of the protein–DNA complex or protein aggregation. However, by using NMR we were able to map chemical shift perturbations induced by the binding of the 8-bp DNA to the holoprotein. The protein aggregation compelled the use of a reverse titration in this case. The binding affinity was notably higher for DNA than for PI5P, given that the cross-peaks did not significantly shift after an equimolar concentration ratio had been reached. A pattern of perturbations similar to that with PI5P was observed, namely, the most affected residues located in the C-terminal region and in the first α-helix. No attempts were made to characterize the DNA binding to the oxidized SAP30L. 4.4.6. DISCUSSION On the basis of the present results, the zinc finger motif of SAP30L undergoes a redoxdependent disulfide bond formation upon oxidative stress. The two disulfide bonds, Cys29-Cys30 and Cys38-Cys74, are formed, involving Cys30, the non-coordinating but a highly conserved cysteine residue. The two adjacent cysteines, Cys29 and Cys30, form a well-known, redox-active Cys-Cys protein motif, which has been frequently reported to occur, especially in ribonucleases.119,123,124 The disulfide bond between two adjacent cysteine residues (i.e., a vicinal disulfide bond) is referred to in the literature as a “forbidden” disulfide, involving an eight-membered ring structure in which the peptide bond typically adopts a highly distorted, non-planar trans-conformation. Interestingly, the native mass spectrometry data suggested that the oxidized SAP30L retains a folded structure in solution, which is further supported by the observed affinity toward the phospholipids. This may have important implications in relation to its function in the Sin3A complex. In the absence of a three-dimensional structure for the oxidized SAP30L, we modeled the disulfide bonds to the NMR structure of holoSAP30L. The disulfide-bonded structure was generated by removing the zinc ion and rotating the torsional angles of the cysteine residues to be able to make the two disulfide bonds, followed by an energy-minimization of the resulting structure. The modeled structure shows that the formation of these two disulfide bonds is plausible without any major conformational changes in the overall structure (the backbone atom RMSD was 0.68 Å between the two structures). The peptide bond in the resulting eight-membered ring of the vicinal Cys29-Cys30 adopts a clearly distorted trans-conformation, consistent with the other reported vicinal disulfide bonds. Therefore, the zinc center in SAP30L forms a redox switch, which, upon oxidative stress, releases the coordinated zinc ion with a concomitant formation of the two specific disulfide bonds, a vicinal Cys29-Cys30 and Cys38-Cys74. Figure 18 deciphers the schematic of the redox-active zinc center in SAP30L and the three-dimensional structures of the zinc centers of the holo and oxidized proteins. 38 The oxidized SAP30L was found to bind all the lipids with markedly higher affinity than the holoprotein, and these differences suggest a conformational change upon the oxidation of the protein. The binding of the lipids with the oxidized SAP30L further supports that it remains folded in solution, likely having a functional role. The zinc coordination is not required for the binding of the lipids and the oxidation might enable conformational freedom not available for the holoprotein. The increased lipid binding affinity of the oxidized SAP30L might further affect its release from the DNA and therefore the repression activity of the protein complex. Figure 18. Schematic representation of the redox switch in SAP30L N-terminal ZnF motif. Upon oxidative stress, the two specific disulfide bonds, a vicinal Cys29-Cys30 and Cys38-Cys74, are formed. In the holoprotein, the zinc coordination is perfectly tetrahedral. In the oxidized form, the vicinal Cys29-Cys30 forms an eight-membered ring with the ring peptide bond adopting a highly distorted trans-conformation. 39 4.5. DESIGNED ZINC FINGER PEPTIDES III 4.5.1. GENERAL Small zinc finger (ZnF) motifs are promising molecular scaffolds for protein design, owing to their structural robustness and versatility. Moreover, their characterization provides important insights into protein folding in general. ZnF motifs usually possess an exceptional specificity and high affinity towards the Zn(II) ion to drive folding. A recently discovered non-native CHANCE (Cys/His peptide exhibiting a nonexpected conformational ensemble) motif, serendipitously found during the analysis of the first Cys/His-rich domain (CH1) of the transcriptional regulator CREB binding protein (CBP; see Figure 1), has shown that ZnF-like designer peptides can mimic well their natural counterparts.24,28,125 Based on the CHANCE motif, Sharpe et al. designed a number of “minimal mutant” (MM) peptides, where several amino acid residues were mutated to alanines, and a set of “designed functional finger” (DFF) peptides, where the surface residues were further mutated to mimic some common ZnF motifs (Figure 19).24,28 Their objective was to find minimal sequence features, which could retain the original fold, and to test a possibility of grafting the surface of a designer peptide with a specific (DNA or protein binding) function. Although the latter objective appeared challenging,28 it was demonstrated that the CHANCE motif retains a stable fold upon multiple alanine mutations, suggesting its potential as a versatile molecular scaffold for protein design. Based on the NMR structure of MM1, the zinc ion is coordinated by Cys5, Cys10, His19, and Cys23 residues (Figure 19). In addition, there is an additional histidine residue (His22) in close vicinity (ca. 3.7 Å), which could participate in the transient metal ion binding upon folding, or act as an important second-shell ligand.125 Figure 19. NMR structure of MM1 zinc finger motif (PDB 1WO3).28 Underneath are the amino acid sequences of the peptides used in this study. The zinc binding residues are marked with a blue background and changes from the CHANCE motif are marked with red. 40 In this study, high-resolution FT-ICR mass spectrometry was used for characterization of the metal ion specificity and affinity, as well as the primary coordination sphere robustness of several small ZnF motifs. The earlier studies demonstrated the structural robustness of the CHANCE motif, where the most stable MM peptide, named MM1, was chosen as the starting point.24,27,28 For comparison, the designer mutant named DFF2, which mimics the N-terminal ZnF motif of the HIV-1 nucleocapsid protein 7 (NCp7), was characterized and compared to the native 18-residue peptide (NC), which represents one of the smallest known stable ZnF motifs.29 Moreover, several Cys/Histo-Ala mutant peptides of MM1 were synthesized to study the importance of the primary and some secondary amino acid ligands toward the zinc ion binding. 4.5.2. METAL ION BINDING OF ZINC FINGER PEPTIDES Initial measurements in the denaturing solution conditions showed that the peptides were synthesized correctly and no other impurities were present in the samples (see Table S1 in III). In these conditions, the peptides were found to be unfolded and no metal ion binding could be observed (Figure 20a and Figure S2 in III). The analyses also indicated that the MM1 peptide was partly oxidized, by forming an internal disulfide bond, which was determined by CID experiments to be between Cys5 and Cys10. Similarly, the DFF2 peptide was also found to be partially oxidized forming the same disulfide bond. The NC peptide was found to exist as a disulfide-linked dimer. The oxidation of NCp7 and similar zinc finger motifs in the absence of zinc has been previously observed.126–129 The formed disulfide could easily be reduced with DTT, an ESI-MS compatible reducing agent.67 However, during prolonged storage times the peptides were found to reoxidize even in the presence of DTT. However, as DTT can also chelate zinc ions, the use of excessive DTT was avoided in further experiments. To characterize zinc ion binding of the peptides, the experiments were first attempted in water. However, peptide re-oxidation occurred rapidly in the absence of zinc. Therefore, a mixture of water and MeCN was tested as an alternative solvent mixture and a solvent ratio of 50:50 (v/v) was found optimal for further experiments, as it minimized the peptide oxidation in the absence, while it preserved the folding of the peptides in the presence of zinc. A mass spectrum measured for MM1 in the presence of 10-fold molar excess of Zn(II) ions showed a marked change in the CSD (2+ to 4+; average charge zav ≈ 3.0), as compared to the apo-peptide (Figure 20 a,b) and the mass increased by ∼61.9111 Da, consistent with the binding of one zinc ion. Only a small amount of the apo-peptide was observed in this sample. The solvent accessible surface area for MM1 as calculated from the NMR structure (PDB ID: 1WO3) by using the PDBePISA server130 is 2032 Å2. This value translates into zav ≈ 2.9, based on the empirical charge–surface area correlation proposed earlier,131 which is essentially the same as observed experimentally. Therefore, the observed shift in the CSD, and a saturative binding of one specific zinc ion indicate the Zn(II) ion-induced folding of MM1 into a well-defined three-dimensional structure. 41 Figure 20. ESI FT-ICR mass spectra of a) MM1 peptide without zinc and b) with 10-fold molar excess of Zn2+ ions in solution. The peptide concentration was 2.5 µM in both. c) Titration curve of zinc binding to MM1 peptide. For better estimation of the zinc binding affinity, a zinc titration was also performed with MM1. The fractional saturation of the peptide versus free zinc ion concentration is shown in Figure 21c. The dissociation constant (Kd) determined from the curve fitting is (112 ± 9) × 10–9 M. The obtained sub-micromolar affinity is approximately 7–8 orders of magnitude weaker than that observed in native zinc fingers. Similar results were obtained with the DFF2 and NC peptides (Figure S2 in III). However, the zinc ion concentration requirements were markedly different for these peptides. The DFF2 peptide required ca. 15-fold zinc excess to reach saturation, suggesting a somewhat lower affinity than MM1. In contrast, the NC peptide was fully saturated at a 3-fold zinc concentration, implying very high affinity. The dissociation constant between NC and zinc has been estimated to be around ~10–15 M.132 When acetic acid (pH 3.2) was added to the peptide solutions, only the apo-peptides were observed due to the acid-induced unfolding. The specificity of MM1 to bind zinc ions was assessed by using a variety of other alkali and transition metals. Of the metals tested, MM1 was able to bind Mn2+ and Ca2+, but with very low affinity. In addition, up to three Hg2+ ions were seen to bind the peptide, suggesting mostly non-specific binding. Surprisingly, only weak binding with Co2+ was observed, although cobalt is used as a probe to study the coordination spheres of zinc fingers. In fact, Sharpe et al. reported the use of Co2+ to probe the coordination geometry of MM1 by absorption spectroscopy measurements.28 Only gold (Au+) was able to form an abundant complex with the MM1 peptide (Figure 22). However, the binding of the gold ion to the peptide did not shift the CSD, indicating that it does not induce similar folding of the peptide as compared to zinc. Gold ions usually have a two-dentate binding mode in ZnFs by two cysteine residues. A further increase of the gold ion concentration caused the peptide to oxidize rapidly, forming a disulfide bond. However, the gold ion did not bind to the oxidized peptide, indicating a requirement for free thiols for the complex formation. 42 Figure 21. ESI FT-ICR mass spectra of MM1 peptide with 12-fold molar excess of a) Mn2+, b) Ca2+, c) Co2+, d) Hg2+ and e) Au+ ions. The peptide concentration was 2.5 µM in each. 4.5.3. ZINC COORDINATION SPHERE ROBUSTNESS OF MM1 Several Cys/His-to-Ala mutants of MM1 were synthesized in order to study the importance of the residues in the primary coordination sphere for efficient zinc binding. A robust and stable primary coordination sphere would be essential for the design of novel ZnF based protein scaffolds. The preliminary experiments showed that the affinity of the mutants was slightly lower than the MM1 peptide. Therefore, the zinc concentration was raised to 36-fold (90 µM) while keeping the peptide concentration constant. Surprisingly, all of the mutant peptides, except C10A, were able to bind zinc ions (Figure 22), suggesting that Cys10 is the most critical residue for the zinc ion binding in MM1 and cannot be replaced or substituted by any other residue. In contrast, the C23A mutant was fully saturated in these conditions, suggesting that the absence of this residue can be compensated for by other residues, such as the non-coordinating His22. Also, the H19C mutant was almost fully saturated in these conditions, however the affinity was lower than that of the MM1 peptide. An increase in the amount of cysteine residues in the coordination sphere generally correlates with a higher entropy contribution in the binding free energy, resulting in a higher sensitivity to the factors such as pH, temperature or ionic strength.133 The C5A and H19A mutants had similar affinities to MM1, but lower in comparison to the C23A and H19C mutants. 43 Figure 22. Mass spectra of zinc binding to Cys/His-to-Ala mutants of MM1 peptide. a) C5A, b) C10A, c) C23A, d) H19A and e) H19C. Zinc ion concentration was 36-fold in each case. Second-shell interactions also play an important role in the stabilization of structural zinc sites. MM1 contains a non-coordinating His22 residue, which is only ~3.7 Å away from the coordinated zinc ion (Figure 19), having a plausible role as a second-shell ligand. Previously, H22Y mutation was shown to prevent complete folding of the peptide.28 In contrast, the H22A mutant was capable of binding zinc, although the affinity was lower when compared to the MM1 peptide (data not shown). As tyrosine rather than alanine was used in the previous study, the direct comparison is difficult. In summary, it seems that His22 can act as a coordinating residue in MM1, especially in the case of C23A mutation. In order to rationalize the results, the structural models for the studied MM1 mutants were obtained through molecular mechanics calculations by using the solution NMR structure of MM1 as a template. The energy-minimized model for MM1 well retained the overall fold with a backbone root-mean-square deviation (RMSD) of only 0.780 Å. In addition, all mutant peptide models also retained the overall fold (backbone RMSDs between 0.187 and 0.846 Å) with their zinc coordination sites, showing a tetrahedral coordination geometry (see Figure 5 in III). These results also suggest that His22 could act as a substitutional zinc ligand. Interestingly, the obtained model for the C10A mutant also showed nearly perfect zinc binding geometry. However, the absence of Cys10 residue residing in the long loop connecting the two helical regions (V2–A4 and V16–M20), supposedly results in destabilization since there are no other polar contacts (e.g. hydrogen bonds) present in that region. A more detailed structural analysis would require molecular dynamics simulations or heteronuclear NMR analysis to probe structural changes and possible unfolding of this mutant. These results suggest that small designed zinc finger motifs, such as MM1 and the like, may be more robust structurally than their natural counterparts in larger proteins or enzymes.22,134–136 44 5. CONCLUSIONS Zinc proteins are of utmost importance in many physiological processes. Thus, their analysis provides a way to understand these processes at the molecular level, which may have important implications in finding targeted therapies for diseases or, for example, designing artificial nanocatalysts, such as zinc finger nucleases or specific biosensors for a variety of purposes.22–25 In this study, the structure and functions of several different zinc proteins were studied by using three-dimensional protein structures available in the structure database (PDB) and by experimental methods, especially high-resolution mass spectrometry. The large-scale analysis of zinc protein structures deposited in the PDB revealed that the coordination spheres are very diverse and exhibit large variations in the metal-to-ligand bond distances. Previous database surveys have been either limited in scale or have used improper analysis methods or statistical rules, which have led to severe misinterpretations of the true coordination chemistry of zinc ions.80–82 For example, diand tricoordinated structures have been reported, which are very unlikely for any transition metal in biological environments. A wealth of new information regarding the reasons behind the shortcomings of the previous database surveys were pointed out in this study, emphasizing the need for establishing better analysis and validation protocols for structural characterization of metalloproteins in general. Although zinc ions are redox-inert and unique in their coordination environments, the principles used here can be applied to any other biological metal, where the oxidation state and the electron configuration may play an even more significant role. Zinc ions themselves cannot participate in the redox-regulation in cellular conditions; however, the coordination to cysteine residues in proteins provides the necessary redoxactivity for the structural and enzymatically active zinc sites.9,137 As zinc finger motifs have several cysteine residues which coordinate to the zinc ion, they are sensitive towards oxidative stress. This may result in cysteine oxidation, which releases the zinc ion and alters the protein conformation. In this work, redox-regulation of the SAP30L corepressor protein was studied. The N-terminal zinc finger motif of SAP30L was shown to undergo in vitro oxidation, which directly affects the binding affinity toward nuclear phospholipids. As shown in the previous studies, the presence of hydrogen peroxide led to the detachment of SAP30L from the chromatin and its relocalization to the cytoplasm, although this was associated with the increased amount of phospholipids upon oxidative stress.102,109 In this study, the molecular basis for the redox-regulation of SAP30L by the formation of the two specific disulfide bonds and the release of the coordinated zinc ion was demonstrated as having a plausible role in the repression activity of the Sin3A corepressor complex. The design of ZnF based molecular scaffolds, whether based on a known or a novel amino acid sequence, has gained considerable interest in recent years.22–25,28 The designer zinc finger peptides have shown that the primary structures of these peptides can be tolerant to several amino acid mutations without marked difference in their folding or 45 metal ion binding characteristics. This makes them very robust and versatile molecular templates for further design of ZnF based protein scaffolds for a variety of purposes. Previously, a non-native CHANCE motif was shown to be exceptionally tolerant for replacement of most of its amino acid residues with alanine without a marked decrease in its ability to adopt a fully folded structure. One of the resulting “minimal mutant” structures, named MM1, was further characterized in this study. As a continuation of the previous studies, the robustness of the structure of MM1 was evaluated in terms of its primary zinc coordination sphere. It was demonstrated that MM1 retains its zinc specificity and binding affinity and remains folded upon selective Cys/His-to-Ala mutations of its primary zinc coordinating residues. This suggests that small designer peptides may have access to a wider conformational space required for efficient metal ion binding, as compared to similar structural motifs in natural proteins. Threedimensional structures for the MM1 mutants would offer further structural insights into the robustness of the zinc center in MM1, in particular the roles of some secondary amino acid residues and the conformations adopted by the mutant structures. Biochemical studies of zinc-binding proteins are challenging due to the limited number of applicable spectroscopic methods for their characterization. This study showed that high-resolution mass spectrometry serves as an excellent analytical tool for characterizing various structural aspects in small and large zinc metalloproteins, including metal ion specificity and affinity, ligand binding, redox chemistry and metal dependent folding. The high sensitivity, specificity, and direct information regarding the binding stoichiometry and are the key figures of merit, which make mass spectrometry a competitive technique over the more traditional techniques. 46 ACKNOWLEDGEMENTS This work was carried out at the Department of Chemistry, University of Eastern Finland between 2008-2016. Grants from the Department of Chemistry at the beginning of these studies, the financial support and travel grants from the Graduate School of Organic Chemistry and Chemical Biology (GSOCCB), and funding from the TEKES project “Biomedical product concept development based on big data gathered from immunogenomics and –proteomics” are gratefully acknowledged. I express my deepest gratitude to my supervisor, Professor Janne Jänis for the opportunity to work with mass spectrometry and zinc proteins. Your guidance and support over the years have been invaluable for the completion of these studies. I would also like to thank Ph.D. Jarkko Valjakka for his guidance during these studies, critical reading of the manuscripts and especially for asking the important questions. Special thanks are also due to our former laboratory technician, Ritva Romppanen, for her assistance in operating the mass spectrometry instruments during the research and being able to locate the necessary laboratory equipment and chemicals. The referees of this dissertation, Prof. Risto Kostiainen and Assoc. Prof. Vesa Hytönen, are also owed special recognition and thanks for their critical reading and suggested improvements. Docent Greg Watson was responsible for the language revision of this manuscript. The former and current members of the mass spectrometry and the protein group, as well as the entire staff and student body of the Department of Chemistry also deserve warm acknowledgement for having created a pleasant atmosphere to work within. Finally, I would like to thank my family for their constant support throughout my life. Without your support and care during these years the completion of this work would not have been possible. Joensuu, November 2016 Mikko Laitaoja 47 REFERENCES 1. Harding, M. M.; Nowicki, M. W.; Walkinshaw, M. D. Crystallogr. Rev. 2010, 16, 247. 2. Loo, J. A. Int. J. Mass Spectrom. 2001, 204, 113. 3. Abendroth, J.; Niefind, K.; Schomburg, D. J. Mol. Biol. 2002, 320, 143. 4. Frankær, C. G.; Knudsen, M. V.; Norén, K.; Nazarenko, E.; Ståhl, K.; Harris, P. Acta Crystallogr. 2012, D68, 1259. 5. Vahrenkamp, H. Dalton Trans. 2007, 4751. 6. Waldron, K. J.; Rutherford, J. C.; Ford, D.; Robinson, N. J. Nature 2009, 460, 823. 7. Andreini, C.; Banci, L.; Bertini, I.; Rosato, A. J. Proteome Res. 2006, 5, 196. 8. Coleman, J. E. Annu. Rev. Biochem. 1992, 61, 897. 9. Maret, W.; Li, Y. Chem. Rev. 2009, 109, 4682. 10. Berman, H. M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T. N.; Weissig, H.; Shindyalov, I. N.; Bourne, P. E. Nucleic Acids Res. 2000, 28, 235. 11. Lee, M. S.; Gippert, G. P.; Soman, K. V.; Case, D. A.; Wright, P. E. Science 1989, 245, 635. 12. Krishna, S. S.; Majumdar, I.; Grishin, N. V. Nucleic Acids Res. 2003, 31, 532. 13. Kwan, A. H.; Mobli, M.; Gooley, P. R.; King, G. F.; Mackay, J. P. FEBS J. 2011, 278, 687. 14. Auld, D. S. BioMetals 2009, 22, 141. 15. Andreini, C.; Bertini, I. J. Inorg. Biochem. 2012, 111, 150. 16. Auld, D. S. BioMetals 2001, 14, 271. 17. Andreini, C.; Bertini, I.; Cavallaro, G. PLoS ONE 2011, 6, e26325. 18. Hakanpää, J.; Szilvay, G. R.; Kaljunen, H.; Maksimainen, M.; Linder, M.; Rouvinen, J. Protein Sci. 2006, 15, 2129. 19. Wlodawer, A.; Minor, W.; Dauter, Z.; Jaskolski, M. FEBS J. 2008, 275, 1. 20. Fabris, D.; Fenselau, C. Anal. Chem. 1999, 71, 384. 48 21. Penner-Hahn, J. E. Coord. Chem. Rev. 2005, 249, 161. 22. Dahiyat, B. I.; Mayo, S. L. Science 1997, 278, 82. 23. Pabo, C. O.; Peisach, E.; Grant, R. A. Annu. Rev. Biochem. 2001, 70, 313. 24. Sharpe, B. K.; Matthews, J. M.; Kwan, A. H. Y.; Newton, A.; Gell, D. A.; Crossley, M.; Mackay, J. P. Structure 2002, 10, 639. 25. Cerasoli, E.; Sharpe, B. K.; Woolfson, D. N. J. Am. Chem. Soc. 2005, 127, 15008. 26. De Guzman, R. N.; Wojciak, J. M.; Martinez-Yamout, M. A.; Dyson, H. J.; Wright, P. E. Biochemistry 2005, 44, 490. 27. Newton, A. L.; Sharpe, B. K.; Kwan, A.; Mackay, J. P.; Crossley, M. J. Biol. Chem. 2000, 275, 15128. 28. Sharpe, B. K.; Liew, C. K.; Kwan, A. H.; Wilce, J. A.; Crossley, M.; Matthews, J. M.; Mackay, J. P. Structure 2005, 13, 257. 29. South, T. L.; Summers, M. F. Protein Sci. 1993, 2, 3. 30. Murray, K. K.; Boyd, R. K.; Eberlin, M. N.; Langley, G. J.; Li, L.; Naito, Y. Pure Appl. Chem. 2013, 85, 1515. 31. Fabris, D.; Hathout, Y.; Fenselau, C. Inorg. Chem. 1999, 38, 1322. 32. Smirnova, J.; Zhukova, L.; Witkiewicz-Kucharczyk, A.; Kopera, E.; Olędzki, J.; Wysłouch-Cieszyńska, A.; Palumaa, P.; Hartwig, A.; Bal, W. Anal. Biochem. 2007, 369, 226. 33. Larabee, J. L.; Hocker, J. R.; Hanas, J. S. Arch. Biochem. Biophys. 2005, 434, 139. 34. Marshall, A. G.; Guan, S. Rapid Commun. Mass Spectrom. 1996, 10, 1819. 35. Heck, A. J. R. Nat. Methods 2008, 5, 927. 36. Kebarle, P.; Verkerk, U. H. Mass Spec. Rev. 2009, 28, 898. 37. Kaltashov, I. A.; Eyles, S. J. Mass Spec. Rev. 2002, 21, 37. 38. Cole, R. B. Electrospray and MALDI Mass Spectrometry: Fundamentals, Instrumentation, Practicalities, and Biological Applications, 2nd ed., John Wiley & Sons, 2011. 39. Wilm, M. Mol. Cell. Proteomics 2011, 10, M111.009407. 40. Felitsyn, N.; Peschke, M.; Kebarle, P. Int. J. Mass Spectrom. 2002, 219, 39. 49 41. Iavarone, A.; Jurchen, J.; Williams, E. J. Am. Soc. Mass Spectrom. 2000, 11, 976. 42. Lemaire, D.; Marie, G.; Serani, L.; Laprévote, O. Anal. Chem. 2001, 73, 1699. 43. Marshall, A. G.; Hendrickson, C. L.; Jackson, G. S. Mass Spec. Rev. 1998, 17, 1. 44. Barrow, M. P.; Burkitt, W. I.; Derrick, P. J. Analyst 2005, 130, 18. 45. Scigelova, M.; Hornshaw, M.; Giannakopulos, A.; Makarov, A. Mol. Cell. Proteomics 2011, 10, M111.009431. 46. Qi, Y.; O’Connor, P. B. Mass Spec. Rev. 2014, 33, 333. 47. Loo, J. A.; Edmonds, C. G.; Udseth, H. R.; Smith, R. D. Anal. Chem. 1990, 62, 693. 48. Loo, J. A. Mass Spec. Rev. 1997, 16, 1. 49. Kaltashov, I.; Zhang, M.; Eyles, S.; Abzalimov, R. Anal. Bioanal. Chem. 2006, 386, 472. 50. Kaltashov, I.; Abzalimov, R. J. Am. Soc. Mass Spectrom. 2008, 19, 1239. 51. Marcoux, J.; Robinson, C. V. Structure 2013, 21, 1541. 52. Daniel, J. M.; Friess, S. D.; Rajagopalan, S.; Wendt, S.; Zenobi, R. Int. J. Mass Spectrom. 2002, 216, 1. 53. van den Bremer, E. T. J.; Jiskoot, W.; James, R.; Moore, G. R.; Kleanthous, C.; Heck, A. J. R.; Maier, C. S. Protein Sci. 2002, 11, 1738. 54. Livnah, O.; Bayer, E. A.; Wilchek, M.; Sussman, J. L. Proc. Natl. Acad. Sci. USA 1993, 90, 5076. 55. Leppiniemi, J.; Määttä, J. A. E.; Hammaren, H.; Soikkeli, M.; Laitaoja, M.; Jänis, J.; Kulomaa, M. S.; Hytönen, V. P. PLoS ONE 2011, 6, e16576. 56. Iavarone, A. T.; Udekwu, O. A.; Williams, E. R. Anal. Chem. 2004, 76, 3944. 57. Zubarev, R. A.; Kruger, N. A.; Fridriksson, E. K.; Lewis, M. A.; Horn, D. M.; Carpenter, B. K.; McLafferty, F. W. J. Am. Chem. Soc. 1999, 121, 2857. 58. Zubarev, R. A. Mass Spec. Rev. 2003, 22, 57. 59. Cooper, H. J.; Håkansson, K.; Marshall, A. G. Mass Spec. Rev. 2005, 24, 201. 60. Gorman, J. J.; Wallis, T. P.; Pitt, J. J. Mass Spec. Rev. 2002, 21, 183. 50 61. Meng, F.; Forbes, A. J.; Miller, L. M.; Kelleher, N. L. Mass Spec. Rev. 2005, 24, 126. 62. Paizs, B.; Suhai, S. Mass Spec. Rev. 2005, 24, 508. 63. Breuker, K.; Jin, M.; Han, X.; Jiang, H.; McLafferty, F. W. J. Am. Soc. Mass Spectrom. 2008, 19, 1045. 64. The PyMOL Molecular Graphics System, Version 1.3r1, Schrödinger, LLC, Cambridge, MA, 2010. 65. Viiri, K. M.; Korkeamäki, H.; Kukkonen, M. K.; Nieminen, L. K.; Lindfors, K.; Peterson, P.; Mäki, M.; Kainulainen, H.; Lohi, O. Nucleic Acids Res. 2006, 34, 3288. 66. Pace, C. N.; Vajdos, F.; Fee, L.; Grimsley, G.; Gray, T. Protein Sci. 1995, 4, 2411. 67. Scigelova, M.; Green, P.; Giannakopulos, A.; Rodger, A.; Crout, D.; Derrick, P. Eur. J. Mass Spectrom. 2001, 7, 29. 68. Marie, G.; Serani, L.; Laprévote, O. Anal. Chem. 2000, 72, 5423. 69. Wang, W.; Kitova, E. N.; Klassen, J. S. Anal. Chem. 2003, 75, 4945. 70. Andreini, C.; Cavallaro, G.; Lorenzini, S.; Rosato, A. Nucleic Acids Res. 2013, 41, D312. 71. Wang, G. G.; Song, J.; Wang, Z.; Dormann, H. L.; Casadio, F.; Li, H.; Luo, J.-L.; Patel, D. J.; Allis, C. D. Nature 2009, 459, 847. 72. Chang, X.; Jørgensen, A. M. M.; Bardrum, P.; Led, J. J. Biochemistry 1997, 36, 9409. 73. Berman, H. M.; Coimbatore Narayanan, B.; Costanzo, L. D.; Dutta, S.; Ghosh, S.; Hudson, B. P.; Lawson, C. L.; Peisach, E.; Prlić, A.; Rose, P. W.; Shao, C.; Yang, H.; Young, J.; Zardecki, C. FEBS Lett. 2013, 587, 1036. 74. Taverna, S. D.; Ilin, S.; Rogers, R. S.; Tanny, J. C.; Lavender, H.; Li, H.; Baker, L.; Boyle, J.; Blair, L. P.; Chait, B. T.; Patel, D. J.; Aitchison, J. D.; Tackett, A. J.; Allis, C. D. Mol. Cell 2006, 24, 785. 75. Brown, E. N.; Ramaswamy, S. Acta Crystallogr. 2007, D63, 941. 76. McCall, K. A.; Huang, C.; Fierke, C. A. J. Nutr. 2000, 130, 1437S. 77. Zheng, H.; Chruszcz, M.; Lasota, P.; Lebioda, L.; Minor, W. J. Inorg. Biochem. 2008, 102, 1765. 78. Abriata, L. A. Acta Crystallogr. 2012, D68, 1223. 51 79. Alberts, I. L.; Nadassy, K.; Wodak, S. J. Protein Sci. 1998, 7, 1700. 80. Patel, K.; Kumar, A.; Durani, S. Biochim. Biophys. Acta 2007, 1774, 1247. 81. Tamames, B.; Sousa, S. F.; Tamames, J.; Fernandes, P. A.; Ramos, M. J. Proteins 2007, 69, 466. 82. Sousa, S. F.; Lopes, A. B.; Fernandes, P. A.; Ramos, M. J. Dalton Trans. 2009, 7946. 83. Magyar, J. S.; Weng, T.-C.; Stern, C. M.; Dye, D. F.; Rous, B. W.; Payne, J. C.; Bridgewater, B. M.; Mijovilovich, A.; Parkin, G.; Zaleski, J. M.; Penner-Hahn, J. E.; Godwin, H. A. J. Am. Chem. Soc. 2005, 127, 9495. 84. Abriata, L. A. Acta Crystallogr. 2013, B69, 176. 85. Friedman, R. Dalton Trans. 2014, 43, 2878. 86. Pozharski, E.; Weichenberger, C. X.; Rupp, B. Acta Crystallogr. 2013, D69, 150. 87. Read, R. J.; Adams, P. D.; Arendall III, W. B.; Brunger, A. T.; Emsley, P.; Joosten, R. P.; Kleywegt, G. J.; Krissinel, E. B.; Lütteke, T.; Otwinowski, Z.; Perrakis, A.; Richardson, J. S.; Sheffler, W. H.; Smith, J. L.; Tickle, I. J.; Vriend, G.; Zwart, P. H. Structure 2011, 19, 1395. 88. Sommerhalter, M.; Lieberman, R. L.; Rosenzweig, A. C. Inorg. Chem. 2005, 44, 770. 89. Zheng, H.; Chordia, M. D.; Cooper, D. R.; Chruszcz, M.; Müller, P.; Sheldrick, G. M.; Minor, W. Nat. Protoc. 2014, 9, 156. 90. Feng, L.; Yan, H.; Wu, Z.; Yan, N.; Wang, Z.; Jeffrey, P. D.; Shi, Y. Science 2007, 318, 1608. 91. Adler, M.; Bryant, J.; Buckman, B.; Islam, I.; Larsen, B.; Finster, S.; Kent, L.; May, K.; Mohan, R.; Yuan, S.; Whitlow, M. Biochemistry 2005, 44, 9339. 92. Spåhr, H.; Calero, G.; Bushnell, D. A.; Kornberg, R. D. Proc. Natl. Acad. Sci. USA 2009, 106, 9185. 93. Grzenda, A.; Lomberk, G.; Zhang, J.-S.; Urrutia, R. Biochim. Biophys. Acta 2009, 1789, 443. 94. McDonel, P.; Costello, I.; Hendrich, B. Int. J. Biochem. Cell Biol. 2009, 41, 108. 95. Laherty, C. D.; Billin, A. N.; Lavinsky, R. M.; Yochum, G. S.; Bush, A. C.; Sun, J.-M.; Mullen, T.-M.; Davie, J. R.; Rose, D. W.; Glass, C. K.; Rosenfeld, M. G.; Ayer, D. E.; Eisenman, R. N. Mol. Cell 1998, 2, 33. 52 96. Huang, N. E.; Lin, C.-H.; Lin, Y.-S.; Yu, W. C. Y. Biochem. Biophys. Res. Commun. 2003, 306, 267. 97. Kuzmichev, A.; Zhang, Y.; Erdjument-Bromage, H.; Tempst, P.; Reinberg, D. Mol. Cell. Biol. 2002, 22, 835. 98. Xie, T.; He, Y.; Korkeamäki, H.; Zhang, Y.; Imhoff, R.; Lohi, O.; Radhakrishnan, I. J. Biol. Chem. 2011, 286, 27814. 99. Teittinen, K. J.; Grönroos, T.; Parikka, M.; Junttila, S.; Uusimäki, A.; Laiho, A.; Korkeamäki, H.; Kurppa, K.; Turpeinen, H.; Pesu, M.; Gyenesei, A.; Rämet, M.; Lohi, O. J. Cell. Biochem. 2012, 113, 3843. 100. Sichtig, N.; Körfer, N.; Steger, G. Arch. Biochem. Biophys. 2007, 467, 67. 101. Le May, N.; Mansuroglu, Z.; Léger, P.; Josse, T.; Blot, G.; Billecocq, A.; Flick, R.; Jacob, Y.; Bonnefoy, E.; Bouloy, M. PLoS Pathog. 2008, 4, e13. 102. Viiri, K. M.; Jänis, J.; Siggers, T.; Heinonen, T. Y. K.; Valjakka, J.; Bulyk, M. L.; Mäki, M.; Lohi, O. Mol. Cell. Biol. 2009, 29, 342. 103. Viiri, K. M.; Heinonen, T. Y.; Mäki, M.; Lohi, O. BMC Evol. Biol. 2009, 9, 149. 104. Lemmon, M. A. Nature Rev. Mol. Cell Biol. 2008, 9, 99. 105. Kaadige, M. R.; Ayer, D. E. J. Biol. Chem. 2006, 281, 28831. 106. Gozani, O.; Karuman, P.; Jones, D. R.; Ivanov, D.; Cha, J.; Lugovskoy, A. A.; Baird, C. L.; Zhu, H.; Field, S. J.; Lessnick, S. L.; Villasenor, J.; Mehrotra, B.; Chen, J.; Rao, V. R.; Brugge, J. S.; Ferguson, C. G.; Payrastre, B.; Myszka, D. G.; Cantley, L. C.; Wagner, G.; Divecha, N.; Prestwich, G. D.; Yuan, J. Cell 2003, 114, 99. 107. Di Paolo, G.; De Camilli, P. Nature 2006, 443, 651. 108. Kutateladze, T. G. Nat. Chem. Biol. 2010, 6, 507. 109. Viiri, K.; Mäki, M.; Lohi, O. Sci. Signal. 2012, 5, pe19. 110. He, Y.; Imhoff, R.; Sahu, A.; Radhakrishnan, I. Nucleic Acids Res. 2009, 37, 2142. 111. Liew, C. K.; Crossley, M.; Mackay, J. P.; Nicholas, H. R. J. Mol. Biol. 2007, 366, 382. 112. Jauch, R.; Bourenkov, G. P.; Chung, H.-R.; Urlaub, H.; Reidt, U.; Jäckle, H.; Wahl, M. C. Structure 2003, 11, 1393. 113. Fabris, D.; Zaia, J.; Hathout, Y.; Fenselau, C. J. Am. Chem. Soc. 1996, 118, 12242. 53 114. Konrat, R.; Weiskirchen, R.; Bister, K.; Kräutler, B. J. Am. Chem. Soc. 1998, 120, 7127. 115. Shlomai, J. Antioxid. Redox Signal. 2010, 13, 1429. 116. Kröncke, K.-D.; Klotz, L.-O. Antioxid. Redox Signal. 2009, 11, 1015. 117. Ilbert, M.; Graf, P. C. F.; Jakob, U. Antioxid. Redox Signal. 2006, 8, 835. 118. Lee, C.; Lee, S. M.; Mukhopadhyay, P.; Kim, S. J.; Lee, S. C.; Ahn, W.-S.; Yu, M.-H.; Storz, G.; Ryu, S. E. Nat. Struct. Mol. Biol. 2004, 11, 1179. 119. Park, C.; Raines, R. T. Protein Eng. 2001, 14, 939. 120. Wouters, M. A.; Fan, S. W.; Haworth, N. L. Antioxid. Redox Signal. 2010, 12, 53. 121. Carugo, O.; Čemažar, M.; Zahariev, S.; Hudáky, I.; Gáspári, Z.; Perczel, A.; Pongor, S. Protein Eng. 2003, 16, 637. 122. de Araujo, A. D.; Herzig, V.; Windley, M. J.; Dziemborowicz, S.; Mobli, M.; Nicholson, G. M.; Alewood, P. F.; King, G. F. Antioxidants & Redox Signaling 2013, 19, 1976. 123. Kim, B.-M.; Schultz, L. W.; Raines, R. T. Protein Sci. 1999, 8, 430. 124. Lima, W. F.; Wu, H.; Nichols, J. G.; Manalili, S. M.; Drader, J. J.; Hofstadler, S. A.; Crooke, S. T. J. Biol. Chem. 2003, 278, 14906. 125. Tang, J.; Kang, S.-G.; Saven, J. G.; Gai, F. J. Mol. Biol. 2009, 389, 90. 126. Walkup, G. K.; Imperiali, B. J. Am. Chem. Soc. 1997, 119, 3443. 127. Yu, X.; Hathout, Y.; Fenselau, C.; Sowder, R. C.; Henderson, L. E.; Rice, W. G.; Mendeleyev, J.; Kun, E. Chem. Res. Toxicol. 1995, 8, 586. 128. Mely, Y.; Cornille, F.; Fournié-Zaluski, M.-C.; Darlix, J.-L.; Roques, B. F.; Gérard, D. Biopolymers 1991, 31, 899. 129. Lehmann, E.; Zenobi, R. Angew. Chem. Int. Ed. 1998, 37, 3430. 130. Krissinel, E.; Henrick, K. J. Mol. Biol. 2007, 372, 774. 131. Kaltashov, I. A.; Mohimen, A. Anal. Chem. 2005, 77, 5370. 132. McLendon, G.; Hull, H.; Larkin, K.; Chang, W. J. Biol. Inorg. Chem. 1999, 4, 171. 133. Kochańczyk, T.; Drozd, A.; Krężel, A. Metallomics 2015, 7, 244. 54 134. Jeloková, J.; Karlsson, C.; Estonius, M.; Jörnvall, H.; Höög, J.-O. Eur. J. Biochem. 1994, 225, 1015. 135. Simpson, R. J. Y.; Cram, E. D.; Czolij, R.; Matthews, J. M.; Crossley, M.; Mackay, J. P. J. Biol. Chem. 2003, 278, 28011. 136. Bergman, T.; Zhang, K.; Palmberg, C.; Jörnvall, H.; Auld, D. S. Cell. Mol. Life Sci. 2008, 65, 4019. 137. Maret, W. Antioxid. Redox Signal. 2006, 8, 1419. Mikko Laitaoja: Structure-Function Studies of Zinc Proteins 140 110/2011 TORVINEN Mika: Mass spectrometric studies of host-guest complexes of glucosylcalixarenes 111/2012 KONTKANEN Maija-Liisa: Catalyst carrier studies for 1-hexene hydroformulation: cross-linked poly(4-vinylpyridine), nano zinc oxide and one-dimensional ruthenium polymer 112/2012KORHONENTuulia:Thewettabilitypropertiesofnano-andmicromodifiedpaintsurfaces 113/2012JOKI-KORPELAFatima:Functionalpolyurethane-basedfilmsandcoatings 114/2012 LAURILA Elina: Non-covalent interactions in Rh, Ru, Os, and Ag complexes 115/2012 MAKSIMAINEN Mirko: Structural studies of Trichoderma reesei, Aspergillus oryzae and Bacillus circulans sp. alkalophilus beta-galactosidases – Novel insights into a structure-function relationship 116/2012 PÖLLÄNEN Maija: Morphological, thermal, mechanical, and tribological studies of polyethylene compositesreinforcedwithmicro–andnanofillers 117/2013LAINEAnniina:Elementaryreactionsinmetallocene/methylaluminoxanecatalyzedpolyolefin synthesis 118/2013TIMONENJuri:Synthesis,characterizationandanti-inflammatoryeffectsofsubstitutedcoumarin derivatives 119/2013 TAKKUNEN Laura: Three-dimensional roughness analysis for multiscale textured surfaces: Quantitative characterization and simulation of micro- and nanoscale structures 120/2014 STENBERG Henna: Studies of self-organizing layered coatings 121/2014 KEKÄLÄINEN Timo: Characterization of petroleum and bio-oil samples by ultrahigh-resolution Fourier transform ion cyclotron resonance mass spectrometry 122/2014 BAZHENOV Andrey: Towards deeper atomic-level understanding of the structure of magnesium dichloride and its performance as a support in the Ziegler-Natta catalytic system 123/2014 PIRINEN Sami: Studies on MgCl2/ether supports in Ziegler–Natta catalysts for ethylene polymerization 124/2014 KORPELA Tarmo: Friction and wear of micro-structured polymer surfaces 125/2014 HUOVINEN Eero: Fabrication of hierarchically structured polymer surfaces 126/2014 EROLA Markus: Synthesis of colloidal gold and polymer particles and use of the particles in preparation of hierarchical structures with self-assembly 127/2015 KOSKINEN Laura: Structural and computational studies on the coordinative nature of halogen bonding 128/2015 TUIKKA Matti: Crystal engineering studies of barium bisphosphonates, iodine bridged ruthenium complexes, and copper chlorides 129/2015JIANGYu:Modificationandapplicationsofmicro-structuredpolymersurfaces 130/2015 TABERMAN Helena: Structure and function of carbohydrate-modifying enzymes 131/2015KUKLINMikhailS.:Towardsoptimizationofmetaloceneolefinpolymerizationcatalystsvia structuralmodifications:acomputationalapproach 132/2015SALSTELAJanne:Influenceofsurfacestructuringonphysicalandmechanicalpropertiesof polymer-cellulosefibercompositesandmetal-polymercompositejoints 133/2015 CHAUDRI Adil Maqsood: Tribological behavior of the polymers used in drug delivery devices 134/2015 HILLI Yulia: The structure-activity relationship of Pd-Ni three-way catalysts for H2S suppression 135/2016 SUN Linlin: The effects of structural and environmental factors on the swelling behavior of Montmorillonite-Beidellite smectics: a molecular dynamics approach 136/2016 OFORI Albert: Inter- and intramolecular interactions in the stabilization and coordination of palladium and silver complexes: DFT and QTAIM studies 137/2016 LAVIKAINEN Lasse: The structure and surfaces of 2:1 phyllosilicate clay minerals 138/2016 MYLLER Antti T.: The effect of a coupling agent on the formation of area-selective monolayers of iron a-octabutoxy phthalocyanine on a nano-patterned titanium dioxide carrier 139/2016KIRVESLAHTIAnna:Polymerwettabilityproperties:theirmodificationandinfluencesupon water movement Dissertations Department of Chemistry University of Eastern Finland No. 140 (2016) Mikko Laitaoja Structure-Function Studies of Zinc Proteins
© Copyright 2026 Paperzz