05. Structure Elucidation

BIOINF 4120 Bioinforma2cs 2 -­‐ Structures and Systems -­‐ Oliver Kohlbacher Summer 2012 5. Structure Elucida3on Overview •  Protein structure elucida3on •  X-­‐ray diffrac3on (XRD) •  Crystalliza3on •  Physics of electron diffrac3on •  Phasing, modeling, refinement •  Nuclear magne3c resonance spectroscopy (NMR) •  Physical founda3ons of NMR spectroscopy •  1D and 2D spectra and their interpreta3on •  Comparison NMR and XRD •  Structural Databases: the PDB 2 X-­‐Ray Crystallography X-Ray
Source
Protein
Crystal
Detector
Analysis
3 1
X-­‐Rays 4 Protein Crystals Proteins are difficult to crystallize •  Irregular shape ⇒ large „holes“ in the crystal •  Rather large crystals required (0.1 – 0.5 mm) •  Large amounts of protein necessary •  Protein needs to be very pure •  Crystal growth is very slow (weeks to months) •  Some proteins do not crystallize at all (membrane proteins!) Branden, Tooze, p. 376
5 Crystalliza2on – „Hanging Drop“ Nölting, p. 70,
Branden/Tooze, p. 376
6 2
Protein Crystals •  Regular arrangement of protein molecules in a three-­‐dimensional laZce •  Irregular shape of proteins causes water-­‐filled holes in the crystal ⇒ high water content (20 – 90%) •  Unit cell: smallest subunit of the crystal from which the whole crystal can be created by transla3on Branden, Tooze, p. 375
7 Protein Crystal 8 •  Example: Fab unit cell contains two copies of Fab •  Crystal is formed by transla2on of this unit cell along a regular laSce X-­‐ray Diffrac2on of Proteins •  Bernal and Crowfoot observed in 1934 that pepsin crystals create a well-­‐defined diffrac3on pa`ern •  Nearly three decades and the inven3on of computers were necessary un3l Kendrew and Perutz could solve the first structures in Max Perutz, John Kendrew
1960 (myoglobin, hemoglobin) 9 3
Wave Equa2ons 10 Euler’s formula Any periodic sine or cosine func3on can be represented as a complex exponen3al func3on. Example: Wave Equa2ons 11 .
φ
λ
s
Intensity for 3me t at posi3on r is described by: with unit vector s poin3ng along the direc3on of the wave front, with frequency ω, wave length λ, phase φ and i² = –1 Interference 2
•  Construc2ve Amplitude increases •  Destruc2ve 0
Amplitude decreases •  Depends on phase difference •  Interference of two coherent waves E1 and E2 of equal amplitude E0: -2
12 4
Interference Constant factor Phase factor
•  Resulting wave has the same frequency as the original
waves E1 and E2
•  Amplitude depends on phase factor, i.e., the phase
difference Δφ
•  Amplitude is easily observable
13 Diffrac2on at Two Centers 14 s0
s0
Dete
r·s0
r
s
λ S = s – s0
2θ
s
s0
r·s!
Origin
ctor
The retarda2on of the interfering waves is r·s – r·s0 and thus the phase difference Δφ = 2πλ-­‐1 r·(s -­‐ s0) = 2π r·S with S = (s – s0) / λ
Diffrac2on at Two Centers 15 s0
r·s0
r
s0
r·s!
λ S = s – s0
s
s0
s
Δφ = 2πλ-­‐1 r·(s -­‐ s0) = 2π r·S Considering the ra3o of wave E and wave E0(r, t) diffracted at the origin yields: 5
Structure Factor 16 •  Apart from the phase difference, diffrac2on probability fi of atom i is important: •  The whole diffrac2on pa\ern is then the sum of all diffracted waves origina3ng from atoms i at unit cell posi3ons ri: •  F is also called structure factor •  Structure factor depends on atom posi2ons and diffrac3on probabili3es Structure Factor •  Structure factor corresponds to the Fourier transform of the atom coordinates of the diffrac3ng protein •  Diffrac3on occurs by interac3on with the atoms’ electron hulls, not with the nuclei •  Thus, we introduce a con3nuous electron density ρ(r) •  F then becomes 17 Diffrac2on Pa\ern of a Protein Nölting
18 6
Fourier Analysis Nölting
19 Fourier Analysis Nölting
20 Fourier Analysis Im
Re
Nölting
21 7
Phasing Problem •  Diffrac3on pa`ern corresponds to Fourier transform of electron density •  Inverse Fourier transform yields electron density from this •  Problem: detector measures intensity only, not the phase! I = |F(S)| = F(S)F*(S) •  Phase informa3on, however, is required to compute electron density! •  Phasing problem: Reconstruc3on of the phase informa3on •  Common way to solve this: heavy atom replacement John O‘Brien, © The New Yorker Collection 1991
Rhodes, p. 18
22 Overview X-­‐Ray Diffrac2on Nölting, p. 68
23 Electron Density Maps 24 8
Electron Density Maps 25 Resolu2on •  Resolu3on of a structure determines informa3on content •  Determined by quality of the crystal: •  Purity •  Inclusions •  Water content •  Stability under irradia3on •  Resolu3on can be es3mated from diffrac3on pa`ern Nölting
26 Resolu2on •  Resolu3on determines which atomic details are recognizable •  Poor resolu3on (large value) blurs the details of the structure •  Resolu3on is measured in Å •  Resolu3on of 2 Å does not mean, that the error for the atom coordinates is about 2 Å! •  Error in the atom coordinates would be about 0.3 Å in that case 27 9
Resolu2on Resolution
[Å]
Information obtainable
4.0
Fold class, some secondary structures
3.5
Helices and strands become distinguishable
3.0
Most side chains recognizable
2.5
All side chains well defined, φ and ψ of the backbone
partially well defined, water can be seen
1.5
All backbone torsions well defined,
first hydrogen atoms visible
very good
1.0
Hydrogen atoms become visible
possible
poor
typical
28 Nuclear Magne2c Resonance •  1H nuclei possess nuclear magne2c moment •  In an external magne3c field B0, every nucleus assumes one of two possible states (spins) : α or β
•  The two states differ in energy, spin state α (parallel to B0) is energe3cally more favorable β
B0
ΔE
α
29 Nuclear Magne2c Resonance •  1H nuclei possess nuclear magnetic moment
•  In an external magnetic field B0, every nucleus assumes one of
two possible states (spins) : α or β
•  The two states differ in energy, spin state α (parallel to B0) is
energetically more favorable
•  Addition of energy can invert the spin state
β
h ⋅ν
ΔE
α
30 10
Nuclear Magne2c Resonance •  1H nuclei possess nuclear magnetic moment
•  In an external magnetic field B0, every nucleus assumes one of
two possible states (spins) : α or β
•  The two states differ in energy, spin state α (parallel to B0) is
energetically more favorable
•  Addition of energy can invert the spin state
β
ΔE
α
31 Nuclear Magne2c Resonance ΔE depends on •  The magnitude of the external magne3c field •  The electronic environment of the nucleus 32 Nuclear Magne2c Resonance ΔE •  Can be measured (absorp3on) •  Has different magnitude for different types of atoms •  For a system of atoms we thus obtain an NMR spectrum 33 11
Angular Momentum •  Nuclei have nuclear angular momentum P •  P can be considered the quantum mechanical analog of a classical angular momentum (which does not suggest that the nuclei are rota3ng in any way!) •  P depends on spin quantum number I (~ = h/2π, with Planck‘s constant h) •  I is a func3on of the nuclide, i.e., of the number of neutrons and protons in the nucleus •  For (g,g) nuclei (even number of protons and neutrons) I becomes zero ⇒ invisible for NMR! 34 Magne2c Moment •  Magne2c moment µ = γ P is propor3onal to angular momentum •  Propor3onality constant γ is called magnetogyric ra2o •  γ determines sensi3vity of measurement: high γ = high sensi3vity •  γ differs for each nuclide 35 Proper2es of Important Nuclides Nuclide
Nat. abundance
[%]
I
γ
[107 T-1s-1]
Rel.
sensitivity
1H
99.985
½
26.7519
1.00
2D
0.015
1
4.1066
0.01
12C
98.9
0
-
-
13C
1.1
½
6.7283
0.01
14N
99.63
1
1.9338
0.001
15N
0.37
½
-2.7126
0.001
16O
99.96
0
-
-
17O
0.0037
5/2
-3.6280
0.03
31P
100
½
10.8394
0.07
36 12
Quan2za2on of P •  In an external magne3c field with magne3c flux density B0 the magnitude of P, resp. µ, is quan2zed along the direc3on of B0 (z-­‐axis)
•  Possible states of the nuclear spin are described by magne2c quantum number m Pz = m~ with m = –I, –I + 1,.., +I Pz,+½
and z
µz = γm~
•  For a nucleus with I = ½ (e.g., 1H) we obtain: Pz,-½
m = +½, –½ and thus there are two possible spin states Pz,½ = ½ ~ and Pz,-­‐½ = –½ ~
37 Energy in a Magne2c Field •  For simplicity, we will consider only nuclei with I = ½. Similar statements hold for other nuclei. •  Every atom with µ ≠ 0 is a magne3c dipole in an external magne3c field •  Classically, the energy E of a dipole is E = -­‐ µz B0 = -­‐ m γ ~ B0 •  The energy difference between the two spin states is thus ΔE = |E-­‐½ – E½| = |γ~B0| •  This energy difference corresponds to a resonance frequency ν with hν = ΔE 38 Energy in a Magne2c Field •  Resonance frequency depends on γ and B0 •  Stronger magne3c fields correspond to larger energy differences, which in turn correspond to higher resonance frequencies E
0
m = +½
ΔE1
ΔE2
m = –½
B1
B2
39 13
Spin Popula2ons •  Each atom can assume one of two states, all atoms are thus split into two popula3ons of size Nα and Nβ
•  Majority of nuclei assume the ground state, i.e., the state of lowest energy (Nβ < Nα) •  Occupancy of states follows Boltzmann distribu2on •  Example: T = 300 K, B0 = 7.05 T ⇒ Nβ = 0.99995 Nα
•  Differences in occupancy of the states are very small since energy differences are small compared to kBT ! 40 NMR – Hardware http://en.wikipedia.org/wiki/Image:Pacific_Northwest_National_Laboratory_800_MHz_NMR_Spectrometer.jpg
41 NMR – Hardware 42 14
NMR Spectra of CHnBr4-­‐n •  In principle, resonance frequencies should depend on the nuclide only ⇒ should be iden3cal for all 1H nuclei •  Counter example: bromomethanes – resonance frequency depends on chemical environment of the nuclei 43 Shielding •  „Chemical environment “ influences ν
⇒ NMR contains structural informa2on! •  Electrons create a field B´ that shields nucleus from B0
•  Nucleus thus is exposed to an effec2ve field Beff= B0 -­‐ B´ •  B´ is propor3onal to B0 Beff = B0 – B´ = B0 – σB0 = (1 – σ) B0 with shielding σ
44 Chemical Shid •  ν depends on the nuclide and B0 •  Instruments differ in their B0 •  To simplify comparison between instruments, introduce a scale independent of B0
•  Chemical shid δ
•  δ is usually given in ppm (10-­‐6) rela3ve to the resonance frequency νref of a reference substance 45 15
Reference Substances •  Reference substances should be •  Chemically inert •  Easy to handle CH3
H3C
•  Yield clear, intensive signals Si
CH3
CH3
•  Reference substance for 1H and 13C is owen tetramethylsilane (TMS) •  All 1H und 13C nuclei in TMS are chemically equivalent, i.e., the result in only one peak each 46 Structural Informa2on in Spectra •  Chemical shiw depends on •  Topology (cons3tu3on) •  Geometry (conforma3on) •  Certain experiments yield •  Topological informa2on (neighborhood) •  Distance informa2on (e.g., NOE constraints) •  In combina3on this data can be used to deduce the structure of the protein 47 1H-­‐NMR Spectrum of EtOH 48 16
Scalar Coupling •  Spins interact with each other ⇒ Energy of one nucleus depends on an other nucleus’ spin state ⇒ Energy levels shiwed ⇒ Resonance frequency shiwed 2J
•  Coupling is mediated across bonds •  1J coupling – across one bond 3J
•  2J coupling – across two bonds (geminal) •  3J coupling – across three bonds (vicinal) 49 Scalar Coupling hνA1
Example: spin system A–X hνA2
νA1 νA2
J
νA
νA
50 Incfluence of Structure on δ
•  Chemical shiw is caused by changes in electron density •  Electron density is influenced by: •  Topology •  Directly neighboring atoms (+I, -­‐I effect, …) •  Implicitly given by the type of the amino acid •  Geometry •  Charges in the vicinity (electrosta3cs) •  Aroma3c systems (ring current effect) 51 17
Random Coil Shid •  Nuclei in a similar environment, i.e., with iden3cal neighboring atoms, have similar chemical shiws •  Differences in conforma3on can cause differences in the shiw •  random coil shi.s are the shiws of amino acid atoms in a random coil, a pep3de without explicit secondary structure 52 Chemical Shids of Amino Acids 53 1H-­‐NMR Spectrum of a Protein 54 18
2D NMR Spectrum •  Peaks on the diagonal correspond to the shiws in 1D spectrum δ1
δA
•  Cross peaks (off-­‐diagonal) are caused by transfer of magne3za3on between two nuclei, i.e., interac3on between these nuclei •  It usually implies „closeness“ of these nuclei δB
δB
δA δ2
55 (H,H)-­‐COSY •  COrrelated SpectroscopY – magne3za3on is transferred along bonds •  Cross peaks occur between nuclei separated by two or three bonds 56 (H,H)-­‐COSY •  COSY shows characteris3c pa`erns for certain amino acids •  This allows the assignment of peaks to certain amino acids and thus their iden3fica3on in a spectrum 57 19
(H,H)-­‐COSY 58 Structure Elucida2on with NMR 1.  Mul3ple NMR experiments 2.  Determina3on of • 
Coupling constants (yields backbone torsion angles) • 
NOE distances (yields interatomic distances) • 
Hydrogen bond pa`erns 3.  Modeling of the structure consistent with these structural constraints 59 Comparison XRD – NMR XRD NMR •  Also for large proteins •  < 30 kDa •  Requires crystals •  From solu3on •  Hydrogen atoms invisible •  Hydrogen atoms are essen3al •  Unlabeled protein •  Higher spa3al resolu3on •  Isotope-­‐labeled protein required •  Informa3on on flexibility 60 20
Databases – PDB PDB (Protein Data Bank) – h\p://www.rcsb.org •  Database for biomolecular structures •  Maintained by the RCSB (Research Collaboratory for Structural BioinformaCcs) •  Deposi3on of structures in the PDB is prerequisite for the publica3on of the structure in a journal •  Each structure is given a unique iden3fier (PDB ID) • 
• 
• 
• 
4 characters 1st character – version 2nd – 4th character – structure ID Example: •  2PTI, 3PTI, 4PTI are different structures of protein BPTI •  2PTI: 1973, 3PTI: 1976, 4PTI: 1983 61 PDB – Growth 90000
Yearly Growth
80000
Total
70000
60000
50000
40000
30000
20000
10000
12
10
20
08
20
04
06
20
20
02
20
00
20
98
20
96
19
94
19
92
19
90
19
86
88
19
19
82
84
19
19
80
19
78
19
76
19
74
19
19
19
72
0
Data from: http://www.rcsb.org/pdb/statistics/contentGrowthChart.do?content=total&seqid=100
Data as of 11.04.2012
62 PDB – Sta2s2cs Proteins
Protein-NAComplexes
Nucleic
Acids
Total
XRD
66,098
3,266
1,348
70,714
NMR
8,190
186
979
9,362
Total
74,756
3,575
2,356
80,710
http://www.rcsb.org
Data as of 11.04.2012
63 21
PDB – The First Entry! 64 PDB – The First Entry! HEADER
OXYGEN STORAGE
05-APR-73
1MBN
COMPND
MYOGLOBIN (FERRIC IRON - METMYOGLOBIN)
SOURCE
SPERM WHALE (PHYSETER CATODON)
AUTHOR
H.C.WATSON,J.C.KENDREW
[…]
REVDAT 20
27-OCT-83 1MBNS
1
REMARK
JRNL
AUTH
H.C.WATSON
JRNL
TITL
THE STEREOCHEMISTRY OF THE PROTEIN MYOGLOBIN
JRNL
REF
PROG.STEREOCHEM.
V.
4
299 1969
JRNL
REFN
ASTM PRSTAP US ISSN 0079-6808
419
[…]
SEQRES
1
153 VAL LEU SER GLU GLY GLU TRP GLN LEU VAL LEU HIS VAL
[…]
HET
HEM
1
44
PROTOPORPHYRIN IX WITH FE(OH), FERRIC
FORMUL
2 HEM
C34 H32 N4 O4 FE1 +++ .
FORMUL
2 HEM
H1 O1
HELIX
1
A SER
3 GLU
18 1 N=3.63,PHI=1.73,H=1.50
[…]
TURN
1 CD1 PHE
43 PHE
46
BETW C/D HELICES IMM PREC CD2
[…]
ATOM
1 N
VAL
1
-2.900 17.600 15.500 1.00 0.00
2
ATOM
2 CA VAL
1
-3.600 16.400 15.300 1.00 0.00
2
ATOM
3 C
VAL
1
-3.000 15.300 16.200 1.00 0.00
2
ATOM
4 O
VAL
1
-3.700 14.700 17.000 1.00 0.00
2
ATOM
5 CB VAL
1
-3.500 16.000 13.800 1.00 0.00
2
ATOM
6 CG1 VAL
1
-2.100 15.700 13.300 1.00 0.00
2
ATOM
7 CG2 VAL
1
-4.600 14.900 13.400 1.00 0.00
2
ATOM
8 N
LEU
2
-1.700 15.100 16.000 1.00 0.00
1
ATOM
9 CA LEU
2
-.900 14.100 16.700 1.00 0.00
ATOM
10 C
LEU
2
-1.000 13.900 18.300 1.00 0.00
ATOM
11 O
LEU
2
-.900 14.900 19.000 1.00 0.00
ATOM
12 CB LEU
2
.600 14.200 16.500 1.00 0.00
ATOM
13 CG LEU
2
1.100 14.300 15.100 1.00 0.00
1
ATOM
14 CD1 LEU
2
.400 15.500 14.400 1.00 0.00
1
[…]
1MBNH
1MBN
1MBNM
1MBNG
1
4
1
1
1MBNS
1MBNG
1MBNG
1MBNG
1MBNG
1
2
3
4
5
1MBN
39
1MBND
1MBNG
1MBNG
1MBN
10
25
26
52
1MBN
60
1MBN
1MBN
1MBN
1MBN
1MBN
1MBNP
1MBNL
1MBN
1MBN
1MBN
1MBN
1MBN
1MBN
1MBNL
72
73
74
75
76
4
8
79
80
81
82
83
84
9
65 References + Materials Structure elucida2on in general • 
B. Nöl2ng, Methods in Modern Biophysics, Springer, Berlin • 
Branden, Tooze, Introduc2on to Protein Structure, Garland, New York, 1999 • 
R. Co\erill, Biophysics – An Introduc2on, Wiley, West Sussex, 2002 • 
T. Creighton: Proteins – Structures and Molecular Proper2es, Freeman, 2nd ed., 1992 X-­‐ray diffrac2on • 
T. L. Blundell and L. N. Johnson, Protein Crystallography, Academic Press New York, 1976 • 
G. Rhodes, Crystallography made crystal clear, Elsevier, 1999 NMR • 
H. Günther, NMR-­‐Spektroskopie, Thieme, Stu`gart • 
H. Friebolin, Basic One-­‐ and Two-­‐Dimensional NMR Spectroscopy, VCH, Weinheim • 
Kurt Wüthrich, NMR of Proteins and Nucleic Acids. John Wiley and Sons, 1986 • 
J. Cavenagh, W. J. Fairbrother, A. G. Palmer, and N. J. Skelton, Protein NMR Spectroscopy: Principles and Prac2ce, Academic Press Inc., San Diego, 1996. 66 22