Lecture 1

Proteomics
George Tsaprailis, Director
Linda Breci, Associate Director
Arizona Proteomics Consortium
University of Arizona
What is proteomics and how
is Mass spectrometry used
in proteomics ?
1
Proteomics: the study of the
Proteome
A collection of proteins, usually comprising
a biological system
Important because (1) proteins perform most cellular
functions, (2) proteins are the major elements of
most cellular structures, and (3) proteins are targets
of drugs/toxicants
Why Proteomics?
Mass Spectrometry
Protein ID
2
Proteomics involvesProtein Chemistry
Mass
Spectrometry
Computing (+ Bioinformatics)
•Sample isolation/clean-up
Protein Chemistry
•Sample purification
Protein
fractions
digest
peptides
3
Key to Proteomics is to obtain peptide masses and/or
sequences
Mass
Spectrometry
Protein
mixture
separation
digestion
Peptide
mixture
separation
Proteins
digestion
Peptides
MS analysis
MS data
MS
MS/MS
All types of hardware used in proteomics
Mass
Spectrometry
4
Computing (+ Bioinformatics)
The Proteomic Approach
1D PAGE
2D PAGE
Protein
Chemistry
Pre-prep
steps
Protein(s) Solution
HPLC fractions
IP eluent
Sample
Protein
ESI
LC-MS/MSMass
Mass
Spectrometry
Spectrometer
MALDI
MS
Digest
Peptides
Computing (+Protein
Bioinformatics)
Id + Informatics
5
What is proteomics and how
is Mass spectrometry used
in proteomics ?
Mass Spectrometry
• What is a mass spectrometer and what does it
measure?
– An instrument that makes ions
– Measures the mass/charge (m/z) of ions
• Mass Spectrometry in proteomics
– For proteins and peptides
• whole protein mass measurements
• protein identification based on peptide mass measurement
• protein identification based on peptide structure analysis
(fragmentation)
• Need to know some basic principles
6
Protein/peptide relationship
Enzyme
Protein
Peptides
Making ions
O
H+
H2N
CH
CH3
C
O
H
N
CH
C
CH2
CH
CH3
O
H
N
CH
CH2
CH3
C
O
H
N
CH
C
OH
CH2
CH2
CH2
CH2
H+NH2
Ala-Leu-Phe-Lys mass of neutral = 477.3
Ala-Leu-Phe-Lys m/z of singly charged = 478.3
Ala-Leu-Phe-Lys m/z of doubly charged = 239.6
7
Making ions
Ions are made in an ion source
Important methods in Proteomics:
1) MALDI (matrix assisted laser desorption)
2) ESI (electrospray ionization
Electrospray Ionization
ESI
+11
1301.53
100
14306.0
100
75
Intensity
Relative Intensity
75
+10
1431.47
+12
1193.20
Calculated
Mass Spectrum
50
25
50
0
5000
10000
+9
1590.33
15000
m/z
25
+8
1789.00
+13
1101.40
0
1000
1100
1200
1300
1400
1500
1600
1700
1800
1900
2000
m/z
16000000
+
[M+H]
14318.68
14000000
12000000
Intensity
10000000
2+
[M+2H]
7157.18
8000000
6000000
4000000
+
[2M+H]
[2M+H]
28,638.7
14318.68
+
2000000
Matrix Assisted Laser
Desorption Ionization
MALDI
0
10000
20000
30000
40000
m/z
8
Analyzing ions
The ion source is coupled to the analyzer
Important analyzers in Proteomics:
1) TOF (time of flight)
2) Ion Trap
Matrix Assisted Laser Desorption (MALDI)
(ion source)
Time of Flight (analyzer)
FLIGHT TUBE ANALYZER
Pulsed
laser light
Analyzer
LASER
+
+
+
+
Ion Beam
Detector
Sample and matrix
on tip of solid probe
9
Time of Flight (TOF)
Linear Mode:
better sensitivity
poor resolution
Reflectron Mode:
less sensitivity
higher resolution
http://www.abrf.org/ABRFNews/1997/June1997/jun97lennon.html
MALDI-TOF spectrum (mix of peptides)
x 4.0
90
Ref
Ref
0
500
m/z
2500
D:\011003_500fmol\Bsaintcal\2Ref\pdata\1\1r (11:26 10/04/01)
10
MALDI Reflectron Spectrum Of ACTH
13-Nov-2003
M@LDI
ACTHResCk 3 (0.098) Cm (1:5)
TOF LD+
6.57e3
2467.344
100
2466.324
2468.330
%
2469.316
2470.337
2471.290
2451.525
0
2444
m/z
2446
2448
2450
2452
2454
2456
2458
2460
2462
2464
2466
2468
2470
2472
2474
2476
2478
2480
2482
Electrospray (ESI) (ion source)
Ion Trap (analyzer)
ION TRAP ANALYZER
ESI
HPLC
4500 V
Dry gas
or Heat
+
+
+ ++
+
++
+
Analyzer
Ion Beam
Detector
Liquid sample sprayed
from needle or capillary
11
ESI-Ion Trap Spectrum
[M+H]+
952.4
[M+H]+ = 951.4 + 1 = 952.4
100
[M+2H]2+ = (951.4 + 2) / 2 = 476.7
Relative Intensity
80
[2M+H]+ = (951.4 x 2) + 1 = 1903.8
60
40
[M+2H]2+
476.9
20
[2M+H]+
1903.4
0
200
400
600
800
1000
1200
1400
1600
1800
2000
m/z
View an ion trap animation
• Exercise 1
12
Resolution and mass accuracy
varies by instrument
INSTRUMENT
LCQ (Ion Trap)
MASS RANGE
Resolution
Accuracy (Error)
m/z
(at m/z 1,000)
(at m/z 1,000)
to 2,000
2,000 (full scan)
0.03% (300 ppm)
10,000 (zoom scan)
MALDI/TOF
to 400,000
15,000 (reflectron)
0.006% (60 ppm) Ext. Cal.
0.003% (30ppm) Int.Cal.
FTICR
to 4,000
ppm =
500,000
0.0001% (1ppm)
(TheoreticalMW − MeasuredMW )
×10 6
TheoreticalMW
Resolution
Resolution
30,000
10,000
3,000
1,000
http://www.matrixscience.com
13
MALDI Reflectron Spectrum Of ACTH
13-Nov-2003
M@LDI
ACTHResCk 3 (0.098) Cm (1:5)
TOF LD+
6.57e3
2467.344
100
2466.324
2468.330
%
2469.316
2470.337
2471.290
2451.525
0
2444
m/z
2446
2448
2450
2452
2454
2456
2458
2460
2462
2464
2466
2468
2470
2472
2474
2476
2478
2480
2482
You must know the resolution of your
instrument to analyze the data!
– We need to know the possible error in the
measurement
– Is the peak monoisotopic?
– Is the peak average?
14
Analysis of whole proteins
by MALDI-TOF and ESI-Ion trap
• MALDI-TOF = measure with 1 or 2 protons
– large molecules like Proteins require Linear mode
(much lower resolution)
• ESI-Ion Trap = measure with many protons (high
charge state)
– mass of the protein can be calculated from the
multiply charged peaks
Mass Spec measures isotopes
Excel calculated example: Carbon is 12.000
For every 12C there is 1.1% 13C
isotopes add up
10 carbons = 11% 13C Peak
for 100 carbons, the 13C peak
is larger than the 12C peak
10 carbons
100 carbons
120
120
100
100
100
100
80
80
60
60
40
40
92.5
53.5
18.8
20
20
10.8
5
0.5
0
1
0.2
6
7
0
1
2
3
4
5
6
7
1
2
3
4
5
15
Proteins have very large isotope widths
Theoretical Isotope distribution of Lysozyme
Isoto pe #
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
9th Isotope
14313.906
1st Isotope
14304.885
m /z
% M ax im u m
14304.885
0.2
14305.888
1.2
14306.891
4.6
14307.893
1 2.8
14308.896
2 6.9
14309.898
4 6.3
14310.900
6 7.6
14311.902
8 6.3
14312.904
9 7.7
14313.906
10 0.0
14314.908
9 3.5
14315.910
8 0.4
14316.912
6 4.2
14317.914
4 7.8
14318.916
3 3.4
14319.918
2 1.8
14320.920
1 3.2
14321.922
7.5
14322.924
3.9
14323.925
1.8
14324.927
0.7
14325.929
0.2
Lysozyme by MALDI/TOF
Average mass = 14,314
16000000
14000000
++
[M+H]
[M+H]
14316.24
14318.68
12000000
Intensity
10000000
2+2+
[2M+2H]
[M+2H]
7157.18
7157.18
8000000
6000000
4000000
++
[2M+H]
[2M+H]
28638.68
14318.68
2000000
0
10000
20000
30000
40000
m/z
16
Lysozyme by ESI-Ion Trap
Average mass = 14,314
+11
1301.53
100
14318.2
14306.0
100
75
Intensity
Relative Intensity
75
+10
1431.47
+12
1193.20
Calculated
Mass Spectrum
50
25
50
0
5000
+9
1590.33
10000
15000
m/z
25
+8
1789.00
+13
1101.40
0
1000
1100
1200
1300
1400
1500
1600
1700
1800
1900
2000
m/z
So we’ve made measurements
Now What?
• A lot of information is available on-line about
proteins and/or the gene
• We will explore protein information in general
• We will then use the available info to perform
data analysis
17
MALDI-TOF analysis of Alkaline
phosphatase – Computer Exercise #2
1.20E+009
1.00E+009
[M+H]+
47155.22
Intensity
8.00E+008
[M+2H]2+
23445.54
[2M+H]+
94472.97
6.00E+008
4.00E+008
2.00E+008
0
25000
50000
75000
100000
m/z
Protein identification – Two strategies
• single stage mass spectrometry (MS)
– called “peptide mass mapping”
– measure all peptides in one spectrum
– MALDI-TOF
– produces low confidence results
• tandem mass spectrometry (MS/MS)
– measure peptides as they elute from an HPLC
– ESI-Ion Trap
– produces high confidence results
18
Single Stage – Peptide mass mapping
Using MALDI-TOF
MALDI-TOF Spectrum of tryptic digest
x 4.0
90
Ref
Ref
0
500
m/z
2500
D:\011003_500fmol\Bsaintcal\2Ref\pdata\1\1r (11:26 10/04/01)
19
MALDI Reflectron Spectrum Of ACTH
13-Nov-2003
M@LDI
ACTHResCk 3 (0.098) Cm (1:5)
TOF LD+
6.57e3
2467.344
100
2466.324
2468.330
%
2469.316
2470.337
2471.290
2451.525
0
2444
m/z
2446
2448
2450
2452
2454
2456
2458
2460
2462
2464
2466
2468
2470
2472
2474
2476
2478
2480
2482
Data Analysis for peptide mass mapping
?
MS
protein
peptides
identify
for example:
Measured Peptide = 1274.5183
rank
MS Peptide MW
Found in Selected
Databases
NDALYFPT...
SWDLTAL...
PTDLDVSY...
• Important data
– multiple peaks
– mass accuracy
– confirming information
(pI, approx. mass,
organism, etc.)
20
Data Analysis for peptide mass mapping
>gi|27807105|ref|NP_777037.1| solute carrier family 6 (neurotransmitter transporter,
glycine), member 9 [Bos taurus] gi|1279843|gb|AAB01159.1| glycine transporter
MAAAQGPVAPSKLEQNGAVPSEATKSDQNLGQGNWRNQIEFVLTSVGYAVGLGNV
WRFPYLCYRNGGGAFMFPYFIMLIFCGIPLFFMELSFGQFASQGCLGVWRISPMFK
GVGYGMMVVSTYIGIYYNVVICIAFYYFFSSMTPVLPWTYCNNPWNTPDCMSVLDN
PNITNGSQPPALPGNVSQALNQTLKRTSPSEEYWRLYVLKLSDDIGNFGEVRLPLLG
CLGVSWVVVFLCLIRGVKSSGKVVYFTATFPYVVLTILFIRGVTLEGAFTGIMYYLTPQ
WDKILEAKVWGDAASQIFYSLGCAWGGLVTMASYNKFHNNCYRDSVIISITNCATSV
YAGFVIFSILGFMANHLGVDVSRVADHGPGLAFVAYPEALTLLPISPLWSLLFFFMLILL
GLGTQFCLLETLVTAIVDEVGNEWILQKKTYVTLGVAVAGFLLGIPLTSQAGIYWLLLM
DNYAASFSLVIISCIMCVSIMYIYGHQNYFQDIQMMLGFPPPLFFQICWRFVSPAIIFFIL
IFSVIQYQPITYNQYQSSQTGLPLFTCQIAPAHVPQPLSGARTPSPKPWSVRVSVLRA
PLCSDSPGRAASNPL
Measured Peptide = 1274.5183
MAAAQGPVAPSK = 1127.5883
LEQNGAVPSEATK = 1343.6807
SDQNLGQGNWR = 1274.5878
1274.5878 theoretical
1274.5183 measured
0.0695 difference
error = 55 ppm
Data Analysis for peptide mass mapping
?
MS
protein
peptides
identify
rank
MS Peptide MW
Found in Selected
Databases
NDALYFPT...
SWDLTAL...
PTDLDVSY...
• Important data
– multiple peaks
– mass accuracy
– confirming
information (pI,
approx. mass,
organism, etc.)
21
Computer Exercise #4
Analyze peptide mass mapping data
• 4 lists of peptide masses provided on
worksheet
– (Alternate address of excel data):
http://www.chem.arizona.edu/facilities/msf/index.html
Problems with whole protein analysis
• Peaks are broad
– large groups of isotope peaks
– peaks further broadened by adducts (contaminants, salts)
• Proteins are often modified
– Instrument may not resolve the mass difference
– No information regarding which amino acid is modified
• Proteins are in a complex matrix
– background stuff
– other proteins (too complex!!!)
Therefore proteins are identified from peptides!
22
How are proteins separated
• Proteins from biological organisms are a complex mixture
• Separating proteins
– 1D SDS-PAGE
• Cross linking controls MW separated
• Low resolution technique, spot can contain 10's to
100's of proteins
– 2D SDS-PAGE
• Best for complex protein mixtures (IEF + SDS-PAGE)
• Other methods
– Chromatography (reverse phase, size exclusion, ion
exchange, affinity)
– Preparative isolectric focusing (IEF)
1D Electrophoresis
Great clean-up tool (rid of salts, detergents, etc…)
Great concentration tool
Biological analytes
Various stains available – various detection limits
USE PRECAST GELS (polymer issue) if possible
Various size gels (spatial resolution)
Various MW ranges
Protein Mixture
or IP eluant
1D SDS-PAGE
23
1D Electrophoresis
http://www.biorad.com
2D Electrophoresis
Separation on the basis of intrinsic charge (pKa)
(1)
isoelectric focusing
Separation on the basis of Size
PAGE (SDS gel electrophoresis)
(2)
24
2D Electrophoresis
Protein Mixture or IP eluant
or Cell/tissue
2D SDS-PAGE
Great clean-up tool (rid of salts, detergents, etc…)
Various stains available – various detection limits
Protein profiling
Various pH ranges
2D gels are very much sample related (sample may require further clean-up
prior to 2D gel
Avoid excess salts in sample (not focus, IPGs burn, 30-40 mM max salt)
Often Automated w/ robotics–high throughput (MALDI-TOF)
Often good for visualizing PTMs
The 1st D: Isoelectric Focussing
–
+
pH 3
pH 7.5
–
+
pH 3
pH 7.5
pH 10
pH 7.5
pH 10
pH 7.5
pH 10
–
+
pH 3
–
+
pH 3
pH 10
25
The 2nd D: SDS-PAGE
+
pH 7.5
pH 7.5
pH 3
+
pH 3
pH 3
pH 7.5
– pH10
– pH10
Proteins
migrate
through the gel
at a rate
proportional to
their size
Smallest
proteins travel
the furthest
distance
–
pH10
size
charge
• Do Computer Exercise #3
• Laboratory tour
26