Database search algorithms

Database search algorithms
• Sequest
– Written by Jimmy Eng/John Yates
• Mascot
– Core written by David Perkins and Darryl
Pappin
• X!Tandem
– Written by Ron Beavis
David Perkins (left) and Darryl Pappin (right) outside ICRF in May 1999
Database Search
Computer scans
database for sequences
that match precursor
ion mass
AGLSQ…
VKERG…
Computer predicts
MS/MS spectra
MS/MS
Computer compares
predicted spectra to the
acquired MS/MS
spectrum
LHYRA…
MS/MS Spectrum Quality
Cross-Correlation Score (xC):
Indicates the quality of match between the acquired MS/MS
spectrum and the algorithm’s predicted spectrum
xC is influenced by:
• the coverage of fragment ions (i.e.; type-b, -y, -a ions)
• the presence of unassigned ions
• the relative intensities of fragment ions
xC criteria:
• +1 ions ≥ 1.90
• +2 ions ≥ 2.20
• +3 ions ≥ 3.75
Elucidated tandem spectrum of
HLVDEPQNLIK
TIC = 2.5 x 105
xC = 3.60
• Go to Mascot
A _PRIM_CHL_02 #492 RT: 30.25 A V : 1 NL: 2.78E6
T: + c ESI Full ms [ 150.00-2000.00]
738.3
100
477.3
A _PRIM_CHL_02 #493 RT: 30.30 A V : 1 NL: 1.39E5
T: + c d Full ms2 [email protected] [ 210.00-1635.00]
90
80
811.5
80
70
953.4
Relative Abundance
Relative Abundance
90
MS/MS
70
1091.3
100
60
50
477.9
40
30
60
531.2
50
763.0
40
30
1196.7
20
10
20
1343.6
332.8
468.5
1475.8
1622.4
1755.9
10
367.2
1961.9
0
1183.4
513.1
753.6
865.5
993.3
906.1
621.1
1190.0 1388.2 1493.1
0
500
1000
m/z
1500
2000
400
600
800
1000
1200
1400
1600
m/z
3 3.0 3
100
90
80
ls
Ful
Relative Abundance
70
60
can
38 .0 9
1 4 .8 9
50
2 8.1 8
40
3 5.4 6
1 5 .4 3
30
42 .7 7
2 6.9 9
1 7 .96
2 6.3 9
20
3 2.3 3
3 8 .6 6
2 3 .5 7
2 1 .47
10
0 .80
0
0
6 .3 9 7 .8 8
5
4 4 .1 8
1 4 .1 7
10
15
20
25
30
35
40
Tim e (m in )
45
4 9 .02 5 2.0 7 5 3.4 1
50
55
60 .5 4 6 3 .4 1
60
65
7 7.9 3
67 .93 68 .99
70
75
MS and MS/MS Spectrum
Intens.
x10 5
+M S, 32.2m in (#1730)
658.1
2.0
1.5
839.7
1.0
421.5
0.5
451.0
499.3
771.2
607.2
529.1
571.3
639.5
745.6
686.3
816.2
866.0 887.1
0.0
400
500
600
700
Intens.
x10 4
800
900
1000
Intens.
x10 4
808.1
412.5
4
1.25
1.00
3
0.75
2
0.50
436.0
0.25
327.5
917.1
545.9
472.0
1
1046.2
663.0
1376.1
143.2
0.00
272.0
571.1
672.1
0
500
750
1000
1250
1500
500
750
1000
1250
1500
m /z
Data-Dependent Scanning
1st MS/MS spectrum
MS Full Scan
2nd MS/MS spectrum
0
5
MS Full Scan
0
5
0
5
0
5
0
5
0
5
0
5
0
5
0
5
0
5
0
Data-Dependent Duty Cycle
Experimental parameters that influence duty cycle length:
• Number of full scan microscans
• Number of MS/MS microscans
• Ion trap injection times
– Abundance of ions
Data-Dependent Duty Cycle
The instrumental processing time required to acquire one
analytical full scan and one analytical MS/MS scan.
Number of Full Scan
Microscans
Number of MS/MS Microscans
1
5
9
1
1.9 s
5.5 s
10.1 s
5
4.9 s
8.4 s
12.6 s
9
7.7 s
10.1 s
15.4 s
Approximately:
• 0.7 s per
full scan microscan
• 1.0 s per
MS/MS microscan
Chromatographic Peak Capacity
The maximum number of separated peaks
that can fit into the space provided by a
separation method.
nc = L = L
wRs 4σRs
Peak Width
(w)
L
Duty Cycle vs. Peak Capacity
Avg. Duty Cycle:
~10 s
Intensity
Acquired
MS/MS
Intensity
Threshold
Avg. Peak Width:
~40 s (at base)
Time
MS/MS Quality vs. Elution Time
3.80
Relative Abundance
100
3.91
75
50
3.74
3.74
3.27
25
2.65
2.01
0
25.0
25.2
25.4
25.6
25.8
26.0
26.2
Time (min)
26.4
26.6
1.35
26.8
27.0
2D HPLC MS/MS
• PROs
•Couple
orthogonal
• Excellent for
complex mixture analysis
• Eliminates gel electrophoresis
chromatographic
systems
• Minimizes sample preparation
together
• Can be quantitative using stable isotope
labeling
- for example SCX and
• CONs
reversed
phase and
(C18)
• Tedious to implement
maintain
• Uses a lot of instrument time
Note: 2D HPLC MS/MS is also referred to as
MuDPIT (multidimensional protein identification technology)
Column Packing
•
•
•
•
Nano LC columns cost between $350 - $500
One bad sample can complete destroy the column
Alternative: pack your own
Items you need: desire, a pressure bomb, fused
silica (various sizes), column media, zero dead
volume unions, frits, and a graduate student with
good hands
• Major challenge – minimize post column dead
volume
2-D Capillary Packing Scheme
SCX
• Partisil SCX, 10 µm
3-5 cm
Strong Cation
Exchange
1000 psi
Helium
Fused Silica
Capillary
2 µm Frit
Reverse Phase
• Macrosphere, 5 µm
15 cm
Reverse
Phase
Typical lab. packed 150 µm o.d.
2 phase (SCX/C18) column
Tape marker
for packing
volume
Special zero dead volume union (LC
Packing) or machined ZVIXC union
(VICI)
2-D Separation Scheme
SCX
Time
Organic
High
Low
Gradient
Salt
Salt
Organic
Loading
Gradient
Sample
Organic
Gradient
Time
C18
Time
Typical 2-D Separation
No
25 mM
50
75
Salt
Ammonium Acetate
.
.
.
.
.
.
Intensity
etc.
Time
Esquire 3000plus
LC Packings Ultimate
2-D LC-MS/MS Chromatograms
Containing Apolipoprotein E Peptides
15 mM Ammonium Acetate
20 mM Ammonium Acetate
30 mM Ammonium Acetate
LSKELQAAQAR
Xcorr:
3.39
50 mM Ammonium Acetate
100 mM Ammonium Acetate
150 mM Ammonium Acetate