Novel method for estimating isotope incorporation using the ‘half-decimal place rule‘ Ingo Fetzer Department of Environmental Microbiology userR Conference 2009, Rennes Problem 1.2e+7 Intensity 1.0e+7 0% 50% 100% 8.0e+6 6.0e+6 4.0e+6 2.0e+6 0.0 835 840 845 850 855 860 865 870 875 880 mass Substrate fluxes in Procaryotes Function Activity Identitiy Interactions: Competition Mutualism Page 2 Goal Develop an algorithm for estimating 13C incorporation by using ‘half decimal place rule’ Page 3 ‘Half-decimal place rule’ (HDPR) Mann (1995) Decimal places 100% 0% m/z tryptic peptides Heliobacter pylori [Da] Schmidt et al 2003 Page 4 Outline 1. Peptide mass calculation for 12C and 13C 2. Estimation of 12C and 13C slopes (HDPR) 3. Estimation of relative 13C incorporation rates (of user data) implemented in ‘R‘ (R-project.org) Page 5 Flowchart Script 1 Peptide database criterions Peptide sequences AA molecular formula Atomic weights Peptide sequence masses Page 6 Peptide mass calculation for 12C and 13C Virtual digestion with MS-Digest dataset M. tuberculosis H37Rv amino acid sequences length 2 – 40 315,579 peptide fragments ChemScore ≥ 10 Missing cleavage = 0 Modifications = Null Mol. weight ≤ 5000 Da 90,637 peptide sequences Sanger Institute (ftp://ftp.sanger.ac.uk/pub/tb/sequences/TB.pep) Page 7 Peptide mass calculation for 12C and 13C Page 8 Peptide mass calculation for 12C and 13C Script 1: 1. Reduction of dataset (ChemScore, Modification etc.) 315,579 90,637 2 a. Peptide mass calculation Sequence in DB GAG + Molecular formula of AA G=C2H6NO2 Why? Page 9 A=C3H8NO2 Sum of C, H, N, O for each sequence C7 H20 N3 O6 Calculation of percentage 13C incorporation Peptide mass calculation for 12C and 13C 2b. Peptide mass calculation Sum of + Atomic weights C, H, N, O of each 12C = 12.000000 Da sequence 13C = 13.003355 Da C7 H20 N3 O6 N = 14.003074 Da O = 15.994915 Da H = 1.007825 Da Molecular weights of sequences (with decimal residuals) 12C=242.135212 Da 13C=249.158697 Da Page 10 Peptide mass calculation for 12C and 13C 2a. Peptide mass calculation + Atomic weights Count sum of C, H, N, O, S for each sequence 12C = 12.000000 Da = 13,003355 Da N = 14.003074 Da O = 15.994915 Da H = 1.007825 Da S = 31.972071 Da 13C f o s e s s a Molecular M 13 C peptides and H C R 1 NH3+ H O + C R 2 OH MW(H3 Page 11 H O C NH3+ O+) C R 1 OH = 19.01839 Da C NH3+ O Molecular weights of sequences 12 C (with decimal residuals) H C C N H R 2 O C + H3O+ OH Flowchart Script 2 Peptide sequence masses Transformation Transf. Peptide sequence masses k-means clustering Groupings Slope 0%/100% 13C Linear fitting Transf. Decimal Residuals Page 12 m/z – DR plot Estimation of 12C and 13C slopes 0.8 0.6 0.4 m/zTemp = m/z - 1800 * DR 0.0 0.2 Decimal residuals 1.0 Script 2: 1000 2000 3000 m/z Page 13 4000 5000 6000 0.4 0.6 0.8 Hartigan & Wong (1979) 0.0 0.2 Decimal residuals 1.0 Estimation of 12C and 13C slopes 0 k-means clustering using kmeans() Page 14 1000 2000 3000 4000 m/zTemp 5000 6000 1.0 Estimation of 12C and 13C slopes Add 2 3 0.4 0.6 0.8 1 0.0 0.2 Decimal residuals 0 Hartigan & Wong (1979) 0 k-means clustering using kmeans() Page 15 1000 2000 3000 4000 m/zTemp 5000 6000 2.5 3.0 100% slope = 0.000633 0% slope = 0.000514 1.5 2.0 100% 1.0 0% lm() function Stats package 0.0 0.5 Decimal residuals 3.5 Estimation of 12C and 13C slopes 0 1000 2000 3000 m/z Page 16 4000 5000 Flowchart Script 3 User data k-means clustering Robust linear fitting Slope 0% + 100% 13C Slope user data reference calculation Estimation of relative 13C incorporation rates Page 17 3.0 100% slope = 0.000633 0% slope = 0.000514 (with a=0) 2.0 2.5 DR = b ∗ m/z 100% 1.5 ‘outliers‘ 1.0 0% Influence on slope estimation 0.0 0.5 Decimal residuals Script 3: 3.5 Estimation of relative 13C incorporation rates 0 1000 2000 m/z Page 18 3000 4000 5000 Estimation of relative 13C incorporation rates 1.0 1.5 2.0 2.5 3.0 SSE = min ∑ ( DR − b ∗ m/z ) 2 0.5 Decimal residuals 3.5 Linear fitting: minimizing the error sum of the squares 0.0 Breakdown point value = 0 0 1000 2000 m/z Page 19 3000 4000 5000 2.0 1.5 1.0 Breakdown point value set = 0.5 >50% of datapoints must change 0.0 0 rlm() function 1000 2000 m/z MASS package (Venable & Ripley 2002) Page 20 robust linear fitting 2.5 3.0 Robust linear model (Huber 1973) M-estimator type Iterative re-weighted least squares (IWLS; Huber 1981, Hampel et al 1986) 0.5 Decimal residuals 3.5 Estimation of relative 13C incorporation rates 3000 4000 5000 3.0 100% slope = 0.000633 0% slope = 0.000514 (with a=0) Your incorporation rate is 98.3% 1.5 2.0 2.5 DR = b ∗ m/z 100% 0% 1.0 Decimal residuals 3.5 Estimation of relative 13C incorporation rates buser − b12C ∗100 CUser % = b13C − b12C 0.0 0.5 13 0 1000 2000 m/z Page 21 3000 4000 5000 User data input + Page 22 R Script 3 User data output + Page 23 R Script 3 Sensitivity of Method Dataset Pseudomonas putida 1. Calculated 50% and 100% 13C incorporation 2. Randomly sampled 100 times each 10-100 (steps by 10), 150, 200, 300, 500, 1000 sequences (0%,50% and 100%) 3. Statistics on estimated incorporation rate for 0%, 50% and 100% Page 24 Incorporation [%] Sensitivity of Method 150 140 130 120 110 100 90 80 70 60 50 100 90 80 70 60 50 40 30 20 10 0 50 40 30 20 10 0 -10 -20 -30 -40 -50 100% ~100 samples 50% 0% 10 50 100 150 200 300 Number of peptides Page 25 500 1000 Conclusion 1. ‚Half-decimal place rule‘ useful for the estimation of 13C incorporation rates http://s3.amazonaws.com/readers/2008/08/14/draddalybig_1.jpg 2. Robust linear models better suited for fitting of highly variable user data than MinSSE fitting 3. >100 measurements needed for prescision <5% incorporation estimation Page 26 Outlook 1. Application of HPDR on DNA 2. Backcalculation to 12C-peaks function identification 3. Include N-isotope incoorporation http://www.sharp.co.jp/plasmacluster-tech/en/release/images/041117_3.gif Page 27 Acknowledgement Nico Jemlich (UFZ) Carsten Vogt (UFZ) Martin von Bergen (UFZ) Hans-Hermann Richnow (UFZ) http://cache.gawker.com/assets/images/gizmodo/2009/01/bactsunsuet_01.jpg Hauke Harms (UFZ) Frank Schmidt (Uni Greifwald) Jens Mattow (MPI Berlin) Bernd Thiede (Uni Oslo) R development team Page 28
© Copyright 2026 Paperzz