Assigning Adduct and Charge States to Highresolution Accurate-mass Mass Spectral Data Using Frequency of Assignment in Multiple Difference Networks Thomas D. McClure, Matthew D. Kump, Michael Athanas Thermo Fisher Scientific, San Jose, California Overview Results Methods: A graph theory based difference networks are used for determining adduct and charge states for component detection analysis of mass spectrometry data. We verified the operation and using a synthetically generated (MH+, M+Na+, and M+NH4+). isotopes. The two component Results: This work shows that difference networks provide accurate results when applied to complicated collections of adduct and charge states in a mass spectrum. Synthetic Example 1 Purpose: Using a graph-theory based approach to determine adduct and neutral loss species within a mass spectrum as a part of a small molecule component detection workflow. Algorithm Verification Using Introduction Liquid or gas chromatography coupled with mass spectrometry, has been demonstrated to be a powerful tool for characterizing small molecules in biological samples. Often the goal is to understand differences in the types and amounts of these small molecules in metabolomic studies, metabolism studies, and virtually any approach involving a complex matrix where untargeted profile information is desired. The data sets from these experiments are usually large and complex. Such complexity results, in no small part, from more than one signal for each compound detected. These multiple signals per compound come from the formation of multiple charge states, in-source neutral-loss fragmentations, chemical adducts, and the formation of gas-phase polymers. In addition, each of these species also produces a number of isotope signals. In this presentation, we will demonstrate the utility of applying a graph-theory based algorithm for reducing complexity in an LC-MS experiment. Graph-theory mathematics is not a tool in the analysis of MS data. It has been used for analyzing mass spectrometric data in a number of applications including: de novo peptide sequencing1,2, isotope assignment, protein identification and quantification. We will use this algorithm to properly assign and group signals arising from the presence of multiple signals from chemical adducts, in-source neutral-loss fragmentations and adduct related multiple charge states for individual compounds in a mixture of compounds. C19H43N5 342.35912 4.90 A M Methods Synthetic Data Creation for Algorithm Verification An LC-MS data set was created containing thirty compounds each with three adducts ([M+H], [M+Na], and [M+NH4]) as well as the isotopes for each of these species. A mass sorted peak list was generated from the LC-MS data file described above using software that produces extracted ion current chromatograms which are then evaluated using parameter-less peak detection. Synthetic Example 2 Amino Acid Mixture Sample Preparation Commercially available standard mixture was obtained using a Thermo Scientific™ Pierce™ Amino Acid Standard H, P/N 20088. The concentration of the amino acids in this mixture was 2.5 µmol/ml, except cysteine at 1.25 µmol/ ml. A diluted stock solution was prepared using 100 µl of Amino Acid Standard H + 900 µl of water. The final sample was then prepared by diluting the stock in a dilution series to a final dilution of 1:1000 with HPLC-grade water. 2 µl was directly injected on to the HPLC column. HPLC Conditions A Thermo Scientific™ Dionex™ UltiMate™ 3000 RSLC system was used with a Thermo Scientific™ Hypersil GOLD™ HPLC column (150 × 2.1, 1.9 um, P/N 25002152130). The HPLC solvents were: A – 0.1% formic acid in water; B – 0.1% formic acid in methanol. The elution gradient was: 0.5% B to 55% B in 5.5 min, 50% B to 98% B in 0.5 min, hold 98% B for 6 min. The flow rate was 450 µl/min and the column was heated to 55° C. C45H88N4O 701.70310 4.65 A M M MS Conditions A Thermo Scientific™ Q Exactive™ mass spectrometer with a Thermo Scientific™ HESI-II source was used with the following gas settings: sheath was 45 and the aux gas was set to 8. The spray voltage was 3.8 kV, and the capillary temperature was 320° C. The HESI Heater temperature was 350° C. The mass analyzer had the following settings: positive polarity, full MS: 67–1000 AMU, AGC was set to 3+E6, the resolution was 70,000, and the maximum ion injection time was 100 ms. M In both cases, the adduct and 2 Assigning Adduct and Charge States to High-resolution Accurate-mass Mass Spectral Data Using Frequency of Assignment in Multiple Difference Networks ermine adduct and neutral loss lecule component detection Results Analysis of a Dilute Amino Acid Mi Algorithm Verification Using Synthetic Data e used for determining adduct mass spectrometry data. We verified the operation and accuracy of the assignments by the difference network using a synthetically generated raw file with 30 components, each having three adducts (MH+, M+Na+, and M+NH4+). Each adduct had a minimum of three, and usually four, isotopes. The two component examples are shown below. vide accurate results when states in a mass spectrum. Synthetic Example 1 rometry, has been mall molecules in biological the types and amounts of these udies, and virtually any profile information is desired. and complex. Such complexity each compound detected. rmation of multiple charge adducts, and the formation of also produces a number of plying a graph-theory based ent. Graph-theory mathematics d for analyzing mass g: de novo peptide n and quantification. We will use ng from the presence of al-loss fragmentations and pounds in a mixture of ounds each with three adducts or each of these species. A ata file described above using rams which are then evaluated with a Thermo Scientific™ : sheath was 45 and the aux e capillary temperature was mass analyzer had the MU, AGC was set to 3+E6, the me was 100 ms. RT(min) Amino Acid 0.52 Lysine 0.54 Histidine 0.55 Arginine 0.56 Cystine 0.57 Serine 0.58 Aspartic Acid 0.59 Alanine 0.61 Threonine 0.61 Glutamic Acid RT(mi 0.6 0.9 1.1 1.4 1.7 1.8 2.3 M+H M+H-NH3 C19H43N5 342.35912 4.90 Adduct M+H Charge 1 M+Na 1 M+NH4 1 Isotope A0 A1 A2 A3 MZ 342.35912 343.36230 344.36543 345.36827 Intensity 669135 147786 15569 951 Peak Area 5867777 1295968 135995 7155 Isotope A0 A1 A2 A3 MZ 364.34106 365.34420 366.34732 367.35016 Intensity 891390 197788 20891 1353 Peak Area 7816780 1734444 181364 10581 Isotope A0 A1 A2 A3 MZ 359.38567 360.38875 361.39184 362.39456 Intensity 656801 147162 15687 1034 Peak Area 5759615 1290490 137002 7830 M+Na m/z 147.11255 Apex RT 0.46 Adduct M+H C M+H-NH3 M+Na+CH3CN Synthetic Example 2 M+Na M+H using a Thermo Scientific™ entration of the amino acids in mol/ ml. A diluted stock solution 900 µl of water. The final tion series to a final dilution of d on to the HPLC column. system was used with a 50 × 2.1, 1.9 um, P/N 25002d in water; B – 0.1% formic acid B in 5.5 min, 50% B to 98% B in /min and the column was The amino acid mixture previously de algorithm. The following chromatogra and their elution times. M+H-NH3 m/z 182.08078 C45H88N4O 701.70310 4.65 Adduct M+H Charge 1 M+Na 1 M+NH4 1 Isotope A0 A1 A2 A3 A4 MZ 701.70310 702.70633 703.70959 704.71275 705.71652 Intensity 586712 302768 77602 13145 1271 Peak Area 7854121 4052986 1038850 176007 17021 Isotope A0 A1 A2 A3 A4 MZ 723.68501 724.68832 725.69154 726.69469 727.69848 Intensity 743036 383766 98287 16661 1611 Peak Area 9947609 5134197 1316105 223059 21570 Isotope A0 A1 A2 A3 A4 MZ 718.72963 719.73287 720.73603 721.73925 722.74231 Intensity 778223 404727 104345 17783 1961 Peak Area 10421120 5414875 1396982 238128 24241 In both cases, the adduct and charge states are correctly assigned. Apex RT 1.40 Adduct M+H Charge 1 M+H-NH3 1 M+Na 1 Analysis of a Complicated Mixture In our final example, an unpublished adducts in one mass spectrum. Of th M+CaCOOH (z=1), M+MgCOOH (z= 2(H2O) (z=1), M+Ca+2(CH3CN) (z=2 M+Ca+H2O (z=2). Notice there are both single and doub Thermo Scientific Poster Note • PN-64093-ASMS-EN-0614S 3 Analysis of a Dilute Amino Acid Mixture nments by the difference network ponents, each having three adducts minimum of three, and usually four, below. Intensity 669135 147786 15569 951 Peak Area 5867777 1295968 135995 7155 ope 0 1 2 3 MZ 364.34106 365.34420 366.34732 367.35016 Intensity 891390 197788 20891 1353 Peak Area 7816780 1734444 181364 10581 ope 0 1 2 3 MZ 359.38567 360.38875 361.39184 362.39456 Intensity 656801 147162 15687 1034 Peak Area 5759615 1290490 137002 7830 pe RT(min) 0.69 0.95 1.10 Methionine 1.73 IsoLeucine 2.39 Phenylalanine Leucine 1. Establishing a List of Known Using the terms in equation 1, w with combinations of charge carr combine with M to form additiona used to generate a combinatoria 2. Generating the Nodes for th Nodes are generated by matchin differences determined from mon analyzed. The matching m/z valu and an edge is drawn between th and edges for the mass spectrum M+Na+CH3CN M+Na Lysine m/z 147.11255 Apex RT 0.46 Adduct M+H Charge 1 M+H-NH3 1 M+Na+CH3CN 1 M+Na 1 Isotope A0 A1 A1 A2 MZ 147.11255 148.11589 148.10962 149.11687 Intensity 54321440 3604857 192584 457894 Peak Area 66288519 4250354 316075 2173745 Isotope A0 A1 MZ 130.08606 131.08937 Intensity 10109494 812226 Peak Area 11013528 559401 Isotope A0 MZ 210.12015 Intensity 195011 Peak Area 143738 Isotope A0 A1 MZ 169.09444 170.09763 Intensity 2110288 114523 Peak Area 2464782 72849 M+H MZ 701.70310 702.70633 703.70959 704.71275 705.71652 Intensity 586712 302768 77602 13145 1271 Peak Area 7854121 4052986 1038850 176007 17021 MZ 723.68501 724.68832 725.69154 726.69469 727.69848 Intensity 743036 383766 98287 16661 1611 Peak Area 9947609 5134197 1316105 223059 21570 MZ 718.72963 719.73287 720.73603 721.73925 722.74231 Intensity 778223 404727 104345 17783 1961 Peak Area 10421120 5414875 1396982 238128 24241 rectly assigned. Apex RT 1.40 M+Na Adduct M+H Charge 1 M+H-NH3 1 M+Na 1 Node 6 (M+H+Na+ H2O)++ 174.62281 Tyrosine Isotope A0 A1 A2 A2 MZ 182.08078 183.08418 184.08751 184.08508 Intensity 67884942 7242606 306655 370365 Peak Area 130861879 11166390 349548 627524 Isotope A0 A1 MZ 165.05429 166.05765 Intensity 7440865 753306 Peak Area 12655904 1231796 Isotope A0 MZ 204.06277 Intensity 766400 Peak Area 1321692 Analysis of a Complicated Mixture of Adducts In our final example, an unpublished study, we analyzed a sample that contained 13 adducts in one mass spectrum. Of these 13, we detected the following 10 species: M+CaCOOH (z=1), M+MgCOOH (z=1), M+Fe-H (z=1), M+Na (z=1), M+H (z=1), M+H2(H2O) (z=1), M+Ca+2(CH3CN) (z=2), M+Ca+CH3CN (z=2), M+Mg+CH3CN (z=2), M+Ca+H2O (z=2). No (M 308 Node 3 (M+NH4)+ 325.24857 Node 2 (M+Na)+ 330.20396 M+H-NH3 m/z 182.08078 We use the following formula to i are present in each species bein ∆ m/z is the difference in mass-to molecule, n is the gas-phase clu the parent neutral molecule mas by the adduct or adducts or neut contributed to the species by the the detected species. Tyrosine 1.81 Adduct and Neutral-Loss Assi Where: Valine 1.41 Discussion ∆ m/z = (n1M1 + Ma1 + Mcc1) / Z1 Amino Acid Proline M+H-NH3 MZ 342.35912 343.36230 344.36543 345.36827 pe RT(min) Amino Acid 0.52 Lysine 0.54 Histidine 0.55 Arginine 0.56 Cystine 0.57 Serine 0.58 Aspartic Acid 0.59 Alanine 0.61 Threonine 0.61 Glutamic Acid M+H ope 0 1 2 3 pe The amino acid mixture previously described was analyzed using the difference network algorithm. The following chromatogram is labeled with the amino acids in the mixture and their elution times. Node 1 (2M+H+ +H2O+A )++ 348.731 3. Edge Trimming and Assignm Because of the large number of Nodes with multiple assignments (positive ion mode) or M–H (neg factors to be applied to predicted insight. The weighting factors ca down on the ordered list. 4. Assignment of Species The assignments of the charge c then made according to final pos possibilities are reported. Notice there are both single and double charge states in this series of adducts. 4 Assigning Adduct and Charge States to High-resolution Accurate-mass Mass Spectral Data Using Frequency of Assignment in Multiple Difference Networks alyzed using the difference network h the amino acids in the mixture Discussion 5. Overview The current approach is able to ide uncommon adducts in a complicat approach is also able to identify sp loss of water or ammonia, as show Adduct and Neutral-Loss Assignment Algorithm We use the following formula to identify the important mass carrying components that are present in each species being detected by the mass spectrometer. ∆ m/z = (n1M1 + Ma1 + Mcc1) / Z1 – (n2M2 + Ma2 + Mcc2) / Z2 (1) Where: ∆ m/z is the difference in mass-to-charge between two different species of the same molecule, n is the gas-phase cluster or polymeric number for the base molecule, M is the parent neutral molecule mass, Ma is the total mass that is contributed to the species by the adduct or adducts or neutral-loss or losses, Mcc is the total mass that is contributed to the species by the charge carrier or carriers, and Z is the total charge of the detected species. 1. Establishing a List of Known Adducted Species Using the terms in equation 1, we construct a table of candidate “modifying” species with combinations of charge carriers, neutral adducts, and neutral losses that can combine with M to form additional signals. These candidate “modifying” species are used to generate a combinatorial table of mass differences. 2. Generating the Nodes for the Graph Nodes are generated by matching differences in the list of candidate species with mass differences determined from monoisotopic m/z values from the mass spectrum being analyzed. The matching m/z values from the spectrum are added as nodes to the graph and an edge is drawn between the two nodes. The result is a large number of nodes and edges for the mass spectrum. This is illustrated below using hypothetical data. CN Lysine MZ 47.11255 48.11589 48.10962 49.11687 Intensity 54321440 3604857 192584 457894 Peak Area 66288519 4250354 316075 2173745 MZ 30.08606 31.08937 Intensity 10109494 812226 Peak Area 11013528 559401 MZ 10.12015 Intensity 195011 Peak Area 143738 MZ 69.09444 70.09763 Intensity 2110288 114523 Peak Area 2464782 72849 Node 3 (M+NH4)+ 325.24857 Tyrosine Intensity 67884942 7242606 306655 370365 Peak Area 130861879 11166390 349548 627524 Intensity 7440865 753306 Peak Area 12655904 1231796 Intensity 766400 Peak Area 1321692 zed a sample that contained 13 cted the following 10 species: 1), M+Na (z=1), M+H (z=1), M+HN (z=2), M+Mg+CH3CN (z=2), s in this series of adducts. Node 4 (M+K+H2O )+ 264.18847 Node 6 (M+H+Na+ H2O)++ 174.62281 Node 2 (M+Na)+ 330.20396 M+Na Node 1 (M+H)+ 308.22202 Node 8 (2M+Na+H 2O)+ 655.45307 Using the synthetic data we were a accurately assign both the neutral combinations. The amino acid data sets provided adducts, and charge carrying spec potentially interfering signals. The Tyrosine is consistent with the acc The assignment of the acetonitrile mobile phase used methanol and n comparatively quite low. In this cas additional information such as the predicted adducts. Uncommon adducts such as those detected and labeled as shown in magnesium adducts also included Conclusion Using the described graph-th in-source neutral losses, and useful results. Using the method described states and in-source neutral Node 5 (M+2H)++ 154.61465 Node 7 (2M+H)+ 614.42894 Node 10 (2M+H+Na +H2O+ACN )++ 348.73165 6. Example Results Node 9 (2M+Na+H )++ 319.21299 The complexity of the mass s identification of the aforemen capability to group compound The accuracy of this algorithm such as mobile phase compo Preference can be given to s factors for the predicted addu References 1. Taylor, J. A.; Johnson, R. S. R 1067–1075. 2. Clauser, K. R.; Baker, P.; Bur 3. Cox, J.; Mann, M. Nat. Biotec 3. Edge Trimming and Assignment of Species Because of the large number of nodes and edges, ambiguities in assignment can arise. Nodes with multiple assignments are then ordered by frequency of assignment for M+H (positive ion mode) or M–H (negative ion mode). The algorithm also allows for weighting factors to be applied to predicted species should a priori information provide appropriate insight. The weighting factors can be node specific, which causes edges to move up or down on the ordered list. 4. Assignment of Species The assignments of the charge carrier(s) and neutral adduct(s) or neutral loss(es) are then made according to final position in the ordered list. Should ambiguities still arise, all possibilities are reported. All trademarks are the property of Thermo Fi This information is not intended to encourage intellectual property rights of others. Thermo Scientific Poster Note • PN-64093-ASMS-EN-0614S 5 5. Overview The current approach is able to identify difference charge states as well as numerous uncommon adducts in a complicated mixture all within the same mass spectrum. This approach is also able to identify species resulting from in-source neutral losses, such as loss of water or ammonia, as shown by the amino acid samples. ass carrying components that s spectrometer. / Z2 (1) different species of the same er for the base molecule, M is that is contributed to the species is the total mass that is ers, and Z is the total charge of andidate “modifying” species nd neutral losses that can date “modifying” species are ces. of candidate species with mass om the mass spectrum being are added as nodes to the graph lt is a large number of nodes ow using hypothetical data. 6. Example Results Using the synthetic data we were able to verify the operation of the algorithm and accurately assign both the neutral adducts and charged species for a number of combinations. The amino acid data sets provided additional complexity showing that neutral losses, adducts, and charge carrying species are accurately assigned in the presences of other potentially interfering signals. The assignment of loss of ammonia from both Lysine and Tyrosine is consistent with the accurate mass determined elemental composition. The assignment of the acetonitrile and sodium to lysine is questionable, given that the mobile phase used methanol and not acetonitrile and also the strength of the signal is comparatively quite low. In this case, improved accuracy could be achieved by using additional information such as the mobile phase composition to restrict the list of predicted adducts. Uncommon adducts such as those containing calcium, magnesium, and iron were detected and labeled as shown in the final example. Some of the calcium and magnesium adducts also included neutral solvent species and were doubly-charged. Conclusion Using the described graph-theory based algorithm to assign neutral adducts, in-source neutral losses, and charge species to mass spectral data produces useful results. Using the method described here complex mixtures of adducts with varied charge states and in-source neutral losses are detected and properly assigned. Node 5 (M+2H)++ 154.61465 Node 7 (2M+H)+ 614.42894 Node 9 (2M+Na+H )++ 319.21299 The complexity of the mass spectral information is reduced as a consequence of identification of the aforementioned species, which provides the analyst the capability to group compound related signals. The accuracy of this algorithm is further enhanced by incorporation of information such as mobile phase composition and knowledge of in-source fragmentation. Preference can be given to species known to occur through the use of weighting factors for the predicted adducts, charge carriers, and neutral loss species. References 1. Taylor, J. A.; Johnson, R. S. Rapid Commun. Mass Spectrom. 1997, 11 (9), 1067–1075. 2. Clauser, K. R.; Baker, P.; Burlingame, A. L. Anal. Chem. 1999, 71 (14), 2871–2882. 3. Cox, J.; Mann, M. Nat. Biotechnol. 2008, 26 (12), 1367–1372. guities in assignment can arise. equency of assignment for M+H gorithm also allows for weighting i information provide appropriate ch causes edges to move up or duct(s) or neutral loss(es) are Should ambiguities still arise, all All trademarks are the property of Thermo Fisher Scientific and its subsidiaries. This information is not intended to encourage use of these products in any manners that might infringe the intellectual property rights of others. PO64093-EN 0614S 6 Assigning Adduct and Charge States to High-resolution Accurate-mass Mass Spectral Data Using Frequency of Assignment in Multiple Difference Networks www.thermofisher.com ©2016 Thermo Fisher Scientific Inc. All rights reserved. All trademarks are the property of Thermo Fisher Scientific and its subsidiaries. This information is presented as an example of the capabilities of Thermo Fisher Scientific products. It is not intended to encourage use of these products in any manners that might infringe the intellectual property rights of others. Specifications, terms and pricing are subject to change. Not all products are available in all countries. Please consult your local sales representative for details. Africa +43 1 333 50 34 0 Australia +61 3 9757 4300 Austria +43 810 282 206 Belgium +32 53 73 42 41 Canada +1 800 530 8447 China 800 810 5118 (free call domestic) 400 650 5118 Denmark +45 70 23 62 60 Europe-Other +43 1 333 50 34 0 Finland +358 9 3291 0200 France +33 1 60 92 48 00 Germany +49 6103 408 1014 India +91 22 6742 9494 Italy +39 02 950 591 Japan +81 45 453 9100 Latin America +1 561 688 8700 Middle East +43 1 333 50 34 0 Netherlands +31 76 579 55 55 New Zealand +64 9 980 6700 Norway +46 8 556 468 00 Russia/CIS +43 1 333 50 34 0 Singapore +65 6289 1190 Spain +34 914 845 965 Sweden +46 8 556 468 00 Switzerland +41 61 716 77 00 UK +44 1442 233555 USA +1 800 532 4752 PN-64093-EN-0716S
© Copyright 2026 Paperzz