Application Note LCMS-116 What are we eating? MetaboScape® Software; Enabling the De-replication and Identification of Unknowns in Food Metabolomics Introduction Determining the structure of secondary metabolites is a significant bottleneck often faced by today’s plant and food metabolomics scientists. The identification of compounds of interest is a key step for enabling the biological interpretation of observed changes in metabolite profiles. Additionally, there is a need to quickly tag those compounds which have been characterised previously. This so-called de-replication process saves time which might otherwise be spent for the repetitive annotation of already known compounds. Here we re-evaluated a part of the data acquired for the showcase study of the „Metabolomics 2015 - 11th International Conference of the Metabolomics Society“ which was organized by the local conference hosts based at the University of California, Davis. They prepared three different food plates, chosen on the basis of large differences in dietary components, representing a fast food meal (coined „USA“food plate), a California food plate (based on USDA MyPlate dietary recommendations - http://www.choosemyplate.gov/) and a „Davis“ food plate, which was inspired by Korean cuisine. In 2015, we employed complementary approaches such as high resolution accurate mass LC-QTOF-MS/MS and GC-APCI-QTOF-MS/MS for a comprehensive analysis Authors Nikolas Kessler; Heiko Neuweger; Verena Tellström, Aiko Barsch Bruker Daltonik GmbH, Bremen, Germany Keywords Technology and Software Metabolomics impact II Structure elucidation MetaboScape Structure confirmation CompoundCrawler in-silico fragmentation SmartFormula3D Library search MetFrag de-replication unknown ID Food Profiling of both food metabolites and natural products found in the three different food plates. Data evaluation focused on the identification and annotation of characteristic, i.e. differentiating, small molecules found in the food samples. In the present study we re-investigated the data acquired by LC-QTOF-MS/MS in ESI positive ionisation mode and will present novel in the MetaboScape 2.0 software solution which facilitate the identification of natural products. The information can subsequently be used to build well-characterised MS/MS libraries, enabling a quick de-replication for known compounds. The MetaboScape software can address the challenge to identify unknowns and enables the confident assignment of known target compounds, both of which are critical steps in turning raw MS data into knowledge. Experimental According to the organizers of the Metabolomics 2015 showcase study, food plates were homogenized with an industrial-grade food service blender, lyophilized under vacuum (except for volatile profiling), and stored in a -80 °C freezer prior to shipment. In our lab, three replicates of, USA, Davis and California food plate samples were dissolved in 100 µL 80% methanol. Five µL of each sample was analysed in two technical replicates, each by UHPLC-QTOF-MS/MS, resulting in a total number of eighteen runs, excluding blank and quality control samples. Chromatographic separation was carried out using a Dionex RSLC system (Thermo Fisher Scientific) with a 100 x 2 mm Acclaim RSLC 120 C18 column, at a flow rate of 0.3 mL/min, Solvent A: Water + 0.1% HCOOH, Solvent B: Acetonitrile + 0.1% HCOOH, using the following gradient: 0 - 2 min 1% B; 2 – 17 min linear gradient from 1% - 99% B; 17 – 20 min 99% B; 20.1 min 1% B, total run time 30 min. MS detection was performed using a Bruker impact II Qq-TOF mass spectrometer (Bruker Daltonics). The instrument was operated in ESI positive mode acquiring full scan MS and MS/MS data using the InstantExpertiseTM routine. The resulting data was processed using the FindMolecularFeatures (FMF) algorithm and clustered in a bucket table with ProfileAnalysis 2.3 software. The subsequent data analysis and compound identification workflow was performed using tools integrated into the MetaboScape 2.0 software: Automatic molecular formula determination was carried out by combined evaluation of mass accuracy, isotopic patterns, adduct and fragment information using SmartFormula3D software™. Statistical data evaluation and structure identification including MetFrag [1] based in-silico fragmentation were accomplished on the same data. MS/MS spectra of confirmed compounds were stored in the spectral Library Editor integrated in MetaboScape 2.0. Results Data pre-processing for statistical analysis In the non-targeted metabolomics workflow presented here, the detection of compounds via the FindMolecularFeatures (FMF) peak finder was an important initial step of data pre-processing prior to statistical analysis. The FMF algorithm combines ions belonging to one compound such as common adducts (e.g. +Na, +K, +NH4), fragments originating from neutral losses, isotopes and charge states to one FMF compound. In a subsequent bucketing process the extracted features from the different samples were aligned across all samples and combined into a so-called bucket table. Here, a bucket table containing the 18 samples from the USA, Davis and California food plates was calculated and 1163 features were assigned throughout the samples. Following the import of the bucket table to the client-server based MetaboScape 2.0 software, an automated assignment of high-resolution accurate mass (HRAM) MS/MS spectra to the respective buckets enabled the subsequent confident de-replication of known and the structure elucidation for unknown compounds. Confident, automatic de-replication The information for extracted features contained in the bucket table included retention time, accurate mass and isotopic pattern (TIP TM - True Isotopic Pattern) of precursor and fragment spectra and hence, enabled to automatically annotate compounds at different confidence levels: 1. Using a custom “Analyte List” enabled to confidently annotate compounds in the bucket table. This list of known target compounds included metabolite name, molecular formula, retention time information from the applied C18 reversed phase chromatography and MS/MS library spectra. The graphical Annotation Quality “AQ” representation ( ) enabled to readily derive the confidence for each annotation based on user definable levels for matching of accurate mass, retention time, isotopic fidelity and MS/MS library score (see Figure 1). 2. Buckets which were not annotated using the Analyte List were queried against two complementary MS/ MS spectral libraries: The “Bruker HMDB Metabolite Library” and the “Bruker MetaboBASE Personal Library”. This allowed the assignment of features based on spectral similarity. Since no retention time information is evaluated for this workflow compound identification is considered “tentative”. 3. For the buckets which were not annotated by the first two approaches molecular formulas were automatically calculated by SmartFormula3D Figure 1. Overview perspective in MetaboScape 2.0 software. The implemented algorithm considers accurate mass and isotopic pattern information in MS and MS/MS spectra. Furthermore, information from adducts and neutral losses, as well as additional filters for elemental compositions [2, 3] were applied to narrow down the list of possible molecular formulas to biologically relevant candidates. Statistical evaluation via PCA and ANOVA in MetaboScape software revealed a characteristic compound with 943.525 m/z eluting at 10.86 min to be much more abundant in “Davis” food platter samples compared to “CA” and “USA” (see Figure 2). This compound was not annotated by the Analyte List or via the MS/MS spectral library query, but was selected for further characterisation due to Identification of Soyasaponin I as a characteristic compound for “Davis” samples - SmartFormula3D, CompoundCrawler and in-silico fragmentation with MetFrag Figure 2. Box Plot representation for Bucket 10.86 min: 943.525 m/z revealing higher abundance in Davis compared to CA and USA food platter samples. Figure 3. Assignment of elemental composition via SmartFormula3D. Based on precursor m/z information dozens of candidate formulas in a 1 mDa mass accuracy window are possible. In addition to mass accuracy SmartFormula3D considers the True Isotopic Pattern and MS/MS fragment information and returned the molecular formula C 48H78O18 as most likely candidate. Confidence in this result is not only based on the 0.94 ppm mass accuracy and very good isotopic pattern fit (2.71 mSigma value) but it is also supported by 80 fragment ions, constituting 92% of the MS/MS spectral intensity, for each of which an unambiguous molecular formula could be assigned. Figure 4. A) Searching online compound databases with CompoundCrawler for C 48H78O18 returned multiple candidate structures from the online compound databases. In-silico fragmentation of selected candidates using the MetFrag [1] algorithm generated scores for the likelihood of the structures to match the MS/MS fragment peaks. The best candidate molecule was Soyasaponin I. The characteristic aglycon fragment with 441.373 m/z highlighted on the Soyasaponin I molecule subsantiated this structural hyposesis. its relevance as a differentiating feature. The first critical information allowing for the identification of this metabolite was the correct molecular formula: Evaluation by the SmartFormula 3D software enabled to readily assign the molecular formula C 48H78O19 to the precursor with high confidence (see Figure 3) with a mass accuracy of 0.94 ppm and a mSigma value of 2.71 for the [M+H]+ (the lower the mSigma value the better the fit between measured and simulated isotopic pattern; scale ranges from 0 – 1000). Also the [M+Na]+ adduct contained in the extracted feature pointed to this molecular formula consisting only of C, H, and O atoms. Additional confidence in this molecular formula was derived from 80 MS/MS fragment peaks for which formulas could be assigned, covering 92% of fragment peak intensity. A search for this molecular formula in public databases using the integrated CompoundCrawler software functionality generated multiple hits for possible structures. In-silico fragmentation of the selected candidates via the fully integrated MetFrag algorithm delivered Soyasaponin I as the compound with the best MetFrag score (see Figure 4 A). The characteristic aglycon fragment with 441.373 m/z highlighted in Figure 4 A and additional in-silico generated structures (Figure 4 B) matching measured fragment ion peaks subsantiated this structural hypothesis. Confirmation of Soyasaponin I with a reference standard The identity of the compound could be confirmed by measuring the reference standard of Soyasaponin I and comparing retention time and MS/MS spectra (see Figure 5 A and B). Data for the reference compound was acquired approximately 12 month after analysing the original showcase samples by using the same general setup, but not the identical LC-MS/MS system. To demonstrate the transferability and reproducibility from one setup to another, a replicate of a Davis sample that was not analysed during the initial study was redissolved and analyzed on the new setup. The retention time and MS/MS spectrum of the candidate bucket acquired in 2015 matched the data acquired in 2016 (see Figure 5 B and C). Considering that the Davis food platter was inspired by Korean cuisine the identification of Soyasaponin I is in agreement with the “biological” context: The organizers of the showcase sample disclosed that the Davis food platter contained bean sprouts and those were described before to contain Soyasaponin I [4]. 617.405 m/z 781.473 m/z 485.151 m/z Figure 4.B) Further, in-silico generated fragment structures matching measured fragment ion peaks added to the annotation confidence. Figure 6. MS/MS Bucket matches: Three connected buckets based on similar HRAM MS/MS spectra – the similarity indicates these analytes to be related to Soyasaponin. Figure 5. Retention time and MS/MS spectrum of the Soyasaponin I reference standard (A) match the chromatographic signal in the Davis food study samples reanalyzed in 2016 (B) and the corresponding data acquired in 2015 (approximately 12 month before) (C). Identification of further Soyasaponins by MS/MS spectral similarity search In addition to Soyasaponin I, several other soyasaponins have been described in black beans [4]. Since chemically related compounds typically reveal similar MS/MS fragmentation patterns, an MS/MS spectral similarity search was performed with the aim of discovering further soyasaponins within the current data set. Figure 6 represents the outcome of an MS/MS similarity match between the MS/MS spectrum of the identified Soyasaponin I and all other MS/MS spectra of buckets contained in the bucket table. Similar to a typical MS/MS spectral library query, a query spectrum is compared to other MS/MS spectra and a matching score is calculated. The difference of the similarity matching is that the MS/ MS query spectra are not matched against a spectral library of known compounds but against other MS/MS spectra contained in the same bucket table. Two buckets with similar MS/MS spectra were returned: 11.16min:797.468m/z with a score of 899 and 10.69min:959.520m/z with a score of 917. Following the same workflow as described for Soyasaponin I - molecular formula generation followed by database searches for candidate structures and in-silico fragmentation - resulted in the tentative identification of Soyasaponin III and Soyasaponin V, respectively. Conclusions The Bruker impact II series of Q-TOF MS instruments, based on its Full Sensitivity Resolution (FSR) mode provides a non-compromising combination of mass accuracy, isotopic fidelity, resolution, dynamic range, sensitivity and MS/MS performance - a key requirement to analyse highly complex samples. Fully exploiting this high quality data using the novel MetaboScape 2.0 software allowed for an automated and confident de-replication of known target compounds based on user definable confidence levels for mass accuracy, isotopic fidelity, retention time and MS/MS score. Additionally, the integrated structure elucidation solutions SmartFormula 3DTM and MetFrag software enabled the identification of a secondary metabolite with m/z > 900 as a characteristic compound for the Davis food platter samples. In detail, unambiguous molecular formula assignment to the precursor ion followed by in-silico fragmentation of a structure candidate obtained from public database queries led to the successful identification of Soyasaponin I. The Davis food platter sample, inspired by Korean cuisine, contained among other ingredients bean sprouts which are known to contain soyasaponins as the predominant saponin. A subsequent MS/MS spectral similarity search allowed the tentative annotation of two additional Soyasaponins, III and V. These three target compounds can now be added to a custom MS/MS library in the MetaboScape software, extending the list of ‘known knowns’, and in combination with an extended Analyte List, will enable to quickly identify these compounds in other metabolite extracts. Acknowledgements Food study samples were provided by Arpana Vaniya (Fiehn laboratory) from University of California Davis and Nancy Keim (USDA team at Davis, California) as part of the Metabolomics 2015 conference show case study. We also thank Steffen Neumann and his team at the IPB in Halle, Germany for helpful discussions and for providing the source code of the MetFrag algorithm. [1] Wolf et al. BMC Bioinformatics 2010, 11:148. [2] Kind T. and Fiehn O. BMC Bioinformatics. 2007, 8:105. [3] Kessler, N. et al. PLOS One 2014 26; 9(11):e113909. [4] Lee MR et al. J Mass Spectrom. 1999 34(8):804-12. For research use only. Not for use in diagnostic procedures. Bruker Daltonik GmbH Bruker Daltonics Inc. Bremen · Germany Phone +49 (0)421-2205-0 Fax +49 (0)421-2205-103 Billerica, MA · USA Phone +1 (978) 663-3660 Fax +1 (978) 667-5993 [email protected] - www.bruker.com to change specifications without notice. © Bruker Daltonics 08-2016, LCMS-116, 1845963 Bruker Daltonics is continually improving its products and reserves the right References
© Copyright 2026 Paperzz