. 2016 Jan 29:8:3.

doi: 10.1186/s13321-016-0115-9. eCollection 2016.

MetFrag relaunched: incorporating strategies beyond in silico fragmentation

Christoph Ruttkies¹, Emma L Schymanski², Sebastian Wolf³, Juliane Hollender⁴, Steffen Neumann¹

Affiliations

¹ Leibniz Institute of Plant Biochemistry, Department of Stress and Developmental Biology, Weinberg 3, 06120 Halle, Germany.
² Eawag: Swiss Federal Institute for Aquatic Science and Technology, Überlandstrasse 133, 8600 Dübendorf, Switzerland.
³ Leibniz Institute of Plant Biochemistry, Department of Stress and Developmental Biology, Weinberg 3, 06120 Halle, Germany ; R&D NMR Software, Bruker BioSpin GmbH, Silberstreifen, 76287 Rheinstetten, Germany.
⁴ Eawag: Swiss Federal Institute for Aquatic Science and Technology, Überlandstrasse 133, 8600 Dübendorf, Switzerland ; Institute of Biogeochemistry and Pollutant Dynamics, ETH Zürich, 8092 Zürich, Switzerland.

PMID: 26834843
PMCID: PMC4732001
DOI: 10.1186/s13321-016-0115-9

MetFrag relaunched: incorporating strategies beyond in silico fragmentation

Christoph Ruttkies et al. J Cheminform. 2016.

. 2016 Jan 29:8:3.

doi: 10.1186/s13321-016-0115-9. eCollection 2016.

Authors

Christoph Ruttkies¹, Emma L Schymanski², Sebastian Wolf³, Juliane Hollender⁴, Steffen Neumann¹

Affiliations

¹ Leibniz Institute of Plant Biochemistry, Department of Stress and Developmental Biology, Weinberg 3, 06120 Halle, Germany.
² Eawag: Swiss Federal Institute for Aquatic Science and Technology, Überlandstrasse 133, 8600 Dübendorf, Switzerland.
³ Leibniz Institute of Plant Biochemistry, Department of Stress and Developmental Biology, Weinberg 3, 06120 Halle, Germany ; R&D NMR Software, Bruker BioSpin GmbH, Silberstreifen, 76287 Rheinstetten, Germany.
⁴ Eawag: Swiss Federal Institute for Aquatic Science and Technology, Überlandstrasse 133, 8600 Dübendorf, Switzerland ; Institute of Biogeochemistry and Pollutant Dynamics, ETH Zürich, 8092 Zürich, Switzerland.

PMID: 26834843
PMCID: PMC4732001
DOI: 10.1186/s13321-016-0115-9

Abstract

Background: The in silico fragmenter MetFrag, launched in 2010, was one of the first approaches combining compound database searching and fragmentation prediction for small molecule identification from tandem mass spectrometry data. Since then many new approaches have evolved, as has MetFrag itself. This article details the latest developments to MetFrag and its use in small molecule identification since the original publication.

Results: MetFrag has gone through algorithmic and scoring refinements. New features include the retrieval of reference, data source and patent information via ChemSpider and PubChem web services, as well as InChIKey filtering to reduce candidate redundancy due to stereoisomerism. Candidates can be filtered or scored differently based on criteria like occurence of certain elements and/or substructures prior to fragmentation, or presence in so-called "suspect lists". Retention time information can now be calculated either within MetFrag with a sufficient amount of user-provided retention times, or incorporated separately as "user-defined scores" to be included in candidate ranking. The changes to MetFrag were evaluated on the original dataset as well as a dataset of 473 merged high resolution tandem mass spectra (HR-MS/MS) and compared with another open source in silico fragmenter, CFM-ID. Using HR-MS/MS information only, MetFrag2.2 and CFM-ID had 30 and 43 Top 1 ranks, respectively, using PubChem as a database. Including reference and retention information in MetFrag2.2 improved this to 420 and 336 Top 1 ranks with ChemSpider and PubChem (89 and 71 %), respectively, and even up to 343 Top 1 ranks (PubChem) when combining with CFM-ID. The optimal parameters and weights were verified using three additional datasets of 824 merged HR-MS/MS spectra in total. Further examples are given to demonstrate flexibility of the enhanced features.

Conclusions: In many cases additional information is available from the experimental context to add to small molecule identification, which is especially useful where the mass spectrum alone is not sufficient for candidate selection from a large number of candidates. The results achieved with MetFrag2.2 clearly show the benefit of considering this additional information. The new functions greatly enhance the chance of identification success and have been incorporated into a command line interface in a flexible way designed to be integrated into high throughput workflows. Feedback on the command line version of MetFrag2.2 available at http://c-ruttkies.github.io/MetFrag/ is welcome.

Keywords: Compound identification; High resolution mass spectrometry; In silico fragmentation; Metabolomics; Structure elucidation.

PubMed Disclaimer

Figures

**Fig. 1**
Top 1 ranks with PubChem (XlogP3) on the Orbitrap XL Dataset. The results were obtained with MetFrag formula query and the inclusion of references and retention time. The reference score was calculated with the number of patents (PNP) and PubMed references (PPC). The *larger dots* show the best result (336 number 1 ranks), 75th percentile (320), median (312), 25th percentile (249) and worst result (61). For the best result, the weights were $ω_{Frag} = 0.50, ω_{RT} = 0.16$ and $ω_{Refs} = 0.34$

**Fig. 2**
Top 1 ranks with ChemSpider on the Orbitrap XL Dataset. The results were obtained with MetFrag formula query and the inclusion of references and retention time. The reference score was calculated with the ChemSpider reference count (CRC). The *larger dots* show the best result (420), 75th percentile (399), median (388), 25th percentile (311) and worst result (104). The weights for the best result were $ω_{Frag} = 0.49, ω_{RT} = 0.19$ and $ω_{Refs} = 0.32$

See this image and copyright information in PMC

Cited by

Online Prioritization of Toxic Compounds in Water Samples through Intelligent HRMS Data Acquisition.
Meekel N, Vughs D, Béen F, Brunner AM. Meekel N, et al. Anal Chem. 2021 Mar 30;93(12):5071-5080. doi: 10.1021/acs.analchem.0c04473. Epub 2021 Mar 16. Anal Chem. 2021. PMID: 33724776 Free PMC article.
Navigating common pitfalls in metabolite identification and metabolomics bioinformatics.
Novoa-Del-Toro EM, Witting M. Novoa-Del-Toro EM, et al. Metabolomics. 2024 Sep 21;20(5):103. doi: 10.1007/s11306-024-02167-2. Metabolomics. 2024. PMID: 39305388 Free PMC article. Review.
Wheat growth, applied water use efficiency and flag leaf metabolome under continuous and pulsed deficit irrigation.
Stallmann J, Schweiger R, Pons CAA, Müller C. Stallmann J, et al. Sci Rep. 2020 Jun 22;10(1):10112. doi: 10.1038/s41598-020-66812-1. Sci Rep. 2020. PMID: 32572060 Free PMC article.
Food Phenotyping: Recording and Processing of Non-Targeted Liquid Chromatography Mass Spectrometry Data for Verifying Food Authenticity.
Creydt M, Fischer M. Creydt M, et al. Molecules. 2020 Aug 31;25(17):3972. doi: 10.3390/molecules25173972. Molecules. 2020. PMID: 32878155 Free PMC article. Review.
Estimating LoD-s Based on the Ionization Efficiency Values for the Reporting and Harmonization of Amenable Chemical Space in Nontargeted Screening LC/ESI/HRMS.
Souihi A, Kruve A. Souihi A, et al. Anal Chem. 2024 Jul 16;96(28):11263-11272. doi: 10.1021/acs.analchem.4c01002. Epub 2024 Jul 3. Anal Chem. 2024. PMID: 38959408 Free PMC article.

See all "Cited by" articles

References

1. Schymanski EL, Singer HP, Slobodnik J, Ipolyi IM, Oswald P, Krauss M, Schulze T, Haglund P, Letzel T, Grosse S, et al. Non-target screening with high-resolution mass spectrometry: critical review using a collaborative trial on water analysis. Anal Bioanal Chem. 2015;407(21):6237–6255. doi: 10.1007/s00216-015-8681-7. - DOI - PubMed
1. Hug C, Ulrich N, Schulze T, Brack W, Krauss M. Identification of novel micropollutants in wastewater by a combination of suspect and nontarget screening. Environ Pollut. 2014;184:25–32. doi: 10.1016/j.envpol.2013.07.048. - DOI - PubMed
1. Schymanski EL, Singer HP, Longrée P, Loos M, Ruff M, Stravs MA, Ripollés Vidal C, Hollender J. Strategies to characterize polar organic contamination in wastewater: exploring the capability of high resolution mass spectrometry. Environ Sci Technol. 2014;48(3):1811–1818. doi: 10.1021/es4044374. - DOI - PubMed
1. Stein S. Mass spectral reference libraries: an ever-expanding resource for chemical identification. Anal Chem. 2012;84(17):7274–7282. doi: 10.1021/ac301205z. - DOI - PubMed
1. Vinaixa M, Schymanski EL, Neumann S, Navarro M, Salek RM, Yanes O (2015) Mass spectral databases for LC/MS and GC/MS-based metabolomics: state of the field and future prospects. Trends Anal Chem (TrAC). doi:10.1016/j.trac.2015.09.005

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations
Molecular Biology Databases
- NIAID Data Ecosystem - Find datasets on Infectious and Immune-mediated Diseases

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

MetFrag relaunched: incorporating strategies beyond in silico fragmentation

Affiliations

MetFrag relaunched: incorporating strategies beyond in silico fragmentation

Authors

Affiliations

Abstract

Figures

Similar articles

Cited by

References

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases

Abstract

Figures

Similar articles

Cited by

References

Related information

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases