Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Aug 31;93(34):11692-11700.
doi: 10.1021/acs.analchem.1c01465. Epub 2021 Aug 17.

CFM-ID 4.0: More Accurate ESI-MS/MS Spectral Prediction and Compound Identification

Affiliations

CFM-ID 4.0: More Accurate ESI-MS/MS Spectral Prediction and Compound Identification

Fei Wang et al. Anal Chem. .

Abstract

In the field of metabolomics, mass spectrometry (MS) is the method most commonly used for identifying and annotating metabolites. As this typically involves matching a given MS spectrum against an experimentally acquired reference spectral library, this approach is limited by the coverage and size of such libraries (which typically number in the thousands). These experimental libraries can be greatly extended by predicting the MS spectra of known chemical structures (which number in the millions) to create computational reference spectral libraries. To facilitate the generation of predicted spectral reference libraries, we developed CFM-ID, a computer program that can accurately predict ESI-MS/MS spectrum for a given compound structure. CFM-ID is one of the best-performing methods for compound-to-mass-spectrum prediction and also one of the top tools for in silico mass-spectrum-to-compound identification. This work improves CFM-ID's ability to predict ESI-MS/MS spectra from compounds by (1) learning parameters from features based on the molecular topology, (2) adding a new approach to ring cleavage that models such cleavage as a sequence of simple chemical bond dissociations, and (3) expanding its hand-written rule-based predictor to cover more chemical classes, including acylcarnitines, acylcholines, flavonols, flavones, flavanones, and flavonoid glycosides. We demonstrate that this new version of CFM-ID (version 4.0) is significantly more accurate than previous CFM-ID versions in terms of both EI-MS/MS spectral prediction and compound identification. CFM-ID 4.0 is available at http://cfmid4.wishartlab.com/ as a web server and docker images can be downloaded at https://hub.docker.com/r/wishartlab/cfmid.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
A fragmentation graph for the acetic acid [M+H]+ ion. Here each node denotes an ion fragment and each edge denotes the transition between a pair of nodes.
Figure 2.
Figure 2.
(a) An example of how earlier versions of CFM-ID model a fragmentation that only involves one chemical bond (highlighted in green). (b) An example of how earlier versions of CFM-ID model a fragmentation that involves a chemical ring structure. Note that the order of FV0 and FV1, is arbitrary. (c) An example of how the new version of CFM-ID models a fragmentation that involves the same chemical ring structure. Note that this method does not need the “double” feature vector design used by earlier versions of CFM-ID.
Figure 3.
Figure 3.
(a) An ion fragment with its root atom highlighted by the blue circle. (b) Extracting a graph G from the ion fragment. (c) Labelling every vertex in the graph G. (d) Indexing each vertex is determined based on its label. (e) Selecting a subgraph SG from graph G. (f) Computing adjacency matrix from graph SG. (g) Creating two tensors from the adjacency matrix. (h) Flattening two tensors into vectors. (i) Joining two vectors into the output feature vector.

References

    1. Dunn WB; Ellis DI Metabolomics: Current Analytical Platforms and Methodologies. Trends Anal. Chem 2005, 24 (4), 285–294.
    1. Kind T; Wohlgemuth G; Lee DY; Lu Y; Palazoglu M; Shahbaz S; Fiehn O FiehnLib: Mass Spectral and Retention Index Libraries for Metabolomics Based on Quadrupole and Time-of-Flight Gas Chromatography/Mass Spectrometry. Anal. Chem 2009, 81 (24), 10038–10048. 10.1021/ac9019522. - DOI - PMC - PubMed
    1. Stein S Mass Spectral Reference Libraries: An Ever-Expanding Resource for Chemical Identification. Anal. Chem 2012, 84 (17), 7274–7282. 10.1021/ac301205z. - DOI - PubMed
    1. Tautenhahn R; Cho K; Uritboonthai W; Zhu Z; Patti GJ; Siuzdak G An Accelerated Workflow for Untargeted Metabolomics Using the METLIN Database. Nat. Biotechnol 2012, 30 (9), 826–828. 10.1038/nbt.2348. - DOI - PMC - PubMed
    1. Hufsky F; Scheubert K; Böcker S Computational Mass Spectrometry for Small-Molecule Fragmentation. TrAC Trends Anal. Chem 2014, 53, 41–48. 10.1016/j.trac.2013.09.008. - DOI - PMC - PubMed

Publication types