Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Mar 22:11:148.
doi: 10.1186/1471-2105-11-148.

In silico fragmentation for computer assisted identification of metabolite mass spectra

Affiliations

In silico fragmentation for computer assisted identification of metabolite mass spectra

Sebastian Wolf et al. BMC Bioinformatics. .

Abstract

Background: Mass spectrometry has become the analytical method of choice in metabolomics research. The identification of unknown compounds is the main bottleneck. In addition to the precursor mass, tandem MS spectra carry informative fragment peaks, but the coverage of spectral libraries of measured reference compounds are far from covering the complete chemical space. Compound libraries such as PubChem or KEGG describe a larger number of compounds, which can be used to compare their in silico fragmentation with spectra of unknown metabolites.

Results: We created the MetFrag suite to obtain a candidate list from compound libraries based on the precursor mass, subsequently ranked by the agreement between measured and in silico fragments. In the evaluation MetFrag was able to rank most of the correct compounds within the top 3 candidates returned by an exact mass query in KEGG. Compared to a previously published study, MetFrag obtained better results than the commercial MassFrontier software. Especially for large compound libraries, the candidates with a good score show a high structural similarity or just different stereochemistry, a subsequent clustering based on chemical distances reduces this redundancy. The in silico fragmentation requires less than a second to process a molecule, and MetFrag performs a search in KEGG or PubChem on average within 30 to 300 seconds, respectively, on an average desktop PC.

Conclusions: We presented a method that is able to identify small molecules from tandem MS measurements, even without spectral reference data or a large set of fragmentation rules. With today's massive general purpose compound libraries we obtain dozens of very similar candidates, which still allows a confident estimate of the correct compound class. Our tool MetFrag improves the identification of unknown substances from tandem MS spectra and delivers better results than comparable commercial software. MetFrag is available through a web application, web services and as java library. The web frontend allows the end-user to analyse single spectra and browse the results, whereas the web service and console application are aimed to perform batch searches and evaluation.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Workflow of a search based on exact mass and tandem MS spectrum. First the upstream compound library is searched using their respective web service API. The scoring ranks the measured peaks against the in silico fragments.
Figure 2
Figure 2
Algorithm for in silico fragmentation. Each compound is fragmented using the bond dissociation approach. Bonds in ring systems need special treatment. Every possible structure is generated until a given tree depth is reached. The redundancy heuristics and mass checks reduce the search space.
Figure 3
Figure 3
Annotated tandem MS spectrum of Epicatechin. This spectrum for Epicatechin was measured on a Bruker-micrOTOFQ mass spectrometer and manually annotated by an expert. The measured peaks and corresponding fragments for the major signals are depicted. In addition, the non-topological water loss is highlighted in blue.
Figure 4
Figure 4
MetFrag web interface. The web interface with the search parameters at the top and the result list below. The extra window can be opened for each result and shows details such as the spectrum and matching fragment structures.
Figure 5
Figure 5
Top candidates for Naringenin against PubChem. The 9 top ranked compounds where the correct solution (CID 932) is reported at (tied) rank 8. Two clusters of structures (green and blue) are identical apart from their stereochemistry, the remaining three structures (yellow) that explain all six tandem MS peaks have a Tanimoto similarity < 0.95. After clustering with a similarity ≥ 0.95 the stereoisomers are collapsed into one cluster, resulting in a cluster rank 5 for the correct solution.
Figure 6
Figure 6
Empirical runtime. Runtime for the in silico fragmentation step on 5900 compounds randomly drawn from PubChem, with uniform mass distribution between 100 and 1000 Da. Limiting the tree depth of the in silico fragmentation to two (orange) results in an average runtime of 0.2 s for one compound. The exponential runtime can be seen especially when a larger tree depth (red) is used, raising the runtime to 3.4s.

Similar articles

Cited by

References

    1. Dunn WB. Current trends and future requirements for the mass spectrometric investigation of microbial, mammalian and plant metabolomes. Physical Biology. 2008;5:011001. doi: 10.1088/1478-3975/5/1/011001. (24pp) - DOI - PubMed
    1. Kopka J, Schauer N, Krueger S, Birkemeyer C, Usadel B, Bergmuller E, Dormann P, Weckwerth W, Gibon Y, Stitt M, Willmitzer L, Fernie AR, Steinhauser D. GMD@CSB.DB: the Golm Metabolome Database. Bioinformatics. 2005;21(8):1635–1638. doi: 10.1093/bioinformatics/bti236. - DOI - PubMed
    1. Horai H, Arita M, Nishioka T. Comparison of ESI-MS Spectra in MassBank Database. BioMedical Engineering and Informatics, 2008. BMEI 2008. International Conference on. 2008;2:853–857. full_text.
    1. Smith CA, Maille GO, Want EJ, Qin C, Trauger SA, Brandon TR, Custodio DE, Abagyan R, Siuzdak G. Proceedings of the 9th International Congress of Therapeutic Drug Monitoring and Clinical Toxicology. Vol. 27. Louisville, Kentucky; 2005. METLIN: A Metabolite Mass Spectral Database; pp. 747–751. - PubMed
    1. ACD/MS Fragmenter. http://www.acdlabs.com/products/adh/ms/ms_frag/

Publication types

LinkOut - more resources