Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Mar 6;90(5):3156-3164.
doi: 10.1021/acs.analchem.7b04424. Epub 2018 Feb 9.

METLIN: A Technology Platform for Identifying Knowns and Unknowns

Affiliations

METLIN: A Technology Platform for Identifying Knowns and Unknowns

Carlos Guijas et al. Anal Chem. .

Abstract

METLIN originated as a database to characterize known metabolites and has since expanded into a technology platform for the identification of known and unknown metabolites and other chemical entities. Through this effort it has become a comprehensive resource containing over 1 million molecules including lipids, amino acids, carbohydrates, toxins, small peptides, and natural products, among other classes. METLIN's high-resolution tandem mass spectrometry (MS/MS) database, which plays a key role in the identification process, has data generated from both reference standards and their labeled stable isotope analogues, facilitated by METLIN-guided analysis of isotope-labeled microorganisms. The MS/MS data, coupled with the fragment similarity search function, expand the tool's capabilities into the identification of unknowns. Fragment similarity search is performed independent of the precursor mass, relying solely on the fragment ions to identify similar structures within the database. Stable isotope data also facilitate characterization by coupling the similarity search output with the isotopic m/ z shifts. Examples of both are demonstrated here with the characterization of four previously unknown metabolites. METLIN also now features in silico MS/MS data, which has been made possible through the creation of algorithms trained on METLIN's MS/MS data from both standards and their isotope analogues. With these informatic and experimental data features, METLIN is being designed to address the characterization of known and unknown molecules.

PubMed Disclaimer

Conflict of interest statement

Notes

The authors declare no competing financial interest.

Figures

Figure 1
Figure 1
METLIN search functions for metabolite identification. (A) Simple Search and Advanced Search allow the user to search small molecules against a database of 1 million compounds attending to different criteria and retrieve their chemical, spectral and other information of interest. Batch Search facilitates the search of many m/z of interest simultaneously, helping to identify different m/z values as distinct adducts or water losses of the same molecule. (B) With the MS/MS Spectrum Match Search, experimental and library MS/MS spectra can be searched, matched, and scored in an automatic way. (C) Fragment Similarity Search and Neutral Loss Search aid the identification of metabolites or chemical structures by searching m/z values of the fragments or neutral losses, respectively, regardless of the precursor mass.
Figure 2
Figure 2
Fragment Similarity Search facilitates the identification of unknown metabolites where no MS/MS spectral data are available. Two examples are shown where an unknown metabolite is characterized by use of Fragment Similarity Search: (A) a glucuronide of xanthohumol and (B) a desaturation variation of α-tocopherol. (A) The fragments of an unknown metabolite were searched against METLIN and all of the four fragments were found to match with xanthohumol. The comparison between experimental and library MS/MS spectra implies high structural similarities. Furthermore, the 176.03 Da difference between the precursor of the experimental spectra and the protonated species of xanthohumol can be attributed to glucuronidation. This mass difference represents the protonated species of xanthohumol + glucuronic acid − H2O (condensation product). (B) Five selected fragments of an unknown metabolite matched three fragments of α-tocopherol; however, the mass difference for nonmatching fragments as well as the precursor is 2.01 Da. This could be attributed to an extra double bond within the structure of α-tocopherol, presumably on the long aliphatic chain.
Figure 3
Figure 3
METLIN-guided use of 13C-labeled microorganism extracts as internal standards in mass spectrometry. Yeast are grown in the presence of 13C-glucose, yielding a labeling efficiency of 99% for their metabolites. After the extraction of the compounds of interest to use as internal standards, samples are spiked with those extracts to quantify many metabolites at the same time, using the MS/MS data provided by the spectral databases. The generation of MS/MS spectra to populate databases is a limiting step in this workflow.
Figure 4
Figure 4
Isotope-labeled microorganisms as a source of MS/MS spectra to populate spectral repositories. (A) An untargeted metabolomics analysis of two extracts of 12C- and 13C-labeled yeast was carried out to collect MS/MS spectra for METLIN and isoMETLIN. (B) If the putative metabolite MS/MS spectrum is recorded in METLIN, the fragmentation spectrum of its 13C-labeled analogue is easily identified for inclusion into isoMETLIN. (C) If the putative metabolite MS/MS spectrum is not displayed in METLIN, it is possible to obtain both 12C- and 13C-labeled spectra for their inclusion into METLIN and isoMETLIN, respectively, through the use of METLIN search functions, together with the in silico prediction and fragment predicted structure of structurally related molecules. Even if the parent m/z of the candidate molecule is not found in METLIN, it is likely that one will obtain structural information leading to its identification by use of METLIN tools. With this workflow, spectral databases are used to self-populate, by using their tools and current spectra to identify new MS/MS spectra.
Figure 5
Figure 5
Use of isotope-labeled microorganisms and METLIN to determine the structure of unknown molecules. Starting from the unlabeled and 13C-labeled MS/MS spectra of an unknown metabolite, it is possible to obtain structural information with the use of METLIN tools. The m/z shift of 30.10 Da in the parent ions points out the presence of 30 carbons in this metabolite. The neutral loss of 141.02 Da in the unlabeled molecule, together with the neutral loss of 143.03 Da in the 13C-labeled molecule, indicates the presence of a phosphoethanolamine group (C2H8NO4P). Fragments of 44.05 and 46.06 Da represents the main fragments of the phosphoethanolamine group in unlabeled and labeled molecules, respectively. Given that the glycerophosphoethanolamine group is composed of 5 carbons, the rest of the molecule must have 25 carbons. The most likely biomolecule fitting those requirements and with a parent m/z instrument error within 10 ppm is 1-hexadecanoyl-2-(9-oxononanoyl)-sn-glycero-3-phosphoethanolamine.
Figure 6
Figure 6
In silico data generation. (A) Workflow for in silico data simulation. A generalization of the input–output kernel regression model, especially designed to predict fragments of known molecules, is used to generate in silico data. Both unlabeled and isotope-labeled compounds are used for model training, providing additional information through the number of isotope-labeled atoms of each fragment. (B) Comparison between experimental MS/MS spectrum generated by lysoPE(18:0) with its in silico prediction in METLIN, at a collision energy of 10 eV. It is worth noting that 6 out of 7 main fragments of the experimental spectrum match with the in silico simulated data (highlighted in blue).

Comment in

References

    1. Smith CA, O’Maille G, Want EJ, Qin C, Trauger SA, Brandon TR, Custodio DE, Abagyan R, Siuzdak G. Ther Drug Monit. 2005;27:747–751. - PubMed
    1. Tautenhahn R, Cho K, Uritboonthai W, Zhu Z, Patti GJ, Siuzdak G. Nat Biotechnol. 2012;30:826–8. - PMC - PubMed
    1. Kind T, Tsugawa H, Cajka T, Ma Y, Lai Z, Mehta SS, Wohlgemuth G, Barupal DK, Showalter MR, Arita M, Fiehn O. Mass Spectrom Rev. 2017;9999:1–20. - PMC - PubMed
    1. Wishart DS, Tzur D, Knox C, Eisner R, Guo AC, Young N, Cheng D, Jewell K, Arndt D, Sawhney S, Fung C, Nikolai L, Lewis M, Coutouly MA, Forsythe I, Tang P, Shrivastava S, Jeroncic K, Stothard P, Amegbey G, Block D, Hau DD, Wagner J, Miniaci J, Clements M, Gebremedhin M, Guo N, Zhang Y, Duggan GE, Macinnis GD, Weljie AM, Dowlatabadi R, Bamforth F, Clive D, Greiner R, Li L, Marrie T, Sykes BD, Vogel HJ, Querengesser L. Nucleic Acids Res. 2007;35:D521–6. - PMC - PubMed
    1. Vinaixa M, Schymanski EL, Neumann S, Navarro M, Salek RM, Yanes O. TrAC, Trends Anal Chem. 2016;78:23–35.

Publication types

MeSH terms

LinkOut - more resources