Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Aug 1;83(15):5895-902.
doi: 10.1021/ac2006137. Epub 2011 Jun 28.

Applying in-silico retention index and mass spectra matching for identification of unknown metabolites in accurate mass GC-TOF mass spectrometry

Affiliations

Applying in-silico retention index and mass spectra matching for identification of unknown metabolites in accurate mass GC-TOF mass spectrometry

Sangeeta Kumari et al. Anal Chem. .

Abstract

One of the major obstacles in metabolomics is the identification of unknown metabolites. We tested constraints for reidentifying the correct structures of 29 known metabolite peaks from GCT premier accurate mass chemical ionization GC-TOF mass spectrometry data without any use of mass spectral libraries. Correct elemental formulas were retrieved within the top-3 hits for most molecular ion adducts using the "Seven Golden Rules" algorithm. An average of 514 potential structures per formula was downloaded from the PubChem chemical database and in-silico-derivatized using the ChemAxon software package. After chemical curation, Kovats retention indices (RI) were predicted for up to 747 potential structures per formula using the NIST MS group contribution algorithm and corrected for contribution of trimethylsilyl groups using the Fiehnlib RI library. When matching the range of predicted RI values against the experimentally determined peak retention, all but three incorrect formulas were excluded. For all remaining isomeric structures, accurate mass electron ionization spectra were predicted using the MassFrontier software and scored against experimental spectra. Using a mass error window of 10 ppm for fragment ions, 89% of all isomeric structures were removed and the correct structure was reported in 73% within the top-5 hits of the cases.

PubMed Disclaimer

Conflict of interest statement

Competing interest’s statement

The authors declare no competing financial interests.

Figures

Figure 1
Figure 1
Workflow for de-novo annotation of unknown metabolic peaks matching predicted versus experimental data in derivatization based accurate mass GC-TOFMS analysis.
Figure 2
Figure 2
Constraining structures using the JChem ‘Standardizer’ and ‘Reactor’ tools. Example for 6 out of 570 structures downloaded from the PubChem database for query of C6H14N2O2, constrained as tetra-trimethylsilylated structure. Standardizer: Structures CID 134478, 161547 are removed due to isotope inclusion or fragmented structures. Chiral centers are removed. Reactor: Structures CID 148609, 147026 are removed as all derivatization products have less than 4 trimethylsilyl groups (TMS). Structure CID 331 yields 2 new structures, each with 4 TMS groups which are used for retention index calculation.
Figure 3
Figure 3. TMS-group contribution error for retention index prediction using the NIST MS tool
Figure 3A: Difference of experimental versus predicted spectra for 286 hydroxyl-, carboxyl-, amine- and thiol-comprising metabolites of the Fiehnlib metabolomics library. The trendline for dependence of prediction error versus number of trimethylsilyl groups was used as additional TMS-group contribution correction. Figure 3B: 26 trimethylsilylated terminal amines from the Fiehnlib metabolomics library showed a positive trend from underivatized amines to tetraTMS-derivatized amines. This error for 2 and 4 terminal amine-TMS groups was subsequently used to correct the NIST MS tool as additional group contribution.
Figure 4
Figure 4
Scoring remaining structures by accurate mass EI-spectra matching in Mass Frontier. Top panel: experimental spectra (here magnified from m/z 116–141 u) are matched against all ions that are predicted by the Mass Frontier algorithm (lower panels, spectra labeled red). For clarity, computational fragment ions that are not matching at least the nominal experimental masses are left out. For asparagine 3TMS (PubChem CID 6267), all experimental ions were computationally predicted within 10 ppm mass range. For CID 352913 (isoasparagine 3TMS), and CID 9904561 (3-amino-1,4-dihydroxy-pyrrolidin-2-one 3TMS), no ions matching m/z 132.0834 could be found (labeled in red) when considering all possible fragmentation routes given by Mass Frontier.

References

    1. Bowen BP, Northen TR. J Am Soc Mass Spectrom. 2010;21:1471–1476. - PubMed
    1. Hegeman AD. Briefings in Functional Genomics and Proteomics. 2010;9:139–148. - PubMed
    1. Kind T, Fiehn O. Bioanalytical Reviews. 2010;2:23–60. - PMC - PubMed
    1. Fiehn O. TrAC, Trends Anal Chem. 2008:261–269. - PMC - PubMed
    1. Pirok G, Máté N, Varga J, Szegezdi J, Vargyas M, Dóránt S, Csizmadia F. J Chem Inf Model. 2006;46:563–568. - PubMed

Publication types