Seven Golden Rules for heuristic filtering of molecular formulas obtained by accurate mass spectrometry
- PMID: 17389044
- PMCID: PMC1851972
- DOI: 10.1186/1471-2105-8-105
Seven Golden Rules for heuristic filtering of molecular formulas obtained by accurate mass spectrometry
Abstract
Background: Structure elucidation of unknown small molecules by mass spectrometry is a challenge despite advances in instrumentation. The first crucial step is to obtain correct elemental compositions. In order to automatically constrain the thousands of possible candidate structures, rules need to be developed to select the most likely and chemically correct molecular formulas.
Results: An algorithm for filtering molecular formulas is derived from seven heuristic rules: (1) restrictions for the number of elements, (2) LEWIS and SENIOR chemical rules, (3) isotopic patterns, (4) hydrogen/carbon ratios, (5) element ratio of nitrogen, oxygen, phosphor, and sulphur versus carbon, (6) element ratio probabilities and (7) presence of trimethylsilylated compounds. Formulas are ranked according to their isotopic patterns and subsequently constrained by presence in public chemical databases. The seven rules were developed on 68,237 existing molecular formulas and were validated in four experiments. First, 432,968 formulas covering five million PubChem database entries were checked for consistency. Only 0.6% of these compounds did not pass all rules. Next, the rules were shown to effectively reducing the complement all eight billion theoretically possible C, H, N, S, O, P-formulas up to 2000 Da to only 623 million most probable elemental compositions. Thirdly 6,000 pharmaceutical, toxic and natural compounds were selected from DrugBank, TSCA and DNP databases. The correct formulas were retrieved as top hit at 80-99% probability when assuming data acquisition with complete resolution of unique compounds and 5% absolute isotope ratio deviation and 3 ppm mass accuracy. Last, some exemplary compounds were analyzed by Fourier transform ion cyclotron resonance mass spectrometry and by gas chromatography-time of flight mass spectrometry. In each case, the correct formula was ranked as top hit when combining the seven rules with database queries.
Conclusion: The seven rules enable an automatic exclusion of molecular formulas which are either wrong or which contain unlikely high or low number of elements. The correct molecular formula is assigned with a probability of 98% if the formula exists in a compound database. For truly novel compounds that are not present in databases, the correct formula is found in the first three hits with a probability of 65-81%. Corresponding software and supplemental data are available for downloads from the authors' website.
Figures







Similar articles
-
Metabolomic database annotations via query of elemental compositions: mass accuracy is insufficient even at less than 1 ppm.BMC Bioinformatics. 2006 Apr 28;7:234. doi: 10.1186/1471-2105-7-234. BMC Bioinformatics. 2006. PMID: 16646969 Free PMC article.
-
Determination of elemental compositions by gas chromatography/time-of-flight mass spectrometry using chemical and electron ionization.Rapid Commun Mass Spectrom. 2010 Apr 30;24(8):1172-80. doi: 10.1002/rcm.4482. Rapid Commun Mass Spectrom. 2010. PMID: 20301109 Free PMC article.
-
Fragment formula calculator (FFC): determination of chemical formulas for fragment ions in mass spectrometric data.Anal Chem. 2014 Feb 18;86(4):2221-8. doi: 10.1021/ac403879d. Epub 2014 Feb 5. Anal Chem. 2014. PMID: 24498896 Free PMC article.
-
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217. Cochrane Database Syst Rev. 2022. PMID: 36321557 Free PMC article.
-
Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification.In: Kobeissy FH, editor. Brain Neurotrauma: Molecular, Neuropsychological, and Rehabilitation Aspects. Boca Raton (FL): CRC Press/Taylor & Francis; 2015. Chapter 25. In: Kobeissy FH, editor. Brain Neurotrauma: Molecular, Neuropsychological, and Rehabilitation Aspects. Boca Raton (FL): CRC Press/Taylor & Francis; 2015. Chapter 25. PMID: 26269925 Free Books & Documents. Review.
Cited by
-
TOMATOMET: A metabolome database consists of 7118 accurate mass values detected in mature fruits of 25 tomato cultivars.Plant Direct. 2021 Apr 29;5(4):e00318. doi: 10.1002/pld3.318. eCollection 2021 Apr. Plant Direct. 2021. PMID: 33969254 Free PMC article.
-
Differentiating isobaric steroid hormone metabolites using multi-stage tandem mass spectrometry.J Am Soc Mass Spectrom. 2013 Mar;24(3):399-409. doi: 10.1007/s13361-012-0542-4. Epub 2013 Jan 24. J Am Soc Mass Spectrom. 2013. PMID: 23345032
-
Targeted Isolation of Antibiofilm Compounds from Halophytic Endophyte Bacillus velezensis 7NPB-3B Using LC-HR-MS-Based Metabolomics.Microorganisms. 2024 Feb 19;12(2):413. doi: 10.3390/microorganisms12020413. Microorganisms. 2024. PMID: 38399817 Free PMC article.
-
MSdeCIpher: A Tool to Link Data from Complementary Ionization Techniques in High-Resolution GC-MS to Identify Molecular Ions.Metabolites. 2023 Dec 22;14(1):10. doi: 10.3390/metabo14010010. Metabolites. 2023. PMID: 38248813 Free PMC article.
-
A Perspective and Framework for Developing Sample Type Specific Databases for LC/MS-Based Clinical Metabolomics.Metabolites. 2019 Dec 21;10(1):8. doi: 10.3390/metabo10010008. Metabolites. 2019. PMID: 31877765 Free PMC article.
References
-
- Djerassi C, Silva CJ. Sponge Sterols - Origin and Biosynthesis. Accounts of Chemical Research. 1991;24:371–378.
-
- Omura S. Trends in the Search for Bioactive Microbial Metabolites. Journal of Industrial Microbiology. 1992;10:135–156. - PubMed
-
- Wray V. Carbon-Carbon Coupling-Constants - Compilation of Data and a Practical Guide. Progress in Nuclear Magnetic Resonance Spectroscopy. 1979;13:177–256.
-
- Buchanan BG, Smith DH, White WC, Gritter RJ, Feigenbaum EA, Lederberg J, Djerassi C. Applications of Artificial Intelligence for Chemical Inference .22. Automatic Rule Formation in Mass-Spectrometry by Means of Meta-Dendral Program. J Am Chem Soc. 1976;98:6168–6178.
-
- Olson DL, Norcross JA, O'Neil-Johnson M, Molitor PF, Detlefsen DJ, Wilson AG, Peck TL. Microflow NMR: concepts and capabilities. Anal Chem. 2004;76:2966–2974. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous