Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Dec;40(12):1774-1779.
doi: 10.1038/s41587-022-01368-1. Epub 2022 Jul 7.

Enhancing untargeted metabolomics using metadata-based source annotation

Julia M Gauglitz #  1   2 Kiana A West #  1   2 Wout Bittremieux #  1   2 Candace L Williams  3 Kelly C Weldon  1   2   4 Morgan Panitchpakdi  1   2 Francesca Di Ottavio  1 Christine M Aceves  1   2 Elizabeth Brown  2   5 Nicole C Sikora  1   2 Alan K Jarmusch  1   2 Cameron Martino  4   6   7 Anupriya Tripathi  2   5   6 Michael J Meehan  1   2 Kathleen Dorrestein  1   2 Justin P Shaffer  6 Roxana Coras  8 Fernando Vargas  1   2   5 Lindsay DeRight Goldasich  6 Tara Schwartz  6 MacKenzie Bryant  6 Gregory Humphrey  6 Abigail J Johnson  9 Katharina Spengler  1 Pedro Belda-Ferre  4   6 Edgar Diaz  6 Daniel McDonald  6 Qiyun Zhu  6 Emmanuel O Elijah  1   2 Mingxun Wang  1   2 Clarisse Marotz  6 Kate E Sprecher  10   11 Daniela Vargas-Robles  12 Dana Withrow  10 Gail Ackermann  6 Lourdes Herrera  13 Barry J Bradford  14 Lucas Maciel Mauriz Marques  15 Juliano Geraldo Amaral  16 Rodrigo Moreira Silva  17 Flavio Protasio Veras  15 Thiago Mattar Cunha  15 Rene Donizeti Ribeiro Oliveira  18 Paulo Louzada-Junior  18 Robert H Mills  1   2   6   19 Paulina K Piotrowski  20 Stephanie L Servetas  20 Sandra M Da Silva  20 Christina M Jones  20 Nancy J Lin  20 Katrice A Lippa  20 Scott A Jackson  20 Rima Kaddurah Daouk  21   22   23 Douglas Galasko  24 Parambir S Dulai  25 Tatyana I Kalashnikova  26 Curt Wittenberg  26 Robert Terkeltaub  8   27 Megan M Doty  6   28 Jae H Kim  29 Kyung E Rhee  6 Julia Beauchamp-Walters  30 Kenneth P Wright Jr  10 Maria Gloria Dominguez-Bello  31 Mark Manary  32 Michelli F Oliveira  33 Brigid S Boland  25 Norberto Peporine Lopes  17 Monica Guma  8 Austin D Swafford  4 Rachel J Dutton  5 Rob Knight  34   35   36   37   38 Pieter C Dorrestein  39   40   41   42   43
Affiliations

Enhancing untargeted metabolomics using metadata-based source annotation

Julia M Gauglitz et al. Nat Biotechnol. 2022 Dec.

Erratum in

  • Author Correction: Enhancing untargeted metabolomics using metadata-based source annotation.
    Gauglitz JM, West KA, Bittremieux W, Williams CL, Weldon KC, Panitchpakdi M, Di Ottavio F, Aceves CM, Brown E, Sikora NC, Jarmusch AK, Martino C, Tripathi A, Meehan MJ, Dorrestein K, Shaffer JP, Coras R, Vargas F, Goldasich LD, Schwartz T, Bryant M, Humphrey G, Johnson AJ, Spengler K, Belda-Ferre P, Diaz E, McDonald D, Zhu Q, Elijah EO, Wang M, Marotz C, Sprecher KE, Vargas-Robles D, Withrow D, Ackermann G, Herrera L, Bradford BJ, Marques LMM, Amaral JG, Silva RM, Veras FP, Cunha TM, Oliveira RDR, Louzada-Junior P, Mills RH, Piotrowski PK, Servetas SL, Da Silva SM, Jones CM, Lin NJ, Lippa KA, Jackson SA, Daouk RK, Galasko D, Dulai PS, Kalashnikova TI, Wittenberg C, Terkeltaub R, Doty MM, Kim JH, Rhee KE, Beauchamp-Walters J, Wright KP Jr, Dominguez-Bello MG, Manary M, Oliveira MF, Boland BS, Lopes NP, Guma M, Swafford AD, Dutton RJ, Knight R, Dorrestein PC. Gauglitz JM, et al. Nat Biotechnol. 2023 Nov;41(11):1656. doi: 10.1038/s41587-023-02025-x. Nat Biotechnol. 2023. PMID: 37853256 No abstract available.

Abstract

Human untargeted metabolomics studies annotate only ~10% of molecular features. We introduce reference-data-driven analysis to match metabolomics tandem mass spectrometry (MS/MS) data against metadata-annotated source data as a pseudo-MS/MS reference library. Applying this approach to food source data, we show that it increases MS/MS spectral usage 5.1-fold over conventional structural MS/MS library matches and allows empirical assessment of dietary patterns from untargeted data.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.. The concept of reference data-driven based analysis workflow.
1 - Perform spectral alignment of the MS/MS based untargeted metabolomics data from human biospecimens with data from reference samples that have controlled vocabularies for metadata. This can, optionally, be combined with MS/MS libraries. 2 - link the spectral matches to the source information from the metadata from the reference samples. Create a data table of source ontology, human biospecimen and counts to enable data science and interpretation.
Figure 2.
Figure 2.. RDD with food reference data.
a. Food RDD analysis schema. b. Food spectral counts (1% FDR) observed in plasma from a sleep restriction and circadian misalignment study that controlled the diet of the participants (n=371 samples from 20 healthy adults). The size of node represents the relative number of spectral matches at each food level. Blue arrow - foods that could be explained based although they were not provided in the study, orange arrow– source is not known. c. A crossover experiment between centenarian data from Italy and a sleep and circadian study from the US, for both fecal and plasma samples. Study region specific foods consumed by those individuals (yes) vs a different set of study region specific foods (no), (one way Welch’s t-test, thick line is the mean, range within the box is the interquartile range, from the 25 to 75 quartile, min / max are the whiskers). d. PCA of food counts color coded by vegan (brown) vs omnivore data (green). e. Statistical analysis for the food counts at level 3 of the ontology, in relation to omnivore and vegan data (Wilcoxon test, n=36, 19 are vegan and 19 are omnivore). f. Same as e. but level 4 ontology using unique spectral counts (spectral usage is the percentage of MS/MS spectra used in the analysis. Since they are unnamed ontologies as one would find in microorganism phylogeny in microbiome science - e.g. kingdom, genus, species we have denoted these as layers, Table S1). For e-f, The boxes represent the interquartile range (IQR). Lower limit (Q1) is 25th percentile, median (Q2), upper limit (Q3) is 75th percentile. Bars show Q3+1.5xIQR and Q1–1.5xIQR.

References

    1. Knights D, et al., Nat Methods. 2011, 8, 8761. - PMC - PubMed
    1. Ono H, Scientific Data, 2017, 4, 170105. - PMC - PubMed
    1. Bono H, PloS One, 2020,15, e0227076. - PMC - PubMed
    1. Turnbaugh PJ Nature, 2007, 449, 804. - PMC - PubMed
    1. Haug K, et al., Nucleic Acids Research, 2020, 48, D440. - PMC - PubMed

Publication types