Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jun;22(6):1247-1254.
doi: 10.1038/s41592-025-02660-z. Epub 2025 May 12.

A universal language for finding mass spectrometry data patterns

Tito Damiani #  1 Alan K Jarmusch #  2 Allegra T Aron #  3 Daniel Petras #  4   5 Vanessa V Phelan #  6 Haoqi Nina Zhao #  7 Wout Bittremieux  7 Deepa D Acharya  8 Mohammed M A Ahmed  9   10 Anelize Bauermeister  11 Matthew J Bertin  12 Paul D Boudreau  9 Ricardo M Borges  13 Benjamin P Bowen  14   15 Christopher J Brown  16 Fernanda O Chagas  13 Kenneth D Clevenger  17 Mario S P Correia  18 William J Crandall  19 Max Crüsemann  20   21 Eoin Fahy  22 Oliver Fiehn  23 Neha Garg  24 William H Gerwick  7   25 Jeffrey R Gilbert  16 Daniel Globisch  18 Paulo Wender P Gomes  26 Steffen Heuckeroth  27 C Andrew James  28 Scott A Jarmusch  29 Sarvar A Kakhkhorov  30 Kyo Bin Kang  31 Nikolas Kessler  32 Roland D Kersten  33 Hyunwoo Kim  34 Riley D Kirk  35 Oliver Kohlbacher  36 Eftychia E Kontou  37 Ken Liu  19 Itzel Lizama-Chamu  38 Gordon T Luu  38 Tal Luzzatto Knaan  39 Helena Mannochio-Russo  7 Michael T Marty  40 Yuki Matsuzawa  41 Andrew C McAvoy  42 Laura-Isobel McCall  43 Osama G Mohamed  44   45 Omri Nahor  39 Heiko Neuweger  32 Timo H J Niedermeyer  46 Kozo Nishida  41 Trent R Northen  14   15 Kirsten E Overdahl  2 Johannes Rainer  47 Raphael Reher  48 Elys Rodriguez  23 Timo T Sachsenberg  49 Laura M Sanchez  38 Robin Schmid  1   7 Cole Stevens  50 Shankar Subramaniam  22 Zhenyu Tian  51 Ashootosh Tripathi  33   45 Hiroshi Tsugawa  41   52   53 Justin J J van der Hooft  54   55 Andrea Vicini  47 Axel Walter  49 Tilmann Weber  37 Quanbo Xiong  56 Tao Xu  57 Tomáš Pluskal  1 Pieter C Dorrestein  7 Mingxun Wang  58
Affiliations

A universal language for finding mass spectrometry data patterns

Tito Damiani et al. Nat Methods. 2025 Jun.

Erratum in

  • Author Correction: A universal language for finding mass spectrometry data patterns.
    Damiani T, Jarmusch AK, Aron AT, Petras D, Phelan VV, Zhao HN, Bittremieux W, Acharya DD, Ahmed MMA, Bauermeister A, Bertin MJ, Boudreau PD, Borges RM, Bowen BP, Brown CJ, Chagas FO, Clevenger KD, Correia MSP, Crandall WJ, Crüsemann M, Fahy E, Fiehn O, Garg N, Gerwick WH, Gilbert JR, Globisch D, Gomes PWP, Heuckeroth S, James CA, Jarmusch SA, Kakhkhorov SA, Kang KB, Kessler N, Kersten RD, Kim H, Kirk RD, Kohlbacher O, Kontou EE, Liu K, Lizama-Chamu I, Luu GT, Luzzatto Knaan T, Mannochio-Russo H, Marty MT, Matsuzawa Y, McAvoy AC, McCall LI, Mohamed OG, Nahor O, Neuweger H, Niedermeyer THJ, Nishida K, Northen TR, Overdahl KE, Rainer J, Reher R, Rodriguez E, Sachsenberg TT, Sanchez LM, Schmid R, Stevens C, Subramaniam S, Tian Z, Tripathi A, Tsugawa H, van der Hooft JJJ, Vicini A, Walter A, Weber T, Xiong Q, Xu T, Pluskal T, Dorrestein PC, Wang M. Damiani T, et al. Nat Methods. 2025 Sep;22(9):1995. doi: 10.1038/s41592-025-02785-1. Nat Methods. 2025. PMID: 40781363 Free PMC article. No abstract available.

Abstract

Despite being information rich, the vast majority of untargeted mass spectrometry data are underutilized; most analytes are not used for downstream interpretation or reanalysis after publication. The inability to dive into these rich raw mass spectrometry datasets is due to the limited flexibility and scalability of existing software tools. Here we introduce a new language, the Mass Spectrometry Query Language (MassQL), and an accompanying software ecosystem that addresses these issues by enabling the community to directly query mass spectrometry data with an expressive set of user-defined mass spectrometry patterns. Illustrated by real-world examples, MassQL provides a data-driven definition of chemical diversity by enabling the reanalysis of all public untargeted metabolomics data, empowering scientists across many disciplines to make new discoveries. MassQL has been widely implemented in multiple open-source and commercial mass spectrometry analysis tools, which enhances the ability, interoperability and reproducibility of mining of mass spectrometry data for the research community.

PubMed Disclaimer

Conflict of interest statement

Competing interests: P.C.D. is an advisor to Cybele and a co-founder and scientific advisor to Ometa and Enveda with prior approval by UC San Diego. M.W. is a co-founder of Ometa Labs LLC. R.S., S.H. and T.P. are co-founders of mzio GmbH. T.R.N. is an advisor of Brightseed Bio. J.J.J.v.d.H. is a member of the Scientific Advisory Board of NAICONS Srl., Milano, Italy, and is consulting for Corteva Agriscience, Indianapolis, IN, USA. O.K. and T.S. are officers in OpenMS Inc., a non-profit foundation that manages the international coordination of OpenMS development. The other authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Schematic representation of the MassQL ecosystem.
a, Examples of molecules that produce distinctive data patterns when measured by MS as mass/charge (m/z) and intensity (i) peaks. b, MassQL query representing MS/MS fragmentation patterns that encapsulates a characteristic mass loss. The query can be translated to nine languages for enhanced accessibility. c, MassQL is a universal tool to query MS data. MassQL enables data searching in a single file to entire MS repositories. MassQL has also been incorporated into a wide range of MS software. d, MassQL queries are shared and reused via the Community Compendium, which increases reproducibility and knowledge dissemination.
Fig. 2
Fig. 2. Insights from siderophores at a repository scale.
a, The MassQL query for iron-binding searches for the characteristic 54Fe12C peak with 6.3% abundance relative to the 56Fe12C peak. Peaks queried by MassQL are colored in red. b, The molecular network of MassQL spectra hits after clustering by MSCluster (gray, no MS/MS annotation by GNPS MS/MS library; orange, MS/MS annotation by GNPS MS/MS libraries). Singletons (no neighbors in the network) have been excluded from this molecular network. c, A spectral family of the molecular network containing desferrioxamines, including proton-bound and iron-bound desferrioxamines E, G and B, in addition to structurally related analogs. d, Less than <1% of clustered MS/MS are annotated as siderophores by GNPS libraries when masses associated with ethylenediaminetetraacetic acid (an anticoagulant added to MS samples) are removed from the network.
Fig. 3
Fig. 3. Discovering OPEs at a repository scale.
a, The general structure of OPEs and the MassQL query for the characteristic phosphate fragment. b, The molecular network of MassQL MS/MS in a repository scale query after clustering by Falcon (green, annotation by GNPS library; blue, annotation by an OPEs list curated by Ye et al.; gray, unannotated). c, Summary of MS matching results (precursor m/z match with 20 ppm mass error) by the GNPS MS/MS library and the OPEs list by Ye et al.. These putative identifications were based on precursor only (level 3 annotations). d, A molecular family shown containing alkyl-OPEs. OPEs reported by Ye et al. or in the MS/MS database search are indicated: tributyl phosphate and dimers (light orange and dark orange), and trioctyl phosphate and dimers (light blue and dark blue). Dibutyl phosphate (green) was not reported by Ye et al. or in the MS/MS database search. The structures displayed are illustrative of one possible isomer of the alkyl chain; the specific structure is beyond the scope of this report.

References

    1. Stein, S. E. & Scott, D. R. Optimization and testing of mass spectral library search algorithms for compound identification. J. Am. Soc. Mass. Spectrom.5, 859–866 (1994). - PubMed
    1. Baars, O., Morel, F. M. M. & Perlman, D. H. ChelomEx: isotope-assisted discovery of metal chelates in complex media using high-resolution LC–MS. Anal. Chem.86, 11298–11305 (2014). - PubMed
    1. Huber, F. et al. matchms—processing and similarity evaluation of mass spectrometry data. J. Open Source Softw.5, 2411 (2020).
    1. Chang, H.-Y. et al. A practical guide to metabolomics software development. Anal. Chem.93, 1912–1923 (2021). - PMC - PubMed
    1. Matsuda, F. Regular expressions of MS/MS spectra for partial annotation of metabolite features. Metabolomics12, 113 (2016).