Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Mar 7:10:1130781.
doi: 10.3389/fmolb.2023.1130781. eCollection 2023.

Simulated-to-real benchmarking of acquisition methods in untargeted metabolomics

Affiliations

Simulated-to-real benchmarking of acquisition methods in untargeted metabolomics

Joe Wandy et al. Front Mol Biosci. .

Abstract

Data-Dependent and Data-Independent Acquisition modes (DDA and DIA, respectively) are both widely used to acquire MS2 spectra in untargeted liquid chromatography tandem mass spectrometry (LC-MS/MS) metabolomics analyses. Despite their wide use, little work has been attempted to systematically compare their MS/MS spectral annotation performance in untargeted settings due to the lack of ground truth and the costs involved in running a large number of acquisitions. Here, we present a systematic in silico comparison of these two acquisition methods in untargeted metabolomics by extending our Virtual Metabolomics Mass Spectrometer (ViMMS) framework with a DIA module. Our results show that the performance of these methods varies with the average number of co-eluting ions as the most important factor. At low numbers, DIA outperforms DDA, but at higher numbers, DDA has an advantage as DIA can no longer deal with the large amount of overlapping ion chromatograms. Results from simulation were further validated on an actual mass spectrometer, demonstrating that using ViMMS we can draw conclusions from simulation that translate well into the real world. The versatility of the Virtual Metabolomics Mass Spectrometer (ViMMS) framework in simulating different parameters of both Data-Dependent and Data-Independent Acquisition (DDA and DIA) modes is a key advantage of this work. Researchers can easily explore and compare the performance of different acquisition methods within the ViMMS framework, without the need for expensive and time-consuming experiments with real experimental data. By identifying the strengths and limitations of each acquisition method, researchers can optimize their choice and obtain more accurate and robust results. Furthermore, the ability to simulate and validate results using the ViMMS framework can save significant time and resources, as it eliminates the need for numerous experiments. This work not only provides valuable insights into the performance of DDA and DIA, but it also opens the door for further advancements in LC-MS/MS data acquisition methods.

Keywords: data independent acquisition; data-dependent acquisition; digital twin; liquid chromatography tandem mass spectrometry; metabolomics.

PubMed Disclaimer

Conflict of interest statement

JJJVdH is member of the Scientific Advisory Board of NAICONS Srl. Milano, Italy. All other authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
The overall schematic of the ViMMS framework. (A) The Simulated Environment in ViMMS allows for new acquisition methods to be developed against a Virtual MS that takes simulated molecules as input. The new DIA methods, e.g., SWATH and AIF (solid purple box), as well as existing DDA methods (faded orange box), are implemented as controllers and initially tested here in the Simulated Environment. A controller is a specific Python implementation of an acquisition method in ViMMS. (B) Acquisition methods can be run for method validation on the Real Environment in ViMMS, connected to a Thermo Orbitrap Fusion instrument via IAPI, to acquire real experimental scans. Controllers developed in the Simulated Environment can be transferred to the Real Environment easily (shown by the dashed purple line). As different environments abstract the low-level scan generation process, the underlying Python controller codes for the new DIA methods remain unchanged when transferred from the Simulated to the Real Environment (dashed purple box).
FIGURE 2
FIGURE 2
The mean proportion of unique chemical annotations at varying numbers of chemicals and similarity thresholds across five replicates. The error bar shows the 95% confidence interval.
FIGURE 3
FIGURE 3
The distribution of cosine similarity scores when matching observed spectra to the true reference spectra for varying numbers of chemicals.
FIGURE 4
FIGURE 4
The distribution of cosine similarity scores as a boxplot (left) and a histogram (right) when matching observed spectra to the true reference spectra for 5,000 chemicals.
FIGURE 5
FIGURE 5
The distribution of pairwise cosine similarity scores of the ground truth (true chemical spectra), and fragmentation spectra from Top-N, SWATH and AIF. The plot y-axes are truncated at 25% similarity.
FIGURE 6
FIGURE 6
Proportion of matched features to the total number of detected features from the fullscan data.
FIGURE 7
FIGURE 7
Pairwise similarities of spectra in each dataset. DIA methods (AIF, SWATH) produce deconvoluted spectra that are more similar to each other.
FIGURE 8
FIGURE 8
Distribution of cosine similarity of annotated features for Top-N and DIA methods for the (A) GNPS/NIST14 library and (B) Multi-Injection library.
FIGURE 9
FIGURE 9
Venn diagram showing the overlap of annotated features between Top-N, SWATH and AIF at matching threshold ≥60% for the (A) GNPS/NIST14 library, and (B) Multi-Injection library.

References

    1. Bald T., Barth J., Niehues A., Specht M., Hippler M., Fufezan C. (2012). pymzML—Python module for high-throughput bioinformatics on mass spectrometry data. Bioinformatics 28, 1052–1053. 10.1093/bioinformatics/bts066 - DOI - PubMed
    1. Bern M., Finney G., Hoopmann M. R., Merrihew G., Toth M. J., MacCoss M. J. (2010). Deconvolution of mixture spectra from ion-trap data-independent-acquisition tandem mass spectrometry. Anal. Chem. 82, 833–841. 10.1021/ac901801b - DOI - PMC - PubMed
    1. Davies V., Wandy J., Weidt S., Van Der Hooft J. J., Miller A., Daly R., et al. (2021). Rapid development of improved data-dependent acquisition strategies. Anal. Chem. 93, 5676–5683. 10.1021/acs.analchem.0c03895 - DOI - PMC - PubMed
    1. Fernández-Costa C., Martínez-Bartolomé S., McClatchy D. B., Saviola A. J., Yu N.-K., Yates J. R., III (2020). Impact of the identification strategy on the reproducibility of the DDA and DIA results. J. Proteome Res. 19, 3153–3161. 10.1021/acs.jproteome.0c00153 - DOI - PMC - PubMed
    1. Gillet L. C., Navarro P., Tate S., Röst H., Selevsek N., Reiter L., et al. (2012). Targeted data extraction of the ms/ms spectra generated by data-independent acquisition: A new concept for consistent and accurate proteome analysis. Mol. Cell. Proteomics 11, O111.016717. 10.1074/mcp.O111.016717 - DOI - PMC - PubMed

LinkOut - more resources