Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2019 Dec 17;9(12):308.
doi: 10.3390/metabo9120308.

From Samples to Insights into Metabolism: Uncovering Biologically Relevant Information in LC-HRMS Metabolomics Data

Affiliations
Review

From Samples to Insights into Metabolism: Uncovering Biologically Relevant Information in LC-HRMS Metabolomics Data

Julijana Ivanisevic et al. Metabolites. .

Abstract

Untargeted metabolomics (including lipidomics) is a holistic approach to biomarker discovery and mechanistic insights into disease onset and progression, and response to intervention. Each step of the analytical and statistical pipeline is crucial for the generation of high-quality, robust data. Metabolite identification remains the bottleneck in these studies; therefore, confidence in the data produced is paramount in order to maximize the biological output. Here, we outline the key steps of the metabolomics workflow and provide details on important parameters and considerations. Studies should be designed carefully to ensure appropriate statistical power and adequate controls. Subsequent sample handling and preparation should avoid the introduction of bias, which can significantly affect downstream data interpretation. It is not possible to cover the entire metabolome with a single platform; therefore, the analytical platform should reflect the biological sample under investigation and the question(s) under consideration. The large, complex datasets produced need to be pre-processed in order to extract meaningful information. Finally, the most time-consuming steps are metabolite identification, as well as metabolic pathway and network analysis. Here we discuss some widely used tools and the pitfalls of each step of the workflow, with the ultimate aim of guiding the reader towards the most efficient pipeline for their metabolomics studies.

Keywords: data processing; experimental design; liquid chromatography–mass spectrometry (LC-MS); metabolic pathway and network analysis; metabolism; metabolite identification; sample preparation; univariate and multivariate statistics; untargeted metabolomics.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

Figure 1
Figure 1
Common experimental designs. (A) Cross-over design involving a large patient cohort. Two drugs are administered sequentially to each patient, with a crucial washout period between each drug to enable the effects of each drug to be elucidated. (B) Factorial design, where both the gender of the subject and effect of the drug are being studied. (C) Common cross-sectional design in metabolomics studies, comparing controls and two drug dose levels in both genders.
Figure 2
Figure 2
Setting up the data acquisition worklist to facilitate metabolite quantification and identification. Prior to batch run, the instrument should be conditioned (or “passivated”) using the pooled quality control (QC) of biological samples. During the conditioning, high-quality MS/MS data can be acquired in a data-dependent acquisition (DDA) mode by taking advantage of iterative injections through the application of PC-driven exclusion (of ions for which the MS/MS data have already been acquired). In this way, the amount of acquired high-quality MS/MS data will be maximized. The batch run can start (and end) with the analysis of diluted QC series that will serve to remove the features whose response is not linear; however, this removal should be performed carefully by evaluating low abundance features and those with saturation issues. Finally, samples should be run in a randomized fashion (considering the most important confounding factors, such as disease, sex, age, etc., depending on the experiment) with pooled QCs every 4–10 samples (depending on the size of the batch). Extracted blanks can be analyzed after the sample run and used for the removal of background (chemical and informatic) noise. Abbreviations: MS/MS data—fragmentation pattern, HRMS—high-resolution mass spectrometry, DDA—data-dependent acquisition, DIA—data-independent acquisition, AIF—all ion fragmentation (on Agilent or Thermo systems), MSE—all ion fragmentation on Waters systems-, SWATH—sequential window acquisition of all theoretical mass spectra or DIA strategy on Sciex systems, SONAR—scanning quadrupole DIA or DIA strategy on Waters systems.
Figure 3
Figure 3
Overview of lipidomic data analysis (acquired by DDA) using MS-DIAL, the open-access software designed for simultaneous metabolite quantification and identification. Displayed are the MS/MS matched peaks (each lipid class is differently colored) with the example of phosphatidylcholine annotation using MS/MS matching against LipidBlast.
Figure 4
Figure 4
Simplified overview of PCA and OPLS-DA showing (A) good separation on PCA and OPLS-DA scores plots. High R2 and Q2 values indicate good model robustness and predictive capability. Permutation test indicates a valid model. (B) No separation on the PCA scores plot of PC1 vs. PC2, but separation is still achieved using OPLS-DA. In this instance, the model could be overfitted and unreliable. It is advisable to check for separation in other components, e.g, PC2 vs. PC3, as well as to assess R2 and Q2 and perform permutation tests. CV-ANOVA can also be used to assess model validity (not shown).
Figure 5
Figure 5
Metabolite mapping on the metabolic networks—an overview of MetExplore network Viz functionalities. The projected network has been created from the list of chemical reactions (in the cart on the right side of the figure)—derived from the list of identified metabolites whose levels varied significantly (as a result of brain cell profiling). The extent of each pathway has been encircled and colored for visualization. Alanine, aspartate and glutamate metabolism, and arginine biosynthesis have been highlighted as enriched (using integrated ORA).

References

    1. Patti G.J., Tautenhahn R., Siuzdak G. Meta-analysis of untargeted metabolomic data from multiple profiling experiments. Nat. Protoc. 2012;7:508–516. doi: 10.1038/nprot.2011.454. - DOI - PMC - PubMed
    1. Mills E.L., Pierce K.A., Jedrychowski M.P., Garrity R., Winther S., Vidoni S., Yoneshiro T., Spinelli J.B., Lu G.Z., Kazak L., et al. Accumulation of succinate controls activation of adipose tissue thermogenesis. Nature. 2018;560:102–106. doi: 10.1038/s41586-018-0353-2. - DOI - PMC - PubMed
    1. Hayton S., Maker G.L., Mullaney I., Trengove R.D. Experimental design and reporting standards for metabolomics studies of mammalian cell lines. Cell. Mol. Life Sci. 2017;74:4421–4441. doi: 10.1007/s00018-017-2582-1. - DOI - PMC - PubMed
    1. Chong J., Wishart D.S., Xia J. Using MetaboAnalyst 4.0 for Comprehensive and Integrative Metabolomics Data Analysis. Curr. Protoc. Bioinformatics. 2019;68:e86. - PubMed
    1. Blaise B.J., Correia G., Tin A., Young J.H., Vergnaud A.-C., Lewis M., Pearce J.T.M., Elliott P., Nicholson J.K., Holmes E., et al. Power Analysis and Sample Size Determination in Metabolic Phenotyping. Anal. Chem. 2016;88:5179–5188. doi: 10.1021/acs.analchem.6b00188. - DOI - PubMed

LinkOut - more resources