Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2018 Aug 28;8(3):47.
doi: 10.3390/metabo8030047.

Statistical Analysis of NMR Metabolic Fingerprints: Established Methods and Recent Advances

Affiliations
Review

Statistical Analysis of NMR Metabolic Fingerprints: Established Methods and Recent Advances

Helena U Zacharias et al. Metabolites. .

Abstract

In this review, we summarize established and recent bioinformatic and statistical methods for the analysis of NMR-based metabolomics. Data analysis of NMR metabolic fingerprints exhibits several challenges, including unwanted biases, high dimensionality, and typically low sample numbers. Common analysis tasks comprise the identification of differential metabolites and the classification of specimens. However, analysis results strongly depend on the preprocessing of the data, and there is no consensus yet on how to remove unwanted biases and experimental variance prior to statistical analysis. Here, we first review established and new preprocessing protocols and illustrate their pros and cons, including different data normalizations and transformations. Second, we give a brief overview of state-of-the-art statistical analysis in NMR-based metabolomics. Finally, we discuss a recent development in statistical data analysis, where data normalization becomes obsolete. This method, called zero-sum regression, builds metabolite signatures whose estimation as well as predictions are independent of prior normalization.

Keywords: NMR; data normalization; data scaling; metabolic fingerprinting; statistical data analysis; zero-sum.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Normalization of two different urine spectra with respect to creatinine. (Left) Before normalization and (right) after normalization.
Figure 2
Figure 2
Test for differentially regulated metabolites in 1D 1H urinary nuclear magnetic resonance (NMR) fingerprints between acute kidney injury (AKI) and healthy patients with respect to different normalization strategies. –Log10(p-values) of moderated t-test analysis are shown after preprocessing with four different normalization methods: scaling to (a) equal total spectral area, (b) scaling to creatinine, (c) probabilistic quotient normalization (PQN), and (d) scaling to the internal reference TSP, plotted versus the ppm regions of the corresponding NMR buckets (upper panels). The significance level for Benjamini–Hochberg (B/H) adjusted p-values below 0.01, corresponding to a false discovery rate (FDR) below 1%, is marked by an orange line, and the significant NMR features are indicated as orange diamonds. The corresponding log2 fold changes (log2 FC) plotted versus the ppm regions are shown in the lower panels (eh). Since log2 FCs were calculated as AKI minus non-AKI, positive log2 FCs correspond to higher values in AKI than in non-AKI samples. Figure adapted from Zacharias et al. (2017) [22].
Figure 3
Figure 3
Receiver operating characteristic (ROC) curves as well as Venn diagrams of selected classification features for the discrimination of AKI from non-AKI patients based on urinary 1D 1H NMR fingerprints. Four different normalization strategies were employed: scaling to total spectral area (violet solid line), scaling to creatinine (red dashed line), probabilistic quotient normalization (PQN) (blue dotted line), and scaling to the internal reference TSP (cyan dashed–dotted line). Common classification approaches such as (a) support vector machine (SVM) in combination with t-test based feature filtering, and (b) least-absolute shrinkage and selection operator (LASSO) regression show a clear dependence on the chosen normalization strategy, whereas (c) zero-sum regression is completely independent thereof. Figure adapted from Zacharias et al. (2017) [22].

References

    1. Klein M.S., Buttchereit N., Miemczyk S.P., Immervoll A.K., Louis C., Wiedemann S., Junge W., Thaller G., Oefner P.J., Gronwald W. NMR metabolomic analysis of dairy cows reveals milk glycerophosphocholine to phosphocholine ratio as prognostic biomarker for risk of ketosis. J. Proteome Res. 2012;11:1373–1381. doi: 10.1021/pr201017n. - DOI - PubMed
    1. Zacharias H.U., Schley G., Hochrein J., Klein M.S., Köberle C., Eckardt K.U., Willam C., Oefner P.J., Gronwald W. Analysis of Human Urine Reveals Metabolic Changes Related to the Development of Acute Kidney Injury Following Cardiac Surgery. Metabolomics. 2013;9:697–707. doi: 10.1007/s11306-012-0479-4. - DOI
    1. Zacharias H.U., Hochrein J., Vogl F.C., Schley G., Mayer F., Jeleazcov C., Eckardt K.-U., Willam C., Oefner P.J., Gronwald W. Identification of Plasma Metabolites Prognostic of Acute Kidney Injury after Cardiac Surgery with Cardiopulmonary Bypass. J. Proteome Res. 2015;14:2897–2905. doi: 10.1021/acs.jproteome.5b00219. - DOI - PubMed
    1. Davis R.A., Charlton A.J., Godward J., Jones S.A., Harrison M., Wilson J.C. Adaptive binning: An improved binning method for metabolomics data using the undecimated wavelet transform. Chemom. Intell. Lab. 2007;85:144–154. doi: 10.1016/j.chemolab.2006.08.014. - DOI
    1. Vu T.N., Laukens K. Getting your peaks in line: A review of alignment methods for NMR spectral data. Metabolites. 2013;3:259–276. doi: 10.3390/metabo3020259. - DOI - PMC - PubMed

LinkOut - more resources