Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Apr 7;87(7):3606-15.
doi: 10.1021/ac502439y. Epub 2015 Mar 6.

Statistical methods for handling unwanted variation in metabolomics data

Affiliations

Statistical methods for handling unwanted variation in metabolomics data

Alysha M De Livera et al. Anal Chem. .

Abstract

Metabolomics experiments are inevitably subject to a component of unwanted variation, due to factors such as batch effects, long runs of samples, and confounding biological variation. Although the removal of this unwanted variation is a vital step in the analysis of metabolomics data, it is considered a gray area in which there is a recognized need to develop a better understanding of the procedures and statistical methods required to achieve statistically relevant optimal biological outcomes. In this paper, we discuss the causes of unwanted variation in metabolomics experiments, review commonly used metabolomics approaches for handling this unwanted variation, and present a statistical approach for the removal of unwanted variation to obtain normalized metabolomics data. The advantages and performance of the approach relative to several widely used metabolomics normalization approaches are illustrated through two metabolomics studies, and recommendations are provided for choosing and assessing the most suitable normalization method for a given metabolomics experiment. Software for the approach is made freely available.

PubMed Disclaimer

Figures

Figure 1
Figure 1
A graphical representation of the steps involved in the process of normalizing data from a typical metabolomics experiment. The first step involves identifying overall sources of variation. Here, the unwanted variation component is shown in red, and the unmeasurable unwanted variation examples are shown in italics. The second step involves normalizing (either removing the overall unwanted variation component or accommodating it in an appropriate statistical model). The third step involves assessing the normalizing method.
Figure 2
Figure 2
The first three principal components of the (a) unadjusted data, and (b) the data normalized by RUV-random improved for clustering (k = 8, λ = 1.43). The shapes and colours indicate different instruments and temperatures respectively, and Mix I and Mix II samples are shown by the hollow and solid points respectively.
Figure 3
Figure 3
Within-group RLA plots of the (a) unadjusted data, and the data normalized by the (b) SIS (c) RUV-random (k = 3, λ = 0.03) and (d) RUV-random improved for clustering (k = 8, λ = 1.43). The colours represent different temperatures.
Figure 4
Figure 4
Plots showing (a) the first three principal components and (b) generalised pairs plot of the variables age, batch, gender and BMI (The diagonal panels show the marginal distribution of each variable, and off-diagonal panels display pairwise relationships between the quantitative (age, BMI) and categorical (gender, batch). Scattter plots, boxplots, and mosaic plots are used to represent respectively, the relationship between two quantitative variables, between a categorical and a quantitative variable, and between two categorical variables. In the mosaic plots, areas are proportional to counts.)
Figure 5
Figure 5
Figures showing the first two principal components of the unadjusted and normalized data. Colours indicate different batches.
Figure 6
Figure 6
Figures showing (a) volcano plots, and (b) histograms of p-values for unadjusted and RUV-random normalized data.

References

    1. Monteiro MS, Carvalho M, Bastos ML, Guedes de Pinho P. Current medicinal chemistry. 2013;20:257–71. - PubMed
    1. Armitage EG, Barbas C. Journal of pharmaceutical and biomedical analysis. 2014;87:1–11. - PubMed
    1. De Livera AM, Olshansky M, Speed TP. Methods in molecular biology (Clifton, NJ) 2013;1055:291–307. - PubMed
    1. Dunn WB, Broadhurst D, Begley P, Zelena E, Francis-McIntyre S, Anderson N, Brown M, Knowles JD, Halsall A, Haselden JN, Nicholls AW, Wilson ID, Kell DB, Goodacre R. Nature protocols. 2011;6:1060–1083. - PubMed
    1. Wang S-Y, Kuo C-H, Tseng YJ. Analytical chemistry. 2013:1037–46. - PubMed