Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Apr 3;10(4):e1003531.
doi: 10.1371/journal.pcbi.1003531. eCollection 2014 Apr.

Waste not, want not: why rarefying microbiome data is inadmissible

Affiliations

Waste not, want not: why rarefying microbiome data is inadmissible

Paul J McMurdie et al. PLoS Comput Biol. .

Abstract

Current practice in the normalization of microbiome count data is inefficient in the statistical sense. For apparently historical reasons, the common approach is either to use simple proportions (which does not address heteroscedasticity) or to use rarefying of counts, even though both of these approaches are inappropriate for detection of differentially abundant species. Well-established statistical theory is available that simultaneously accounts for library size differences and biological variability using an appropriate mixture model. Moreover, specific implementations for DNA sequencing read count data (based on a Negative Binomial model for instance) are already available in RNA-Seq focused R packages such as edgeR and DESeq. Here we summarize the supporting statistical theory and use simulations and empirical data to demonstrate substantial improvements provided by a relevant mixture model framework over simple proportions or rarefying. We show how both proportions and rarefied counts result in a high rate of false positives in tests for species that are differentially abundant across sample classes. Regarding microbiome sample-wise clustering, we also show that the rarefying procedure often discards samples that can be accurately clustered by alternative methods. We further compare different Negative Binomial methods with a recently-described zero-inflated Gaussian mixture, implemented in a package called metagenomeSeq. We find that metagenomeSeq performs well when there is an adequate number of biological replicates, but it nevertheless tends toward a higher false positive rate. Based on these results and well-established statistical theory, we advocate that investigators avoid rarefying altogether. We have provided microbiome-specific extensions to these tools in the R package, phyloseq.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. A minimal example of the effect of rarefying on statistical power.
Hypothetical abundance data in its original (Top-Left) and rarefied (Top-Right) form, with corresponding formal test results for differentiation (Bottom).
Figure 2
Figure 2. Graphical summary of the two simulation frameworks.
Both Simulation A (clustering) and Simulation B (differential abundance) are represented. All simulations begin with real microbiome count data from a survey experiment referred to here as “the Global Patterns dataset” . Tables of integers with multiple columns represent an abundance count matrix (“OTU table”), while a single-column of integers represents a multinomial of OTU counts/proportions. In both simulation illustrations an effect size is explained and given an example value of 10 for easy mental computation, but its meaning is different for each simulation. Note that effect size is altogether different than library size, the latter being equivalent to both the column sums and the number of reads per sample. A grey highlight indicates count values for which an effect has been applied in Simulation B. Protocol S1 includes the complete source code used to compute the example values shown here, as well as the full simulations discussed below.
Figure 3
Figure 3. Examples of overdispersion in microbiome data.
Common-Scale Variance versus Mean for Microbiome Data. Each point in each panel represents a different OTU's mean/variance estimate for a biological replicate and study. The data in this figure come from the Global Patterns survey and the Long-Term Dietary Patterns study , with results from additional studies included in Protocol S1. (Right) Variance versus mean abundance for rarefied counts. (Left) Common-scale variances and common-scale means, estimated according to Equations 6 and 7 from Anders and Huber , implemented in the DESeq package (Text S1). The dashed gray line denotes the σ 2 = μ case (Poisson; φ = 0). The cyan curve denotes the fitted variance estimate using DESeq , with method = ‘pooled’, sharingMode = ‘fit-only’, fitType = ‘local’.
Figure 4
Figure 4. Clustering accuracy in simulated two-class mixing.
Partitioning around medoids , clustering accuracy (vertical axis) that results following different normalization and distance methods. Points denote the mean values of replicates, with a vertical bar representing one standard deviation above and below. Normalization method is indicated by both shade and shape, while panel columns and panel rows indicate the distance metric and median library size (formula image), respectively. The horizontal axis is the effect size, which in this context is the ratio of target to non-target values in the multinomials that were used to simulate microbiome counts. Each multinomial is derived from two microbiomes that have negligible overlapping OTUs (Fecal and Ocean microbiomes in the Global Patterns dataset [48]). Higher values of effect size indicate an easier clustering task. For simulation details and precise definitions of abbreviations see Simulation A of the Methods section.
Figure 5
Figure 5. Normalization by rarefying only, dependency on library size threshold.
Unlike the analytical methods represented in Figure 4, here rarefying is the only normalization method used, but at varying values of the minimum library size threshold, shown as library-size quantile (horizontal axis). Panel columns, panel rows, and point/line shading indicate effect size (ES), median library size (formula image), and distance method applied after rarefying, respectively. Because discarded samples cannot be accurately clustered, the line formula image is the maximum achievable accuracy.
Figure 6
Figure 6. Performance of differential abundance detection with and without rarefying.
Performance summarized here by the “Area Under the Curve” (AUC) metric of a Receiver Operator Curve (ROC) (vertical axis). Briefly, the AUC value varies from 0.5 (random) to 1.0 (perfect), incorporating both sensitivity and specificity. The horizontal axis indicates the effect size, shown as the actual multiplication factor applied to OTU counts in the test class to simulate a differential abundance. Each curve traces the respective normalization method's mean performance of that panel, with a vertical bar indicating a standard deviation in performance across all replicates and microbiome templates. The right-hand side of the panel rows indicates the median library size, formula image, while the darkness of line shading indicates the number of samples per simulated experiment. Color shade and shape indicate the normalization method. See Methods section for the definitions of each normalization and testing method. For all methods, detection among multiple tests was defined using a False Discovery Rate (Benjamini-Hochberg [52]) significance threshold of 0.05.

References

    1. Shendure J, Lieberman Aiden E (2012) The expanding scope of DNA sequencing. Nature Biotechnology 30: 1084–1094. - PMC - PubMed
    1. Shendure J, Ji H (2008) Next-generation DNA sequencing. Nature Biotechnology 26: 1135–1145. - PubMed
    1. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nature Methods 5: 621–628. - PubMed
    1. Pace NR (1997) A molecular view of microbial diversity and the biosphere. Science 276: 734–740. - PubMed
    1. Wilson KH, Wilson WJ, Radosevich JL, DeSantis TZ, Viswanathan VS, et al. (2002) High-Density Microarray of Small-Subunit Ribosomal DNA Probes. Appl Environ Microbiol 68: 2535–2541. - PMC - PubMed