Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Feb;5(2):169-72.
doi: 10.1038/ismej.2010.133. Epub 2010 Sep 9.

UniFrac: an effective distance metric for microbial community comparison

Affiliations

UniFrac: an effective distance metric for microbial community comparison

Catherine Lozupone et al. ISME J. 2011 Feb.
No abstract available

PubMed Disclaimer

Figures

Figure 1
Figure 1
Rarefaction of data from a study of obese twins (Turnbaugh et al., 2009). This study produced >1 million reads from the V2 region of ribosomal RNA using pyrosequencing. The samples with less than 3000 sequences were first excluded (leaving 112 samples). For five replicate trials, sequences from all 112 samples were subsampled so that each sample had a set number of sequences (between 50 and 2925 with a step size of 125). Pairwise UniFrac values were calculated with both the unweighted (a) and weighted (b) versions for all pairs of samples. To assess the effects of community divergence (the raw UniFrac value) on the sensitivity to sampling, the most similar and most different pairs of samples were identified from the most heavily subsampled data set (2925 sequences per sample) as those in the upper and lower quartile of UniFrac values respectively (calculated separately for unweighted and weighted). The points represent the average UniFrac value at each sample depth for (1) all pairwise comparisons and (2) the pairs that were identified as being in the upper and lower quartiles. Individual points for each of the five replicate trials are plotted, but the values of the replicates were close enough that they are generally on top of each other except for the smallest subsamples.
Figure 2
Figure 2
The results of PCoA jackknifing of the bacteria from the stool of 106 individuals from 60 mammal species reported in Ley et al. (2008) for 100 replicates with unweighted UniFrac and (a) 40 or (b) 100 sequences. The full data set had between 21 and 1060 sequences/sample and the main clustering was explained by diet. This clustering pattern was recaptured with only 40 sequences/sample with the herbivores (green), omnivores (red) and carnivores (blue) largely clustering with each other (samples with less than 40 sequences were excluded from the analysis). In total, 100 sequences/sample show the same trend but with less variability in the point distribution, consistent with the decrease in the standard deviation with sample depth detected in the simulations. These plots were made using QIIME (Caporaso et al., 2010), which supplies a 3D view in which the confidence ellipses for selected PC axes can be viewed dynamically.

References

    1. Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, et al. QIIME allows analysis of high-throughput community sequencing data. Nat Methods. 2010;7:335–336. - PMC - PubMed
    1. Castro HF, Classen AT, Austin EE, Norby RJ, Schadt CW. Soil microbial community responses to multiple experimental climate change drivers. Appl Environ Microbiol. 2010;76:999–1007. - PMC - PubMed
    1. Chao A, Chazdon RL, Colwell RK, Shen TJ. A new statistical approach for assessing similarity of species composition with incidence and abundance data. Ecol Lett. 2005;8:148–159.
    1. Hamady M, Lozupone C, Knight R. Fast UniFrac: facilitating high-throughput phylogenetic analyses of microbial communities including analysis of pyrosequencing and PhyloChip data. ISME J. 2010;4:17–27. - PMC - PubMed
    1. Kuczynski J, Costello EK, Nemergut DR, Zaneveld J, Lauber CL, Knights D, et al. Direct sequencing of the human microbiome readily reveals community differences. Genome Biol. 2010;11:210. - PMC - PubMed

Publication types