Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Dec 13;10(1):225.
doi: 10.1186/s40168-022-01423-8.

The community ecology perspective of omics data

Affiliations

The community ecology perspective of omics data

Stephanie D Jurburg et al. Microbiome. .

Abstract

The measurement of uncharacterized pools of biological molecules through techniques such as metabarcoding, metagenomics, metatranscriptomics, metabolomics, and metaproteomics produces large, multivariate datasets. Analyses of these datasets have successfully been borrowed from community ecology to characterize the molecular diversity of samples (ɑ-diversity) and to assess how these profiles change in response to experimental treatments or across gradients (β-diversity). However, sample preparation and data collection methods generate biases and noise which confound molecular diversity estimates and require special attention. Here, we examine how technical biases and noise that are introduced into multivariate molecular data affect the estimation of the components of diversity (i.e., total number of different molecular species, or entities; total number of molecules; and the abundance distribution of molecular entities). We then explore under which conditions these biases affect the measurement of ɑ- and β-diversity and highlight how novel methods commonly used in community ecology can be adopted to improve the interpretation and integration of multivariate molecular data. Video Abstract.

Keywords: Community ecology; Molecular ecology; Multivariate statistics.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Sample collection and preparation, data collection, and post-processing are inextricably linked in MME techniques and can all potentially affect estimates of diversity. The effect of researcher choices on S, N (middle row), and SAD (bottom row) during data generation is shown below each step. The true diversity in two samples is shown in red and blue, and the measured diversity is shown as dotted lines. Technical errors during sample collection and storage can increase S (e.g., due to non-specific contamination [19]), resulting in higher estimates for S and steeper SAD (a). In contrast, sample preparation can reduce the detectability of certain molecular entities (e.g., during PCR amplification in metabarcoding [20]) resulting in a lower S and flatter SAD (b). Technical limitations on the number of observations are imposed by some of the data collection instruments used (e.g., sequencer), placing technical limits on N, and potentially resulting in more even communities (i.e., a flatter SAD) (c). During processing, applying a less stringent species definition can result in reduced S and a flatter SAD (d)
Fig. 2
Fig. 2
Measuring ɑ-diversity with Hill numbers. ɑ-diversity indices were first unified by ecologist Mark Hill as the inverse of mean proportional abundances in a community [123, 124]. The value of q (or order of diversity) describes how this mean is calculated, affecting the sensitivity of diversity indices to rare species. In ac, Hill numbers are shown for a metabarcoding data obtained from the fecal sample of an Ecuadorian finch (publicly available in NCBI with accession number SRR6486665 [125]). When q=0, the weighted harmonic mean of species’ proportional abundances is measured, and richness is assessed (a). When q=1, the weighted geometric mean is measured, and Shannon’s entropy is assessed (b). When q=2, the weighted arithmetic mean is measured and inverse Simpson’s richness is assessed (c). All Hill numbers are expressed in units of effective numbers of species, or the number of species that would be expected in a community in which all species are equally abundant

References

    1. Goldenfeld N, Woese C. Biology’s next revolution. Nature. 2007;445(7126):369. doi: 10.1038/445369a. - DOI - PubMed
    1. Group G . Genetics (Macmillan Science Library) (4 Volume set) New York: Macmillan Reference USA; 2002.
    1. Sultan M, Schulz MH, Richard H, Magen A, Klingenhoff A, Scherf M, et al. A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome. Science. 2008;321(5891):956–960. doi: 10.1126/science.1160342. - DOI - PubMed
    1. Anderson NL, Anderson NG. Proteome and proteomics: new technologies, new concepts, and new words. Electrophoresis. 1998;19(11):1853–1861. doi: 10.1002/elps.1150191103. - DOI - PubMed
    1. Klassen A, Faccio AT, Canuto GAB, da Cruz PLR, Ribeiro HC, Tavares MFM, et al. Metabolomics: definitions and significance in systems biology. Adv Exp Med Biol. 2017;965:3–17. doi: 10.1007/978-3-319-47656-8_1. - DOI - PubMed

LinkOut - more resources