Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2012 Apr;27(4):233-43.
doi: 10.1016/j.tree.2011.11.010. Epub 2012 Jan 11.

Sequencing our way towards understanding global eukaryotic biodiversity

Affiliations
Review

Sequencing our way towards understanding global eukaryotic biodiversity

Holly M Bik et al. Trends Ecol Evol. 2012 Apr.

Abstract

Microscopic eukaryotes are abundant, diverse and fill critical ecological roles across every ecosystem on Earth, yet there is a well-recognized gap in understanding of their global biodiversity. Fundamental advances in DNA sequencing and bioinformatics now allow accurate en masse biodiversity assessments of microscopic eukaryotes from environmental samples. Despite a promising outlook, the field of eukaryotic marker gene surveys faces significant challenges: how to generate data that are most useful to the community, especially in the face of evolving sequencing technologies and bioinformatics pipelines, and how to incorporate an expanding number of target genes.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Typical standardized workflow (from environment to sequences) for high-throughput marker gene studies. Soils and sediments are typically frozen upon collection (−80°C to preserve RNA) and brought back to the lab for bulk extraction of environmental DNA. Marker genes (e.g. rRNA) are amplified from genomic extracts using barcoded, conserved primer pairs. Following high-throughput sequencing (typically conducted on 454 or Illumina platforms), datasets are processed and clustered into Operational Taxonomic Units (OTUs) under a range of pairwise identity cutoffs. OTUs are subsequently used to conduct alpha and beta diversity analyses, summarize community taxonomy, and interpret assemblages in a phylogenetic context. (Depiction of community analysis modified from Parks et al. [56])
Figure 2
Figure 2
High-throughput studies follow a common workflow that begins with raw sequence data and sample metadata (primer barcodes and environmental data). Raw data is filtered and processed, with the option of denoising (a step currently applicable only to 454 data) before Operational Taxonomic Units (OTUs) are picked through reference-based or de novo approaches. OTU picking can include pre-clustering steps such as Single Linkage Preclustering (SLP, [94]), prefix-suffix filtering or collapsing of identical sequences to reduce compute time (all methods available within the QIIME pipeline [51]); the recommended and default OTU picking workflow in QIIME currently involves sorting sequences by abundance, collapsing identical reads, picking OTUs de novo with uclust, and subsequently inflating the ‘identical reads’ to recapture abundance information about the initial sequences). Taxonomy is next assigned to OTU reference sequences, followed by construction of an OTU abundance matrix and a phylogenetic tree; when working with a closed reference-based OTU picking protocol it is not necessary to make taxonomic assignments or build a phylogenetic tree as these can be obtained directly from the reference data set. These outputs can be subsequently utilized for ecological diversity analyses and visualization approaches.
Figure 3
Figure 3
OTU reference sequences can be placed into an evolutionary context using tools such as pplacer [39] or the Evolutionary Placement Algorithm (EPA [40]), which place short reads into a guide tree framework constructed from full-length reference sequences. For each pre-aligned OTU or sequencing read, likelihood scores are calculated for all possible positions in the tree, and the sequence is subsequently inserted at the node exhibiting the best score.
Figure 4
Figure 4
High-throughput biodiversity research is an active and rapidly evolving field. Future analytical tools will expand towards a number of exciting, emerging research areas including (A) OTU network analysis, (B) visualization as an exploratory tool (modified from Shapiro et al. [95]), (C) Edge Principal Component Analysis to defining biological lineages that define community assemblages (after Matsen & Evans), and (D) Quantifying the impact of different OTU picking strategies on cluster formation.
Figure 5
Figure 5
For future success, biodiversity research must adhere to a trifecta of biology, bioinformatics and database resources; none of these foci can exist in isolation, and each area must serve to inform the others. Biological questions drive high-throughput studies, and so computational pipelines and cyberinfrastructure need to functionally inform our knowledge of ecosystem processes. Likewise, computational resources must be complementary, whereby bioinformatic outputs are effectively databased, and evolving database resources produce continuing refinements in analytical pipelines. Seamless integration between these sectors will be crucial for enabling comparative metadata analyses and untangling complex ecological patterns – for example, mining published datasets for co-occurring species, or linking specific OTUs with environmental parameters (pH, salinity, temperature, etc.).

Comment in

Similar articles

Cited by

References

    1. Danovaro R, et al. Exponential decline of deep-sea ecosystem functioning linked to benthic biodiversity loss. Current Biology. 2008;18:1–18. - PubMed
    1. Wardle DA. The influence of biotic interactions on soil biodiversity. Ecology Letters. 2006;9:870–886. - PubMed
    1. Wegner Parfrey L, et al. Microbial eukaryotes in the human microbiome: ecology, evolution, and future directions. Frontiers in Microbiology. 2011;2:1–16. - PMC - PubMed
    1. Behnke A, et al. Microeukaryote community patterns along an O2/H2S Gradient in a supersulfidic Anoxic Fjord (Framvaren, Norway) Applied and environmental microbiology. 2006;72:3626–3636. - PMC - PubMed
    1. Groisillier A, et al. Genetic diversity and habitats of two enigmatic marine alveolate lineages. Aquatic Microbial Ecology. 2006;42:277–291.

Publication types

Substances