Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Aug 1:4:170093.
doi: 10.1038/sdata.2017.93.

Viral to metazoan marine plankton nucleotide sequences from the Tara Oceans expedition

Collaborators, Affiliations

Viral to metazoan marine plankton nucleotide sequences from the Tara Oceans expedition

Adriana Alberti et al. Sci Data. .

Abstract

A unique collection of oceanic samples was gathered by the Tara Oceans expeditions (2009-2013), targeting plankton organisms ranging from viruses to metazoans, and providing rich environmental context measurements. Thanks to recent advances in the field of genomics, extensive sequencing has been performed for a deep genomic analysis of this huge collection of samples. A strategy based on different approaches, such as metabarcoding, metagenomics, single-cell genomics and metatranscriptomics, has been chosen for analysis of size-fractionated plankton communities. Here, we provide detailed procedures applied for genomic data generation, from nucleic acids extraction to sequence production, and we describe registries of genomics datasets available at the European Nucleotide Archive (ENA, www.ebi.ac.uk/ena). The association of these metadata to the experimental procedures applied for their generation will help the scientific community to access these data and facilitate their analysis. This paper complements other efforts to provide a full description of experiments and open science resources generated from the Tara Oceans project, further extending their value for the study of the world's planktonic ecosystems.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interests.

Figures

Figure 1
Figure 1. Overview of -omics analysis strategy applied on Tara Oceans samples.
Figure 2
Figure 2. Data processing flowchart.
Figure 3
Figure 3. Overview of experimental pipeline from nucleic acids to sequences.
Red crosses highlight QC steps where experiments can be stopped.
Figure 4
Figure 4. Agilent Bioanalyzer profiles of amplified libraries.
(a) Shows an example of electropherogram obtained following the metagenomic library preparation protocol described in paragraph 4.1. The size of this kind of library is very tight due to the size selection step for generation of overlapping paired end reads. (b) Shows an example of metatranscriptomic library generated following the TS_RNA protocol.
Figure 5
Figure 5. Representative examples of tabulated data reports generated by the LIMS for multiple datasets.
(a) Shows an example of sequencing report for metagenomics libraries. Metrics particularly useful for evaluating the quality of this type of data can be visualized, as the % of merged reads, the median size length and the estimated insert size. (b) Shows an example of report for metatranscriptomic libraries from poly(A)+ RNA. Quality control of these libraries focuses on duplication rate and potential contamination by bacteria and fungi, whose % are easily visualized on the report.
Figure 6
Figure 6. Representative examples of key data reports generated by the LIMS for individual datasets.
(a) Quality score box plot of 100-bp Illumina reads. This plot summarizes the average quality per position over all reads; it shows the box-plot per position in the read and the average smoothed line in black. (b) Nucleotide distribution chart per read position: at left, before adapters and low quality reads trimming; at right, after the trimming process. On the left plot, a non-random distribution in the first 12 bases is typical of metatranscriptomic libraires generated with SMART-dT protocol, which leaves SMARTer adapter sequencing at the beginning of the cDNA insert. (c) Graphical representation of known overrepresented sequences (primers and adapters used for library preparation) before (left panel) and after (right panel) adapter sequences trimming. Again, the overrepresentation of SMARTer adapter is easily visualised on the left panel (red bar) and it disappears after the trimming process (right panel). (d) Report of taxonomic assignation by organism (left), by division (middle) and by keyword (right). Bacteria and fungi %<5% are highlighted in green to facilitate manual validation of the dataset. (e) Report of rRNA sequences detection and trimming with detail of % of different rRNA species. (f) Krona chart of the same taxonomic assignment reported in (d). (g) Distribution of the length of the reads obtained after merging of paired reads generated by sequencing of a metagenomic library.

Dataset use reported in

  • doi: 10.1126/science.1261498
  • doi: 10.1126/science.1261605
  • doi: 10.1126/science.1261359
  • doi: 10.1038/nature19366

References

Data Citations

    1. 2015. GenBank. NC_001422.1
    1. 2012. European Nucleotide Archive. PRJEB402
    1. Tara Oceans Consortium C., Tara Oceans Expedition P. 2015. PANGAEA. http://dx.doi.org/10.1594/PANGAEA.859953 - DOI
    1. Alberti A., Pesant S. 2017. PANGAEA. https://dx.doi.org/10.1594/PANGAEA.875581 - DOI

References

    1. Karsenti E. et al. A holistic approach to marine eco-systems biology. PLoS Biol. 9, e1001177 (2011). - PMC - PubMed
    1. Pesant S. et al. Open science resources for the discovery and analysis of Tara Oceans data. Sci Data 2, 150023 (2015). - PMC - PubMed
    1. Gilbert J. A. & Dupont C. L. Microbial metagenomics: beyond the genome. Ann Rev Mar Sci 3, 347–371 (2011). - PubMed
    1. Temperton B. & Giovannoni S. J. Metagenomics: microbial diversity through a scratched lens. Curr. Opin. Microbiol. 15, 605–612 (2012). - PubMed
    1. Venter J. C. et al. Environmental genome shotgun sequencing of the Sargasso Sea. Science 304, 66–74 (2004). - PubMed

Publication types