Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Aug 1;6(8):1-11.
doi: 10.1093/gigascience/gix047.

The metagenomic data life-cycle: standards and best practices

Affiliations

The metagenomic data life-cycle: standards and best practices

Petra Ten Hoopen et al. Gigascience. .

Abstract

Metagenomics data analyses from independent studies can only be compared if the analysis workflows are described in a harmonized way. In this overview, we have mapped the landscape of data standards available for the description of essential steps in metagenomics: (i) material sampling, (ii) material sequencing, (iii) data analysis, and (iv) data archiving and publishing. Taking examples from marine research, we summarize essential variables used to describe material sampling processes and sequencing procedures in a metagenomics experiment. These aspects of metagenomics dataset generation have been to some extent addressed by the scientific community, but greater awareness and adoption is still needed. We emphasize the lack of standards relating to reporting how metagenomics datasets are analysed and how the metagenomics data analysis outputs should be archived and published. We propose best practice as a foundation for a community standard to enable reproducibility and better sharing of metagenomics datasets, leading ultimately to greater metagenomics data reuse and repurposing.

Keywords: best practice; data analysis; metadata; metagenomics; sampling; sequencing; standard.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
A generalized metagenomics data analysis workflow in the context of other “omics” approaches.
Figure 2:
Figure 2:
A common data model for read data and associated metadata.
Figure 3:
Figure 3:
Schematic overview of best practice for analysis metadata collection with example fields. A) Overarching metadata; B) Analysis component; C) Workflow.

References

    1. Salter SJ, Cox MJ, Turek EM et al. . Reagent and laboratory contamination can critically impact sequence-based microbiome analysis. BMC Biology 2014;12:87. - PMC - PubMed
    1. Toribio AL, Alako B, Amid C et al. . European Nucleotide Archive in 2016. Nucleic Acids Res 2016; doi: 10.1093/nar/gkw1106. - PMC - PubMed
    1. Mitchell A, Bucchini F, Cochrane G et al. . EBI metagenomics in 2016 - an expanding and evolving resource for the analysis and archiving of metagenomic data. Nucleic Acid Res 2016;44:D595–603. - PMC - PubMed
    1. Meyer F, Paarmann D, D’Souza M et al. . The Metagenomics RAST server – a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinformatics 2008;9:386. - PMC - PubMed
    1. Field D, Amaral-Zettler L, Cochrane G et al. . The Genomic Standards Consortium. PLoS Biol 2011;9(6):e1001088. - PMC - PubMed

Publication types