Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Feb 23;6(1):e01194-20.
doi: 10.1128/mSystems.01194-20.

Microbiome Metadata Standards: Report of the National Microbiome Data Collaborative's Workshop and Follow-On Activities

Affiliations

Microbiome Metadata Standards: Report of the National Microbiome Data Collaborative's Workshop and Follow-On Activities

Pajau Vangay et al. mSystems. .

Erratum in

  • Correction for Vangay et al., "Microbiome Metadata Standards: Report of the National Microbiome Data Collaborative's Workshop and Follow-On Activities".
    Vangay P, Burgin J, Johnston A, Beck KL, Berrios DC, Blumberg K, Canon S, Chain P, Chandonia JM, Christianson D, Costes SV, Damerow J, Duncan WD, Dundore-Arias JP, Fagnan K, Galazka JM, Gibbons SM, Hays D, Hervey J, Hu B, Hurwitz BL, Jaiswal P, Joachimiak MP, Kinkel L, Ladau J, Martin SL, McCue LA, Miller K, Mouncey N, Mungall C, Pafilis E, Reddy TBK, Richardson L, Roux S, Schriml LM, Shaffer JP, Sundaramurthi JC, Thompson LR, Timme RE, Zheng J, Wood-Charlson EM, Eloe-Fadrosh EA. Vangay P, et al. mSystems. 2021 May 4;6(3):e00273-21. doi: 10.1128/mSystems.00273-21. mSystems. 2021. PMID: 33947809 Free PMC article. No abstract available.

Abstract

Microbiome samples are inherently defined by the environment in which they are found. Therefore, data that provide context and enable interpretation of measurements produced from biological samples, often referred to as metadata, are critical. Important contributions have been made in the development of community-driven metadata standards; however, these standards have not been uniformly embraced by the microbiome research community. To understand how these standards are being adopted, or the barriers to adoption, across research domains, institutions, and funding agencies, the National Microbiome Data Collaborative (NMDC) hosted a workshop in October 2019. This report provides a summary of discussions that took place throughout the workshop, as well as outcomes of the working groups initiated at the workshop.

Keywords: data standards; metadata; microbiome; ontology.

PubMed Disclaimer

Figures

FIG 1
FIG 1
Examples of different types of metadata along the workflow from environmental samples to data and analysis tables. Submitting data to central repositories typically requires sample and preparation metadata. Sample metadata include information about when, where, and what sample was collected; preparation metadata describe how the sample was processed and turned into data products; data processing and feature metadata are generated by the repository or analysis software. Refer to Text S1 in the supplemental material for additional information.
FIG 2
FIG 2
Usage of metadata standards across sample environments. For several MIxS packages, the working group identified representative metagenome organism name(s) for each package (see Table S1 for details) in order to inform how the MIxS packages were used across communities. The standards were evaluated as follows: (i) “Expected MIxS checklist/package,” the chosen checklist/package used for sample registration was the most appropriate MIxS option based on the metagenome organism name provided (Table S1); (ii) “Other checklist/package,” the chosen checklist/package used for sample registration may not have been the most appropriate MIxS checklist/package or followed an alternative set of standards; or (iii) “ENA default checklist or NCBI metagenome package,” the chosen checklist/package used for sample registration was the ENA/NCBI defined minimum for samples/metagenome samples and did not use a specific sample metadata standard. Only public samples and their associated studies for raw read submissions of metagenomic and amplicon data (MIMS and MIMARKs survey) to ENA or SRA were included in the respective counts (counts reflect only submitted data to each repository and exclude mirrored data). Associated studies were counted once for each unique metagenome organism name represented in the study, and hence may have been counted more than once (i.e., a study associated with samples assigned with x unique metagenome organism names may be counted x times). Queries were run in fall 2020. ENA queries used the ENA Portal API with the respective taxon criteria and checklist ID (Table S1) (e.g., ENA sample counts with expected use of the Air MIxS checklist (https://www.ebi.ac.uk/ena/portal/api/search?result=read_run&query=(sample_accession=%22SAMEA*%22%20OR%20sample_accession=%22ERS*%22)%20AND%20(tax_eq(655179)%20OR%20tax_eq(1708701)%20OR%20tax_eq(1643811))%20AND%20checklist=%22ERC000012%22&fields=sample_accession). SRA queries used the NCBI Entrez Programming Utilities (e.g., SRA sample counts with expected use of the MIMS Air MIxS package, esearch -db biosample -query ("biosample sra"[Filter]) AND ((("ncbi"[Filter]) AND ("air metagenome"[Organism] OR "aerosol metagenome"[Organism] OR "cloud metagenome"[Organism]))) AND "package mims metagenome/environmental, air version 5 0"[Properties]).

References

    1. Wood-Charlson EM, Anubhav Auberry D, Blanco H, Borkum MI, Corilo YE, Davenport KW, Deshpande S, Devarakonda R, Drake M, Duncan WD, Flynn MC, Hays D, Hu B, Huntemann M, Li P-E, Lipton M, Lo C-C, Millard D, Miller K, Piehowski PD, Purvine S, Reddy TBK, Shakya M, Sundaramurthi JC, Vangay P, Wei Y, Wilson BE, Canon S, Chain PSG, Fagnan K, Martin S, McCue LA, Mungall CJ, Mouncey NJ, Maxon ME, Eloe-Fadrosh EA. 2020. The National Microbiome Data Collaborative: enabling microbiome science. Nat Rev Microbiol 18:313–314. doi:10.1038/s41579-020-0377-0. - DOI - PubMed
    1. Wilkinson MD, Dumontier M, Aalbersberg IJJ, Appleton G, Axton M, Baak A, Blomberg N, Boiten J-W, da Silva Santos LB, Bourne PE, Bouwman J, Brookes AJ, Clark T, Crosas M, Dillo I, Dumon O, Edmunds S, Evelo CT, Finkers R, Gonzalez-Beltran A, Gray AJG, Groth P, Goble C, Grethe JS, Heringa J, 't Hoen PAC, Hooft R, Kuhn T, Kok R, Kok J, Lusher SJ, Martone ME, Mons A, Packer AL, Persson B, Rocca-Serra P, Roos M, van Schaik R, Sansone S-A, Schultes E, Sengstag T, Slater T, Strawn G, Swertz MA, Thompson M, van der Lei J, van Mulligen E, Velterop J, Waagmeester A, Wittenburg P, Wolstencroft K, Zhao J, Mons B. 2016. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3:160018. doi:10.1038/sdata.2016.18. - DOI - PMC - PubMed
    1. National Microbiome Data Collaborative. 2021. Introduction to metadata and ontologies. https://microbiomedata.org/introduction-to-metadata-and-ontologies/.
    1. Ponsero AJ, Bomhoff M, Blumberg K, Youens-Clark K, Herz NM, Wood-Charlson EM, Delong EF, Hurwitz BL. 2021. Planet Microbe: a platform for marine microbiology to discover and analyze interconnected ‘omics and environmental data. Nucleic Acids Res 49:D792–D802. doi:10.1093/nar/gkaa637. - DOI - PMC - PubMed
    1. Field D, Amaral-Zettler L, Cochrane G, Cole JR, Dawyndt P, Garrity GM, Gilbert J, Glöckner FO, Hirschman L, Karsch-Mizrachi I, Klenk H-P, Knight R, Kottmann R, Kyrpides N, Meyer F, San Gil I, Sansone S-A, Schriml LM, Sterk P, Tatusova T, Ussery DW, White O, Wooley J. 2011. The Genomic Standards Consortium. PLoS Biol 9:e1001088. doi:10.1371/journal.pbio.1001088. - DOI - PMC - PubMed