Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 May 28:12:926.
doi: 10.12688/f1000research.134798.2. eCollection 2023.

Facilitating accessible, rapid, and appropriate processing of ancient metagenomic data with AMDirT

Affiliations

Facilitating accessible, rapid, and appropriate processing of ancient metagenomic data with AMDirT

Maxime Borry et al. F1000Res. .

Abstract

Background: Access to sample-level metadata is important when selecting public metagenomic sequencing datasets for reuse in new biological analyses. The Standards, Precautions, and Advances in Ancient Metagenomics community (SPAAM, https://spaam-community.org) has previously published AncientMetagenomeDir, a collection of curated and standardised sample metadata tables for metagenomic and microbial genome datasets generated from ancient samples. However, while sample-level information is useful for identifying relevant samples for inclusion in new projects, Next Generation Sequencing (NGS) library construction and sequencing metadata are also essential for appropriately reprocessing ancient metagenomic data. Currently, recovering information for downloading and preparing such data is difficult when laboratory and bioinformatic metadata is heterogeneously recorded in prose-based publications.

Methods: Through a series of community-based hackathon events, AncientMetagenomeDir was updated to provide standardised library-level metadata of existing and new ancient metagenomic samples. In tandem, the companion tool 'AMDirT' was developed to facilitate rapid data filtering and downloading of ancient metagenomic data, as well as improving automated metadata curation and validation for AncientMetagenomeDir.

Results: AncientMetagenomeDir was extended to include standardised metadata of over 6000 ancient metagenomic libraries. The companion tool 'AMDirT' provides both graphical- and command-line interface based access to such metadata for users from a wide range of computational backgrounds. We also report on errors with metadata reporting that appear to commonly occur during data upload and provide suggestions on how to improve the quality of data sharing by the community.

Conclusions: Together, both standardised metadata reporting and tooling will help towards easier incorporation and reuse of public ancient metagenomic datasets into future analyses.

Keywords: FAIR data; aDNA; environmental; metadata; metagenomics; microbial; microbiome; palaeogenomics.

PubMed Disclaimer

Conflict of interest statement

No competing interests were disclosed.

Figures

Figure 1.
Figure 1.. Growth of studies curated in the AncientMetagenomeDir as of v24.03.
(a) Number of ancient metagenomic publications published per year with open sequencing data and included in AncientMetagenomeDir. The original AncientMetagenomeDir publication was in 2020. (b) Cumulative sum of the number of published samples with publicly accessible sequencing data. (c) Cumulative sum of the number of ancient metagenomic sequencing data accessions of the samples in panel b. Data from Fellows Yates et al.
Figure 2.
Figure 2.. Updated workflow for submission to AncientMetagenomeDir using the AMDirT autofill functionality.
The AncientMetagenomeDir submission workflow, as updated since. The general workflow remains the same, with issue creation for publication proposals, metadata submission by contributors via a branch and pull request, something that undergoes automated validation (with AMDirT validate), and later peer-review by AncientMetagenomeDir curators. The new addition is the use of autofill that is called via a GitHub Actions ‘bot’. This generates and uploads to the pull request in a comment a partially completed library metadata table that can be filled in, reviewed for accuracy and appended to the corresponding AncientMetagenomeDir library table as a part of the original sample pull request.
Figure 3.
Figure 3.. Example workflow of using AMDirT viewer.
(a) The viewer opens in a user’s web browser, where the desired AncientMetagenomeDir version and table is selected. (b) Interaction with columns follows standard operations common to most spreadsheet software. Samples for download are selected using checkboxes. (c) The same interface can be used for the subsequent library metadata filter table. (d) After pressing ‘Validate library selection’, buttons appear for downloading various download scripts, reference, and pipeline input sheets.

References

    1. Anagnostou P, Capocasa M, Milia N, et al. When data sharing gets close to 100%: what human paleogenetics can teach the open science movement. PLoS One. March 2015;10(3):e0121409. . 10.1371/journal.pone.0121409 - DOI - PMC - PubMed
    1. Wilkinson MD, Dumontier M, Aalbersberg IJJ, et al. The FAIR guiding principles for scientific data management and stewardship. Sci. Data. March 2016;3:160018. . 10.1038/sdata.2016.18 - DOI - PMC - PubMed
    1. Fellows Yates JA, Andrades Valtueña Å, Vågene ÅJ, et al. Community-curated and standardised metadata of published ancient metagenomic samples with AncientMetagenomeDir. Sci. Data. January 2021;8(1):31. . 10.1038/s41597-021-00816-y - DOI - PMC - PubMed
    1. Schubert M, Ermini L, Der Sarkissian C, et al. : Characterization of ancient and modern genomes by SNP detection and phylogenomic and metagenomic analysis using PALEOMIX. Nat. Protoc. May 2014;9(5):1056–1082. . 10.1038/nprot.2014.063 - DOI - PubMed
    1. Fellows Yates JA, Lamnidis TC, et al. : Reproducible, portable, and efficient ancient genome reconstruction with nf-core/eager. PeerJ. March 2021;9:e10947. . 10.7717/peerj.10947 - DOI - PMC - PubMed

LinkOut - more resources