Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Apr 9:5:1585717.
doi: 10.3389/fbinf.2025.1585717. eCollection 2025.

A cost and community perspective on the barriers to microbiome data reuse

Affiliations

A cost and community perspective on the barriers to microbiome data reuse

Julia M Kelliher et al. Front Bioinform. .

Abstract

Microbiome research is becoming a mature field with a wealth of data amassed from diverse ecosystems, yet the ability to fully leverage multi-omics data for reuse remains challenging. To provide a view into researchers' behavior and attitudes towards data reuse, we surveyed over 700 microbiome researchers to evaluate data sharing and reuse challenges. We found that many researchers are impeded by difficulties with metadata records, challenges with processing and bioinformatics, and problems with data repository submissions. We also explored the cost constraints of data reuse at each step of the data reuse process to better understand "pain points" and to provide a more quantitative perspective from sixteen active researchers. The bioinformatics and data processing step was estimated to be the most time consuming, which aligns with some of the most frequently reported challenges from the community survey. From these two approaches, we present evidence-based recommendations for how to address data sharing and reuse challenges with concrete actions for future work.

Keywords: FAIR data; data reuse; data standards; metadata; microbiome; multi-omics; survey.

PubMed Disclaimer

Conflict of interest statement

Author MK is employed by In-Pipe Technology. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
Synthesis of results from the community survey about challenges for data discovery and sharing. The answers to both free-form questions were classified into the broader categories for (A) data search: respondents were asked, “What is currently your biggest challenge when searching for microbiome data in available resources?” and (B) data sharing: respondents were asked, “What is currently your biggest challenge with respect to sharing microbiome data?”. All respondent answers and bins are available here: https://doi.org/10.5281/zenodo.14948343 (Kelliher et al., 2025).
FIGURE 2
FIGURE 2
Estimates of personnel time for microbiome data reuse steps. These steps include: Experimental design, Searching for relevant datasets, Assessing and linking metadata with datasets, Data download, Bioinformatics, Downstream statistics, analyses, and figure generation, and Writing publications. Estimates were gathered from 16 NMDC Champions. Several Champions estimated “greater than” the hours that were used in the final average and median calculations (e.g., >100 h was treated as 100 h). Figure was generated using the ggplot2 R package and the microshades color palette (Wickham, 2011; Dahl et al., 2022).

References

    1. Abdill R. J., Graham S. P., Rubinetti V., Ahmadian M., Hicks P., Chetty A., et al. (2025). Integration of 168,000 samples reveals global patterns of the human gut microbiome. Cell 188, 1100–1118.e17. 10.1016/j.cell.2024.12.017 - DOI - PMC - PubMed
    1. Alexander H., Hu S. K., Krinos A. I., Pachiadaki M., Tully B. J., Neely C. J., et al. (2023). Eukaryotic genomes from a global metagenomic data set illuminate trophic modes and biogeography of ocean plankton. mBio 14, e0167623. 10.1128/mbio.01676-23 - DOI - PMC - PubMed
    1. Arkin A. P., Cottingham R. W., Henry C. S., Harris N. L., Stevens R. L., Maslov S., et al. (2018). KBase: the United States department of Energy systems biology knowledgebase. Nat. Biotechnol. 36, 566–569. 10.1038/nbt.4163 - DOI - PMC - PubMed
    1. Baker W., van den Broek A., Camon E., Hingamp P., Sterk P., Stoesser G., et al. (2000). The EMBL nucleotide sequence database. Nucleic Acids Res. 28, 19–23. 10.1093/nar/28.1.19 - DOI - PMC - PubMed
    1. Bhandary P., Seetharam A. S., Arendsee Z. W., Hur M., Wurtele E. S. (2018). Raising orphans from a metadata morass: a researcher’s guide to re-use of public ’omics data. Plant Sci. 267, 32–47. 10.1016/j.plantsci.2017.10.014 - DOI - PubMed