Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Editorial
. 2024 Apr 25;4(1):vbae057.
doi: 10.1093/bioadv/vbae057. eCollection 2024.

Perspectives on tracking data reuse across biodata resources

Collaborators, Affiliations
Editorial

Perspectives on tracking data reuse across biodata resources

Karen E Ross et al. Bioinform Adv. .

Abstract

Motivation: Data reuse is a common and vital practice in molecular biology and enables the knowledge gathered over recent decades to drive discovery and innovation in the life sciences. Much of this knowledge has been collated into molecular biology databases, such as UniProtKB, and these resources derive enormous value from sharing data among themselves. However, quantifying and documenting this kind of data reuse remains a challenge.

Results: The article reports on a one-day virtual workshop hosted by the UniProt Consortium in March 2023, attended by representatives from biodata resources, experts in data management, and NIH program managers. Workshop discussions focused on strategies for tracking data reuse, best practices for reusing data, and the challenges associated with data reuse and tracking. Surveys and discussions showed that data reuse is widespread, but critical information for reproducibility is sometimes lacking. Challenges include costs of tracking data reuse, tensions between tracking data and open sharing, restrictive licenses, and difficulties in tracking commercial data use. Recommendations that emerged from the discussion include: development of standardized formats for documenting data reuse, education about the obstacles posed by restrictive licenses, and continued recognition by funding agencies that data management is a critical activity that requires dedicated resources.

Availability and implementation: Summaries of survey results are available at: https://docs.google.com/forms/d/1j-VU2ifEKb9C-sW6l3ATB79dgHdRk5v_lESv2hawnso/viewanalytics (survey of data providers) and https://docs.google.com/forms/d/18WbJFutUd7qiZoEzbOytFYXSfWFT61hVce0vjvIwIjk/viewanalytics (survey of users).

PubMed Disclaimer

Conflict of interest statement

A.B. is Editor-in-Chief of Bioinformatics Advances, but was not involved in the editorial process of this manuscript.

Figures

Figure 1.
Figure 1.
Selected survey responses. (a) Word cloud showing data types that providers reported reusing. Larger words were mentioned more frequently. Created with Free Word Cloud Generator (https://www.freewordcloudgenerator.com/). (b) Frequency of citation methods for reused data as reported by providers (left bar of each pair) and users (right bar of each pair). (c) Frequency of methods for discovering/tracking data reuse as reported by providers. (d) Frequency of challenges accessing the original source of reused data as reported by users. Full survey results: Providers: https://docs.google.com/forms/d/1j-VU2ifEKb9CsW6l3ATB79dgHdRk5v_lESv2hawnso/viewanalytics; Supplementary File 1 Users: https://docs.google.com/forms/d/18WbJFutUd7qiZoEzbOytFYXSf WFT61hVce0vjvIwIjk/viewanalytics; Supplementary File 2.

References

    1. Alliance of Genome Resources Consortium. Updates to the alliance of genome resources central infrastructure alliance of genome resources consortium. bioRxiv. 2023, doi: 10.1101/2023.11.20.567935, preprint: not peer reviewed. - DOI
    1. Bandrowski A, Brinkman R, Brochhausen M. et al. The ontology for biomedical investigations. PLoS One 2016;11:e0154556. - PMC - PubMed
    1. Bastian FB, Chibucos MC, Gaudet P. et al. The confidence information ontology: a step towards a standard for asserting confidence in annotations. Database (Oxford) 2015 2015;2015:bav043. - PMC - PubMed
    1. Bell MJ, Lord P.. On patterns and re-use in bioinformatics databases. Bioinformatics 2017;33:2731–6. - PMC - PubMed
    1. Bult CJ, Sternberg PW.. The alliance of genome resources: transforming comparative genomics. Mamm Genome 2023;34:531–44. - PMC - PubMed

Publication types