Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Nov 19;9(1):714.
doi: 10.1038/s41597-022-01807-3.

Unifying the identification of biomedical entities with the Bioregistry

Affiliations

Unifying the identification of biomedical entities with the Bioregistry

Charles Tapley Hoyt et al. Sci Data. .

Abstract

The standardized identification of biomedical entities is a cornerstone of interoperability, reuse, and data integration in the life sciences. Several registries have been developed to catalog resources maintaining identifiers for biomedical entities such as small molecules, proteins, cell lines, and clinical trials. However, existing registries have struggled to provide sufficient coverage and metadata standards that meet the evolving needs of modern life sciences researchers. Here, we introduce the Bioregistry, an integrative, open, community-driven metaregistry that synthesizes and substantially expands upon 23 existing registries. The Bioregistry addresses the need for a sustainable registry by leveraging public infrastructure and automation, and employing a progressive governance model centered around open code and open data to foster community contribution. The Bioregistry can be used to support the standardized annotation of data, models, ontologies, and scientific literature, thereby promoting their interoperability and reuse. The Bioregistry can be accessed through https://bioregistry.io and its source code and data are available under the MIT and CC0 Licenses at https://github.com/biopragmatics/bioregistry .

PubMed Disclaimer

Conflict of interest statement

DDF received salary from Enveda Biosciences.

Figures

Fig. 1
Fig. 1
(a) Summary of the pairwise overlap (in horizontal orange bars) between the prefixes in the Bioregistry and its integrated external registries. The horizontal blue bars show records that could not be automatically aligned and the horizontal green bars represent additional prefixes available in the Bioregistry but not the external resource. The absolute number of records in the union of the external registry with the Bioregistry (accounting for known overlaps) are shown on the right as well as the percentage relative gain introduced by the Bioregistry in parentheses. A large orange section corresponds to high content reuse while a large blue section corresponds to either high novelty of content in the external registry or high potential for semi-automated import into the Bioregistry. Counts on sections of these bar plots representing fewer than 70 prefixes are omitted. (b) A histogram of how many cross references each entry in the Bioregistry has to external registries. The green bar highlights the prefixes with no cross references that only appear in the Bioregistry. (c) A schematic diagram depicting the Bioregistry as an interoperability layer between external registries. Using the NCBI Taxonomy identifier resource as an example, prefixes used for this resource in external registries that the Bioregistry aligns are shown in purple boxes. Additional synonyms for this resource curated in the Bioregistry are shown in orange boxes. The components of this figure are regenerated daily with GitHub Actions and stored in https://github.com/bioregistry/bioregistry/tree/main/docs/img.
Fig. 2
Fig. 2
Website Screenshots. (a) The homepage of https://bioregistry.io prominently features a combine prefix search and CURIE resolution box along with links to all of the components of the site. (b) The full registry of prefixes, resource names, and descriptions can be viewed and full text search performed. (c) Each prefix page shows metadata about the corresponding resource, its identifiers, and serves as a hub for additional functionality in (d), (e), and (f). (d) The prefix page additionally includes the metaregistry’s cross-registry mappings from the prefix to external registries’ prefixes. (e) Each external registry page shows metadata and the capability list of external resources. (f) a sample identifier demonstrates all of the providers that can be resolved.

References

    1. Mark D, et al. The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data. 2016;3(1):160018. doi: 10.1038/sdata.2016.18. - DOI - PMC - PubMed
    1. Annika JacobsenRicardo, et al. FAIR Principles: Interpretations and Implementation Considerations. Data Intelligence. 2020;2(1-2):10–29. doi: 10.1162/dint_r_00024. - DOI
    1. Samantha L, et al. Sharing biological data: why when and how. FEBS Letters. 2021;595(7):847–863. doi: 10.1002/1873-3468.14067. - DOI - PMC - PubMed
    1. Hastings, J. et al. ChEBI in 2016: Improved services and an expanding collection of metabolites. Nucleic Acids Research44 D1214–D1219 10.1093/nar/gkv1031 (2016). - PMC - PubMed
    1. Bateman A, et al. UniProt: the universal protein knowledgebase in 2021. Nucleic Acids Res. 2021;49:D480–D489. doi: 10.1093/nar/gkaa1100. - DOI - PMC - PubMed