Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Aug;15(8):1848-1855.
doi: 10.1111/cts.13302. Epub 2022 Jun 6.

Biolink Model: A universal schema for knowledge graphs in clinical, biomedical, and translational science

Collaborators, Affiliations
Review

Biolink Model: A universal schema for knowledge graphs in clinical, biomedical, and translational science

Deepak R Unni et al. Clin Transl Sci. 2022 Aug.

Abstract

Within clinical, biomedical, and translational science, an increasing number of projects are adopting graphs for knowledge representation. Graph-based data models elucidate the interconnectedness among core biomedical concepts, enable data structures to be easily updated, and support intuitive queries, visualizations, and inference algorithms. However, knowledge discovery across these "knowledge graphs" (KGs) has remained difficult. Data set heterogeneity and complexity; the proliferation of ad hoc data formats; poor compliance with guidelines on findability, accessibility, interoperability, and reusability; and, in particular, the lack of a universally accepted, open-access model for standardization across biomedical KGs has left the task of reconciling data sources to downstream consumers. Biolink Model is an open-source data model that can be used to formalize the relationships between data structures in translational science. It incorporates object-oriented classification and graph-oriented features. The core of the model is a set of hierarchical, interconnected classes (or categories) and relationships between them (or predicates) representing biomedical entities such as gene, disease, chemical, anatomic structure, and phenotype. The model provides class and edge attributes and associations that guide how entities should relate to one another. Here, we highlight the need for a standardized data model for KGs, describe Biolink Model, and compare it with other models. We demonstrate the utility of Biolink Model in various initiatives, including the Biomedical Data Translator Consortium and the Monarch Initiative, and show how it has supported easier integration and interoperability of biomedical KGs, bringing together knowledge from multiple sources and helping to realize the goals of translational science.

PubMed Disclaimer

Conflict of interest statement

The authors declared no competing interests for this work.

Figures

FIGURE 1
FIGURE 1
An example of an Association represented in Biolink Model. In (a), the green ovals represent the subject and object classes, connected by a predicate. Together, the classes and the predicate constitute a statement or “core triple” in the model. Edge properties provide further context and qualification to the core triple. The entire diagram, including the core triple and its provenance, represents a Biolink Model “association.” In (b), we see a specific example of a “biolink:DiseaseToPhenotypicFeatureAssociation,” where the subject is “biolink:Disease,” the object is “biolink:PhenotypicFeature,” and the predicate is “biolink:has_phenotype.” In addition, the “biolink:publications” property (lavender oval) records the provenance of the core triple.
FIGURE 2
FIGURE 2
An overview of the Translator architecture that supports biomedical KG‐based question‐answering, including the role of Biolink Model, in the context of an example question. In this example, a user has posed the natural‐language question: what chemicals or drugs might be used to treat neurological disorders, such as epilepsy, that are associated with genomic variants of RHOBTB2? The question is translated into a graph query, as shown in the top left panel, which is then translated into a Translator standard machine query (not shown). The KG shown in the second panel from the left is derived from a variety of diverse “knowledge sources,” a subset of which are displayed in the figure, that are exposed by Translator “knowledge providers.” Biolink Model provides standardization and semantic harmonization across the disparate knowledge sources, thereby allowing them to be integrated into a KG capable of supporting question‐answering. In this example, Translator provided two answers or results of interest to the investigative team who posed the question, namely, fostamatinib disodium and ruxolitinib, as shown in the bottom left panel. KG, knowledge graph.

References

    1. Bisiani R, Shapiro SC. Encyclopedia of Artificial Intelligence. Beam search. Wiley; 1987.
    1. National Physical Laboratory . Symposium. International Conference on Machine Translation of Languages and Applied Language Analysis. National Physical Laboratory; 1961. Accessed March 03, 2022. https://market.android.com/details?id=book‐dTTawQEACAAJ
    1. Vrandečić D, Krötzsch M. Wikidata: A Free Collaborative Knowledge Base; 2014. Accessed March 03, 2022. https://ai.google/research/pubs/pub42240
    1. Vasilakes JA, Rizvi R, Zhang R. Annotated Semantic Predications from SemMedDB; 2018. Accessed March 03, 2022. https://conservancy.umn.edu/handle/11299/194965
    1. Himmelstein DS, Lizee A, Hessler C, et al. Systematic integration of biomedical knowledge prioritizes drugs for repurposing. eLife. 2017;22:6. doi:10.7554/eLife.26726 - DOI - PMC - PubMed

Publication types

MeSH terms

Grants and funding