Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 May 4;39(5):btad297.
doi: 10.1093/bioinformatics/btad297.

Metapaths: similarity search in heterogeneous knowledge graphs via meta-paths

Affiliations

Metapaths: similarity search in heterogeneous knowledge graphs via meta-paths

Ayush Noori et al. Bioinformatics. .

Abstract

Summary: Heterogeneous knowledge graphs (KGs) have enabled the modeling of complex systems, from genetic interaction graphs and protein-protein interaction networks to networks representing drugs, diseases, proteins, and side effects. Analytical methods for KGs rely on quantifying similarities between entities, such as nodes, in the graph. However, such methods must consider the diversity of node and edge types contained within the KG via, for example, defined sequences of entity types known as meta-paths. We present metapaths, the first R software package to implement meta-paths and perform meta-path-based similarity search in heterogeneous KGs. The metapaths package offers various built-in similarity metrics for node pair comparison by querying KGs represented as either edge or adjacency lists, as well as auxiliary aggregation methods to measure set-level relationships. Indeed, evaluation of these methods on an open-source biomedical KG recovered meaningful drug and disease-associated relationships, including those in Alzheimer's disease. The metapaths framework facilitates the scalable and flexible modeling of network similarities in KGs with applications across KG learning.

Availability and implementation: The metapaths R package is available via GitHub at https://github.com/ayushnoori/metapaths and is released under MPL 2.0 (Zenodo DOI: 10.5281/zenodo.7047209). Package documentation and usage examples are available at https://www.ayushnoori.com/metapaths.

PubMed Disclaimer

Conflict of interest statement

None declared.

Figures

Figure 1.
Figure 1.
Evaluation of the metapaths package for similarity search in the ogbl-biokg biomedical KG. (a) We query using the RDPF (i.e. drug–disease–protein–function) meta-path. (b) The function call used to calculate meta-path-based similarity scores is shown. (c) The meta-path traversal function identifies three paths following the specified meta-path that connect donepezil—a drug used to treat Alzheimer’s disease (AD)—with the regulation of amyloid fibril formation pathway, which is implicated in AD (Supplementary data). (d) The computed similarity scores using Path Count, Normalized Path Count, and Degree-Weighted Path Count metrics are shown

References

    1. Alsentzer E, Finlayson S, Li M. et al. Subgraph neural networks. In: Larochelle H, Ranzato M, Hadsell R. et al. (eds.) Advances in Neural Information Processing Systems. New York: Curran Associates, Inc., 2020, 8017–29.
    1. Fu G, Ding Y, Seal A. et al. Predicting drug target interactions using meta-path-based semantic network analysis. BMC Bioinformatics 2016;17:160. - PMC - PubMed
    1. Himmelstein DS, Baranzini SE.. Heterogeneous network edge prediction: a data integration approach to prioritize disease-associated genes. PLoS Comput Biol 2015;11:e1004259. - PMC - PubMed
    1. Himmelstein DS, Lizee A, Hessler C. et al. Systematic integration of biomedical knowledge prioritizes drugs for repurposing. eLife 2017;6:e26726. - PMC - PubMed
    1. Hogan A, Blomqvist E, Cochez M. et al. Knowledge graphs. ACM Comput Surv 2022;54:1–37.

Publication types