Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jun:82:189-199.
doi: 10.1016/j.jbi.2018.05.003. Epub 2018 May 12.

Exploiting semantic patterns over biomedical knowledge graphs for predicting treatment and causative relations

Affiliations

Exploiting semantic patterns over biomedical knowledge graphs for predicting treatment and causative relations

Gokhan Bakal et al. J Biomed Inform. 2018 Jun.

Abstract

Background: Identifying new potential treatment options for medical conditions that cause human disease burden is a central task of biomedical research. Since all candidate drugs cannot be tested with animal and clinical trials, in vitro approaches are first attempted to identify promising candidates. Likewise, identifying different causal relations between biomedical entities is also critical to understand biomedical processes. Generally, natural language processing (NLP) and machine learning are used to predict specific relations between any given pair of entities using the distant supervision approach.

Objective: To build high accuracy supervised predictive models to predict previously unknown treatment and causative relations between biomedical entities based only on semantic graph pattern features extracted from biomedical knowledge graphs.

Methods: We used 7000 treats and 2918 causes hand-curated relations from the UMLS Metathesaurus to train and test our models. Our graph pattern features are extracted from simple paths connecting biomedical entities in the SemMedDB graph (based on the well-known SemMedDB database made available by the U.S. National Library of Medicine). Using these graph patterns connecting biomedical entities as features of logistic regression and decision tree models, we computed mean performance measures (precision, recall, F-score) over 100 distinct 80-20% train-test splits of the datasets. For all experiments, we used a positive:negative class imbalance of 1:10 in the test set to model relatively more realistic scenarios.

Results: Our models predict treats and causes relations with high F-scores of 99% and 90% respectively. Logistic regression model coefficients also help us identify highly discriminative patterns that have an intuitive interpretation. We are also able to predict some new plausible relations based on false positives that our models scored highly based on our collaborations with two physician co-authors. Finally, our decision tree models are able to retrieve over 50% of treatment relations from a recently created external dataset.

Conclusions: We employed semantic graph patterns connecting pairs of candidate biomedical entities in a knowledge graph as features to predict treatment/causative relations between them. We provide what we believe is the first evidence in direct prediction of biomedical relations based on graph features. Our work complements lexical pattern based approaches in that the graph patterns can be used as additional features for weakly supervised relation prediction.

Keywords: Information extraction; Relation prediction; Semantic graph patterns.

PubMed Disclaimer

Figures

Figure 1
Figure 1
A sample graph of biomedical relations
Figure 2
Figure 2
An example of primitive patterns generated from a compound pattern (from Eq (1))
Figure 3
Figure 3
Schematic of graph pattern based relation prediction
Figure 4
Figure 4
Example discriminative treats patterns obtained through our methods

References

    1. Abacha Asma Ben, Zweigenbaum Pierre. Automatic extraction of semantic relations between medical entities: a rule based approach. Journal of biomedical semantics. 2011;2(5):S4. - PMC - PubMed
    1. Andronis Christos, Sharma Anuj, Virvilis Vassilis, Deftereos Spyros, Persidis Aris. Literature mining, ontologies and information visualization for drug repurposing. Briefings in bioinformatics. 2011;12(4):357–368. - PubMed
    1. Bakal Gokhan, Kavuluru Ramakanth. Predicting treatment relations with semantic patterns over biomedical knowledge graphs. International Conference on Mining Intelligence and Knowledge Exploration; Springer; 2015. pp. 586–596. - PMC - PubMed
    1. Breiman Leo, Friedman Jerome, Stone Charles J, Olshen Richard A. Classification and regression trees. CRC press; 1984.
    1. Brown Adam S, Patel Chirag J. A standard database for drug repositioning. Scientific Data. 2017;4:170029. - PMC - PubMed

Publication types