Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jul 20;9(7):e26714.
doi: 10.2196/26714.

A Biomedical Knowledge Graph System to Propose Mechanistic Hypotheses for Real-World Environmental Health Observations: Cohort Study and Informatics Application

Affiliations

A Biomedical Knowledge Graph System to Propose Mechanistic Hypotheses for Real-World Environmental Health Observations: Cohort Study and Informatics Application

Karamarie Fecho et al. JMIR Med Inform. .

Abstract

Background: Knowledge graphs are a common form of knowledge representation in biomedicine and many other fields. We developed an open biomedical knowledge graph-based system termed Reasoning Over Biomedical Objects linked in Knowledge Oriented Pathways (ROBOKOP). ROBOKOP consists of both a front-end user interface and a back-end knowledge graph. The ROBOKOP user interface allows users to posit questions and explore answer subgraphs. Users can also posit questions through direct Cypher query of the underlying knowledge graph, which currently contains roughly 6 million nodes or biomedical entities and 140 million edges or predicates describing the relationship between nodes, drawn from over 30 curated data sources.

Objective: We aimed to apply ROBOKOP to survey data on workplace exposures and immune-mediated diseases from the Environmental Polymorphisms Registry (EPR) within the National Institute of Environmental Health Sciences.

Methods: We analyzed EPR survey data and identified 45 associations between workplace chemical exposures and immune-mediated diseases, as self-reported by study participants (n= 4574), with 20 associations significant at P<.05 after false discovery rate correction. We then used ROBOKOP to (1) validate the associations by determining whether plausible connections exist within the ROBOKOP knowledge graph and (2) propose biological mechanisms that might explain them and serve as hypotheses for subsequent testing. We highlight the following three exemplar associations: carbon monoxide-multiple sclerosis, ammonia-asthma, and isopropanol-allergic disease.

Results: ROBOKOP successfully returned answer sets for three queries that were posed in the context of the driving examples. The answer sets included potential intermediary genes, as well as supporting evidence that might explain the observed associations.

Conclusions: We demonstrate real-world application of ROBOKOP to generate mechanistic hypotheses for associations between workplace chemical exposures and immune-mediated diseases. We expect that ROBOKOP will find broad application across many biomedical fields and other scientific disciplines due to its generalizability, speed to discovery and generation of mechanistic hypotheses, and open nature.

Keywords: data exploration; discovery; generalizability; immune-mediated disease; knowledge graph; knowledge representation; open science.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Figures

Figure 1
Figure 1
High-level schema for the Reasoning Over Biomedical Objects linked in Knowledge Oriented Pathways (ROBOKOP) knowledge graph, showing classes of nodes as defined by the Biolink model. Note that the schema provides a user guide to assist with the proper construction of queries by providing a visual overview of nodes that are connected in the ROBOKOP knowledge graph.
Figure 2
Figure 2
A Reasoning Over Biomedical Objects linked in Knowledge Oriented Pathways (ROBOKOP) high-level question (A) implemented as a machine question or meta-graph (B) designed to explore genes that might mediate the observed association between workplace exposure to carbon monoxide and Environmental Polymorphisms Registry participant self-report of multiple sclerosis. Users select nodes and edges in the ROBOKOP user interface (or by direct Cypher query) to translate the desired natural-language question into an executable machine question, using the schema provided in Figure 1 as a guide. The resultant aggregated answer graph (C) and list of answer subgraphs (D) show six potential mediating genes. Both the answer graph and the list of answer subgraphs are interactive and can be explored by users. For example in (D), users can click on each answer subgraph to explore knowledge sources, predicate assertions, and publication support. TNF: tumor necrosis factor; BDNF: brain-derived neurotrophic factor; IL10: interleukin-10; NGF: nerve growth factor; IRF8: interferon regulatory factor 8; KCNMA1: potassium calcium-activated channel subfamily M alpha 1; CASP8: caspase 8.
Figure 3
Figure 3
The top-ranked answer subgraph suggesting the involvement of TNF (tumor necrosis factor) (A) as a potential gene that might mediate the observed significant association between workplace exposure to carbon monoxide and multiple sclerosis. The Reasoning Over Biomedical Objects linked in Knowledge Oriented Pathways (ROBOKOP) machine question was structured as follows: carbon monoxide – gene – multiple sclerosis. Example publication support for the linkage between carbon monoxide and TNF suggesting a role for heme oxygenase-1 (B). Metadata for the carbon monoxide node (C).

References

    1. Nicholson DN, Greene CS. Constructing knowledge graphs and their biomedical applications. Comput Struct Biotechnol J. 2020;18:1414–1428. doi: 10.1016/j.csbj.2020.05.017. https://linkinghub.elsevier.com/retrieve/pii/S2001-0370(20)30280-4 - DOI - PMC - PubMed
    1. Wishart D, Feunang Y, Guo A, Lo E, Marcu A, Grant J, Sajed T, Johnson D, Li C, Sayeeda Z, Assempour N, Iynkkaran I, Liu Y, Maciejewski A, Gale N, Wilson A, Chin L, Cummings R, Le D, Pon A, Knox C, Wilson M. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res. 2018 Jan 04;46(D1):D1074–D1082. doi: 10.1093/nar/gkx1037. http://europepmc.org/abstract/MED/29126136 - DOI - PMC - PubMed
    1. Davis A, Grondin C, Johnson R, Sciaky D, McMorran R, Wiegers J, Wiegers TC, Mattingly CJ. The Comparative Toxicogenomics Database: update 2019. Nucleic Acids Res. 2019 Jan 08;47(D1):D948–D954. doi: 10.1093/nar/gky868. http://europepmc.org/abstract/MED/30247620 - DOI - PMC - PubMed
    1. Mondo Disease Ontology. [2021-06-25]. http://www.obofoundry.org/ontology/mondo.html.
    1. Köhler S, Carmody L, Vasilevsky N, Jacobsen J, Danis D, Gourdine J, Gargano M, Harris NL, Matentzoglu N, McMurry JA, Osumi-Sutherland D, Cipriani V, Balhoff JP, Conlin T, Blau H, Baynam G, Palmer R, Gratian D, Dawkins H, Segal M, Jansen AC, Muaz A, Chang WH, Bergerson J, Laulederkind SJF, Yüksel Z, Beltran S, Freeman AF, Sergouniotis PI, Durkin D, Storm AL, Hanauer M, Brudno M, Bello SM, Sincan M, Rageth K, Wheeler MT, Oegema R, Lourghi H, Della Rocca MG, Thompson R, Castellanos F, Priest J, Cunningham-Rundles C, Hegde A, Lovering RC, Hajek C, Olry A, Notarangelo L, Similuk M, Zhang XA, Gómez-Andrés D, Lochmüller H, Dollfus H, Rosenzweig S, Marwaha S, Rath A, Sullivan K, Smith C, Milner JD, Leroux D, Boerkoel CF, Klion A, Carter MC, Groza T, Smedley D, Haendel MA, Mungall C, Robinson PN. Expansion of the Human Phenotype Ontology (HPO) knowledge base and resources. Nucleic Acids Res. 2019 Jan 08;47(D1):D1018–D1027. doi: 10.1093/nar/gky1105. http://europepmc.org/abstract/MED/30476213 - DOI - PMC - PubMed

LinkOut - more resources