Inferring the semantic relationships of words within an ontology using random indexing: applications to pharmacogenomics

Bethany Percha¹, Russ B Altman¹

Affiliations

PMID: 24551397
PMCID: PMC3900134

Comparative Study

Inferring the semantic relationships of words within an ontology using random indexing: applications to pharmacogenomics

Bethany Percha et al. AMIA Annu Symp Proc. 2013.

. 2013 Nov 16:2013:1123-32.

eCollection 2013.

Authors

Bethany Percha¹, Russ B Altman¹

Affiliation

¹ Stanford University, Stanford, CA.

PMID: 24551397
PMCID: PMC3900134

Abstract

The biomedical literature presents a uniquely challenging text mining problem. Sentences are long and complex, the subject matter is highly specialized with a distinct vocabulary, and producing annotated training data for this domain is time consuming and expensive. In this environment, unsupervised text mining methods that do not rely on annotated training data are valuable. Here we investigate the use of random indexing, an automated method for producing vector-space semantic representations of words from large, unlabeled corpora, to address the problem of term normalization in sentences describing drugs and genes. We show that random indexing produces similarity scores that capture some of the structure of PHARE, a manually curated ontology of pharmacogenomics concepts. We further show that random indexing can be used to identify likely word candidates for inclusion in the ontology, and can help localize these new labels among classes and roles within the ontology.

PubMed Disclaimer

Figures

**Figure 1:**
**An example of relation normalization using PHARE. Here two sentences that look very different on the surface are mapped to the same normalized “fact”.**

**Figure 2.**
Bar plots of correlations between number of common parents in ontology and distributional similarity scores for (left) concepts and (right) roles. Each bar represents a different type of semantic vector. Orange bars represent vectors with width 1, gray width 3, and blue width 5.

**Figure 3.**
Correct concepts/roles found, by position in the ranked list. Separate graphs are shown for (left) roles and (right) concepts. The total number of concepts included here was 228 and the total number of roles was 54.

**Figure 4.**
Dependency parses for the two example sentences shown in Figure 1. Because the structure of these sentences is so similar, one could conceive of using distributional semantics methods to establish an alignment between them, thus performing a task akin to normalization without the use of an ontology.

See this image and copyright information in PMC

References

1. Coulet A, Shah NH, Garten Y, Musen M, Altman RB. Using text to build semantic networks for pharmacogenomics. Journal of biomedical informatics. 2010;43(6):1009–1019. - PMC - PubMed
1. Coulet A, Garten Y, Dumontier M, Altman RB, Musen MA, Shah NH. Integration and publication of heterogeneous text-mined relationships on the Semantic Web. J Biomed Semantics. 2011;2(Suppl 2):S10. - PMC - PubMed
1. Percha B, Garten Y, Altman RB. Discovery and explanation of drug-drug interactions via text mining. Pacific Symposium on Biocomputing Pacific Symposium on Biocomputing; 2012. p. 410. - PMC - PubMed
1. Liu K, Hogan WR, Crowley RS. Natural language processing methods and systems for biomedical ontology learning. Journal of Biomedical Informatics. 2011;44:163–179. - PMC - PubMed
1. Turney PD, Pantel P. From frequency to meaning: Vector space models of semantics. Journal of Artificial Intelligence Research. 2010;37(1):141–188.

Publication types

Actions
Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

U01 GM061374/GM/NIGMS NIH HHS/United States

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Inferring the semantic relationships of words within an ontology using random indexing: applications to pharmacogenomics

Affiliation

Inferring the semantic relationships of words within an ontology using random indexing: applications to pharmacogenomics

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources