Inferring the semantic relationships of words within an ontology using random indexing: applications to pharmacogenomics
- PMID: 24551397
- PMCID: PMC3900134
Inferring the semantic relationships of words within an ontology using random indexing: applications to pharmacogenomics
Abstract
The biomedical literature presents a uniquely challenging text mining problem. Sentences are long and complex, the subject matter is highly specialized with a distinct vocabulary, and producing annotated training data for this domain is time consuming and expensive. In this environment, unsupervised text mining methods that do not rely on annotated training data are valuable. Here we investigate the use of random indexing, an automated method for producing vector-space semantic representations of words from large, unlabeled corpora, to address the problem of term normalization in sentences describing drugs and genes. We show that random indexing produces similarity scores that capture some of the structure of PHARE, a manually curated ontology of pharmacogenomics concepts. We further show that random indexing can be used to identify likely word candidates for inclusion in the ontology, and can help localize these new labels among classes and roles within the ontology.
Figures




Similar articles
-
Semantic role labeling for protein transport predicates.BMC Bioinformatics. 2008 Jun 11;9:277. doi: 10.1186/1471-2105-9-277. BMC Bioinformatics. 2008. PMID: 18547432 Free PMC article.
-
SIFR annotator: ontology-based semantic annotation of French biomedical text and clinical notes.BMC Bioinformatics. 2018 Nov 6;19(1):405. doi: 10.1186/s12859-018-2429-2. BMC Bioinformatics. 2018. PMID: 30400805 Free PMC article.
-
Identification of key concepts in biomedical literature using a modified Markov heuristic.Bioinformatics. 2003 Feb 12;19(3):402-7. doi: 10.1093/bioinformatics/btg010. Bioinformatics. 2003. PMID: 12584127
-
Natural Language Processing methods and systems for biomedical ontology learning.J Biomed Inform. 2011 Feb;44(1):163-79. doi: 10.1016/j.jbi.2010.07.006. Epub 2010 Jul 18. J Biomed Inform. 2011. PMID: 20647054 Free PMC article. Review.
-
Text mining for drug-drug interaction.Methods Mol Biol. 2014;1159:47-75. doi: 10.1007/978-1-4939-0709-0_4. Methods Mol Biol. 2014. PMID: 24788261 Free PMC article. Review.
Cited by
-
Learning the Structure of Biomedical Relationships from Unstructured Text.PLoS Comput Biol. 2015 Jul 28;11(7):e1004216. doi: 10.1371/journal.pcbi.1004216. eCollection 2015 Jul. PLoS Comput Biol. 2015. PMID: 26219079 Free PMC article.
-
Pharmacogenomics in the clinic.Nature. 2015 Oct 15;526(7573):343-50. doi: 10.1038/nature15817. Nature. 2015. PMID: 26469045 Free PMC article. Review.
-
An ontology for Autism Spectrum Disorder (ASD) to infer ASD phenotypes from Autism Diagnostic Interview-Revised data.J Biomed Inform. 2015 Aug;56:333-47. doi: 10.1016/j.jbi.2015.06.026. Epub 2015 Jul 4. J Biomed Inform. 2015. PMID: 26151311 Free PMC article.
References
-
- Turney PD, Pantel P. From frequency to meaning: Vector space models of semantics. Journal of Artificial Intelligence Research. 2010;37(1):141–188.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources