Generating Domain Terminologies using Root- and Rule-Based Terms
- PMID: 34194119
- PMCID: PMC8240749
Generating Domain Terminologies using Root- and Rule-Based Terms
Abstract
Motivated by the need for flexible, intuitive, reusable, and normalized terminology for guiding search and building ontologies, we present a general approach for generating sets of such terminologies from natural language documents. The terms that this approach generates are root- and rule-based terms, generated by a series of rules designed to be flexible, to evolve, and, perhaps most important, to protect against ambiguity and standardize semantically similar but syntactically distinct phrases to a normal form. This approach combines several linguistic and computational methods that can be automated with the help of training sets to quickly and consistently extract normalized terms. We discuss how this can be extended as natural language technologies improve and how the strategy applies to common use-cases such as search, document entry and archiving, and identifying, tracking, and predicting scientific and technological trends.
Keywords: dependency parsing; natural language processing; ontology generation; search; terminology generation; unsupervised learning.
Figures


















Similar articles
-
Terminology extraction from medical texts in Polish.J Biomed Semantics. 2014 May 31;5:24. doi: 10.1186/2041-1480-5-24. eCollection 2014. J Biomed Semantics. 2014. PMID: 24976943 Free PMC article.
-
Gene Ontology synonym generation rules lead to increased performance in biomedical concept recognition.J Biomed Semantics. 2016 Sep 9;7(1):52. doi: 10.1186/s13326-016-0096-7. J Biomed Semantics. 2016. PMID: 27613112 Free PMC article.
-
Interactive Cohort Identification of Sleep Disorder Patients Using Natural Language Processing and i2b2.Appl Clin Inform. 2015 May 27;6(2):345-63. doi: 10.4338/ACI-2014-11-RA-0106. eCollection 2015. Appl Clin Inform. 2015. PMID: 26171080 Free PMC article.
-
Terminology supported archiving and publication of environmental science data in PANGAEA.J Biotechnol. 2017 Nov 10;261:177-186. doi: 10.1016/j.jbiotec.2017.07.016. Epub 2017 Jul 23. J Biotechnol. 2017. PMID: 28743591 Review.
-
Ontologies Applied in Clinical Decision Support System Rules: Systematic Review.JMIR Med Inform. 2023 Jan 19;11:e43053. doi: 10.2196/43053. JMIR Med Inform. 2023. PMID: 36534739 Free PMC article. Review.
Cited by
-
A Web Resource for Exploring the CORD-19 Dataset Using Root- and Rule-Based Phrases.J Indian Inst Sci. 2020;100(4):725-731. doi: 10.1007/s41745-020-00193-2. Epub 2020 Sep 29. J Indian Inst Sci. 2020. PMID: 33013023 Free PMC article. Review.
References
-
- Berners-Lee Tim, Hendler James, and Lassila Ora. 2001. “The Semantic Web.” Scientific American.
-
- Bhat Talapady N. 2010. “Building Chemical Ontology for Semantic Web Using Substructures Created by Chem-BLAST.” International Journal on Semantic Web and Information Systems 6 (3): 22–37.
-
- Bhat Talapady N., Bartolo Laura M., Kattner Ursula R., Campbell Carelyn E., and Elliott John T.. 2015. “Strategy for Extensible, Evolving Terminology for the Materials Genome Initiative Efforts.” Journal of Materials, no. 8: 1866–75.
-
- Clark Herbert H., and Deanna Wilkes-Gibbs. 1986. “Referring as a Collaborative Process.” Cognition, no. 22: 1–39. - PubMed
-
- Coulter Neal, Monarch Ira, and Konda Suresh. 1998. “Software Engineering as Seen Through Its Research Literature: A Study in Co-Word Analysis.” Journal of the Association for Information Science and Technology 49 (13). New York, NY: John Wiley & Sons, Inc.: 1206–23.
Grants and funding
LinkOut - more resources
Full Text Sources