Two biomedical sublanguages: a description based on the theories of Zellig Harris
- PMID: 12755517
- DOI: 10.1016/s1532-0464(03)00012-1
Two biomedical sublanguages: a description based on the theories of Zellig Harris
Abstract
Natural language processing (NLP) systems have been developed to provide access to the tremendous body of data and knowledge that is available in the biomedical domain in the form of natural language text. These NLP systems are valuable because they can encode and amass the information in the text so that it can be used by other automated processes to improve patient care and our understanding of disease processes and treatments. Zellig Harris proposed a theory of sublanguage that laid the foundation for natural language processing in specialized domains. He hypothesized that the informational content and structure form a specialized language that can be delineated in the form of a sublanguage grammar. The grammar can then be used by a language processor to capture and encode the salient information and relations in text. In this paper, we briefly summarize his language and sublanguage theories. In addition, we summarize our prior research, which is associated with the sublanguage grammars we developed for two different biomedical domains. These grammars illustrate how Harris' theories provide a basis for the development of language processing systems in the biomedical domain. The two domains and their associated sublanguages discussed are: the clinical domain, where the text consists of patient reports, and the biomolecular domain, where the text consists of complete journal articles.
Similar articles
-
Paraphrasing for condensation in journal abstracting.J Biomed Inform. 2002 Aug;35(4):265-77. doi: 10.1016/s1532-0464(03)00016-9. J Biomed Inform. 2002. PMID: 12755521
-
The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text.J Biomed Inform. 2003 Dec;36(6):462-77. doi: 10.1016/j.jbi.2003.11.003. J Biomed Inform. 2003. PMID: 14759819
-
Information extraction from biomedical text.J Biomed Inform. 2002 Aug;35(4):260-4. doi: 10.1016/s1532-0464(03)00015-7. J Biomed Inform. 2002. PMID: 12755520
-
Status of text-mining techniques applied to biomedical text.Drug Discov Today. 2006 Apr;11(7-8):315-25. doi: 10.1016/j.drudis.2006.02.011. Drug Discov Today. 2006. PMID: 16580973 Review.
-
A survey of current work in biomedical text mining.Brief Bioinform. 2005 Mar;6(1):57-71. doi: 10.1093/bib/6.1.57. Brief Bioinform. 2005. PMID: 15826357 Review.
Cited by
-
Care episode retrieval: distributional semantic models for information retrieval in the clinical domain.BMC Med Inform Decis Mak. 2015;15 Suppl 2(Suppl 2):S2. doi: 10.1186/1472-6947-15-S2-S2. Epub 2015 Jun 15. BMC Med Inform Decis Mak. 2015. PMID: 26099735 Free PMC article.
-
A survey of automated methods for biomedical text simplification.J Am Med Inform Assoc. 2022 Oct 7;29(11):1976-1988. doi: 10.1093/jamia/ocac149. J Am Med Inform Assoc. 2022. PMID: 36083212 Free PMC article.
-
A cross-institutional evaluation on breast cancer phenotyping NLP algorithms on electronic health records.Comput Struct Biotechnol J. 2023 Aug 22;22:32-40. doi: 10.1016/j.csbj.2023.08.018. eCollection 2023. Comput Struct Biotechnol J. 2023. PMID: 37680211 Free PMC article.
-
The biomedical discourse relation bank.BMC Bioinformatics. 2011 May 23;12:188. doi: 10.1186/1471-2105-12-188. BMC Bioinformatics. 2011. PMID: 21605399 Free PMC article.
-
The UAB Informatics Institute and 2016 CEGS N-GRID de-identification shared task challenge.J Biomed Inform. 2017 Nov;75S:S54-S61. doi: 10.1016/j.jbi.2017.05.001. Epub 2017 May 3. J Biomed Inform. 2017. PMID: 28478268 Free PMC article.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources