Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Apr;9(4):e1003044.
doi: 10.1371/journal.pcbi.1003044. Epub 2013 Apr 25.

Chapter 16: text mining for translational bioinformatics

Affiliations

Chapter 16: text mining for translational bioinformatics

K Bretonnel Cohen et al. PLoS Comput Biol. 2013 Apr.

Abstract

Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research-translating basic science results into new interventions-and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

References

    1. Steele MP, Speer MC, Loyd JE, Brown KK, Herron A, et al. (2005) Clinical and pathologic features of familial interstitial pneumonia. Am J Respir Crit Care Med 172: 1146–1152. - PMC - PubMed
    1. Boon K, Bailey N, Yang J, Steel M, Groshong S, et al. (2009) Molecular phenotypes distinguish patients with relatively stable from progressive idiopathic pulmonary fibrosis (ipf). PLoS ONE 4: e5134 doi:10.1371/journal.pone.0005134. - DOI - PMC - PubMed
    1. Chapman W, Dowling J, Wagner M (2004) Fever detection from free-text clinical records for biosurveillance. J Biomed Inform 37: 120–127. - PMC - PubMed
    1. Chapman W, Dowling J (2007) Can chief complaints detect febrile syndromic patients? Journal of Advances in Disease Surveillance 3.
    1. Elhadad N (2006) User-sensitive text summarization: application to the medical domain [Ph.D. thesis]. New York: Columbia University.

Publication types