Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Nov 1;77(21):e115-e118.
doi: 10.1158/0008-5472.CAN-17-0615.

DeepPhe: A Natural Language Processing System for Extracting Cancer Phenotypes from Clinical Records

Affiliations

DeepPhe: A Natural Language Processing System for Extracting Cancer Phenotypes from Clinical Records

Guergana K Savova et al. Cancer Res. .

Abstract

Precise phenotype information is needed to understand the effects of genetic and epigenetic changes on tumor behavior and responsiveness. Extraction and representation of cancer phenotypes is currently mostly performed manually, making it difficult to correlate phenotypic data to genomic data. In addition, genomic data are being produced at an increasingly faster pace, exacerbating the problem. The DeepPhe software enables automated extraction of detailed phenotype information from electronic medical records of cancer patients. The system implements advanced Natural Language Processing and knowledge engineering methods within a flexible modular architecture, and was evaluated using a manually annotated dataset of the University of Pittsburgh Medical Center breast cancer patients. The resulting platform provides critical and missing computational methods for computational phenotyping. Working in tandem with advanced analysis of high-throughput sequencing, these approaches will further accelerate the transition to precision cancer treatment. Cancer Res; 77(21); e115-8. ©2017 AACR.

PubMed Disclaimer

Figures

Figure 1
Figure 1
(A) Domain model and (B) Example of phenotypes produced from the DeepPhe system. Bold and underlined items in 1b show the mentions extracted by cTAKES pipeline 1 (mention detection). Abbreviations/Acronyms: cTAKES - Apache Clinical Text Analysis and Knowledge Extraction System; FHIR – Fast Healthcare Interoperability Resources; RAD – radiology note; SP – surgical pathology note

References

    1. DeepPhe Information Model [Internet] Available from: https://github.com/DeepPhe/models.
    1. UIMA [Internet] 2013 Available from: uima.apache.org.
    1. Hochheiser H, Castine M, Harris D, Savova G, Jacobson RS. An information model for computable cancer phenotypes. BMC Med Inform Decis Mak. 2016 Sep 15;16(1):121. - PMC - PubMed
    1. Savova GK, Masanz JJ, Ogren PV, Zheng J, Sohn S, Kipper-Schuler KC, et al. Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. J Am Med Inform Assoc JAMIA. 2010 Oct;17(5):507–13. - PMC - PubMed
    1. Wu ST, Kaggal VC, Dligach D, Masanz JJ, Chen P, Becker L, et al. A common type system for clinical natural language processing. J Biomed Semant. 2013 Jan 3;4(1):1. - PMC - PubMed

Publication types

LinkOut - more resources