An information extraction framework for cohort identification using electronic health records
- PMID: 24303255
- PMCID: PMC3845757
An information extraction framework for cohort identification using electronic health records
Abstract
Information extraction (IE), a natural language processing (NLP) task that automatically extracts structured or semi-structured information from free text, has become popular in the clinical domain for supporting automated systems at point-of-care and enabling secondary use of electronic health records (EHRs) for clinical and translational research. However, a high performance IE system can be very challenging to construct due to the complexity and dynamic nature of human language. In this paper, we report an IE framework for cohort identification using EHRs that is a knowledge-driven framework developed under the Unstructured Information Management Architecture (UIMA). A system to extract specific information can be developed by subject matter experts through expert knowledge engineering of the externalized knowledge resources used in the framework.
Figures
References
-
- Chapman WW , Gundlapalli AV , South BR , Dowling JN . Natural language processing for biosurveillance . Infectious Disease Informatics and Biosurveillance . 2011 : 279 – 310 .
Grants and funding
LinkOut - more resources
Full Text Sources