Comparing ICD9-encoded diagnoses and NLP-processed discharge summaries for clinical trials pre-screening: a case study
- PMID: 18999285
- PMCID: PMC2656007
Comparing ICD9-encoded diagnoses and NLP-processed discharge summaries for clinical trials pre-screening: a case study
Abstract
The prevalence of electronic medical record (EMR) systems has made mass-screening for clinical trials viable through secondary uses of clinical data, which often exist in both structured and free text formats. The tradeoffs of using information in either data format for clinical trials screening are understudied. This paper compares the results of clinical trial eligibility queries over ICD9-encoded diagnoses and NLP-processed textual discharge summaries. The strengths and weaknesses of both data sources are summarized along the following dimensions: information completeness, expressiveness, code granularity, and accuracy of temporal information. We conclude that NLP-processed patient reports supplement important information for eligibility screening and should be used in combination with structured data.
Figures
References
-
- Sinackevich N, Tassignon J-P. Speeding the Critical Path. Applied Clinical Trials. 2004
-
- Aronsky D, Haug PJ, Lagor C, Dean NC. 2005. Accuracy of Administrative Data for Identifying Patients With Pneumonia; pp. 319–328. - PubMed
-
- Gundlapalli AV, South BR, Phansalkar S, Kinney AY, Shen S, et al. Application of Natural Language Processing to VA Electronic Health Records to Identify Phenotypic Characteristics for Clinical and Research Purposes. Proc of 2008 AMIA Summit on Translational Bioinformatics. 2008:36–40. - PMC - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical