Automated Classification of Radiology Reports for Acute Lung Injury: Comparison of Keyword and Machine Learning Based Natural Language Processing Approaches

Imre Solti¹, Colin R Cooke, Fei Xia, Mark M Wurfel

Affiliations

PMID: 21152268
PMCID: PMC2998031
DOI: 10.1109/BIBMW.2009.5332081

Automated Classification of Radiology Reports for Acute Lung Injury: Comparison of Keyword and Machine Learning Based Natural Language Processing Approaches

Imre Solti et al. Proceedings (IEEE Int Conf Bioinformatics Biomed). 2009 Nov.

. 2009 Nov:2009:314-319.

doi: 10.1109/BIBMW.2009.5332081.

Authors

Imre Solti¹, Colin R Cooke, Fei Xia, Mark M Wurfel

Affiliation

¹ Department of Medical Education and Biomedical Informatics, University of Washington, Seattle WA.

PMID: 21152268
PMCID: PMC2998031
DOI: 10.1109/BIBMW.2009.5332081

Abstract

This paper compares the performance of keyword and machine learning-based chest x-ray report classification for Acute Lung Injury (ALI). ALI mortality is approximately 30 percent. High mortality is, in part, a consequence of delayed manual chest x-ray classification. An automated system could reduce the time to recognize ALI and lead to reductions in mortality. For our study, 96 and 857 chest x-ray reports in two corpora were labeled by domain experts for ALI. We developed a keyword and a Maximum Entropy-based classification system. Word unigram and character n-grams provided the features for the machine learning system. The Maximum Entropy algorithm with character 6-gram achieved the highest performance (Recall=0.91, Precision=0.90 and F-measure=0.91) on the 857-report corpus. This study has shown that for the classification of ALI chest x-ray reports, the machine learning approach is superior to the keyword based system and achieves comparable results to highest performing physician annotators.

PubMed Disclaimer

Figures

**Figure 1**
Shows that the peak for accuracy is at character 6-gram.

See this image and copyright information in PMC

References

1. Erickson SE, Martin GS, Davis JL, Matthay MA, Eisner MD. Recent trends in acute lung injury mortality: 1996–2005. Critical Care Medicine. 37:1574–1579. - PMC - PubMed
1. Chapman WW, Bridewell W, Hanbury P, Cooper GF, Buchanan BG. A simple algorithm for identifying negated findings and diseases in discharge summaries. Journal of Biomedical Informatics. 2001;34:301–310. - PubMed
1. NegEx, Negation identification for clinical conditions. Retrieved August 01, 2009, from http://code.google.com/p/negex.
1. Ratnaparkhi A. IRCS Reports 97—08. University of Pennsylvania; 1997. A simple introduction to Maximum Entropy models for Natural language Processing.
1. MALLET, Machine Learning for Language Toolkit. Retrieved August 02, 2009, from http://mallet.cs.umass.edu/

Grants and funding

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Automated Classification of Radiology Reports for Acute Lung Injury: Comparison of Keyword and Machine Learning Based Natural Language Processing Approaches

Affiliation

Automated Classification of Radiology Reports for Acute Lung Injury: Comparison of Keyword and Machine Learning Based Natural Language Processing Approaches

Authors

Affiliation

Abstract

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources