Automated classification of radiology reports to facilitate retrospective study in radiology

Yihua Zhou¹, Per K Amundson, Fang Yu, Marcus M Kessler, Tammie L S Benzinger, Franz J Wippold

Affiliations

PMID: 24874407
PMCID: PMC4391070
DOI: 10.1007/s10278-014-9708-x

Automated classification of radiology reports to facilitate retrospective study in radiology

Yihua Zhou et al. J Digit Imaging. 2014 Dec.

. 2014 Dec;27(6):730-6.

doi: 10.1007/s10278-014-9708-x.

Authors

Yihua Zhou¹, Per K Amundson, Fang Yu, Marcus M Kessler, Tammie L S Benzinger, Franz J Wippold

Affiliation

¹ The Department of Radiology, Saint Louis University School of Medicine, 3635 Vista Blvd at Grand Blvd, Saint Louis, MO, 63110, USA, yzhou31@slu.edu.

PMID: 24874407
PMCID: PMC4391070
DOI: 10.1007/s10278-014-9708-x

Abstract

Retrospective research is an import tool in radiology. Identifying imaging examinations appropriate for a given research question from the unstructured radiology reports is extremely useful, but labor-intensive. Using the machine learning text-mining methods implemented in LingPipe [1], we evaluated the performance of the dynamic language model (DLM) and the Naïve Bayesian (NB) classifiers in classifying radiology reports to facilitate identification of radiological examinations for research projects. The training dataset consisted of 14,325 sentences from 11,432 radiology reports randomly selected from a database of 5,104,594 reports in all disciplines of radiology. The training sentences were categorized manually into six categories (Positive, Differential, Post Treatment, Negative, Normal, and History). A 10-fold cross-validation [2] was used to evaluate the performance of the models, which were tested in classification of radiology reports for cases of sellar or suprasellar masses and colloid cysts. The average accuracies for the DLM and NB classifiers were 88.5% with 95% confidence interval (CI) of 1.9% and 85.9% with 95% CI of 2.0%, respectively. The DLM performed slightly better and was used to classify 1,397 radiology reports containing the keywords "sellar or suprasellar mass", or "colloid cyst". The DLM model produced an accuracy of 88.2% with 95% CI of 2.1% for 959 reports that contain "sellar or suprasellar mass" and an accuracy of 86.3% with 95% CI of 2.5% for 437 reports of "colloid cyst". We conclude that automated classification of radiology reports using machine learning techniques can effectively facilitate the identification of cases suitable for retrospective research.

PubMed Disclaimer

Figures

**Fig. 1**
A screenshot of the case finder program

**Fig. 2**
Ten-fold cross-validation. The precision represents the overall precision for all classes. The performance for individual classes is not shown

**Fig. 3**
Performance of different classes using a 4-gram dynamic language model in the 10-fold cross-validation analysis. The performance is measured by accuracy, recall rate, and precision. *All* results of all classes combined, Hx history class, *PostTx* post treatment class, *Pos* positive class, *Neg* negative class, *Normal* normal class, *DDx* differential diagnosis class

See this image and copyright information in PMC

References

1. Alias-i. 2008. LingPipe 4.0.1. http://ir.exp.sis.pitt.edu/ne/lingpipe-2.4.0/index.html Accessed September 25, 2010
1. Cavnar WB, Trenkle JM. N-gram-based text categorization. Proceedings of SDAIR-94, 3rd Annual Symposium on Document Analysis and Information Retrieval, 1994
1. Rahmoun A, Elberrichi Z. Experimenting N-Grams in text categorization. Int Arab J Inf Technol. 2007;4(4):377–385.
1. Lang K: “Newsweeder: learning to filter news”. Proceedings of the 12th International Conference on Machine Learning 331–339, 1995
1. Kohavi R: A study of cross-validation and bootstrap for accuracy estimation and model selection. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence 2:1137–1143, 1995

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Automated classification of radiology reports to facilitate retrospective study in radiology

Affiliation

Automated classification of radiology reports to facilitate retrospective study in radiology

Authors

Affiliation

Abstract

Figures

References

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources