Natural Language-based Machine Learning Models for the Annotation of Clinical Radiology Reports
- PMID: 29381109
- DOI: 10.1148/radiol.2018171093
Natural Language-based Machine Learning Models for the Annotation of Clinical Radiology Reports
Abstract
Purpose To compare different methods for generating features from radiology reports and to develop a method to automatically identify findings in these reports. Materials and Methods In this study, 96 303 head computed tomography (CT) reports were obtained. The linguistic complexity of these reports was compared with that of alternative corpora. Head CT reports were preprocessed, and machine-analyzable features were constructed by using bag-of-words (BOW), word embedding, and Latent Dirichlet allocation-based approaches. Ultimately, 1004 head CT reports were manually labeled for findings of interest by physicians, and a subset of these were deemed critical findings. Lasso logistic regression was used to train models for physician-assigned labels on 602 of 1004 head CT reports (60%) using the constructed features, and the performance of these models was validated on a held-out 402 of 1004 reports (40%). Models were scored by area under the receiver operating characteristic curve (AUC), and aggregate AUC statistics were reported for (a) all labels, (b) critical labels, and (c) the presence of any critical finding in a report. Sensitivity, specificity, accuracy, and F1 score were reported for the best performing model's (a) predictions of all labels and (b) identification of reports containing critical findings. Results The best-performing model (BOW with unigrams, bigrams, and trigrams plus average word embeddings vector) had a held-out AUC of 0.966 for identifying the presence of any critical head CT finding and an average 0.957 AUC across all head CT findings. Sensitivity and specificity for identifying the presence of any critical finding were 92.59% (175 of 189) and 89.67% (191 of 213), respectively. Average sensitivity and specificity across all findings were 90.25% (1898 of 2103) and 91.72% (18 351 of 20 007), respectively. Simpler BOW methods achieved results competitive with those of more sophisticated approaches, with an average AUC for presence of any critical finding of 0.951 for unigram BOW versus 0.966 for the best-performing model. The Yule I of the head CT corpus was 34, markedly lower than that of the Reuters corpus (at 103) or I2B2 discharge summaries (at 271), indicating lower linguistic complexity. Conclusion Automated methods can be used to identify findings in radiology reports. The success of this approach benefits from the standardized language of these reports. With this method, a large labeled corpus can be generated for applications such as deep learning. © RSNA, 2018 Online supplemental material is available for this article.
Similar articles
-
Automated Classification of Free-Text Radiology Reports: Using Different Feature Extraction Methods to Identify Fractures of the Distal Fibula.Rofo. 2023 Aug;195(8):713-719. doi: 10.1055/a-2061-6562. Epub 2023 May 9. Rofo. 2023. PMID: 37160146 Free PMC article.
-
Comparison of deep learning models for natural language processing-based classification of non-English head CT reports.Neuroradiology. 2020 Oct;62(10):1247-1256. doi: 10.1007/s00234-020-02420-0. Epub 2020 Apr 25. Neuroradiology. 2020. PMID: 32335686
-
Natural Language Processing of Radiology Reports in Patients With Hepatocellular Carcinoma to Predict Radiology Resource Utilization.J Am Coll Radiol. 2019 Jun;16(6):840-844. doi: 10.1016/j.jacr.2018.12.004. Epub 2019 Mar 2. J Am Coll Radiol. 2019. PMID: 30833164
-
Basic Artificial Intelligence Techniques: Natural Language Processing of Radiology Reports.Radiol Clin North Am. 2021 Nov;59(6):919-931. doi: 10.1016/j.rcl.2021.06.003. Radiol Clin North Am. 2021. PMID: 34689877 Review.
-
Automated image label extraction from radiology reports - A review.Artif Intell Med. 2024 Mar;149:102814. doi: 10.1016/j.artmed.2024.102814. Epub 2024 Feb 14. Artif Intell Med. 2024. PMID: 38462277 Review.
Cited by
-
CAD and AI for breast cancer-recent development and challenges.Br J Radiol. 2020 Apr;93(1108):20190580. doi: 10.1259/bjr.20190580. Epub 2019 Dec 16. Br J Radiol. 2020. PMID: 31742424 Free PMC article. Review.
-
Rule-based natural language processing for automation of stroke data extraction: a validation study.Neuroradiology. 2022 Dec;64(12):2357-2362. doi: 10.1007/s00234-022-03029-1. Epub 2022 Aug 1. Neuroradiology. 2022. PMID: 35913525
-
Computer-aided diagnosis in the era of deep learning.Med Phys. 2020 Jun;47(5):e218-e227. doi: 10.1002/mp.13764. Med Phys. 2020. PMID: 32418340 Free PMC article. Review.
-
AI musculoskeletal clinical applications: how can AI increase my day-to-day efficiency?Skeletal Radiol. 2022 Feb;51(2):293-304. doi: 10.1007/s00256-021-03876-8. Epub 2021 Aug 3. Skeletal Radiol. 2022. PMID: 34341865 Review.
-
Risk Factors for Gastrointestinal Bleeding in Patients With Acute Myocardial Infarction: Multicenter Retrospective Cohort Study.J Med Internet Res. 2025 Jan 30;27:e67346. doi: 10.2196/67346. J Med Internet Res. 2025. PMID: 39883922 Free PMC article.
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical