Rule-based natural language processing for automation of stroke data extraction: a validation study

Dane Gunter¹, Paulo Puac-Polanco², Olivier Miguel², Rebecca E Thornhill³, Amy Y X Yu⁴, Zhongyu A Liu⁴, Muhammad Mamdani⁵, Chloe Pou-Prom⁶, Richard I Aviv^{7

8}

Affiliations

¹ The Ottawa Hospital Research Institute, Ottawa, ON, Canada.
² Department of Radiology, Radiation Oncology and Medical Physics, University of Ottawa, The Ottawa Hospital Civic Campus Room C110, 1053 Carling Ave, Ottawa, ON, ON K1Y 4E9, Canada.
³ Division of Medical Physics, Department of Radiology, Radiation Oncology and Medical Physics, University of Ottawa, Ottawa, ON, Canada.
⁴ Department of Medicine (Neurology), University of Toronto, Sunnybrook Health Sciences Centre, Toronto, ON, Canada.
⁵ Department of Medicine, Unity Health Toronto, University of Toronto, Toronto, ON, Canada.
⁶ Unity Health Toronto, Toronto, ON, Canada.
⁷ The Ottawa Hospital Research Institute, Ottawa, ON, Canada. raviv@toh.ca.
⁸ Department of Radiology, Radiation Oncology and Medical Physics, University of Ottawa, The Ottawa Hospital Civic Campus Room C110, 1053 Carling Ave, Ottawa, ON, ON K1Y 4E9, Canada. raviv@toh.ca.

PMID: 35913525
DOI: 10.1007/s00234-022-03029-1

Rule-based natural language processing for automation of stroke data extraction: a validation study

Dane Gunter et al. Neuroradiology. 2022 Dec.

. 2022 Dec;64(12):2357-2362.

doi: 10.1007/s00234-022-03029-1. Epub 2022 Aug 1.

Authors

Dane Gunter¹, Paulo Puac-Polanco², Olivier Miguel², Rebecca E Thornhill³, Amy Y X Yu⁴, Zhongyu A Liu⁴, Muhammad Mamdani⁵, Chloe Pou-Prom⁶, Richard I Aviv^{7

8}

Affiliations

¹ The Ottawa Hospital Research Institute, Ottawa, ON, Canada.
² Department of Radiology, Radiation Oncology and Medical Physics, University of Ottawa, The Ottawa Hospital Civic Campus Room C110, 1053 Carling Ave, Ottawa, ON, ON K1Y 4E9, Canada.
³ Division of Medical Physics, Department of Radiology, Radiation Oncology and Medical Physics, University of Ottawa, Ottawa, ON, Canada.
⁴ Department of Medicine (Neurology), University of Toronto, Sunnybrook Health Sciences Centre, Toronto, ON, Canada.
⁵ Department of Medicine, Unity Health Toronto, University of Toronto, Toronto, ON, Canada.
⁶ Unity Health Toronto, Toronto, ON, Canada.
⁷ The Ottawa Hospital Research Institute, Ottawa, ON, Canada. raviv@toh.ca.
⁸ Department of Radiology, Radiation Oncology and Medical Physics, University of Ottawa, The Ottawa Hospital Civic Campus Room C110, 1053 Carling Ave, Ottawa, ON, ON K1Y 4E9, Canada. raviv@toh.ca.

PMID: 35913525
DOI: 10.1007/s00234-022-03029-1

Abstract

Purpose: Data extraction from radiology free-text reports is time consuming when performed manually. Recently, more automated extraction methods using natural language processing (NLP) are proposed. A previously developed rule-based NLP algorithm showed promise in its ability to extract stroke-related data from radiology reports. We aimed to externally validate the accuracy of CHARTextract, a rule-based NLP algorithm, to extract stroke-related data from free-text radiology reports.

Methods: Free-text reports of CT angiography (CTA) and perfusion (CTP) studies of consecutive patients with acute ischemic stroke admitted to a regional stroke center for endovascular thrombectomy were analyzed from January 2015 to 2021. Stroke-related variables were manually extracted as reference standard from clinical reports, including proximal and distal anterior circulation occlusion, posterior circulation occlusion, presence of ischemia or hemorrhage, Alberta stroke program early CT score (ASPECTS), and collateral status. These variables were simultaneously extracted using a rule-based NLP algorithm. The NLP algorithm's accuracy, specificity, sensitivity, positive predictive value (PPV), and negative predictive value (NPV) were assessed.

Results: The NLP algorithm's accuracy was > 90% for identifying distal anterior occlusion, posterior circulation occlusion, hemorrhage, and ASPECTS. Accuracy was 85%, 74%, and 79% for proximal anterior circulation occlusion, presence of ischemia, and collateral status respectively. The algorithm confirmed the absence of variables from radiology reports with an 87-100% accuracy.

Conclusions: Rule-based NLP has a moderate to good performance for stroke-related data extraction from free-text imaging reports. The algorithm's accuracy was affected by inconsistent report styles and lexicon among reporting radiologists.

Keywords: Data extraction; Natural language processing; Rule-based; Stroke; Stroke surveillance.

PubMed Disclaimer

References

1. Yu AY, Holodinsky JK, Zerna C, Svenson LW, Jetté N, Quan H, Hill MD (2016) Use and utility of administrative health data for stroke research and surveillance. Stroke 47(7):1946–1952. https://doi.org/10.1161/STROKEAHA.116.012390 - DOI - PubMed
1. Elkins JS, Friedman C, Boden-Albala B, Sacco RL, Hripcsak G (2000) Coding neuroradiology reports for the Northern Manhattan Stroke Study: a comparison of natural language processing and manual review. Comput Biomed Res 33(1):1–10. https://doi.org/10.1006/cbmr.1999.1535 - DOI - PubMed
1. Pons E, Braun LM, Hunink MG, Kors JA (2016) Natural language processing in radiology: a systematic review. Radiology 279(2):329–343. https://doi.org/10.1148/radiol.16142770 - DOI - PubMed
1. Garg R, Oh E, Naidech A, Kording K, Prabhakaran S (2019) Automating ischemic stroke subtype classification using machine learning and natural language processing. J Stroke Cerebrovasc Dis 28(7):2045–2051. https://doi.org/10.1016/j.jstrokecerebrovasdis.2019.02.004 - DOI - PubMed
1. Waqas M, Rai AT, Vakharia K, Chin F, Siddiqui AH (2020) Effect of definition and methods on estimates of prevalence of large vessel occlusion in acute ischemic stroke: a systematic review and meta-analysis. J Neurointerv Surg 12(3):260–265. https://doi.org/10.1136/neurintsurg-2019-015172 - DOI - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- Springer
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Rule-based natural language processing for automation of stroke data extraction: a validation study

Affiliations

Rule-based natural language processing for automation of stroke data extraction: a validation study

Authors

Affiliations

Abstract

References

MeSH terms

LinkOut - more resources

Full Text Sources

Medical