Classification of neurologic outcomes from medical notes using natural language processing

Marta B Fernandes^{1

2

3}, Navid Valizadeh^{1

2}, Haitham S Alabsi^{1

2}, Syed A Quadri^{1

2

3}, Ryan A Tesh^{1

2

3}, Abigail A Bucklin^{1

2

3}, Haoqi Sun^{1

2

3}, Aayushee Jain^{1

3}, Laura N Brenner^{2

4

5}, Elissa Ye^{1

3}, Wendong Ge^{1

2

3}, Sarah I Collens¹, Stacie Lin², Sudeshna Das^{1

2}, Gregory K Robbins^{2

6}, Sahar F Zafar^{1

2}, Shibani S Mukerji^{1

2

7}, M Brandon Westover^{1

2

3

8}

Affiliations

¹ Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States.
² Harvard Medical School, Boston, MA, United States.
³ Clinical Data Animation Center (CDAC), MGH, Boston, MA, United States.
⁴ Division of Pulmonary and Critical Care Medicine, MGH, Boston, MA, United States.
⁵ Division of General Internal Medicine, MGH, Boston, MA, United States.
⁶ Division of Infectious Diseases, MGH, Boston, MA, United States.
⁷ Vaccine and Immunotherapy Center, Division of Infectious Diseases, MGH, Boston, MA, United States.
⁸ McCance Center for Brain Health, MGH, Boston, MA, United States.

PMID: 36865787
PMCID: PMC9974159
DOI: 10.1016/j.eswa.2022.119171

Classification of neurologic outcomes from medical notes using natural language processing

Marta B Fernandes et al. Expert Syst Appl. 2023.

. 2023 Mar 15:214:119171.

doi: 10.1016/j.eswa.2022.119171. Epub 2022 Nov 6.

Authors

Affiliations

¹ Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States.
² Harvard Medical School, Boston, MA, United States.
³ Clinical Data Animation Center (CDAC), MGH, Boston, MA, United States.
⁴ Division of Pulmonary and Critical Care Medicine, MGH, Boston, MA, United States.
⁵ Division of General Internal Medicine, MGH, Boston, MA, United States.
⁶ Division of Infectious Diseases, MGH, Boston, MA, United States.
⁷ Vaccine and Immunotherapy Center, Division of Infectious Diseases, MGH, Boston, MA, United States.
⁸ McCance Center for Brain Health, MGH, Boston, MA, United States.

PMID: 36865787
PMCID: PMC9974159
DOI: 10.1016/j.eswa.2022.119171

Abstract

Neurologic disability level at hospital discharge is an important outcome in many clinical research studies. Outside of clinical trials, neurologic outcomes must typically be extracted by labor intensive manual review of clinical notes in the electronic health record (EHR). To overcome this challenge, we set out to develop a natural language processing (NLP) approach that automatically reads clinical notes to determine neurologic outcomes, to make it possible to conduct larger scale neurologic outcomes studies. We obtained 7314 notes from 3632 patients hospitalized at two large Boston hospitals between January 2012 and June 2020, including discharge summaries (3485), occupational therapy (1472) and physical therapy (2357) notes. Fourteen clinical experts reviewed notes to assign scores on the Glasgow Outcome Scale (GOS) with 4 classes, namely 'good recovery', 'moderate disability', 'severe disability', and 'death' and on the Modified Rankin Scale (mRS), with 7 classes, namely 'no symptoms', 'no significant disability', 'slight disability', 'moderate disability', 'moderately severe disability', 'severe disability', and 'death'. For 428 patients' notes, 2 experts scored the cases generating interrater reliability estimates for GOS and mRS. After preprocessing and extracting features from the notes, we trained a multiclass logistic regression model using LASSO regularization and 5-fold cross validation for hyperparameter tuning. The model performed well on the test set, achieving a micro average area under the receiver operating characteristic and F-score of 0.94 (95% CI 0.93-0.95) and 0.77 (0.75-0.80) for GOS, and 0.90 (0.89-0.91) and 0.59 (0.57-0.62) for mRS, respectively. Our work demonstrates that an NLP algorithm can accurately assign neurologic outcomes based on free text clinical notes. This algorithm increases the scale of research on neurological outcomes that is possible with EHR data.

Keywords: Coronavirus; Glasgow outcome scale; Intensive care unit; Machine learning; Modified Rankin Scale; Natural language processing.

PubMed Disclaimer

Conflict of interest statement

Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

**Fig. 1.**
Methodology for notes preprocessing and modeling. The list of extraction fields for discharge summary processing using regular expressions (regex) is shown in Supplementary Table A.2.

**Fig. 2.**
Study cohorts for classification of Glasgow Outcome Scale (GOS) and modified Rankin Scale (mRS) outcomes, with inclusion and exclusion criteria. ICU – Intensive Care Unit. Persistent vegetative state (GOS 2) was omitted from the analysis because this outcome was rare in our cohort.

**Fig. 3.**
Models’ performance on the hold-out test set by class label, for (a) Glasgow Outcome Scale and (b) modified Rankin Scale. Labels: GOS 1, mRS 6 – death; GOS 3, mRS 5 – severe disability; mRS 4 – moderately severe disability; GOS 4, mRS 3 – moderate disability; mRS 2 – slight disability; mRS 1 – no significant disability; mRS 0 – no symptoms; GOS 5 – good recovery.

**Fig. 4.**
Confusion matrices normalized by (a) recall and (b) precision, for the GOS model, and normalized by (c) recall and (d) precision, for the mRS model, evaluated in the hold-out test sets. Labels: GOS 1, mRS 6 – death; GOS 3, mRS 5 – severe disability; mRS 4 – moderately severe disability; GOS 4, mRS 3 – moderate disability; mRS 2 – slight disability; mRS 1 – no significant disability; mRS 0 – no symptoms; GOS 5 – good recovery.

See this image and copyright information in PMC

References

1. Agarwala S, Anagawadi A, & Reddy Guddeti RM (2021). Detecting Semantic Similarity Of Documents Using Natural Language Processing. Procedia Computer Science, 189, 128–135. 10.1016/j.procs.2021.05.076 - DOI
1. Alawad M, Gao S, Qiu J, Schaefferkoetter N, Hinkle JD, Yoon H-J, Christian JB, Wu X-C, Durbin EB, Jeong JC, Hands I, Rust D, & Tourassi G (2019). Deep Transfer Learning Across Cancer Registries for Information Extraction from Pathology Reports. IEEE EMBS International Conference on Biomedical Health Informatics (BHI), 2019, 1–4. 10.1109/BHI.2019.8834586 - DOI - PMC - PubMed
1. Alfattni G, Belousov M, Peek N, & Nenadic G (2021). Extracting Drug Names and Associated Attributes From Discharge Summaries: Text Mining Study. JMIR Medical Informatics, 9(5), e24678. - PMC - PubMed
1. Azari A, Janeja VP, & Levin S (2015). Imbalanced learning to predict long stay Emergency Department patients. IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2015, 807–814.
1. Bai T, & Vucetic S (2019). Improving Medical Code Prediction from Clinical Text via Incorporating Online Knowledge Sources. The World Wide Web Conference, 72–82. 10.1145/3308558.3313485 - DOI

Grants and funding

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Classification of neurologic outcomes from medical notes using natural language processing

Affiliations

Classification of neurologic outcomes from medical notes using natural language processing

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources