Assessment of Natural Language Processing Methods for Ascertaining the Expanded Disability Status Scale Score From the Electronic Health Records of Patients With Multiple Sclerosis: Algorithm Development and Validation Study
- PMID: 35019849
- PMCID: PMC8792771
- DOI: 10.2196/25157
Assessment of Natural Language Processing Methods for Ascertaining the Expanded Disability Status Scale Score From the Electronic Health Records of Patients With Multiple Sclerosis: Algorithm Development and Validation Study
Abstract
Background: The Expanded Disability Status Scale (EDSS) score is a widely used measure to monitor disability progression in people with multiple sclerosis (MS). However, extracting and deriving the EDSS score from unstructured electronic health records can be time-consuming.
Objective: We aimed to compare rule-based and deep learning natural language processing algorithms for detecting and predicting the total EDSS score and EDSS functional system subscores from the electronic health records of patients with MS.
Methods: We studied 17,452 electronic health records of 4906 MS patients followed at one of Canada's largest MS clinics between June 2015 and July 2019. We randomly divided the records into training (80%) and test (20%) data sets, and compared the performance characteristics of 3 natural language processing models. First, we applied a rule-based approach, extracting the EDSS score from sentences containing the keyword "EDSS." Next, we trained a convolutional neural network (CNN) model to predict the 19 half-step increments of the EDSS score. Finally, we used a combined rule-based-CNN model. For each approach, we determined the accuracy, precision, recall, and F-score compared with the reference standard, which was manually labeled EDSS scores in the clinic database.
Results: Overall, the combined keyword-CNN model demonstrated the best performance, with accuracy, precision, recall, and an F-score of 0.90, 0.83, 0.83, and 0.83 respectively. Respective figures for the rule-based and CNN models individually were 0.57, 0.91, 0.65, and 0.70, and 0.86, 0.70, 0.70, and 0.70. Because of missing data, the model performance for EDSS subscores was lower than that for the total EDSS score. Performance improved when considering notes with known values of the EDSS subscores.
Conclusions: A combined keyword-CNN natural language processing model can extract and accurately predict EDSS scores from patient records. This approach can be automated for efficient information extraction in clinical and research settings.
Keywords: machine learning; multiple sclerosis; natural language processing.
©Zhen Yang, Chloé Pou-Prom, Ashley Jones, Michaelia Banning, David Dai, Muhammad Mamdani, Jiwon Oh, Tony Antoniou. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 12.01.2022.
Conflict of interest statement
Conflicts of Interest: JO reports grants from MS Society of Canada, The Barford and Love MS Fund of St. Michael’s Hospital Foundation, National MS Society, Brain Canada, Biogen-Idec, Roche, and EMD-Serono; and personal fees for consulting or speaking from Biogen-Idec, EMD-Serono, Roche, Sanofi-Genzyme, Novartis, and Celgene.
Figures
Similar articles
-
Claims-based algorithm to estimate the Expanded Disability Status Scale for multiple sclerosis in a German health insurance fund: a validation study using patient medical records.Front Neurol. 2023 Dec 7;14:1253557. doi: 10.3389/fneur.2023.1253557. eCollection 2023. Front Neurol. 2023. PMID: 38130836 Free PMC article.
-
Validation of a machine learning approach to estimate expanded disability status scale scores for multiple sclerosis.Mult Scler J Exp Transl Clin. 2022 Jun 22;8(2):20552173221108635. doi: 10.1177/20552173221108635. eCollection 2022 Apr-Jun. Mult Scler J Exp Transl Clin. 2022. PMID: 35755008 Free PMC article.
-
Artificial intelligence to predict clinical disability in patients with multiple sclerosis using FLAIR MRI.Diagn Interv Imaging. 2020 Dec;101(12):795-802. doi: 10.1016/j.diii.2020.05.009. Epub 2020 Jul 7. Diagn Interv Imaging. 2020. PMID: 32651155
-
Automated extraction of clinical traits of multiple sclerosis in electronic medical records.J Am Med Inform Assoc. 2013 Dec;20(e2):e334-40. doi: 10.1136/amiajnl-2013-001999. Epub 2013 Oct 22. J Am Med Inform Assoc. 2013. PMID: 24148554 Free PMC article.
-
Mitoxantrone: a review of its use in multiple sclerosis.CNS Drugs. 2004;18(6):379-96. doi: 10.2165/00023210-200418060-00010. CNS Drugs. 2004. PMID: 15089110 Review.
Cited by
-
Health Care Language Models and Their Fine-Tuning for Information Extraction: Scoping Review.JMIR Med Inform. 2024 Oct 21;12:e60164. doi: 10.2196/60164. JMIR Med Inform. 2024. PMID: 39432345 Free PMC article.
-
Natural language processing systems for extracting information from electronic health records about activities of daily living. A systematic review.JAMIA Open. 2024 May 24;7(2):ooae044. doi: 10.1093/jamiaopen/ooae044. eCollection 2024 Jul. JAMIA Open. 2024. PMID: 38798774 Free PMC article. Review.
-
Common clinical blood and urine biomarkers for ischemic stroke: an Estonian Electronic Health Records database study.Eur J Med Res. 2023 Mar 25;28(1):133. doi: 10.1186/s40001-023-01087-6. Eur J Med Res. 2023. PMID: 36966315 Free PMC article.
-
It's time to change our documentation philosophy: writing better neurology notes without the burnout.Front Digit Health. 2022 Nov 28;4:1063141. doi: 10.3389/fdgth.2022.1063141. eCollection 2022. Front Digit Health. 2022. PMID: 36518562 Free PMC article.
-
Improving Clinical Documentation with Artificial Intelligence: A Systematic Review.Perspect Health Inf Manag. 2024 Jun 1;21(2):1d. eCollection 2024 Summer. Perspect Health Inf Manag. 2024. PMID: 40134899 Free PMC article.
References
-
- Murray TJ. Diagnosis and treatment of multiple sclerosis. BMJ. 2006 Mar 04;332(7540):525–7. doi: 10.1136/bmj.332.7540.525. http://europepmc.org/abstract/MED/16513709 332/7540/525 - DOI - PMC - PubMed
-
- Meyer-Moock S, Feng Y, Maeurer M, Dippel F, Kohlmann T. Systematic literature review and validity evaluation of the Expanded Disability Status Scale (EDSS) and the Multiple Sclerosis Functional Composite (MSFC) in patients with multiple sclerosis. BMC Neurol. 2014 Mar 25;14(1):58. doi: 10.1186/1471-2377-14-58. https://bmcneurol.biomedcentral.com/articles/10.1186/1471-2377-14-58 1471-2377-14-58 - DOI - DOI - PMC - PubMed
-
- Uitdehaag BMJ. Disability Outcome Measures in Phase III Clinical Trials in Multiple Sclerosis. CNS Drugs. 2018 Jun 20;32(6):543–558. doi: 10.1007/s40263-018-0530-8. http://europepmc.org/abstract/MED/29926371 10.1007/s40263-018-0530-8 - DOI - PMC - PubMed
LinkOut - more resources
Full Text Sources