Predicting life expectancy with a long short-term memory recurrent neural network using electronic medical records
- PMID: 30819172
- PMCID: PMC6394008
- DOI: 10.1186/s12911-019-0775-2
Predicting life expectancy with a long short-term memory recurrent neural network using electronic medical records
Abstract
Background: Life expectancy is one of the most important factors in end-of-life decision making. Good prognostication for example helps to determine the course of treatment and helps to anticipate the procurement of health care services and facilities, or more broadly: facilitates Advance Care Planning. Advance Care Planning improves the quality of the final phase of life by stimulating doctors to explore the preferences for end-of-life care with their patients, and people close to the patients. Physicians, however, tend to overestimate life expectancy, and miss the window of opportunity to initiate Advance Care Planning. This research tests the potential of using machine learning and natural language processing techniques for predicting life expectancy from electronic medical records.
Methods: We approached the task of predicting life expectancy as a supervised machine learning task. We trained and tested a long short-term memory recurrent neural network on the medical records of deceased patients. We developed the model with a ten-fold cross-validation procedure, and evaluated its performance on a held-out set of test data. We compared the performance of a model which does not use text features (baseline model) to the performance of a model which uses features extracted from the free texts of the medical records (keyword model), and to doctors' performance on a similar task as described in scientific literature.
Results: Both doctors and the baseline model were correct in 20% of the cases, taking a margin of 33% around the actual life expectancy as the target. The keyword model, in comparison, attained an accuracy of 29% with its prognoses. While doctors overestimated life expectancy in 63% of the incorrect prognoses, which harms anticipation to appropriate end-of-life care, the keyword model overestimated life expectancy in only 31% of the incorrect prognoses.
Conclusions: Prognostication of life expectancy is difficult for humans. Our research shows that machine learning and natural language processing techniques offer a feasible and promising approach to predicting life expectancy. The research has potential for real-life applications, such as supporting timely recognition of the right moment to start Advance Care Planning.
Keywords: Advance care planning; Clinical free-text; Life expectancy prediction; Long short-term memory.
Conflict of interest statement
Ethics approval and consent to participate
The data used in this study were gathered through an informed opt-out procedure by the Transitie Project. The Transitie Project, hosted at the academic hospital Radboudumc, approved the use of their data for this research. Retrospective research on patient files requires adherence to the Personal Data Protection Act. Therefore the data were anonymized and processed in a secure research environment.
As determined by the Central Committee on Research Involving Human Subjects (the national medical-ethical review committee,
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Figures




Similar articles
-
Automated model versus treating physician for predicting survival time of patients with metastatic cancer.J Am Med Inform Assoc. 2021 Jun 12;28(6):1108-1116. doi: 10.1093/jamia/ocaa290. J Am Med Inform Assoc. 2021. PMID: 33313792 Free PMC article.
-
A nursing note-aware deep neural network for predicting mortality risk after hospital discharge.Int J Nurs Stud. 2024 Aug;156:104797. doi: 10.1016/j.ijnurstu.2024.104797. Epub 2024 May 9. Int J Nurs Stud. 2024. PMID: 38788263
-
Recent advances in Swedish and Spanish medical entity recognition in clinical texts using deep neural approaches.BMC Med Inform Decis Mak. 2019 Dec 23;19(Suppl 7):274. doi: 10.1186/s12911-019-0981-y. BMC Med Inform Decis Mak. 2019. PMID: 31865900 Free PMC article.
-
Mechanisms and contextual influences on the implementation of advance care planning for older people in long-term care facilities: A realist review.Int J Nurs Stud. 2022 Sep;133:104277. doi: 10.1016/j.ijnurstu.2022.104277. Epub 2022 Apr 30. Int J Nurs Stud. 2022. PMID: 35717924 Review.
-
Data-driven modeling and prediction of blood glucose dynamics: Machine learning applications in type 1 diabetes.Artif Intell Med. 2019 Jul;98:109-134. doi: 10.1016/j.artmed.2019.07.007. Epub 2019 Jul 26. Artif Intell Med. 2019. PMID: 31383477 Review.
Cited by
-
The added value of text from Dutch general practitioner notes in predictive modeling.J Am Med Inform Assoc. 2023 Nov 17;30(12):1973-1984. doi: 10.1093/jamia/ocad160. J Am Med Inform Assoc. 2023. PMID: 37587084 Free PMC article.
-
Current Trends in Readmission Prediction: An Overview of Approaches.Arab J Sci Eng. 2021 Aug 16:1-18. doi: 10.1007/s13369-021-06040-5. Online ahead of print. Arab J Sci Eng. 2021. PMID: 34422543 Free PMC article.
-
Medical short text classification via Soft Prompt-tuning.Front Med (Lausanne). 2025 Apr 14;12:1519280. doi: 10.3389/fmed.2025.1519280. eCollection 2025. Front Med (Lausanne). 2025. PMID: 40297159 Free PMC article.
-
Review of Big Data Analytics, Artificial Intelligence and Nature-Inspired Computing Models towards Accurate Detection of COVID-19 Pandemic Cases and Contact Tracing.Int J Environ Res Public Health. 2020 Jul 24;17(15):5330. doi: 10.3390/ijerph17155330. Int J Environ Res Public Health. 2020. PMID: 32722154 Free PMC article. Review.
-
Principles and challenges of modeling temporal and spatial omics data.Nat Methods. 2023 Oct;20(10):1462-1474. doi: 10.1038/s41592-023-01992-y. Epub 2023 Sep 14. Nat Methods. 2023. PMID: 37710019 Review.
References
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources