Representation learning for clinical time series prediction tasks in electronic health records
- PMID: 31842854
- PMCID: PMC6916209
- DOI: 10.1186/s12911-019-0985-7
Representation learning for clinical time series prediction tasks in electronic health records
Abstract
Background: Electronic health records (EHRs) provide possibilities to improve patient care and facilitate clinical research. However, there are many challenges faced by the applications of EHRs, such as temporality, high dimensionality, sparseness, noise, random error and systematic bias. In particular, temporal information is difficult to effectively use by traditional machine learning methods while the sequential information of EHRs is very useful.
Method: In this paper, we propose a general-purpose patient representation learning approach to summarize sequential EHRs. Specifically, a recurrent neural network based denoising autoencoder (RNN-DAE) is employed to encode inhospital records of each patient into a low dimensional dense vector.
Results: Based on EHR data collected from Shuguang Hospital affiliated to Shanghai University of Traditional Chinese Medicine, we experimentally evaluate our proposed RNN-DAE method on both mortality prediction task and comorbidity prediction task. Extensive experimental results show that our proposed RNN-DAE method outperforms existing methods. In addition, we apply the "Deep Feature" represented by our proposed RNN-DAE method to track similar patients with t-SNE, which also achieves some interesting observations.
Conclusion: We propose an effective unsupervised RNN-DAE method to summarize patient sequential information in EHR data. Our proposed RNN-DAE method is useful on both mortality prediction task and comorbidity prediction task.
Keywords: Electronic health records; Mortality prediction; Recurrent neural network; Representation learning.
Conflict of interest statement
The authors declare that they have no competing interests.
Figures









Similar articles
-
Endpoint prediction of heart failure using electronic health records.J Biomed Inform. 2020 Sep;109:103518. doi: 10.1016/j.jbi.2020.103518. Epub 2020 Jul 25. J Biomed Inform. 2020. PMID: 32721582
-
Combining structured and unstructured data for predictive models: a deep learning approach.BMC Med Inform Decis Mak. 2020 Oct 29;20(1):280. doi: 10.1186/s12911-020-01297-6. BMC Med Inform Decis Mak. 2020. PMID: 33121479 Free PMC article.
-
Feature rearrangement based deep learning system for predicting heart failure mortality.Comput Methods Programs Biomed. 2020 Jul;191:105383. doi: 10.1016/j.cmpb.2020.105383. Epub 2020 Feb 6. Comput Methods Programs Biomed. 2020. PMID: 32062185
-
Deep representation learning of patient data from Electronic Health Records (EHR): A systematic review.J Biomed Inform. 2021 Mar;115:103671. doi: 10.1016/j.jbi.2020.103671. Epub 2020 Dec 31. J Biomed Inform. 2021. PMID: 33387683 Free PMC article.
-
Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review.J Am Med Inform Assoc. 2018 Oct 1;25(10):1419-1428. doi: 10.1093/jamia/ocy068. J Am Med Inform Assoc. 2018. PMID: 29893864 Free PMC article.
Cited by
-
Semisupervised Calibration of Risk with Noisy Event Times (SCORNET) using electronic health record data.Biostatistics. 2023 Jul 14;24(3):760-775. doi: 10.1093/biostatistics/kxac003. Biostatistics. 2023. PMID: 35166342 Free PMC article.
-
Clinical relevance of deep learning models in predicting the onset timing of cancer pain exacerbation.Sci Rep. 2023 Jul 17;13(1):11501. doi: 10.1038/s41598-023-37742-5. Sci Rep. 2023. PMID: 37460584 Free PMC article.
-
Enhancing Patient Outcome Prediction Through Deep Learning With Sequential Diagnosis Codes From Structured Electronic Health Record Data: Systematic Review.J Med Internet Res. 2025 Mar 18;27:e57358. doi: 10.2196/57358. J Med Internet Res. 2025. PMID: 40100249 Free PMC article.
-
Patient Representation From Structured Electronic Medical Records Based on Embedding Technique: Development and Validation Study.JMIR Med Inform. 2021 Jul 23;9(7):e19905. doi: 10.2196/19905. JMIR Med Inform. 2021. PMID: 34297000 Free PMC article.
-
A semi-supervised adaptive Markov Gaussian embedding process (SAMGEP) for prediction of phenotype event times using the electronic health record.Sci Rep. 2022 Oct 22;12(1):17737. doi: 10.1038/s41598-022-22585-3. Sci Rep. 2022. PMID: 36273240 Free PMC article.
References
-
- Wang Q, Qiu J, Zhou Y, Ruan T, Gao D, Gao J. Automatic severity classification of coronary artery disease via recurrent capsule network. In: 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE: 2018. p. 1587–94. 10.1109/bibm.2018.8621136.
-
- Allyn J, Allou N, Augustin P, Philip I, Martinet O, Belghiti M, Provenchere S, Montravers P, Ferdynus C. A comparison of a machine learning model with euroscore II in predicting mortality after elective cardiac surgery: a decision curve analysis. PLoS ONE. 2017;12(1):0169772. doi: 10.1371/journal.pone.0169772. - DOI - PMC - PubMed
-
- Cheng Y, Wang F, Zhang P, Hu J. Risk prediction with electronic health records: A deep learning approach. In: Proceedings of the 2016 SIAM International Conference on Data Mining. SIAM: 2016. p. 432–40. 10.1137/1.9781611974348.49.
-
- Zhang J, Wang Q, Zhang Z, Zhou Y, Ye Q, Zhang H, Qiu J, He P. An effective standardization method for the lab indicators in regional medical health platform using n-grams and stacking. In: 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE: 2018. p. 1602–9. 10.1109/bibm.2018.8621274.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources