Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Oct 1;25(10):1419-1428.
doi: 10.1093/jamia/ocy068.

Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review

Affiliations

Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review

Cao Xiao et al. J Am Med Inform Assoc. .

Abstract

Objective: To conduct a systematic review of deep learning models for electronic health record (EHR) data, and illustrate various deep learning architectures for analyzing different data sources and their target applications. We also highlight ongoing research and identify open challenges in building deep learning models of EHRs.

Design/method: We searched PubMed and Google Scholar for papers on deep learning studies using EHR data published between January 1, 2010, and January 31, 2018. We summarize them according to these axes: types of analytics tasks, types of deep learning model architectures, special challenges arising from health data and tasks and their potential solutions, as well as evaluation strategies.

Results: We surveyed and analyzed multiple aspects of the 98 articles we found and identified the following analytics tasks: disease detection/classification, sequential prediction of clinical events, concept embedding, data augmentation, and EHR data privacy. We then studied how deep architectures were applied to these tasks. We also discussed some special challenges arising from modeling EHR data and reviewed a few popular approaches. Finally, we summarized how performance evaluations were conducted for each task.

Discussion: Despite the early success in using deep learning for health analytics applications, there still exist a number of issues to be addressed. We discuss them in detail including data and label availability, the interpretability and transparency of the model, and ease of deployment.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Illustration of literature search and selection procedure.
Figure 2.
Figure 2.
Transform longitudinal EHR data into input vectors (top left), which could support different analytics tasks described in the survey (top right). The underlying deep learning models are visually described at the bottom (a): Feedforward neural networks use multiple layers of fully connected neural networks and non-linear activations (eg., sigmoid or rectified linear unit). (b): Recurrent neural networks can process variable-length input sequence using its recurrent connection. (c): Restricted Boltzmann Machines are bipartite neural networks that consist of binary stochastic nodes. They can capture the latent representation of the input data by learning their generative probability. (d): Generative adversarial networks can generate realistic synthetic samples by training the generator and the discriminator in an adversarial game. (e): Convolutional neural networks capture local features of the input data, and stack those features up via a sequence of convolution to derive global features. (f): Word2vec exploits the co-occurrence information of discrete concepts (eg., words in text, codes in EHR data) to derive concept representations. (g): Denoising autoencoders (AE) try to reconstruct original input from its corrupted version, thus learning robust representations of the input data.

Similar articles

Cited by

References

    1. Richesson RL, Sun J, Pathak J, Kho AN, Denny JC.. Clinical phenotyping in selected national networks: demonstrating the need for high-throughput, portable, and computational methods. Artif Intell Med 2016; 71: 57–61. - PMC - PubMed
    1. LeCun Y, Bengio Y, Hinton G.. Deep learning. Nature 2015; 5217553: 436–44. - PubMed
    1. Gulshan V, Peng L, Coram M, Stumpe MC, Wu D, Narayanaswamy A.. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 2016; 31622: 2402–10. - PubMed
    1. Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 2017; 5427639: 115–8. - PMC - PubMed
    1. Leung MKK, Xiong HY, Lee LJ, Frey BJ.. Deep learning of the tissue-regulated splicing code. Bioinformatics 2014; 3012: i121–9. - PMC - PubMed

Publication types