Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2018 Sep;22(5):1589-1604.
doi: 10.1109/JBHI.2017.2767063. Epub 2017 Oct 27.

Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis

Review

Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis

Benjamin Shickel et al. IEEE J Biomed Health Inform. 2018 Sep.

Abstract

The past decade has seen an explosion in the amount of digital information stored in electronic health records (EHRs). While primarily designed for archiving patient information and performing administrative healthcare tasks like billing, many researchers have found secondary use of these records for various clinical informatics applications. Over the same period, the machine learning community has seen widespread advances in the field of deep learning. In this review, we survey the current research on applying deep learning to clinical tasks based on EHR data, where we find a variety of deep learning techniques and frameworks being applied to several types of clinical applications including information extraction, representation learning, outcome prediction, phenotyping, and deidentification. We identify several limitations of current research involving topics such as model interpretability, data heterogeneity, and lack of universal benchmarks. We conclude by summarizing the state of the field and identifying avenues of future deep EHR research.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Trends in the number of Google Scholar publications relating to deep EHR through August 2017. The top distribution shows overall results for “deep learning” and “electronic health records”. The bottom two distributions show these same terms in conjunction with a variety of specific application areas and technical methods. Large yearly jumps are seen for most terms beginning in 2015.
Fig. 2
Fig. 2
Neural network with 1 input layer, 1 output layer, and 2 hidden layers.
Fig. 3
Fig. 3
The most common deep learning architectures for analyzing EHR data. Architectures differ in terms of their node types and the connection structure (e.g. fully connected versus locally connected). Below each model type is a list of selected references implementing the architecture for EHR applications. Icons based on the work of van Veen [30].
Fig. 4
Fig. 4
Example of a convolutional neural network (CNN) for classifying images. This particular model includes two convolutional layers, each followed by a pooling/subsampling layer. The output from the second pooling layer is fed to a fully connected layer and a final output layer. [31]
Fig. 5
Fig. 5
Symbolic representation of a RNN (left) with equivalent expanded representation (right) for an example input sequence of length three, three hidden units, and a single output. Each input time step is combined with the current hidden state of the RNN, which itself depends on the previous hidden state, demonstrating the memory effect of RNNs.
Fig. 6
Fig. 6
Example of a stacked autoencoder with two independently-trained hidden layers. In the first layer, is the reconstruction of input x, and z is lower dimensional representation (i.e., the encoding) of input x. Once the first hidden layer is trained, the embeddings z are used as input to a second autoencoder, demonstrating how autoencoders can be stacked. [33]
Fig. 7
Fig. 7
EHR Information Extraction (IE) and example tasks.
Fig. 8
Fig. 8
Illustration of how autoencoders can be used to transform extremely sparse patient vectors into a more compact representation. Since medical codes are represented as binary categorical features, raw patient vectors can have dimensions in the thousands. Training an autoencoder on these vectors produces an encoding function to transform any given vector into it's distributed and dimensionality-reduced representation.
Fig. 9
Fig. 9
Beaulieu-Jones and Greene's [49] autoencoder-based phenotype stratification for case (1) vs. control (0) diagnoses, illustrated with t-SNE. (A) shows clustering based on raw clinical descriptors, where there is little separable structure. (B-F) show the resulting clusters following 0-10,000 training epochs of the single-layer autoencoder. As the autoencoder is trained, there are clear boundaries between the two labels, suggesting the unsupervised autoencoder discovers latent structure in the raw data without any human input.
Fig. 10
Fig. 10
Example of the positive effect of sparsity constraints on model interpretability. Shown are the first hidden layer weights from Lasko et al.'s [50] autoencoder framework for phenotyping uric acid sequences, which in effect form functional element detectors.

References

    1. Birkhead GS, Klompas M, Shah NR. Uses of Electronic Health Records for Public Health Surveillance to Advance Public Health. Annu Rev Public Health. 2015;36:345–59. - PubMed
    1. T. O. of the National Coordinator for Health Information Technology. Adoption of Electronic Health Record Systems among U.S. Non-Federal Acute Care Hospitals: 2008-2015, year = 2016. URL = https://dashboard.healthit.gov/evaluations/data-briefs/non-federal-acute....
    1. J E, N N. Electronic Health Record Adoption and Use among Office-based Physicians in the U.S., by State: 2015 National Electronic Health Records Survey. The Office of the National Coordinator for Health Information Technology, Tech Rep. 2016
    1. Botsis T, Hartvigsen G, Chen F, Weng C. Secondary Use of ehr: Data Quality Issues and Informatics Opportunities. AMIA Joint Summits on Translational Science proceedings AMIA Summit on Translational Science. 2010;2010:1–5. [Online]. Available: http://www.ncbi.nlm.nih.gov/pubmed/21347133{%}5Cnhttp://www.pubmedcentra.... - PMC - PubMed
    1. Jensen PB, Jensen LJ, Brunak S. Translational genetics: Mining electronic health records: towards better research applications and clinical care. Nature Reviews - Genetics. 2012;13:395–405. - PubMed