Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Feb-Mar;36(11):12510-12516.
doi: 10.1609/aaai.v36i11.21520. Epub 2022 Jun 28.

Flexible-Window Predictions on Electronic Health Records

Affiliations

Flexible-Window Predictions on Electronic Health Records

Mehak Gupta et al. Proc AAAI Conf Artif Intell. 2022 Feb-Mar.

Abstract

Various types of machine learning techniques are available for analyzing electronic health records (EHRs). For predictive tasks, most existing methods either explicitly or implicitly divide these time-series datasets into predetermined observation and prediction windows. Patients have different lengths of medical history and the desired predictions (for purposes such as diagnosis or treatment) are required at different times in the future. In this paper, we propose a method that uses a sequence-to-sequence generator model to transfer an input sequence of EHR data to a sequence of user-defined target labels, providing the end-users with "flexible" observation and prediction windows to define. We use adversarial and semi-supervised approaches in our design, where the sequence-to-sequence model acts as a generator and a discriminator distinguishes between the actual (observed) and generated labels. We evaluate our models through an extensive series of experiments using two large EHR datasets from adult and pediatric populations. In an obesity predicting case study, we show that our model can achieve superior results in flexible-window prediction tasks, after being trained once and even with large missing rates on the input EHR data. Moreover, using a number of attention analysis experiments, we show that the proposed model can effectively learn more relevant features in different prediction tasks.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
(A) An example of the observed and missing data configurations (gray shows missing). (B) Flexible observation and prediction windows for a patient with 10 yrs of data.
Figure 2:
Figure 2:
Our model’s structure. Red arrows show the last hidden state of the decoder. The yellow arrows show the weighted vector, given to the decoder, and the purple arrows show y¯t, given to the decoder’s linear layer.
Figure 3:
Figure 3:
Attention analysis on one positive sample. Each row of numbers on the right shows the predicted probability of obesity. Removing the factors positively associated with obesity (show in red), decreases the probability of obesity. An opposite is seen when medication is removed (green).
Figure 4:
Figure 4:
Average global time attention scores for the 12 timestamps in the 3-yr observation window.

References

    1. Alpert MA; and Hashimi MW 1993. Obesity and the Heart. The American Journal of the Medical Sciences, 306(2): 117–123. - PubMed
    1. Baowaly MK; Lin C-C; Liu C-L; and Chen K-T 2019. Synthesizing electronic health records using improved generative adversarial networks. Journal of the American Medical Informatics Association, 26(3): 228–241. - PMC - PubMed
    1. Baytas IM; Xiao C; Zhang X; Wang F; Jain AK; and Zhou J 2017. Patient subtyping via time-aware lstm networks. In Proc. of the ACM SIGKDD conf. on knowledge discovery and data mining, 65–74.
    1. Cao W; Wang D; Li J; Zhou H; Li L; and Li Y 2018. BRITS: Bidirectional Recurrent Imputation for Time Series. In Advances in Neural Information Processing Systems, volume 31.
    1. CDC. 2001. Center for Disease Control - Data Table of BMI-for-age Charts.

LinkOut - more resources