Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Apr 17;8(1):6085.
doi: 10.1038/s41598-018-24271-9.

Recurrent Neural Networks for Multivariate Time Series with Missing Values

Affiliations

Recurrent Neural Networks for Multivariate Time Series with Missing Values

Zhengping Che et al. Sci Rep. .

Abstract

Multivariate time series data in practical applications, such as health care, geoscience, and biology, are characterized by a variety of missing values. In time series prediction and other related tasks, it has been noted that missing values and their missing patterns are often correlated with the target labels, a.k.a., informative missingness. There is very limited work on exploiting the missing patterns for effective imputation and improving prediction performance. In this paper, we develop novel deep learning models, namely GRU-D, as one of the early attempts. GRU-D is based on Gated Recurrent Unit (GRU), a state-of-the-art recurrent neural network. It takes two representations of missing patterns, i.e., masking and time interval, and effectively incorporates them into a deep model architecture so that it not only captures the long-term temporal dependencies in time series, but also utilizes the missing patterns to achieve better prediction results. Experiments of time series classification tasks on real-world clinical datasets (MIMIC-III, PhysioNet) and synthetic datasets demonstrate that our models achieve state-of-the-art performance and provide useful insights for better understanding and utilization of missing values in time series analysis.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Demonstration of informative missingness on MIMIC-III dataset. The bottom figure shows the missing rate of each input variable. The middle figure shows the absolute values of Pearson correlation coefficients between missing rate of each variable and mortality. The top figure shows the absolute values of Pearson correlation coefficients between missing rate of each variable and each ICD-9 diagnosis category. More details can be found in supplementary information Section S1.
Figure 2
Figure 2
An example of measurement vectors xt, time stamps st, masking mt, and time interval δt.
Figure 3
Figure 3
Graphical illustrations of the original GRU (top-left), the proposed GRU-D (bottom-left), and the whole network architecture (right).
Figure 4
Figure 4
Classification performance on Gesture synthetic datasets with different correlation values.
Figure 5
Figure 5
Plots of input decay γxt for all variables (top) and histrograms of hidden state decay weights Wγh for 10 variables (bottom) in GRU-D model for predicting mortality on PhysioNet dataset.
Figure 6
Figure 6
Early prediction capacity and model scalability comparisons of GRU-D and other RNN baselines on MIMIC-III dataset.

References

    1. Rubin DB. Inference and missing data. Biom. 1976;63:581–592.
    1. Johnson, A. et al. Mimic-iii, a freely accessible critical care database. Sci. Data (2016). - PMC - PubMed
    1. Schafer, J. L. & Graham, J. W. Missing data: our view of the state of the art. Psychol. methods (2002). - PubMed
    1. Kreindler, D. M. & Lumsden, C. J. The effects of the irregular sample and missing data in time series analysis. Nonlinear Dyn. Syst. Analysis for Behav. Sci. Using Real Data (2012). - PubMed
    1. De Boor C, De Boor C, Mathématicien E-U, De Boor C, De Boor C. A practical guide to splines. New York: Springer-Verlag; 1978.