Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Oct 3;15(1):34515.
doi: 10.1038/s41598-025-17703-w.

Explainability and importance estimate of time series classifier via embedded neural network

Affiliations

Explainability and importance estimate of time series classifier via embedded neural network

Ho Tung Jeremy Chan et al. Sci Rep. .

Abstract

Time series is common across disciplines, however the analysis of time series is not trivial due to inter- and intra-relationships between ordered data sequences. This imposes limitation upon the interpretation and importance estimate of the features within a time series. In the case of multivariate time series, these features are the individual time series and the time steps, which are intertwined. There exist many time series analyses, such as Autocorrelation and Granger Causality, which are based on statistic or econometric approaches. However analyses that can inform the importance of features within a time series are uncommon, especially with methods that utilise embedded methods of neural network (NN). We approach this problem by expanding upon our previous work, Pairwise Importance Estimate Extension (PIEE). We made adaptations toward the existing method to make it compatible with time series. This led to the formulation of aggregated Hadamard product, which can produce an importance estimate for each time point within a multivariate time series. This subsequently allows each time series within a multivariate time series to be interpreted as well. Within this work, we conducted an empirical study with univariate and multivariate time series, where we compared interpretation and importance estimate of features from existing embedded NN approaches, an explainable AI (xAI) approach, and our adapted PIEE approach. We verified interpretation and importance estimate via ground truth or existing domain knowledge when it is available. Otherwise, we conducted an ablation study by retraining the model with Leave-One-Out and Singleton feature subsets to see their contribution towards model performance. Our adapted PIEE method was able to produce various feature importance heatmaps and rankings inline with the ground truth, the existing domain knowledge or the ablation study.

PubMed Disclaimer

Conflict of interest statement

Declarations. Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Sensitivity Analysis with Perturbation within the context of time series, where different parts of the series are replaced with 0.
Fig. 2
Fig. 2
Reduce and Retrain within the context of multivariate time series where signals from the time series are removed. b.) Reduce and Retrain within the context of multivariate time series where time steps across the signals from the time series are removed.
Fig. 3
Fig. 3
(a) Outline of the principles behind PIEE. (b) Outline of Importance Estimate via Weight Profile and Importance Estimate via Gradient Profile.
Fig. 4
Fig. 4
(a) A shallow NN. (b) A shallow NN with a linear layer between the input and the NN’s first layer. (c) A shallow NN with a pairwise layer between input and the NN’s first layer. (d) A shallow NN with an immutable pairwise layer between input and the NN’s first layer, where gradient changes for each respective feature are collected by the Gradient Collection Module.
Fig. 5
Fig. 5
An example of Retrain with Singleton subsets within the context of multivariate time series, where only one signal is used as the input for retraining. This allows for a performance comparison, which informs the significance of the one signal used.
Fig. 6
Fig. 6
An example of Retrain with LOO subsets within the context of multivariate time series, where one signal is removed from the input signals, resulting in a LOO subset, which is then retrained for performance comparison. This informs the significance of the signal excluded from LOO subset.
Fig. 7
Fig. 7
Datasets from E1: Simulated Time Series Datasets. (a) Simulated Univariate time series. (b) Simulated Multivariate time series.
Fig. 8
Fig. 8
Occupancy Detection Dataset.
Fig. 9
Fig. 9
Simulated Univariate Datasets with DeepLIFT’s importance estimates: (a) The time series of b’s calculated importance estimate. (b) DeepLIFT’s importance estimate for each class of the time series across 5 runs with different splits. DeepLIFT captures succinctly the characteristic of each class.
Fig. 10
Fig. 10
(a) Simulated Univariate Dataset 0 with existing embedded FS NN’s and PIEE’s importance estimate using AHP across 5 runs with different splits: PIEE’s Grad methods’ estimates are able to identify the key periods in differentiating the different classes, whereas others’ estimates are akin to noise. (b) Simulated Univariate Dataset 1 with existing embedded FS NN’s and PIEE’s importance estimate using AHP across 5 runs with different splits: PIEE’s Grad methods’ estimates are able to identify the key periods in differentiating the different classes. PIEE’s Weight-Naive method is able to capture some characteristic of the time series, although its estimate does not line up with the key periods.
Fig. 11
Fig. 11
Multivariate Dataset 0 with DeepLIFT’s importance estimate: (a) The time series of b’s calculated importance estimate. (b) DeepLIFT’s importance estimate for each class of the time series across 5 runs with different splits. DeepLIFT captures succinctly each class’s characteristic for each channel.
Fig. 12
Fig. 12
Multivariate Dataset 1 with DeepLIFT’s importance estimate: (a) The time series of b’s calculated importance estimate. (b) DeepLIFT’s importance estimate for each class of the time series across 5 runs with different splits. DeepLIFT captures succinctly each class’s characteristic for each channel.
Fig. 13
Fig. 13
(a) Simulated Multivariate Dataset 0 with adapted existing embedded FS NN’s and PIEE’s importance estimate using EWM across 5 runs with different splits: PIEE’s Weight-Naive and Grad methods unanimously agree on their estimate, which is supported by the ground truth. DF and NFS are unable to do the same. (b) Multivariate Dataset 1 with adapted existing embedded FS NN’s and PIEE’s importance estimate using EWM across 5 runs with different splits: DF, NFS and PIEE’s Weight-Naive would produce high importance for the noise channel whilst PIEE’s Grad methods only produce high importance for the channels with relevant information for classification. This is supported by the ground truth of the simulation.
Fig. 14
Fig. 14
Simulated Multivariate Dataset 0 with adapted existing embedded NN’s and PIEE’s importance estimate using AHP approach across 5 runs with different splits: Grad methods agree on their estimate, which is supported by the ground truth of the stimulation.
Fig. 15
Fig. 15
Multivariate Dataset 0 with adapted existing embedded NN’s and PIEE’s importance estimate using AHP approach across 5 runs with different splits: PIEE’s Grad methods agree on their estimate, and produce output supported by the ground truth.
Fig. 16
Fig. 16
AR EEG dataset with adapted PIEE’s importance estimate of time steps using EWM (left), AHP (middle) and DeepLIFT (right) across 5 runs of user-fold split: red highlights periods with significant difference, green highlights event related theta peak. The graphs demonstrates that PIEE methods, as well as DeepLIFT peaked during the significant periods (red and lime).
Fig. 17
Fig. 17
AR EEG dataset with DeepLIFT importance estimate of EEG channels across 5 runs of user-fold split: the dark blue areas represent higher importance estimate than lighter blue areas, as informed by the colour bar. The graphs show that DeepLIFT favours the right centro-parietal similar to the Gradient based analysis methods in Fig. 18.
Fig. 18
Fig. 18
AR EEG dataset with adapted PIEE’s importance estimate of EEG Channels using EWM (top) and AHP (bottom) across 5 runs of user-fold split: the dark blue areas represent higher importance estimate than lighter blue areas, as informed by the colour bars. The graphs show that the Gradient based analysis methods tend to favour the right centro-parietal regions across both adaptations, whereas Weight-Naive of Weight based analysis is inconclusive across both adaptations.
Fig. 19
Fig. 19
Occupancy Detection dataset with adapted existing embedded FS NN’s and PIEE’s importance estimate using EWM and AHP across 5 runs with different splits: Apart from DF, all methods using both approaches agree on Light being an important channel.
Fig. 20
Fig. 20
HHAR dataset with adapted existing embedded FS NN’s and PIEE’s importance estimate using EWM and AHP across 5 runs with different splits: Both approaches of PIEE’s Grad methods agree consistently.

References

    1. LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature521, 436–444. 10.1038/nature14539 (2015). - DOI - PubMed
    1. Dayana, A. M. & Emmanuel, W. R. S. Deep learning enabled optimized feature selection and classification for grading diabetic retinopathy severity in the fundus image. Neural Comput. Appl.34, 18663–18683. 10.1007/s00521-022-07471-3 (2022). - DOI
    1. Guidotti, R. et al. A survey of methods for explaining black box models. ACM computing surveys (CSUR)51, 1–42 (2018). - DOI
    1. Adadi, A. & Berrada, M. Peeking inside the black-box: A survey on explainable artificial intelligence (xai). IEEE Access6, 52138–52160 (2018). - DOI
    1. Rojat, T. et al. Explainable artificial intelligence (xai) on timeseries data: A survey. arXiv preprint arXiv:2104.00950 (2021).

LinkOut - more resources