. 2017 Nov 6;17(11):2556.

doi: 10.3390/s17112556.

Deep Recurrent Neural Networks for Human Activity Recognition

Abdulmajid Murad¹, Jae-Young Pyun²

Affiliations

¹ Department of Information Communication Engineering, Chosun University, 375 Susuk-dong, Dong-gu, Gwangju 501-759, Korea. aaymurad@chosun.kr.
² Department of Information Communication Engineering, Chosun University, 375 Susuk-dong, Dong-gu, Gwangju 501-759, Korea. jypyun@chosun.ac.kr.

PMID: 29113103
PMCID: PMC5712979
DOI: 10.3390/s17112556

Deep Recurrent Neural Networks for Human Activity Recognition

Abdulmajid Murad et al. Sensors (Basel). 2017.

. 2017 Nov 6;17(11):2556.

doi: 10.3390/s17112556.

Authors

Abdulmajid Murad¹, Jae-Young Pyun²

Affiliations

¹ Department of Information Communication Engineering, Chosun University, 375 Susuk-dong, Dong-gu, Gwangju 501-759, Korea. aaymurad@chosun.kr.
² Department of Information Communication Engineering, Chosun University, 375 Susuk-dong, Dong-gu, Gwangju 501-759, Korea. jypyun@chosun.ac.kr.

PMID: 29113103
PMCID: PMC5712979
DOI: 10.3390/s17112556

Abstract

Adopting deep learning methods for human activity recognition has been effective in extracting discriminative features from raw input sequences acquired from body-worn sensors. Although human movements are encoded in a sequence of successive samples in time, typical machine learning methods perform recognition tasks without exploiting the temporal correlations between input data samples. Convolutional neural networks (CNNs) address this issue by using convolutions across a one-dimensional temporal sequence to capture dependencies among input data. However, the size of convolutional kernels restricts the captured range of dependencies between data samples. As a result, typical models are unadaptable to a wide range of activity-recognition configurations and require fixed-length input windows. In this paper, we propose the use of deep recurrent neural networks (DRNNs) for building recognition models that are capable of capturing long-range dependencies in variable-length input sequences. We present unidirectional, bidirectional, and cascaded architectures based on long short-term memory (LSTM) DRNNs and evaluate their effectiveness on miscellaneous benchmark datasets. Experimental results show that our proposed models outperform methods employing conventional machine learning, such as support vector machine (SVM) and k-nearest neighbors (KNN). Additionally, the proposed models yield better performance than other deep learning techniques, such as deep believe networks (DBNs) and CNNs.

Keywords: deep learning; human activity recognition; recurrent neural networks.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

**Figure 1**
Schematic diagram of an RNN node where $h_{t - 1}$ is the previous hidden state, $x_{t}$ is the current input sample, $h_{t}$ is the current hidden state, $y_{t}$ is the current output, and $ℱ$ is the activation function.

**Figure 2**
Schematic of LSTM cell structure with an internal recurrence $c_{t}$ and an outer recurrence $h_{t}$ . Cell gates are the input gate $i_{t}$ , input modulation gate $g_{t}$ , forget gate $f_{t}$ , and output gate $o_{t}$ . In contrast to an RNN node, the current output $𝓎_{t}$ is considered equal to current hidden state $h_{t}$ .

**Figure 3**
The proposed HAR architecture. The inputs are raw signals obtained from multimodal-sensors, segmented into windows of length T and fed into LSTM-based DRNN model. The model outputs class prediction scores for each timestep, which are then merged via late-fusion and fed into the softmax layer to determine class membership probability.

**Figure 4**
Unidirectional LSTM-based DRNN model consisting of an input layer, several hidden layers, and an output layer. The number of hidden layers is a hyperparameter that is tuned during training.

**Figure 5**
Bidirectional LSTM-based DRNN model consisting of an input layer, multiple hidden layers, and an output layer. Every layer has a forward $L S T M^{f ℓ}$ and a backward $L S T M^{b ℓ}$ track, and the number of hidden layers is a hyperparameter that is tuned during training.

**Figure 6**
Cascaded unidirectional and bidirectional LSTM-based DRNN model. The first layer is bidirectional, whereas the upper layers are unidirectional. The number of hidden unidirectional layers is a hyperparameter that is tuned during training.

**Figure 7**
Accuracy and cost of the unidirectional DRNN model for the USC-HAD dataset over mini-batch training iterations: (a) training and testing accuracies; (b) cross-entropy cost between ground truth labels and predicted labels for both training and testing.

**Figure 8**
Performance results of the proposed unidirectional DRNN model for the UCI-HAD dataset: (a) Confusion matrix for the test set containing the activity recognition results. The rows represent the true labels and the columns represent the model classification results; (b) Comparative accuracy of the proposed model against other methods.

**Figure 9**
Performance results of the proposed unidirectional DRNN model for USC-HAD dataset: (a) Confusion matrix for the test set displaying activity recognition results with per-class precision and recall; (b) Comparative accuracy of proposed model against other methods.

**Figure 10**
Performance results of the proposed bidirectional DRNN model for the Opportunity dataset: (a) Confusion matrix for the test set as well as per-class precision and recall results; (b) Comparative F1 score of proposed model against other methods.

**Figure 11**
Performance results of the proposed cascaded DRNN model for the Daphnet FOG dataset: (a) Confusion matrix for the test set, along with per-class precision and recall; (b) F1 score of the proposed method in comparison with other methods.

**Figure 12**
Performance results of the proposed cascaded DRNN model for the Skoda dataset: (a) Confusion matrix for the test set as well as per-class precision and recall results; (b) Comparative accuracy of proposed model against other methods.

See this image and copyright information in PMC

References

1. Graves A., Mohamed A., Hinton G. Speech recognition with deep recurrent neural networks; Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing; Vancouver, BC, Canada. 26–31 May 2013; pp. 6645–6649.
1. Sundermeyer M., Schlüter R., Ney H. LSTM Neural Networks for Language Modeling; Proceedings of the Thirteenth Annual Conference of the International Speech Communication Association; Portland, OR, USA. 9–13 September 2012.
1. Yao L., Cho K., Ballas N., Paí C., Courville A. Describing Videos by Exploiting Temporal Structure; Proceedings of the IEEE International Conference on Computer Vision; Santiago, Chile. 7–13 December 2015.
1. Graves A. Supervised Sequence Labelling with Recurrent Neural Networks. Volume 385. Springer; Berlin/Heidelberg, Germany: 2012. Studies in Computational Intelligence.
1. Plötz T., Hammerla N.Y., Olivier P. Feature Learning for Activity Recognition in Ubiquitous Computing; Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence; Barcelona, Catalonia, Spain. 16–22 July 2011; pp. 1729–1734.

MeSH terms

Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Deep Recurrent Neural Networks for Human Activity Recognition

Affiliations

Deep Recurrent Neural Networks for Human Activity Recognition

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials