Recognition of human activity using GRU deep learning algorithm

Mohsen, Saeed

doi:10.1007/s11042-023-15571-y

Recognition of human activity using GRU deep learning algorithm

Open access
Published: 11 May 2023

Volume 82, pages 47733–47749, (2023)
Cite this article

Download PDF

You have full access to this open access article

Multimedia Tools and Applications Aims and scope Submit manuscript

Recognition of human activity using GRU deep learning algorithm

Download PDF

Saeed Mohsen ORCID: orcid.org/0000-0003-2863-0074^1,2

2739 Accesses
11 Citations
1 Altmetric
Explore all metrics

Abstract

Human activity recognition (HAR) is a challenging issue in several fields, such as medical diagnosis. Recent advances in the accuracy of deep learning have contributed to solving the HAR issues. Thus, it is necessary to implement deep learning algorithms that have high performance and greater accuracy. In this paper, a gated recurrent unit (GRU) algorithm is proposed to classify human activities. This algorithm is applied to the Wireless Sensor Data Mining (WISDM) dataset gathered from many individuals with six classes of various activities – walking, sitting, downstairs, jogging, standing, and upstairs. The proposed algorithm is tested and trained via a hyper-parameter tuning method with TensorFlow framework to achieve high accuracy. Experiments are conducted to evaluate the performance of the GRU algorithm using receiver operating characteristic (ROC) curves and confusion matrices. The results demonstrate that the GRU algorithm provides high performance in the recognition of human activities. The GRU algorithm achieves a testing accuracy of 97.08%. The rate of testing loss for the GRU is 0.221, while the precision, sensitivity, and F1-score for the GRU are 97.11%, 97.09%, and 97.10%, respectively. Experimentally, the area under the ROC curves (AUC_S) is 100%.

Deep Learning Algorithms for Human Activity Recognition: A Comparative Analysis

Human Activity Recognition Based on Convolutional Neural Network

AReNet: Cascade learning of multibranch convolutional neural networks for human activity recognition

Article 09 November 2023

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Recognizing human activities enables to an assessment of human performance and thus its efficiency in daily life [65]. From this perspective, artificial intelligence (AI) has an effective role in evaluating and recognizing human activities [38]. In previous years, human activity recognition (HAR) has become a popular issue for research because of its importance in several fields, such as sports, healthcare, and fitness [18, 24, 34, 35, 52, 57], human-computer interaction, interactive gaming, smart manufacturing [10], and remote monitoring systems.

In other fields, wearable accelerometers are utilized to evaluate human activity for remotely communicating among hospitals and patients [50]. However, the low precision of these accelerometers is a challenging issue yet to be fully overcome [11, 44]. Many traditional machine learning algorithms have been developed to identify human activity [33, 45, 48, 58]. However, the level of accuracy of these algorithms remains an issue [23, 27].

One of the deep learning algorithms i.e., gated recurrent unit (GRU). This algorithm introduces effective solutions to solve the issue of low accuracy. This algorithm is already useful in such applications as digital image processing [51], speech classification [62], and language modeling [31], and is applicable to identify human activities. The motivation for using the GRU is to tackle the vanishing gradient problem, and it is convenient to process sequences of time.

In this paper, a GRU algorithm of deep learning is proposed to classify human activities with the objective of increasing the classification accuracy of the GRU algorithm via presenting a hyper-parameter tuning method. In this context, the primary aim of implementing the proposed GRU algorithm is to achieve high accuracy to identify human activities. Thus, a k-fold cross-validation technique is used to achieve high performance of classification accuracy. The GRU algorithm is tested and trained on the Wireless Sensor Data Mining (WISDM) dataset. Figure 1 shows the framework of the proposed work i.e., gated recurrent unit (GRU) to recognize six human activities: walking, sitting, downstairs, jogging, standing, and upstairs.

The main contributions of the paper are as follows:

Implementing the GRU algorithm to classify human activities;
Achieving maximum testing accuracy of the GRU algorithm with a hyper-parameter tuning method on a central processing unit (CPU);
Evaluating the performance of the proposed algorithm using different evaluation metrics with the WISDM dataset;
Applying the k-fold cross-validation technique to enhance the performance of the proposed GRU algorithm.

The remainder of the paper is organized as follows: Section 2 reviews the related work. Section 3 presents a background of the GRU. Section 4 introduces the methodology of the proposed work. Section 5 explains the evaluation metrics to assess the performance of the proposed algorithm. The results are demonstrated in Section 6. Section 7 shows the results. Conclusions from the proposed work are covered in Section 8.

2 Related work

In the literature, several deep learning algorithms have been presented for recognition of human activities. In [21], Hammarela et al. proposed a bi-directional long short-term memory (LSTM) algorithm to identify a large number of human activities. The authors used inertial sensors to pick up humans’ hand signals. This algorithm was trained on the Opportunity dataset [12] and had an F1-score of 92.7%. In [42], Pienaar and Malekian developed an LSTM algorithm for HAR that used a regularization method to enhance the computations to deal with a huge WISDM dataset [32] and reached a maximum accuracy of 94%. However, the performance of this work was assessed via only two evaluation metrics, namely learning curve and confusion matrix. Cruciani et al. [17] implemented a convolutional neural network (CNN) algorithm for HAR that was applied to the UCI-HAR dataset available in [9] with an overall accuracy of 91.98%. This algorithm was assessed using a variety of evaluation metrics. Ordonez and Roggen [40] proposed a ConvLSTM algorithm to classify human activities. The authors utilized several inertial measurement units (IMU_S) and accelerometers. The algorithm achieved an F1-score of 95.8% and classified five activities using the Skoda dataset [49]. In [59], Xia et al. introduced an LSTM-CNN algorithm for daily life activities. This algorithm was also trained to the WISDM dataset [32] and reached an accuracy of 95.85%. However, the computational time consumed in the training process was noticeable.

Alani et al. [2] presented CNN, LSTM, and CNN-LSTM algorithms to recognize imbalanced data for HAR. These algorithms were trained on the SPHERE dataset [54] and reached accuracies of 92.98%, 93.55%, and 93.67%, respectively. The SPHERE dataset has twenty different human activities. The algorithms had limited performance and were evaluated using a single metric. In [8], Alzantot et al. proposed an LSTM algorithm to identify human activities, which was utilized to differentiate between real and synthesized data, but, the accuracy achieved for this algorithm was quite low, it had many training layers led to an architecture complexity.

Researchers [7, 37, 47, 63] have introduced CNN and LSTM deep learning algorithms to identify human activities in daily living. Shakya et al. [47] presented recurrent neural network (RNN) and CNN algorithms, utilized the Shoaib SA, and Actitracker datasets, which were partitioned randomly and reached accuracies of 81.74% and 92.22%, respectively. Mekruksavanich et al. [37] also proposed an LSTM algorithm, achieving an overall accuracy of 96.2%, and an F1-score of 96.3%. In [7], Alsheikh et al. achieved an overall accuracy of 86.6%, but did not demonstrate which dataset was utilized for the test. An LSTM architecture was introduced in [63], which achieved a maximum accuracy of 92.1% on an unknown test dataset split.

Agarwal et al. [1] applied an RNN-LSTM algorithm to classify human activity, utilizing the WISDM dataset. The authors used only two metrics for the performance assessment of the RNN-LSTM algorithm and achieved a total accuracy of 95.78%. Zhao et al. [66] presented a bi-directional LSTM algorithm to recognize human activities, for which a number of sensors were utilized to gather the datasets. The primary disadvantage of this work was the long-time consumed in training process, and thus it wasn’t so suitable for real-time applications. Cipolla et al. [16] implemented an LSTM algorithm to recognize human activities using the SPHERE dataset. The proposed algorithm demonstrated strong ability to deal with unbalanced data. The algorithm was achieved a classification accuracy of 83.2% and applied to five activities.

Over the recent years, CNN gained a lot of attention and is often utilized in fields of text analysis [19], image classification [53], and natural language processing [64]. In [26], Ignatov trained a CNN algorithm for HAR. The accuracy achieved 90.42% and 93.32% for the testing and training datasets, respectively. Xu et al. [61] applied a CNN algorithm on a randomly selected 70% of a dataset, and then utilized the algorithm to assess the remaining 30%, achieving a maximum accuracy of 91.97%. Huang et al. [25] proposed an architecture of two sequential CNN_S and utilized a cross-validation technique to achieve an F1-score of 84.6%. Although a number of researchers implemented accurate algorithms to recognize human activities, there is still a wide margin of enhancement to obtain. In particular, the previous studies failed in applying the hyper-parameter tuning method to achieve high accuracy for recognizing human activities.

3 Theoretical background of the gated recurrent unit (GRU)

One of the popular deep learning approaches is the gated recurrent unit (GRU) [14], which is a special type of recurrent neural networks (RNNs). RNN has the vanishing gradient problem [20, 46], which GRU was created to tackle. GRU is convenient to process sequences of time. GRU layers haven’t memory blocks recurrently linked in a memory cell. The architecture of a GRU cell is demonstrated in Fig. 2, which comprises two gates; a reset gate and an update gate [15]. The two gates reject and accept information passing across the cell. In this Figure, the reset gate decides how much of the past information to forget. This decision is executed via a sigmoid activation function (σ). The sigmoid output is r_t. If the sigmoid function has a value of 1, the data will be passed into the GRU algorithm; if the sigmoid function value is 0, the data cannot be passed through the GRU algorithm. The input of the Reset gate is the present input x_t and previous hidden state h_t-1. The Update gate decides what information will be updated to pass along a future state. So, the update gate depends on a very simple part of the previous state. Also, the update gate includes a sigmoid activation function for updating the cell state. This function has a range from 0 to 1. The output of the sigmoid is z_t, which is multiplied with a tanh function that gives a new cell state ĥ_t. This tanh has values from −1 to 1. The output of the multiplication process is added to the present cell state C_t, where h_t is the present hidden state output of the present cell. The reset gate and update gate are calculated from Eqs. (1) and (2), respectively. Whereby W_r and W_z represent the weights for the reset gate and update gate, respectively. In the training phase, the weights will be learned. Further, the h_t of the GRU cell is determined using Eq. (3) [13].

$${r}_t=\sigma \left({W}_r\left[{h}_{t-1},{x}_t\right]\right)$$

(1)

$${z}_t=\sigma \left({W}_z\left[{h}_{t-1},{x}_t\right]\right)$$

(2)

$${h}_t=\left(1-{z}_t\right)\times {h}_{t-1}+{z}_t\times {\hat{h}}_t$$

(3)

4 Methodology

The GRU algorithm is developed to recognize human activities. The proposed GRU algorithm is chosen due to its high performance. The GRU is convenient to process data of time sequence [29]. One of the advantages of GRU is that needs a little computational time, thus it has high speed in the training process [22]. The GRU is a memory extension for the recurrent neural network (RNN). Thus, an advantage of the GRU is avoiding the issue of vanishing gradient [36].

4.1 Gated recurrent unit (GRU) algorithm

Figure 3 illustrates the architecture of the proposed GRU algorithm. This architecture is executed via TensorFlow framework. It comprises an input layer, 2 GRU layers, and an output layer. The input layer includes three features and 90-time steps with a number of samples. The features are ax, ay, and az. Where ax represents the acceleration in the x-axis, while ay and az are the accelerations in the y-axis, and the z-axis, respectively. The two GRU layers are used to capture the features of time for the sequence of data. The GRU layers are stacked in order to improve its stability and accuracy with achieve depth to this algorithm. Each GRU layer has 32 hidden units [55], and utilized a rectified linear unit (ReLU) activation function to enhance the robustness of the GRU algorithm. The output layer has six neurons with a softmax function utilized as activation to find the six classes. In this algorithm, the training epochs are set to 50 with a batch size of 64 and the learning rate is initialized to 0.0025. This rate represents the training speed of the algorithm. Additionally, the utilized optimizer is the Adam, which computes the proper weights for the GRU algorithm, avoids errors, and increases the training accuracy [30]. Further, a regularization technique is implemented to prevent the GRU algorithm from over-fitting [60]. This technique is based on a cross-entropy loss function.

4.2 Dataset description

The WISDM dataset utilized to classify human activities is available in [28]. The dataset has 1,098,207 samples of different human activities including walking, sitting, downstairs, jogging, standing, and upstairs. The sample percentages for each activity are 38.6%, 5.5%, 9.1%, 31.2%, 4.4%, and 11.2%, respectively. The WISDM dataset was gathered from 36 individuals utilizing a mobile phone, which has an internal accelerometer sensor positioned in a front trouser pocket. The readings of WISDM dataset are recorded with 20 Hz sampling frequency. The WISDM dataset is based on six factors with information referred to the human activities: time, x-, y-, and z-accelerations. The WISDM dataset is split into a testing set (20%) and a training set (80%). The testing set is utilized to assess the proposed GRU algorithm.

The WISDM dataset is selected due to it includes a variety of daily life activities. The overall size of the dataset is 1,098,207 samples. So, it is sufficient data to train the proposed GRU algorithm. Using Scikit-learn framework, the WISDM dataset is divided into 0.2 to test the GRU algorithm and 0.8 to train the algorithm. A probability sampling approach is used to randomly distribute the dataset, which presents a strong training phase for the GRU algorithm and thus improves its performance [41].

Figure 4 illustrates a sample of the acceleration data gathered from only one person within a period of time. In this figure, three signals of acceleration: a_x, a_y, and a_z. Each signal is a function in terms of the gravity acceleration (g). The blue signal represents the acceleration in the x- direction a_x. The green and orange signals indicate the a_y and a_z which are the accelerations in the y- and z- direction, respectively. The acceleration signals are stored for 22,000 seconds. The three signals have amplitude from -g to g.

5 Evaluation metrics

The training phase to assesses the performance of the proposed GRU algorithm [56]. For instance, accuracy, sensitivity, precision, and F1-score, which are depend on statistical values that resultant from a confusion matrix: False Negative (FN), True Negative (TN), False Positive (FP), and True Positive (TP). TP means an output in which an algorithm correctly expects the positive class, and TN means an output where an algorithm correctly expects the negative class. FP represents an algorithm that wrongly expects the positive class, and FN presents an algorithm that wrongly expects the negative class. Also, the confusion matrix is utilized to find the ability of an algorithm to classify multi-classes correctly with the GRU algorithm utilizing the testing dataset for comparing a true label with an output predicted label.

The Accuracy metric is the ratio of true predictions to the overall predictions of the testing dataset. The accuracy is determined via Eq. (4).

$$Accuracy=\frac{TP+ TN}{TP+ TN+ FP+ FN\kern0.5em }$$

(4)

The Sensitivity metric represents the ratio of correctly classified positives of a particular class to the overall number of true class activities in the dataset of testing. Sensitivity is also called as Recall and can be determined from Eq. (5).

$$Sensitivity=\frac{TP}{TP+ FN\kern0.5em }$$

(5)

The Precision metric is the ratio of true predictions of a certain class activity to the overall predictions of the same-class in the dataset of testing. It is calculated via Eq. (6).

$$Precision=\frac{TP}{TP+ FP\kern0.5em }$$

(6)

The F1-score metric is the mean of the sensitivity and precision multiplied by a factor of 2. From Eq. (7), F1-score is calculated. This metric is also called the balanced F1-Measure. F1-score takes both false negatives (FN) and false positives (FP) into the calculation. So, this metric is a more feasible metric for an assessing than the metric of accuracy. The best possible value for all evaluation metrics that mentioned above is 1, and the worst value is 0.

$$F1- score\kern0.5em =2\times \frac{Precision\times Recall}{Precision+ Recall\ }$$

(7)

Also, the area under the ROC curve (AUC) is utilized as an assessing metric for the GRU algorithm. AUC is calculated via the integration of the true positive rate (TPR) multiplied by the false positive rate (FPR), see Eq. (8). The possible AUC values are always between 0 and 1. High values of AUC imply an algorithm is capable of differentiating among classes of human activity. Thus, the algorithm with a large area under the ROC curve is high-performance algorithm to classify human activities.

$$AUC={\int}_0^1 TPR\ d(FPR)$$

(8)

FPR and TPR are determined from Eqs. (9) and (10). The FPR represents the ratio of the overall number of false positives (FP) to the sum of true negatives and false positives (TN+ FP). Also, FPR is called as 1-Specificity. The TPR represents the ratio of the total number of true positives (TP) to the sum of false negatives and true positives (FN + TP).

$$FPR=\frac{FP}{TN+ FP\kern0.5em }$$

(9)

$$TPR=\frac{TP}{FN+ TP\kern0.5em }$$

(10)

The used evaluation metrics “accuracy, sensitivity, precision, F1-score, and AUC” are selected because of their effectiveness in assessing and analyzing the performance of the proposed GRU algorithm [3,4,5,6, 39, 43].

6 Experimental results

The proposed GRU algorithm experimentally is implemented via Python programming language in a Spyder environment. The algorithm is executed on a personal computer (PC) with Windows 7 operating system, Intel Core2Duo, two central processing units, 4 GB RAM memory, and a 3.3 GHz processor. Figure 5 demonstrates the two curves for testing and training accuracy of the GRU algorithm. The curve of training is also namely learning curve. The brown curve shows the testing accuracy, which starts from 88.40% and reaches 97.08% after 50 epochs. The red curve represents training accuracy, which continuously changes and reaches a maximum value of 97.56% after 50 epochs.

Figure 6 demonstrates the loss rate curves of the GRU algorithm according to the sets of testing and training. The loss rate continuously declines with increasing number of training epochs. For the training set, the loss rate reaches 0.204 after 50 epochs. For the testing set, the loss rate starts at 0.645 and decreases to 0.221.

The confusion matrix obtained from the GRU algorithm is depicted in Fig. 7, demonstrating 10,662 instances in the testing dataset which are correctly classified. The matrix predicts a predicted label, which is compared to the true label in the dataset of testing. The matrix has diagonal values indicate to the classification accuracy, while the values below and above the diagonal demonstrate errors that occurred. The matrix detects data of 916, 3444, 586, 479, 1078, and 4159 as true positives for the six human activities: downstairs, jogging, sitting, standing, upstairs, and walking, respectively.

The normalized confusion matrix for the GRU algorithm is presented in Fig. 8. It is clear that the GRU algorithm performs very well, with low errors occurring above and below the diagonal of the normalized confusion matrix. It has a classification accuracy of 0.9 for the downstairs class, while the accuracy of 0.99, 0.98, 0.99, 0.92, and 0.98 for the classes – jogging, sitting, standing, upstairs, and walking, respectively.

Table 1 introduces a classification report of the GRU algorithm in terms of the Precision, Sensitivity, and F1-score, to analyze its performance with the WISDM dataset. The average Precision, Sensitivity, and F1-score are 97.11%, 97.09%, and 97.10%, respectively.

Table 1 Classification report of the GRU using the WISDM dataset

Full size table

ROC curves can be considered probability curves utilized to assess the GRU algorithm and illustrate the true positive rate (TPR) on the ordinate axis, while the false positive rate (FPR) is represented on the abscissa for threshold values from 0.0 to 1.0. The ROC curves for the GRU are illustrated, in Fig. 9 and the AUC values for all six activities are 1.00. This result indicates that the GRU algorithm achieves high performance results.

Figure 10 shows the precision-recall (PR) curves for the GRU algorithm for each activity. Downstairs “class 0” achieves an area value of 0.955, while class 1 “jogging” has an area of 0.999. Sitting “class 2”, standing “class 3”, upstairs “class 4”, and walking “class 5” achieve area values of 0.996, 0.992, 0.970, and 0.998, respectively. Thereby, the area under the micro-average PR curve is 0.995. The computation of the average value is executed using the summation of areas for all six classes divided by the classes’ number (6 classes in this work).

PR and ROC curves are robust evaluation metrics, which compute all statistical values “TN, TP, FN, and FP”, these curves are utilized in the assessment of the proposed GRU algorithm. Additionally, both curves enable instant and easy visual diagnosis of the algorithm’s behavior. These curves also depend on the area factor, where a larger area means a more useful test, and the areas under PR and ROC curves are utilized to compare the usefulness of tests.

Also, the performance of the proposed GRU algorithm via the technique of k-fold cross-validation is considered. In this technique, the WISDM dataset is partitioned into 5 equal parts (one part is used for validation and four parts for training), where k denotes the part number and equals 5 in this work. The proposed GRU algorithm is trained five times via various partitioning of the WISDM dataset. Therefore, the mean accuracy for the GRU is 97.97% with a standard deviation of ±0.47%, while the average sensitivity is 97.92% with standard deviation of ±0.58%. Similarly, the mean precision is 97.96% with ±0.52% standard deviation, and the average F1-score is 97.91% with ±0.42% standard deviation. Thereby, the proposed GRU algorithm has high performance and low standard deviation. Furthermore, the technique of k-fold cross-validation enhanced the algorithm’s performance and avoided the biasing of the performance results using a proper division of the testing and training dataset.

7 Discussion

The results illustrate that the proposed GRU algorithm has high accuracy and low loss rate for testing and training. Results from the normalized confusion matrix and PR curves show that the accurate prediction of human activities can be achieved with the proposed GRU algorithm. Figure 5 illustrates the performance of the GRU algorithm in terms of the accuracy of testing and training utilizing the WISDM dataset. Distributions of error percentages for the classes are shown in Figs. 7 and 8.

The micro-average PR curve has an area of 99.5% for the GRU algorithm. Also, for all six activities, the AUC is 100%. Further, the technique of k-fold cross-validation is used to assess the performance results for the GRU algorithm in terms of the average sensitivity, precision, accuracy, and F1-score at k = 5.

Eventually, Table 2 and Fig. 11 present a comparison of the accuracy between this work and previously published works. The accuracy reached by the GRU algorithm, 97.08%, is better than the algorithms in [17, 21, 40, 42, 59].

Table 2 Comparison of the accuracy between this work and previous works

Full size table

The better performance of the GRU algorithm is based on the accurate tuning of the hyper-parameters of the algorithm, which includes the optimizer type, activation and loss functions, dropout rate, batch size, learning rate, epochs, and the number of neurons for the used layers in the proposed GRU algorithm.

In particular, when the learning rate, batch size, and the number of epochs were set to 0.0025, 32, and 10, respectively, and utilizing a softmax activation function, the GRU algorithm reached a testing accuracy of 93.87%, but in case the batch size was adjusted to 128, the learning rate was changed to 0.001, the epochs were reconfigured to 5 with the same activation function, the GRU algorithm’s testing accuracy achieved 92.32%.

Thus, the proper settings of these hyper-parameters significantly improve the results. Furthermore, the best performance of the GRU algorithm is achieved when the hyper-parameters are set to a learning rate of 0.0025 with epochs of 50 and the batch size of 64, the Adam optimizer is used with a regularization method, depending on the implementation of a cross-entropy loss function.

The parameters of the GRU algorithm are tuned via GridSearchCV technique, which automatically computes the proper values of the hyper-parameters for achieving better performance of the proposed GRU algorithm.

8 Conclusion

This paper proposes an architecture of the GRU algorithm for the classification of daily human activities. The main aim of this paper is to achieve maximum testing accuracy of the GRU algorithm with a hyper-parameter tuning method on a central processing unit (CPU). The performance of the proposed algorithm is evaluated using different evaluation metrics with the WISDM dataset. Experiments carried out to achieve the maximum testing accuracy of the proposed algorithm reached the testing accuracy for the GRU of 97.08%, the testing loss rate of 0.221, the training accuracy of 97.56%, and the loss rate of training is 0.204.

The normalized confusion matrix, precision, sensitivity, F1-score, and ROC curves are determined to allow assessment of the algorithm’s performance. The GRU algorithm achieved a sensitivity is 97.09%, while the precision is 97.11% and the F1-score of 97.10%. The micro-average area under the PR curve is 99.5%. Finally, the GRU algorithm achieved AUC of 100% for all classes.

The hyper-parameters of the GRU algorithm e.g., batch size, the optimizer type, dropout rate, number of training epochs, learning rate, the activation and loss functions, and number of neurons for the used layers in the proposed GRU algorithm, are found to significantly affect the accuracy of the GRU algorithm, and high accuracy achieved is highly correlated to the optimal settings applied.

In addition, the performance of the GRU algorithm is assessed in terms of accuracy, sensitivity, precision, and F1-score via the k-fold cross-validation technique.

In the future, the proposed GRU algorithm can be trained with different datasets and the performance can be compared with a graphics processing unit (GPU) environment. Also, one could implement other algorithms of deep learning, such as long short-term memory (LSTM).

Data availability

The datasets generated during and/or analyzed during the current study are available in the [Kaggle] repository, https://www.cis.fordham.edu/wisdm/dataset.php

References

Agarwal P, Alam M (2020) A lightweight deep learning model for human activity recognition on edge devices. Procedia Comput Sci 167:2364–2373
Article Google Scholar
Alani AA, Gosma G, Taherkhani A (2020) “Classifying imbalanced multi-modal sensor data for human activity recognition in a smart home using deep learning,” International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, pp. 1–8
Ali A, Zhu Y, Chen Q, Yu J, Cai H (2019) “Leveraging spatio-temporal patterns for predicting citywide traffic crowd flows using deep hybrid neural networks,” 2019 IEEE 25th international conference on parallel and distributed systems (ICPADS), Tianjin, China, pp. 125–132
Ali A, Zhu Y, Zakarya M (2021) A data aggregation based approach to exploit dynamic spatio-temporal correlations for citywide crowd flows prediction in fog computing. Multimed Tools Appl 80:31401–31433
Article Google Scholar
Ali A, Zhu Y, Zakarya M (2021) Exploiting dynamic spatio-temporal correlations for citywide traffic flow prediction using attention based neural networks. Inf Sci 577:852–870
Article MathSciNet Google Scholar
Ali A, Zhu Y, Zakarya M (2022) Exploiting dynamic spatio-temporal graph convolutional neural networks for citywide traffic flows prediction. Neural Netw 145:233–247
Article Google Scholar
Alsheikh MA, Niyato D, Lin S, Tan H, Han Z (2016) Mobile big data analytics using deep learning and apache spark. IEEE Netw 30(3):22–29
Article Google Scholar
Alzantot M, Chakraborty S, Srivastava M (2017) “SenseGen: A deep learning architecture for synthetic sensor data generation,” IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops), Kona, HI, USA, pp. 188–193
Anguita D, Ghio A, Oneto L, Parra X, Reyes-Ortiz JL (2020) “A public domain dataset for human activity recognition using smartphones,” vol. 3, pp. 3
Antunes RS, Seewald LA, Rodrigues VF, Da Costa CA, Gonzaga Jr L, Righi RR, Maier A, Eskofier B, Ollenschläger M, Naderi F, Fahrig R, Bauer S, Klein S, Campanatti G (2018) A survey of sensors in healthcare workflow monitoring. ACM Comput Surv 51(2):1–37
Bulling A, Blanke U, Schiele B (2014) A tutorial on human activity recognition using body-worn inertial sensors. J ACM Comput Surv 46(3):33
Google Scholar
Ricardo C, Hesam S, Alberto C, Sundara TD, Gerhard T, José del RM, Daniel R (2013) The opportunity challenge: a benchmark database for on-body sensor-based activity recognition. Pattern Recogn Lett 34(15):2033–2042
Cheng Y, Wang C, Yu H, Hu Y, Zhou X (2019) “GRU-ES: Resource usage prediction of cloud workloads using a novel hybrid method,” in Proc. IEEE 21st Int. Conf. High Perform. Comput. Commun., IEEE 17th Int. Conf.Smart City, IEEE 5th Int. Conf. Data Sci. Syst. (HPCC/SmartCity/DSS), pp. 1249–1256
Cho K, van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” arXiv:1406.1078. [Online]. Available: http://arxiv.org/abs/1406.1078
Chu A, Lai Y, Liu J (2019) Industrial control intrusion detection approach based on multiclassification GoogLeNet-LSTM model. Secur Commun Netw 2019:6757685
Article Google Scholar
Cipolla E, Infantino I, Maniscalco U, Pilato G, Vella F (2017) “Indoor actions classification through long short term memory neural networks,” In International Conference on Image Analysis and Processing, pp. 435–444
Cruciani F et al (2020) Feature learning for human activity recognition using convolutional neural networks. CCF Trans Pervasive Compu Interact 2(1):18–32
Article Google Scholar
Demrozi F, Bacchin R, Tamburin S, Cristani M, Pravadelli G (2020) Toward a wearable system for predicting freezing of gait in people affected by parkinson’s disease. IEEE J Biomed Health Inf 24(9):2444–2451
Article Google Scholar
Dos Santos C, Gatti M (2014) Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts In Proceedings of COLING 2014, the 25^th Int. conference on computational linguistics: Technical Papers, Dublin, Ireland. Dublin City University and Association for Computational Linguistics., pp. 69–78
Greff K, Srivastava RK, Koutník J, Steunebrink BR, Schmidhuber J (2017) LSTM: a search space odyssey. IEEE Trans Neural Netw Learn Syst 28(10):2222–2232. https://arxiv.org/abs/1503.04069
Hammerla NY, Halloran S, Plötz T (2016) “Deep, convolutional, and recurrent models for human activity recognition using wearables,” presented at the Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, New York, New York, USA, pp. 1533–1540. https://arxiv.org/abs/1604.08880
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Hong J, Ramos J, Dey AK (2016) Toward personalized activity recognition systems with a semipopulation approach. IEEE Trans Human-Mach Syst 46(1):101–112
Article Google Scholar
Hsu Y, Chang H, Chiu Y (2019) Wearable sport activity classification based on deep convolutional neural network. IEEE Access 7:170199–170212
Article Google Scholar
Huang J, Lin S, Wang N, Dai G, Xie Y, Zhou J (2020) TSE-CNN: a two-stage end-to-end CNN for human activity recognition. IEEE J Biomed Health Inf 24(1):292–299
Article Google Scholar
Ignatov A (2018) Real-time human activity recognition from accelerometer data using convolutional neural networks. Appl Soft Comput 62:915–922
Article Google Scholar
Igual R, Medrano C, Plaza I (2015) A comparison of public datasets for acceleration-based fall detection. Med Eng Phys 37(9):870–878
Article Google Scholar
“In Wireless Sensor Data Mining dataset,” (2012) [Online]. https://www.cis.fordham.edu/wisdm/dataset.php, ed, 2012
Khorram A, Khalooei M, Rezghi M (2021) “End-to-end CNN + LSTM deep learning approach for bearing fault diagnosis,” Appl Intell, pp. 736–751
Kingma DP, Ba J (2015) “Adam: a method for stochastic optimization,” Available: http://arxiv.org/abs/1412.6980
Kłosowski P (2018) “Deep learning for natural language processing and language modelling,” in Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), pp. 223–228
Kwapisz JR, Weiss GM, Moore SA (2011) Activity recognition using cell phone accelerometers. J SIGKDD Explor Newsl 12(2):74–82
Article Google Scholar
Lattanzi E, Freschi V (2020) Evaluation of human standing balance using wearable inertial sensors: a machine learning approach. Eng Appl Artif Intell 94:103812
Article Google Scholar
Lawal IA, Bano S (2020) Deep human activity recognition with localization of wearable sensors. IEEE Access 8:155060–155070
Article Google Scholar
Malaisé A, Maurice P, Colas F, Charpillet F, Ivaldi S (2018) “Activity recognition with multiple wearable sensors for industrial applications,” in Proc. of the 1^st International Conference on Advances in Computer Human Interactions, pp. 1–7
Malhotra P, Vig L, Shroff G, Agarwal P (2015) “Long short term memory networks for anomaly detection in time series,” in 23^rd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN), pp. 89–94
Mekruksavanich S, Jitpattanakul A, Youplao P, Yupapin P (2020) Enhanced hand-oriented activity recognition based on smartwatch sensor data using LSTMs. Symmetry 12(9):1570
Article Google Scholar
Mohsen S, Elkaseer A, Scholz SG (2021) “Human activity recognition using k-nearest neighbor machine learning algorithm,” Proceedings of the 8^th International Conference on Sustainable Design and Manufacturing (KES-SDM), Split, Croatia, pp. 304–313
Mohsen S, Elkaseer A, Scholz SG (2021) Industry 4.0-oriented deep learning models for human activity recognition. IEEE Access 9:150508–150521
Article Google Scholar
Ordóñez FJ, Roggen D (2016) Deep convolutional and LSTM recurrent neural networks for multimodal wearable activity recognition. Sensors 16(1)
Pengfei J, Chunkai Z, Zhenyu H (2014) “A new sampling approach for classification of imbalanced data sets with high density,” in International Conference on Big Data and Smart Computing (BIGCOMP), pp. 217–222
Pienaar SW, Malekian R (2019) “Human activity recognition using LSTM-RNN deep neural network architecture,” in 2019 IEEE 2^nd Wireless Africa Conference (WAC), pp. 1–5
Qiu S, Zhao H, Jiang N, Wang Z, Liu L, An Y, Zhao H, Miao X, Liu R, Fortino G (2022) Multi-sensor information fusion based on machine learning for real applications in human activity recognition: state-of-the-art and research challenges. Inf Fus 80:241–265
Article Google Scholar
Ronao CA, Cho S-B (2015) “Deep convolutional neural networks for human activity recognition with smartphone sensors,” in Neural Information Processing, Cham, pp. 46–53
Ronao CA, Cho S-B (2016) Human activity recognition with smartphone sensors using deep learning neural networks. Expert Syst Appl 59:235–244
Article Google Scholar
Shakeel M, Itoyama K, Nishida K, Nakadai K (2021) Detecting earthquakes: a novel deep learning-based approach for effective disaster response. Appl Intell 51:8305–8315
Shakya SR, Zhang C, Zhou Z (2018) Comparative study of machine learning and deep learning architecture for human activity recognition using accelerometer data. Int J Mach Learn Comput 8(6):577–582
Google Scholar
Shoaib M, Bosch S, Incel OD, Scholten H, Havinga PJM (2015) A survey of online activity recognition using mobile phones. Sens 15(1):2059–2085
Stiefmeier T, Roggen D, Ogris G, Lukowicz P, Tröster G (2008) Wearable activity tracking in car manufacturing. IEEE Pervasive Comput 7(2):42–50
Article Google Scholar
Stisen A et al. (2015) “Smart Devices are Different: Assessing and mitigating mobile sensing heterogeneities for activity recognition,” presented at the Proceedings of the 13^th ACM Conference on Embedded Networked Sensor Systems, Seoul, South Korea
Szegedy C et al. (2015) “Going deeper with convolutions,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–9
Tanberk S, Kilimci ZH, Tükel D, Uysal M, Akyokuş S (2020) A hybrid deep model using deep learning and dense optical flow approaches for human activity recognition. IEEE Access 8:19799–19809
Article Google Scholar
Tao W, Leu MC, Yin Z (2020) Multi-modal recognition of worker activity for human-centered intelligent manufacturing. Eng Appl Artif Intell 95:103868
Article Google Scholar
Twomey N, et al. (2016) “The SPHERE challenge: activity recognition with multimodal sensor data,” University of Bristol, pp. 1–14
Verma S (2019) Understanding input and output shapes in LSTM-Keras. Accessed: Mar 10:2020
Google Scholar
Vinayakumar R, Alazab M, Soman KP, Poornachandran P, Al-Nemrat A, Venkatraman S (2019) Deep learning approach for intelligent intrusion detection system. IEEE Access 7:41525–41550
Article Google Scholar
Wang Y, Cang S, Yu H (2019) A survey on wearable sensor modality centred human activity recognition in healthcare. Expert Syst Appl 137:167–190
Article Google Scholar
Wang J, Chen Y, Hao S, Peng X, Hu L (2019) Deep learning for sensor-based activity recognition: a survey. Pattern Recogn Lett 119:3–11
Article Google Scholar
Xia K, Huang J, Wang H (2020) LSTM-CNN architecture for human activity recognition. IEEE Access 8:56855–56866
Article Google Scholar
Xiong J, Zhang K, Zhang H (2019) “A vibrating mechanism to prevent neural networks from overfitting,” in 2019 15th International Wireless Communications & Mobile Computing Conference (IWCMC), pp. 1737–1742
Xu W, Pang Y, Yang Y, Liu Y (2018) “Human activity recognition based on convolutional neural network,” In Proceedings of the 24^th International Conference on Pattern Recognition (ICPR), Beijing, China, pp. 165–170
Yu D, Deng L (2014) “Automatic speech recognition: a deep learning approach,” Springer Publishing Company, Incorporated
Yuwen C, Kunhua Z, Ju Z, Qilong S, Xueliang Z (2016) “LSTM networks for mobile human activity recognition,” in 2016 International Conference on Artificial Intelligence: Technologies and Applications, pp. 50–53
Zhang Y, Wallace B (2015) “A sensitivity analysis of (and practitioners' guide to) convolutional neural networks for sentence classification,” arXiv preprint arXiv: 1510.03820
Zhang C, Chen Y, Chen H, Chong D (2021) Industry 4.0 and its implementation: a review. Inf Syst Front:1–12
Zhao R, Yan R, Wang J, Mao K (2017) Learning to monitor machine health with convolutional bi-directional LSTM networks. Sens 17(2):273
Article Google Scholar

Download references

Funding

Open access funding provided by The Science, Technology & Innovation Funding Authority (STDF) in cooperation with The Egyptian Knowledge Bank (EKB).

Author information

Authors and Affiliations

Department of Electronics and Communications Engineering, Al-Madinah Higher Institute for Engineering and Technology, Giza, 12947, Egypt
Saeed Mohsen
Department of Artificial Intelligence Engineering, Faculty of Computer Science and Engineering, King Salman International University (KSIU), 46511, South Sinai, Egypt
Saeed Mohsen

Authors

Saeed Mohsen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Saeed Mohsen.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Mohsen, S. Recognition of human activity using GRU deep learning algorithm. Multimed Tools Appl 82, 47733–47749 (2023). https://doi.org/10.1007/s11042-023-15571-y

Download citation

Received: 07 September 2022
Revised: 10 March 2023
Accepted: 21 April 2023
Published: 11 May 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s11042-023-15571-y

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Recognition of human activity using GRU deep learning algorithm

Abstract

Similar content being viewed by others

Deep Learning Algorithms for Human Activity Recognition: A Comparative Analysis

Human Activity Recognition Based on Convolutional Neural Network

AReNet: Cascade learning of multibranch convolutional neural networks for human activity recognition

1 Introduction

2 Related work

3 Theoretical background of the gated recurrent unit (GRU)