1 Introduction

Human beings are physically active in nature, and as such, perform various activities in their day-to-day life. In this day and age of the Internet of Things (IoT) and the Internet of Everything (IoE), the study of each and every aspect of human existence is now possible and has been revolutionized and remodeled with intelligence and decision-making capabilities. This has been made possible with the rapid technological advancements in sensing technologies as well as advanced computational methodologies. The recognition of human activities with the aid of smart, automated, unobtrusive systems is one such widely popular domain of research that poses various challenges that are yet to be solved.

There are different challenges in human activity recognition, such as identification of specific activities like walking, sitting, standing, falling [1,2,3,4], identification of gait-related parameters [5,6,7,8], estimation of disease onset [9,10,11,12], gesture recognition and biometrics [13,14,15,16], etc. These challenges are addressed with the help of data mostly gathered from the Inertial Measurement Unit (IMU) sensor which is a common choice for the development of such systems. In the part of machine intelligence, there are various machine learning algorithms that are utilized by researchers to develop automated decision-making systems, namely, Logistic Regression (LR), Decision Tree (DT), Support Vector Machine (SVM), k-Nearest Neighbors (kNN), Artificial Neural Networks (ANN), Deep Neural Networks (DNN), etc.

The current work addresses one such challenge of human activity recognition with the aid of conventionally used machine intelligence techniques. In particular, the classification of human walking speed has been carried out through machine learning techniques. The purpose of this study is to design an affordable technique that will help in tracking the mentally incompetent as well as in childcare process, pediatric orthopedics, wildlife monitoring, etc. Ten subjects of different physiques were selected. The subjects were asked to walk with smartphones tied steadily to their waist at three different speeds (slow, moderate, and fast) at their own ability. Mobile phone acceleration data of these walks were recorded in online mode. Later, these data were labeled and fed into different classifiers as features to identify the speed of walking in the different cases. The use of a curated primary dataset and only three accelerometer features signifies the novelty of this work.

This presentation is organized in the following order. Section 2 discusses some of the pertinent state-of-the-art works that address the same problem, followed by relevant observations in Sect. 2.1. Section 3 highlights the motivation for the proposed work, followed by Sect. 4 to discuss the problem formulation and contributions of the work. The description of data used in this work is given in Sect. 5. Section 6 discusses the experimental setup for the current work including the learning models and evaluation metrics used. The results of classification performance are discussed and analyzed in Sect. 7. Concluding remarks are offered in Sect. 8.

2 Literature Survey

In this section, some of the related works on human activity recognition with a specific focus on human gait and walking speed detection using IMU sensor data have been briefly reviewed.

In 2013, Zhang et al. [17] proposed the identification of human single stride and walking style based on the data of their walks at different speeds. The speeds were not manipulated or regulated during data acquisition. In 2015, Lee et al. [18] developed a system to identify the activities of walking, running, and jumping using pre-defined speeds on a treadmill. Cuzzolin et al. in 2017 [19] proposed the detection of Parkinson’s disease with the aid of Hidden Markov Model (HMM) representations of walking data. In 2019, Mahoney et al. [20] developed a system of 3 IMU sensors to identify the step type (walking, running, or jogging) from activities performed on a treadmill at constant speeds. In the same year, Dauriac et al. [21] proposed a system to determine the walking speed of people with transfemoral amputation based on IMU sensors embedded in their prosthetic lower limbs. The walking speed of older adults was estimated using their proposed system in the work by Byun et al. [22] in the same year.

In 2020, Paraschiv-Ionescu et al. [23] proposed a system for the identification of walking speed in gait-impaired and mobility-affected patients using a system of 4 IMU sensors. Also, Chen et al. [24] developed a method of identifying the different types of walking patterns (level, slope, stairs) with the aid of a shoe-fitted sensor setup. In the same year, Baroudi et al. [25] developed a system for determining walking speed with bio-logging-based high-resolution and lightweight sensors. For comparison between a Baropodometric Platform and IMU data, walk data at normal and slow speeds was captured by Correale et al. in 2020 [26].

In 2021, Revi et al. [27] suggested a system to identify the walking speed of healthy and hemiparetic people with the aid of a thigh-worn IMU sensor setup. Lueken et al. [28] proposed the detection of artificially induced neuro-degenerative and musculoskeletal diseases through walking data of increasing speeds in 2022. Khalili et al. [29], in the same year, developed a system based on smartphone sensors to determine different walking activities while using their smartphones, such as calling, texting, and swinging. In 2022 and 2023, Chakraborty and Mukherjee developed a system to distinguish between actual walking activity and an imitated or fake walking activity based on IMU signals captured from lower limb movements [30, 31].

2.1 Observations

From the study of the state-of-the-art works in this domain it has been found that the use of a large number of features in gait analysis not only complicates the machine learning algorithms but also makes it difficult to be used for long-duration analysis of time series data.

3 Motivation

With the advent of low-cost wearable devices, IMU-based data collection and analysis for walking activity detection has been a popular research interest. However, this walking activity data for different unregulated speeds has not been used previously. Each person has their own perception of speeds, and as such, their walking at slow or fast speeds would be different as a factor of their unique physique. This introduces a novel challenge and is the motivation for the current work.

4 Problem Formulation and Contributions

The present work uses only the accelerometer data to capture the human walking action and categorizes it to different speeds along with the action of sitting with the aim of developing a cost-effective automated system to identify these activities efficiently.

5 Data Description

The current work utilizes a primary dataset that has been gathered online. For this purpose, a standard off-the-shelf smartphone (Samsung Galaxy M31 series) has been used. Ten volunteers of different physiques were asked to perform the 4 different types of activity—fast walk (class 0), moderate walk (class 1), slow walk (class 2), and sitting (class 3). Each volunteer was asked to perform the activity for a duration of up to 10 min. During the activity sequence, the smartphone was striped to the frontal torso of the body near the waist in an upright front-facing position as described in Fig. 1. The 3-axis Inertial Measurement Unit (IMU) sensor data was gathered from the smartphone for the proposed work. The ground truth information was collected during the activities and appropriately labeled in the final dataset for a supervised learning problem.

Fig. 1
figure 1

Smartphone striped in upright and front-facing position in one of the subject’s waist

Accelerometers installed in the mentioned phone have an acceleration range of \(\pm 8g\) and a gyroscope sensing the orientation with a maximum range of 17.45326 rad/sec. Slow, moderate, and fast walks are designated based on the average speeds [32, 33] taken as follows:

  1. 1.

    Slow Walk: \(< 1 meter / sec\)

  2. 2.

    Moderate walk: \(< 1.45 meters / sec\)

  3. 3.

    Fast walk: \(>1.5 meters / sec\)

Table 1 provides a tabular summary of the dataset used in this work.

Table 1 Summary of the dataset

6 Proposed Solution and Experimental Setup

The experiments have been modeled in this work as supervised learning (classification) tasks with the aid of 5 different conventionally used machine learning algorithms, namely Decision Tree (DT), Logistic Regression (LR), Support Vector Machine (SVM), Gaussian Naive Bayes (GNB), and k-Nearest Neighbors (kNN). As the system is meant to run for long duration time series data and uses only three features for classification, in order to minimize the overall complexity of the system, neural network applications are not experimented in this work [34, 35]. Each algorithm was trained with 70% of data randomly chosen from the dataset, while the remaining data was used for testing the performance of the classification algorithms.

For the evaluation of the classification performance of each of the aforementioned algorithms, some standard metrics have been used for a comparative study. The measures of Precision, Recall, F1-score, and Accuracy provide abstracted summaries of classifier performance. Precision gives an estimate of the number of times the different activities are classified correctly. Recall highlights the flaws in the decision-making process of the models by considering the misclassifications too. F1-score gives a weighted measure of precision and recall. The Receiver Operating Characteristic (ROC) curve represents the classifier behavior for all possible sets of thresholds by plotting the True Positive Rate (TPR) vs False Positive Rate (FPR). If the curve lies exactly on the diagonal of the axis, then the model has little to no capacity to learn from the IMU data about the different speeds. Another important consideration here is that the current work is a multi-class classification problem consisting of 4-classes, and as a result, the straightforward usage of the ROC curve is not feasible. Hence, in all the discussions of results, the One-vs-Rest (OvR) approach has been utilized for the identification of ROC characteristics. This means that each class of data has been carefully compared against the other 3 classes taken together, keeping the procedure similar to a binary classification problem.

7 Results and Discussion

The results obtained by the proposed work are discussed in this section with the help of standard metrics and ROC curves. Also, a comparative study is performed to determine the efficiency of the proposed work with respect to other state-of-the-art works.

7.1 Using standard metrics

In this section, the conventional measures, namely, accuracy, precision, recall, and f1-score have been used to judge the performance of the five different classification algorithms. The results have been summarized in the Table 2. The same has been illustrated pictorially in Fig. 2. The observations from these results are discussed below:

  1. 1.

    The DT classifier outperforms all the other models in terms of the different standard metrics of performance evaluation with accuracy and f1-scores of around 95%. High precision and recall are also noted which signifies the capacity of the model to eliminate false positives (FP) and false negatives (FN).

  2. 2.

    The SVM and kNN models are almost similar in their performance as per the metrics of accuracy, precision, recall, and f1 score. These two models come second to the DT model and are able to perform with an accuracy and f1 score of about 84% and 76% respectively. The difference in the capacities of these models will be clear in the second part of the discussion of results using the ROC curve.

  3. 3.

    Also, the LR and GNB models are found to be similar in performance to each other and they come third in place in decreasing order of accuracy. Between these two models, GNB outperforms LR in terms of precision and f1-score, while the recall in both cases is almost similar. This hints at the fact that LR returns a larger number of false positives (FP) than GNB, which is not a good characteristic.

Table 2 Results of classifier performance
Fig. 2
figure 2

Illustration of standard metrics

The above discussions provide an overview of the results obtained from the classification of the five different models. However, the individual characteristics of each model with respect to the four classes of data in our current work, need to be analyzed further. This has been undertaken in the following section.

Fig. 3
figure 3figure 3

One-vs-Rest classification Histograms for all the models

Fig. 4
figure 4figure 4

One-vs-Rest ROC curves for all the models

7.2 Using the ROC curve

The observations from the results as illustrated in Figs. 3 and 4 are discussed as follows:

  1. 1.

    For the DT classifier (Figs. 3a and 4a), it is observed that the maximum segregation and minimum overlap in the prediction for all the scenarios of OvR histograms. Similarly, the ROC curves reinforce that for classes 2 and 3 the ROC curve is near perfect, whereas for classes 0 and 1 there is very minimal misclassification. Among all the classification models, this DT model performs the best in terms of all the chosen standard metrics of evaluation. The reason behind this outstanding performance of the DT algorithm may be characterized by the overfitting of the system model.

  2. 2.

    For the LR model (Figs. 3b and 4b), the performance noted is much worse than that of the previous model. For classes 0 and 1, there is almost complete overlap between the data as well as a minimum probability of proper classification of class 1. In the case of class 2, most of the data are predicted to overlap with other classes, and only a few data points are predicted with a probability higher than 0.5 and without any overlap. Only for class 3, the results are perfect, similar to the observations from the DT model. The ROC curves respectively reflect the same observations where the ROC curve is closest to the threshold for class 1, followed by class 0 and class 2 which seriously suffer in higher FPR. The ROC curve for class 3 is perfect.

  3. 3.

    The SVM model (Figs. 3c and 4c) is similar to previous models in terms of class 3 prediction, while class 2 is predicted better than the LR model. Class 0 data points are predicted better than class 1 data, as the misclassification is more in the latter case.

  4. 4.

    For the GNB classifier (Figs. 3d and 4d), the class 1 data has the most overlap in prediction followed by class 0. Class 2 prediction has minimal overlap, while class 3 data points are perfectly predicted. This is similar to the other classification algorithms. The ROC curves reflect the same trends of the model’s performance where the curve for class 2 is very much similar to the curve for LR in class 2. However, the overall performance is marginally better than LR but worse than SVM and DT models.

  5. 5.

    The performance of the kNN model is better than LR, SVM, and GNB models for all 4 classes of data. Figures 3e and 4e illustrate that there is some overlap in predictions of class 0 and 1, minimal overlap for class 2, and perfect prediction of class 3. The kNN model outperforms GNB and LR with respect to class 2 data.

  6. 6.

    Overall, it can be determined that a perfect classification of class 3 data is observed for all the chosen models in this work. This is not remarkable as class 3 corresponds to the sitting activity while all other classes are of motion in a walking state.

  7. 7.

    The maximum results of overlap in predictions are noted in classes 0 and 1 which correspond to slow and moderate walking speeds respectively. From the observations, it can be thus determined that most of the classification models fail to distinguish between the data of slow and moderate walking. This may also be due to the fact that volunteers of different physiques may have different estimates of moderate speed, that is to say, a slow walk for one person may be a moderate-speed walk for another person. From the results, it can be deduced that the best segregation of these activities is done by DT and kNN whereas the worst performance in this regard was demonstrated by the LR model.

7.3 Comparison

Proposed work used a set of conventional supervised learning algorithms to distinguish between walking activities of different speeds, gathered from a smartphone-based IMU sensor, which is a novel factor. As a result, this work is not directly comparable to any of the available state-of-the-art works as most of them work with image and/or video data. However, a comparison based on the classifier performance using IMU sensor data, as observed by the researchers over the years, can be undertaken.

The work by Park et al. [36] demonstrates that the unmodified SVM classifier performs poorly in comparison to the DT model, which tallies with the observation in this work. Similarly, reliable performance in determining walking speed was noted in the work by Wang et al. [37]. Khalili et al. [29] in their work highlighted the reliable performance of SVM and DT algorithms, which is also observed in this work. In the recent works by Chakraborty and Mukherjee [30, 31] the features from IMU data were utilized that correspond to two activities—walking and swinging legs. This is similar to the activities of sitting and walking in the present work. It has been observed from their results that the performance of kNN and DT classifiers is better when compared to SVM and GNB, which is comparably similar to the performance of these classifiers in the proposed system of the present study. Thus, this proposed system is comparable to several other state-of-the-art works dealing with similar challenges in terms of the performance of the chosen classifiers on IMU data.

8 Conclusion

The current work proposes a novel approach in distinguishing the activity of walking at different speeds, namely, slow, moderate, and fast, from sitting activity. The IMU sensor-based dataset has been curated in a supervised manner with the recording of ground truth information. The main challenge introduced by this dataset is that the speeds were not regulated or manipulated, i.e. each volunteer performed the activity with their own understanding of slow, moderate, and fast walk. For experimentation, five different classification models were built and their performances were analyzed using standard metrics like accuracy, precision, recall, and f1-score. Further analysis of classspecific performance using the ROC curve highlights how the supervised learning algorithms handled the challenge introduced in this work. The DT model was found to be the best in segregating the walks of different speeds, followed by kNN and SVM models.

The dataset used in this current work can be further expanded to develop a substantial data set with more variance. Also, the duration of each activity can be further increased. In the future, other classification models can be included to develop a more advanced system of gait speed analysis. Addressing these limitations may open scopes for some further researches in this domain.