A comparison of machine learning classifiers for smartphone-based gait analysis

Altilio, Rosa; Rossetti, Andrea; Fang, Qiang; Gu, Xudong; Panella, Massimo

doi:10.1007/s11517-020-02295-6

A comparison of machine learning classifiers for smartphone-based gait analysis

Original Article
Open access
Published: 06 February 2021

Volume 59, pages 535–546, (2021)
Cite this article

Download PDF

You have full access to this open access article

Medical & Biological Engineering & Computing Aims and scope Submit manuscript

A comparison of machine learning classifiers for smartphone-based gait analysis

Download PDF

Rosa Altilio¹,
Andrea Rossetti¹,
Qiang Fang²,
Xudong Gu³ &
…
Massimo Panella ORCID: orcid.org/0000-0002-9876-1494¹

4125 Accesses
13 Citations
Explore all metrics

Abstract

This paper proposes a reliable monitoring scheme that can assist medical specialists in watching over the patient’s condition. Although several technologies are traditionally used to acquire motion data of patients, the high costs as well as the large spaces they require make them difficult to be applied in a home context for rehabilitation. A reliable patient monitoring technique, which can automatically record and classify patient movements, is mandatory for a telemedicine protocol. In this paper, a comparison of several state-of-the-art machine learning classifiers is proposed, where stride data are collected by using a smartphone. The main goal is to identify a robust methodology able to assure a suited classification of gait movements, in order to allow the monitoring of patients in time as well as to discriminate among a pathological and physiological gait. Additionally, the advantages of smartphones of being compact, cost-effective and relatively easy to operate make these devices particularly suited for home-based rehabilitation programs.

Recognition of the Pathology of the Human Movement with the Use of Mobile Technology and Machine Learning

Gait Abnormality Detection without Clinical Intervention Using Wearable Sensors and Machine Learning

CAGSI: A Classification Approach towards Gait Speed Identification

Article Open access 11 December 2023

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

An important field of application for data classification and screening is the one concerning human motion, in particular the one required in gait analysis, defined as the systematic study of human walking [1,2,3,4,5]. As far as we know, considerable information can be extracted by analyzing patients’ walking because it contains important biometric features. In effect, gait is related to the walker’s physical and, sometimes, psychological state [6].

Decision-making for gait analysis could be supported by the use of computational intelligence techniques for automatic determining the status of a patient. For instance, in [7,8,9] several classification algorithms are evaluated and compared in terms of their ability to discriminate among physiological and pathological gait.

Generally, the most adopted tools for gait analysis are based on high complexity motion capture systems exploiting active or passive markers, electromyography (EMG), dynamometric platforms, and so on [10,11,12,13,14]. Unfortunately, the high cost and the complexity they require make them suitable only for specific clinicians and hospitals.

To ease the monitoring of patients’ motion, while keeping the cost low, a lot of researchers focus their attention on IMU sensors [15,16,17]. Unfortunately, these sensors have severe drift problems, making necessary their usage in parallel with other technologies [18]. To overcome these issues, some studies have analyzed the impact that new technologies based on both visual and non visual systems could bring on the related research and application fields [6, 19,20,21].

In effect, in the last decade, everyday life has been powerfully influenced by technology, like tablets or smartphones ceaselessly connected through mobile networks. Research is increasingly striving towards the exploitation and the evolution of daily experiences by improving and developing new functionality that people could ask for.

A current and quite recent trend of research has seen the investigation of smartphone-based applications for movement monitoring and analysis, as for example considered in [22]. Gait analysis experiments using a smartphone to demonstrate the capability to accurately quantify gait parameters with a sufficient level of consistency have been performed in [23] and in several other works as, for instance, in [24,25,26,27]. However, these approaches mainly rely on human experts (i.e., doctors) for the clinical analysis of smartphone data and making a decision accordingly. The feasibility, efficacy and usefulness of machine learning techniques for discriminating automatically the gait movements and for assessing the main extracted features have not been systematically evaluated so far.

In this work, we extend the use of sensors contained in a smartphone to realize a work particularly suited for a future home-based rehabilitation approach. In particular, its advantages of being compact, cost-effective and relatively easy to operate automatically, compared to the other onerous and expensive technologies, make this device particularly suited for this context. As for the biomedical context, a low-cost smartphone-based system could bring great advantages to both diseased people and clinicians, by upgrading the patients’ quality of life and reducing the average rehabilitation cost. Despite this approach being interesting for the treatment of many diseases like Parkinson’s, multiple sclerosis and Coxarthrosis, we will describe a specific solution tailored to the monitoring of people recovering from a stroke. In effect, post-stroke rehabilitation has been proven to be essential and effective in helping stroke patients to gradually regain part of their body functionality. In particular, gait analysis, which is the standard practice for diagnosis, assessment, monitoring and discussion of diseases that affect gait, is used to detect the walking patterns and posture that are unique for hemiplegic patients at different recovery stages.

We propose a reliable remote monitoring scheme that can assist medical specialist in watching over the patient’s condition. A smartphone is used to collect stride data and obtain useful information from these data by means of advanced features extraction methods. The system should be used to assist medical specialists in analyzing the rehabilitation path at range, also when the patient is not in the hospital anymore. In this way, assuming that the approach is inserted in a well-scheduled program of “home” rehabilitation, it will be possible to reduce costs, while improving the patient’s life quality and allowing clinicians to evaluate patient’s improvements in a safer and faster manner.

The novelty, with respect to state-of-the-art applications, is the combination of the data acquisition and filtering from the device, with data fusion and pattern recognition techniques that provide a correct definition of the gait movements, allowing to monitor the patient in time, as well as discriminate among a pathological and a physiological gait.

The rest of the paper is organized as follows. We introduce the proposed approach in Section 2. The application is ascertained by extensive computer simulations and several benchmark results, which are reported in Section 3 and discussed in Section 4. Finally, our conclusions are drawn in Section 5.

2 Methods

2.1 Selection and description of patients

In the experimental process, which will be described in the detail successively, we evaluate the gait of two set of individuals. A group of both healthy people and post-stroke patients took part in the experiment. We collected 60 different walking trials through heterogeneous smartphone devices of different manufacturers. Among these 60 records, 25 of them belong to voluntary unhealthy patients from the Rehabilitation Medical Center of the 2nd Hospital of Jiaxing, Zhejiang province, China; the remaining 35 are healthy persons among academic researchers and doctors of the previously cited Medical Center. Additionally, data differ for the length of the recording session: 41 of them (respectively 13 from patients and 28 from healthy people) are recorded in 10 s, the remaining 19 (12 of which are patients) are recorded in 20 s. People are asked to perform a walking in a straight path, without deviation.

All research activities in this study were conducted in accordance with the ethical principles of the Declaration of Helsinki. As it involves human participants, the present study was performed in accordance with the relevant institutional and national guidelines, with informed written consent from all human subjects involved in the study including for publication of the results. However, the study is exempt from the explicit ethics approval of appropriate institutional Committees, as it is mainly focused on the engineering aspects pertaining to the use of specific ICT technologies as well as signal processing and analysis of the related data. All subject anonymity is preserved as identifying information is not included in the manuscript.

2.2 Procedural information

In this study, we perform an analysis of the user’s stride in order to extract suitable features for a classification purpose. A low-cost smartphone device (i.e., Samsung’s Galaxy A3 2017 SM-A320F with Android v6.0.1) is put in the pocket of a band and ties around the user’s calf as shown in Fig. 1. In fact, most of the gait information and lower limb’s angle movements are linked to this muscle. One device only is adopted, so as to represent a realistic scenario in a home-based context where a user has one smartphone only.

In order to implement a simple and cheap approach, we decided to use one smartphone only in the clinical trials. Further researches might investigate on the use of two or more devices, although dealing with severe issues as time synchronization and sensor mismatch among smartphones. Consequently, we are not able to provide a full representation of the gait cycle, which is represented in Fig. 2, since one sensor node is not able to achieve this. Rather, our aim is to evaluate some additional features that could be used for a possible home-based lower limb rehabilitation, by focusing on the features in Table 1 that should be easy to recover using the adopted hardware setup.

Table 1 Spatio-temporal features adopted for the gait analysis

Full size table

The proposed system for gait monitoring is illustrated in the flow chart of Fig. 3 and it can be summarized into several main operations, described in the following subsections.

2.2.1 Raw data collection from sensors

Firstly, data acquisition is performed as follows:

the user opens the application (we worked in this case on an Android^®; device but any operating system can be adopted) and sets the recording time;
the smartphone is put into the band’s pocket and fasten around the calf;
the user taps the “Start” command on the screen and, after a countdown of 3 s, the app begins to record data;
during the recording time, the user performs the walking test and a device’s vibration will advise him/her that recording is terminated;
at the end of the trial, the user uploads data into the database after explicit consensus granted via the application.

If the user is a voluntary hemiplegic patient, an assistant helps him/her in any demands. Accelerometer, gyroscope, and magnetometer data are captured during each trial. However, in the following we will not consider magnetometer data because they are too sensitive to the presence of metal objects in the environment.

2.2.2 Data resampling

Once data are collected from sensors, the second main step consists in resampling them in order to reduce the differences caused by the fact that acquisition is carried out using different smartphones with heterogeneous control hardware and sensor technologies. In addition, the app code usually cannot set the sampling rate at software level, as it depends on the hardware available on the adopted device. Depending on which device is used, each sensor data is sampled at a different rate (i.e., from 50 to 350 Hz). In order to apply the same denoising algorithm to the whole set of data, we need a common sampling rate. Since 200 Hz is a reasonable trade-off, also considering the final target rate after wavelet filtering discussed successively, depending on the starting rate we did resampling or low-pass filtering to obtain the same rate.

2.2.3 Denoising and filtering

The successive step consists in denoising both acceleration and gyroscope data by a wavelet-based estimation algorithm and, successively, in low-pass filtering the reconstructed signals in order to make easier the feature extraction process.

A wavelet-based denoising algorithm is firstly applied by using the following model [28]:

$$ s(n)=f(n)+\sigma e(n) , $$

(1)

where n is the sample (time) index, s(n) is the noisy signal, f(n) is the signal to be recovered, e(n) is a zero-mean, unit-variance Gaussian white noise and σ is the noise level. The adopted algorithm is able to suppress the noise part of the signal s(n) and to recover f(n) through the following steps:

1.
a wavelet decomposition of s(n) at level W is evaluated (we used the family of Daubechies’ least asymmetric wavelets as default option);
2.
a thresholding operation is performed to detail coefficients for each computed level from 1 to W;
3.
the wavelet reconstruction is computed based on the level W original approximation coefficients and on the modified detail coefficients from level 1 to W.

In the following, we will consider W = 4 levels taking into account a minimum of 128 samples per data recording trial. Noise estimate is performed at each wavelet level to scale the reference noise model σ = 1, then a soft thresholding is performed by using a “universal threshold” approach for minimax performance [29], using a threshold proportional to $\sqrt {2\ln (L)}$ where L is the length of the considered signal.

After denoising, acceleration and gyroscope signals are passed through a 4-level Mallat’s filter bank [30] for low-pass filtering and downsampling. In fact, average walking frequency of healthy people is about 1.8 Hz [31] and hence, as shown in Fig. 4, a final sampling frequency of 12.5 Hz is suitable to make easier the subsequent feature extraction process without loosing any useful information.

2.2.4 Power spectrum density estimation

As a well-known result achieved in the literature [32], the feature extraction for gait classification analysis should be based also on the power spectrum density (PSD) of considered signals, which coincide with the absolute values of acceleration and gyroscope signals after denoising and filtering. PSD estimation is then performed by the “Periodogram” method and we used the maximum PSD magnitude values (i.e., peaks) for both acceleration and gyroscope data. An example in this regard is shown in Fig. 5, where input denoised unfiltered data are shown.

It is worth to point out that the average stride frequency of every stride can be estimated and used as a parameter to find the walking cycle in each trial; in most cases, the acceleration’s average frequency (PSD peak) is located at the gyroscope’s second harmonic (see Fig. 5). However, because of relevant fluctuations due to the mechanism through which data are measured, this rule could not be always satisfied.

2.2.5 Gait feature extraction

In addition to the PSD peaks introduced before, other three features are used for gait cycles discrimination: cycle duration (C_d), cycle regularity (C_r), and cadence or revolutions per minute (R_m).

Those features are generally used by the clinicians, in combination with other information on the subjects, to monitor the progress of a therapy or the evolution of a disease [5, 33, 34]. By the proposed approach, we propose a novel use of these features through the synergy of extracting gait features by low-cost devices and making classification automatically by means of machine learning models.

Each stride is recognizable from the acceleration pattern as the time between two “valleys”. In fact, when the foot hits the ground, the sudden acceleration causes a spike followed by a deceleration that is represented by a valley; then, the successive leg swinging causes a new acceleration and the process is repeated cyclically. However, this behavior is more evident in healthy people rather than hemiplegic ones, as shown in Fig. 6.

In order to find out the time instants of foot contacts, both accelerometer and gyroscope data are considered. First, the absolute PSD’s peak of both accelerometer and gyroscope data is measured; from the related frequency we can obtain a gross estimate of the average time between two steps considering either acceleration or gyroscope data, respectively. The average of these time intervals is then used as a rolling window to find a minimum in a small interval around an initial guess in accelerometer data, the final result is shown in Fig. 6.

Since we assume that the first valley is the starting point of the first stride and that one stride is performed during the time gap between two consecutive valleys, we can define the cycle (stride) duration as:

$$ C_{d}[k]=V[k+1]-V[k] , $$

(2)

where V [k] is the array containing the valleys’ locations (in seconds) and k is their index. So, C_d[k] will represent the time difference between two valleys of a gait cycle. The cycle (stride) regularity is expressed as the standard deviation of the elements in the vector C_d, it is measured in second and proves regularity when the value of C_r tends to zero. Finally, the cadence is defined as:

$$ R_{m} =\frac{60 N_{c}}{V_{l}-V_{f}} , $$

(3)

where N_c is the number of cycles taken from recorded array, V_l is the last element of the valley location (in seconds) and V_f is the first one. Consequently, the cadence is the projection of how many strides could be performed in a minute and it is thus expressed in cycles/min.

All of the features adopted in this paper are summarized in Table 1; for the cycle duration representing a recorded trial we consider the average value of the elements in the vector C_d.

2.2.6 Feature normalization

Before using data for classification purposes it is helpful to perform data normalization in order to scale the features in the same numerical range, which in this case is chosen between 0 and 1. Let M be the number of patterns in the available dataset, where each pattern x_m, $m=1{\dots } M$, is a collection of N features (i.e., N = 5 in the present approach) associated with a specific recorded trial:

$$ \mathbf{x}_{m}=[x_{m_{1}} x_{m_{2}} {\dots} x_{m_{N}} ] , m=1{\dots} M . $$

(4)

Since data features are completely heterogeneous, patterns cannot be normalized globally but with different affine transformations of features independent from one another:

$$ x_{m_{j}}\gets \frac{x_{m_{j}}-b_{j}}{a_{j}-b_{j}} , j=1 {\dots} N , m=1 {\dots} M , $$

(5)

where the terms are defined as ${a_{j}=\max \limits _{m}\{x_{m_{j}}\}}$ and ${b_{j}=\min \limits _{m}\{x_{m_{j}}\}}$, with ${j=1 {\dots } N}$.

2.2.7 Data classification and results

The last step of the proposed algorithm consists in training a binary classifier by using well-known machine learning paradigms in order to categorize data and discriminate between healthy and unhealthy people. This is useful also to understand if the considered gait features are able to support this kind of classification, as for many other application fields [35,36,37,38].

We have investigated in our experiments all of the possible combinations of input features, therefore considering 2⁵ − 1 = 31 different datasets. A 10-fold stratified validation is performed and several classification algorithms are compared in terms of classification error for each dataset.

2.3 Statistics

Several classification algorithms are used to assess the validity of the proposed approach:

Linear Discriminant Analysis (LDA): tries to characterize data using a linear polynomial in order to separate patterns into two or more classes. It maximizes the inter-class discriminatory information by using the Fisher Discriminant technique for surface separation [39, 40]. For the method to perform well data should satisfy the homoscedastic hypothesis, no hyperparameters are to be set in advance.
Quadratic Discriminant Analysis (QDA): similarly to the LDA, tries to characterize a dataset using a quadratic polynomial based on Gaussian density conditional functions [41, 42]. It does not require any assumption on data, so it is more suitable for real contexts, no hyperparameters are to be set in advance.
K-Nearest Neighbor (KNN): classifies a pattern depending on the most frequent class in the neighborhood of the pattern itself [43]. It does not require any assumptions on data and, in the following, we will use the Euclidean distance between patterns and K = 5 as a default value.
Naive Bayes (NB): is a statistical technique that seeks to verify if an element belongs to a class based on Bayes’ Theorem [44, 45]. The algorithm calculates various conditional probabilities and assigns the patterns to the class with the highest probability. In the following we will use Gaussian kernel smoothing to estimate and model the data density.
Support Vector Machine (SVM): is a particular supervised learning approach that can be applied for both regression and classification problems [46, 47]. Based on the solution of a quadratic convex problem, it is used for finding global minimum also in nonlinear complex problems. In the following we will adopt as default options a Radial Basis Function (RBF) kernel with Sequential Minimal Optimization (SMO) solver.
Neuro-Fuzzy classifier (NF): is used to partition the feature space into fuzzy sets and assign non mutually exclusive membership values representing the reliability of the pattern of belonging to each class [48, 49]. In the following the model will be trained by a scaled conjugate gradient method with 100 maximum epochs and one cluster per class.
Classification and Regression Tree (CART): operates by recursively splitting data until ending points, defined by some predefined criteria, are achieved [50, 51]. It should handle with nonlinear relations between features and classes [52], finding a correct trade-off among computational complexity and accuracy. Prior class probabilities will be estimated in the following based on class frequencies.
Probabilistic Neural Network (PNN): this approach is based on a four-layer neural network employing Bayesian decision-making theory and data-driven learning [53, 54]. The spread of radial basis functions will be set by default to 0.1.
Fuzzy Inference System (FIS): this method adopts first-order fuzzy rules and a data-driven inference system trained by means of the Substractive Clustering method [48, 55]. Gaussian membership functions will be adopted with one rule per fuzzy cluster and 0.5 influence of the cluster center (normalized data space).

All the classifiers use the same set of data, no ad hoc changes are made to make every dataset suited for the specific classification model. It can be underlined, however, that some general differences exist in the way by which each algorithm extract information from the data. For instance, KNN and SVM classifiers do not provide from training data a mathematical model of the classifier. In fact, SVM classifier gives as output the support vectors, while KNN seeks, for each pattern to be classified, the nearest patterns which the output label is extracted from.

On the contrary, statistical and fuzzy logic-based classification algorithms aim at finding, by using training data, the parameters of a mathematical model that is able to infer the probability of or the fuzzy membership to a class, respectively, for the pattern under classification. An intermediate behavior is the one of CART classifiers, where a decision tree is obtained by training data rather than a parametric model. Further details can be found in the references cited for each classifier listed above.

3 Results

In this section we report the obtained numerical results. For the sake of illustration, let us consider firstly the PSD magnitude of acceleration and gyroscope for healthy classification. Looking at Fig. 7, where the first 35 (blue) records are from the healthy group while the successive 25 (red) records are of the post-stroke patients, by the differences of the maximum PSD magnitude the reader could have a sufficient but not so accurate estimation of the healthy status. Consequently, a more accurate classification approach is required in order to perform a robust analysis.

To this end, we considered the classification models introduced in Section 2.3 and we performed an exhaustive search considering all the 31 sets of possible combinations of the 5 features listed in Table 1. A stratified 10-fold validation procedure was adopted for evaluating the classification accuracy; for each subset of features a classifier is trained 6 times, by classifying each time 10 different patterns (i.e., subjects or recorded trials) and using the remaining 50 patterns for training. As most of the classification models depend upon a random initialization of model parameters, and the 10-fold partitioning of the dataset is random as well, we repeated the above procedure 10 times and the ones considered in the following are the average values of accuracy obtained over the 10 different trials.

For each classifier we report in Table 2 the subset of features that yielded the best (average) classification accuracy, among all of the possible combination of features. More precisely, there are 5 rows (one per feature) and 10 columns (one per each algorithm plus the number of times that each feature is selected in an optimal dataset); each element of the table takes value 1 if the corresponding feature is selected and 0 otherwise. In case of identical values we chose the subset with the lower number of features according to a regularization approach [56].

Table 2 Best feature subset per classifier and number of times a feature is adopted

Full size table

The numerical classification results are summarized in Table 3 where, for each classification algorithm, there are reported the average classification accuracy and the related standard deviation over the 10 trials carried out in correspondence of the subset of features that yielded the best (average) accuracy. The number of adopted features is obtained by summing the ones in the related column of Table 2.

Table 3 Average classification accuracy and standard deviation for the best feature subset

Full size table

4 Discussion

By analyzing the results obtained in Table 2, it is evident that the acceleration is not a useful feature while the gyroscope must be taken into account for a good discrimination. In fact, the sole use of the PSD Peak Gyroscope feature is the best option for LDA, CART, and FIS classifiers. On the other hand, QDA, SVM, and PNN are able to achieve a good classification by using some other features, such as Cadence, which is sufficient for SVM and PNN. It is worth to point out that these results are quite in accordance with current medical practices.

Looking at the overall performance of the proposed classification approaches, we note that the accuracy in discriminating among pathological and physiological gait is always maintained at high levels, from 80 to 90%. In addition to PSD peak gyroscope, cadence is the feature that by means of PNN is able to obtain the best accuracy of 91.13%. Cycle regularity allows NB classifier to achieve a 90.38%, which the second score in the ranking. In all cases, the performance volatility measured by the standard deviation is adequate. Overall, the great performance of PNN with only 2 features does suggest that a data-driven machine learning approach can bring improvements with respect to statistical approaches based, for instance, on Discriminant Analysis and to non-parametric models as KNN as well.

As a final remark, we note that the previous numerical results are strictly dependent on the uncertainty of measures through which data are gathered and then processed. In the present case, error in measurements depends by two main factors: accuracy and precision of hardware sensors; objectivity of the experimental setup, mainly depending on the application of the smartphone on a same point of the body as well as on the reproducibility of clinical trials (same walking, same movements, etc.). In this work, the influence of such errors is mitigated by the use of several and different hardware devices and by the adoption of a relatively large number of patients during the clinical tests.

5 Conclusions

The novelty with respect to state-of-the-art applications is the combination of data acquisition and filtering on the device, with pattern recognition and data fusion techniques that provide a correct discrimination of gait movements. An exhaustive feature selection approach is considered in order to find out the best subset of features able to discriminate among healthy and unhealthy subjects.

The procedure has been used also for evaluating the performance of several classification models in terms of classification accuracy. Very good performances, even achieving a 100% of accuracy, are obtained on the clinical trials performed in this research. It is important to point out that this is a feasibility study, not a clinical trial of a model. However, the results are very promising for making possible to assist medical specialists in analyzing the rehabilitation path in the near future.

In particular, the model could be extended for using it in specific and personalized programs for home rehabilitation meant to improve the patient’s quality of the life while boosting the treatment effectiveness and thus shortening the patient’s recovery time.

References

Nilsson NJ (1969) Survey of pattern recognition. Ann N Y Acad Sci 161(2):380–401
Article Google Scholar
Secco J, Farina M, Demarchi D, Corinto F, Gilli M (2016) Memristor cellular automata for image pattern recognition and clinical applications. In: Circuits and Systems (ISCAS), 2016 IEEE International Symposium on. IEEE, pp 1378–1381
Altilio R, Liparulo L, Panella M, Proietti A, Paoloni M (2015) Multimedia and gaming technologies for telerehabilitation of motor disabilities [leading edge]. IEEE Technol Soc Mag 34(4):23–30
Article Google Scholar
Pugazhenthi D, Priya VS (2013) Pattern recognition using automatic image classification and recognition methods: A literature review. International Journal of Engineering Sciences & Research Technology, pp 1354–1356
Whittle MW (2014) Gait analysis: an introduction. Butterworth-Heinemann
Tao D, Li X, Wu X, Maybank SJ (2007) General tensor discriminant analysis and gabor features for gait recognition. IEEE Trans Pattern Anal Mach Intell 29(10)
Altilio R, Paoloni M, Panella M (2017) Selection of clinical features for pattern recognition applied to gait analysis. Med Biol Eng Comput 55(4):685–695
Article Google Scholar
Horst F, Kramer F, Schäfer B, Eekhoff A, Hegen P, Nigg BM, Schöllhorn WI (2016) Daily changes of individual gait patterns identified by means of support vector machines. Gait Posture 49:309–314
Article CAS Google Scholar
Liparulo L, Zhang Z, Panella M, Gu X, Fang Q (2017) A novel fuzzy approach for automatic Brunnstrom stage classification using surface electromyography. Med Biol Eng Comput 55(8):1367–1378
Article Google Scholar
Samuel OW, Li X, Fang P, Li G (2016) Examining the effect of subjects’ mobility on upper-limb motion identification based on emg-pattern recognition. In: Asia-Pacific Conference on Intelligent Robot Systems (ACIRS). IEEE, pp 137–141
Cesqui B, Tropea P, Micera S, Krebs HI (2013) Emg-based pattern recognition approach in post stroke robot-aided rehabilitation: a feasibility study. J Neuroeng Rehab 10(1):75
Article Google Scholar
Dipietro L, Ferraro M, Palazzolo JJ, Krebs HI, Volpe BT, Hogan N (2005) Customized interactive robotic treatment for stroke: Emg-triggered therapy. IEEE Trans Neural Syst Rehab Eng 13(3):325–334
Article Google Scholar
Shusong X, Xia Z (2010) Emg-driven computer game for post-stroke rehabilitation. In: 2010 IEEE Conference on Robotics Automation and Mechatronics (RAM). IEEE, pp 32–36
Zhang X, Zhou P (2012) High-density myoelectric pattern recognition toward improved stroke rehabilitation. IEEE Trans Biomed Eng 59(6):1649–1657
Article Google Scholar
Mantyjarvi J, Lindholm M, Vildjiounaite E, Makela S-M, Ailisto HA (2005) Identifying users of portable devices from gait pattern with accelerometers. In: 2005. Proceedings.(ICASSP’05). IEEE International Conference on Acoustics, Speech, and Signal Processing, vol 2. IEEE, pp ii–973
Thang HM, Viet VQ, Thuc ND, Choi D (2012) Gait identification using accelerometer on mobile phone. In: 2012 International Conference on Control, Automation and Information Sciences (ICCAIS). IEEE, pp 344–348
Zhang Z, Liparulo L, Panella M, Gu X, Fang Q (2016) A fuzzy kernel motion classifier for autonomous stroke rehabilitation. IEEE J Biomed Health Inf 20(3):893–901
Article Google Scholar
Li J, Besada JA, Bernardos AM, Tarrío P, Casar JR (2017) A novel system for object pose estimation using fused vision and inertial data. Inf Fusion 33:15–28
Article CAS Google Scholar
Niyogi SA, Adelson EH et al (1994) Analyzing and recognizing walking figures in xyt. In: CVPR, vol 94, pp 469–474
Qin L-, Ma H, Liao W-H (2015) Insole plantar pressure systems in the gait analysis of post-stroke rehabilitation. In: 2015 IEEE International Conference on Information and Automation. IEEE, pp 1784–1789
Zhang Z, Fang Q, Ferry F (2011) Upper limb motion capturing and classification for unsupervised stroke rehabilitation. In: IECON 2011-37th Annual Conference on IEEE Industrial Electronics Society. IEEE, pp 3832–3836
DelRosario MB, Redmond SJ, Lovell NH (2015) Tracking the evolution of smartphone sensing for monitoring human movement. Sensors 15(8):18901–18933
Article Google Scholar
LeMoyne R, Mastroianni T, Cozza M, Coroian C, Grundfest W (2010) Implementation of an iPhone as a wireless accelerometer for quantifying gait characteristics. In: 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology. IEEE, pp 3847–3851
Nishiguchi S, Yamada M, Nagai K, Mori S, Kajiwara Y, Sonoda T, Yoshimura K, Yoshitomi H, Ito H, Okamoto K et al (2012) Reliability and validity of gait analysis by android-based smartphone. Telemed e-Health 18(4):292–296
Article Google Scholar
LeMoyne R, Mastroianni T (2017) Wearable and wireless gait analysis platforms: smartphones and portable media devices. In: Wireless MEMS Networks and Applications. Elsevier, pp 129–152
Yamada M, Aoyama T, Mori S, Nishiguchi S, Okamoto K, Ito T, Muto S, Ishihara T, Yoshitomi H, Ito H (2012) Objective assessment of abnormal gait in patients with rheumatoid arthritis using a smartphone. Rheumatol Int 32(12):3869–3874
Article Google Scholar
Ellis RJ, Ng YS, Zhu S, Tan DM, Anderson B, Schlaug G, Wang Y (2015) A validated smartphone-based assessment of gait and gait variability in Parkinson’s disease. PLoS one 10 (10):e0141694
Article Google Scholar
Abramovich F, Benjamini Y, Donoho DL, Johnstone IM (2006) Adapting to unknown sparsity by controlling the false discovery rate. Ann Stat 34(2):584–653
Article Google Scholar
Donoho DL, Johnstone IM (1994) Ideal spatial adaptation by wavelet shrinkage. Biometrika 81(3):425–455
Article Google Scholar
Mallat SG (1989) A theory for multiresolution signal decomposition: The wavelet representation. IEEE Trans Pattern Anal Mach Intell 11(7):674–693
Article Google Scholar
Ji T et al (2005) Frequency and velocity of people walking. Struct Eng 84(3):36–40
Google Scholar
Preece SJ, Goulermas JY, Kenney LPJ, Howard D (2009) A comparison of feature extraction methods for the classification of dynamic activities from accelerometer data. IEEE Trans Biomed Eng 56(3):871–879
Article Google Scholar
Perry J, Davids JR, etal. (1992) Gait analysis: normal and pathological function. J Pediatr Orthop 12(6):815
Article Google Scholar
Vaughan CL, Davis BL, O’Connor JC (1992) Dynamics of human gait. Human Kinetics Publishers
Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1(1):131–156
Article Google Scholar
Rizzi A, Buccino NM, Panella M, Uncini A (2008) Genre classification of compressed audio data. In: 2008 IEEE 10th Workshop on Multimedia Signal Processing, pp 654–659
James G, Witten D, Hastie T, Tibshirani R (2013) An introduction to statistical learning. Springer
Proietti A, Liparulo L, Leccese F, Panella M (2016) Shapes classification of dust deposition using fuzzy kernel-based approaches. Measurement 77:344–350
Article Google Scholar
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugenics 7:179–188
Article Google Scholar
Balakrishnama S, Ganapathiraju A (1998) Linear discriminant analysis-a brief tutorial. Inst Signal Inf Process 18
Srivastava S, Gupta MR, Frigyik BA (2007) Bayesian quadratic discriminant analysis. J Mach Learn Res 8:1277–1305
Google Scholar
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: Data mining, inference, and prediction, 2nd edn. Springer
Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27
Article Google Scholar
Manning CD, Raghavan P, Schütze M (2008) Introduction to information retrieval. Cambridge University Press
Rish I (2001) An empirical study of the naive bayes classifier. In: IJCAI 2001 workshop on empirical methods in artificial intelligence, vol 3. IBM New York, pp 41–46
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Google Scholar
Suykens JohanAK, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300
Article Google Scholar
Sun C-T, Jang J-S (1993) A neuro-fuzzy classifier and its applications. In: 1993., Second IEEE International Conference on Fuzzy Systems. IEEE, pp 94–98
Cetişli B, Barkana A (2009) Speeding up the scaled conjugate gradient algorithm and its application in neuro-fuzzy classifier training. Soft Comput 14(4):365–378
Article Google Scholar
Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and regression trees. CRC Press
Rokach L, Maimon O (2008) Data mining with decision trees: theory and applications. World Scientific Pub Co Inc.
Friedl MA, Brodley CE (1997) Decision tree classification of land cover from remotely sensed data. Remote Sens Environ 61(3):399–409
Article Google Scholar
Specht DF (1990) Probabilistic neural networks. Neural Netw 3(1):109–118
Article Google Scholar
Wasserman PD (1993) Advanced methods in neural computing. Van Nostrand Reinhold
Chiu S (1994) Fuzzy model identification based on cluster estimation. J Intell Fuzzy Syst 2(3):267–278
Article Google Scholar
Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press

Download references

Funding

Open Access funding provided by Università degli Studi di Roma “La Sapienza”.

Author information

Authors and Affiliations

Department of Information Engineering, Electronics and Telecommunications (DIET), University of Rome “La Sapienza”, Via Eudossiana 18, 00184, Rome, Italy
Rosa Altilio, Andrea Rossetti & Massimo Panella
Department of Biomedical Engineering, College of Engineering, Shantou University, Shantou, 515063, China
Qiang Fang
Second Hospital of Jiaxing, Jiaxing, 314000, China
Xudong Gu

Authors

Rosa Altilio
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Rossetti
View author publications
You can also search for this author in PubMed Google Scholar
Qiang Fang
View author publications
You can also search for this author in PubMed Google Scholar
Xudong Gu
View author publications
You can also search for this author in PubMed Google Scholar
Massimo Panella
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Massimo Panella.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Altilio, R., Rossetti, A., Fang, Q. et al. A comparison of machine learning classifiers for smartphone-based gait analysis. Med Biol Eng Comput 59, 535–546 (2021). https://doi.org/10.1007/s11517-020-02295-6

Download citation

Received: 01 April 2020
Accepted: 14 December 2020
Published: 06 February 2021
Issue Date: March 2021
DOI: https://doi.org/10.1007/s11517-020-02295-6

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A comparison of machine learning classifiers for smartphone-based gait analysis

Abstract

Similar content being viewed by others

Recognition of the Pathology of the Human Movement with the Use of Mobile Technology and Machine Learning

Gait Abnormality Detection without Clinical Intervention Using Wearable Sensors and Machine Learning

CAGSI: A Classification Approach towards Gait Speed Identification

1 Introduction

2 Methods