1 Introduction

Neurological diseases, trauma injuries or even aging have a significant impact on the gait performance and therefore, also in quality of life. Recent studies [20, 22, 52] have shown the importance of rehabilitation therapies to alleviate these effects. However, for that purpose, therapies must be adapted to each individual, which requires to define approaches to determine the functional state and detect possible changes.

Traditionally, standardized clinical scales have been used to assess the functional state of an individual. However, both the workload that entails performing the associated tests and the limited number of resources, lead to a reduced frequency of assessment sessions, which complicates proper individualization of the therapy [16]. In order to overcome these limitations, in recent works [2, 15, 24, 33, 39, 40, 46, 47] the use of technological approaches to monitor individuals’ gait, and extract indicators that can be used to detect changes in the functional state, has been proposed as a decision-making support approach to the therapist.

Among the different technological solutions proposed [40], wearable devices based on inertial sensors (accelerometers, gyroscopes) are currently the most popular ones [24, 29, 38]. However, the methods used to keep them attached to the body can be uncomfortable to certain individuals [7, 36, 42], especially those with reduced mobility. In response to this problem, less invasive wearable alternatives such as wristbands, watches or smartphones have been proposed more recently [3, 26, 41]. Nevertheless, in these cases, parasitic motions have hindered their applicability. Given the problems mentioned above, recent studies have proposed including sensors in Assistive Devices for Walking (ADW) such as crutches, canes or walkers [6, 8, 14, 18, 43, 51]. The use of ADW is common in individuals with motor impairment, which avoids user-rejection and makes this solution the most suitable for monitoring these people’s gait.

In terms of application of the captured gait data, most of the studies focus on classifying gait patterns of specific neurological diseases or distinguishing these individuals from healthy ones [1, 5, 12, 21, 29]. However, once a diagnosis has been made and the functional state of the patient characterized, the effectiveness of a particular rehabilitation therapy depends on being able to adapt it to the specific evolution of each individual, which the aforementioned applications are not designed for. In this context, anomaly detection approaches based on the use of personalized gait pattern models have been proposed as a solution to aid the decision-making process of the therapist. However, existing published works [4, 10, 32, 34, 44] are mainly based on wearable sensors and are designed for people that do not require ADW.

Hence, in this work, an individualized Machine Learning-based approach to detect significant changes in the functional state of individuals that require the use of Assistive Devices for Walking is proposed. The proposed approach, different to the ones proposed in the literature, provides three main contributions: (1) A methodology to create an individualized model for each person based on current gait pattern data that is used to detect anomalies or significant deviations; (2) An approach focused on people that require Assistive Devices for Walking and specific devices to monitor their gait; and (3) A validation based on real data of people with multiple sclerosis.

The rest of the article is organized as follows: Sect. 2 provides an insight into the most relevant works related to anomaly detection approaches, and their application to gait patterns. Section 3 presents the sensorized tip used for data acquisition; Sect. 4 details the methodology for the development of a gait anomaly detector; in Sect. 5 the validation of the methodology in healthy people is shown; while Sect. 6 presents five case studies based on people with multiple sclerosis where the methodology has been tested; finally, the most important ideas are summarized in Sect. 7.

2 Related work

The goal of anomaly detection (AD) techniques is to detect data that do not fit into a normal pattern. AD applications are multiple, being of special interest in areas such as fraud detection, intrusion detection, network anomaly detection or industrial fault detection. In the context of this work, an anomaly is considered a deviation from a normal gait pattern for a specific individual.

Among the different types of anomalies that are distinguished in the literature, point anomalies, which are defined as instances that can be considered as abnormal with respect to the rest of the data, are the most studied [9]. Gait anomalies are included in this group, as a change in the gait pattern is constituted by samples that differ from others considered as normal.

In order to detect this type of anomalies, multiple techniques are proposed in the literature, which can be classified into two main groups: Machine Learning (ML)-based techniques, and non-ML-based techniques. Among the latter, statistical techniques stand out [37], as they provide statistically justifiable solutions. However, they rely on the assumption that the data is generated from a particular distribution, which is not applicable in many cases [9].

In recent years, the use of ML-based techniques has increased significantly in the field of anomaly detection, due to their ability to deal with nonlinear and multidimensional data, such as human gait data [15]. These techniques are used to build models that distinguish between normal and anomalous classes. Thus, according to [37], anomaly detection can be split into three broad categories based on the training data function used to build these models: supervised, semi-supervised and unsupervised anomaly detection.

Supervised anomaly detection assumes the availability of a training data set which includes labeled instances for both normal and anomalous classes. However, in this mode, two issues arise: on the one hand, the database is usually not balanced, and more data corresponding to normality are typically available; on the other hand, in some applications, such as gait anomaly detection, it is difficult to obtain a training data set which covers every possible anomalous behavior that occur in the data [9]. In this situation, the most common practice in anomaly detection is to use semi-supervised or unsupervised techniques. In the former case, the training data set includes only data corresponding to the normal class; in the latter case, the labels are unknown and there is an implicit assumption that anomalous samples are less frequent than normal ones.

Among the various ML techniques, Support Vector Machine (SVM) is the most widely used in anomaly detection [37]. There are studies that use this technique in a supervised manner [17, 28]; however, it is common to use the One Class Support Vector Machine (OC-SVM) variant trained only with samples corresponding to normal behavior, that is, in a semi-supervised mode. This technique has been widely studied and used in areas such as computer networks [49] or intrusion detection [30].

In the particular field of gait analysis, although some statistical analysis-based approaches exist [32, 34, 44], most works are focused on the use of ML-based approaches [4, 10, 31, 48] that allow handling gait pattern analysis considering multidimensional data. In this field, techniques such as K-Nearest Neighbor (KNN) or Support Vector Machine (SVM) are used, being SVM the most commonly used technique, due to its remarkable generalization capability even on small to medium data sets [15, 27].

However, it should be noted that most of these works focus on distinguishing healthy people from those with an impairment by using a multi-class classification approach, in which at least two classes (healthy or not) are considered for training the developed model. In fact, few studies focus on developing gait anomaly detection in which only one class data are available (the one considered as normal). [48] follows this approach, in which a OC-SVM is used to distinguish healthy people from people with mobility problems. There, a model is trained with data corresponding only to healthy people, and it is tested with other subjects’ data, without prior knowledge about their physiological state. Moreover, all previous studies are based on data provided by wearable sensors, and works related to people that require ADW are very limited. Among the latter, in [19], a cane is used as the acquisition system and changes in gait are studied, although the aim of the work is to evaluate the effectiveness of the device, and not detecting anomalies in gait pattern.

In conclusion, the analyzed related works indicate that (1) most of the works are focused on multi-class classification, in which at least two data sets related to healthy people and people suffering a particular disease or impairment are considered; (2) studies focus mainly on the data provided by wearable sensors, and very few exist related to people that need ADW; (3) most approaches do not consider an individualized approach, as a model is defined considering all available population; and (4) most works only validate their approach considering healthy people with simulated impairments. Hence, it can be concluded that there is a lack of works related to (a) individualized gait pattern anomaly detection, in which a single class is considered (normal state); (b) people that require ADW; and (c) validation with people that suffer diseases or impairments. The current work by the authors contributes by covering all these gaps, as illustrated in the following sections.

3 Sensorized tip

Considering the advantages of using ADW for monitoring the gait performance of patients, in this work, the sensorized tip presented in [6] (Fig. 1) is used to capture data. It is worth mentioning that, unlike sensorized crutches or canes presented in [8, 14, 18, 43, 51], the proposed sensorized tip can be attached to the personal assistive device of each individual.

Fig. 1
figure 1

Sensorized tip

The sensorized tip includes a data acquisition device and three sensors (a force sensor, a barometer and an Inertial Measurement Unit) integrated in an aluminum enclosure. The force sensor provides information on the axial load exerted by the patient, and the barometer indicates the atmospheric pressure, which can be used to estimate the relative height of the device, and finally, the Inertial Measurement Unit provides linear acceleration data, angular speed and magnetic field in the local x, y, z axes of the crutch. This last sensor, apart from providing the aforementioned signals, integrates an algorithm [6] to estimate the Euler angles in the global reference frame, which are used to estimate the anteroposterior and lateromedial crutch angles that can be used to define the relative 3D motion of the crutch (Fig. 1).

Fig. 2
figure 2

Gait anomaly detector design methodology

4 A new methodology for a personalized gait anomaly detector based on sensorized tip data

This section details the methodology defined to develop the individualized gait anomaly detector, which allows to detect significant changes in the functional state of an individual. This detector will be designed considering the data provided by the sensorized tip detailed in the previous section. Note that the system is designed to serve as a support on the decision-making process of the therapist that is treating the monitored patient.

The proposed methodology is based on the literature on anomaly detection and gait recognition systems [15]. Although there are researches that work directly with raw captured data [13, 50], most of the works in the area propose a feature engineering-based approach, in which the data are processed to obtain a reduced set of features that can be used to characterize them [15, 27, 37]. This approach is generally divided into three phases: (1) feature generation, (2) feature extraction for dimensionality reduction and (3) pattern classification.

Based on this analysis, in this work, the design methodology summarized in Fig. 2 is proposed. This approach requires to define a model of the current state of the specific individual, which is tuned by using data from his/her current and previous states. This model is used to train a SVM-based anomaly detector, which is the most widely used technique in anomaly detection and gait recognition due to its generalization capability, even with a limited training data set [15, 27], as stated in Sect. 2.

The detailed procedure is defined in the next subsections and briefly summarized next. First, a database for the specific patient has to be developed based on a series of simple experimental tests related to both the current state (CS) of the individual and the previous one (PS) (Sect. 4.1). Then, based on the captured data, a set of features is generated (Sect. 4.2). In a third step, the previously defined set of features and their relative importance to detect the changes from the previous to the current state are analyzed, and those features that are not relevant or provide redundant information are removed (Sect. 4.3). For this purpose, correlation analysis and the Random Forest technique are used. Once the most relevant features are selected, the set of optimal hyperparameters for the SVM-based anomaly detector are identified (Sect. 4.4). Then, using the selected hyperparameters, a One Class SVM is trained using the current state data to implement the gait anomaly detector (Sect. 4.5). Finally, the developed detector is used to evaluate new/future data, determining whether patient’s state has changed significantly or not (Sect. 4.6).

4.1 Database generation

In order to develop the gait anomaly detector, a model of the current functional state of the individual is required. This will enable to detect anomalies based on significant changes over the current state. For this purpose, a proper data set is required, which will be based on the data extracted from experimental tests with the individual.

As the focus of this work are individuals that require ADW, the tests to be performed have to be simple to perform and non invasive. This way, and based on previous works [6], the database will be generated by the data obtained by walking in a straight line at a comfortable speed according to the individual’s condition, with the sensorized tip (Sect. 3) attached to the personal assistive device of the individual. This test allows to easily evaluate the gait performance to detect anomalies.

As detailed in Fig. 2, several data sets are defined for this purpose. In order to model the current state and design the gait anomaly detector, the Initialization Set, composed by the Previous State and Current State data sets is defined. These data sets are related to different functional states of the patient that has been previously identified by the therapist (i.e., significant changes exist on the patient condition from the previous (PS) to the current state (CS)). Each of these data sets are composed by different straight-line-walking tests captured on different dates and times, in order to capture the variability that may exist within the same functional state. As will be analyzed later, although only current state data will be used to train the detector (Training Set), previous state data are used for identifying the hyperparameters of the detector, among them the most relevant features to detect state changes. Hence, this approach requires a previously detected anomaly by a therapist for its initialization.

The new data captured once the system is operational will be included in the New Data data set (future state), which will be introduced in the detector to evaluate whether it belongs to the Current State or not. In the latter case, an anomaly will be automatically detected, i.e., a significant state change.

4.2 Feature set generation

A feature-based approach is applied to the database with the aim of creating and selecting useful features that help identifying individualized anomaly patterns in gait [15, 27, 37]. In this case, the proposed process is divided in three phases: (1) trimming the signals; (2) performing a segmentation process; and (3) generating the features that characterize each segment.

The first phase lays in removing the signals captured during approximately the first and the last meter of each test. This is carried out to avoid the acceleration or deceleration phases related to the execution of the tests.

Then, in a second step, signal segmentation is performed using the procedure defined in [35]. This procedure is based on the cycles of use of the crutch, which are composed of a stance phase (in which the crutch is in contact with the ground) and a swing phase (in which there is no contact between the crutch and the ground). This way, the signals provided by the tip sensors are divided into sequential windows that are associated with those cycles. The instant at which the crutch comes in contact with the ground defines the start of each window and therefore, the segmentation. This can be easily detected by considering the force sensor signal, as illustrated in Fig. 3.

Fig. 3
figure 3

Cycle of use of an ADW and its phases

The aforementioned segments are small time windows, in which each variable captured by the sensorized tip has a time evolution. In the last phase of the feature generation, a set of statistical, time or frequency-based operators can be applied to each crutch cycle or segment to characterize them with a set of features [12, 29]. In this case, the set of operators used is based on the one proposed in [35] and is summarized in the first column of Table 1. These operators are applied to the different variables captured by the sensorized tip: x, y, z linear accelerations—measured by the 3D accelerometer—\(\omega _x\), \(\omega _y\), \(\omega _z\) angular speed—measured by the gyroscope—lateromedial and anteroposterior angles—estimated using the algorithms of the sensorized tip—and the normalized force (the measured force vs individual weight related force ratio)—measured by the force sensor. Note that, different from [35], the magnetometer and Euler angles signals are left out from the characterization, as they deal with absolute angles that can lead to errors. Similarly, the barometer signal is neither used, since its relative accuracy (0,12 hPa \(\approx\) 1 m) is not high enough to detect variations in a short straight line test.

After performing the three phases of the defined process, 90 features (presented in Table 1) associated with each cycle of use of the crutch are obtained.

Table 1 Features generated from the data provided by the sensorized tip

4.3 Feature selection

The set of 90 features obtained in the previous section enables to characterize the gait of each individual. However, not all of them are relevant in detecting gait anomalies; moreover, changes in each individual may be reflected in different features, which highlights the need for an individual approach. Furthermore, some of the features may be correlated and provide redundant information.

In order to optimize the anomaly detector, in this section, feature selection is performed, eliminating those that are not relevant or provide redundant information. In fact, several studies in anomaly detection state that feature selection and extraction enhance and improve the accuracy and computational efficiency of the models [15, 27, 37]. According to these studies, Principal Component Analysis (PCA) is one of the most used features extraction techniques; however, this algorithm transforms the initial features into new ones, so the physical meaning of the initial information, which is needed for proper diagnosis and results interpretation in multiple applications, gets lost. Based on this statement, in this work, correlation analysis and Random Forest techniques have been used for features selection, an approach that allows the explainability of the results and has proven to be effective for feature selection [35].

4.3.1 Correlation analysis

The feature set obtained in the previous step (Sect. 4.2) is processed by carrying out a correlation analysis, in which features that may be correlated are analyzed. For that purpose, a correlation matrix is calculated using the samples contained into the Initialization Set (composed of the Previous State and Current State databases defined in Sect. 4.1), and the features whose correlation coefficient is greater than 0.95 are identified, as it is considered that the pair of features is highly correlated. From the correlated features, only one is kept in the feature set, removing the rest.

4.3.2 Random forest-based ordering

Random Forest is a Machine Learning technique widely used in various applications to perform regression or classification tasks and which includes an approach to estimate the relative importance of the features for classification [11].

In this work, Random Forest is used to identify the gait-related features that are more significant to detect a change of state in an individual. For that purpose, the samples contained into the Initialization Set (composed of the Previous State and Current State databases defined in Sect. 4.1) are used as input. The algorithm is implemented using MATLAB's Statistics and Machine Learning Toolbox.

Once the results are obtained, the features are sorted in decreasing order considering their contribution for distinguishing the samples corresponding to the previous and current states. Features with a positive relative importance contribute to the classification, whereas a negative weight means that the feature is detrimental to it. The latter are removed from the set of features.

Note that the results of the Random Forest will vary from one patient to another, as not all patients change their gait in the same way. Knowing which features are most significant for each patient is also a useful information for the therapist, so in this phase, apart from optimizing the process by removing irrelevant features, potential information for the specialist is also obtained.

Following this two-step procedure, a relevance-ordered representative feature set, with relevant and non-redundant information is achieved, which will be used as the basis to train the detector.

Fig. 4
figure 4

Hyperparameter optimization

4.4 Optimal hyperparameter selection

Given the current state of the patient (normal state), the purpose is to determine whether the new/future data belong to the same group, or on the contrary, correspond to a new one, which means that the patient’s state has changed (anomalous state). Thus, the detector can be designed considering a classification problem between these two states.

Note that, although there are two groups (normal state and anomalous state), this problem cannot be treated as a supervised binary classification, since within the anomalous state there may be multiple groups, belonging to different patient’s states. In this situation, a binary approach could lead to errors, as not all the samples belonging to the anomalous state have to be similar. Therefore, in this case, One Class Support Vector Machine (OC-SVM) semi-supervised Machine Learning technique has been selected to implement the detector, as the only data known for the detector is the one related to the normal state. This approach has been widely used in anomaly detection [23, 48].

An OC-SVM model is trained using only the data corresponding to one class (usually the normal class which in this work is the data corresponding to the Current State, as seen in Fig. 2). During the training process, the algorithm defines a boundary around the areas comprising the training data, and once the process has finished, the model can be used to classify new data as normal (those inside the boundary, related to the normal state) or different (those outside the boundary, related to the anomalous state). In general, the distribution of the training set is complex and the anomalies are not linearly separable from the normal data; in such cases, kernel functions are used to project the samples into a multidimensional space, where a linear separation is feasible.

In this work, an OC-SVM with a radial basis function (RBF) kernel is used, as it is suitable for data with a complex distribution. When using this kernel, it is necessary to define an hyperparameter called \(\gamma\), which controls the bandwidth of the kernel. The larger this parameter, the smaller the kernel width, and therefore, the more the model fits to the training data, existing risk of overfitting. The main challenge when using SVM-based techniques is to define the hyperparameters [15], for which cross-validation is a common practice [30].

According to the literature [25], the value of gamma can vary significantly in a range between \(10^{-2}\) and \(10^2\). Based on experimental tests, in this case, three values have been proposed: 0.01, 0.1 and 1, from which the methodology itself will select the optimal value for each patient. Apart from this hyperparameter, the methodology will also optimize the number of features to be used in each case, by selecting a subset of the n most relevant features from the ordered feature set obtained in feature selection step (Sect. 4.3).

Figure 4 summarizes the procedure developed to optimize the two hyperparameters mentioned above. For this purpose, the data from the Initialization Set are used as it contains the data of two different states of the individual. Since the main objective of the work is to classify a test on the basis of information provided by other tests that are taken as a reference, first, a Leave One Out (LOO) approach is applied to the Current State database, thus generating several training sets, formed by all the samples into this database except those corresponding to one test. Once these sets are generated, for each combination of parameters, m classifiers are trained (m being the number of tests in the Current State database, and therefore, the number of sets obtained by the LOO approach) using the training sets previously generated. Each classifier is evaluated using the tests from the Initialization Set that has not been used in the training. These tests are validated separately, obtaining in each case the percentage of correctly identified samples; then, the geometric mean is calculated with the obtained percentages, evaluating this way the performance of each classifier. Note that the geometric mean penalizes the misidentification of each test, thus assessing the ability of the classifiers to identify correctly both similar and dissimilar tests to those in the training set.

Once the m classifiers have been evaluated, the geometric mean is calculated again with the obtained results, in order to determine the effect that the training data have on the classifier performance. This way, a single value is obtained for each combination of hyperparameters that allow to evaluate and compare the different combinations in order to determine the best performing configuration.

4.5 One class SVM training

Given the optimal hyperparameters obtained from the aforementioned proposed procedure, the One Class SVM classifier is trained. For this purpose, only the samples in the Current State database are used.

Note that, although unusual, in some cases, there may be multiple optimal combinations of hyperparameters. In these cases, one classifier is trained for each optimal combination, and the trained classifiers are used to implement the detector, combining their outputs as analyzed in the next section.

4.6 New/future data evaluation

The off-line trained OC-SVM classifier is operational so as it can be periodically used with new data captured from the particular patient based on the procedure detailed in Sect. 4.1.

The OC-SVM classifier will provide an output for each of the new segments or crutch cycles, labeling them as part of the normal state or not (thus, an anomalous state). As multiple cycles exist on a test, a voting strategy will be used to determine the main class of the test. This way, the predominant label will be assigned to the test, determining whether the patient’s functional state has changed or not. In case, there are multiple optimal hyperparameter configurations, and therefore, multiple trained classifiers, the classification made by all of them will be taken into account, assigning the test the label that is most repeated among all of them.

Note that the proposed approach is designed to be updated each time a new test is identified, improving this way its performance. Figure 5 shows the updating process: once the new test is evaluated, if it belongs to the current state, it will be included in the Current State database, completing it. On the contrary, if it belongs to another class, the databases are reordered, removing the data from the Previous State, moving the data from the Current State to the Previous State database and introducing the new test into the Current State database.

Fig. 5
figure 5

System update process

Note that the previous state data would be removed for classification purposes, but not from the original database as long as all the historical data about the patient must be preserved for clinical monitoring. It must be highlighted that this is a support system for the therapist, so if a change is detected, the specialist will have to carry out a study with the patient, confirming that the detected change is relevant. If so, more tests corresponding to that state will be performed in order to complete the Current State database.

5 Validation in healthy people

In order to validate the proposed detector, first, a set of tests are carried out with healthy people in a controlled environment with the approval of the University of the Basque Country’s (UPV/EHU) Ethics Committee (Code M10/2021/325). The aim of these tests is to verify the potential of the proposed approach.

Note that since the anomaly detector is based on obtaining a personalized model for each individual, in the design of the validation tests, participants of different characteristics are considered. In particular, ten participants have been selected for these tests, whose main data are summarized in Table 2.

Table 2 Participants’ main data

In this section, the validation procedure is detailed: First, the tests are described, and then, the results are shown and analyzed.

5.1 Test setup

In order to make the tests carried out with healthy people as similar as possible to real cases, these tests have been designed in such a way that the participants are required to use a crutch out of necessity. Based on specialists expertise, 3D impediments have been designed and manufactured together with an insole, which, being located on the sole of a foot, generate discomfort that makes the participant have to use a crutch to walk.

As the aim of the methodology is to identify gait changes, three types of impediments have been designed (Fig. 6). The degree of discomfort generated by each of these impediment configurations is different, with the first state being the slightest, and the third being the most critical, where the use of the crutch becomes essential. This way, three different states can be simulated.

Fig. 6
figure 6

Impediments

For each configuration of impediments, the participants use the assistive device in the way that seems most appropriate to them; this way, the results are not conditioned, and simulate an evolution similar to that of a rehabilitation process, in which the patient’s ability to set the foot on the ground changes gradually.

Based on this test design, the participants have performed 4 independent 10-meter tests in each state. The data of each test have been processed as detailed in Sect. 4.2, thus obtaining approximately six or seven samples (segments or cycles) per test. As mentioned in Sect. 4.1, each test consists of walking in a straight line at a comfortable speed, with the sensorized tip attached to the assistive device.

Six out of 12 tests performed per patient have been used to initialize the system: three tests of the same state constitute the Previous State database, and three tests corresponding to another state form the Current State database. Once the data sets are defined, the procedure defined in the previous section is applied to train an One Class SVM.

In reality, the patients’ functional state evolves over time, and therefore, the last test performed will always be the one to classify. However, in this validation process, being the states simulated, each test has been treated as independent, and multiple scenarios have been generated, representing a different evolution in each of them, by ordering the tests in all possible combinations. This allows a more comprehensive analysis of the performance of the proposed design approach.

Using the aforementioned approach, and considering the possible test and state combinations, 96 different Initialization Sets are obtained, and in each of them 6 tests remain to be classified and are used to evaluate the designed detector. Thus, for each patient, 576 scenarios are analyzed, 480 in which a significant change exists between the Current State database tests and the test to be classified, and 96 in which the test to be classified is similar to those in the Current State database. Evaluating all possible scenarios, a set of overall performance results has been obtained, as analyzed in the next subsection.

5.2 Results and discussion

Table 3 summarizes the overall results, showing the specificity, sensitivity, accuracy, precision and F-score data for each participant. When obtaining these values, state changes have been considered as a positive event, while no change has been taken as a negative event.

Table 3 Performance of the methodology in validation with healthy people

The results show that, the proposed approach provides an average accuracy of 87.5%, which is slightly higher compared to the \(\simeq\) 84% of accuracy obtained in other studies with a similar approach based on wearable sensors [4, 10]. These results are more specific than sensitive, and they show a high precision. This means that if the system detects a change, there is a high probability that the change has occurred, which is desirable in order not to overload the therapist with false positives.

In order to further analyze the performance of the proposed approach, a set of individual examples is analyzed next. Figure 7 shows three cases where the designed detector has successfully identified normal and anomalous states. In the selected examples, the designed One Class SVM models have as their optimal configuration only the two most relevant features (according to the methodology described in Sect. 4.4), allowing a 2D representation of the classifier (in the first shown case—participant 4—the 75th percentile of normalized force and the cycle time; in the second case—participant 5—the 50th percentile of linear acceleration in x axis and the maximum value of normalized force; and in the last case—participant 9—the stance end value of anteroposterior angle and the 75th percentile of normalized force).

In each figure, the different shapes represent the 3 different states emulated by the impediments, while colors are used to distinguish the different tests (except those that form the Previous State or Current State database, which, as part of the same set, have been represented with the same color). A small shape means that the sample has been labeled as normal, whereas a large shape is the one that has been identified as anomaly.

Fig. 7
figure 7

Graphical representation of one class SVM performance in three successful cases

As it can be seen in Fig. 7, the developed detectors have been able to identify changes through the two features considered to be the most important ones. It must be highlighted that as the methodology is individualized, those relevant features can be different in each scenario. In the three cases in Fig. 7, features associated with the force signal are identified as some of the most relevant. This is because the 3D impediments used to simulate motor impairments make it difficult to set the foot on the ground, so the participants use the crutch for support, as in a rehabilitation process. The greater the discomfort, the greater the force exerted on the crutch, and therefore, the features associated with this variable are related to the simulated state changes. Although force is a relevant feature in these cases, it is not enough for the system, which has determined that the 2-feature configuration is the best. This second feature is different for each participant, and it represents that as discomfort increases, apart from exerting more load on the crutch, some participants may change their gait speed, others the crutch tilt, or even acceleration or angular speed of the crutch may be varied. This highlights the importance of the individual approach, through which state changes have been well identified despite the variability that exists between participants.

Although cases with good performance predominate (with an average accuracy of 87.5%), there are also cases where the designed detectors have misclassified some tests; Fig. 8 shows three of these cases. As in the previous case, in order to illustrate the performed classification, the cases in which a set of two features form the optimal configuration for the One Class SVM detector design have been selected (in the first shown case—participant 2—25th percentile of the normalized force and the 25th percentile of acceleration in y axis; in the second case—participant 7—75th percentile of acceleration in x axis and the correlation coefficient between the angular speed in y and z axes of the crutch; and in the last case—participant 10—the lateromedial angle value in the maximum force moment and the kurtosis of the angular speed in y axis).

Fig. 8
figure 8

Graphical representation of one class SVM performance in three unfavorable cases

Results shown in these cases illustrate the cases in which, using the selected features, the detector fails to detect the anomalies and considers them as part of the normal state. This phenomenon may occur if the patient changes his or her gait pattern in a way that is different from what is known. In that case, since the methodology has not characterized that type of change, the system may not be able to perform a good classification. However, note that this system is designed to be updated as detailed in Sect. 4.6, and therefore, it will always have information on the latest changes, minimizing this way the risk of misclassification.

In addition, it should be highlighted that in the cases shown in Fig. 8, there is a large difference between the states represented by the Previous State and Current State databases, being the tests corresponding to the intermediate state the ones to be classified. The larger the difference between the initialization samples, the more features will undergo variations, and therefore, those selected as the most important ones will be able to properly classify the samples of those known states, but may fail when an intermediate state has to be classified. However, in real cases, if the patient is well monitored, it would be unusual to find such cases, especially in rehabilitation processes (like the one simulated in this work), where changes occur gradually.

6 Validation in people with multiple sclerosis

In this section, the performance of the proposed approach with a set of people with multiple sclerosis is evaluated. The aim of this section is to proof the validity of the approach with real patient’s data.

Multiple sclerosis is a chronic, neurodegenerative disease that affects people from a very young age (average 30 years) and progresses over many years. Fatigue and motor impairment are the most common symptoms, in fact, 15 years after the first outbreak, 80% of patients need an assistive technical device to walk [45]. In many cases, the functional state of these patients is reflected in their gait pattern; therefore, the methodology developed in this work is also suitable for these case studies.

In this section, the test procedure is detailed: First, the tests carried out with patients are described, and then, the obtained results are shown and analyzed.

6.1 Test setup

Tests have been performed with five people with multiple sclerosis (PwMS) who use a crutch in their daily life. The basic data of these patients are detailed in Table 4. These tests have been carried out at the Multiple Sclerosis Association of Bizkaia (ADEMBI) with the approval of the Basque Government PS2018017 Ethics Committee.

Table 4 Patients’ main data

In order to monitor the temporal evolution of these patients, three sessions have been carried out, one every three months. In each of them, patients have performed two repetitions of a single test consisting of walking 10 m in a straight line at a comfortable speed according to their condition, with the sensorized tip attached to their own assistive device.

Table 5 shows the number of samples or crutch cycles obtained in each test. Note that in a 10 m test, only few samples are obtained, which means that a wrong step (one that does not correspond to the real patient’s state) could condition the results. Testing over longer distances would increase the number of samples, thus minimizing this risk and obtaining a more robust system. However, it must also be taken into account that PwMS have mobility problems, which prevents them from being able to walk long distances without aid. Based on this situation, in this case, it has been decided to carry out tests of 10 m, which is the minimum distance for the system to work as determined in the validation phase.

Table 5 Number of samples (crutch cycles) per test

Apart from capturing data, in each session, the functional state of the PwMS has also been assessed by a specialist using for this purpose the standardized Short Physical Performance Battery (SPPB) scale. This scale consists of several tests to assess the patient’s balance, strength and gait speed. The score that can be obtained ranges from 0 to 12, with 12 being the best score. Table 6 shows the results obtained in this assessment.

It must be taken into account that multiple sclerosis is a complex disease, whose evolution is very unstable. This explains the variations that are observed in patients’ states.

Table 6 SPPB values

Once the tests have been carried out, the methodology detailed in Sect. 4 is applied. In this case, as real study cases with a real temporal evolution are analyzed, the four tests of the first two sessions should be used to initialize the system. After the initialization, the system’s objective would be to determine whether the patient’s functional state has changed from the second to the third testing session. However, since the anomaly detection approach presented in this work is not subject to temporal evolution, in this validation process, the test sessions have been treated as independent, and they have been ordered in all possible combinations as detailed in Sect. 5. Thus, a more comprehensive analysis of the proposed system’s performance is carried out.

Note that, according to the performed assessment (Table 6), there are three patients (patient 2, patient 3 and patient 5) whose SPPB value has changed in all three session, reflecting a significant state change; in the case of patient 1 and patient 4, on the contrary, sessions with the same SPPB value are observed. As the methodology requires the current and the previous states (between which there has been a change of state) for the initialization, in the case of patients 1 and 4, four cases have been analyzed. In the case of patients 2, 3 and 5, on the contrary, six possible initialization combinations have been studied. In each of them, the remaining two tests (the ones that have not been used in the initialization phase) have been evaluated. Table 7 shows all the possible combinations used in this validation phase.

6.2 Results and discussion

After applying the developed approach to the data obtained in the aforementioned tests, the results shown in Table 7 have been obtained. Each row represents a case study, in which the tests used as Previous State and Current State databases are defined. In each of these case studies, the remaining two tests have been evaluated as normal or as anomaly. Thus, cells in italics are those that have been correctly identified by the system, while those in bold are the ones that have been mislabeled.

Table 7 Results of the methodology in validation in people with multiple sclerosis

Based on the results shown in Table 7, the accuracy of the approach for each patient has been calculated. Table 8 presents the obtained values.

Table 8 Performance of the methodology in validation in people with multiple sclerosis

Results show that in general, the methodology identifies correctly the validation tests; however, there are cases in which misclassification has occurred.

On the one hand, there are cases where one of the validation tests indicates that there has been an anomaly, while the other determines that the state corresponds to the Current State. This phenomenon is due in part to the small number of samples available. Note that the methodology classifies each sample (gait cycle) as either normal or anomalous, and it is the predominant label that is assigned to the test. Therefore, a reduced number of gait cycles makes each sample carry more weight in the final decision. The first image in Fig. 9 shows one of these cases (patient 1, case study 3), in which the methodology has determined that a single feature—the standard deviation of the linear acceleration in x axis—forms the optimal configuration. In this case, the tests to be evaluated correspond to the Current State, however, it can be seen that some validation samples are slightly different from those samples used for training, so the system labels them as anomalous. This reflects that, in this particular case, the samples acquired in the 10-meter tests performed have not been sufficient to capture the variability of a specific state or to adequately characterize it. In fact, if the two validation tests are considered together and not as separate tests, the system determines that they belong to the CS, as indicated by the SPPB value. Longer tests would be appropriate in this case in order to optimize the results, but as noted in Sect. 6.1, people with multiple sclerosis have mobility problems, which prevent them from walking long distances without aid. Moreover, it should be noted that in most cases, 10-meter tests have been shown to be sufficient, obtaining an average accuracy of 82.5%, so increasing the distance would have to be analyzed taking into account each patient’s impairment.

As in Figs. 7 and 8, shapes are also used in this one to represent states, colors to distinguish the tests (except those that form the Previous State or Current State databases, which have been represented with the same color) and size to differentiate the samples labeled as normal (the small shapes) and the ones classified as anomalies (the large shapes).

Fig. 9
figure 9

Graphical representation of one class SVM performance in three unfavorable real cases

On the other hand, there are cases in which both validation tests have the wrong label, such as the case study 6 of patient 2, or case studies 1 and 3 of patient 5. The second and the third images in Fig. 9 shows two of these case studies: the case study 6 of patient 2 (where, the standard deviation of angular speed in x axis and the standard deviation of angular speed in z axis form the optimal configuration) and the case study 3 of patient 5 (where the 50th percentile of linear acceleration in x axis is the most relevant feature). In these cases, according to the most relevant features, there has been no change between the CS and the new tests, but the SPPB value has changed.

This phenomenon may be due to several reasons. On the one hand, there may be some features that represent the change between these states that the methodology has not been able to identify. However, on the other hand, it may also be that the actual gait of the patients between these tests has not changed much even though the value on the SPPB scale is different. In fact, in the case of patient 5, the results are consistent, and the two scenarios in which the methodology has failed are indicating that in sessions 2 and 3, the patient’s gait pattern is similar. In the case of patient 2, something similar occurs, and even though one of the tests indicates an anomaly, if all of the validation samples were taken into account together, the methodology would conclude that in the sessions 1 and 2, the gait pattern has not changed.

Despite these unfavorable cases, the overall performance of the system is acceptable with an average accuracy of 82.5%. Figure 10 shows some of the cases in which the One Class SVM classifier has performed accurately. In the case of patient 1, the correlation coefficient between the angular speed in x and y axes is used; in the case of patient 2, the correlation coefficient between the angular speed in x and y axes and the lateromedial angle in the stance end moment are the most relevant features; and in the case of patient 3, the lateromedial angle in the stance end moment, the 25th percentile of the lateromedial angle, and the standard deviation of the angular speed in x axis form the optimal configuration).

Fig. 10
figure 10

Graphical representation of one class SVM performance in three successful real cases

Note that in these validation cases, the force does not appear among the most important features. Instead, angles and the gyroscope related angular speeds are the ones that predominate. It is to be noted that among PwMS, some use the crutch to support, and others use it to maintain balance, so state changes can be reflected in different variables depending on the participant. This highlights once again the need for an individual approach when carrying out this type of studies.

In conclusion, the overall results demonstrate that the proposed approach to design gait anomaly detectors is promising when applied to real scenarios. Also, it must be highlighted that this work proposes an individualized approach and consequently, the results cannot be taken as representative of the whole population.

7 Conclusions

In this work, a new ML-based methodology for the design of personalized gait anomaly detectors to detect gait pattern changes in people who need an assistive device for walking is proposed. These gait pattern variations may reflect changes in the patients’ functional state, and therefore, an early detection is important so that the specialist can redesign the rehabilitation according to the patient’s new functional state, thus improving his/her quality of life.

For this purpose, in this case, a sensorized tip that can be attached to the patient’s own assistive device for walking has been used as monitoring system. Once defined the data acquisition technology, a methodology to design the ML-based detector has been proposed. Taking into account the variability that exists among patients, the proposed approach creates an individualized model of the patient to detect the state changes.

The approach has been tested with healthy people and people with multiple sclerosis, obtaining an average accuracy of 87.5% and 82.5%, respectively. These results have emphasized the need to use individualized approaches for anomaly detection in complex and individual evolution dependent diseases. This proposal will be extended to longer monitoring periods in order to enrich and validate a time adaptive approach.

It is to be noted that the proposed approach requires for its initialization a previously detected anomaly, in order to determine the most important features to be included in the individualized ML-model. Furthermore, as analyzed, in some specific cases the length of the tests may condition the results, and further research is advisable to determine the optimal test length to capture the variability of the gait pattern for a specific state. Hence, future work by the authors includes to explore new initialization approaches in order to reduce the amount of previous data required to initialize the anomaly detector and analyze the effects of the test length on the results.