Introduction

Gait is now widely recognized as a reliable indicator of health status [1] and gait speed is even considered to be the “6th vital sign” [1, 2]. A number of conditions are known to influence gait such as neurological [3], musculoskeletal [4, 5], cardiovascular [6, 7] and respiratory [8] diseases. A typical way of assessing gait is to administer a walking test. The six-minute walk test (6MWT) is one of the most widely used assessments [9], largely because it has been shown to reflect the functional impact of a large number of distinct indications [10,11,12]. In addition, the 6MWT is “easy to administer, better tolerated, and more reflective of activities of daily living” than other walking tests [9]. The 6MWT has been used to measure responses to medical interventions [13,14,15,16], as a one-time measure of functional status [10], to monitor disease state [17, 18], as a predictor of morbidity and mortality [19, 20], and even to assess the impact of walking-induced fatigue on gait parameters in multiple sclerosis patients [21].

Despite its multiple advantages, the 6MWT has a number of limitations. First, the only outcome of the test is the total six-minute walking distance (6MWD): a single cumulative measure that does not provide any insight into the kind of gait disturbance being presented by the patient at the time of the test. Second, the 6MWT does not allow any knowledge to be gained about how the patient’s status or gait evolved during the execution of the test. Last but not least, the sole outcome of the test, i.e. the total 6MWD, is typically estimated manually by a rater based on the number of laps the patient walked around the course and the markings on the floor or the wall for the last partial lap [22].

Given the importance and widespread use of the 6MWT in clinical practice, there has been great interest in proposing wearable devices or smartphone applications as potential solutions for enhancing this assessment. Indeed, the precision of several potential solutions for enhancing the 6MWD has been assessed in previous studies, including evaluations of a commercially available wearable inertial measurement unit (IMU) [23], the iPhone’s built-in algorithm [24], purpose-built smartphone-based algorithms [25,26,27] and a custom machine-learning algorithm [28]. While some of these studies yielded good results, use of several of these applications had the disadvantage of requiring a priori information such as the patient’s height [25] or the course length [26, 27], or pre-calibration for each subject being tested [28]. In addition, it is important to note that many of the other wearable device studies carried out so far also lacked proper validation. In most cases, they were performed using a rater-estimated distance as a reference; whereas using distance measurement equipment to establish the validity of the new 6MWT distance-estimating solutions would have been preferable. Indeed, among the previous studies mentioned above, only the study carried by Salvi et al. [26] involved comparison of the test system with a distance-measuring device, in this case a trundle wheel. Although the trundle wheel served as a more reliable reference than a rater, the data collected in this study were based on assessment of only three young male adults [26].

FeetMe® insoles are among the recently developed wearable gait assessment devices that have not yet been specifically evaluated for measuring the 6MWD. These insoles were designed to evaluate a number of gait parameters on an individual step basis, as well as to provide an accurate and consistent estimation of the total distance walked during assessments. They thus allow monitoring of the evolution of a patient’s gait parameters during walking assessments, providing useful insights into the exact gait disturbance presented by the patient, as well as data on elements such as patient fatigability [29]. In addition, the FeetMe® insoles should simplify the execution of the 6MWT, potentially to such an extent that the test could be self-administered in local healthcare centers.

The validity and reliability of FeetMe® insoles to measure gait parameters has already been demonstrated in healthy volunteers [30], as well as in patients post-stroke [31], and in those with Parkinson’s disease [32] against the GaitRite® mat. Thus, the aim of the present study was to assess the validity of the FeetMe® insoles to measure the 6MWD in healthy volunteers along 10-m and 30-m tracks. Taking into account the limitations of previous studies, the total 6MWD was simultaneously estimated by a rater and by the FeetMe® device, and was also measured by a second investigator with a surveyor's wheel to provide the ground truth.

Materials and methods

Study design

This single-center prospective study was conducted at the Delafontaine Hospital Center (Saint-Denis, France) between October 2021 and August 2022. The study was approved by a French ethics committee, CPP EST I, and complied with the Declaration of Helsinki and all subsequent amendments (registration number, ID-RCB: 2021-A00037-34). It was carried out in a population of healthy adults.

Inclusion

All healthy volunteers aged between 18 and 80 years, who were able to walk 100 m unaided, had no gait disturbances, and who were accustomed to using a smartphone, were eligible to participate in the study. To ensure a more balanced distribution and sufficient representation across different age groups in the study population, the recruitment process aimed to enroll an approximately equal number of participants in each of the three age segments: 18 to 38, 39 to 59, and 60 to 80 years old.

Volunteers who had undergone a surgical intervention that could potentially impact gait in the previous 3 months (e.g. orthopedic surgery, an intervention for trauma of the lower limbs or spine, gynecological or urological surgery, or brain or spinal cord surgery) and those with a chronic disease affecting walking (e.g. rheumatological, orthopedic, pain, or neurological disorders) were excluded.

The volunteers were provided with information about the study by phone or e-mail prior to the first study visit, and were then given the opportunity to ask any questions. All volunteers provided signed consent prior to the study start. Participants were instructed to attend the study site on the designated day to receive the training and undergo the required tests. They were advised to wear comfortable footwear during their visit, with no specific restrictions on the type of shoes allowed.

Instrumentation

The study used size 35 to 46 FeetMe® insoles (FeetMe SAS, Paris, France), a Class Im CE(93/42/EC) and Class I FDA 510(k) exempt medical device (Fig. 1a). Each insole contains 18 capacitive pressure sensors and a 6 degrees of freedom IMU, as well as a Bluetooth® Low Energy (BLE) emitter. The FeetMe® insoles were used together with the FeetMe® Evaluation smartphone application to administer the 6MWT (Fig. 1c to e). Data collected by the insoles were transferred to the smartphone application via the BLE emitter, allowing information on plantar pressure, gait parameters and walking distance to be received in real time. Several standard walk tests can be accessed through the application, including the 6MWT. Once the 6MWT had been selected and launched by the rater, the application collected and recorded the user’s gait parameters for each of their steps over the entire duration of the test, and then automatically stopped recording after 6 min and informed the user that the test had been completed. Test results, including the 6MWD in meters, were then displayed in the application or on the associated web platform, the FeetMe® Mobility Dashboard.

Fig. 1
figure 1

a A pair of FeetMe® insoles. b An example of a surveyor’s wheel. c to e FeetMe® Evaluation mobile application interface. Source (figures (a), (c), (d) and (e)): FeetMe® company. Written permission to publish these images was obtained from the FeetMe® company

A Laserliner® RollPlot Mini surveyor’s wheel (UMAREX GmbH & Co. KG, Arnsberg, Germany; Fig. 1b) was used to determine the ground truth and provide an objective reference for comparison of the estimates obtained using the FeetMe® device and by the rater.

Intervention

The data analyzed in this study were collected during a single hospital visit during which each participant carried out two 6MWTs under the following conditions:

  • participant walked along the 10-m track while wearing the FeetMe® device and was simultaneously followed by an investigator with a surveyor’s wheel and evaluated by the rater;

  • participant walked along the 30-m track while wearing FeetMe® device and was simultaneously followed by an investigator with a surveyor’s wheel and evaluated by the rater.

A schematic representation of the tests performed is shown in Fig. 2. The participants were asked to walk at a comfortable speed (i.e., a speed self-selected by the volunteer). During the test, the rater informed the participant of the time remaining every minute, then 30 s and 10 s before the end, but did not give the participant any signs of encouragement. The tests were carried out in a random order to reduce test order bias. The randomization list was computer generated; participants were blinded to the randomization, but the rater was unblinded. The surveyor’s wheel was used as the reference tool for the distance measurement. A mandatory resting time of 15 min was applied between each test. However, tests only began when the participant indicated that they were sufficiently rested and were ready to perform another test.

Fig. 2
figure 2

Schematic representation of the tracks used for the 6MWT. The participant wearing the FeetMe® device walks on the track and the rater monitors the time and counts the number of turns. The participant was also followed by an investigator with a surveyor’s wheel, which was used to measure the exact distance walked

Study outcomes

The main outcome was evaluation of the validity of the FeetMe® insoles to measure the 6MWD in meters along the 10-m and 30-m tracks compared to the ground truth measured by the surveyor’s wheel. The accuracy of the FeetMe estimates was then assessed in comparison to that of the rater assessments for both track lengths.

Statistical analysis

The normality of the 6MWD data was assessed using Q-Q plots and Shapiro–Wilk normality tests. The mean and the standard deviation (SD) of the recorded 6MWDs were calculated for the FeetMe® device and the rater, as well as for the surveyor’s wheel. The bias (i.e., systematic error), the 95% confidence interval of differences (i.e., limits of agreement), Pearson correlation coefficient, intraclass correlation coefficient (ICC (2,1)), coefficient of determination, and mean absolute error (MAE) were calculated for the FeetMe® device versus the surveyor’s wheel and for the rater versus the surveyor’s wheel. A Levene test was used to assess significant differences between the SDs of the distances measured by the surveyor’s wheel and those estimated by the FeetMe® device or the rater. In addition, a paired t-test was performed on the absolute errors of the FeetMe® and rater measurements to detect any significant differences.

Agreement between the ground truth and the distances estimated by the FeetMe® device or the rater was analyzed using Bland–Altman plots [33] and linear regression plots (FeetMe® versus the surveyor’s wheel and rater versus the surveyor’s wheel).

The following criteria were used to assess the degree of correlation [34]: < 0.30 negligible, 0.30–0.50 low, 0.50–0.70 moderate, 0.70–0.90 good, and 0.90–1.00 excellent. The same criteria were used for the coefficients of determination. For the ICCs, values below 0.50 were deemed to indicate poor validity, values between 0.50 and 0.75 to indicate moderate validity, values between 0.75 and 0.90 to indicate good validity and values greater than 0.90 to indicate excellent validity (as described previously [35]). A priori significance levels (α) were set at 0.05 for all analyses. All data and statistical analyses were performed using Python software (version 3.8).

Results

Demographics and population distribution

A total of 33 healthy volunteers, 15 females and 18 males, were included in the study. Two investigators carried out all of the rater assessments and distance measurements using the surveyor’s wheel. Participants ranged in age from 23 to 73 years, with a mean of age of 42 years (see Figure S1 for the age distribution). The average height and weight of the population were 173.9 ± 9.3 cm and 70.9 ± 10.9 kg, respectively. For one participant, the insoles were not set up correctly and the FeetMe® recordings were unusable. Therefore, the FeetMe–surveyor’s wheel and rater–surveyor’s wheel comparisons were carried out using data from 32 and 33 subjects, respectively.

The normality of the data distribution for each method (FeetMe®, rater, and the surveyor’s wheel) was confirmed by both the Q-Q plot and Shapiro‑Wilk analyses (Fig. 3 and Table 1).

Fig. 3
figure 3

Q-Q plots for the 6MWDs obtained by the rater, FeetMe® and the surveyor’s wheel on the 30-m and 10-m tracks

Table 1 The total distances measured by the surveyor’s wheel (ground truth) and those estimated by FeetMe® and the rater

Validity assessment

The population mean (SD) 6MWDs estimated by FeetMe® and by the rater were both similar to the ground truths measured by the surveyor’s wheel, regardless of the track length (Table 1). This was confirmed by the results of the Levene tests: for both track lengths, no significant differences were observed between the SDs of the FeetMe® estimations and the ground truth, or between the SDs of the rater estimations and the ground truth (Table 1).

According to the data presented in Table 2, the estimations provided by FeetMe® exhibited limited bias and mean absolute error (MAE) when compared to the measurements obtained using the surveyor's wheel. On the 30-m track, the bias was -8.3 m, with an MAE of 9.75 m. Similarly, on the 10-m track, the bias was -9.0 m, with an MAE of 12.86 m. In comparison to the mean (SD) 6MWD of 476.2 m (61.1 m) and 441.2 m (58.0 m) achieved on the respective tracks, these values can be considered relatively small, providing valuable context.

Table 2 Analysis of the accuracy and of the agreement of the distances estimated by FeetMe® and the rater along the 30-m and 10-m tracks

Regression and Pearson correlation analyses indicated a high level of agreement between FeetMe® estimations and the ground truth, with the coefficient of determination of 0.96 and 0.95 on the 30-m and 10-m tracks respectively and Pearson correlation coefficients of 0.98 and 0.97 on the same respective tracks. Furthermore, the ICC (Intraclass Correlation Coefficient) values demonstrated excellent validity of the estimations on both tracks with values of 0.97 on the 30-m track and 0.96 on the 10-m track. In addition, it is worth noting that the ICC confidence intervals are quite narrow for both track lengths with lower boundary above 0.85.

Comparative analysis

When examining the accuracy of FeetMe® estimations in comparison to estimations made by the rater, several noteworthy observations emerge. Firstly, it is evident that the rater estimations consistently exhibited a higher level of bias, with values of -16.24 m compared to -8.3 m on the 30-m track, and -37.73 m compared to -9.0 m on the 10-m track.

The coefficients of determination, while slightly higher for the rater on the 30-m track (0.99) compared to FeetMe® (0.96), were lower on the 10-m track (0.85 for the rater and 0.95 for FeetMe®). Despite this difference, both approaches demonstrated good to excellent agreement with the ground truth, as shown in Table 2. However, the linear regression analysis reveals a greater degree of deviation from the ground truth in the rater estimations, particularly on the 10-m track (Fig. 4).

Fig. 4
figure 4

Linear regression plots between the distances recorded by the rater and the ground truth, and between FeetMe® and the ground truth, on the 30-m and 10-m tracks. The ground truth was measured using a surveyor’s wheel

Pearson correlation analysis indicated that both approaches displayed a similar level of correlation to the ground truth on the 30-m track (0.99 for the rater and 0.98 for FeetMe®). However, on the 10-m track, the rater estimations exhibited considerably lower correlation with the ground truth (0.76), compared to FeetMe® estimations (0.96). Overall, the Pearson correlation coefficients confirm an excellent agreement with the ground truth of both systems on the 30-m track. On the 10-m track, FeetMe® estimations still maintain an excellent level of agreement, while the rater estimations only reach a good level.

Differences in ICC values were also observed between the two methods (Table 2). The FeetMe® ICC values were excellent (above 0.96), regardless of the track length. Even when the 95% CIs of the ICC values were considered, the ICC was good (0.86) to excellent (0.90). In the case of the rater estimations, the ICC was excellent on the 30-m track (0.95) but was at the limit between good and moderate (0.76) on the 10-m track. The lower boundaries of the 95% CIs of the ICCs for the rater estimates were very low — 0.12 and 0 on the 30-m and 10-m tracks, respectively—indicating a very poor validity.

In addition, the MAEs of the FeetMe® estimations were lower than those of the rater estimations (Table 2). The MAEs were 9.75 m and 12.66 m for FeetMe® and 16.24 m and 38.88 m for the rater, on the 30-m and 10-m tracks, respectively. Significant differences between the absolute errors for the two methods were observed using the paired t-test (Table 2), indicating that the FeetMe® device estimations were significantly more accurate than those made by the rater.

In accordance with the results of the statistical analyses shown in Table 2, the Bland–Altman plots (Fig. 5) revealed that, while both the FeetMe® device and the rater underestimated the 6MWD, the extent of this underestimation was much greater for the rater (16.24 m and 37.73 m on 30-m and 10-m tracks, respectively) than for FeetMe® (below 10 m for both conditions).

Fig. 5
figure 5

Bland–Altman plots between the distances recorded by the rater and the ground truth and between FeetMe® and the ground truth, provided by the surveyor’s wheel, on the 30-m and 10-m tracks. The solid lines indicate the mean difference values, and the dashed lines indicate the upper and lower limits of agreement (95% confidence intervals)

In summary, when assessing the accuracy of the two methods, it is evident that FeetMe® estimations demonstrate less bias and error and show a closer alignment with the ground truth, particularly on the 10-m track.

Discussion

This study aimed to evaluate the validity of FeetMe® insoles in estimating the 6MWD in a population of healthy adults, using the ground truth measured by a surveyor’s wheel as a reference. Additionally, the study conducted a comparative analysis of the accuracies of FeetMe® insoles and of the rater, who performed the 6MWD assessment simultaneously with FeetMe® insoles.

Both the FeetMe® insoles and the rater performed well and showed excellent correlations with the ground truth with Pearson correlation coefficients greater than 0.9. However, differences in the ICC values were observed between the two methods, particularly on the shorter 10-m track, with the FeetMe® insoles outperforming the rater and obtaining ICCs that were excellent for both tracks (above 0.95) compared to only excellent (0.95) for the 30-m track and good (above 0.75) for 10-m track for the rater. Furthermore, the lower limits of the 95% CI of the ICCs were very low for the rater (below 0.15), but were still good for FeetMe® (above 0.85).

In addition, comparisons demonstrated that the FeetMe® estimates were more accurate than those made by the rater. Indeed, values of MAE statistical indicators were systematically and significantly lower for FeetMe® than for the rater. Moreover, although both methods were found to underestimate the total distance walked relative to the ground truth, the extent of this underestimation was greater for the rater than for FeetMe®. This underestimation by the rater may potentially be explained by the fact that the rater only took into account the distance walked along the straight parts of the tracks and excluded the distance walked during half-turns. This hypothesis is supported by the observation that the rater underestimation was greater when the test was performed using the 10-m track, on which the participants were obliged to make more turns during the 6 min of walking, than when the test was performed on the 30-m track.

The higher number of unaccounted for half-turns on the 10-m track was also likely to have contributed to the higher MAE values obtained for the rater estimates, most notably on the shorter track. Indeed, the rater estimates were considerably less precise on the 10-m track than on the 30-m track (MAEs of 38.88 m versus 16.24 m respectively). In contrast, the MAE values for the FeetMe® estimates on the 10-m and 30-m tracks were similar (MAEs of 12.86 m and 9.75 m, respectively), showing that the FeetMe® device was more accurate and less sensitive to the track length than the rater.

The results of the current study can be examined from the viewpoint of the minimal clinically important difference (MCID) a test or a tool is able to detect. In chronic obstructive pulmonary disease (COPD), the MCID of the 6MWT has been found to be around 25 m [36]. A similar MCID value has been reported in the literature for coronary artery disease [37]. In contrast, much higher MCID values of up to 30 m have been reported for patients with Duchenne muscular dystrophy [38], and values of 42 m have been reported for patients with lung cancer [39]. Thus, with MAE values of 9.75 m and 12.86 m for the 10-m and 30-m tracks respectively, the FeetMe® insoles are clearly well adapted for detecting the MCIDs in all these diseases, irrespective of the track length used for the 6MWT. In contrast, based on the results of this study, the ability of a rater to detect the MCID in patients with these diseases may be limited, particularly in patients with COPD or coronary artery disease or when the test is conducted using shorter tracks. Overall, our findings indicate that the FeetMe® insoles will be suitable for use in clinical studies, as well as for the longitudinal follow up of patients, provided its test–retest reliability is demonstrated. In addition, the FeetMe® insoles could be used as a tool to quantify the outcomes of an intervention, such as a rehabilitation program or surgery.

Finally, the current study indicated that the performance of the FeetMe® device was superior and more consistent than that reported in the literature for other wearable devices and smartphone-based algorithms. Between-study differences in procedures, populations and reference methods make comparisons between studies problematic. However, the study by Shah et al. [23] evaluated a similar type of technology to that assessed in the current study, used analogous age-based inclusion criteria, and had a comparable sample size. The results of Shah et al. demonstrated that the commercially available IMU being tested had a MAE of 19.77 m for 6MWTs conducted on a 15-m track and an MAE of 18.36 m when the tests were conducted on a 20-m track. In our study, the FeetMe® estimates were associated with considerably lower MAEs of 12.86 m and of 9.75 m for the 10-m and 30-m tracks, respectively. However, in this case, the superior performance of the FeetMe® device might be explained by the fact that the population included by Shah et al. involved both multiple sclerosis patients and healthy volunteers.

It is not relevant to compare the performance of the FeetMe® insoles to that of smartphone-based systems requiring a priori information such as track length or a patient’s height. However, the insoles can be compared to systems that do not require any such information to function. For instance, Ata et al. characterized the performance of the iPhone CMPedometer algorithm on a 100ft (30.48 m) course and found a bias ± SD of 56% ± 44% [24]. Even after applying a linear correction factor of 0.75 to reduce this error, they found a distance estimation bias ± SD of 8% ± 32%, compared to approximate estimations of -2.0% ± 3.2% and -1.7% ± 2.4% for the bias values calculated for FeetMe® on the 10-m and 30-m tracks, respectively. Juen et al. developed a smartphone-based machine-learning algorithm that evaluated the 6MWD by identifying laps and estimating the per lap average speed of the patients [28]. The performance of this algorithm was evaluated on a 10-m track, and the results obtained indicated error rates of 1.82% for healthy subjects and 3.78% for patients with pulmonary disease. However, the error characterization approach used in this evaluation differed from that used in other studies, with Juen et al. using a 10-fold cross-validation procedure rather than validation on an independent dataset. In addition, while these results were very encouraging, the applied algorithm had to be calibrated for each subject prior to the test, making this system more complicated to use in real-world clinical practice.

The current study provided the first assessment of validity of FeetMe® insoles to measure 6MWDs and, comparing to the validation methods described in previous studies of other devices, used an improved validation approach involving the ground truth provided by the surveyor’s wheel. However, this study also had some limitations. In particular, it would have been interesting to study results obtained from a larger sample population, including both healthy adults and those with gait anomalies, and to evaluate estimates obtained from multiple centers rather than single center. These limitations should be addressed in future studies, involving a larger multicentric study population that includes more elderly participants as well as participants with pathological gait. Most importantly, a study validating the use of the device at home should be performed.

Conclusions

In conclusion, this study showed that the FeetMe® connected insoles can be used as a valid and accurate solution for measuring the 6MWD. Using the data from 32 healthy subjects, we demonstrated that the distance estimates made by the FeetMe® device were consistent with the ground truth measured by a surveyor’s wheel, and were more accurate than the estimations provided by the rater. The accuracy of the FeetMe® insoles was maintained regardless of the length of the track, with MAEs remaining below 13 m for all test estimates (representing as an example only 2.9% of 450 m). The high level of accuracy of the FeetMe® estimates means that the insoles will be suitable for detecting MCIDs in patients with a wide range of diseases, making this device a highly relevant tool for use during patient appraisal. Although further studies are required, the ease of use of the FeetMe® system and ergonomic design of the insoles mean that the FeetMe® device could eventually be used to safely self-administer 6MWTs at home, with a reduced track length of 10 m to eliminate space constraints, and could therefore reduce the burden associated with patients commuting to assessment clinics.