Prediction of Knee Joint Compartmental Loading Maxima Utilizing Simple Subject Characteristics and Neural Networks

Joint loading may affect the development of osteoarthritis, but patient-specific load estimation requires cumbersome motion laboratory equipment. This reliance could be eliminated using artificial neural networks (ANNs) to predict loading from simple input predictors. We used subject-specific musculoskeletal simulations to estimate knee joint contact forces for 290 subjects during over 5000 stance phases of walking and then extracted compartmental and total joint loading maxima from the first and second peaks of the stance phase. We then trained ANN models to predict the loading maxima from predictors that can be measured without motion laboratory equipment (subject mass, height, age, gender, knee abduction-adduction angle, and walking speed). When compared to the target data, our trained models had NRMSEs (RMSEs normalized to the mean of the response variable) between 0.14 and 0.42 and Pearson correlation coefficients between 0.42 and 0.84. The loading maxima were predicted most accurately using the models trained with all predictors. We demonstrated that prediction of knee joint loading maxima may be possible without laboratory-measured motion capture data. This is a promising step in facilitating knee joint loading predictions in simple environments, such as a physician’s appointment. In future, the rapid measurement and analysis setup could be utilized to guide patients in rehabilitation to slow development of joint disorders, such as osteoarthritis. Supplementary Information The online version contains supplementary material available at 10.1007/s10439-023-03278-y.

net intersegmental forces and moments, static optimization (SO) to solve individual muscle forces, and finally calculation of joint contact forces using joint reaction analysis (JRA).
In the experimental data extraction marker trajectories and ground reaction forces (GRFs) were read from C3D motion capture files.The stance phase was detected based on the vertical GRF threshold (dataset depending, see details from Dataset-specific notes in the musculoskeletal analysis workflow) and further analysis was commenced only for the stance phase.We analyzed a single stance phase per trial.Automated checks were implemented to identify those ground contacts that were made by the leg of interest (more symptomatic leg in the CAROT dataset, dominant leg in the Fukuchi dataset, and right leg in the Camargo, Horst, and Schreiber datasets), verify that the contact with the force plate was made only with a single leg and the foot did not make contact to the ground beyond the force plate.
A static standing trial was used to scale the generic musculoskeletal model and adjust the positions of virtual markers on the model.The markers used for scaling the segments differed between the Plug-in-Gait marker set and had biomechanical modeling performed on the trials, we utilized joint centers for scaling the segments.For the other datasets, we used markers on easily defined anatomical landmarks.The location of medial and lateral tibiofemoral contact points in the frontal plane was set according to a regression equation that estimated femur intercondylar distance based Knee intercondylar distance).During the scaling, we also estimated participant-specific frontal plane knee alignment (varus/valgus angle) to be used as a predictor in the artificial neural networks.This was done by opening the varus/valgus DoF of the knee joint and letting the IK algorithm estimate the joint angles during static standing.Knee alignment determined using this approach has been shown to correlate with knee alignment measured from radiographs 13,21 .However, the participant-specific knee alignment was not utilized for subsequent musculoskeletal simulations since our validation showed that assuming the knee alignment to be neutral provided a better estimate of the medial-lateral distribution of the tibiofemoral contact forces.Finally, we opted to preserve the mass distribution of the segments during scaling.
After scaling the segments of the model, we scaled the muscles forces of the model by multiplying The muscle strength scaling method has been used previously 7 and follows from allometric scaling of surface area-based properties (muscle strength depends on the physiological cross-sectional area of the muscle).
Joint kinematics during walking were calculated using the IK algorithm which estimates joint angles by minimizing the sum of the squared difference between the positions of the virtual markers of the model and the positions of the experimental markers.The input marker trajectories used was dataset dependent but we favoured markers on clusters when available.When used as input in subsequent analyses, the IK was lowpass filtered with a 12-Hz cutoff frequency.We used the IK and the GRFs as inputs to solve the net intersegmental forces and moments using the ID algorithm.Subsequently, using the SO tool, we solved the contributions of individual muscle forces to the inverse dynamics solution by minimizing the sum of squared muscle activations.The model was supplemented with reserve actuators at each DoF to prevent failing simulations in case the muscles in the model would not be able to provide the required joint moments.The optimal force (joint moment provided with activation coefficient of 1) of these reserve actuators was set to 1 Nm discouraging their utilization during SO.However, the reserve actuator in the hip flexion/extension DoF was defined so that it had 1 Nm optimal force in the extension direction but a 100 Nm optimal force in the flexion direction.Thus, SO favoured using the reserve actuator for providing the hip flexion moment.The decision was based on our validation of tibiofemoral contact force estimation and the findings by Simonsen et al. 18 suggesting that hip flexion moment in walking is, at least partly, provided by passive structures such as ligaments rather than muscle forces.The hip flexor reserve actuator was used to model the effect of passive structures providing the moment.
Finally, we used the JRA tool to estimate the resultant forces and moments transferred between consecutive bodies as a result of all loads acting on the model.From this analysis, we extracted the forces applied to the medial and lateral contact points of the tibia (representing forces transmitted through medial and lateral compartments of the tibiofemoral joint).From these forces, we concentrate on the component of the force along to long axis of the tibia (i.e.component perpendicular to the tibial plateau).In case the lateral compartment contact force reported by the JRA was a physiologically impossible tensile force, we set the force to zero.In this case, it is assumed that the lateral compartment is unloaded and the force required to stabilize the tibiofemoral joint in the frontal plane was provided by the lateral collateral ligament 22 .Total tibiofemoral contact force was calculated as the sum of the medial and lateral compartments.
We refer to the forces calculated for the tibiofemoral joint as knee joint contact forces (KJCFs).
The KJCF outcomes that we tried to estimate using artificial neural networks were the loading response and terminal extension peaks (the first and the second peaks of the stance phase, respectively) and the full-stance peak (the higher peak of the first and second peaks) of the medial, lateral, and total (medial and lateral sides summed together) KJCF.We also calculated medial force ratio (MFR) of the loading response, terminal extension, and total KJCF by dividing the value of the corresponding medial KJCF by the peak of corresponding full-stance KJCF at the instant where the peak occurred.Therefore, we tried to estimate nine KJCF outcomes and three MFR outcomes.

Validation of the musculoskeletal modeling and simulation
Optimization and validation of the musculoskeletal modeling and simulation pipeline was done using CAMS-Knee dataset 19 .The dataset provides direct in vivo measurement of threedimensional forces and moments applied to the tibial component of total knee replacement from six participants.The compressive force applied on the medial compartment was calculated as , where Fz is the compressive force applied to the tibial tray of the implant, My is the frontal plane moment and l is the distance between the medial and lateral contact points 8 .The dataset also provides motion capture data (marker trajectories and GRF) and we used these to simulate the KJCFs.The simulated KJCFs were compared with the directly measured.
Walking trials with self-selected walking speed were utilized to optimize and validate the simulation pipeline while data from stair descent was used for scaling the musculoskeletal models as it included a phase of static standing.Three to four stance phases of walking were included in the validation from each of the six participants and used for selecting the following parameters for modeling and simulation pipeline of the current study: participant-specific vs. generic frontal plane alignment of the knee, participant-specific vs. generic femoral intercondylar distance, reserve actuator strength and muscle force scaling.The reasoning for exploring these parameters is briefly explained below.
Frontal plane knee alignment (knee adduction angle) is known to be an important factor that affects the medio-lateral distribution of tibiofemoral contact forces 8 .Measurement of dynamic knee adduction angle during gait is heavily affected by soft-tissue artefacts 1 and thus, it is unreasonable to estimate kinematics of this DoF based on marker trajectories.However, it may improve the estimation of the medio-lateral distribution of tibiofemoral contact forces if the knee alignment is adjusted participant-specifically and kept statically in that alignment for further analyses.In absence of imaging-based knee alignment measurement, the alignment can be estimated based on motion capture data 13,21 .We tested the effect of having the frontal plane knee alignment as neutral (0 degrees of adduction) or as participant specific based on motion capture data and found that the assumption of neutral knee alignment produced a slightly better and more consistent estimate of the medio-lateral distribution of KJCFs (Fig. S1).More specifically, keeping the frontal plane knee alignment neutral produced a more consistent estimate of the MFR in which lateral unloading was not present while all participants presented similar predominant loading of the medial compartment as calculated also from the in vivo data.The mean MFR was also slightly better estimated at the peak of the total tibiofemoral force with neutral compared to informed knee alignment.However, the total KJCF was slightly better estimated in the first half of the stance when knee alignment was estimated using motion capture.It is worth noting that we observed the more consistent MFR estimation using the neutral knee alignment although there was an average of 3.8 degrees of varus alignment in the participants as measured by post-surgery computed tomography 9 and we were able to reasonably estimate the alignment using the motion capture data (r = 0.82).We selected to keep the knee alignment neutral for the analyses.However, the alignment was estimated during the scaling process and the information on the estimated alignment was saved for use in the artificial neural network prediction of KJCFs.The distance between the medial and lateral tibiofemoral contact points in the implemented knee model affects the medio-lateral distribution of tibiofemoral contact forces 11 and hence may be important to set them participant-specifically when possible.We tested the effect of this by running the analyses with a model having the intercondylar distance estimated based on a regression equation with the distance between the markers over the medial and lateral epicondyles of the femur as the input 22 and with a model having the intercondylar distance set based on known implant geometry.The intercondylar of the implant was found by measuring the distance between the apex points of the convex surfaces of the femoral implant geometry.The tibial tray had a corresponding concave shape and thus it was assumed that the center of pressure on both compartments is located on these points.It was found that the model with intercondylar distance informed by implant geometry produced a better estimate of the medio-lateral distribution of tibiofemoral contact forces, especially in early stance (Fig. S2).The average intercondylar distance obtained using the regression equation was 73 mm for this population which is considerably longer than typically observed (see Fig. S5).This difference is potentially due to differences in the populations and the amount of soft tissue over the epicondyles.Since in this study we had participants with varying body compositions (from lean to obese) we decided to create a regression equation to estimate the participant-specific intercondylar distance from a measurement that is not depending on body composition (see section Knee intercondylar distance) and used this regression equation to set the intercondylar distance for the analyses.Preliminary analyses with the CAMS knee dataset indicated that the second peak of the simulated tibiofemoral contact force may be overestimated.The overestimation was associated with a high rectus femoris muscle activation during the late stance when the internal hip flexor moment occurs.This causes a cascade of compensatory muscle activations in hamstrings and gastrocnemius to overcome the knee extension moment that the rectus femoris creates while flexing the hip.
Simonsen et al. 18 concluded based on their experimental findings that the hip flexor moments in late stance is mainly due to passive elastic structures resisting hip joint extension.As typical musculoskeletal models used for simulating walking, also the one utilized in the current study, does not model passive structure around the joints, the muscles in the model are required to generate the moments that those structures create.In the case of hip flexor moment, this limitation in the model may cause the overestimation in the second peak of the tibiofemoral contact force.
To overcome the limitation we tested if supplementing the model with a strong reserve actuator in flexion direction would improve the accuracy of the simulated KJCFs.The rationale here is that when the model has a strong hip flexor reserve actuator the optimization favours using the reserve actuator to provide the requited hip flexion moment which in turn would prevent potentially unrealistic activation of the biarticular hip flexor, rectus femoris.Thus, the strong reserve actuator in the model can be viewed as a representation of the moments provided by the passive structure around the hip.It should be noted that this approach would not be appropriate if estimating loading of the hip joint.Our analyses showed that the accuracy of the estimated peak of the total KJCF in the second half of the stance improved when the optimal force of the hip flexor reserve actuator was >10 Nm.A value of 100 Nm provided a slightly better estimation of the total KJCF at midstance and a slightly better MFR in the second half of the stance than 1000 Nm.Thus, we selected the value of 100 Nm for the analyses of this study.However, it should be noted that the individual responses to the reserve actuator strength were highly variable.Here, we selected the option that provided the best mean prediction.Future studies may utilize other solutions such as EMG informed simulations of modeling the passive structures around the hip.Muscle strength of the model, i.e. maximal isometric muscle force of each muscle, needs to be adjusted since the body masses vary a lot between individuals in the dataset and differ from the generic model.In lack of any other relevant information, the muscle strength was adjusted by scaling with the assumption that muscle strength adapts to the habitual loading of the body which is dominated by the loading induced by body mass during daily activities.Therefore, a person with higher body mass is assumed to also have stronger muscles 4 .We tested the following scaling factors for the muscle strengths: , which represent, respectively, simple scaling based on body mass, allometric scaling, and scaling based on body mass but all muscle forces increased by 50% to make sure that muscle activations do not saturate to maximum muscle activation.The strength scaling had only a minor effect on the estimated KJCFs (Fig. S4).This was expected since in a task requiring submaximal muscle forces the differences in muscle strength between models can be compensated by adjusting muscles activation during SO.However, we observed a more marked overestimation of the total KJCF at the second half of the stance when the scale factor was set to option 3. We selected option 2 for its strong theoretical basis (allometric scaling of a variable that is dependent on the surface area, in this case, the physiological cross-section area of the muscle) and experimental confirmation in lean individuals 4 .

Knee intercondylar distance
Tibiofemoral contact forces are transmitted through two contact points in the musculoskeletal model utilized in the current study.The location of the contact point in the frontal plane affects the distribution of the contact forces between the medial and lateral compartments of the tibiofemoral joint 11 and hence it is reasonable to set them participant-specifically when possible.A regression equation has been previously established to estimate the femoral intercondylar distance based on the distance between markers placed over the femoral epicondyles 22 and utilized to set the intercondylar distance (ICD) for musculoskeletal models to examine compartmental loading in the tibiofemoral joint 16 .However, this previously established regression equation is critically dependent on the soft tissue thicknesses around the knee.As this study included participants with varying levels of adipose tissue, we created a regression equation to estimate ICD based on body height.We measured ICD (distance between centre points of each condyle) from magnetic resonance images of 50 participants from the CAROT dataset and obtained a positive correlation between the participants' body height and ICD (r = 0.584, p < 0.001).The correlation between tibia or femur lengths and ICD were also tested but found to be weaker than the correlation between body height and ICD.The regression model (Fig. S5) was used to set the intercondylar distance for the participant-specific musculoskeletal models in the current study.Similarly, to previous studies, it was assumed that the medial and lateral contact points of the tibiofemoral joint are located equidistant from the knee joint centre 16,22 .Thus, both contact points were located ½*estimated ICD from the knee joint centre in the frontal plane.

Automatic KJCF peak detection
The KJCF time series in lateral and medial compartments, as well as in their summed profile, is characterized by two maxima resulting from the loading response and terminal extension phases and a local minimum between them.We identified the local minimum by first assuming its location as the minimum between 40% and 60% of the KJCF time series.If we detected the minimum at the edge of this range, the range was broadened until the minimum was detected inside the range.
There was generally no need to broaden the range because the local minimum was easily detected.
Loading response and terminal extensions peaks were then defined as the maximum values of the time series on the left and right side of the local minimum, respectively.In some datasets, artefacts were detected as sharp peaks in the beginning or the end of the time series, which could in some cases have values higher than the peaks of interest.To avoid identifying these artefacts as loading response and terminal extension peaks, we used the following algorithm: whenever the maximum was found within a set number of percentage points of the edge of the range (between the beginning of the time series and the local minimum for the loading response peak, and between the local minimum and the end of the time series for the terminal extension peak), the range was narrowed from that side by one percentage point and the maximum was detected again.We used three percentage points as the threshold for loading response peaks and eight percentage points for the terminal extension peaks.
After these detection steps, if we still failed to detect the peaks or the local minimum, the resulting invalid or undefined peak values were set to -1.We then omitted data points with peak values of -1 when the dataset was constructed for artificial neural network analysis.

Dataset-specific notes in the musculoskeletal analysis workflow
The subjects in all datasets performed the walking trials barefoot.However, varying levels of noise and artefacts in the GRFs and marker trajectories, as well as nonstandard marker sets in different datasets, necessitated the use of different parameters in identifying the stance phase, filtering the data, and determining the settings for musculoskeletal simulation-based analyses.Some data may have been filtered twice, but the effect of repeated filtering on the data is negligible.

CAROT
The force threshold for stance phase detection was set to 10 N and the GRF data was lowpass filtered with a 4 th order zero-lag Butterworth filter at 12 Hz in Vicon Nexus.The marker trajectories were smoothed as part of the plug-in gait pipeline in Vicon Nexus with a Woltring filter.

Schreiber
The force threshold for stance phase detection was set to 10 N. The GRF and marker trajectory data in the dataset were already lowpass filtered at 15 Hz and 6 Hz, respectively 17 .Therefore, we applied no marker trajectory filtering but filtered the GRF data at 12 Hz.

Fukuchi
The force threshold for stance phase detection was set to 200 N and the GRF data and marker trajectories were lowpass filtered at 12 Hz using a 4 th order Butterworth lowpass filter.A large force threshold was necessary to limit the stance phase because otherwise, the lowpass filtering would have caused strong artefacts to COP signal near the time boundaries of the stance phase.

Horst
The force threshold for stance phase detection was set to 300 N and the marker trajectories and GRF data were lowpass filtered at 12 Hz using a 4 th order Butterworth filter.Force and moment data were filtered for the whole trial, but COP data were filtered only for the detected contact period.

Camargo
The force threshold for stance phase detection was set to 50 N and both the GRF data and marker trajectories were lowpass filtered at 12 Hz using a 4 th order Butterworth filter.There were no separate standing trials in the dataset but scaling was performed on a static standing phase in each walking trial.

Dispersion of response variables in the combined dataset
The values of different response variables and their standard deviations in the combined dataset are presented in Figure S6, where the height of the bar indicates the mean of the response variable and the whiskers indicate its standard deviation.In Figure S7, the range of the values and the 25 th and the 75 th percentiles are shown, while Figure S8 illustrates the values normalized to the bodyweights of the subjects; the central mark denotes the median value, the edges of the box the 25 th and 75 th percentiles, and the whiskers the range of all values.Full-stance summed bodyweightnormalized KJCF peaks have a median of 3.15, 25 th percentile of 2.79, and 75 th percentile of 3.59.

Folds in the outer loop
As shown in Figure S9, the full dataset is divided into k1 cross-validation folds.For each crossvalidation fold, we construct a training set and a test set.The training set is constructed from all the data in all folds except the current fold.The data in the current fold then constitutes the test set.
For example, the training set for the first fold (Training set of fold 1) comprises the data from folds 2 to k1.The test set for the first fold (Test set of fold 1) is just the data in the first fold.

Subfolds in the inner loop
Next, the training set is treated like a whole dataset; for the inner cross-validation loop, it is divided into k2 folds.Let us call these inner loop folds subfolds to separate them from the folds in the outer loop.
Like in the outer loop, we construct a training set and a test set for each subfold; the data in a subfold is the test set for that subfold, and the data in all other subfolds is the training data for that subfold.
In the illustrated example, data from folds 2 to k1, also known as the training set of fold 1, are divided into k2 subfolds; each subfold is assigned a training set comprising the data of all other subfolds (training set of subfold 1), while the data in that subfold is its own test set (test set of When training stops, the network predicts response values in the test set; the error of these predictions is the error for that subfold.We evaluate that error for all desired hyperparameter combinations.The best hyperparameter combination for each subfold is defined as the combination that that results in the smallest RMSE.In summary, for each subfold and each hyperparameter combination, we trained a separate artificial neural network and evaluated its accuracy on the data in that subfold.
Training and evaluation of outer loop networks Now, for each fold in the outer loop, we have k2 candidates for the best hyperparameter combination.Quite often the candidates are not unique.Therefore, we select the most frequent combination as the hyperparameter combination for each outer fold.As a result, each outer fold has its own hyperparameters, although some folds may share the same hyperparameters.In many datasets, subjects had trials at different velocities.For example, each subject in the CAROT dataset had trials at a comfortable walking speed and a fixed, instructed walking speed.
Therefore, it is reasonable to use weights that give different trial types the same importance regardless of the number of trials within that trial type.On the other hand, if a certain trial type had just one sample and another type had 20 samples, it could be argued that the latter trial type should be weighted more during training because it is less likely to have a high random error in the mean target value due to the high number of repetitive samples.
Additionally, because the trial type structure varied between datasets, creating general weighting rules that apply to all datasets is difficult.However, creating a customized weight-determining algorithm for each dataset would be cumbersome.We devised a satisfactory compromise with four main assumptions that were used to create a general algorithm that created the weights for all datasets.
First, we assumed that initially each subject is equally important in training the prediction algorithm.Therefore, we weighted the trials under each subject so that the sum of the weights equaled 1 for each subject.It made no difference if the subjects had different kinds of trials.
Second, we assumed that the fewer trials there are for a subject, the more random error there is in the biomechanically analyzed KJCF output compared to the actual value.Therefore, we applied a weight to all trials under an individual subject; that weight is a measure of the amount of random error.The measure was chosen as the square root of the number of trials under that individual subject.Now, if a subject had only one trial, it was weighted with a value of 1.If a subject had 4 trials, they were both weighted with a value of 2. If a subject had 16 trials, they were all weighted with 4. This emphasized subjects with many trials over subjects with few trials without completely overriding the effects of the first assumption.This method has flaws, because in the dataset some subjects had trials where they had been instructed to walk with different velocities.If we had wanted to be more exact, we would have applied the second assumption to these unique combinations of subjects and instructed velocities instead of only subjects.However, we believe that this method was reasonable while being simple to implement.
With the first and the second method combined, if there were N trials under a subject, then each trial was weighted by .Thus, by summing the weights of all trials under a subject, the weight for that subject became .
Third, we assumed that each dataset except the CAROT dataset is equally important in training the prediction algorithm.Therefore, we wanted to weigh the trials under each dataset except CAROT so that the sum of the weights for each dataset equaled 1.This assumption was like the first, but for datasets with respect to the number of trials instead of subjects with respect to the number of trials.The CAROT dataset, unlike all other datasets included in this study, comprised represented the gait the ANN should generalize, and the weights of CAROT trials were multiplied by 0.5.This decision was supported by the fact that the gait kinematics of KOA patients have been shown to differ from those of healthy controls 2,14 .
Fourth, we assumed that the fewer subjects a dataset has, the less value from variety it has.
Therefore, we wanted to give greater weights to datasets with many subjects and smaller weights to datasets with few subjects.We selected the square root of the number of subjects in the dataset as the weight for the dataset.
For each trial, the weights from these four assumptions were multiplied together to obtain the final weight.

Validation of the artificial neural networks
Between k-fold cross-validation and holdout validation, it is often argued that cross-validation is better if one has the computational resources to use it, because it trains the model and checks it against a test set several times while utilizing the entire dataset for training 5 .This is unlike in holdout validation, where only a portion of the dataset can be used for training and the model is trained and checked against the test set only once 20 .Tohka and van Gils recommend crossvalidation for datasets containing less than one thousand subjects 20 .
We opted, as per common convention, to use 5-fold cross-validation: the entire dataset is split into

Figure S1 .
Figure S1.Simulated and measured total tibiofemoral contact forces and medial force fraction

Figure S2 .
Figure S2.Simulated and measured total tibiofemoral contact forces and medial force fraction

Figure S3 .
Figure S3.Simulated and measured total tibiofemoral contact forces and medial force fraction

Figure S4 .
Figure S4.Simulated and measured total tibiofemoral contact forces and medial force fraction

Figure S6 .
Figure S6.Mean knee joint contact force peaks for different response variables (heights of bars)

Figure S7 .
Figure S7.The medians of response values (red marks), the range of response values between 25 th

Figure S8 .
Figure S8.The medians of the bodyweight-normalized response values (red marks), the ranges of subfold 1).Training and evaluation of inner loop networks Then, the training set is split into a weight training set and a validation set.The weight training set is used to modify the weights of the artificial neural network iteratively during training; each iteration, the network at that point is used to predict response values in the validation set, and the error is calculated for the validation set.When the error stops decreasing for 6 consecutive iterations, the training stops.
Similarly to how the training sets of inner folds were split, the training set of each outer fold is split into a weight training set and a validation set.The weight training set is used to train the network, and the validation set is used to determine when to stop training.Lastly, the RMSE for the subfold is evaluated using the test set.Finally, we take the mean of all k1 RMSE values to determine the RMSE of the artificial neural network model.Pearson correlation coefficients are processed similarly.Note that the illustration and the explanation describe a case where early stopping is used with the trainlm algorithm.In the case of Bayesian regularization with the trainbr algorithm, no validation set exists, and the weight training set is the entire training set.Formulation of the weights of the artificial neural network training samplesMany trainable algorithms allow the specification of weights for different data samples during training.If no weights are specified, each sample is equally important in constructing the structure of the prediction algorithm.Because the combined dataset contained samples from datasets of different sizes and different subjects within a single dataset had a varying number of trials, it is justified to specify weights to avoid gross overrepresentation of a single subject, dataset or trial type, which could cause a significant bias.

5
Figures Figure S10 Figure S14 illustrate some KJCF curves that were manually omitted based on

Figure S10 .
Figure S10.An example of KJCF curves where the automatic stance detection failed to detect the full stance phase and the resulting KJCF curve contains approximately half of the stance phase.

Figure S11 .
Figure S11.An example of KJCF curves where the automatic stance detection detected several stance phases as one stance, and the force plate forces had an offset artifact.

Figure S12 .
Figure S12.An example of KJCF curves where the static optimization stage of musculoskeletal analysis failed to reach sufficient activation in hip muscles.

Figure S13 .
Figure S13.An example of KJCF curves where the force plate moment artifact in the early stance phase distorted the beginning of the KJCF curves.

Figure S14 .
Figure S14.An example of KJCF curves where a single-frame marker jump in motion capture data distorted some joint angles and the middle of the KJCF curves.

Figure S15 .
Figure S15.An example of KJCF curves where several marker jumps in motion capture data