1 Introduction

In recent years, the prevalence of chronic diseases has escalated worldwide, exerting a profound impact on healthcare systems. To address the challenges posed by these conditions, innovative healthcare delivery models, such as telerehabilitation, have emerged, offering customized programs that extend beyond traditional hospital settings.

Traditional home-based rehabilitation heavily relies on the exercise programs prescribed by physiotherapists, yet patient adherence to these regimens often falls short [1]. On the other hand, telerehabilitation involves consistent monitoring of patients’ health status and progress, aligning closely with treatment objectives and individual expectations [2, 3].

To ensure the effectiveness and safety of rehabilitation programs provided at distance, it is imperative to maintain comparable levels of supervision to those provided by in-person therapists. This necessitates the monitoring of both postural alignment and kinematics. An emerging field in the realm of rehabilitation technologies is dedicated to the advancement of digital biofeedback systems [4, 5]. These systems utilize external sensors to collect motion data, which is then meticulously analyzed and conveyed to the user, with the primary aim of delivering support and motivation throughout the home-based rehabilitation process. Widely adopted approaches often rely on wearable and/or vision-based systems to monitor patients during exercises, offering a cost-effective solution for remote analysis of human motion.

The integration of motion capture system technology for recording subjects’ performance can yield reliable outcomes when coupled with specific and precise data processing algorithms. Machine learning algorithms have become increasingly prevalent for motion assessment, with the aim of providing comprehensive movement evaluation and feedback to ensure the accurate execution of gestures. Assistive systems play a vital role in supporting patients with in-home exercises [6, 7]. To be effective, assistive systems need to be adequate, affordable, and easily accessible, while also ensuring user engagement [8]. Furthermore, it is essential for these systems to assess patients’ performance and provide therapists with the necessary data to track progress and make informed clinical decisions [9, 10].

Prior research has delved into computerized approaches for at-home rehabilitation. These advancing systems frequently incorporate intricate interaction designs, providing both visual and auditory cues [8, 11]. They employ marker-based motion capture or Kinect-based setups to evaluate patients’ exercise execution via motion kinematics. Although these endeavors highlight the promise of computer-driven systems in enhancing movement proficiency, they frequently encounter obstacles due to technical intricacies.

The assessment of human motion is an intricate endeavor with wide-ranging implications, spanning from enhancing athletic performance to aiding in diagnostic and therapeutic applications within the field of rehabilitation. Researchers and innovators have invested significant time and effort in developing vision tools designed to monitor and evaluate motion, particularly in remote settings, to address the inherent complexities of this task [12].

The development of digital assistive systems for precise human motion evaluation has posed a formidable challenge in the field of artificial intelligence (AI) research [13]. This challenge is amplified by the diverse range of movements exhibited by individuals, variations in physical capabilities, and the inherent subjectivity involved in the algorithmic approaches used to assess human motion. Constructing an automated system capable of addressing these intricacies is a substantial undertaking, demanding a profound understanding of the diverse spectrum of human movements, physical abilities, and the subjective nature of human motion assessment [14,15,16].

While most existing computational approaches to automatic human motion assessment involve comparing exercise movements to predefined templates or ground truth, the goal for digital assistive rehabilitation systems is to accurately identify and count each exercise repetition to provide assistance and feedback [11, 17]. These systems must also categorize motion patterns as correct or incorrect, evaluating factors such as speed, direction, quality, and compensatory patterns to replicate the nuanced work of an in-person physical therapist [18].

Current repositories of rehabilitation datasets often suffer from limited scope and inadequate data modeling. Recognizing the crucial need for ongoing monitoring and support in rehabilitation, our study aims to overcome these limitations. Our research addresses the pressing demand for assistive evaluation systems for rehabilitation exercises. By harnessing artificial intelligence to analyze human motion, we propose annotating exercise motion phases within an existing dataset, serving a dual purpose in analysis for correction evaluation for patients and providing feedback, as well as quantitative analysis. This solution can be a valuable tool for patients, ensuring correct exercise performance and offering real-time feedback during sessions. We suggest adapting a previously published dataset to fit the requirements of a digital assistive system aimed at remotely aiding patients, essentially acting as a virtual physical therapist [19, 20]. This innovative approach involves meticulously labeling human movement data to ensure a more personalized and effective rehabilitation experience. By closely monitoring the accuracy of exercise repetitions and offering corrective feedback for any deviations in movement patterns, our adapted dataset enables systems to deliver tailored support to patients [18]. Through these additional annotations, we not only enable systems to track exercise repetitions effectively but also provide actionable insights to rectify any erroneous movement patterns, thus enhancing the effectiveness and customization of rehabilitation endeavors.

This paper underscores the pivotal importance of assessing the quality of human motion within the context of rehabilitation. The innovative rehabilitation data modeling presented here sets the stage for the development of more effective and personalized rehabilitation programs, thereby offering vital support to individuals on their path to recovery. In Sect. 2, we conduct a comparative analysis with other existing studies and datasets. Section 3 outlines the methodology employed for dataset modeling, Sect. 4 presents the dataset obtained, and finally, Sect. 5 and Sect. 6, provide concluding remarks that underscore the valuable contributions of this work.

2 Related study

Recent advancements in research have ventured into the realm of mathematical modeling and evaluation techniques for the analysis of human motion in the context of physical therapy. These innovative methods can be broadly categorized into three distinct areas: comparison, categorization, and scoring [21].

Comparison techniques involve assessing a patient's exercise performance by contrasting their motion sequences with predefined template motion sequences. These templates are typically derived from accurate executions of exercises by healthy individuals, clinicians, or patients under clinical supervision. Categorization methods, on the other hand, are designed to classify patient activities into discrete groups, such as correct and incorrect movement patterns. Scoring approaches aim to provide automated evaluations of a patient’s motion using personalized clinical or non-clinical scoring systems, utilizing both continuous and categorical scales.

Unlike comparative and scoring analyses, categorization methods play a crucial role in delivering real-time patient feedback and guidance during rehabilitation sessions. The selection of the most suitable computational technique for monitoring and assessing human movements significantly impacts the collection and organization of datasets.

Several datasets have been published in the field of rehabilitation research. Datasets like SPHERE [22], MSR-3D [23], and UTD-MHAD [24] have been compiled and used for monitoring applications, primarily offering graphical or statistical comparisons of patient actions without providing definitive patient evaluations. For example, the SPHERE dataset includes examples of both normal and physically impaired movements for activities like walking, ascending stairs, and sit-to-stand transitions. The KIMORE [25] and UI-PRMD [26] datasets have been designed to develop an evaluation system that assigns a score to exercise performance.

Conversely, other datasets like the Toronto Rehab Stroke Pose (TRSP) [27], Kinect 3D Active (K3Da) [28], and IntelliRehabDS (IRDS) [29] were created to classify movements into categorical classes. For instance, the TRSP dataset contains pose estimations of stroke patients and healthy subjects performing upper limb rehabilitation movements, annotated for regular and compensatory patterns of motion during an assisted-robot rehabilitation session. The IRDS dataset contains gestures of simple rehabilitation movements performed by both healthy and pathological subjects in various conditions (sitting or standing), labeled for correct and incorrect executions.

This work aims to describe a methodology for modeling a rehabilitation dataset that can be adopted to support AI-driven digital assistive systems, capable of assisting and monitoring home programs and providing feedback to improve patient performance. Given the limitations of currently available published datasets in terms of motion data modeling, a subset of the freely available IRDS was chosen for manipulation in alignment with our objectives. The goal is to implement a set of categorical classes of movements associated with the corresponding exercise phase, distinguishing each exercise repetition in finer terms of the quality of the motion pattern [30].

3 Materials and methods

3.1 Original dataset

In pursuit of the primary goal of this study to model a dataset supporting AI-driven digital assistive system, a subset of the freely available IRDS dataset underwent manipulation.

This selected subset encapsulates repetitions of shoulder movements executed by 29 subjects, comprising 15 patients and 14 healthy controls. Recorded using a Microsoft Kinect One sensor at 30 frames per second, the dataset provides 3D coordinates of 25 body joints and corresponding depth maps for each frame. Each movement is meticulously annotated with exercise type (shoulder flexion, shoulder abduction, shoulder forward elevation), the limb side involved (right, left, both sides), the subject’s position (sitting or standing), and the correctness of pattern motion (correct or incorrect). Shoulder flexion and abduction exercises comprises gesture performed separately with the left body side and the right body side, the shoulder forward elevation is simultaneously performed with both sides.

Table 1 summarizes for each exercise the gesture index.

Table 1 Description of gesture categories included in the new dataset

What sets the IRDS apart from other datasets is its inclusion of natural motions performed by both patients and healthy subjects, encompassing correct and incorrect patterns. Moreover, the dataset features patients with diverse pathological conditions, such as stroke and spinal cord injury, offering a valuable resource for enhancing the monitoring and assistance of upper limb rehabilitation across various populations, including those with neurological and musculoskeletal disorders.

While the current published form of the IRDS dataset serves as a valuable resource for categorizing exercises and assessing their correctness, it falls short of addressing the needs of digital assistive rehabilitation systems. These systems require a finer evaluation, recognizing repetitions of each exercise and assessing them for the completeness of range of motion (ROM) while analyzing incorrect patterns of motion, as suggested by recent works [31, 32]. The ultimate aim is to provide a dataset that facilitates the evaluation of whether an exercise is performed correctly according to its prescription or not.

This work outlines the methods employed to model the IRDS dataset, supporting a movement classification approach for the evaluation of human motions. This approach enables real-time feedback to patients regarding the performance of their prescribed shoulder rehabilitation exercises, allowing for a more detailed analysis of each exercise repetition [30].

3.2 Data modeling

The dataset aims to provide a comprehensive evaluation of ROM and compensatory movement patterns exhibited by patients during shoulder exercise rehabilitation sessions.

To enhance the granularity of the analysis, we associated categorical classes of ROM with specific exercise phases in an individual’s performance. This enables a detailed examination of each movement, considering both the completeness and correctness of its execution.

In the IRDS dataset, each frame, capturing both correct and incorrect movements, underwent labeling for ROM execution classes. An expert in human physiology assessed depth images to assign classes such as complete concentric phase (up), complete eccentric phase (down), incomplete eccentric phase (incomplete down), and incomplete concentric phase (incomplete up). These labels serve three main objectives in rehabilitation evaluation: segmenting exercise routines into repetitions, assessing each repetition’s completion status and providing feedback for performance enhancement, indicating which phase of the repetition is considered invalid-concentric, eccentric, or totally incomplete.

Furthermore, incorrect movements were labeled to distinguish various classes of ROM, allowing for a nuanced evaluation of repetitions not only in terms of motion completeness but also in terms of the percentage of correct and incorrect patterns. This approach provides insights into the quality of motion and facilitates tracking advancements during the rehabilitation process.

The methodology is based on the hypothesis that separating the evaluation of ROM and the recognition of compensatory movements into distinct stages can enhance the accuracy of AI-assisted rehabilitation systems. This separation reduces the complexity of each task by minimizing the number of classes, thereby potentially improving overall system performance.

Formally, we consider a set of input–output pairs \(D=\{(x_i,y_i)\}_{i=1}^N\) where \(x_i \in R^n\) are the N samples of the input feature space obtained from the 3D video motion analysis and \(y_i \in C = \{C_j \vert 1 \le j \le j\}\) are the class of labels to which these features correspond. The evaluation look for a decision function \(f:R^n \rightarrow C\) which given a sample, that in this work contains features of body joints’ 3D pose data, determines the output class that includes the kind of execution performed: \(x \rightarrow y = f(x,w)\). Throughout the following, let \(X(z) = \{X_t, m_t, c_t \}\) denote a labeled sequence of N frames of the considered set of the three shoulder exercises \(z \in \{ S1, S2, S3 \}\), where \(X_t\) denotes the acquired RGB-D image at frame t, and \(m_t\) represents the ROM phase of the exercise execution. Moments of a given exercise comprise both the key ROM phases of the exercise, resulting as \(m_t \in C = \{C_j \vert 1 \le j \le 4\}\). Moreover, \(c_t \in C = \{C_j \vert 0 \le j \le 2\}\) denoting whether the t frame belongs either to a physiological or compensatory motion pattern class.

4 Results

The manipulated dataset contains 1488 files for shoulder rehabilitation movements, following the structure of the original dataset.

Each file is associated with i) subject ID, ii) gesture type, iii) number of repetitions, and iv) label of correct movement and incorrect movements.

The nomenclature of the files is as follows:

$$\begin{aligned} SubjectID.GestureType.RepetitionNo.CorrectMovement.Position.csv \end{aligned}$$
(1)

The data contents of each file, presented in csv format, can be described as follows: (i) each file contains n frames, (ii) each frame contains spatial information of 25 joints along the three axes (x, y, z), iii) each frame contain temporal information of ROM phase execution.

The ROM phase executions are available along with depth image files. Figure 1 displays a chronological sequence of motion for ROM classes, exemplifying their temporal order in execution.

Fig. 1
figure 1

A chronological sequence of left-side shoulder flexion, labeled within the context of ROM classes, captured in depth images of Patient ID 215

The comprehensive dataset composition is outlined in Table 2, providing a detailed breakdown of ROM classes for both the healthy and patient categories.

Table 2 Description of the new dataset in terms of samples for each ROM class category along the various gesture types

The distribution of data for each ROM class was systematically examined within each patient and healthy subject category for various gesture types. Figure 2 depicts a bar plot that distinguishes between patient and healthy category IDs for each gesture type.

Fig. 2
figure 2

Distribution of ROM classes for gesture type and healthy and patients category of subjects

5 Discussion

Ensuring the effectiveness of at-home rehabilitation programs relies on providing both qualitative and quantitative evaluations of exercises to patients. Recent advancements in machine learning and computer vision techniques have sparked increased interest in automating the assessment of rehabilitation exercises [33]. However, significant challenges and unanswered questions still exist before we can widely deploy datasets to develop AI-assisted digital systems for both home-based and clinic-based settings [34, 35].

This paper’s contribution lies in presenting a dataset focusing on shoulder rehabilitation exercises. Specifically, a subset of the dataset, involving exercises performed by 29 subjects, was meticulously modeled to assess repetition movements at a granular level. This modeling aims to recognize and differentiate between ideal ROM execution and the identification of incorrect patterns of movement.

To achieve this, a comprehensive set of key ROM poses was meticulously labeled for both correct and incorrect movement patterns at each frame of the exercises. The proposed method offers several advantages for future AI applications in exercise assessment and feedback assistance.

A notable aspect of this method is its applicability in clinical practice. Associating specific categories of motion with each phase of exercise performance enables the system to provide real-time feedback that simulates the expertise of a rehabilitation specialist. During the exercise, camera-captured data is processed by a 3D pose estimation algorithm, extracting features that are fed into an AI system for frame-by-frame motion analysis according to predefined evaluation rules. The system, trained to identify correct concentric and eccentric ROM poses, can discern individual repetitions within a continuous motion sequence, evaluating them in terms of completeness [18, 36]. Additionally, the system can recognize incorrect patterns of motion, offering detailed feedback on deviations from correct execution and quantifying the overall exercise performance quality by analyzing pose validity frame by frame.

One strength of this dataset is its generalizability across various subject characteristics, such as age, sex, and physical impairment conditions, and exercise settings (e.g., sitting or standing) and the presence of both correct and incorrect movement patterns. However, it is worth noting that some limitations exist, such as limited subject positioning in front of the camera. To enhance the algorithm’s generalizability for rehabilitation exercises, it is important to consider greater variability in subject positioning, including various distances and angles in front of the camera (e.g., sagittal, frontal, and lateral views) [37].

Furthermore, it is worth noting that the data reveals a notable class imbalance, especially within the category of patients experiencing incomplete execution of ROM. There is a hypothesis that suggests these individuals might encounter physical impairments that impede their ability to attain the full ROM. However, given the absence of specific medical condition data for each subject, a more comprehensive analysis of the underlying factors cannot be carried out at this time.

Future efforts should aim to expand this dataset, serving as a crucial starting point for further research in the development and testing of AI-assisted digital systems for rehabilitation purposes.

6 Conclusions

The field of rehabilitation technologies is increasingly focused on developing digital assistive systems to support and motivate patients during home sessions. Consequently, a crucial challenge for researchers is devising methods for automating the assessment of patient performance and functional recovery.

Despite a wealth of previous literature and research on computational approaches for motion evaluation, there has been a relatively limited exploration into modeling rehabilitation datasets for accomplishing the double objective of quality performance evaluation and the recognition of specific types of movement errors to give corrective feedback.

This study represents an initial step towards advancing the development of future datasets, poised to serve as the foundation for new systems offering remote assistance in rehabilitation procedures. This transformation is expected to become widespread and fundamentally reshape the assessment of home rehabilitation programs. Our methodology’s strength lies in the meticulous labeling and annotation of datasets based on movement phases, crucial for constructing systems akin to virtual coaches. However, certain limitations persist. The inclusion of quantitative metrics and annotating motion according to performance evaluation categories are imperative for further refinement and utility. Nevertheless, our primary objective is to provide assistive feedback to patients during online rehabilitation sessions.

We advocate for other researchers to integrate movement phase annotation into their datasets or provide comprehensive information for subsequent labeling, thus facilitating progress in this field. Furthermore, it is equally vital that the data encompass a broader spectrum of subjects with diverse physical and health-related characteristics. Nonetheless, addressing privacy concerns and securing informed consent from patients is a significant challenge in this domain. Patients must be fully informed about how their data will be used and its potential impact on their treatment. Additionally, any annotated and labeled data containing personal information should be accompanied by clear explanations of its practical application and how it can enhance clinical practices.

Ultimately, research into constructing assistive systems for remote support hinges heavily on dataset creation and curation. Through our efforts, we seek to underscore the importance of developing and disseminating datasets pertinent to rehabilitation, offering online patient information and objective performance metrics for clinical outcome monitoring.