Introduction

Previous research has shown that computer games are a promising adjunct to improve patient motivation in physiotherapy regimens [1,2,3,4]. Gamification principles can make rehabilitation exercises more enjoyable and can provide support for patients in a home-based rehabilitation program, hereby supporting treatment adherence [5]. In addition, the implementation and use of wearable sensors for telemonitoring of patients in a home-based rehabilitation setting has long been advocated [6, 7]. The addition of wearable technology to gaming provides an easy way of delivering direct feedback and improving supervision in exercise programs [5, 8].

Distal radius fractures are one of the most frequently occurring types of traumatic injury, making up 18% of fractures in patients presenting to the emergency department, and up to 25% of all fractures in the elderly patient group [9,10,11,12]. Distal radius fractures lead to considerable morbidity, resulting in temporary loss of productivity, thus causing a substantial impact on society [13,14,15]. Physiotherapy or self-supervised exercise programs are commonly recommended as a rehabilitation strategy after such injuries [16,17,18]. Previous research has shown that practical constraints such as time, costs and travel distance lead to a low adherence to physiotherapy referrals and exercises [19, 20]. Treatment adherence to home-based, self-supervised exercise programs is equally low, mainly due to a lack of monitoring and feedback to patients [21]. Receiving feedback and encouragement is thought to motivate patients, leading to a higher self-efficacy and hereby increasing treatment adherence [22].

To support patients in their rehabilitation process while simultaneously improving patient monitoring, the serious game ReValidate! has been developed. The game is controlled by two wearable motion sensors, placed proximally and distally of the wrist, so that they bridge the wrist joint. Each sensor contains an accelerometer, a gyroscope and a magnetometer, enabling three-dimensional registration of movements in the wrist joint. The paired sensors act as controllers for playing the serious game Revalidate!, which is installed as a mobile application on a smartphone or tablet. The smartphone or tablet itself acts as the gaming platform. The ReValidate! game incorporates telemonitoring of the range of motion (ROM) of the wrist joint and of treatment adherence. In order for the in-game monitoring of patients’ progress to be safe during a clinical rehabilitation program outside of a research setting, and not pose a threat to patient privacy and data safety, the game must comply with medical device regulations [23]. The ROM outcomes as measured by the game application also need to be valid and reliable. Previously, the game has been established to be a promising and valid therapeutic support tool for patients recovering from a distal radius fracture [24]. Other research has established that wearable motion sensors can be used as a reliable method for measuring ROM in patients [25]. The universal goniometer is used as the reference standard, as this is the most frequently and widespread used method of measurement worldwide, next to visual estimation [26]. Newly developed smartphone apps are increasingly researched, and seem to be reliable [27, 28]. The objective of this study was to determine the reliability of active ROM measurements as measured by the wearable-controlled ReValidate! mobile game application, compared to the active ROM as measured by an experienced surgeon using a universal goniometer as the reference standard.

Patients and methods

Ethics and study design

The study was designed as an observational cohort study. The study was approved by the medical ethical review board of our institution (MEC-AMC W17_003 #17.015).

Subjects

A total of 34 patients suffering a restricted wrist ROM, and 7 healthy volunteers were recruited for this study. Patient characteristics are shown in the baseline table (Table 1). The patients were asked to participate in the study by their treating physician when visiting the outpatient clinics of the department of plastic, reconstructive and hand surgery, or the department of trauma surgery of our hospital. All patients visited the outpatient clinics for regular follow-ups, varying from 6 week to 1 year after their original injury. Patient all had a restricted range of motion to varying degrees. The seven healthy volunteers with no history of wrist injury or movement restriction due to other disease or factors were recruited from the hospital staff.

Table 1 Baseline table with patient characteristics

For the purpose of determining internal validity of the electronic goniometer, sample size calculations were based on the intraclass correlation coefficient. To detect an intraclass correlation coefficient (ICC) of 0.9 when the null-hypothesis is 0.6, using a two-way mixed effects model, with a power of 80% and a significance level of 5%, seven participants were needed. The sample size to test external validity was in agreement with previous studies, and deemed to be sufficient in determining a difference in means of ROM of 5° between ROM measured using the digital game wearable goniometer, the goniometer measurements performed by experienced surgeons, with a significance level of 0.05 and a power of 80% [26, 29, 30]. Though a 5° difference in ROM may not predict clinical relevance, increasing restrictions in range of motion are linked to poorer functional outcomes [31,32,33,34,35]. Variations up to 18° between goniometry measurements performed by different medical professionals have been described previously [34], yet a difference of 5° is also used in comparable studies [27, 34], and was therefore considered to be the maximum for reliable measurements using the ReValidate! game application.

Game setup

The ReValidate! serious game is played on a tablet computer (Apple™ iPad®, Apple, Cupertino, California, USA). The game is controlled by the wearable Valedo motion sensors (Valedo®, Hocoma, Switzerland). Alternatively, a version that can be played on a smartphone (Apple™ iPhone®, Apple, Cupertino, California, USA) and that is controlled using the Myo™ Armband (Thalmic Labs, Kitchener, Canada) has been developed to meet the needs of patients not possessing both types of device. As the the Valedo® sensors were previously established to be reliable for range of motion measurements [36], and were more stable in their connection to the mobile application than the smartphone version of the game, only the tablet version of the game was tested for its reliability of range of motion measurements in this study.

Using two separate motion sensors, placed both proximally and distally of the wrist joint, the isolated wrist joint motions can be used as game control, and ROM can be measured. Patients were instructed both orally as well as using pictures within the game, how to attach the sensors to their arm and hand. One sensor is securely strapped to the dorsum of the hand at metacarpal level, while the other sensor is securely strapped to the lateral side of the forearm, just distally to the elbow (Fig. 1), so that the sensors cannot move during gameplay. Sensor placement was checked by the main researcher to ensure correct placement, and corrected if necessary, before continuing to the gameplay session.

Fig. 1
figure 1

Setup of the Valedo® sensors around dorsum of the hand and lateral side of proximal forearm

The game is programmed to register both the maximum and average ROM values, as well as the complete motion ‘path’. By moving the wrist, the player controls an underwater avatar (Fig. 2). For each combination of opposing movements (flexion and extension; radial deviation and ulnar deviation; and pronation and supination) a different avatar is controlled. The game consists of 42 levels of increasing difficulty, one for each day of a 6-week rehabilitation program. Each level is completed by steering all three avatars through an underwater parkour successfully. Exercise duration and frequency, average ROM during gameplay, as well as maximum ROM are stored on the device itself, and when connected to the internet, are send to and stored in a secured web-based database. This database is located on the highly secured hospital computer servers, which comply with the General Dara Protection Regulation (GDPR), medical device regulation (MDR) and national and international safety standards including the ISO 27001 and NEN 7510 [37,38,39]. The data can be retrieved only by patients and healthcare providers using a personal login, and is available for telemonitoring of patients’ progress and recovery.

Fig. 2
figure 2

Overview of exercises per movement arc. a: pronation/supination (angler fish); b: flexion/extension (shark); c: radial/ulnar deviation (penguin)

Internal (construct) validity measurements of the game

To establish internal validity of the goniometer embedded within the ReValidate! game, a test–retest was performed using healthy volunteers. These volunteers were instructed to play the game and to reach their own maximum ROM during gameplay. To ensure the test–retest would not be influenced by confounding factors such as strain or exhaustion resulting from playing the game itself, volunteers were asked to play one complete level of the game twice, with at least 30 min of rest in between. The intraclass correlation coefficient (ICC) between the ROM as measured in the first and second gameplay session was determined using a two-way mixed-effects model (single measures, absolute agreement).

Inter-rater reliability of experts using a universal goniometer (reference standard) for active ROM measurements

Directly after the two gameplay sessions, four experienced medical professionals were asked to measure the active ROM (flexion–extension, pronation-supination and radial deviation-ulnar deviation arcs) in the healthy volunteers, using a universal (analogue, short-arm) goniometer, as the reference standard. Experts had at least 5 years of experience and used the universal goniometer in their daily practice. To establish if measurements reported by the professionals using a universal goniometer can be considered reliable, the inter-rater reliability between the measurements of the participating professionals was tested. The experts were blinded to previously measured ROM outcomes, both as measured by the game and as measured by the other experts. Participants were instructed not to disclose any previous measurement outcomes. The inter-rater reliability was analyzed using a two-way random-effects model (average measures, absolute agreement).

External (concurrent) validity

External validity was established by comparing ROM outcomes of the ReValidate! game in patients with various levels of restriction in wrist ROM, as measured by highly experienced medical professionals using a universal handheld goniometer. Experts were two plastic surgeons and two trauma surgeons, all of whom were specialized in treating wrist injuries. All experts performed active ROM measurements in their daily practice. First, the maximum active ROM in the affected wrist joint of patients was measured by one of the experts. Subsequently, patients played a single level of the ReValidate! game using the same hand and maximum ROM outcomes were extracted from the game database.

Concurrent validity between the game ROM outcomes and the ROM outcomes as measured by the experts was determined using a one-sample t-test to compare mean differences. Values were represented in means and standard deviations, p-values of < 0.05 were considered statistically significant. A difference in means of 5° was considered acceptable for reliable measurements. Scatter-plots and Bland–Altman plots were constructed to evaluate agreement and to assess for any systematic bias.

Statistical analysis

All analyses were performed using the statistical package for the social sciences (SPSS) version 26 (IBM, New York, USA). ICC outcomes of 0.5 or lower were considered poor, 0.5–0.75 were considered moderate, 0.75–0.9 were considered good, and outcomes larger than 0.9 were considered excellent agreement.

Results

Internal validity of the in-game range of motion measurements

The test–retest reliability of the game, reflecting the internal validity of measurement outcomes established by the game was determined by calculating the intraclass correlation coefficient (ICC) using a two-way mixed-effects model (single measures, absolute agreement). Even though patients are instructed to keep their wrist straight when calibrating the game and sensors, the neutral position can vary, since the game lacks visual analysis of movements. Therefore, the motion arcs are represented. Arcs are calculated by adding up the two maximum measurements of motions in opposite directions (flexion + extension; radial deviation + ulnar deviation; pronation + supination). Outcomes are represented in mean ROM and standard deviation, mean difference and ICC and 95% confidence intervals (Table 2).

Table 2 Test–retest of mean ROM in degrees as measured within the game

Though only the mean difference for the palmar-dorsal flexion is within the predefined 5° range, the mean differences are small, ranging from 1° to 8°. There were no significant differences between the test and retest. Intraclass correlations range from poor (0.376) for the pronation-supination arc, to good (0.693–0.863) for the palmar-dorsal flexion and radial-ulnar deviation arcs.

Inter-rater reliability between medical professionals

To evaluate the reliability of ROM measurements by experienced professionals using a universal goniometer, the inter-rater reliability was determined. Measurements were registered in degrees. Means and standard deviations between the medical professionals showed large variations, especially in the palmar-dorsal flexion and pronation-supination arcs (ICC 0.16 and 0.02, respectively). Though the measurements showed more agreement in the radial-ulnar deviation arc, there was still a considerable variation and the ICC was only moderate (ICC 0.52). Outcomes of the inter-rater reliability (two-way random-effects model (average measures, absolute agreement)) are represented in Table 3.

Table 3 Mean ROM in degrees as measured by the four experts, and inter-rater reliability between the different measurements

External validity and reliability

Mean differences were calculated by determining the difference between the ROM measured by the game and measured by the expert (game outcomes–expert outcomes). Mean differences were tested using a one-sample T-test to determine their difference to zero, with zero being the ideal outcome (meaning a perfect agreement between the doctors’ and the game measurements). Outcomes are represented as mean difference, 95% confidence interval of the mean difference and the significance level (Table 4).

Table 4 Mean differences between game measurements and measurements by their treating physician using a universal goniometer

Scatter plots show the agreement between the two measurement techniques (Fig. 3). Though the scatter plots do not show perfect agreement (line of agreement, bold line), they show comparability between the game and universal goniometer measurements (Fig. 3). Bland–Altman plots provide an insight into measurement distribution and possible systematic bias (Fig. 4). The palmar-dorsal flexion arc plot shows a mean difference of – 1.5 (Fig. 4a; bold line), meaning that the game systematically measures a value 1.5° lower than measured by the experts. The radial-ulnar deviation plot shows a mean difference of –3.6 (Fig. 4b; bold line). The lower and upper 95% limits of agreement are – 71.47 and 72.33 for palmar-dorsal flexion, and – 46.13 and 38.67 for radial-ulnar deviation, respectively (mean ± 1.96 SD; dashed lines in Fig. 4a, b). As these limit of agreement intervals contain zero, this means that there are no differences between the measured values using the game and the universal goniometer. Limits of agreement intervals are wider than the predetermined confidence levels of 5°, however, which can be explained by the large confidence intervals of the game measurements and the expert measurements. The plots show that both methods have a large variation in outcomes, yet there is no systematic bias in either method.

Fig. 3
figure 3

a, b Scatter plots of game versus Universal goniometer measurements. a Palmar-dorsal flexion arc. b Radial-ulnar deviation arc (ROM range of motion)

Fig. 4
figure 4

a, b Bland–Altman plots of the differences against the means, for the palmar-dorsal flexion arc (a), and the radial-ulnar deviation arc (b). The bold lines indicate the mean difference, dashed lines denote upper and lower 95% limits of agreement (ROM range of motion, G game, UG universal goniometer)

Discussion

The ReValidate! game for wrist rehabilitation has previously shown to be valid in home-based patient care for training active range of motion in patients having suffered a distal radius fracture and has the potential to improve rehabilitation outcomes [24]. To safely use the game as a telemonitoring tool, its ROM measurement outcomes must be reliable when compared to the most widely used method of ROM measurement in clinical practice, being the use of a universal goniometer [27].

This study shows a good test–retest reliability (or construct validity) for the radial-ulnar deviation and palmar-dorsal flexion arcs, as measured during repeated playing sessions of the game. Pronation and supination test–retest outcomes are less reliable and show considerable variation between the gameplay sessions. External validity testing, comparing game outcomes to measurements by experts, shows a good comparability for radial-ulnar deviation and palmar-dorsal flexion arcs. Though the limits of agreement are wider than the ideal level of 5°, the variation can also be attributed to the variation in measurements between medical professionals. The motion sensors (Valedo®, Hocoma, Switzerland) used for controlling the ReValidate! game and evaluated in this study, are therefore sufficiently reliable to be used in patient home-based monitoring, compared to ROM measurements as performed by an experienced medical professional.

Regarding monitoring in a home-based rehabilitation setting, the ReValidate! mobile game application can be seen as a valid monitoring tool for dorsal-palmar flexion and radial-ulnar deviation, as the mean differences between the game and the medical professionals are within the 4°–6° range. A restriction in ROM in the wrist has been shown in previous studies to be associated with lower functional scores [33, 35]. Remote monitoring of ROM could therefore provide medical professionals with important information on the effectiveness of a rehabilitation program, or warning signs that a patient is at risk for persistent functional limitations [28]. A successful remote monitoring system can hereby help control the increasing demand on healthcare due to the aging population.

It is striking that the participating medical experts, all with at least 5 years of experience and experienced in using a universal goniometer in daily practice, show significant variation in measuring active ROM of the wrist (ICC varying between 0.02 and 0.52). These measurements are the most frequently used type of measurement for wrist range of motion, next to visual estimation, and are therefore considered the ‘gold standard’ by some. As they are used as a reference for the results of the game, these variations inevitably influence outcome of the study. Previous studies show that the inter-rater reliability is dependent on the measurement technique used and the experience and training of the medical professional [29, 40]. Caution is recommended when interpreting measurements made by different professionals, as variation may be larger, and differences of 6°–10° in measurement are described as acceptable [29, 41].

With previous research mainly focused on flexion and extension measurements as a tool for monitoring rehabilitation progress [42], the question remains whether small variations in other ROM measurements should be considered for patient monitoring in a rehabilitation program. The large differences in pronation and supination measurements by the game application, as found in this study, are most likely due to the placement of the motion sensors around the hand and forearm. As the sensor is strapped around the forearm, it will therefore move with the skin as the skin moves relative to the bones. While surgeons use static measurement landmarks (for example, the elbow joint or the upper arm); the game uses a sensor which is not static during pronation and supination movements, hereby leading to smaller measured values for pronation and supination movements. The differences between ROM outcomes as measured by the game application were found to consistently be around 65° smaller than as measured by the surgeons. Therefore, these outcome values could still be used for the purpose of patient follow-up when the game is used as a stand-alone treatment and measurement method in home-based rehabilitation.

While it is a limitation that the game is currently only available for Apple™ (Cupertino, California, USA) devices, this choice was made as Apple™ HealthKit™ (Apple, Cupertino, California, USA) can directly send data to the Epic® electronic health records (Epic Systems Corporation, Verona, Wisconsin, USA). This will allow physicians immediate access to game data, including progress reports and treatment adherence details. Though the smartphone version of the game has not been tested for reliability of goniometric measurements, this version still allows for home-based monitoring of treatment adherence. The game has been developed specifically for compatible smartphones and tablets, and follows the ‘bring your own device’ principle. This principle means that when the game is compatible with more different types of mobile devices, more patients are enabled to practice their rehabilitation exercises using the game. The game will therefore be expanded to be compatible with other types of devices in future development.

It can also be considered a limitation of this study the ROM measurements by experts contain too much ‘human error’. Though these measurements are commonly used to monitor patients’ rehabilitation progress in daily medical practice, they seem too inconsistent be considered to be the gold standard for ROM measurements.

Conclusions

In this study, experienced healthcare professionals’ measurements show a poor inter-rater reliability using a universal goniometer in measuring active ROM of the wrist. In contrast, the wearable-controlled mobile game ReValidate!, which incorporates a digital goniometer, shows a higher reliability and validity in measuring active ROM of the wrist joint. This study shows that medical professionals, including surgeons and hand therapists, can rely on a commercially available off-the-shelf tool to reliably monitor the progress of their patients participating in a home-based rehabilitation program, without requiring hospital visits for monitoring. Especially during a worldwide pandemic, with restricted access to healthcare and where social distancing rules apply, this is an interesting development.