1 Introduction

Fall-related injuries have become a significant contributor to health-care costs (Maki et al. 2003). Balance training can help to reduce the chance of fall-related injury (Emery et al. 2005). Currently, balance training is available as part of various exercise classes and in fitness and sports training. However, these options are not appealing to all populations (Findorff et al. 2009). With the recent introduction of active gaming and home training using video game–based platforms, the options for balance training have extended into Virtual Reality (VR).

Virtual Reality can be defined as an artificial environment generated through software and presented on a computer-generated display, where the observer’s actions produce actions in the artificial environment leading to a sense of being in the environment. There are many off-the-shelf VR platforms that fit this definition, including video gaming systems like the Sony PlayStation ® and Nintendo ® wii™. On these platforms, some games and applications require different types and amounts of user movement as input to the game. The display is shown on a standard television monitor.

With high fidelity graphics, visual scene motion along with self motion in controlling the game, and the strong sense of immersion that these Virtual Environments (VE) provide, simulator sickness symptoms can be produced (Sparto et al. 2004; Kennedy et al. 1993). Simulator sickness is similar to motion sickness, and can manifest itself through a wide variety of symptoms.

It is believed that active head movement training could be an important element in improving balance control. Cohen et al. (1995) showed that purposeful head movement exercises improved balance in vestibular patients. This improvement could be due to individual vestibular or visual field training effects or a recalibration of one sensory input with respect to another (McConville et al. 1994). Gottshall et al. (2006) demonstrated in healthy subjects that active, volitional head movements, as part of a training program to adapt to altered visual inputs, improve scores on the dynamic gait index. This set of tests measures balance while walking with and without volitional head movement with respect to the trunk. Eye-head coordination is improved with active head movements in healthy subjects (Hoshowsky et al. 1994). Active head movement training together with meaningful visual stimuli could lead to balance improvement (Foster 1994).

Arm movements have been shown to be very important in balance maintenance and recovery (Maki and McIlroy 2006). Both unilateral and bilateral arm movements generate anticipatory and reactive responses in postural leg muscles (Mochizuki et al. 2004) and hip and trunk muscles (Yamazaki et al. 2005). Furthermore, reaching and grasping actions are important in balance recovery and prevention of falls (Maki and McIlroy 2006).

The purpose of this study was to characterize balance-related motor learning effects through game performance and to evaluate the effectiveness and specificity of a video game for balance training in healthy subjects. The game scores were expected to improve over the course of the training, and it was hypothesized that this improvement could be characterized mathematically. Furthermore, it was hypothesized that balance tests would show improvement after the video game–based training. The video game selection was based on evidence which showed the importance of visual inputs with active head and arm movements.

2 Methods

The experiment consisted of game play training using the Sony PlayStation ® 2 and EyeToy ® combination with the Harmonix AntiGrav™ game. Game scores were recorded along with simulator sickness symptoms. Balance tests were performed before the training commenced in session 1 and after the completion of all training in session 9. The results were compared to control subject performance on the same tests, who did not receive the game-based training. In other words, like the control group, the tests on the experimental group were performed at the beginning and at the end of the 3-week period with the following exception: the experimental group was also tested before and after session 4.

2.1 Subjects

Seventeen subjects were recruited for the experiment: nine male, eight female; ages 22–34 years, height 150–180 cm, weighing 50–90 kg. The subjects were randomly divided into two groups: a training group and a control group. Seven training subjects and seven control subjects completed the experiment. All subjects reported no medical conditions affecting balance and no history of falls. Based on subject responses, all but one subject were right-handed, and all had normal or corrected vision. Each subject provided informed, written consent to comply with ethics approval granted by the institutional review board (Reference 2006-107-1).

2.2 Apparatus

Sony’s EyeToy ® used video capture technology, in which it imaged the user with a small video camera mounted above the display and used face-recognition and motion-tracking technologies (no body markers were required) (Weiss et al. 2004; Wang et al. 2006). The system used two-dimensional user motion in the frontal plane as inputs. Each user was individually calibrated before each session, to account for height differences and range of head motion. During this procedure, the subject’s own image was shown on the screen overlaid with a calibration figure, instructions, and controls. The subject had to move to a position such that his/her face was centered on the figure, and then was instructed to wave his/her hand over the “lock” control. In the subsequent screen, the subject was required to bend to the left and right such that the head was aligned with target positions. If this procedure was not properly executed, it became very apparent in the early stages of game play that the game was not responsive (the avatar movements did not respond to user movements), and the calibration was repeated. A diagram of the system and screenshots are shown in Fig. 1.

Fig. 1
figure 1

a Conceptual schema of the PlayStation ® 2, EyeToy ®, and hoverboard game (not an actual screenshot) with the user. b Screenshot of the first step in calibration. The actual image of the user is shown superimposed (off to the side in this picture to highlight the two images) on an outline of a human frame with an opening at the face. c Screenshot of the game in progress. The avatar seen in the middle of the screen is riding on a “rail,” which constrains the path while presenting targets for arm movements. While elsewhere head movements steer the avatar, here head movements are required to lean toward targets (if presented on one side) to be struck with rapid arm movements and to duck or jump over obstacles. The lower right corner has an indicator of the user’s head and arm positions in real time. These are also reflected in the avatar actions. In the center, the accumulated points are shown along with the target number of points to beat. The time remaining is shown in the top left corner. The bar on the left side indicates the current position along the racetrack. d Screenshot of the game in progress. Here the user has successfully struck a pair of targets with both hands, and the user has been given a credit of extra time to complete the race

The user was portrayed in the game as an avatar and the user’s image was not shown after the calibration procedure. The avatar raced on a hoverboard along roadways and rails, over jumps, and when airborne, through rings. The user, standing in front of the display, used head movements to guide the avatar and was required to jump or duck to avoid obstacles. The user also had to reach with either arm or both to strike targets as the avatar passed them. The arm motions were directed diagonally up or down or to the side, requiring additional head/upper body motions to contact the targets. The game activity occurred at a rapid pace, in a very compelling immersive environment, with sound effects and “life or death” consequences. It provided challenging and unpredictable situations requiring maintenance of balance and cognitive skills (such as path selection and memory) at the same time. As the user gained experience in the game, anticipation of upcoming movements became easier.

2.3 Experimental protocol

The control (no training) and experimental (video game training) groups completed the balance tests (as described below) at the beginning and at the end of a 3-week period. Both the control group and the experimental group were free to conduct their daily activities as normal, and thus, the control group controlled for the effect of the specific experimental video game training. The experimental group trained 9 times during the 3-week period, and the control group received no training.

A tutorial was provided at the beginning of the first training session; it asked the user to make basic movements like bending the upper body left and right to steer the avatar. The tutorial proceeded to the next step when each basic skill had been successfully performed. Next steps included using the arms to strike at targets, crouching and jumping to avoid obstacles, and moving the arms to perform tricks.

The game had a fixed duration of approximately 7 min. Based on previous studies with video game–based training (Flynn et al. 2007; Deutsch et al. 2008), 1-h training sessions were scheduled. Four games were played in each session with a 10-min break after 2 games, resulting in 2 training blocks of 15 min each per session. Since we wanted to see the learning progress at more than one level of difficulty, the subject was moved to the next level of difficulty for the second block of training if performance in the first block was above a threshold score of 30,000. This threshold of 30,000 was provided by the game as the “score to beat” (see Fig. 1c, d); by observation, it reflected a moderate level of competence. If the threshold was not achieved, then the subject was kept at the same level for the second 15-min block. The first training session each day was always at the lower level of difficulty.

2.3.1 Outcome measures

Game performance was measured by the number of tokens or targets struck with arm movements and the speed and accuracy in navigating the avatar through the race course using head movements. Additional points could be achieved by doing tricks, executed by arm movements while airborne. The scores were provided after each game, and points were displayed when awarded, which reinforced the immediate feedback provided during game play.

Three balance tests [Balance Board (BB), Tandem Romberg (TR), and One-Leg Standing (OL)] were selected to evaluate transfer of the video game training to the real world. These tests evaluated transfer of learning to the real world because they were conducted without the virtual environment and thus tested for changes in balance outside of the game. Many balance-related measures were found in the literature ranging from the Dynamic Gait Index to the Sensory Organization Test to various functional measures and clinical tests. Our tests were selected based on common usage and relevance to overall balance (Quesada et al. 2007, Vereeck et al. 2008, and Curb et al. 2006) rather than factors such as sensory contribution or functional outcomes such as locomotion.

The average scores from two trials for each test were used. To minimize a “ceiling effect” (which occurs if most subjects pass a balance test given for a specified time, e.g., 30 s), the time was recorded when balance was lost. This gave a finer measure to detect and quantify improvement.

The first test, (BB), used a Balance Board, also known as a wobble board (which was supported on a half sphere and “wobbled” when a person stood on it) (Quesada et al. 2007). In this test, the subject was required to maintain balance on this Balance Board, with feet apart, eyes open, and arms on the side. The second test was the “tandem Romberg” (TR) test, also known as the “sharpened Romberg test” (Vereeck et al. 2008). In the TR test, the subject was instructed to stand heel to toe, head straight ahead, arms folded against the chest. The eyes were closed to make the task more challenging for young subjects and to reduce any ceiling effect. In the third test, the One-Leg test (OL), the subject was required to maintain his/her balance on one leg (right) while keeping his/her eyes closed and arms straight against the sides of the body. This test was recommended by Curb et al.: “a new recommended battery includes unassisted single-leg stand”, showing a test–retest reliability correlation coefficient of 0.69 with a 30 s ceiling in healthy adults (Curb et al. 2006).

A stopwatch was used to record the time a subject maintained balance under the three test conditions. The results were analyzed using paired, two-tailed t tests for the means and a single factor ANOVA.

A Simulator Sickness Questionnaire (SSQ) was completed during the break and after the training protocol on each training day. The SSQ was developed for simulators with visual displays by Kennedy et al. (1993) from 1,100 motion sickness questionnaires completed in the US Navy. It contained 16 items on which the subjects rated their responses [ranging on a 4-point scale; 0 (none), 1 (slight), 2 (moderate), 3 (severe)] (Kennedy et al. 1993). This was used to ensure the safety and comfort of the participants.

3 Results

3.1 Training game performance

The performance in the game was based on the game scores in both levels of difficulty. All subjects except one achieved the target score for Level 1 in the first session and were moved to training at both Levels 1 and 2 by the second session. The remaining subject achieved the target score in the second session and began both levels of training in the third session. Overall, there was a gradual increase in the total scores for both levels over the period of the experiment as seen in Fig. 2.

Fig. 2
figure 2

Game scores over the 9 training sessions. Each symbol reflects one subject’s scores. a Game scores for Level 1 training. The scores for the two games from each session were averaged. b Game scores for Level 2 training. The scores for the two games from each session were averaged. Level 2 scores begin from session 2, since only Level 1 was played during the first training session

3.1.1 Model of motor learning through game scores

Game scores improved consistently over the training sessions, which could be attributed to increasing familiarity with the virtual environment, anticipation of required movements, learned cognitive decisions, and strategies as well as improved sensorimotor coordination and balance. The improvement in game scores was consistent with our common understanding of improved performance with practice and with studies of motor learning (Newell et al. 2001). The game scores were averaged across subjects and fit to an exponential given in Eqs. (1) and (2) for Level 1 and Level 2 games, respectively (McConville and Virk 2007):

$$ y\left( n \right) = \, 57,500 - 41,000{\text{e}}^{ - 0.26n} $$
(1)
$$ y\left( n \right) = \, 57,500 - 43,500{\text{e}}^{ - 0.087n} $$
(2)

where n is the session number and y is the curve fit value. All three variables were initially used to obtain curve fits. It was found that constraining the DC term to be the same for both curves did not affect the quality of the fit, and it represented the average performance plateau after a large number of sessions. These curves are shown together with the averaged data in Fig. 3. The curve fits reflected the increase and plateau of motor learning as a function of the session number for the first level (Level 1), and suggested that the subjects had not yet mastered Level 2 because of the absence of the plateau.

Fig. 3
figure 3

Averaged scores over all seven training subjects for Level 1 and Level 2 training along with corresponding exponential curve fits (see text)

3.2 Balance tests

Performance on balance tests improved over the course of the training period for the training subjects as compared to controls. During pilot trials, it was found that some volunteers were balancing for longer than 3 min on the Balance Board and 2 min on the Tandem Romberg tests. The BB and TR tests were discontinued after 3 and 2 min, respectively, because of the potential for muscle fatigue, which would interfere with the measure of balancing ability (Gandevia 2001). Of the 7 experimental subjects in each group (training and control), three control subjects reached the maximum time in the BB pre-test and one control subject in the TR pre-test. In order to ensure that the control and training subjects were drawn from the same population, they were compared with a one-tailed, 2-sample t test. No significant differences were found for any of the 3 pre-tests between the control and experimental groups: Balance Board (p = 0.20), Tandem Romberg (p = 0.19), and One-Leg standing (p = 0.12).

Figure 4 shows the balancing times in seconds averaged for training and control subjects for each test. The differences between pre- and post-training measures were evaluated with paired two-tailed t tests for the mean. For the Balance Board test, there was no significant difference between the pre-test and post-test among control subjects (p = 0.63), while trained subjects showed a significant difference (p = 0.0012). For the Tandem Romberg test, control subjects showed no significant difference (p = 0.32), while trained subjects showed a significant difference at p < 0.05 (p = 0.012). No significant difference was seen in the control or trained subjects for the One-Leg standing test (p = 0.53 and p = 0.20, respectively). Analysis of the difference between the control and trained subjects using ANOVA showed similar results. The BB results were significantly different (p = 0.00017), the TR results were also significantly different but to a lesser degree (p = 0.0016), and the OL results were not significantly different (p = 0.17).

Fig. 4
figure 4

a Time that the subjects were able to maintain balance on a Balance Board (BB) pre- and post-training. The performance for the training subjects (n = 7), mean = 67 and 142 s, is compared to the performance of the control subjects (n = 7) who did not receive training during the 3-week period; mean = 97 and 90 s. The trained subjects had a significant difference in balancing times (p < 0.005). b Time that the subjects were able to maintain balance in the Tandem Romberg (TR) test pre- and post-training. The performance for the training subjects (n = 7), mean = 41 and 102 s, is compared to the performance of the control subjects (n = 7) who did not receive training during the 3-week period; mean = 59 and 64 s. The trained subjects had a significant difference in balancing times (p < 0.05). c Time that the subjects were able to maintain balance in the One-Leg (OL) standing test pre- and post-training. The performance for the training subjects (n = 7), mean = 15 and 24 s, is compared to the performance of the control subjects (n = 7) who did not receive training during the 3-week period; mean = 25 and 19 s. The differences in balancing times were not significant for either group

3.3 Simulator Sickness Questionnaire (SSQ)

The results of the SSQ revealed that 4 of the 7 subjects (57 %) experienced slight sweating during the trials. This was attributed to the game requiring active head and body movements to hit targets and to navigate, and it was expected. 3 of the 7 subjects (43 %) experienced simulator sickness symptoms other than sweating (fullness sensation in the head, headache, eyestrain, dizziness, and general discomfort) in the first few training sessions. None of the simulator sickness symptoms were severe. On subsequent trials, most of the symptoms subsided. Figure 5 shows the progression of symptoms on the SSQ over the training sessions. Based on the downward trend in symptoms observed in Fig. 5, linear regression was performed to test for non-zero slope. The ANOVA performed on this regression verified a non-zero slope with a significance of p < 0.01 (p = 0.0078).

Fig. 5
figure 5

Simulator sickness symptoms across all subjects from the 16-item Simulator Sickness Questionnaire over the course of the training sessions

4 Discussion

4.1 Proposed motor learning parameter to compare game difficulty for video game design

The improvement in subject scores in the video game play was modeled as a negative exponential increase with an almost linearly increasing portion followed by a plateau. Likewise, the results from Kim et al. (1999) showed an approximately linear increase over 4 training sessions, which was likely due to an insufficient number of sessions to fully learn the task. The data showed that when mastery was achieved, it was important to increase the challenge level to Level 2 to ensure continued learning.

We could thus begin to quantify the difficulty levels of the game using the above analysis. Since the reciprocals of the exponents in Eqs. (1) and (2), the time constants, are proportional to the half times to reach the plateau, then their ratio could be used to express the relative difficulty of the game levels. The ratio of the exponents is: 0.26/0.087 = 3.0. Therefore, Level 2 would be 3 times as difficult as Level 1. This analysis could provide a parameter for the design of video games for training and rehabilitation, to ensure a suitable increment of difficulty and continued progress in motor learning.

This was a theoretical analysis of the results, and suggests further experiments to correlate difficulty with balance improvement. The subject could not stay on Level 1 according to this analysis because the learning curve would reach a plateau. An elderly person may benefit from staying on Level 1 for more training sessions, because of the longer learning curve. These concepts along with the response to training in different populations should be tested experimentally in future studies.

4.2 Balance improvement

The findings suggest that the AntiGrav™ game is effective in improving balance for two-legged stance in young healthy subjects but not for one-legged stance. The training was performed with feet side-by-side. This position was also required on the Balance Board, and the improvements were the greatest in this test. Moderate results were obtained in the TR test with two feet in line, heel to toe. Weakest improvements (not achieving statistical significance) were obtained in the OL test. These results suggest the specificity of training which has also been observed by others (Di Fabio and Anderson 1993; Coyle et al. 1981; Izquierdo et al. 2002). Future studies should investigate the specificity of training to the head movements used in the training.

It was recognized that functional balance involves a combination of movements. This was the basis for dual-task training conducted by Silsupadol et al. (2006). The AntiGrav™ game also presented simultaneous tasks and tasks in rapid succession, with a combination of head and arm movement targets as well as cognitive challenge in decision making. In future studies, the task breakdown, measurement, and analysis could also be performed.

4.3 Simulator sickness symptoms

Video games have become ingrained in everyday life as a popular leisure activity and are now used by an increasingly wide range of age groups with the appearance of systems like the wii-fit™. The sensory conflict between user head movements side-to-side and the visual feedback on the display related to forward motion may have produced some of the mild symptoms of simulator sickness observed in the SSQ in the young healthy subjects. Stoffregen et al. (2008) observed an incidence of 42–56 % in motion sickness-related symptoms in an undergraduate student population when playing console video games, which is consistent with our findings. The simulator sickness symptom reduction over repeated exposures is consistent with studies in other types of virtual environments (Sparto et al. 2004; Jacob et al. 1993).

4.4 Limitations

It is recognized that the fast-paced nature of this game is best suited for young healthy subjects. These populations can benefit from balance training for injury prevention. Modified games can be developed for and evaluated on younger patients. Balance disorders in these groups could range from viral infection to Menière’s disease. Custom-designed games can also be developed for older populations interested in balance training.

It is possible that the extra balance testing in session 4 might have affected the results of the final test. However, the minimal series of 3 balance testing sessions overall in the experimental group was demonstrated to be insufficient to produce significant improvement on the OL test. Furthermore, training on instability devices directly on its own has been demonstrated to have results over the course of 6 weeks or 12 sessions (Heitkamp et al. 2001), which is twice the training period used here and many more sessions than the one extra balance testing session.

It is recognized that the sense of immersion may have been limited by the real world visible in the periphery around the display. This was not considered to be a detrimental factor due to concerns about potential simulator sickness symptoms and the subjective reports of a significant sensation of immersion by the subjects in all cases. In fact, the peripheral cues could serve as a safety factor preventing loss of balance during training while still allowing improvement in the balance scores. There is an obvious trade-off between not eliciting motion sickness symptoms for comfort reasons versus the requirement of the user to experience these symptoms for recovery (Norré and Beckers 1988). This trade-off will be investigated in future studies.

Traditionally, training is performed directly on the desired activity, without the need to transfer learning from a simulated environment. Virtual Reality has many benefits, such as the safety and comfort of the home environment, which is important to the elderly and others who are unsure of themselves in actual sport activity. VR also facilitates dual tasking, which should be explored in future studies. It would be a complementary form of training and exercise to improve overall fitness including balance fitness, extending the options available to those who could benefit the most.

5 Conclusions

The purpose of this study was to investigate the potential for an off-the-shelf video game for balance training in healthy subjects, to evaluate specificity of training and motor learning, and to quantify simulator sickness symptoms. It was found that game scores and some balance test results improved over the course of the training. Motor learning effects were modeled at two levels of difficulty and a difficulty parameter was proposed. The results suggest that such a video game could improve balance in a manner specific to the training tasks. In the future, such training can be generalized to other populations, ideally covering a range of tasks and training characteristics such as optimized difficulty levels.