Introduction

The median age of the general population is projected to significantly rise in the upcoming years [1]. As the elderly population grows in age and size, an increased patient population-based stress will be placed on already overloaded clinics and hospitals. Major contributors to this increase are need of care for the elderly who are healthy to stay healthy (such as physical exercise, fall detection and fall risk reduction) and need for rehabilitation after stroke, for which age is a significant risk factor. The demand for technologically advanced methods of elderly care, which can be accessed at any time and used in a private, home-based setting while still providing rehabilitation instructions and progress tracking, is expected to expand. The Kinect is the forerunner in commercially available hardware upon which development of these methods can be built while simultaneously maintaining affordability for large-scale disbursement [2]. In this paper we present a review of the most current avenues of research into Kinect-based elderly care and stroke rehabilitation systems to provide an overview of the state of the art, limitations, and issues of concern as well as suggestions for future work in this direction. Figure 1 presents the structure of the manuscript, essentially, how studies included in this review are grouped together into relevance-based subsections. Elderly Care is comprised of two subsections: 1) Fall detection and 2) Fall risk reduction. Stroke Rehabilitation contains: 1) Evaluation of Kinect’s Spatial Accuracy, and 2) Kinect-based Rehabilitation Methods. We have allocated a third section titled ‘Serious and exercise games’ for studies that are indirectly related to the first two sections and present a complete system for elderly care or stroke rehabilitation in a Kinect-based game format. A concise summary with significant findings and subject demographics (if applicable) of each study included in the review is also provided in a table format Tables 1 and 2), to facilitate readers’ access to more detailed information for studies of interest. The remainder of the Introduction section provides a brief overview of each of the three main sections of the paper.

Figure 1
figure 1

Manuscript Structure. Structure of the manuscript summarizing how studies included in this review were grouped together into relevance-based subsections. The Applications in Elderly Care section is comprised of two subsections: 1) Fall detection and 2) Fall risk reduction. The Applications in Stroke Rehabilitation section contains: 1) Evaluation of Kinect’s spatial accuracy and 2) Kinect-based rehabilitation methods. We have included a third section titled ‘Serious and exercise games’ for studies that we believe are indirectly related to the first two sections and present a complete system for elderly care or stroke rehabilitation in a Kinect-based game format. There are many applications of the Kinect in rehabilitative and assistance-based research that, while extremely important, fall outside the scope of this systematic review.

Table 1 Overview of studies categorized under the section applications of Kinect in elderly care
Table 2 Overview of studies categorized under the section applications of Kinect in stroke rehabilitation

Users of the Kinect are able to intuitively interact with a computer through various gestures and postures. This natural method of human-computer interaction allows for the development of specialized forms of elderly care applications and medical alert support systems. These alert systems focus on reducing the probability a fall incident will occur; a leading cause of injury, emotional distress, and financial burden to the elderly [717]. Current prototype alert support systems of fall detection and/or risk reduction show tentative promise of becoming successful tools to extend elderly independent life through accurate fall detection [1822] and automated gait assessment [2326].

Ideally, all stroke rehabilitation exercises would be performed with therapist-assisted daily practice; however, the demand this would create for therapists make it logistically impractical and quite expensive. Kinect-based stroke rehabilitation applications have the potential to reduce this impracticality through guided interactive rehabilitation and virtualized therapists. The accuracy of the Kinect for clinical use to this end is strong [2732], supporting the potential for full realization of the latter virtualization paradigm which could make pseudo-therapist assisted home-based rehabilitation a reality [33]. The various Kinect-based rehabilitation methods noted in this review hold great potential not only for supporting accurate completion of rehabilitation [34], but also possibly enhancing clinical record keeping and future medical diagnostic methods [35].

The Kinect also provides a platform for the development of stimulating game-based applications in both elderly care and stroke rehabilitation. Serious games offer patients rehabilitation environments which help motivate successful completion of otherwise dreary or demanding rehabilitation regimens [36], whereas the aim of exercise games (also termed “exergames”) is to create stimulating methods of maintaining an active lifestyle tailored to the specific physiological and psychological requirements of the elderly and disabled while providing the benefits of physical exercise routines [3746].

Methods

Inclusion criteria

All peer-reviewed journal and conference proceedings articles published in English, directly (e.g., fall detection) or indirectly (e.g., gait assessment) related to elderly care and stroke rehabilitation and conducted within one of the subfields presented in the previous section, and also in Figure 1, were included in this review.

Exclusion criteria

As the volume of literature regarding the usage of the Kinect in the fields of interest was not anticipated to be extraordinarily large, but instead not yet aggregated, exclusion criteria for this review was minimal: studies which did not go through a peer-review process for publication or were published in a non-English language, were not directly or indirectly related to elderly care or stroke rehabilitation, or were out of the scope of the subfields mentioned previously, and also in Figure 1, were excluded from the review.

Information databases & search methodology

The following electronic bibliographic databases were searched: IEEE/IET Electronic Library, PubMed, ACM Digital Library, Computer Science Index, Safari Tech Books Online, and ISI Web of Science. Articles were located using the keyword kinect and derived combination sets of the following: stroke, rehabilitation, gesture, posture, clinical, geriatrics, elderly, ageing, aged, alert, fall, gait, exergame, and serious game. No date range or other limits were imposed during the search. Titles and abstracts of all articles were scanned for relevance and a complete list of possible inclusions was compiled with citation information retained in a LaTeX bibliography file. All relevant papers were then closely examined and if a tagged journal paper was deemed a more complete study of a tagged conference paper, the journal version was included and the conference version was discarded. The literature review was concluded on August 1st, 2013.

Data collection and presentation

As the topic of this review spans two overarching categories including multiple, not directly inter-related subcategories, data was collected, and is presented, on a by-topic basis through included charts and graphs as well as in-line text.

Contingency bias

Initially, we planned to assess study quality with the Downs and Black check-list [47]; however, based on the observation that current systems developed utilizing the Kinect are often at an immature ‘proof of concept’ state, this approach was deemed unfruitful. As the Kinect is a newly emerging care and rehabilitation tool, discussing the possibility of various biases within individual papers was determined to be out of the scope of this survey. To this end, all within-scope peer-reviewed studies, including unverified and/or only anecdotally supported, are included in this review.

Results

Figure 2 summarizes stages of article search and the inclusion/exclusion process. 948 records were located through a search utilizing the methods described in the previous section. An additional 7 records were located by manual searches of relevant research laboratory websites. As the keyword set used was generalized with a large overlap between paper topic areas (e.g. a reference to ‘ageing’ typically would return papers relevant to both elderly care and stroke rehabilitation) a large percentage of initially located papers were duplicates (≈48%). After removal of duplicate papers, 461 records remained. Of these records, 378 were excluded for relevancy issues deduced from titles and abstracts. For example, searching for “Kinect AND stroke” returned publications from a variety of sub-fields unrelated to this review, such as, controller-free exploration of medical image data. 83 full-text articles were then examined and resulted in 35 more exclusions for the following reasons: references to the Kinect was included only in future work, only reference to the Kinect was in citation information, paper was not directly or indirectly related to either elderly care or stroke rehabilitation, paper was not peer reviewed, and/or a more recent and comprehensive version of the study was located in a journal publication. The remaining 50 studies met all criteria of inclusion and an overview of bibliographical information content and main results of all studies are provided in Table 1 and Table 2. To extend the utility of these tables to the reader, each study is categorized as a research, methodology or review paper. In addition to the study type, tables summarize the outcome measures, key findings and human subject demographics (if applicable) of each study.

Figure 2
figure 2

Study results during PRISMA phases. Visual representation of the article search and inclusion/exclusion process during different phases of the conducted review process.

Kinect applications in elderly care

In this section we provide a review of applications of Kinect in elderly care grouped under two categories: 1) Fall Detection and 2) Fall Risk Reduction. Technologically advanced alert support systems are a potential avenue of assistance for the independently living elderly person, and these systems could also then be leveraged to produce affordable in-home telerehabilitation methods of care [48]. With falls being a main cause of injury and mortality for the elderly [717, 49] development of robust, affordable, and widely dispersible in-home fall detection and risk reduction alert systems is needed and there is significant interest in applications of Kinect to address this need. We refer the reader to Table 1 for a more detailed summary of the studies covered in this section.

Fall detection

Fall detection has traditionally relied on one or more technologies of panic buttons, audio sensors, physically worn accelerometers or gyroscopes, and/or 2D video capture. Each of these systems comes with inherent limitations: patients with various cognitive deficiencies may be unable to successfully utilize panic buttons, audio sensors are easily overloaded with background noise interference of televisions or music, physically worn devices are cumbersome and wearing them is easily forgotten, and performance of 2D video capture systems significantly degrade in shadow filled or light-less environments [18]. The creation of 3D Red-Green-Blue-Depth (RGBD) cameras is leading to the development of novel alert support system prototypes striving to overcome these previous limitations as well as to enable anonymous privacy preserving fall detection [19, 50]. While the added depth field measurement of the Kinect allows for enhancements in previously employed fall detection methods, conclusive evidence of significant improvement in generalized fall detection performance compared to current strategies has yet to be demonstrated. Current studies utilizing the Kinect have been limited to mostly clean environments lacking basic occlusion, and while the results are generally positive, it should be noted that detecting authentic fall occurrences in occluded home environments is a significantly more challenging goal. The Kinect’s initial performance in fall detection remains in need of vindication by more extensive research.

As fall detection systems are typically employed in living environments, variables such as distance from the sensor and illumination are major challenges for current video-based strategies to overcome. Nevertheless, Zhang et al. [19] presented a method of fall detection that continues to perform well in the lack of normal lighting as well as when the participant was beyond the range of depth sensing of the Kinect through RGB video and image processing methodologies. The reported rate of successful fall detection utilizing this method on simulated events performed by five participants was as high as 98%. Similarly, Lee et al. [20] put forth an algorithm capable of monitoring in shadow filled or completely dark environments. This system used three unique features: bounding box ratios, normalized 2-D velocity variations from the centroids, and Kinect-gathered depth information in order to overcome the error introduced by shadows during moving object tracking. An overall accuracy of 97% with a minimal false positive rate of 2% was reported when applying the system to an unspecified ground-truth dataset.

Another current trend of research in RGBD camera-based fall detection support systems is the development of robust systems that can perform unthwarted by occlusion caused by static or mobile scenery. The following three methods aim to overcome this pitfall by utilizing a 3D bounding box, pre-occlusion velocity data analysis for falls which end in an occluded state, and the orientation of a participant’s derived “spine” in respect to a ground plane.

First, Mastorakis and Makris’s [21] bounding box method utilized a participant’s width, height and depth instead of a more standard method of skeletization; calculations derived from a center of mass; or specific measurements of predetermined body points. This approach enabled the system to function fully, without data pertaining scene objects; however, one drawback to this system is that it does not differentiate between bounding boxes created for valid participants and those created for scene objects. If an object falls over, a false positive will occur.

The second method was developed by Rougier et al. [22] and utilized two forms of data: centroid height relative to floor level and the velocity of a moving body. This method allowed for fall detection through use of the former method when there are no significant occlusions, and through utilization of the latter method when the subject is fully occluded after a fall.

The third method, presented by Planinc et al. [18], is a fall detection system that relied on a calculation of a participant’s “spine,” and its orientation with respect to the ground plane. This “spine” is not directly related to the physiological spine, but instead is derived from an analysis of full-body 3D data collected with a Kinect. It can then be assumed that scenarios where this spine rapidly transforms from a state of perpendicular to the ground plane to parallel, and does not return to perpendicular within a specified amount of time, can be considered as a potential fall event.

Fall risk reduction

Creation of successful algorithms to directly analyze and predict fall potential through real-time Kinect-based systems is a tantalizing, and potentially possible, idea; however, to date, no direct fall prevention algorithms have yet been demonstrated. The existing studies along this line of research are comprised mainly of gait assessment and pre-emptive in-home fall-prevention training exercises, and while gait-based and pre-emptive exercise methods are, at best, only indirectly related to true fall prevention methodologies, they nonetheless have strong potential to contribute to the enabling of accurate fall detection and prevention in the future.

Gait assessment to reduce fall events

The studies in this section represent differing approaches in current Kinect-based automated gait assessment methodology research. Stone and Skubic [23, 24] conducted a comparison of the Kinect with a ground-truth Vicon system. Gabel et al. [25] and Parajuli et al. [26] each separately demonstrated that a wide range of full-body biomechanical parameters, such as core posture or angular velocities of limbs, were accurately acquirable using The Kinect. Stone and Skubic [23, 24] also performed a successful real-world pilot study of their automated gait analysis system in a functional assisted living facility.

Stone and Skubic [24] collected and compared gait related data using the Kinect, a dual web-camera system, and a ground-truth Vicon motion capture system. Not only was the Kinect reported to have sufficient accuracy for clinical gait assessment, it provided significant improvements in foreground capture and overall computational requirements when compared to the low-cost web camera system.

Gabel et al. [25] developed a method of full body gait analysis through the use of Kinect-based data and a multiple additive regression tree algorithm [51, 52]. The system monitored the time of the stride, stance, and swing phases of a gait cycle, as well as angular velocities of arm movements; however, measurements of lower limb angular velocities and core posture were also noted to be possible. Prediction of kinematic measurements using these algorithms resulted in a difference of stride measurements between the two systems of 35–71 ms and a correlation coefficient of angular velocity readings between the Kinect and a gyroscope of greater than 0.91 for both arms. Furthering these findings, Parajuli et al. [26] demonstrated that variables of Kinect-based posture and gait recognition can also be accurately acquired (up to 99%) through the use of the specific biomechanic and algorithmic parameters of: height/shoulder width, arms’ coordinates, and c-support vector machine scaling (for an optimal hyper-plane [53]).

Translating Kinect-based gait assessment application from only system accuracy and biomechanic parameter readings to real-world applications, Stone and Skubic [23] developed a system which yielded the simultaneous creation of accurate, autonomous daily in-home gait data profiles of multiple residents of an assisted living facility. These profiles were then examined and reported as containing sufficient parameters for diagnostic use.

Limitations of Kinect-based elderly care

As the Kinect is a relatively new piece of hardware, establishing the limitations of the sensor within specific application scenarios is an ongoing process. Nevertheless, we provide a list of current limitations of Kinect that we noted based on our review of applications in elderly care systems.

  1. 1.

    Current Kinect-based fall risk reduction strategies are derived from gait-based, early intervention methodologies and thus are only indirectly related to true fall prevention which would require some form of feedback prior to a detected potential fall event.

  2. 2.

    Occlusion in fall detection algorithms, while partially accounted for through the methodologies of the various systems discussed, is still a major challenge inherent in Kinect-based fall detection systems. Current strategies focus on a subject who stands, sits, and falls in an ideal location of the Kinect’s field of vision, while authentic falls in realistic home environment conditions are more varied, therefore the current results should not be taken as normative.

  3. 3.

    The Kinect sensor must be fixed to a specific location and has a range of capture of roughly ten meters. This limitation dictates that fall events must occur directly in front of the sensor’s physical location. While it has been noted that a strategically placed array of Kinect sensors could mitigate this limitation [32, 54], a system utilizing this methodology has not yet been implemented and evaluated.

  4. 4.

    Without careful consideration of the opinions of a system’s proposed user base, concerns regarding ubiquitous always-on video capture systems, such as the Kinect, may inhibit wide-scale system adoption. During the review, it was noted that research related to the reception of alert support systems is at an early phase, likely due to in-home hardware previously being cumbersome and expensive. With the Kinect having the potential to be widely disbursed in in-home setting monitoring systems, this avenue of research has become more viable and relevant [5557].

Kinect applications in stroke rehabilitation

In this section we provide a review of applications of Kinect in stroke rehabilitation grouped under 2 categories: 1) Evaluation of Kinect’s Spatial Accuracy, and 2) Kinect-based Rehabilitation Methods. These categories follow the trend of the literature to first evaluate the Kinect sensor as a clinically viable tool for rehabilitation. Motor function rehabilitation for stroke patients typically aims to strengthen and retrain muscles to rejuvenate debilitated functions, but inadequate completion of rehabilitation exercises drastically reduces the potential outcome of overall motor recovery. These exercises are often unpleasant and/or painful leading to patients’ tolerance for exercise to decrease as indicated by Dobkin et al. [58]. Lange et al. [59] noted that decreased tolerance or motivation often lead to intentional and unintentional ‘cheating’ or, in the worst case scenario, avoidance of rehabilitation exercises altogether. The Kinect may contain the potential to overcome these barriers to in-home stroke rehabilitation as an engaging and accurate markerless motion capture tool and controller interface; however, a functional foundation of Kinect-based rehabilitation potential needs to be established focusing on the underlying strategies of rehabilitation schemas rather than the placating effects offered by serious games. We refer the reader to Table 2 for a more detailed and comprehensive summary of articles focusing on Kinect-based stroke rehabilitation.

Evaluation of Kinect’s spatial accuracy

Advances in the field of gesture controlled user interfaces have only recently erupted in popularity and functionality due to the development of new, affordable computer vision technology. Historically, a majority of research in gesture controlled virtual reality interfaces has been focused on upper limb rehabilitation [60], usually utilizing hand gestures that required various bulky and impractical designs [61, 62]. With ubiquitous computing hardware advances, such as the Kinect, current research is rapidly migrating toward a more compact and direct human communication method of gestures and gesture patterns. Through these advances, the advantages virtual reality systems have previously shown to offer in a clinical setting and novel home-based stroke rehabilitation paradigms, are becoming feasible. However, before Kinect-based motion capture systems can be deemed useful, the spatial accuracy and resolution of the heart of these systems-Kinect-gathered data-needs to be thoroughly examined.

The following studies focus on evaluating the accuracy of the Kinect, and when placed under scrutiny, the Kinect has been found, in general, to carry significant potential for a cost-effective motion capture system for rehabilitation. Chang et al. [27] and Loconsole et al. [30] specifically examined accuracy of Kinect in recording upper extremity movements, whereas a whole-body postural evaluation approach was taken by Clark et al. [28], Fern et al. [31], and Obdrzálek et al. [29]. Furthermore, in an attempt to remove the variability introduced through human subjects during accuracy evaluations, Pedro et al. [32] utilized a robotic arm. As the volume of accuracy evaluation studies focusing on specific postures or scenarios is large, we refer the reader to Table 2 for more detailed reports and accounts of these individual studies. In the remainder of this section, we provide a summary of most prevalent results of Kinect accuracy evaluation studies.

Research related to the Kinect’s ability to accurately capture upper extremity movements is consistently reported as sufficient for clinical use with regards to the elbow and wrist joint tracking; however, mixed results have been reported for the shoulder. Loconsole et al. [30] leveraged a setup containing real, rather than solely virtual, objects, and while some accuracy variation in all joint trajectories was noted depending on the object’s distances on the Z axis (towards the object from the camera) and X axis (horizontal, sideways from the camera), all tests - including those for the shoulder joint trajectories - resulted in readings of within 2 cm of the correct/baseline values. This was reported as well within the limits of rehabilitation needs. Chang et al. [27]; however, observed acceptable wrist and elbow joint tracking, but the shoulder trajectory readings were found to be widely inconsistent. The authors attribute these inconsistencies to differing methods of motion capture and joint estimation between the ground-truth OptiTrack system and the Kinect. On the other hand, even with these inconsistencies, when participants were asked to utilize external rotation of the shoulder during game play, the system successfully identified all non-external rotation movements performed as incorrect.

Clark et al. [28], Obdrzálek et al. [29], and Fern et al. [31] concluded that, in general, the Kinect has sufficient accuracy for the assessment of whole-body kinematics for postural control and diagnostic purposes. The notable issues of concern with regards to Kinect-based accuracy values between these three studies ultimately related to one of two things: self-occlusion errors (which can be caused by the angle between a participant and the sensor, specific movements such as placing a hand on one’s lumbar spine, or when the scene contained non-participant objects such as wheelchairs or walkers) or proportional biases which, when observed, always occurred in complex embedded systems such as the pelvis, sternum or shoulders.

In order to simplify the method of verifying the Kinect’s accuracy for rehabilitation and to avoid the influence of various errors introduced by human biomechanics, Pedro et al. [32] utilized a ‘points of interest’ approach rather than a whole-body kinematic analysis. Under this methodology, the Kinect readings had good repeatability in both X and Y axes, whereas repeatability worsened as the distance to the ‘point of interest’ (Z axis) increased. Data gathered in this study showed that the average of the standard deviation of spatial error increased quadratically with distance; however, even with this limitation it was noted that the Kinect retained a sufficient level of accuracy at manufacturer recommended distances for use in rehabilitation applications.

In an attempt to further improve accuracy of Kinect, including lessening occlusion error or enhancing fine motor control capture, the use of the Kinect together with various inertial sensors has sparked interest. Hussain et al. [63] made use of Kinect-monitored manipulation of specially designed intelligent objects (i.e a can, a jar, and a key-like object embedded with inertial sensors) for fine motor control diagnostics of the hand and wrist. This allows for a virtual environment to monitor the location and kinematics of both the user and the objects manipulated by the user. A variety of hand-held objects very similar if not identical to those prototyped by Hussain et al. are utilized in current, widely used stroke impairment classification tests [64]. Data fusion systems of this type have the potential to enable low-cost home-based stroke impairment quantification tests for both gross and fine motor skills.

As noted by Obdrzálek et al. [29], the Kinect does not perform well at skeletizing positions under significant participant occlusion, or non-participant object interference. When compensation for this deficiency is required for more specific applications, skeletization based on a fusion of Kinect-gathered and worn inertial sensor data show promise for accurate data collection. Bo et al. [65] used inertial motion sensing units composed of 2-axis gyrometers, 3-axis accelerometers, and the Kinect (using Primesensor NITE Middleware) to support accurate Kinect-based data capture with significant occlusion-based error reduction. This error reduction was accomplished by utilizing Kinect readings, when available, as a method of inertial sensor calibration, and inertial sensor estimations when Kinect readings are unavailable due to occlusion.

Kinect-based rehabilitation methods

The ultimate goals of validating Kinect’s accuracy for rehabilitation are diagnostics (quantifying motor function improvement level of patients) and development of home-based rehabilitation protocols. In this section, we provide a summary of studies that focused on Kinect’s applications to pursue these goals, and their results on provisionary physical and mental benefits.

Virtual reality-based rehabilitation offers a a highly interactive system with many documented benefits specifically to stroke patients [66], and a large variety of Kinect-based approaches of stroke rehabilitation have recently come to the forefront. From Da Gama et al. [67] Pastor et al. [68], and Yeh et al.’s [69] virtual exercise guide and game-based rehabilitation systems, to Shiratuddin et al.’s [70] interactive visuotactile 3D virtual environment, and Frisoli et al.’s [71] multi-modal architecture for brain-controlled interface-driven robotic upper limb exoskeleton, there is a broad range of potential Kinect-based applications.

Promoting proper form/posture, repetition, and enjoyability of stroke-based impairment rehabilitation exercises support and foster motor recovery. Toward enabling proper form, Da Gama et al. [67] developed a system which focuses on the guidance and correction of targeted upper extremity exercise movements. This system monitors and corrects inappropriate postural compensation, a common but discouraged strategy during stroke rehabilitation. Yeh et al. [69] proposed a system that, through the manipulation of varied virtual balls, aims to entertain a patient who has to perform repetitive and what would otherwise be dull exercises. The enjoyability of a task is commonly linked in part to personal performance, and building on this premise, Pastor et al. [68] presented a game-based system where the level of difficulty can be personalized to the patient’s specific impairment-related needs through explicit/direct parameter adjustments or based on performance during game play.

Hints of various multidisciplinary directions Kinect-based research is expanding toward can be seen in the more intricate applications of the Kinect presented by Frisoli et al. [71] and Shiratuddin et al. [70]. Frisoli et al. proposed a Kinect-based, multi-modal architecture for a brain-controlled interface-driven robotic upper limb exoskeleton with a goal of providing active assistance during reaching tasks for stroke rehabilitation. At the level of action planning, the patient’s intention to move towards an object is acquired through a Kinect-based vision system that identifies and tracks physical objects, and an eye-tracking system. At the level of action, brain activity is analyzed during motor imagery and controls the exoskeleton accordingly. Experimental results demonstrated that operating the exoskeleton movement through brain-computer interaction was successful with a classification error rate of 89.4 ± 5.0%. Shiratuddin et al. also proposed a unique framework which utilizes non-contact natural user interfaces, such as the Kinect, in an interactive visuotactile 3D virtual environment rehabilitation system.

These initial benefits, system ideas, and hints toward future research demonstrate a strong potential for fruitful Kinect development, as well as enhancing previously developed widely used out-patient rehabilitation services [72]. These initial studies, by and large, present positive results; however, the potential impact Kinect-based rehabilitation may have on future paradigms is only currently emerging and it is difficult to predict how widespread such systems will become. Their use depends largely on their success in practical implementation, validation of benefits and acceptance by users.

Limitations of Kinect-based stroke rehabilitation

The current stage of Kinect-based rehabilitation literature is lacking in reported functional and validation data because a majority of systems are only at a proof of concept stage. The following over-arching limitations have been derived from the current state of the literature:

  1. 1.

    While initial Kinect-based comparisons with research grade motion capture systems demonstrated highly correlated trends and reasonable accuracy, a majority of evaluation studies focused only on sets of gross movements that are advantageous for Kinect. Evaluation of more realistic and/or specific diagnostic movement sets are still needed.

  2. 2.

    The Kinect is unable to accurately assess internal joint rotations of the shoulder and instead utilizes a much less clinically viable single-point estimation. Use of the Kinect for specific shoulder-based functionality requirements have yet to be shown to be clinically viable.

  3. 3.

    Rehabilitation goals which include fine motor skills can not be captured by the Kinect alone; however initial studies suggest fusion systems of Kinect and inertial sensors can be a feasible alternative.

  4. 4.

    Kinect systems are usually not suitable for severely disabled patients, as gross movements that remain extremely small in their entirety are difficult for the Kinect to accurately capture.

Serious and exercise games

Historically, virtual reality rehabilitation has always been a promising field with an infeasibly high price tag for mass implementation [73]. Tanaka et al. [2] note that recent research, focused on hardware and software, has lead to Sony’s Eyetoy, Microsoft’s Kinect, and Nintendo’s Wii becoming the top three market contenders as tools for low-cost virtual reality rehabilitation platforms. This initial comparison study concluded that the Kinect is the forerunner of these top three tools, citing three main reasons: 1) the Kinect provides the most natural form of human-computer interaction; 2) the Kinect is the most feasible technology for a widely dispersed system of elderly exergaming as it utilizes vision-based data capture and requires no extraneous hardware, and 3) the freedom of controller-free data acquisition and ease of developer access to the Kinect platform required for the development of novel and high quality rehabilitation systems and exergames.

The benefits of focused physical tasks and exercise to stroke victims and the elderly have a rich and well documented history [3746]. Growing out of this solid foundation, Kinect-based gaming has notable potential to create a low-cost and enjoyable exercise setting while simultaneously gathering quantitative data related to rehabilitation progress, general caloric expenditure, and aerobic activity [74, 75]. Current Kinect-based gaming research consists of exergames and serious games. Exergames (a term for exercise games) aim to combine natural human movements and the entertainment of video games to promote elderly exercise and enable built-in unobtrusive diagnostics, whereas serious games intend to simultaneously rehabilitate motor-impaired users while evaluating patient progress and monitoring for potential patient injury. In this section, we provide a review of studies and their results involving use of Kinect in serious and exercise game applications. Again, we refer the reader to Tables 1 and 2 for more comprehensive summaries of articles focusing on this topic.

Design considerations

In the past, game development has focused on utilizing a player’s cognitive abilities to create an enjoyable experience. The physical dimension of serious and exergames has added an extra challenge to game development while simultaneously enabling video games to find alternative applications in facilitating general and rehabilitation-based exercises. The majority of current design considerations focus on accessibility challenges caused by software development decisions [76] and hand-held and floor-based physical devices such as the Wii, EyeToy, and Dance-Dance Revolution modifications, in addition to Kinect [77, 78]; however, our discussion here focuses mainly on design considerations for Kinect-based systems.

In their comparison of elderly preferences between button-based, mixed button/gesture-based, and gesture-based controllers for game play, Pham et al. [79] observed that Kinect-based controller-free design carries the benefit of being the preferred choice of the elderly. Three main findings were reported: 1) older participants preferred less or no physical controlling devices (42% prefered gesture-only, 25% preferred mixed, 8% preferred button-only, and 25% had no preference); 2) the requirement of larger physical movement of the Kinect did not stop it from being the most attractive system, and 3) older adults perceived the need to develop their knowledge and skill further for complete use of the Kinect.

Arntzen et al. [80] examined the physical and cognitive requirements of game design targeted for elderly players based off of interviews and a literature review. The resulting design considerations for controller-free game development were compiled to define seven categories: 1) visual; 2) hearing; 3) motion; 4) technological; 5) acceptance; 6) enjoyment, and 7) emotional response. Furthermore, they suggested an iterative approach to game development, in which a preliminary assessment should be done with patients using traditional games, and results of the assessment should inform refinement and definition of requirements. Once a game prototype is developed, another assessment should be conducted on the usability of the system by those with age related cognitive and/or physical disabilities. Gerling et al. [81] also proposed an iterative method of game development and noted that while age-related visual, hearing, and motion impairments can be accounted for during design, it is advisable to conduct multiple stages of user-feedback driven design prototypes in order to accommodate more specific impairments as well as to ensure user approval in the cognitive categories of technological acceptance, enjoyment, and emotional response.

McNiell et al. [82] offered suggestions for future rehabilitation game development based on previous work and a literature survey. They highlighted that the response to failure and poor performance, in any game, is integral to its use by a player base, and hence should be taken as an important consideration. They suggested that including appropriate positive and encouraging feedback mechanisms are necessary tools to overcome the discouragement that system unfamiliarity and poor motor skills will inevitably cause during use of a serious or exercise game.

Jiang et al. [83] suggested a number of heuristics for selecting Kinect-based gesture patterns specific to patients with upper extremity impairments. The guidelines for appropriate gesture selection were derived using a human-based approach which constructs the gesture lexicon based on studying how potential users interact with each other rather than what would be easy for the system to recognize. These guidlines included the following: (1) Select gestures that do not strain the muscles; (2) Select gestures that do not require much outward elbow joint extension; (3) Select gestures that do not require much outward shoulder joint extension; (4) Select gestures that avoid outer positions; (5) Select dynamic gestures instead of static gestures; (6) Select vertical plane gestures where hands’ extension should be avoided; (7) Relaxed neutral position is in the middle between outer positions, and (8) Select gestures that do not require wrist joint extension caused by hand rotation.

Exercise games

The Kinect is not unique in its ability to provide vision-based data capture capable of supporting gaming paradigms; however, current research grade multiple camera motion capture systems are typically expensive, difficult to set up, and require a knowledgeable operator. The Kinect does not suffer from these challenges, and with its low-cost, leading the emergence of natural gaming paradigms and development of targeted exercise games (the term “exergames” is also commonly used) in a variety of areas.

Pham and Theng [79] demonstrated an interesting interaction between participant performance and preferences. When given the choice among a button-based system (Wii), a system that fused button-based and vision-based (Wii/Kinect), and a solely vision-based (Kinect) system, the majority of elderly participants gravitated toward the Kinect-only system; however, performance measurements suggested that a fusion system of physical buttons and Kinect resulted in higher performance. Two main benefits cited for this general preference of the Kinect-only system were the remote range provided and the more comfortable method of human-computer interaction. Hassani et al. [84] noted that this more comfortable interaction method was especially observed in frail or partially disabled participants who did not desire to get up and walk toward a computer screen. Furthermore, a completely home-based system, as described by Maggiorini et al. [54], would be ideal for a game-based exercise and rehabilitation paradigm.

Gerling et al. [85] conducted two studies to examine the use of Kinect as a human-computer interface for older adults. In the first study, an evaluation of elderly participants’ performance using a set of gestures developed with the aid of a physical therapist was performed. Based on the resulting limitations observed in movement patterns, an exergame was designed targeting institutionalized elderly participants. The second study then investigated how participants responded to the derived game-based gestures, and concluded that Kinect-based gaming has a positive effect on users emotional well being. Sun et al. [86] developed a Kinect-based exergame which allowed players to participate in interactive balance exercises with visual feedback, and explored how Kinect-based balance training exercises influence the balance control ability and the tolerable intensity level of a player. The game would move various body-outline shapes toward the player’s avatar, and the player would then have to imitate the body-outline shape in order to pass through it without touching the outline. As differing player experience evaluation methods resulted in different findings, it was concluded that care must be taken while deciding on which evaluation methods are to be employed within game design. Chiang et al. [87] examined the health benefits of somatosensory video games specifically related to reaction time and hand-eye coordination on institutionalized older adults confined to wheelchairs. “Follow the Arrow”, “Matchmaker”, and “Mouse Mayhem” –three previously developed games– were modified for the Kinect and then utilized to gather participant related data. A significant decrease in the mean and standard deviation times from start to target were noted in the experimental group (which received Kinect-based training) while the control group lacked any observable improvement. Chen et al. [88] presented a study which attempted to quantify the health benefits of Kinect-based somatosensory video games to older adults with disabilities. Various physical benefits were noted throughout the study; however, mental benefits, in general, showed no significant differences between experimental and control groups.

Each of these systems concluded with overall positive results and demonstrated that Kinect-based gaming can significantly improve quality of life using a variety of measures, such as participant’s emotional state [85], physical function, level of body pain [88], visual performance skills, reaction time, and hand-eye coordination [87]. A caveat to these positive results can be seen in that evaluation methods based specifically on player experience can result in notably different outcomes. Thus exergames for training purposes strongly building on player experience as a metric should be designed with care [86]. Also, to understand the efficacy of somatosensory video game intervention, more rigorous examination needs to be conducted in order to strengthen these initial results.

Serious games

The physical changes that accompany ageing affect a wide range of functions, including sensory-perceptual processes, motor abilities, response speed and cognitive processes [89]. Research on the efficacy of serious games to retain and rehabilitate optimal abilities have been limited mainly to qualitative studies with small sample sizes and focusing on a variety of controllers and inertial sensor systems [33]. This limitation can also be seen in the literature for current Kinect-based serious games, as the majority of studies have not yet moved beyond initial game design and development.

For many stroke patients, balance and weight shift management constitute a risk of secondary injury [90]. Lange et al. [91] prototyped a serious game based on their Flexible Action and Articulated Skeleton Toolkit (FAAST) which enabled a Kinect-based system to run Jewel Mine; a balance rehabilitation game which encourages the user to reach outside of their base of support. Based on discussion and technical support from Lange et al., Huang et al. [92] proposed a smart glove extension to their system for concurrent hand and upper limb rehabilitation by requiring a player to actually grasp the virtual gems and place them into a receptacle instead of just hitting them. Also utilizing the fusion of a Kinect sensor and a haptic glove, Sadihov et al. [93] developed three minigame applications to offer a variety of game play, motor requirement, and difficulty options: 1) a table wiping game; 2) a meteor deflection game, and 3) a rope pulling game. This methodology of developing multiple minigames for maximum variety of play and motor tasks was also employed by Borghese et al. [34] in the development of Animal Feeder, which offered training for dual task management, and of Fruit Catcher, a game which required reaching and weight shift techniques to be employed without movement of the feet. Borghese et al.’s system also carries the potential to utilize Kinect-obtained information about a patient to both fine tune rehabilitation game parameters and to assess patients’ improvement.

Crosbie et al. [60] noted that friendly competition built into a stroke-based serious game can increase social activity and enjoyment; however, it also is feasible to anticipate a patient inadvisedly attempting to ‘win’ or surpass a set ‘high score’ becoming physically exhausted from over use - especially in systems based on remote monitoring and lacking direct clinical monitoring [35]. This competitive aspect of human nature has the potential to drastically hamper successful rehabilitation. As a solution to this problem, Saini et al. [94] proposed the “watch dog” monitor in order to prevent overuse or overexertion injuries. This monitoring system alerts a user if a game maneuver they execute goes beyond therapist-recommended kinematic limb and body angle limits and ensures that users will not exercise for periods of time so extended as to be counter-productive to rehabilitation goals.

Concerning lower-limb rehabilitation, Llorens et al. [95] developed a serious game which functioned by estimating participants’ foot locations and then creating two virtual feet on a screen with the game objective of using these virtual feet to step on randomly rising targets that emerged from the floor. The results of this follow-up study involving chronic stroke patients showed improvement on the Berg Balance Scale [4] of 49.00 to 52.13 which was noted by the authors as surpassing standards of post-stroke improvement in functionality previously established by Liston et al. [96].

Throughout all of these developed systems, one thread of consistency is the positive reception by players and initial rehabilitation results. The overall view in the literature related to serious games is positive; however, the level of current confidence is almost unanimously recognized as tentative and needing further study.

Limitations of current Kinect-based serious and exercise games

The limitations specific to Kinect-based serious and exercise gaming applications, considering the requirement of clinical data capture for specific limb movements; specific player-base-desired design considerations; varying levels of limb impairment, and previously defined serious and exergame-based requirements, can be summarized as the following:

  1. 1.

    Any games designed specifically for diagnostic usage are limited to non-occluding movements. This implies that standard stroke impairment level tests requiring extensive occluding movement sets may be untenable for a Kinect-based system to capture.

  2. 2.

    Diagnostic potential for extremities is limited to gross movements, as fine movements of the hand and foot are currently outside the Kinect’s capture sensitivity.

  3. 3.

    Games targeted at rehabilitation may be prone to “cheating” (e.g. excessive, unnatural and counter-productive trunk-based compensation).

  4. 4.

    Appropriate response to failure and poor performance, if not accounted for during game design, can inherently limit positive outcomes due to demotivation/discouragement resulting in less frequent/consistent use of the system.

  5. 5.

    The current benefits of Kinect-based gaming have only tentatively been studied with mainly short term and small sample sized studies. Data to date should be seen as initial results, and not normative.

Discussion

As Kinect-based elderly care and stroke rehabilitation research is in its infancy, a majority of the data acquired is qualitative with a focus on self-report and personal opinion. Compounded with this observation is the fact that the data is derived only from small groups of participants, it is anything but normative and should be viewed as tentative initial results. Filling the deficiencies of quantitative and large population based research thus remains a potentially fruitful and necessary avenue of future work. The current applications of Kinect in elderly care seem to be at a more mature state than those of stroke rehabilitation; however, even with the current deficiency, we believe that the Kinect carries the potential to become a future cornerstone of widely dispersed care and rehabilitation systems.

With regards to fall detection and fall risk reduction applications, each of the current technologies of panic buttons, audio sensors, body-fixed systems, 2D and 3D video capture systems comes with inherent physical and financial limitations. While the Kinect does not render any of these technologies obsolete, and comes with its own limitations, its unique aspects may enhance current systems, or potentially be the foundation for newly developed functionalities.

  1. 1.

    The autonomous nature of the Kinect allows for fall detection without requiring a user to physically trigger a panic button system, or the wearing of cumbersome physical devices, which can often be forgotten.

  2. 2.

    The Kinect comes with built in directional sound capture capabilities, possibly enabling it to be used as, or in unison with, current audio sensor-based systems. This is a functionality of the Kinect not yet studied.

  3. 3.

    The added third dimension depth field measurement offered by the Kinect requires less overhead than current methodologies, and enables the development of more accurate methods of fall-related image processing to be developed.

  4. 4.

    The low-cost, marker-less, and widely dispersible nature of an already commercially available gaming system immediately enables the current clinical virtual reality-based rehabilitation methodologies to be rapidly relocated to individual home settings.

While questions such as participants with what level of impairment most strongly benefit from using the Kinect, how to build an ideal exercise routine for preventative or rehabilitation needs, which is the most performant fall detection algorithm, or how to most efficiently leverage all the various capabilities of the Kinect are still unanswered, the overarching conclusions of this review point toward the Kinect as a promising technology for a wide range of elderly care and stroke rehabilitation applications and need for studies involving larger participant pools to establish reliability and validity.

Remarks and suggestions for future work

We have compiled the following list based on our review of the current state of Kinect-related elderly care and stroke rehabilitation literature. It contains our remarks as well as suggestions for relevant future research.

  1. 1.

    A majority of applications in elderly care and stroke rehabilitation require a robust and easily manipulated user interface which at present cannot be readily found in current commercial systems. When developing exercise games, serious games, and applications of Kinect-based rehabilitation, it is vitally important to remember that repurposing a technology initially intended for a younger and healthier audience requires careful adherence to new design strategies focused on both the physiological and psychological requirements of aging and injured users. Therefore we suggest that multidisciplinary research teams involving engineers as well as clinicians, human factors experts and cognitive psychologists would be best positioned to tackle the challenges that such game development efforts would entail.

  2. 2.

    A critical validation step for Kinect-based applications to both fields is a focused experimental evaluation of the accuracy and latency of the motion capture data obtained from Kinect in comparison with a research grade motion capture device and statistical evaluation of this data for specific diagnostic potential. The effects of distance from the Kinect sensor on gathered data is another important consideration. In order to verify that the Kinect has the potential to make therapy financially accessible and medically beneficial to a large population of elderly and stroke patients, more targeted studies involving relevant rehabilitation and preventative exercise movements, are needed.

  3. 3.

    Kinect applications may have the potential to simultaneously achieve care or stroke related goals while capturing real-time, clinically viable data for injury risk evaluation. The real-time data gathering aspect of the Kinect has yet to be satisfactorily examined as a majority of documented work focuses solely only providing assistance or motivation to the user while ignoring this important potential function.

  4. 4.

    As alert systems potentially gather data in an always-on fashion even what might normally be considered mundane activities can then turn into potential privacy infringements. Because of this potential problem, the reception and concerns of the elderly related to always-on systems require a thorough and careful examination.

  5. 5.

    Game content specifically designed for aging and/or injured users must simultaneously allow for high standards of captivating game play and long-term enjoyment potential while maintaining seamless methodologies of adaptation and monitoring of players needs, which are critical characteristics of a successful low-cost home-based rehabilitation paradigm.

  6. 6.

    Generic game play may be unsuitable for many patients with secondary disabilities not solely defined by motor function. Strategic adaptation schemas for game play are necessary traits as complex demands for speech, memory, or cognition patterns significantly reduces the potential for games with large and diverse player bases.

  7. 7.

    The stimulating aspect of exergames and serious games, while being beneficial in patient motivation, should be closely monitored as to provide an exercise or rehabilitation environment that discourages overexertion injuries in both individual movements and length of play.

  8. 8.

    Assumptions of rehabilitation success should not be based on in-game score improvement as arbitrary scores do not necessarily correlate with actual functional improvement, and evaluation methods based specifically on player experience can result in notably different outcomes, therefore using player experiences or in-game scores as metrics should be avoided.

Conclusions

In this paper, various applications of the Kinect in the fields of elderly care and stroke rehabilitation have been examined. We have classified these applications into the groups of (1) Fall Detection, (2) Fall Risk Reduction, (3) Evaluation of Kinect’s Spatial Accuracy, (4) Kinect-based Rehabilitation Methods, and (5) Serious and Exercise Games - serious games for stroke rehabilitation and exercise games for the elderly. While only in its initial stages of development, the Kinect already shows notable potential in making therapy and alert systems financially accessible and medically beneficial to a large population of elderly and stroke patients; however, some significant technological limitations still present are: a fixed location sensor with a range of capture of only roughly ten meters; a difficulty in fine movement capture; shoulder joint biomechanical accuracy, and fall risk reduction methodologies that only utilize indirect, gait-based pre-emptive training. The directions for future work are vast and have promise to enhance elderly care; stroke patient motivation to accurately complete rehabilitation exercises; rehabilitation record keeping, and future medical diagnostic and rehabilitation methods. Based on our review of the literature, we have reported a summary of critical issues and suggestions for future work in this domain.

Authors’ information

OC received his B.S. and M.S. degrees in Mechanical Engineering in 2004 and 2006 respectively, from Istanbul Technical University, Turkey. He completed his Ph.D. in Mechanical Engineering in 2011 at Rice University and was a Research Assistant at the Mechatronics and Haptic Interfaces (MAHI) Lab from 2006 to 2011. He was an Assistant Professor at San Francisco State University from 2011 to 2013. He is currently an Assistant Professor at Colorado School of Mines and Director of the Biomechatronics Research Laboratory. He was a recipient of the best paper award at the IEEE World Haptics Conference in 2011. His research interests include robotic rehabilitation, biomechatronics, robotics, haptics, human sensorimotor control system, motor adaptation and learning.

DW received his B.A. in Music from the University of Georgia in 2009 and his M.S. in Computer Science from San Francisco State University in 2013. Between 2012–2013, he was a Research Assistant in the School of Engineering’s Biomechatronics Research Laboratory. His research interests include computer vision, human-computer interaction, biomechatronics, robotic rehabilitation, and kinematic modeling.