Virtual Reality Social Cognition Training for Young Adults with High-Functioning Autism

Few evidence-based social interventions exist for young adults with high-functioning autism, many of whom encounter significant challenges during the transition into adulthood. The current study investigated the feasibility of an engaging Virtual Reality Social Cognition Training intervention focused on enhancing social skills, social cognition, and social functioning. Eight young adults diagnosed with high-functioning autism completed 10 sessions across 5 weeks. Significant increases on social cognitive measures of theory of mind and emotion recognition, as well as in real life social and occupational functioning were found post-training. These findings suggest that the virtual reality platform is a promising tool for improving social skills, cognition, and functioning in autism.


Introduction
Transition into adulthood is often a challenging time for individuals with autism spectrum disorders (ASD). Social impairments, inherent in high-functioning autism (HFA), interfere with the process of building relationships, functioning occupationally, and participating and integrating into the community (Hendricks and Wehman 2009), all of which are aspects of social functioning that take on a larger role during the transition to adulthood. For many individuals with HFA, navigating these new social demands can be challenging. Recognition of these challenges has led to the development of treatments for adults with HFA that aim to increase functioning by focusing on selected social skills, such as emotion recognition, that in turn may improve social cognition, and social functioning. Less often, interventions are based directly on social cognition, the thought processes of accurately integrating, interpreting and responding to social cues. However, there are few evidence-based interventions for adults with HFA (Moxon and Gates 2001) that apply either social skills or cognition approach. Further, measuring change over the course of social interventions remains elusive in terms of identifying informative variables to measure (Cunningham 2011).
Previous adult interventions, outside the realm of Virtual Reality (VR), have quantified social skills and cognition over time using a variety of techniques and measures, and have had mixed success. Some adult group intervention studies have utilized observational techniques, such as conversational style during simulated scenarios of a party or a job interview (Howlin and Yates 1999), or observations of frequency and types of interactions (Hillier et al. 2007). Self-rated questionnaires, such as empathy and acceptance by others, also have been selected (Hillier et al. 2007). Significant changes on self-rated questionnaires and observations of social skills training were reported in these intervention studies after 8 weeks and 1 year, respectively, in samples of thirteen and ten HFA adults (Hillier et al. measuring social change over time is through the use of social cognition performance measures. Performance measures are recorded while participants engage in social tasks and the accuracy of the response is quantified. For example, measures of emotion recognition in faces or voices are recorded while a participant is asked to label or match an emotion or emotion word. Emotion recognition tasks have been used in previous social skills and cognition training studies in adults (Golan and Baron-Cohen 2006;Turner-Brown et al. 2008). In both aforementioned studies, improvements were found on these measures of emotion recognition although other measures of theory of mind and conversational skills did not show a change after the interventions. Overall, the previous literature in HFA adults has demonstrated that targeted interventions can improve social skills and social cognition. However, the majority of these interventions have been social skillsbased and group-based which may limit the amount of practice of social skills and the amount of time spent interacting with others outside the autism spectrum. These interventions may also be somewhat limited by the participant's imagination, which has been noted to be impacted in this population (Herrera et al. 2008;Wing and Gould 1979). This lack of real-world training may hinder the generalization of treatment effects. Less is known about the potential change in social skills and social cognition when conducted in an individual real-time simulation of authentic social interactions with neurotypical adults.
One platform that provides opportunities to practice dynamic and real-life social interactions is VR, which is a computer-based simulation of reality in which visual representations, based in everyday life settings, are presented on a screen. VR has been used previously and shown to be an effective intervention tool in treating various conditions including anxiety, phobias such as fear of flying, stroke, and post-traumatic stress disorder (Riva 2005;Anderson et al. 2001;Cameirao et al. 2010;Broeren et al. 2008;Rothbaum et al. 1999). Its utility is likely due to several unique characteristics. Unlike other therapeutic options, such as role-playing, VR represents real-life experiences in a safe, controllable manner that allow for repeated practice and exposure, which is a key element in treatment. VR can also provide naturalistic environments with unlimited social scenarios and has been shown to replicate social conditions (Wallace et al. 2010). Online representations of people, referred to as 'avatars', can be controlled by clinicians and software can morph clinicians' voices to match the avatar's demographics; for example, an older woman can look and sound like a young boy. Also, Artificial Intelligence (AI) may include scripted characters that have rote roles without the need of a clinician. Importantly, VR allows for social interactions without the high levels of stress, fears of mistakes or rejection that is commonly encountered in face-to-face exchanges. For use in helping the autism population, VR leverages a common interest for many adults with ASD, computers, to increase motivation, investment in the treatment, and generalization (Parsons and Mitchell 2002). The flexibility of VR environments, their removal of common stressors of face-to-face interactions, and their inherent appeal to many adults with ASD, all suggest that VR may prove to be a more effective platform for enhancing social skills and social cognition in ASD compared to other therapeutic tools that are more didactic and constrained in their application. The current study examines whether gaining experience in life-like contexts in VR environments can enhance social skills, social cognition, and social functioning.
There are few VR studies in ASD and even fewer in adults with ASD. Earlier studies have found that the VR can be utilized by children with ASD as a learning tool (Strickland et al. 1996), to teach safety skills, (Self et al. 2007;Josman et al. 2008), to hold their interest (Cobb et al. 2002;Max and Burke 1997), monitor eye gaze (Lahiri et al. 2011), aid learning of pretend play (Herrera et al. 2008), and interpret emotions of avatars accurately (Moore et al. 2005;Cheng and Ye 2010). Three previous studies reported preliminary findings using social scenes of a virtual café and/or bus, and AI to teach social conventions in adolescents with ASD. The VR software provided users an opportunity to maneuver an avatar and engage in simplified interactions, such as finding a place to sit. Parsons et al. (2004) investigated the use of the virtual café in twelve children (13-18 years) with ASD having estimated full scale intelligence quotients (IQ) falling in the mildly impaired to average ranges. When compared to the matched control group, the ASD group had difficulties maneuvering avatars; the authors explain this as potentially being related to difficulties with personal space rather than motor or executive dysfunction or misinterpretation of the VR environment. The ASD children were also observed to use and engage in the virtual environment as a representation of reality. Building from that examination, a qualitative case-study of two adolescent boys (IQ 73 and 100) was conducted to further investigate the use of VR in ASD interventions (Parsons et al. 2006). Again, results indicated that the ASD adolescents interpreted the VR as life-like, while enjoying the task and discussions with the real-life facilitator seated next to them in the physical world. A subsequent study utilized trained raters to quantify social judgment and reasoning in six adolescents (IQ range 65-110) using Likert scales at three time points. After the use of the adapted virtual training environment, progress in some social decisions, such as where to sit was reported. These preliminary studies have demonstrated that individuals with ASD can use, appropriately understand, enjoy, and practice social interactions in VR.
While these previous VR studies showed promise, they were limited in several ways. First, the VR software incorporated AI in which the participant made a mouse click on the screen to activate feedback on programmed social decisions. This explicit and discrete skill set approach and software may limit practice, feedback, and possibly the effects of social interventions. Secondly, measurement of social performance over time was limited to experimental measures. Measurement of social skills and behavior is difficult especially since few social measures are published or standardized. A further complication is the lack of sensitive measures, which can impact results of social skills program making them appear less effective (Gresham et al. 2001). The need for reliable and valid tools to measure social cognition and functioning remains a challenge, particularly for adult populations with ASD (Hillier et al. 2007). Additionally, these studies were limited to children and adolescents. Whether VR training is an effective intervention tool for ASD adults in a variety of social skills and cognition remains unaddressed.
VR holds enormous promise for social improvement in ASD by offering a platform to safely practice and integrate social cues that may improve social skills, social cognition, and social functioning. No published studies to date have examined VR intervention in adults. The Virtual Reality Social Cognition Training (VR-SCT) intervention was developed to address these gaps in providing effective treatments for adults with HFA. Primarily, this pilot study investigated the feasibility of a 10-session VR-SCT intervention in adults with HFA. A secondary aim was to quantify social change over time using social performance and skill measures, and a functional questionnaire. We hypothesized that change scores (post-pre) on performance measures of social perception, emotion recognition, theory of mind and social conversation would be significant. A 6-month follow-up phone questionnaire was also conducted to examine social, academic and occupational functioning.

Participants
Eight participants ranging in age from 18 to 26 years were recruited by the Center for BrainHealth Ò at The University of Texas at Dallas (UTD) as part of an on-going study of social cognition. All procedures were approved by the Institutional Review Boards at UTD and the University of Texas Southwestern Medical Center at Dallas. Each participant provided written informed consent to participate in the research study. All participants held a primary diagnosis of either Asperger Syndrome or PDD-NOS from a licensed psychiatrist, and diagnoses were confirmed via the Autism Diagnostic Observation Schedule (ADOS; Lord et al. 2002). Participants were excluded if they had an acute psychiatric condition or Axis I psychopathology, except managed depression, or a history of neurologic disorders. All had average to above average estimated IQ score ranges on the Wechsler Abbreviated Scale of Intelligence (WASI; Wechsler 1999). Seven of the participants were Caucasian and six were either employed or a full time student. Refer to Table 1 for demographic information.

Virtual Reality Social Skills Intervention
Technology and VR Environment The VR technology was developed using Second Life TM (Linden Lab 2003), a three-dimensional virtual world software available to the public. The study utilized Second Life TM version 2.1 run on Microsoft Windows XP or newer, graphics cards of ATI Radeon 8500 or better and 1.5 GHz 9 86 CPU. The monitor size was a 24-inch with a resolution of 1920 9 1200.
A protected Second Life TM island, accessible only for the purpose of participating in our VR-SCT training, was designed and utilized for this study. The virtual reality environment included the following locations: an office building, a pool hall, a fast food restaurant, a technology store, an apartment, a coffee house, an outlet store, a school, a campground, and a central park. Avatars, representing the user in the virtual world, were modeled to resemble each participant and the coach involved in the study. Although the avatars' physical appearance could be altered (e.g., changing clothes, height, weight, hair color), movements of the avatar within Second Life TM were limited to arm/body gestures and mouth movements only. The actual facial movements displaying affect were neither displayed nor changed on the character. Avatars were driven by a standard QWERTY keyboard and mouse. An audio voice manipulation software MorphVox TM (Screaming Bee 2005), was utilized by the confederate therapist to morph her voice to match the avatar character she was driving in the VR.
Intervention Method The VR-SCT was developed to provide realistic and dynamic opportunities to engage in, practice, and to attain feedback on meaningful young adult social scenarios. Each young adult logged into the VR physically located at the Center for BrainHealth Ò . See Fig. 1 for screenshot showing the modeling of an avatar to resemble each participant. After logging onto the VR system, participants were instructed by the coach (lead clinician) inside the VR with a prompt that directed them to a social situation at a specific location and with a specific person with whom to interact (confederate clinician). Social scenarios were constructed in order to emphasize the learning objective of the session in varying contexts, such as meeting new people, dealing with a roommate conflict, negotiating financial or social decisions, and interviewing for a job. These scenarios were developed to mimic commonly experienced real-world social situations for young adults. Table 2 describes the learning objectives and social situations across the sessions. The VR-SCT intervention manual provided the procedure, standardized prompts, and questions for the two clinicians, the ''coach'' and the ''confederate,'' involved in each session. Please see Table 3 for more information of the VR-SCT. The coach facilitated each session with the participants in real life and met with the participant prior to each session, facilitated setting him or her up at the computers, and moderated each session in the VR with an avatar resembling her physical characteristics. The confederate clinician changed avatars (e.g., older man to younger female) and morphed her voice to match the gender, age, and race of the avatar being portrayed in each scenario. For example, in Session 5 Job Interview, after logging into the VR system a participant would be greeted by the coach avatar in the VR and instructed to go to the office building for his/her first interview (see Fig. 2 of a screenshot of the job interview scenario). During the first interview with the interviewee (confederate therapist acting as an older male) the coach would observe the participant's performance and take notes on social objectives (e.g., recognizing emotion and interest of interviewer, responding appropriately with relevant language, conveying emotion and interest). After the interview ended and the participant exited the office building, the coach asked structured questions about the participant's insight into his/her performance during the interview and then provided education and individualized feedback. Next, the participant would be instructed to go to the electronic store for his/her second interview and to attempt to incorporate the feedback they just received.

Pre-and Post-Measures
A battery of social cognition measures was selected to assess performance before and after the VR-SCT intervention in three areas of social skills and cognition: verbal and non-verbal emotion recognition, Theory of Mind, and conversation skills. These measures have been used to assess previous social skills and cognition training in young adults with HFA (Golan and Baron-Cohen 2006;Turner-Brown et al. 2008) or have a standardization sample. For all measures, higher scores indicate better performance.

Verbal and Non-Verbal Emotion Recognition The new
Advanced Clinical Solutions for WAIS-IV and WMS-IV Social Perception Subtest (ACS-SP; Pearson 2009) was utilized to measure social perception abilities. This measure yields four scaled scores derived from various tasks. First, the SP-Affect Naming assessed the ability to match photographs of faces and people interacting to basic words of emotions. For example, the participant said or pointed to the word ''happy'' when presented with a picture of a person with a happy expression. Second, SP-Prosody is a similar assessment but with auditory stimuli. These two subtests combine into the third Social Perception Total Score (SP-Total). Fourth, the SP-Pairs score reflects a combination of abilities in deciphering non-literal language, such as sarcasm, and the intention of the speaker. Across ACS-SP scores, average internal consistency is reported as r a = 0. eyes and asked participants to match cognitive states or complex emotions (e.g., ''desiring''). The Eyes was presented on paper and raw scores range from 0 to 36. Another ToM measure, Triangles, also known as the Social Perception Task (Abell et al. 2000) was additionally selected. In this experimental measure, adapted from the original videos of Heider and Simmel (1944), participants are asked to narrate the movements of inanimate shapes presented as short videos on a computer screen. Responses were recorded, transcribed, and scored with more points being awarded when mentalizing or attributing thoughts or feelings to the triangles were stated. The Triangles animation and intentionality scoring criteria are based on the 6-point Likert scale methods of Castelli et al. (2000). Participants describing higher levels of intentional and mental states of the stimuli are awarded higher scores with the raw score range of 0-30.
Conversational Skills A performance based measure called the Social Skills Performance Assessment, Version 3.2(2) (SSPA; Patterson et al. 2001) was utilized to assess conversational abilities (such as clarity, fluency, social appropriateness, affect, and overall argument) of two structured prompts. Responses from the semi-structured, role-played conversations were audio recorded. Ratings were completed by two raters that were blinded to the pre-or post-time point. Scores were averaged, as ratings were within approximately one point of each other on each item and agreement was low. Each scene was scored with the manual's pre-defined 5-point Likert scale in eight categories for scene 1 and nine categories for scene 2. Total scores range from 17 to 85.

Functional Measure
VR-SCT Follow-up Survey A follow-up survey was constructed to assess the long term impact of the VR-SCT on social skills, cognition and functioning. The questionnaire queried if the VR-SCT directly impacted social skills and cognition such as emotion recognition and expression, perspective taking, conversation skills, and assertion using yes or no answers. Questions also were asked if social, academic, and occupational functioning were directly and positively impacted by the VR-SCT and if this intervention would be recommended to others.

Procedure
Individuals completed a pretesting battery within 2 weeks before starting the training program. Each participant completed ten VR-SCT sessions, 2 per week, 1 hour each. Before the VR-SCT commenced, each participant was trained how to navigate in the VR environment with the use of a standard keyboard. Post-testing occurred no more than 2 weeks following the last training session. Three clinicians trained in autism spectrum disorders were involved in this study. A follow-up phone call was placed after 6 months to collect responses for the VR-SCT Follow-up Survey by one researcher involved in the study.

Data Analysis
To assess the efficacy of the VR-SCT to improve various aspects of social skill and social cognition, mean difference change scores were analyzed with 95 % confidence intervals. Effect sizes, partial eta squared, were reported. Effect sizes were interpreted using Cohen (1988) metrics: 0.20-0.49 as small effect, 0.50-0.79 as medium effect and 0.80 as large effect. Due to the pilot nature of the study, significance levels for multiple tests were not adjusted. Analyses were performed using SPSS version 18.

Results
All participants completed the ten sessions of the VR-SCT training sessions and pre-and post-testing. Table 4 presents individual pre-and post-and change scores of  Table 5 presents data on measures of emotion recognition, theory of mind, and social performance.

Conversational Skills
Social performance scores as measured by the SSPA increased although not significantly after the intervention, mean difference (SD) = 3.50 (4.26), 95 % CI: -0.61 to 7.06, p = .053.  Participants learned to navigate within the VR environment with relative ease. Behaviors observed online mimicked real life behaviors with similar conversational styles, such as speaking too much on one topic, showing a lack of initiation during conversation, as well as atypical use of proxemics, as observed during the ADOS. The participants' verbal feedback indicated that they enjoyed the sessions and would have enjoyed additional sessions. At least 6 months after completing the post-testing, participants were asked to provide comments or feedback to help improve the intervention. Please refer to Table 6 for results of the VR-SCT Follow-up Questionnaire. The following comments were listed: ''It provides years of social training in just a few sessions, and I think everyone could benefit from this, not only those with Asperger's.'' Participants reported specific social benefits as a direct consequence of their participation with the intervention. Most participants indicated that using the computer intervention and advanced technology assisted with drawing them out into social situations to boost their overall confidence. This added insight and confidence provided more willingness to experience social opportunities within everyday situations.

Discussion
This pilot study is among the first to investigate the use of VR to enhance social skills, social cognition, and social functioning in young adults on the autism spectrum. The VR-SCT was developed as a semi-structured, manualized intervention that utilized the strengths of the VR platform and dynamic practice of meaningful social scenarios for young adults. The VR-SCT learning objectives focused real-time performance of emotion recognition (recognizing other's feelings and tone of voice), theory of mind (recognizing and responding to other's thoughts and desires), and conversational skills (initiating, maintaining and closing). After 10 sessions of the VR-SCT intervention, scores significantly increased on some measures of verbal and non-verbal recognition and theory of mind. These findings support the potential benefit as well as the feasibility of implementing and further testing of a VR social skills training program to practice various social abilities in a relatively short period of time. Similar to past studies, results of the current study found that a social skills training intervention may enhance performance on emotion recognition from faces. Large within group effect size was previously found on a version of Ekman faces after intervention (Turner-Brown et al. 2008). The improvement in face emotion recognition in the present study was achieved in a shorter window of treatment as compared to longer treatment periods utilized in previous studies. In our study, the gain was measured after 10 sessions delivered over 5 weeks. Earlier studies identified gains in face emotion recognition after a longer intervention period of 18-weeks of training using the SCIT-A intervention in the Turner-Brown et al. study and after 10-15 weeks of social software home-use in a study led by Golan and Baron-Cohen (2006). Unlike Golan and Baron-Cohen, the current VR-SCT study did not specifically address and train emotion recognition from faces. The improved performance on recognizing emotions from faces may suggest a carryover treatment effect to an untrained, but socially relevant domain of social cognition.
In addition to significant improvements on recognizing emotions from faces, recognizing emotions from voice on the ACS-SP also significantly increased after the current VR-Social Cognition Training. Golan and Baron-Cohen (2006) also found improvements on a measure of prosody recognition, which was a targeted and trained social skill, after their treatment. According to the ACS-SP manual, practice effects of the ACS-SP subtests are minimal and therefore the difference at post-testing is likely unrelated to practice effects. Additionally, scores on the ACS-SP Total also significantly increased at post-testing, which suggests that the ability to integrate a variety of social cues (i.e., faces, tone of voice, body language in pictures) had improved.
Another essential social cognitive skill that is often targeted by social training interventions is ToM, the ability to mentalize. The current study utilized two measures to assess these skills before and after the intervention. Significant improvement was observed on the Triangles measure, but was not found of the Eyes task, possibly due to the participants whose performance worsened at posttesting. Performance decrease may be attributed to variable test conditions for each participant (e.g., fatigue, outside real-world stress factors influencing each person) or may be a result of the difficult nature of the Eyes task. This may be an indication that this measure may not be as sensitive to measure true change in mentalization relative to other ToM assessments. Similarly, the Eyes test did not reveal significant increases at post-testing in a computer software study (Golan and Baron-Cohen 2006). The authors explained that the Eyes task was utilized as a generalization measure and therefore suggests that generalization did not occur using the software program possibly due to the limited time of use. However, improvements on other measures of ToM have been reported (Turner-Brown et al. 2008). In the current study, the VR-SCT was developed to address ToM issues, in particular what another person maybe thinking or feeling and how to respond appropriately to that emotion; therefore, the increase of the Triangles scores may indicate enhanced sensitivity of this measure to ToM abilities. Conversational skills, in past studies, have been rated by various methods. For example, Hillier et al. (2007) observed the frequency of group participation discussion, Howlin and Yates (1999) observed conversation skills, and Turner-Brown et al. (2008) used a standardized, semistructured role-play assessment, SSPA, as used in this study. Like Turner-Brown et al., our study did not find increased scores on this measure after intervention. Performance at post-testing may indicate that conversational skills were not enhanced; however, this study's follow-up survey results indicated that individuals perceived this as an area of improvement. That is, conversation skills were most frequently rated as the most changed skill in response to the intervention. As Turner-Brown et al. reported, the SSPA may lack ecological validity compared to real life conversations, or the sensitivity to the nature of communication change may be a factor; further, for our study, the subjective ratings may have influenced results. Future studies may consider more sensitive measures.
The 6-month follow-up phone questionnaire, VR-SCT Follow-up Survey, provided further information about the generalization of skills, current functioning of the participants and the feedback of areas for improvement. Although this questionnaire needs further validation, overall, the participants responded favorably to the items that queried about the perceived gains after the intervention. All participants felt like the practice in the VR-SCT helped them to gain social skills to maintain a conversation in real-life following the intervention. Other frequently endorsed skills by the majority of participants included understanding other people's point of view and establishing relationships. The VR-SCT was most frequently rated to impact occupational functioning.
Overall, the results suggest that VR-SCT offers promise as a platform and training protocol to enhance social cognition. Further controlled trial studies are needed to validate these preliminary findings. This study has additional caveats that will need to be addressed. The small sample size and the lack of a control group limit the generalization of the study. Additionally, commonly cited limitations of social cognition measures may have impacted the measurement and the sensitivity to measuring change of social cognition over time and the subjective ratings of the SSPA in our study were possibly limiting. The VR-SCT Followup Survey may have also influenced the results as it was administered by one of the researchers involved with the participants; further, the design of the dichotomous 'yes' or 'no' answers potentially confounded the results. The VR-SCT intervention was also limited by the technology available, as avatars lacked the ability to display facial emotions in real-time in contrast to the programmed movements of the avatar.
Future research in VR platforms and social skills and cognition training should utilize technology that includes facial tracking and or movement on the avatar. This would allow naturalistic (and sometimes subtle) real-time facial affect to be projected onto an avatar in the VR thus decreasing the loss of social information from real-life to VR. In addition, studying the feasibility of a remote application of the VR-SCT, such as at home or in a school setting, will increase accessibility of interventions and answer if results are consistent across venues. Future research into measures of social cognition will also be useful in understanding the impact of treatments on social abilities and functioning. For example, there is a lack of standardized and published measures of rating individual's personal emotion expression for facial expression and speech/prosody (e.g., timing of what is said and how it is said). Additionally, other measures such as behavioral observations, depression, and quality of life inventories could prove meaningful to the outcomes of treatment studies in this population.
Overall the current investigation provides preliminary evidence for the feasibility and use of VR and the VR-SCT intervention in young adults with ASD. The major void in evidence-based interventions presents a demand to provide treatment for young adults with ASD transitioning into adulthood. The VR-SCT is an interactive and visually stimulating approach to treatment with preliminary data suggesting it to be a promising, dynamic practice of basic to complex social skills used to enhance meaningful young adult skills and functioning.