Virtual reality (VR) applications are becoming increasingly promising in assessing and treating children with neurodevelopmental disorders, as the technology is more affordable than before (Coburn et al., 2017). Previous reviews on VR assessment and treatment studies (Bailey et al., 2022; Dechsling et al., 2021; Mesa-Gresa et al., 2018; Stasolla, 2021; Valentine et al., 2020; Wieckowski & White, 2017) have focused on broad technological perspectives (including augmented, mixed, virtual reality and desktop VR designs), and have not concentrated on immersive virtual reality (IVR). In this scoping review, we studied articles that report on the assessment and treatment of neurodevelopmental disorders with IVR. Our aim was specifically to look into feasibility, task design, and human guidance during IVR assessment/treatment. We also examined what the studies report on the generalization of the treated skills to everyday life. Furthermore, we assessed the research quality of the scoped studies.

Virtual Reality

Virtual reality means an interactive, three-dimensional digital representation of a real or imagined space (Howard et al., 2021). Immersion describes the technical capabilities of a system, and the subjective correlate of immersion is presence (Slater & Sanchez-Vives, 2016). Immersive virtual reality is most commonly achieved with head-mounted displays (HMDs) and cave automatic virtual environments, CAVEs (Cruz-Neira et al., 1993; Howard et al., 2021). The HMD covers the user’s vision and tracks the head movements of the user to align the presented environment with the user’s point of view, which provides an immersive experience (Howard et al., 2021). Current immersive virtual reality HMD technology has been found to be safe and feasible for users with autism spectrum disorder (ASD) (McCleery et al., 2020; Newbutt et al., 2020). CAVEs project a digital environment around users on the walls of a room, causing users to be surrounded by the immersive virtual environment (IVE) (Howard et al., 2021).

Neurodevelopmental Disorders

According to ICD-11 (International Classification of Diseases, 11th Revision), neurodevelopmental disorders (hereafter NDD) are behavioral and cognitive disorders that arise during the developmental period (World Health Organization, 2018). Only disorders whose core features are neurodevelopmental are included in this grouping. The etiology for NDDs is complex, and in many individual cases unknown (ICD-11, World Health Organization 2018). Neurodevelopmental disorders exhibit frequent co-occurrence (Thapar & Rutter, 2015). They may involve atypical development of intellectual, social, and linguistic skills (ICD-11; Thapar & Rutter, 2015), and lead to diagnoses such as disorders of intellectual development, developmental speech and language disorders, autism spectrum disorder, developmental learning disorder, and attention deficit hyperactivity disorder (ICD-11; Thapar & Rutter, 2015).

ASD is a complex neurodevelopmental disorder characterized by restricted, repetitive, and inflexible behaviors and persistent deficits in reciprocal social communication (World Health Organization, 2018, ICD-11). The symptoms of social communication vary individually from mild to severe, but include difficulties in understanding contextual language and nonverbal communication (Loukusa et al., 2018). In developmental language disorder (DLD), there are persistent difficulties in acquisition, understanding, production, or use of language, which arise during early childhood, and cause limitations in communication (Bishop et al., 2017). Attention deficit hyperactivity disorder (ADHD) is characterized by a persistent pattern (at least 6 months) of inattention and/or hyperactivity-impulsivity that has a direct negative impact on academic, occupational, or social functioning (World Health Organization 2018, ICD-11).

Assessment and Intervention

Assessment is the process of collecting valid and reliable information, e.g., using traditional standardized norm referenced tests, semi-structured observation, questionnaires, and rating scales (Le Couteur et al., 2008; Makransky & Bilenberg, 2014; McCauley, 2013), on which clinical decisions and interventions are grounded (Shipley & McAfee, 2019). Interventions aim at improving human health and behavior (Smith et al., 2015) and strengthening the individual’s ability to participate in life, and can target both the individual and the community (ICF, World Health Organization, 2001). Health Interventions consist of three dimensions: target, action, and means (ICHI, World Health Organization, 2020). In this review, we analyze actions and means in immersive VR interventions.

It is crucial to consider the generalization of the effects of an intervention to everyday life when evaluating its effectiveness (Carruthers et al., 2020). Generalization is an active process, and therefore interventions should include direct approaches to promote it (Carruthers et al., 2020; Stokes & Baer, 1977). For example, generalization can be supported with sufficient and varied examples and including common aspects of the original learning environment and the new contexts (Carruthers et al., 2020; Stokes & Osnes, 1989). Individuals with ASD experience difficulty in applying things they have learned in one context to another (e.g., from school to home) (Carruthers et al., 2020; Ingersoll, 2008; Jones et al., 2011), and generalization has increasingly become an explicit focus in many autism intervention programs (Carruthers et al., 2020; Green & Garg, 2018; Lord et al., 2005). According to a systematic review of generalization following traditional, non-VR social communication interventions for children with autism, there are only a few methodologically sound social communication intervention studies exploring generalization in autism, and no consensus on how it should be measured (Carruthers et al., 2020). There is some evidence on generalization in ASD following intervention, according to experimental and single-subject design literature, but methodologies vary and findings are inconsistent (Carruthers et al., 2020).

Virtual Reality in Neurodevelopmental Disorders

Immersive virtual reality allows the recreation of controlled real-life situations with high fidelity, sense of presence in 3D virtual environment, and measurement in real time (Alcañiz et al., 2019; Peeters, 2019). VR wearable technologies can be used for evaluation of the treatment effect and for personalization of the therapy (e.g., Di Palma et al., 2017). For example, recent advances in eye tracking technology enable fast and accurate tracking of the user’s gaze (Courgeon et al., 2014; Piumsomboon et al., 2017). A high-immersion VE (e.g., CAVE, HMD) may be more effective than a low-immersion VE in evaluating and teaching complex social skills, e.g., conversation skills, which require integration of emotion and intention identification, gesturing, and receptive language (Alcañiz et al., 2019; Miller & Bugnariu, 2016).

According to recent reviews, technology (mobile apps/tablets, robots, gaming, computerized tests, videos, and virtual reality) has been used to facilitate assessment and treatment, especially in ASD and ADHD (Bailey et al., 2022; Stasolla, 2021; Valentine et al., 2020; Wieckowski & White, 2017). In the reviews on VR/AR interventions associated with communication or social skills for people with ASD or NDD, the number of articles varied from 31 to 69 studies (Bailey et al., 2022; Dechsling et al., 2021; Mesa-Gresa et al., 2018). Moderate evidence for the effectiveness of VR interventions for children with autism was found (Mesa-Gresa et al., 2018). Most of the research concerning the effectiveness of technological interventions on social communication has focused on design and potential application, and only rarely on the usability of the technology (Wieckowski & White, 2017). Recent reviews have pointed out several weaknesses in the studies, which limit the generalizability of the results, such as small sample sizes, lack of control groups, and lack of interactive activities (Mesa-Gresa et al., 2018; Stasolla, 2021; Valentine et al., 2020). Most participants studied have been individuals with high functioning ASD (Dechsling et al., 2021; Mesa-Gresa et al., 2018). Female participants with ASD were underrepresented (Dechsling et al., 2021). User experiences were not collected systematically (Dechsling et al., 2021), although it is crucial to include persons with ASD and other NDDs (and their stakeholders) in the design and customizing of VR applications to meet their needs and expectations (Garzotto et al., 2018; Newbutt & Bradley, 2022).

What has been lacking in previous reviews is an analysis of the type and amount of human guidance in the virtual environment (see Bailey et al., 2022). Guidance during intervention achieves its purpose when it supports the client’s intrinsic motivation (Deci & Ryan, 2008) to succeed or reduces attitudinal, emotional, or behavioral barriers to proceed with treatment (Barone, 2016). It is known that persons with NDDs need guidance in VR (Bailey et al., 2022; Parsons & Cobb, 2011). For example, in an IVR intervention children and young adults with mild to moderate intellectual disability (ID), the participants with moderate ID needed more guidance than did those with mild ID (Eden & Bezer, 2011). Children with ASD needed more in IVR facilitations with HMDs when compared to CAVE virtual environment, because HMD isolates the user from the surroundings (Li et al., 2019). Furthermore, efficacious web-based interventions contained an element of human interaction (either by real person or an avatar) (Khan et al., 2019).

The Aim of the Study

The main aim of this scoping review was to analyze research on immersive virtual reality assessment and intervention in neurodevelopmental disorders to obtain a comprehensive view on what interventions affect functioning in everyday life, and whether human guidance, besides IVR, is essential in the generalization process. We found no results on other NDDs than ASD and ADHD, (except one article on stuttering), and therefore concentrated on these NDDs. Furthermore, we assessed the quality of the research. The research objectives lend themselves to a scoping review approach, because of the novelty of the research area. The specific research questions were as follows:

  1. 1.

    With which neurodevelopmental disorders were immersive virtual reality assessment and/or intervention studied?

  2. 2.

    How were tasks in immersive virtual reality suited for children with neurodevelopmental disorders? How many subjects interrupted the tasks, and did they report cybersickness/nausea? What was reported about user experience and motivation?

  3. 3.

    What skills were assessed, how was behavior in IVR measured, and were other measures used?

  4. 4.

    What skills were targeted in intervention, what were the results of the IVR practice, and could the skills be generalized to everyday life?

  5. 5.

    What was the level of evidence of the studies?

Methods

Eligibility Criteria

The methods and results were reported according to PRISMA scoping review guidelines (Tricco et al., 2018). The search was limited to the years 2010 to present, because immersive VR technology develops rapidly, and the technical quality of the HMDs affects the user experience (see Newbutt et al., 2020). The search was limited to articles in English. Although we originally included conference proceedings because the field of research is so recent, they were excluded from the final analysis because of their limited methodological quality.

Information Sources

The data search was performed on 25.6. 2021 in Scopus (multi-disciplinary citation database of peer-reviewed literature), Ovid PsycINFO (provides access to international literature in psychology and related disciplines), and Web of Science (citation databases of peer-reviewed journals, books, and conference proceedings covering sciences, social sciences, arts, and humanities).

Search

The search terms of neurodevelopmental disorders were based on ICD-11 (World Health Organization, 2018), but the key concepts were similar to those used in previous versions of the diagnostic manuals. Motor disorders (developmental motor coordination disorder and stereotyped movement disorder) were not included. The search terms regarding immersive virtual reality (see Table 1) were designed based on preliminary searches on the subject. We consulted an information specialist of the University of Helsinki when we designed the search phrases, and selected the sources of evidence to obtain a comprehensive range of research literature on immersive virtual reality and neurodevelopmental disorders. The search phrases are available in Supplement 1.

Table 1 The concepts and the related search terms

In addition, we used Iris.ai (AI-powered search tool) in March 2022 to search for relevant articles published after 2020, with no results. After that (the last search was performed on 25.3.2022), we searched for articles manually on Google Scholar (search words virtual reality, neurodevelopmental disorders, developmental language disorder, autism spectrum disorder, ADHD).

Selection of Sources of Evidence

The inclusion criteria for the studies were as follows: studies with children and adolescents; neurodevelopmental disorders and immersive virtual reality intervention or assessment; and subjects, methods, and results reported and full text available, in English, from 2010 to June 2021.

We first screened the search results by title following the eligibility criteria (Fig. 1). The eligibility criteria were revised at each state of the screening by all three authors. The first and last authors performed the screening and then the results were agreed between the authors. In screening by title, the article was included if the title mentioned neurodevelopmental disorders and immersive virtual reality or related terms.

Fig. 1
figure 1

Flow diagram (Tricco et al., 2018) on identification of the studies via databases

The first and last authors performed the screening by abstracts according to the eligibility criteria. Because the abstract in many articles did not define virtual reality clearly, the first author searched for the type of VR from the whole text. At this stage, we excluded all mixed reality, desktop VR, and eye tracking studies with desktop or real surroundings. We also excluded studies including robots, because the design of the task and possibilities for use are very different from those of IVR. Articles that focused only on technological aspects and not on the assessment or intervention effects of participants were excluded. We decided to exclude conference papers (nine on autism intervention and one on the assessment of ADHD) from the analysis, because of their limited methodological quality.

One study in the results aimed at evaluating the relevance of using a virtual classroom in clinical practice with school-age children and adolescents who stutter, measuring the subjects’ confidence as speakers and social anxiety (Moise-Richard, Menard, Bouchard & Leclercq, 2021). The results showed that a real audience created higher anticipatory anxiety than a virtual one (Moise-Richard, Menard, Bouchard & Leclercq, 2021). However, both the self-reported anxiety levels and the stuttering severity ratings of the participants when talking in front of a virtual class did not differ from those observed when talking to a real audience, and were significantly higher than when talking in an empty virtual apartment (Moise-Richard, Menard, Bouchard & Leclercq, 2021). It was not possible to review the use of immersive virtual reality on stuttering using just one article, and therefore it was not included in the analysis.

Data Charting Process

All three authors designed the data extraction form together, the first author tested it, and all three authors accepted it. Information from each article was extracted by the three authors.

Data Items

The authors defined the data variables together and charted with Excel software. We extracted information on the following details: publication year and type, participant characteristics (number of subjects, diagnosis, age and gender, control group and diagnostic measures reported and/or used in the study), research design (aim and duration of the research, baselines, training period, session length, dosage of VR in total, participant guidance pre, during, and post VR, place of intervention/assessment), the VR technology and its design (hardware configuration, stimulus type, response type, task in VR, description of training/assessment, tutorial scene in VR, hint(s), distractor stimulus, immersion, interaction with the VR environment, avatar/peer interaction during VR, virtual environment(s), game elements), results (time point for measuring outcomes, outcome assessor(s), standardized assessment methods, other assessment methods (behavioral/virtual reality)), generalization of the results (measurement of generalization, standardized assessment methods, other assessment methods (behavioral/virtual reality), time point for measuring generalization, assessors, reported benefits/possibilities), and user experience and side effects.

In the results, some studies related to ASD use the three functional levels (FL) (Level 1 requiring no support to Level 3 requiring substantial support, according to DSM V, APA 2013) to describe the functioning of the participants instead of IQ. In this review, we use both functional level and IQ (if mentioned) for participants with ASD to provide consistency and to clarify the results. When IQ is not mentioned, but the participants are described as attending mainstream school or using spoken language as their primary mode of communication, the functional level is interpreted to be 1. In the results, IQ measures also vary, from full scale IQ (WISC IV, Wechsler, 2003 and Wechsler Abbreviated Scale of Intelligence III: WASI III), to verbal comprehension and perceptual reasoning (WISC IV), as processing speed and working memory are often the dependent variables in the studies related to ADHD, to nonverbal reasoning ability (Raven’s progressive matrices, Raven & Court, 1998).

Research quality assessment of the studies was conducted according to JBI (Joanna Briggs Institute Evidence Based Practices Database) Levels of Evidence and Grades of Recommendation (Jordan et al., 2019). We used Diagnosis Levels 1–5 on the studies related to assessment and Effectiveness Levels 1–5 on the studies related to intervention. Diagnosis Levels range from Studies of test accuracy among consecutive patients (Level 1), Studies of test accuracy among non-consecutive patients (Level 2), Diagnostic case control studies (Level 3), Diagnostic yield studies (Level 4), to Expert opinion and bench research (Level 5). Effectiveness Levels range from Experimental designs (Level 1), Quasi-experimental designs (Level 2), Observational analytic designs (Level 3), Observational descriptive studies (Level 4), to Expert opinion and bench research (Level 5). The agreement of the research design quality was resolved by the first and last author first independently, and then by discussion, if needed.

Results

Description of Studies

Immersive virtual reality assessment and intervention were mainly studied in connection with autism spectrum disorder (15 studies) and attention deficit hyperactive disorder (16 studies). Concerning ASD, there were 6 assessment and 9 intervention studies; for ADHD, there were 14 studies on assessment and 2 studies on intervention. There were participants with ASD and ADHD in three studies, two of which concentrated on assessment. There was one article on stuttering, but no results on other neurodevelopmental disorders.

Participant Characteristics

Reporting of the diagnostic measures varied considerably in the research articles identified with the current search both in the assessment studies (Table 2, participant characteristics of the assessment studies) and in the intervention studies (Table 3, participant characteristics of the intervention studies).

Table 2 Participant characteristics of individuals with ASD and ADHD in assessment studies in IVR tasks
Table 3 Participant characteristics of individuals with ASD and ADHD in intervention studies in IVR tasks

It is noteworthy that information on diagnostic measures was not available in 10 studies out of 15 on ASD and 1 study out of 16 on ADHD. Only 4 out of 15 ASD studies (Alcaniz Raya et al., 2020; Greffou et al., 2012; Raya et al., 2020; Simoes et al., 2020) reported that the diagnosis of autism was confirmed according to the current gold standard, the Autism Diagnostic Observation Schedule (ADOS, Lord et al., 1999). In addition, one out of three studies with participants with both ASD and ADHD, Boo et al. (2022), used ADOS. In one study (Jarrold et al., 2013), autism spectrum-related features were screened with the Autism Spectrum Screening Questionnaire (ASSQ; Posserud et al., 2006), the Social Communication Questionnaire (SCQ, Corsello et al., 2007), and the Social Responsiveness Scale (SRS, Constantino et al., 2004) and in another study (Ip et al., 2018), the participants completed CAST (Williams et. al., 2005). Six studies mention that the participants had a clinical diagnosis of ASD based on DSM V criteria (APA, 2013) by a licensed professional. In five studies, the children had verifiable autism diagnoses and were recruited from local hospital, therapy center, or special school. We accepted variability on the ASD diagnostics to obtain a comprehensive view on the development of IVR assessment and intervention, and the research quality. Considering functioning in ASD, 10 out of 15 studies had participants with Functional Level 1. Although reported IQ varied in the studies related to ADHD, the reported IQ measures of the participants were average (around 100–110) in 9 out of 16 studies.

Feasibility of IVR

14/34 studies reported factors related to user experience: with questionnaires (5), parental interview (2), and verbal feedback (1). The rest of these 14 studies reported researcher’s observations or did not specify the feedback received.

Participants’ refusal and interruption was reported in 6/34 studies of the current review. The number of children with ASD or ADHD who interrupted varied between 1 and 25% in these six studies. Three studies reported that participants with ASD had challenges adjusting to the HMD or viewing goggles (Dixon et al., 2020; Ravindran et al., 2019; Yuan & Ip, 2018). Familiarization and desensitization helped subjects with ASD to proceed with the HMD and virtual environment. In Ravindran et al. (2019), the participants (N = 12) with ASD (ages 9–16) attended 80.3% of VR joint attention training sessions, although nearly half of the participants were pre- or nonverbal, and one-third minimally verbal. The participants were able to complete 97.6% of VR sessions attended after school staff had helped them to get accustomed to the HMDs (Ravindran et al., 2019). The researchers observed users with ASD of ages 4–19, who were able to use HMD and complete the VR tasks without problems in three studies (Fernandez Herrero & Lorenzo, 2020; Johnston et al., 2020; Simoes et al., 2020).

7/34 studies of the current review reported whether nausea related to immersive virtual reality occurred, of which the following three studies reported on factors connected to actual side effects. The participants with ADHD and the typically developing controls reported very few negative experiences in the Simulator sickness questionnaire, SSQ (Kennedy et al., 1993), as a result of IVR exposure in Negut et al. (2017) and Seesjärvi et al. (2022). There was an interesting observation in one study with participants with ASD, that the school staff noted that the participants’ comfort with the IVR equipment increased, but the participants’ responses in pictorial questionnaires were inconsistent with these observations (Ravindran et al., 2019).

Factors related to user experience and motivation were reported within the results of 3/34 articles. In one study, nearly a hundred children with ASD participated in the CAVE virtual environment training, and after the research positive feedback was received from parents (Ip et al., 2017). In a study related to ADHD, immersive virtual reality continuous performance task (CPT) was perceived as more enjoyable than desktop CPT in participant debriefings after the test (Eom et al., 2019).

Skills Assessed in IVR

The studies related to ADHD (N = 14) assessed the overall cognitive executive performance using virtual continuous performance tests, VR CPTs (AULA Nesplora Classroom or Nesplora Aquarium). Behavior was quantified by actions with game controllers and HMD head movement recording. The immersive VR CPTs utilized sensory modality (visual vs. auditory) and presence/absence of distractors in addition to the variables of traditional desktop CPTs (omissions, commissions and average response time) (Climent et al., 2011; Areces et al., 2016). VR CPT measurement results discriminated the ADHD group from typically developing controls (e.g., Areces et al., 2018; Camacho-Conde & Climent, 2020).

It is notable that the ADHD studies repeatedly utilized VR CPTs, and therefore the research focus is limited. An interesting result of one study was that the reaction time variability of the ADHD group improved significantly when a virtual teacher character who presented social cues (pointing gestures or verbal instructions) was present (Eom et al., 2019). One of the studies assessed attentional executive dysfunctions in open-ended daily living tasks (EPELI, Seesjärvi et al., 2022). The discriminant validity of the open-ended everyday life contexts VR task was excellent and comparable to that of CPT (Seesjärvi et al., 2022).

The methods used to quantify behavior varied between studies related to ASD assessment (N = 6) and studies related to ASD, ADHD, and comorbid symptoms of both (N = 2) and aimed at recognizing diagnosis-related features in IVR. Electrodermal activity (EDM) differed in varied CAVE virtual environments and stimuli conditions between TD (typically developing) and ASD groups (Raya et al., 2020). Motion and body posture tracking was conducted with methods integrated with IVR (n = 2) and other methods (n = 2). Differences in children and adolescents with ASD and typically developing controls were found in postural control, reactivity, and stability (Greffou et al., 2012), controlling interpersonal distance in real vs. immersive virtual worlds (Simoes et al., 2020), responding to a virtual character’s greeting (Alcaniz Raya et al., 2020), and reaction time (Ip et al., 2017).

Only two studies analyzed verbal responses as part of the assessment (Boo et al., 2022; Jarrold et al., 2013). The verbal responses of the participants revealed that the children from the diagnostic groups (ASD, ADHD, and combined) produced less complex structural language than TD children, and language complexity decreased in all groups with increasing social demands (Boo et al., 2022). Children with ASD displayed evidence of atypical social orienting when they were required to simultaneously speak and attend to virtual peers in a virtual classroom (Jarrold et al., 2013).

Gaze and head movement tracking were conducted with HMDs’ built-in eye tracking (N = 1) and by deducing the gaze based on head motion recorded by the HMD (N = 5). The children with neurodevelopmental disorders had a significantly longer gazing time (built-in eye tracking in VIVE Pro Eye) on virtual teacher than typically developing children (mean 11.63 s compared to mean 8.21 s) when auditory disturbances were created as the teacher was talking (total event 75 s, disturbances at 30–45 s) (Ide-Okochi et al., 2022). In the other studies, the direction of gaze was deduced based on head motion (Boo et al., 2022; Fernandez Herrero & Lorenzo, 2020; Jarrold et al., 2013; Johnston et al., 2020; Mangalmurti et al., 2020). For example, in a study addressing auditory hypersensitivity, HMD tracked voluntary involvement with auditory stimuli (Johnston et al., 2020).

It is notable that HMD VR technology makes measurement possible throughout the intervention and can be used as an assessment tool. The IVR system captured task start and end times during the intervention, accomplishment time, and selective (focus changes) and sustained attention, as well as the reference point of each look event. The system collected data about the number of interactions between the avatars and the user, taking into account the actions taken by the researcher during the conversation (Fernandez Herrero & Lorenzo, 2020). The total time of visual contact between the user and the virtual characters during their interaction was also registered (Fernandez Herrero & Lorenzo, 2020). There was a connection between improvement in performance in immersive virtual environment, as recorded by the IVR system, and questionnaires on communication and social interaction to the parents (Fernandez Herrero & Lorenzo, 2020).

Targets of Intervention

ASD intervention studies concentrated on social communication (Fig. 2). They have aimed at improving nonverbal communication and social interaction (Cheng et al., 2015), verbal and social communication (Fernandez Herrero & Lorenzo, 2020), emotional and social adaptation skills (Ip et al., 2018; Yuan & Ip, 2018), recognition of six basic emotions and social skills (Tsai et al., 2021), and joint attention skills (Ravindran et al., 2019).

Fig. 2
figure 2

The targets of the IVR interventions

Only two studies concentrated on intervention in ADHD (Bioulac et al., 2020; Tabrizi et al., 2020). The first study compared VR therapy and medication for children with ADHD to improve memory functions (Tabrizi et al., 2020). The other study analyzed the attention and inhibition of distracters in children with ADHD, comparing the effect of methylphenidate treatment, immersive virtual classroom training, and psychotherapy (Bioulac et al., 2020). In addition, one of the studies with eight participants with ASD and ADHD, concentrated on anxiety and disruptive classroom behavior (Bossenbroek et al., 2020).

Results of VR Task and Generalization to Everyday Life

In order to evaluate the results and generalization, we examined guidance and task design in IVR interventions. 9/34 studies systematically reported on human guidance pre-, during, and/or post-IVR (see Table 4). We identified various types of guidance from the included ASD articles. No guidance is reported in either assessment or intervention of ADHD. Guidance is defined here as human interaction before, during, or after immersive VR (verbal instructions for HMD VR, guiding participation in CAVE VR intervention, as well as assistance to proceed in either HMD or CAVE VR tasks). In this review, we defined the provider of guidance as a trainer (in the studies: researcher, research assistant, clinician, therapist, teacher, mediator, monitor, or trainer). IVEs, avatars, and virtual characters shaped the interactions in IVR as well. Table 4 describes the guidance and functionalities of the IVEs.

Table 4 Guidance and the functionalities of the IVR

Measurement of the generalization of the effects of intervention to contexts other than IVR was conducted in 83% of the intervention studies (Table 5). The methods were variable: standardized questionnaires (5 of the 12 intervention studies) and observation (4 studies). In two studies, the measurement was conducted in a real environment (Dixon et al., 2020; Miller et al., 2019). In four studies, the generalization was measured by questionnaires and tasks developed by the researchers, not in IVR, but closely related to the intervention settings and providers.

Table 5 Task in IVR, guidance, and generalization

No human guidance or generalization of the results was reported in only one of the intervention studies on ASD, but a non-playable character acted as a guide within the game, and the participants were rewarded with gems (Johnston et al., 2020). The intervention aimed at decreasing auditory hypersensitivity, and auditory stimulus interaction (by controllers and head movements) increased significantly between sessions one and four (Johnston et al., 2020).

There were two immersive VR interventions for ADHD, of which one reports on generalization. The intervention’s task in a virtual classroom was not very ecologically valid, and human guidance for the participants was not described (Bioulac et al., 2020). The participants improved in an immersive virtual environment and in CPT as much as did the control group with medication (Bioulac et al., 2020). However, there was no significant difference in daily life behavior as observed by parents (ADHD-RS) before and after intervention (Bioulac et al., 2020).

According to these ten studies on IVR intervention, ecologically valid task design and human guidance, which promotes the intervention’s aim, seem to ease the generalization process. The task can be considered ecologically valid when it suits the aim of the intervention, for example, responses by gaze, with gestures, and verbally (Fernandez Herrero & Lorenzo, 2020; Ip, et al., 2018; Yuan & Ip, 2018) in social communication interventions.

The Level of Evidence of the Studies

Two studies related to the assessment of ADHD reached Level 2 (Studies of Test Accuracy among non-consecutive patients) (Diaz-Orueta et al., 2014; Zulueta et al., 2019) (Fig. 3). Three of the studies related to intervention reached Level 3 (Observational analytic designs), one on the intervention of ASD (Ip et al., 2018) and two on the intervention of ADHD (Bioulac et al., 2020; Tabrizi et al., 2020). The rest of the intervention studies were on Level 4 (Observational descriptive studies).

Fig. 3
figure 3

The levels of evidence for diagnosis in the assessment studies and the levels of evidence for effectiveness in the intervention studies (Jordan et al., 2019). 1Levels of evidence for diagnosis: Level 2 = Studies of test accuracy among non-consecutive patients, Level 3 = Diagnostic case control studies, Level 4 = Diagnostic yield studies. 2Levels of evidence for intervention: Level 3 = Observational analytic designs, Level 4 = Observational descriptive studies

Discussion

The aim of the current scoping review was to analyze immersive virtual reality assessment and intervention in neurodevelopmental disorders and specifically examine feasibility, task design, and human guidance during IVR, and generalization of skills to everyday life. Furthermore, we assessed the research quality of the scoped studies.

According to our synthesis of the literature, immersive VR was still focused on the symptomatology of ADHD and ASD. There is a gap in the immersive virtual reality research for other neurodevelopmental disorders. For example, developmental language disorder (DLD) was not addressed, although there is a need for research on language interventions in DLD (Tarvainen et al., 2021). In particular, the assessment and intervention of language comprehension has not been targeted. IVR was studied on assessment of individuals with ADHD (41% of the 34 studies), intervention on individuals with ASD (26%), assessment of individuals with ASD (18%), intervention of individuals with ADHD (6%), and in assessment of and intervention on combined groups of ADHD and ASD.

The participant characteristics in the inspected studies on ASD were in line with those of previous reviews: most of the studies included were conducted with high-functioning individuals (Bailey et al., 2022; Dechsling et al., 2021; Mesa-Gresa et al., 2018). However, the diagnostic measures were reported only in 33% and IQ in 53% of the studies, although functional level affects the guidance needed in IVR (Eden & Bezer, 2011). Notably, not all articles in the current review used the ADOS, which is considered the current gold standard, for diagnosing ASD. Therefore, there may be some differences in participant groups between different studies, which may explain some of the results. In the future, diagnostic methods should be reported more consistently, and uniform internationally recognized diagnostic criteria should be used, when possible. We identified that sex of participants with ASD in the studies was not balanced; i.e., that males were studied more often than females, similarly to the reviews by Dechsling et al. (2021) and Mesa-Gresa et al. (2018). Also in our results, there were fewer female participants than in the population with ASD in general (12% in the assessment studies, 1 study N/A, and 17% in the intervention studies, 1 study N/A), and this may hinder understanding the suitability of IVR for all individuals with ASD. This points out a growing need to study ASD in connection with female participants.

Concerning the feasibility of immersive VR, only a minority of the studies (41%) in our review reported factors related to user experience in IVR, which also has been noted in a previous review (Dechsling et. al., 2021). Overall, the IVR technology was suitable for participants with NDDs. Some individuals with ASD faced challenges in adjusting to HMDs, but familiarization and desensitization enabled them to use the equipment in most reported cases (e.g., Ravindran et. al., 2019). For the development of engaging IVR for persons with NDDs, there is need to collect feedback systematically on side effects (e.g., Simulator sickness questionnaire, Kennedy et al., 1993), user experiences, and motivation (e.g., Cognitive Absorption Scale, Agarwal & Karahanna, 2000).

Regarding assessment in immersive VR, all the studies on ADHD targeted attention and cognitive executive functions, whereas the assessment studies on ASD and combined groups targeted diagnosis-related features. Immersive virtual reality allows objective, real-time measurement of behavioral data connected to diagnostic features of ASD and ADHD. Immersive VEs can capture the behavioral change during intervention and could be used to measure initial learning (Carruthers et al., 2020). The IVE characteristics (e.g., the situations of the avatars in the IVE) influence the VR data acquired during intervention (see Fernandez Herrero & Lorenzo, 2020), and therefore assessment in IVE needs further development. In the future, it will be also essential to assess how typically developing individuals perform in the same IVE.

Considering the targets of immersive virtual reality, 67% of interventions on ASD targeted social communication. Human guidance before, during, and after IVR was reported concerning only ASD in 10 studies. Previous research suggests that individuals with ASD experience difficulties applying things they have learned in one context to another (Carruthers et al., 2020; Ingersoll, 2008; Jones et al., 2011). This may have supported the stronger emphasis on guidance, which also enabled lower-functioning individuals to participate in IVR. For example, the individuals with ASD functioning at Level 2 or 3 were able to participate in IVR joint attention intervention when provided guidance to proceed in the task (Ravindran et al., 2019).

To evaluate generalization effectively, it is necessary to measure the initial learning of the targeted skills, and the skills in a context that differs from the intervention (Carruthers et al., 2020). Our results suggest that ecologically valid tasks closely connected to intervention targets (e.g., responding with gestures in a social communication intervention) may ease the generalization of the skills. In 83% of the 12 intervention studies, the tasks in immersive VR were ecologically valid and measurable. Eighty-three percent of the intervention studies included in the review report whether generalization of the effects to other contexts than IVR occurred, but the measurement methods, timings, and contexts were variable (standardized questionnaires 42% of the 12 studies, observation 33%, questionnaires and tasks closely related to intervention developed by the researchers 33%, real environment 17%). In future studies, it will be crucial to measure the transfer of the skills from IVR to daily life contexts (for example, to observe asking for help in everyday life, in social communication intervention), as well as the documentation of targeted skills, tasks in VR, and initial learning in sufficient detail.

Interestingly, we found that well-targeted guidance according to the aims of intervention pre-, during, and post-IVR with ecologically valid task designs in IVR may ease the generalization process. When exercising new skills, social and contextual conditions that support the feelings of competence, autonomy, and relatedness are the basis for maintaining intrinsic motivation (Deci & Ryan, 2008). The intervention provider’s co-presence in IVR provides personalized opportunities for guidance and to initiate and respond to joint attention and joint activities. For example, the interventions’ provider could act as another child in a relevant situation (e.g., buying ice cream together), which may increase motivation and provide repeated opportunities to practice. Like efficacious web-based interventions delivered to children and young people with NDDs, the element (therapist involvement) of human interaction (by a real person or an avatar) (Khan et al., 2019) may prove important in generalization, but should be planned carefully.

Research quality assessment of the results showed that the most common assessment research design was Diagnostic case control study (Level 3), and the most common intervention research design was Observational descriptive study (Level 4) (Jordan et al., 2019). The evaluation of the research quality should be matched to the stage of development of the intervention (Rychetnik et al., 2002), but IVR technology and the field of research are developing rapidly. Concerning ASD interventions, the research still concentrates mainly on implementation of the various immersive virtual designs for the participants (see Wieckowski & White, 2017). More robust research designs (with preferred diagnostic measures, standardized measures of outcomes, and measurement of the generalization), multidisciplinary co-operation, and involvement of the stakeholders in the design process are needed to support persons with NDDs (Dechsling et al., 2021; Mesa-Gresa et al., 2018; Newbutt & Bradley, 2022).

The limitations of the current study were that we searched only four databases and did not conduct a complete search after June 2021. The conference proceedings were not included in the analysis because of their limited research quality, although the field of research is very recent. The only article on stuttering, and the information it would have provided on the immersive VR design, was excluded from the analysis, because it was not possible to review stuttering intervention based on only one study. We analyzed human guidance and interaction, but presented the role of virtual characters and IVEs only briefly, because it was not in the scope of the current review.

In the future, there will be a need for immersive virtual reality research on assessment of and intervention on other NDDs besides ADHD and ASD. For example, assessment and intervention of developmental language disorder seems feasible, as IVR is already used in communication interventions in ASD and in foreign language education (see Peixoto et al., 2021). The search terms also aimed to find immersive virtual reality research, which utilizes eye tracking, but only one recent study utilized gaze tracking (HMDs’ built-in eye tracking function) in data collection (Ide-Okochi et al., 2022). With the developing technology, HMD-integrated eye tracking offers new possibilities in the assessment of NDDs. For example, specific improvement in core symptoms of ASD has been reported rarely, which can be due to the lack of valid outcome measures capturing change (Kitzerow et al., 2016). With integrated eye tracking technology, IVR designs may provide an opportunity for capturing these core features and changes.

Conclusion

Immersive VR intervention can capture behavioral change, acting as an objective measurement tool, and should be broadened to target more neurodevelopmental disorders. Our review suggests that IVR makes it possible to practice complex social skills in a controlled situation close to daily life. Generalization of the IVR interventions was analyzed only in a few studies, with small groups of participants and variable designs, which limits the conclusions. Active human trainers and ecologically valid tasks in immersive VR involve the child in interaction and joint activities and may help to generalize the results to everyday life, which may help to personalize and customize the intervention for the individual, thus making it possible to focus on individual strengths and challenges in IVR. Participatory guidance may, in fact, act as a bridge from IVR to real life, supporting the gaining of new skills for social communication, and this should be planned in more detail.