Introduction

Autism spectrum disorder (ASD) is a complex neurodevelopmental disorder characterised by restricted and repetitive patterns of behaviours and persistent deficits in social interactions (American Psychiatric Association, 2013). Pervasive social, communication and adaptive behavioural difficulties emerge early in childhood and have lifelong effects on the psychosocial functioning of individuals with ASD into adulthood (Henninger & Taylor, 2012; van Heijst & Geurts, 2015). ASD affects up to one in 59 children (Baio et al., 2018), with similar prevalence rates recorded in the adult population (Brugha et al., 2011). While adulthood comprises the majority of one’s life and there will be a dramatic increase of approximately half a million young people with ASD transitioning to adulthood within the next decade, research beyond middle childhood is limited (Shattuck et al., 2012). Only an estimated 2% of ASD research funding is used to investigate lifespan issues for adults with autism in the US, with the majority of research funding spent on understanding the nature of ASD and improving early identification (Interagency Autism Coordinating Committee, 2016). This is concerning as the estimated national cost for adults with autism in the US is $175 billion dollars per annum (Buescher et al., 2014).

Cognitive, language, social functioning and behavioural outcomes are generally poor for adults with ASD (Magiati et al., 2014). Adults with autism reported difficulties with employment outcomes and independent living skills (Wise et al., 2019). In a systematic review of longitudinal follow-up studies Magiati et al. (2014) reported social integrational and independence outcomes for adults with ASD to be poor or very poor and employment outcomes to be low. While health, education and transition services for children with autism are relatively well established around the world, service provision tailored for adults is only starting to develop (Magiati et al., 2014). While we expect a growing number of older adults with autism will seek clinical services, there are significant gaps in our knowledge of how to support this population. As such there is a need to identify efficacious interventions as that may reduce the economic impact of ASD, as well as improve health outcomes and quality of life.

A systematic review design is an appropriate methodology to consolidate evidence about interventions and inform recommendations for clinical service provision (Higgins & Green, 2011). In particular, a systematic review of randomised controlled trials (RCTs) can provide the highest level of evidence for interventions (Harbour, Miller,, & The Scottish Intercolleigate Guidelines Network Grading Review Group, 2001). While there have been attempts to consolidate the already limited research on interventions for adults with autism, another limitation is the lack of valid and reliable outcome measures for adults with autism, which further compromises attempts at treatment evaluation (Brugha et al., 2015).

In a systematic review of outcome measures used in treatment trials (n=30) for older adolescents and adults with autism, Brugha et al. (2015) found outcome measures used were inconsistent with frequent use of non-standardised assessments and limited use of measures designed for individuals with autism or measures focusing on core ASD deficits (i.e., social functioning or communication). Concerns around the use of outcome measures is also reported in a study by Bolte and Diehl (2013), where data from 195 prospective trials were evaluated. Bolte and Diehl (2013) reported identifying 289 different outcome measures. Of these, 61.6% were used once and 20.8% were investigator-designed. Furthermore, only three tools were used in more than 2% of the studies (Bolte & Diehl, 2013).

In a systematic review of psychosocial interventions for adults with autism, Bishop-Fitzpatrick et al. (2013) identified 13 studies, of which only four studies were RCTs. Most of the studies were single-case studies or non-randomised controlled trials, which reported on applied behaviour analysis or social cognitive training. While the number and quality of the studies were limited, results indicated a positive effect of psychosocial interventions for adults with autism (d = 0.14–3.59).

In another systematic review specific to behavioural interventions targeting adaptive skill building for young adults with autism, Palmen et al. (2012) identified 20 studies with 116 participants (sample sizes n = 1–22) with participants ranging from 16 to 55 years (mean age 16.5 years). Improvements were reported in 19 of the studies, which included a range of adaptive skills including social interaction skills, practical academic skills, vocational and everyday living skills. However, 15 of the included studies used a form of single subject design, with only 5 studies using a group design with non-randomised group assignment.

In a more recent systematic review, Hedley et al. (2017) identified 50 studies investigating the effects of vocational rehabilitation programs for adults with autism. Studies were included regardless of their design and Hedley et al. (2017) concluded that the current literature on vocational rehabilitation for adults with autism was limited by poor participant characterisation, small sample sizes, a lack of randomisation and use of appropriate controls.

The current study takes a step towards addressing some of the limitations identified in previous SRs in this area by including all types of non-pharmacological interventions for adults with ASD and by using the most robust evidence available. The aim of this systematic review was to identify and evaluate RCTs that investigated the effects of non-pharmacological interventions for adults with autism. We therefore aimed to (1) identify non-pharmacological interventions for adults with autism using a RCT design, (2) identify the intervention approach, reported outcomes and effectiveness of existing interventions, and (3) determine the methodological quality of the identified non-pharmacological intervention studies.

Method

The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement (Liberati et al., 2009) guided the conduct and reporting of this review (see supplementary Table 1). The methods of the analysis and inclusion criteria were specified a priori and documented in the registration to the International Prospective Register of Systematic Reviews (reference number removed).

Eligibility Criteria

The inclusion criteria were as follows: (1) participants aged 18 years or over; (2) a diagnosis of ASD (including Asperger syndrome or Pervasive Developmental Disorder Not Otherwise Specified prior to the DSM-V) (American Psychiatric Association, 2013) and (3) a RCT design was used to determine the effectiveness of a non-pharmacological intervention. Studies containing young people with autism were only included provided adults aged 18 years or over were also in the sample. Studies were considered a RCT if participants were assigned to any group (intervention or comparison) by means of random assignment. To be included in the review, the intervention group must have received a non-pharmacological intervention and the comparison group could have received no intervention, intervention as usual or an alternate intervention.

We included all studies investigating the effects of non-pharmacological interventions (e.g., independent living/self-management skills, cognitive behaviour therapy, social functioning and vocational rehabilitation), which commonly address core features of ASD (i.e., social and communication skills; restricted, repetitive patterns of behaviour, interests or activities; and adaptive behaviour and functional skills). However, biofeedback interventions were excluded. In addition, studies must have been published in English. Only original, peer-reviewed RCT articles were included. Conference abstracts, reviews, opinion pieces and study designs other than RCTs were excluded.

Information Sources and Search Strategies

A literature search was conducted in September 2020 using the following five electronic databases: CINAHL, EMBASE, ERIC, PsycINFO and PubMed. The full search strategies for each database are described in Table 1. Supplementary grey literature searches were conducted and reference lists were checked to identify studies.

Table 1 Search terms

Study Selection

One author independently screened all titles and abstracts of the retrieved articles against the inclusion criteria. To ensure rating accuracy, a second author independently screened a random sample of 40% of the abstracts, which was examined to determine the inter-rater reliability: Weighted Kappa = 0.87 (95% CI = 0.79–0.95). Where articles could not be excluded based on title and abstract, the reviewers assessed full papers for relevance independently. Disagreements about article inclusion between the reviewers were discussed and resolved by consensus within the research team.

Data Collection Process

A data extraction form was created to extract data from the included articles. Two authors were involved in extracting information about the methodological quality, study characteristics, intervention design, and main findings regarding intervention effectiveness independently. A third author checked all extracted data for accuracy and consistency. Data relating to the characteristics of the individual studies were extracted for: inclusion and exclusion criteria of participants, characteristics of the intervention and control groups, and screening and outcome measures. The Template for Intervention Description and Replication (TIDieR) checklist (Hoffmann et al., 2014) was used as a guide to extract the information relevant to the intervention design. This included the intervention aim, intervention materials (physical and informational), intervention procedures (activities and processes), intervention agent (who delivers the intervention), modes of delivery (how intervention is delivered), dosage (number of sessions, duration, intensity and dose), tailoring (planned adaptations), and modifications (modifications over intervention course).

Risk of Bias of Individual Studies

To assess the potential risk of bias for each study we used the Qualsyst critical appraisal tool, a commonly used quality assessment checklist for evaluating primary research papers from a variety of fields (Kmet et al., 2004). The checklist consists of 14 criteria used to assess the methodological quality of individual studies. Depending on the degree to which each criterion is met, a score between 0 and 2 is given (2 = meets the criterion; 1 = partially meets the criterion; 0 = does not meet the criterion). A total score was derived by adding up the scores from the 14 criteria, with the lowest possible score of 0 and the highest possible total score of 28 for RCTs. The total score was converted to an overall quality percentage score (a total score divided by 28 and multiplying that value by 100). An overall quality percentage score of 80% or higher indicates strong methodological quality, a score between 70% and 79% good quality, a score between 50% and 69% adequate quality, and a score below 50% poor quality. Two authors scored the included studies against the 14 criteria independently. Disagreements about the scoring between the raters were resolved by consensus.

Data Synthesis

After data were extracted using comprehensive data extraction forms, data were extrapolated and synthesized within a number of categories: participant characteristics, inclusion criteria, treatment conditions and outcomes, components of studies, components of the interventions, and methodological quality. The method to assess intervention outcomes were effect sizes and significance of findings. However, we could not conduct a meta-analysis due to heterogeneity in intervention designs and outcome measures used and the methodological quality of the included studies (Guolo & Varin, 2017).

Results

The process of study selection is illustrated in Fig. 1 (PRISMA flow diagram). The initial search using subject headings and free texts retrieved a total 3865 abstracts after duplicates were removed, with 41 articles meeting the final eligibility criteria for this review (see Fig. 1). No additional articles were identified by grey literature searches and checking references of the included articles.

Fig. 1
figure 1

Prisma flow chart

Description of Studies

The study characteristics of the 41 included articles are described in Table 2. The majority of the RCT studies (n = 23) were conducted in the US (Capriola-Hall et al., 2020; Cox et al., 2017; Eack et al., 2018; Faja et al., 2012; Gantman et al., 2012; Gentry et al., 2015; Gorenstein et al., 2020; Hayes et al., 2015; Laugeson et al., 2015; Maisel et al., 2019; McVey et al., 2016; McVey et al., 2017; Morgan et al., 2014; Murza et al., 2014; Oswald et al., 2018; Smith et al., 2014; Strickland et al., 2013; Van Bourgondien et al., 2003; Wehman et al., 2017; Wehman et al., 2014; Wehman et al., 2020; White et al., 2016; White et al., 2019), with three in Japan (Kumazaki et al., 2019; Kumazaki et al., 2017; Miyajima et al., 2016), three studies in the UK (Ashman et al., 2017; Gaigg et al., 2020); Russell et al. (2013) and Germany (Bölte et al., 2002; Mastrominico et al., 2018; Rosenblau et al., 2020) respectively, two studies in Spain (Garcia-Villamisar & Dattilo, 2010; Garcia-Villamisar et al., 2016), and the Netherlands (Spek et al., 2013; Wijker et al., 2020) respectively, and one study in Israel (Saban-Bezalel & Mashal, 2015), Sweden (Hesselmark et al., 2014) and Canada (Nadig et al., 2018), Nigeria (Akabogu et al., 2019), Australia (Tang et al., 2020) respectively. The studies spanned a period of 18 years, conducted between 2002 and 2020, with most studies published between 2014 and 2019 (see Table 2).

Table 2 Study characteristics

Participants

Across the 41 articles, there were a total of 846 adults with autism in the intervention groups and 819 in the control groups. Study sample sizes ranges from 218 (intervention group: 114; control group: 104) in McVey et al. (2017) to eight (intervention group: 2; control group: 4) in White et al. (2016). Only eight studies involved participants with an age group mean above the age of 30 years (Ashman et al., 2017; Gaigg et al., 2020; Garcia-Villamisar & Dattilo, 2010; Garcia-Villamisar et al., 2016; Hesselmark et al., 2014; Miyajima et al., 2016; Rosenblau et al., 2020; Spek et al., 2013). One study included adults older than 65 years of age (Gaigg et al., 2020), and five studies including adults aged 60 years or older (Ashman et al., 2017; Mastrominico et al., 2018; Russell et al., 2013; Spek et al., 2013; Wijker et al., 2020). There were considerably more male participants across all included studies except for Akabogu et al. (2019), three studies included only male participants in the intervention group (Bölte et al., 2002; Morgan et al., 2014; Strickland et al., 2013; Van Bourgondien et al., 2003) and two studies did not report the gender split (Faja et al., 2012; McVey et al., 2017). The total number of male participants was 610 compared with 270 female participants.

Measures for Confirming Diagnosis and Screening for Inclusion

Across the 41 included studies, 21 different standardised measures were used to screen for and confirm ASD diagnosis or other inclusion criteria (see Table 2). Participant diagnosis of ASD (including Asperger syndrome and PDD-NOS) was confirmed using a range of measures. The Diagnostic and Statistical Manual of Mental Disorders (DSM-IV; American Psychiatric Association, 2000; or DSM-V; American Psychiatric Association, 2013) was used to confirm the diagnosis of ASD in 14 studies (see Table 2). The Autism Diagnostic Observation Schedule (ADOS; Lord et al., 2012) was used to confirm ASD diagnosis in 13 studies, the Autism Diagnostic Interview (ADI; Rutter et al., 2003) in five studies, and the International Classification of Diseases-10 (ICD; World Health Organisation, 1994) in two studies (see Table 2). Five studies used both the DSM and ADOS or ADI (Faja et al., 2012; Maisel et al., 2019; Morgan et al., 2014; Oswald et al., 2018; Spek et al., 2013), seven studies reported only using previous school or medical records to confirm diagnosis (Garcia-Villamisar & Dattilo, 2010; Gentry et al., 2015; Hayes et al., 2015; Murza et al., 2014; Strickland et al., 2013; Wehman et al., 2017; Wehman et al., 2020) and three studies referred to an existing diagnosis only (Van Bourgondien et al., 2003; Wehman et al., 2014; Wijker et al., 2020). Of the 41 studies, 18 used two or more methods for confirming diagnosis (see Table 2).

Across the 41 included studies, a range of 19 different measures, in addition to ADOS or ADI, were used for screening of inclusion criteria (see Table 2). The Wechsler Abbreviated Scale of Intelligence (WASI or WASI-2; Wechsler, 2011) or Wechsler Abbreviated Scale of Intelligence (WAIS-III or WAIS-IV; Wechsler, 2008) was the most frequently used measure for screening inclusion criteria and was reported across 10 studies. The Autism Spectrum Quotient (AQ; Baron-Cohen et al., 2001) was reported across five studies and the Kaufman Brief Intelligence Test (KBIT; Kaufman & Kaufman, 2005), the Social Responsiveness Scale (SRS; Constantino, 2005), and the Pervasive Developmental Disorder–Autism Society Japan Rating Scale (PARS; PARS Committee, 2008) were each reported across three studies (see Table 2).

Outcome Measures

Across the 41 included studies, 91 different outcome measures were used with only 18 of the 91 measures used in more than one study (see Table 2). The Social Responsiveness Scale (2nd Edition, SRS; Constantino, 2005) was used in 12 studies and was the most frequently used outcome measure. The Quality of Socialization Questionnaire (QSQ; Gantman et al., 2012), Test of Young Adult Social Skills Knowledge (TYASSK; Gantman et al., 2012) and Vineland Adaptive Behavior Scales (Second Edition) (2nd Edition, VABS; Sparrow et al., 2004) were each reported across four studies. The Empathy Quotient (EQ; Lawrence et al., 2004), Social Skills Improvement System-Rating Scales (SSIS-RS; Gresham & Elliott, 2008) or Social Skills Rating Scales (SSRS; Gresham & Elliott, 1990), Supports Intensity Scale (SIS; Thompson et al., 2004), Symptom Checklist 90 (SCL-90; Derogatis & Cleary, 1977) and the Clinical Global Impression-Improvement (CGI-I; Guy, 1976) were each reported across three studies. Some studies used multiple outcome measures to evaluate the same outcome (e.g., anxiety) and many used eight outcome measures or more (see Table 2).

Interventions

We categorised the 41 included studies which reported 38 non-pharmacological interventions (three studies reported two different non-pharmacological interventions into four intervention groups [see Table 3]; Gaigg et al., 2020; Hesselmark et al., 2014; White et al., 2016). Interventions were categorised into those aiming to improve (1) social functioning and language skills, (2) vocational rehabilitation outcomes, (3) cognitive skills training and (4) independent living skills. Social functioning and language skills interventions were most studied (n = 20), followed by cognitive training interventions (n = 11). Vocational rehabilitation was reported in 10 studies and only one study reported an independent living skills intervention (see Table 3). Across the 41 included studies, the interventions were compared to a waitlist control group (n = 21), alternate intervention (n = 13) or treatment as usual (n = 7; see Table 2).

Table 3 Summary of non-pharmacological interventions for adults with ASD

Social Functioning and Language Skills Interventions

Interventions targeting social functioning and language skills included a range of approaches (see Table 3). Four studies evaluated a specific social intervention—the Program for the Education and Enrichment of Relational Skills (PEERS) for young adults, which aimed to improve social skills (Gantman et al., 2012; Laugeson et al., 2015; McVey et al., 2016; McVey et al., 2017). Seven studies evaluated interventions employing a social skills training-based component (Akabogu et al., 2019; Ashman et al., 2017; Gorenstein et al., 2020; Nadig et al., 2018; Oswald et al., 2018; Van Bourgondien et al., 2003; White et al., 2016). All but one (Van Bourgondien et al., 2003) of these studies used weekly modules in a group setting. Of those using weekly modules in a group setting, all except two (Akabogu et al., 2019; Nadig et al., 2018) specified a component targeting skill generalisation (e.g., biweekly social outings, socialisation home-work tasks). All publications on the PEERS program reported significant improvements in the majority of areas targeted with both Gantman et al. (2012) and McVey et al. (2016) reporting medium to large effect sizes. Two studies reported significant improvements (Akabogu et al., 2019; Van Bourgondien et al., 2003), eleven reported mixed results (Ashman et al., 2017; Bölte et al., 2002; Garcia-Villamisar & Dattilo, 2010; Faja et al., 2012; Gorenstein et al., 2020; Hesselmark et al., 2014; Mastrominico et al., 2018; Oswald et al., 2018; Nadig et al., 2018; Murza et al., 2014; White et al., 2016), and three studies reported no significant improvements (Wijker et al., 2020; White et al., 2016; Saban-Bezalel & Mashal, 2015)

Three studies used computer-based facial recognition (Bölte et al., 2002; Faja et al., 2012; White et al., 2016). One study (Bölte et al., 2002) reported significant improvements on the affect recognition task, but not for behaviour modification, one improvement (Faja et al., 2012) and one no change (White et al., 2016). No effect sizes were reported.

Four studies used leisure or recreational therapy (Garcia-Villamisar & Dattilo, 2010; Garcia-Villamisar et al., 2016; Hesselmark et al., 2014; Mastrominico et al., 2018) and one study (Wijker et al., 2020) used a novel approach of dog assisted therapy to address social awareness and communication. In Garcia-Villamisar and Dattilo (2010) and Hesselmark et al. (2014), recreational therapy program was designed to improve QoL. Garcia-Villamisar et al. (2016) aimed to increase executive functions, social skills, adaptive behaviours and well-being and Mastrominico et al. (2018) empathy. All five studies reported mixed results, with significant improvements in some of the targeted areas (see Table 3). Hesselmark et al. (2014) and Mastrominico et al. (2018) reported no significant between group differences.

Only two studies focused on language skills and were language comprehension interventions (Murza et al., 2014; Saban-Bezalel & Mashal, 2015; see Table 3). Saban-Bezalel and Mashal (2015) found for both the Irony questionnaire and hemispheric processing, the ASD intervention group was significantly less accurate before (p < 0.05), but not after (p > 0.05) intervention compared to a typically-developing control group, with medium to large effect sizes reported. ASD controls remained significantly less accurate both before and after passive intervention (p <0.05). Murza et al. (2014) reported mixed findings. Significant improvements were found for one inference generation reading measure and one metacognitive measure with small to large effect sizes reported. However, no significant improvements were found for social inference and reading comprehension.

Vocational Rehabilitation Outcome Interventions

Within the included studies, 10 studies reported on vocational rehabilitation interventions (see Table 3). Six studies aimed to improve interview skills (Hayes et al., 2015; Kumazaki et al., 2019; Kumazaki et al., 2017; Morgan et al., 2014; Smith et al., 2014; Strickland et al., 2013), with four studies focusing on improving employment performance and outcomes (Gentry et al., 2015; Wehman et al., 2017; Wehman et al., 2014; Wehman et al., 2020). The most available evidence for vocational rehabilitation interventions was Project SEARCH Plus ASD Supports (Wehman et al., 2017; Wehman et al., 2014; Wehman et al., 2020). Most studies used some form of technology to deliver the intervention (i.e., web- or app-based intervention tool) with four interventions using a face-to-face group delivery component (Morgan et al., 2014; Wehman et al., 2017; Wehman et al., 2014; Wehman et al., 2020).

Three of the six studies aiming to improve interview skills reported significant improvement with large effect sizes (Morgan et al., 2014; Smith et al., 2014; Strickland et al., 2013). Hayes et al. (2015) also reported significant improvement without reporting effect size and Kumazaki et al. (2019); Kumazaki et al. (2017) reported mixed results.

Findings were mixed for the four studies targeting employment performance and outcomes. Gentry et al. (2015) reported functional performance between groups was not significant. Wehman et al. (2014) reported a significant difference in the number of participants who attained employment, but no significant differences in wages or need for employment supports were found. By contrast, Wehman et al. (2017) reported significant large differences between groups for employment attainment, wages, hours worked and supports needed and Wehman et al. (2020) reported a significant difference in employment, with wages at or above minimum wage after 1 year.

Cognitive Skills Training Interventions

Eleven of the included studies evaluated cognitive training interventions (see Table 3). Four focused on reducing comorbid symptoms (Capriola-Hall et al., 2020; Gaigg et al., 2020; Russell et al., 2013; Spek et al., 2013), three focused on improving cognitive and social functioning (Miyajima et al., 2016; Rosenblau et al., 2020; Tang et al., 2020), one on improving QoL or social emotional outcomes (Hesselmark et al., 2014; Tang et al., 2020), one on core cognitive and social cognitive outcomes with employment as a secondary outcome (Eack et al., 2018), one on adjustment to college (White et al., 2019), and one on reducing distressing thoughts (Maisel et al., 2019). Most studies reported significant improvements following cognitive training intervention. Three studies (Eack et al., 2018; Gaigg et al., 2020; Spek et al., 2013; White et al., 2019) reported significant large intervention effects. Hesselmark et al. (2014) reported significant moderate effects for improving QoL. Miyajima et al. (2016) and Rosenblau et al. (2020) found improvements in cognitive and social functioning. However, no effect sizes were reported in Miyajima et al. (2016). Maisel et al. (2019) reported a significant reduction in thought believability after intervention with medium to large effect sizes. Russell et al. (2013) reported a significant reduction in OCD symptoms with small to large effect sizes. However, cognitive training was not found to be significantly different when compared to anxiety management. Capriola-Hall et al. (2020) and Tang et al. (2020) reported mixed results.

Independent Living Skills Interventions

One study (Cox et al., 2017) used intervention to improve independent living skills. Cox et al. (2017) used a Virtual Reality Driving Simulation Training (VRDST) program to improve driving performance. Participants were allocated one of three types of VRDST or routine training. VRDST was found to significantly improve driving skills compared to routine training (see Table 3).

Methodological Quality Assessment

Of the 41 studies, 20 studies were rated as having ‘strong’ methodological quality and 18 studies were rated as having ‘good’ methodological quality (see Table 3 and Supplementary Table 2) using the Qualsyst critical appraisal tool (Kmet et al., 2004). The range in the rated methodological quality of the included studies was between 54% ('Adequate'; Bölte et al., 2002) and 93% ('Strong'; Garcia-Villamisar et al., 2016; Russell et al., 2013).

Of the 41 included studies, 34 described participant characteristics sufficiently and 19 studies reported the sampling strategy used clearly (see Table 3). However, two studies (Akabogu et al., 2019; Wijker et al., 2020) did not reported a between group comparison of participants’ characteristics at baseline. All studies reported random allocation of participants. However, only 18 studies reported the procedure of randomisation. Most studies were at risk of observation bias due to a lack of blinding of assessors and/or interviewers. No study reported the blinding of participants and only 11 studies reported the blinding of investigators or assessors. Blinding, however, is inherently difficult in the type of intervention studies included in this review. Only three studies conducted a power analysis to determine the required sample size (Garcia-Villamisar et al., 2016; McVey et al., 2017; Russell et al., 2013). Twenty studies calculated and reported effect size regarding the effectiveness of the intervention. The rest of studies did not have sufficient information to assess the appropriateness of sample size or the effectiveness of the intervention.

Discussion

Interventions

This systematic review evaluated RCTs reporting on the effectiveness of non-pharmacological interventions for adults with autism. The 41 included studies were categorised into (1) social functioning and language skills interventions, (2) vocational rehabilitation outcomes, (3) cognitive training, and (4) independent living skills.

Our first main finding was that social skills interventions were the most frequently targeted interventions, followed by cognitive skills training and vocational rehabilitation interventions. This finding is similar to a previous systematic review of behavioural interventions for adaptive skill development in adults with ASD, which found social interaction skills were the most common intervention target (Palmen et al., 2012). This finding is also supported by a number of previous systematic reviews of psychosocial interventions using varying study designs for adults with autism (Bishop-Fitzpatrick et al., 2013) and of social skills interventions in young people and adults with autism (Ke et al., 2018). The finding that a relatively high number of interventions targeted social functioning is not surprising given that young adults with autism are more likely to be single, live with family members, and have ongoing social functioning difficulties compared with adults with other mental health disorders (Barneveld et al., 2014; Tobin et al., 2014).

Regarding the type of intervention designs, similar studies were grouped around themes arising from its content (e.g., vocational rehabilitation). However, many interventions, even those grouped in terms of treatment content differed in terms of breadth, where some targeted specific skills and others had broader outcomes. Moreover, while there were multiple interventions targeting social skills, only two interventions targeted language skills and one targeted independent living skills. This finding is of concern given that communication difficulties are a core and well documented difficulty among people with ASD (Magiati et al., 2014; Wise et al., 2019). Further research on interventions aiming to improve the language and communication skills and independent living skills of adults with ASD is critical, given that communication skills are crucial for social functioning and independent living skills are critical for quality of life and psychological wellbeing (Magiati et al., 2014; Wise et al., 2019).

Effectiveness

Two programs, PEERS for young adults and Project SEARCH plus ASD support, were identified to have the most robust evidence for improving social skills and employment outcomes, respectively. The effectiveness of the PEERS program was evaluated in four RCTs (Gantman et al., 2012; Laugeson et al., 2015; McVey et al., 2016; McVey et al., 2017), all which found significant improvement in primary outcomes and two of which reported medium to large effect sizes (Laugeson et al., 2015; McVey et al., 2016). Project SEARCH was evaluated in three RCTs (Wehman et al., 2017; Wehman et al., 2014; Wehman et al., 2020), which reported significant improvements on primary outcomes and one reported a large effect size (Wehman et al., 2017). Both programs were found to use standardised outcome measures adding to the rigour of these findings (Bolte & Diehl, 2013).

There were limitations to determining the effectiveness of existing non-pharmacological interventions for adults with ASD. Although there were some promising results from different studies in this systematic review, it is difficult to provide overall conclusions regarding the strength of treatments outside of the study context due to several factors. These factors include heterogeneity in intervention designs, outcome measures used and the methodological quality of the included studies. We were unable to conduct a meta-analysis as a result of these factors (Guolo & Varin, 2017).

Methodological Study Quality

Regarding methodological quality and study design, some of the studies included small sample sizes and results from these studies to be interpreted with caution. In terms of generalisation, participants in the studies may be relatively homogenous and not reflect the broader population of adults with ASD. The age ranges show that most studies included young adults, mainly in their 20s and early 30s. It should be noted that few studies involved adults with autism over the age of 60. The studies included in this review were also exclusively from economically advanced countries and mainly from the US or Europe, with a geographical exception of the studies conducted in Israeli (Saban-Bezalel & Mashal, 2015) and Japan (Kumazaki et al., 2019; Kumazaki et al., 2017; Miyajima et al., 2016). In summary, there are clear limitations in terms of generalising the findings both at the individual study level and considering the whole sample of studies included in this review.

There are also several points for consideration regarding the outcome measures used across the included studies. As reported by Bolte and Diehl (2013) and Brugha et al. (2015), we found a large variation in the outcome measures used across studies, with few outcome measures used across multiple studies. Furthermore, some outcome measures were not standardised and there was a tendency to not specify primary outcome measures, with a high number of outcome measures used across several studies. This use of outcome measures not only makes comparisons of studies difficult for evaluation and meta-analytic purposes (Brugha et al., 2015), but demonstrates a lack of consistency around outcome measure use in adult autism intervention trials. Collectively, these findings suggest the need to establish guidelines for appropriate outcome measurement use in autism intervention trials (Bolte & Diehl, 2013). The heterogeneity of the studies and the outcome measures used in this review made it impossible to conduct a meta-analysis, which calls for comparisons using a random-effects model framework, an analytical approach that requires a substantial number of similar studies to make warranted inferences (Guolo & Varin, 2017).

Limitations

The current study underwent a rigorous review process by searching relevant electronic databases, comprehensively screening titles, abstracts and articles by two independent reviewers, and ensuring good inter-rater reliability agreement for Qualsyst methodological quality ratings. Despite good rater agreement, this systematic review is subject to limitations. No meta-analyses could be conducted due methodological shortcomings in the included studies according to the Qualsyst ratings and a large variety in intervention designs and outcome measures. As a result, only preliminary conclusions could be made about the effectiveness of non-pharmacological interventions in adults with autism.

Conclusion

In conclusion, 41 non-pharmacological studies were identified in this review. Most studies reported on social skills training, followed by cognitive skills training and vocational rehabilitation interventions. Two interventions, PEERS and Project SEARCH, showed the most robust evidence on effectiveness.

Due to heterogeneity in intervention designs, outcome measures and methodological quality of the included studies, no meta-analysis could be performed. While emerging evidence suggests that non-pharmacological interventions could be effective, there is an urgent need to establish guidelines around the use of outcome measures in these trials and for more RCT interventions aiming to improve communication and independent living skills of adults with ASD.