Children and adults on the autism spectrum report higher levels of stress compared to typically developing individuals (Bishop-Fitzpatrick et al., 2015, 2017a; Browning et al., 2009; Groden et al., 2006; Hirvikoski & Blomqvist, 2015; McGillivray & Evert, 2018). A hypothesized reciprocal relationship between the severity of autism characteristics and high levels of perceived stress has been demonstrated in previous research (Bishop-Fitzpatrick et al., 2015; Groden et al., 2006; Hirvikoski & Blomqvist, 2015; Pahnke et al., 2014; Porges et al., 2013). Additionally, autism symptoms may also restrict the ability of these individuals to seek for help or social support when needed (Hirvikoski & Blomqvist, 2015). This may lead to long-term presence of high levels of stress, which has a profound negative effect on physical and mental health, as demonstrated in typically developing individuals (Mendelson, 2013; Slavich, 2016).

The most frequently reported information in stress research covers objective features related to the stress response, including changes at physiological and behavioral levels. Yet, another important aspect of stress is the level of perceived stress, which is defined as “the feelings and thoughts an individual has related to the stressfulness of their life and their ability to overcome stressful events” (Phillips, 2013, pp. 1453–1454). As these thoughts and feelings are related to factors such as personality, coping resources, and support, individuals may encounter similar negative life events but can appraise the impact or severity differently (Phillips, 2013). This aspect could be referred to as the subjective information concerning stress and should rely on self-reported measures. In several studies, Bishop-Fitzpatrick et al., (2015, 2017a, 2018) have reported that in adults on the autism spectrum, with and without co-occurring intellectual disability, high levels of perceived stress were associated with poor social functioning, social outcome, and quality of life. In addition, it has been stated that interventions for adults on the autism spectrum may be less efficient due to the high levels of perceived stress as these may hamper the use of learned cognitive control strategies to control behavior (Bishop-Fitzpatrick et al., 2017a). These findings point towards the clinical significance of perceived stress and its assessment in individuals on the autism spectrum. However, contrary to research involving the stress response, little research attention has been paid to this subjective component in these individuals. This underrepresentation of research may relate to the fact that individuals on the autism spectrum often display difficulties with reporting their own affective states (DuBois et al., 2016). Furthermore, they often encounter difficulties with communication and use of figurative language (Happé, 1995) as well as with remembering what has happened in the past (Crane et al., 2013). This may lead to problems with comprehension of the questions in the self-report measure. In addition, symptoms of stress in individuals on the autism spectrum may have been coupled to other concepts such as quality of life, mood symptoms, and problems with emotion regulation (Bishop-Fitzpatrick et al., 2017a). Accordingly, self-report measures have often been perceived as inaccurate and unreliable in individuals on the autism spectrum (Baron et al., 2006), leading to a scarcity of information regarding, for instance, perceived stress. However, it has been posited that individuals on the autism spectrum may show a different way of processing their emotions rather than an absence of this processing (Berthoz & Hill, 2005). Furthermore, as is discussed by Keith et al. (2019), the absence of self-report measures in individuals on the autism spectrum is problematic, given that this misses the individual’s perspective on his or her symptoms. In contrast, informant reports rely solely on observable behaviors and spontaneous sharing of emotions and internal states in order to measure internally experienced symptoms (Keith et al., 2019). Fortunately, in recent years, the field is evolving, acknowledging the increased need for reliable self-report measures in individuals on the autism spectrum. Despite the awareness of the difficulties that may be encountered when using self-reports in individuals on the autism spectrum, valid and reliable self-report measures have been found with regard to depression (Cassidy et al., 2018a), suicidality (Cassidy et al., 2018b), emotion regulation (Berthoz & Hill, 2005), and anxiety and sensory problems (Keith et al., 2019). Therefore, a similar finding is expected for self-report measures on stress in this population. This systematic review addressed two research questions: (1) Which self-report measures have been used in populations on the autism spectrum with regard to reporting stress; (2) Is information regarding the psychometric properties of these tools present for individuals on the autism spectrum? It is important to note that some studies may not use the specific term of perceived stress but instead may only refer to the measurement of self-reported stress. Therefore, the compliance towards the definition of perceived stress as described by Phillips (2013) will be discussed as well.

Methods

This systematic review was executed according to the Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA) statement (see Electronic Supplementary Material (ESM) 1). Analysis of methodological quality was performed using aspects of the COSMIN Risk of Bias checklist and, additionally, the Grading of Recommendations Assessment, Development and Evaluation (GRADE) system to determine the level of quality (Mokkink et al., 2018; Prinsen et al., 2018; Terwee et al., 2018). The COSMIN Risk of Bias checklist is developed to assess the methodological quality of single studies in a systematic review of patient-reported outcome measures through the screening for risk of bias. The latter refers to whether the results of the studies are trustworthy. The checklist contains nine boxes with standards for design requirements and preferred statistical methods of studies on measurement properties. Each of the standards is rated with a 4-point system (very good, adequate, doubtful, or inadequate). The overall rating of the quality is determined by taking the lowest rating of any of the standards in the box. Next, the result of each measurement property is rated against the criteria for good measurement properties, resulting in an either sufficient, insufficient, or indeterminate rate. The final step contains the grading of the quality of the evidence for each measurement property according to the GRADE system. The latter uses the factors “risk of bias,” “inconsistency,” “indirectness,” and “imprecision” to determine the quality of evidence. High quality of evidence reflects the confidence of the authors that the true measurement property lies close to that of the pooled or summarized result, moderate quality of evidence reflects moderate confidence, low quality of evidence reflects limited confidence, and very low quality reflects very little confidence as the true measurement property is likely to be substantially different from the summarized or pooled result. When studies did not have the intention of assessing psychometric properties of the questionnaire(s), the reviewers extracted preliminary data in order to be rated by the COSMIN checklist.

Information Sources and Search Strategy

The search strategy, based on the Population, Intervention, Comparison, and Outcome (PICO) method, was entered in PubMed, EMBASE, Web of Science, and Cochrane Library in November 2019 and was last updated in June 2021. A combination of free text words, controlled terminology (f.i. MeSH terms), and linguistic variations was used based on the concepts of “stress” and “Autism Spectrum Disorder” (see ESM 2). No filters were applied. After the database screening, hand search screening was performed as well, based on the reference list of the included articles.

Eligibility Criteria and Screening Procedure

In order to be included, studies needed to fulfill the following eligibility criteria: (1) a study population older than 6 years as younger individuals may struggle to complete the self-report format, (2) a diagnosis of autism spectrum disorder according to the Diagnostic and Statistical Manual of Mental Disorders (American Psychiatric Association, 1980, 1987, 1994, 2000, 2013) or International Statistical Classification of Diseases and Related Health Problems (World Health Organization, 2016, 2019) or otherwise confirmed by standardized diagnostic tests and/or by clinical interview, (3) use of self-reported questionnaires as a stress assessment tool, and (4) peer-reviewed research written in English or Dutch. Reviews, meta-analyses, qualitative designs, case studies/case series, editorials, conference papers, books and book chapters, trial registrations, unpublished manuscripts, letters to the editor, abstracts only, and expert opinions were excluded.

The selection process consisted of two phases. One researcher conducted the screening according to title and abstract for all articles while a second independent researcher screened 20% of the articles (level of agreement based on ICC was 0.96). Articles were included to the second screening phase if they met the abovementioned eligibility criteria or in case eligibility could not yet be determined. In the second phase, two independent researchers screened all full texts following the same eligibility criteria (level of agreement based on ICC was 0.87). During a consensus meeting, all doubts or disagreements were resolved. Finally, the reference lists of the included studies were screened and additional articles were included if eligible.

Data Extraction and Questionnaire Evaluation

Two independent researchers performed data extraction for all full texts based on the following variables: population characteristics of individuals on the autism spectrum (diagnosis, age, gender, and exclusion criteria), the employed questionnaire, the reporting interval, and information concerning the content and construct of the questionnaire. When available, information regarding psychometric properties of the questionnaires in populations on the autism spectrum was extracted from the studies and rated according to the COSMIN guidelines (Tables 1, 2, 3, 4, and 5). The following information concerning psychometric properties was gathered: internal consistency, reliability and validity measures, and the presence of data from a comparison group or questionnaire assessing a similar construct within the same study. The latter item was included to provide preliminary evidence of construct validity, confirming the hypothesis that higher scores were reported in individuals on the autism spectrum in comparison with control groups (discriminant validity) or that a positive correlation was present with questionnaire(s) assessing a similar construct (convergent validity). Finally, for interventional studies, information on responsiveness to change of the questionnaires was extracted. Any disagreements or doubts were resolved during a consensus meeting (needed for 5% of the data). Information concerning psychometric properties, determined in other populations, fall outside the scope of this review but has been described in previous reports for most of the included questionnaires (Antony et al., 1998; De Bruin et al., 2018; Goodwin et al., 2007; Groden et al., 2001; Lee, 2012; Osika et al., 2007; Stallknecht et al., 2017).

Table.1 Characteristics of the individual studies
Table.2 Overview of the internal consistency of included questionnaires
Table.3 COSMIN Risk of Bias checklist
Table.4 Rating against “good measurement properties”
Table.5 GRADE quality of evidence

For clarity, a categorization in general “trait-like” stress measures and moment-specific “state-like” stress measures was used. A trait is thereby considered as part of an individual’s personality, thus a long-term characteristic, whereas a state is influenced by external events, thus temporary (The Oxford Review Encyclopaedia of Terms, 2019). The general measures were further divided into (1) questionnaires solely including stress-specific questions and (2) combined questionnaires including other psychological symptoms.

Results

Study Selection and Population Characteristics

The search strategy in the four different databases resulted in 10,799 articles after deduplication. After two screening phases, 29 articles were retained and two additional articles were included after reference screening, resulting in 31 included articles (for the selection flow chart, see Fig. 1).

Fig. 1
figure 1

Flow chart of the selection process

In total, 28 different study samples were identified, as some study samples were independently reported twice in different articles (Table 1). In total, 2350 individuals on the autism spectrum were included from which 1353 were male. Three studies reported other gender identities than male and female. The gender distribution represented a male preponderance in most study samples (in 23 out of 28) as typically found in populations on the autism spectrum (Giarelli et al., 2010), except for five studies reporting more females than males or an equal distribution between genders in their study sample. The ages ranged between 6 and 71 years with only six studies using self-reports of stress in children and/or adolescents. The most frequently used exclusion criteria were based on intellectual ability (intelligence quotient (IQ) ≤ 70, 80, or 85) and the presence of co-occurring psychiatric disorders or problems such as current psychotic disorders, suicide risk, and substance abuse. The studies defined the presence of an autism diagnosis using different terminologies including (high-functioning) autism spectrum disorder, Asperger’s syndrome, or pervasive developmental disorder not otherwise specified (PDD-NOS). More detailed information can be found in Table 1.

A total of eight different questionnaires were used to assess self-reported stress in individuals on the autism spectrum. More detailed information concerning the psychometric properties of the questionnaires is available in Table 2 and Table 3.

General “Trait-Like” Stress Questionnaires

Stress-Specific Questionnaires

This category is characterized by questionnaires focusing on the frequency of behavior, feelings, and/or somatic problems related to stress or on the intensity of the stress reaction. Five questionnaires fulfilled this description: (1) the Adjusted Stress Survey Schedule (SSS; Groden et al., 2001), (2) the Perceived Stress Scale (PSS; Cohen & Williamson, 1988; Cohen et al., 1983), (3) the Stress in Children (SiC) questionnaire (Osika et al., 2007), (4) the Chronic Stress Questionnaire for Children and Adolescents (CSQ-CA; De Bruin et al., 2018), and (5) a self-developed questionnaire (Hillier et al., 2016).

The SSS was the only questionnaire included in this review that was specifically developed for individuals on the autism spectrum (Groden et al., 2001). Although its original version constituted an informant-reported measure, a modified self-report version of the SSS for adolescents and adults on the autism spectrum was recently developed and adopted in three studies (Bishop-Fitzpatrick et al., 2017a; McGillivray & Evert, 2018; Pahnke et al., 2014). The respondents are asked to rate the intensity of the stress reaction in common daily activities, divided into eight categories: (1) Changes and Threats, (2) Anticipation/Uncertainty, (3) Unpleasant Events, (4) Pleasant Events (such as presents or birthday parties), (5) Sensory/Personal Contact, (6) Food-Related Activity, (7) Social/Environment Interactions, and (8) Ritual-Related Stress (Groden et al., 2001). Pahnke et al. (2014) used both the original informant-reported and the self-report versions but found no significant correlations between the total scores (p > 0.10).

Reliability: In only two studies, Cronbach’s alpha was reported (Bishop-Fitzpatrick et al., 2017a; McGillivray & Evert, 2018), indicating excellent internal consistency (α = 0.96–0.97) of the questionnaires’ total score. The quality of evidence, rated according to the COSMIN guidelines, was moderate due to the presence of a serious risk of bias. The internal consistency of the subscales, ranging from 0.58 to 0.89, reported in the study of McGillivray and Evert (2018) was similar to the internal consistency values of the original version by Groden et al. (2001).

Construct Validity

Low quality of evidence was found for construct validity as only one small study compared the scores on the SSS of adults on the autism spectrum to those of typical peers. They found statistically significant differences, indicating a higher stress intensity for adults on the autism spectrum (Bishop-Fitzpatrick et al., 2017a).

Responsiveness

One study demonstrated lower scores on the SSS after acceptance and commitment therapy in adolescents on the autism spectrum (Pahnke et al., 2014), based on very low quality of evidence due to the small sample size.

The PSS was developed as a self-report questionnaire and designed to measure “the degree to which individuals appraise situations in their lives as stressful” (Cohen et al., 1983). The items focus on stress-related behaviors and feelings. Three versions of the PSS exist, with the original version containing 14 items (PSS-14), followed by the development of two shorter versions that contain 10 (PSS-10) and 4 (PSS-4) items (Cohen & Williamson, 1988). In two studies, the original PSS-14 was used in adults on the autism spectrum, but information on psychometric properties was not reported (Hirvikoski & Blomqvist, 2015; Pahnke et al., 2019). The PSS-10 was used in three studies using two different adult populations on the autism spectrum (Bishop-Fitzpatrick et al., 2017a, 2018; Wijker et al., 2020). Two studies with the same study sample measured the degree of perceived stress during the last month (Bishop-Fitzpatrick et al., 2017a, 2018) while the third study did not specify the reporting interval (Wijker et al., 2020). Lastly, the PSS-4 was reported in two studies using the same adult population on the autism spectrum (Bishop-Fitzpatrick et al., 2017b; Hong et al., 2016). Both studies specified the reporting interval as “during the last month.”

Reliability

Good internal consistency was reported for the PSS-10 in two studies with a Cronbach’s alpha of 0.87 (Bishop-Fitzpatrick et al., 2017a, 2018) whereas the PSS-4 demonstrated an acceptable internal consistency (α = 0.76) (Bishop-Fitzpatrick et al., 2017b; Hong et al., 2016). All results are based on very low quality of evidence due to a very serious risk of bias and small sample sizes.

Construct Validity

Only one study compared the scores of the PSS-10 and the PSS-14, respectively, in adults on the autism spectrum to those of typical peers and found significantly higher perceived stress in individuals on the autism spectrum (Bishop-Fitzpatrick et al., 2017a; Hirvikoski & Blomqvist, 2015). These results are based on low quality of evidence due to the small samples sizes of the studies.

Responsiveness

Lower levels of perceived stress were reported after acceptance and commitment therapy based on the PSS-14 (Pahnke et al., 2019) and after dog-assisted therapy based on the PSS-10 (Wijker et al., 2020), in both adults on the autism spectrum. These results are based on very low and low quality of evidence, respectively, due to a serious risk of bias and the small sample sizes of both studies.

The SiC Questionnaire was developed as a self-report questionnaire by Osika et al. (2007) to assess the degree of perceived distress in children. In addition, the presence of symptoms of lower well-being and important aspects of coping and social support are examined as well. Children need to rate the frequency of 21 physical and emotional symptoms of stress on a 4-point Likert scale. Two studies have used this questionnaire in children and adolescents on the autism spectrum (7–17 years; Choque et al., 2017; Jonsson et al., 2019). The developers of this questionnaire have advised the use of cutoff criteria to categorize the child’s stress level as follows: “No stress” (< 2), “Medium stress” (2–2.5), and “High stress” (≥ 2.5) (Stallknecht et al., 2017). However, these were not applied in the two studies included in this review (Choque et al., 2017; Jonsson et al., 2019).

Responsiveness: Both studies used the SiC Questionnaire to assess the effectiveness of a social skill group training but found no significant differences related to the intervention (Choque et al., 2017; Jonsson et al., 2019). This finding is based on high quality of evidence.

The CSQ-CA was specifically developed for children and adolescents to assess chronic levels of stress (De Bruin et al., 2018). Therefore, respondents need to rate the relevance of 19 described feelings and behaviors, using a 4-point scale, according to their relevance during the past 3 months. One study used this questionnaire in children and adolescents on the autism spectrum (Ridderinkhof et al., 2018).

Reliability

Internal consistency was good, as rated by Cronbach’s alpha (α = 0.86) but based on very low quality of evidence due to a very serious risk of bias and the small sample size of the study.

Responsiveness

This questionnaire was used to determine the short- and long-term effects of a mindfulness-based program (Ridderinkhof et al., 2018). A significant reduction of stress was only present at 2-month follow-up but not at posttest and 1-year follow-up. These results were based on very low quality of evidence caused by a serious risk of bias and the small sample size of the study.

One study used a self-developed questionnaire consisting of a Likert scale to measure the degree of perceived stress on an average day in adolescents and young adults on the autism spectrum (Hillier et al., 2016).

Responsiveness

This study used the questionnaire to determine the effect of a technology-based music program and reported a decrease of stress in 63% of their study population. However, no statistical information was reported and the quality of evidence is rated as very low due to a very serious risk of bias and the small sample size of the study.

Combined Questionnaires

This category contains only one questionnaire, the Depression Anxiety Stress Scale (DASS; (Lovibond & Lovibond, 1995a). The DASS contains items reflecting on symptoms of depression, stress, and anxiety. Respondents are asked to rate the frequency of these symptoms during the past week. As this review concerns self-reports on stress, only a description of the findings related to the stress subscale will be provided. Three studies used the original 42-item version in adolescent and adult populations on the autism spectrum (Adams et al., 2021; McGillivray & Evert, 2014, 2018).

Reliability

One study reported excellent internal consistency for the stress subscale score (α = 0.92–0.97; Adams et al., 2021). Furthermore, good test–retest reliability (r = 0.73–0.77) was reported in the study of Adams et al. (2021) with an interval of 10 weeks. Both results are based on low quality of evidence due to a very serious risk of bias.

Construct Validity

One study reported higher mean scores on all subscales in young adults on the autism spectrum (McGillivray & Evert, 2018) compared to the normative data from the DASS manual (Lovibond & Lovibond, 1995b), based on high quality of evidence.

Responsiveness: One study reported lower scores on the stress subscale for adolescents and young adults following group-based cognitive behavioral therapy (McGillivray & Evert, 2014). This finding is based on very low quality of evidence due to a serious risk of bias and the small sample size of the study.

Thirteen studies used the short 21-item version in adolescents and adults on the autism spectrum from which six studies specified the reporting interval as “during the past week,” following the manual’s instructions (Beck et al., 2020; Bemmer et al., 2021; Bernardin et al., 2021; Cage et al., 2018; Maddox & White, 2015; Zimmerman et al., 2017). One study used the DASS-21 to measure current levels of symptoms (Jackson et al., 2018) whereas the remaining three studies did not specify in which reporting interval the symptoms had to be present (Demetriou et al., 2021; George & Stokes, 2018; Maisel et al., 2019; Nah et al., 2018; Park et al., 2019, 2020).

Reliability

Measures of internal consistency were reported in five studies, with a Cronbach’s alpha of 0.84 to 0.89 for the stress subscale score (Cage et al., 2018; George & Stokes, 2018; Maddox & White, 2015; Nah et al., 2018; Park et al., 2020). In addition, satisfactory item-total correlations (r = 0.40–0.77) and item-scale correlations (r = 0.32–0.82) were demonstrated in the study of Park et al. (2020). All these findings are based on moderate quality of evidence due to a serious risk of bias.

Construct Validity

Some preliminary moderate- (due to inconsistent results) to high-quality evidence for construct validity as based on hypothesis testing was demonstrated in several studies. The latter reported higher subscale and/or total scores in individuals on the autism spectrum as compared to typical peers or norm values (Cage et al., 2018; Demetriou et al., 2021; George & Stokes, 2018; Maddox & White, 2015; Maisel et al., 2019; Nah et al., 2018) or young adults with psychosis (Park et al., 2019). In addition, a gender interaction effect was found in the study of Bernardin et al. (2021), where only men on the autism spectrum scored higher on the stress subscale whereas this difference was not found in women. Low quality of evidence was present for the comparison with clinical groups with anxiety and depression as lower scores were reported in individuals on the autism spectrum in the study of Nah et al. (2018), but no significant differences were found in the study of Park et al. (2019). In addition, in the latter study, no significant differences were found between young adults on the autism spectrum and young adults with bipolar disorder, based on high quality of evidence. When compared to individuals with social anxiety disorder, no significant differences could be demonstrated based on moderate quality of evidence due to the small sample size (Demetriou et al., 2021; Maddox & White, 2015). However, individuals on the autism spectrum and co-occurring social anxiety disorder did score significantly higher when compared to individuals on the autism spectrum without co-occurring social anxiety disorder, based on low quality of evidence due to the small sample size of the study (Maddox & White, 2015).

Convergent Validity

Preliminary high-quality evidence of convergent validity was demonstrated in the study of Nah et al. (2018) based on a moderate correlation between the stress subscale of the DASS-21 and the Mini-Social Phobia Inventory (r = 0.42; p < 0.01). However, a recent study of Park et al. (2020) was the first study to validate the use of the DASS-21 in an adult population on the autism spectrum without intellectual disability. They demonstrated adequate convergent validity with high quality of evidence based on moderate correlations between the stress subscale and the Hamilton Rating Scale for Depression (r = 0.56) as well as the Liebowitz Social Anxiety Scale Self-Report (r = 0.57; p < 0.001).

Factorial Validity

The 3-factorial structure was confirmed by Park et al. (2020) in adults on the autism spectrum using confirmatory factor analysis and is based on moderate quality of evidence due to a serious risk of bias.

Responsiveness

One study reported lower scores on the stress subscale in adolescents and adults on the autism spectrum after modified cognitive behavioral therapy for social anxiety and social functioning (Bemmer et al., 2021). This finding is based on low quality of evidence due to a serious risk of bias and the small sample size of the study.

Moment-Specific “State-Like” Questionnaire

Two questionnaires were included in this category to assess moment-specific stress: the Subjective Units of Distress Survey (SUDS; Barrios & Hartmann, 1988) and a momentary stress questionnaire based on event sampling method (ESM).

The SUDS is a questionnaire that measures self-reported perceived stress towards an anxiety-provoking situation or a stressful situation, which was used in one study with children on the autism spectrum (Lopata et al., 2008). Two questions were provided with a Visual Analogue Scale ranging from 0 to 100 with 0 referring to “no stress at all” or “not feeling good at all” and 100 referring to “the most stress you have ever felt” or “the best I have ever felt.” After scoring one question in reverse order, the scores were averaged to create the SUDS composite score.

Reliability

Good to excellent internal consistency based on the SUDS composite score was reported with Cronbach’s alpha ranging between 0.85 and 0.92, as the questionnaire was used in two study conditions. This result is based on very low quality of evidence due to an extremely serious risk of bias and the small sample size of the study. The items were negatively correlated (r =  − 0.74 to − 0.85), confirming the measurement of the same construct through opposite scaling.

Criterion Validity

A mild to moderate relationship was found between a physiological measure (cortisol) and the total score on the self-report, based on low quality of evidence due to the small sample size of the study.

ESM is a self-reporting technique, which assesses affect, stress, and contextual correlates in everyday life. Respondents are asked to fill out a short questionnaire at random times during the day. This technique was used in two studies with the same adult population on the autism spectrum (van der Linden et al., 2020; van Oosterhout et al., 2021). The same momentary stress questionnaire with a 7-point rating system was used in which the total score contained the summation of three different stress measures based on activity-related stress, event-related stress, and social stress.

Reliability

Activity-related stress contained three questions and demonstrated a Cronbach’s alpha of 0.72 (van Oosterhout et al., 2021), based on very low quality of evidence due to a very serious risk of bias and the small sample size of the study. Information regarding the psychometric properties of the entire questionnaire was not reported.

Discussion

The purpose of this systematic review was to provide an exhaustive overview of the used self-report measures regarding stress in individuals on the autism spectrum in addition to a description of the psychometric properties, when available. In total, eight different questionnaires were used in 28 different study populations of individuals on the autism spectrum to measure self-reported stress. Based on the results presented above, the use of any of these questionnaires cannot be recommended since evidence on psychometric properties is currently too scarce. These results are an important call to action for the research community for whom multiple implications for future research are addressed below.

Age Ranges Covered per Questionnaire

Adults and adolescents on the autism spectrum were included in studies using the DASS and the adjusted SSS to assess the level of stress (Adams et al., 2021; Beck et al., 2020; Bemmer et al., 2021; Bernardin et al., 2021; Bishop-Fitzpatrick et al., 2017a; Cage et al., 2018; Demetriou et al., 2021; George & Stokes, 2018; Jackson et al., 2018; Maddox & White, 2015; Maisel et al., 2019; McGillivray & Evert, 2014, 2018; Nah et al., 2018; Pahnke et al., 2014; Park et al., 2019, 2020; Zimmerman et al., 2017). In addition, the PSS was only administered in adults on the autism spectrum (Bishop-Fitzpatrick et al., 2017a, 2017b, 2018; Hirvikoski & Blomqvist, 2015; Hong et al., 2016; Pahnke et al., 2019; Wijker et al., 2020) despite the presence of modified versions of the PSS, including one for adolescents (van der Ploeg, 2013). The momentary stress questionnaire using ESM was used in only one adult study population on the autism spectrum (van der Linden et al., 2020; van Oosterhout et al., 2021). Up until now, no studies have used the SSS, DASS, PSS, and the momentary stress questionnaire using ESM in children and/or adolescents on the autism spectrum, thus information regarding feasibility and other psychometric properties of these questionnaires in this young population is lacking. However, the presence of stress in children and adolescents on the autism spectrum was examined in other studies using child-adapted questionnaires, such as the SiC Questionnaire and the CSQ-CA. Lastly, both the SUDS (Lopata et al., 2008) and the self-developed questionnaire by Hillier et al. (2016) were used in only one study with children and adolescents on the autism spectrum, respectively.

Evidence on the Importance of Self-Reports

Evidence concerning the unique contribution of self-reports on internalizing states in individuals on the autism spectrum has been mentioned in previous research (Berthoz & Hill, 2005; Keith et al., 2019; Rieffe et al., 2011) and has been supported by the studies included in this review. First, the feasibility of the reported questionnaires in various study populations on the autism spectrum was confirmed. Second, the absence of significant correlations between self-reports of adolescents on the autism spectrum and informant reports was demonstrated in the study of Pahnke et al. (2014), using the modified version of the SSS. Therefore, it is hypothesized that the content of subjective stress reports differs from the content gathered by informant reports (teachers) due to the adolescents’ difficulties with communicating stress towards their teachers or, alternatively, their difficulties with interpreting their own emotional status. These findings are in line with other studies, indicating poor correlations between self-reports and informant reports of people with psychiatric symptoms, including autism spectrum disorder (Keith et al., 2019; Miller et al., 2014). In addition, it is more sensible to ask individuals themselves on their internalizing states since the experience of emotions and the presence of internalizing symptoms are internal processes to which only they have direct access to (Barrett et al., 2007; Lambie & Marcel, 2002). Although unique information can be provided by self-reports of individuals on the autism spectrum, informant reports are more commonly used to gain insight into internalizing states of individuals on the autism spectrum (Keith et al., 2019). Thus, a sensitization for using self-report tools regarding stress in individuals on the autism spectrum is needed.

Evidence on Reliability

The results of this systematic review revealed that, although the psychometric properties of some of the included questionnaires have been assessed thoroughly in various populations, this is not the case for populations on the autism spectrum. Some studies reported values of internal consistency as a preliminary indication of reliability properties (Henson, 2001). These results implied a good to excellent internal consistency of the SSS, DASS, CSQ-CA, and SUDS, based on the total and/or subscale scores (see Table 2). However, caution must be taken with the interpretation of these results as most of them were rated as doubtful and two results as inadequate according to the COSMIN Risk of Bias checklist. This resulted in low to very low quality of evidence for most of the questionnaires. Moderate quality of evidence was reported for the internal consistency of the adjusted SSS and the DASS-21. In addition, the numerous reports on deficiencies of using Cronbach’s alpha should be mentioned. Over the last few years, this measure has been regarded as inappropriate to measure internal consistency since it can vary according to different factors and can be biased in different directions (Dunn et al., 2014). Furthermore, its assumptions are rigid and almost never met (Dunn et al., 2014; McNeish, 2018). For instance, unidimensionality of the scale is one of those assumptions. In this review, only two studies included a measurement of unidimensionality based on item-total and item-scale correlations for the DASS-21 (Park et al., 2020) and inter-item correlations for the SUDS (Lopata et al., 2008), which resulted in only satisfactory and strong correlations, respectively. In order to be perceived as a unidimensional measure, those correlations should be perfect (Dunn et al., 2014). Additionally, the DASS-21 does not claim to be a unidimensional scale (Lovibond & Lovibond, 1995a). Thus, using Cronbach’s alpha may not result in an appropriate measure for internal consistency. Numerous alternatives have been put forward, such as the coefficient omega. The latter has less risk of overestimation or underestimation of reliability in addition to more realistic assumptions than Cronbach’s alpha. More alternatives with the same concept as Cronbach’s alpha have been reported elsewhere (Dunn et al., 2014; McNeish, 2018).

Test–retest reliability was only demonstrated for the DASS-42 in an adult population on the spectrum without intellectual disability (Adams et al., 2021). However, this result was based on low quality of evidence. Finally, no reports on internal consistency or other reliability measures were found for the PSS-14, SiC Questionnaire, and the entire ESM momentary stress questionnaire in individuals on the autism spectrum.

Evidence on Validity

Only one study assessed the validity of the DASS-21 in an adult population on the autism spectrum without intellectual disability (Park et al., 2020). None of the other studies intended to assess the psychometric properties of the relevant questionnaires in their study population. However, based on the definitions in the COSMIN taxonomy, preliminary low to high quality of evidence for construct validity, more specifically defined as hypothesis testing (discriminative and/or convergent validity), was available for some questionnaires in this review. Therefore, the hypothesis that individuals on the autism spectrum would report higher perceived stress than other populations was used (discriminative validity). For adults on the autism spectrum, higher total scores on the SSS and PSS-14 were reported in comparison with typical peers in only one study for each questionnaire, respectively (Bishop-Fitzpatrick et al., 2017a; Hirvikoski & Blomqvist, 2015). Six studies using the DASS-21 and one study using the DASS-42 reported higher total and/or subscale scores for adolescents and adults on the autism spectrum as compared to typical peers or norm values (Cage et al., 2018; Demetriou et al., 2021; George & Stokes, 2018; Maddox & White, 2015; Maisel et al., 2019; McGillivray & Evert, 2018; Nah et al., 2018). The discriminative capacity of the DASS-21 was insufficient when comparing individuals on the autism spectrum and individuals with other psychiatric symptoms. This is not surprising given the high co-occurring rate of psychiatric problems in individuals on the autism spectrum (Mannion & Leader, 2013; Matson & Goldin, 2013). Indeed, the presence of any psychiatric disorder might lead to equal or similar amounts of perceived stress but with different levels of impact on daily functioning, which might not be distinguished by using the DASS-21. Thus, the latter might have sufficient construct validity for identifying individuals from clinical groups versus individuals in the general population but might be insufficient for the discrimination between different clinical groups, especially when clinical groups with high prevalence of co-occurring disorders such as autism spectrum disorder are included. However, this is in contrast with the findings of Antony et al. (1998), who reported differences in scores between several clinical groups and between clinical and nonclinical groups, providing evidence for discriminant validity of both DASS versions. The recent validation study of Park et al. (2020) provided moderate to high quality of evidence, respectively, for the DASS-21 in adults on the autism spectrum with regard to construct validity as based on convergent validity and for structural validity as based on factorial validity. Additional research is needed to support the preliminary low to high quality of evidence for construct validity of the SSS, DASS-42, DASS-21, PSS-14, and PSS-10, next to defining validity properties of the other questionnaires included in this review. For instance, the SiC Questionnaire gathers information about different constructs in one questionnaire, which has been considered as being a part of a higher-order dimension of subjective health, such as stress (Osika et al., 2007). This may influence the construct validity of the SiC Questionnaire, but, up until now, no evidence regarding this psychometric property is available in children on the autism spectrum. Furthermore, no information was available on the scaling of the self-developed questionnaire of Hillier et al. (2016), which made it difficult to compare its construct with the other questionnaires included in this review.

Preliminary low quality of evidence for criterion validity was reported for the SUDS as its scores did correlate with a physiological golden standard for stress measurements (cortisol), albeit with a large variation across the results (Klimes-Dougan et al., 2001; Selye, 1950). The authors hypothesized that, based on their results, the self-ratings on the SUDS from children on the autism spectrum might be valid when reporting moderate or greater distress but might be invalid when lower levels of distress are reported (Lopata et al., 2008). Although these results are preliminary, further research might enhance the level of evidence and confirm this hypothesis. However, the ongoing discussion on the possible presence of correlations between physiological and self-reported measures on stress in individuals on the autism spectrum should be taken into consideration (Romanczyk & Gillis, 2006). Self-reports on stress might uncover unique information concerning this topic, which cannot be provided or confirmed by physiological data. This could be an alternative explanation for the large variation found in the study of Lopata et al. (2008).

Evidence on Responsiveness

Several studies included a self-report measure on stress to report change after an intervention such as acceptance and commitment therapy, cognitive behavioral therapy, dog-assisted therapy, social skills training, mindfulness, or a technology-based music program (Table 1). The level of quality ranged between very low and low for all questionnaires but one, the SiC Questionnaire, for which high quality of evidence was determined. The small sample sizes were the main cause of the very low and low quality of evidence for responsiveness.

Differences on Item Level

A comparison of the questionnaires on item level pointed towards differences between the contents of the questionnaires. Only two questionnaires in this review (PSS and SiC Questionnaire) fully covered the concept of perceived stress, according to the definition of Phillips (2013), including items concerning symptoms of stress and the ability to cope with them. The DASS and CSQ-CA also included the description of stress-related symptoms but no items on coping abilities. Finally, using the SSS, SUDS, the momentary stress questionnaire using ESM, and the developed questionnaire of Hillier et al. (2016), respondents are asked to rate the intensity of their stress reaction in contrast to rating the frequency of stress-related symptoms as in the previously mentioned questionnaires. In addition, the SSS consists of very concrete descriptions of situations known to be stress provoking in individuals on the autism spectrum, whereas the other questionnaires in this review were not developed for individuals on the autism spectrum specifically. Therefore, it is important for researchers and clinicians to take into account which concept they aim to measure with self-reported questionnaires concerning stress as not all questionnaires cover the same aspects. This could result in different outcomes, such as a possible referral when using different questionnaires as a screening measure for the same individual.

Differences in Reporting Interval

It is important to note the differences in the reporting interval across questionnaires and between the studies, which complicates the comparison of the results. Some studies implemented rather broad reporting intervals to examine symptoms, such as during the past month or past week, using the DASS or PSS (Beck et al., 2020; Bemmer et al., 2021; Bernardin et al., 2021; Bishop-Fitzpatrick et al., 2017a, 2017b, 2018; Cage et al., 2018; Hong et al., 2016; Maddox & White, 2015; McGillivray & Evert, 2014, 2018; Zimmerman et al., 2017). Other studies even included the entire life span to gather information concerning stress with the PSS in individuals on the autism spectrum (Hirvikoski & Blomqvist, 2015; Pahnke et al., 2019). These reporting intervals might induce recall bias, which, in turn, might be different for individuals with or without being on the autism spectrum as frequently observed in clinical practice. Individuals on the autism spectrum tend to focus more on one specific stressor, and they usually experience more difficulties with describing stress or mood over a longer period. This different perception of stress over time could cause differences in the response pattern on the questionnaires. Although this fell beyond the scope of the included studies, future researchers should consider this possible confounding factor. Furthermore, using the DASS, current symptom assessment was reported as well. Although this might provide valuable information, the momentary assessment of symptom levels may be strongly influenced by the situations that the individual has encountered in the few hours before the administration of the questionnaire in addition to the individual’s mood that day. Thus, assessing current symptoms reflects only a snapshot of the presence of certain symptoms, which is usually not generalizable throughout the individual’s overall mood status. Therefore, a well-evaluated reporting interval should be considered when using trait-like questionnaires as mentioned above. In contrast, state-like questionnaires such as the SUDS or use of the ESM technique can cover a short time span due to the momentary character of this assessment regarding an individual’s perceived stress, for instance to evaluate the immediate effect of a certain stressor.

Clinical Relevance

As previously mentioned, higher levels of perceived stress and difficulties with coping have been reported in children and adults on the autism spectrum (Bishop-Fitzpatrick et al., 2015, 2017a; Browning et al., 2009; Groden et al., 2006; Hirvikoski & Blomqvist, 2015; McGillivray & Evert, 2018). Several associations have been demonstrated in previous research between the level of stress and autistic traits (Hirvikoski & Blomqvist, 2015), higher intellectual capacities (George & Stokes, 2018), gender, and age (McGillivray & Evert, 2018). It is also recognized that heightened levels of perceived stress may further compromise social functioning in adults on the autism spectrum and negatively influence their quality of life (Bishop-Fitzpatrick et al., 2015, 2017b; Hirvikoski & Blomqvist, 2015; Hong et al., 2016; Park et al., 2019). Therefore, assessment of perceived stress in individuals on the autism spectrum with appropriate measurement tools and subsequent treatment is of high clinical interest. Next, it is important to note that each of the included questionnaires covered different aspects of stress. Clinicians and researchers must base the choice of the most appropriate self-report measure on the initial purpose of using that measure. In order to achieve an increased use of self-reports in individuals on the autism spectrum, adaptations in the current self-report tools may be necessary as well as further examining its psychometric properties. Furthermore, since evidence of superiority is lacking and it seems that self-reports and informant reports might provide different information, it would be best to combine both versions.

Limitations of the Study

Some limitations need to be considered. First, only peer-reviewed studies were included in this review, causing the exclusion of possible interesting studies reported as abstracts or conference papers. However, due to their methodology, insufficient information was available to discuss in this review.

Second, the COSMIN Risk of Bias assessment and the rating of the quality of evidence according to the GRADE system resulted in only few psychometric properties with moderate to high quality of evidence. In addition, apart from one study (Park et al., 2020), none of the studies aimed at examining the psychometric properties of their relevant questionnaires in their study populations. This stresses the need for future research to focus on studies determining the psychometric properties of the reported questionnaires. Third, given the combined character of the DASS, it could be argued that this questionnaire should have been excluded from the systematic review since it did not focus on the measurement of perceived stress only. However, given the absence of a predefined exclusion criterion for combined questionnaires and the presence of a stress-specific subscale, this questionnaire was eventually included for data extraction and further discussion.

Finally, some features concerning the study samples need to be considered as these might limit the interpretation of the results found in this review. First, most study samples represented a male preponderance, similar to what is typically reported in studies concerning individuals on the autism spectrum (Giarelli et al., 2010). However, more women were included in two studies using the DASS-21 (Cage et al., 2018; George & Stokes, 2018) which was attributed to the format of the data collection by means of a survey (Cage et al., 2018) and might attract more female than male responders (Sax et al., 2003). However, a preponderance of female reports might have an impact on the level of reported stress and/or the consequences related to stress. In a sample of typically developing adults, women reported more daily stress with more conflicts, frustration, daily demands, and chronic problems (Matud, 2004). Additionally, the different results between typically developing men and women in comparison with those on the autism spectrum as found in the study of Bernardin et al. (2021) support the use of gender-specific norms. The latter might provide more insight into the experience of stress in men and women on the autism spectrum. Second, most studies excluded individuals on the autism spectrum and intellectual disability. Therefore, the findings from this review are not generalizable to the general population on the autism spectrum, which encompasses individuals with lower intellectual abilities as well. However, the PSS-4 was used in a sample of adults on the autism spectrum from which one-third was diagnosed with an intellectual disability (Bishop-Fitzpatrick et al., 2017b; Hong et al., 2016). This might be explained by the limited number of questions in this questionnaire, making it more feasible to administer in individuals with lower intellectual abilities, although research to confirm this hypothesis needs to be conducted. Third, in the majority of study populations, the mean age of diagnosis was in the adult range (Cage et al., 2018; Hirvikoski & Blomqvist, 2015). This is not in accordance with common practice where the mean age of diagnosis occurs primarily in childhood or early adolescence due to early detection, screening procedures, and the fact that autism spectrum disorder is a neurodevelopmental condition (Elsabbagh et al., 2012; Lai et al., 2014). However, this shift in mean age of diagnosis might be partly explained by the large proportion of females in one of these studies (Cage et al., 2018) for whom a diagnosis might be found later in life in comparison with males on the autism spectrum (Giarelli et al., 2010). Finally, in some studies, the participants were not recruited using strict inclusion criteria (Cage et al., 2018), especially in one study where no detailed information regarding diagnosis or diagnostic procedures was provided (George & Stokes, 2018). Furthermore, in the study of Jackson et al. (2018), 20 participants scored below the cutoff criterion of the 10-item Autism Quotient (AQ-10) but were still included in the autistic group as the authors suggested that these participants had false negative scores. All previously mentioned factors are important to consider when interpreting the results of this review since they refer to heterogeneous representations of populations on the autism spectrum, as is commonly reported in the literature.

Implications for Future Research

Clear clinical relevance is present with regard to assessing self-reported stress levels and the feasibility of administering such tools in individuals on the autism spectrum. In contrast, evidence on psychometric properties of these self-reports is still scarce, except for the DASS-21. This gap in current research should be addressed by using appropriate study designs and psychometric approaches in future research. Therefore, the different aspects on reliability and validity that are mentioned in the COSMIN checklist should be addressed as current evidence is scarce and mainly of low to very low quality. The most important contributors to this low level of quality are small sample sizes and high levels of risk of bias. Inconsistent results were main contributors for the low quality of evidence of hypothesis testing regarding the DASS-21. No information has been reported for any of the included self-report measures in populations on the autism spectrum regarding content validity, cross-cultural validity, and measurement error. In addition, factor analysis was not performed in the included studies of this review, apart from the study of Park et al. (2020) regarding the DASS-21. However, factor analysis provides information with regard to the dimensionality of the questionnaires, which would allow deciding upon the most appropriate psychometric approach. With respect to the hypothesized construct validity, future research could include typically developing peers and populations with other clinical disorders than autism spectrum disorder in order to further investigate this aspect. Including a comparison with questionnaires on similar constructs (internalizing symptoms) can provide more insight into construct validity as based on convergent validity. The collection of normative and gender-specific data on self-reported measures in individuals on the autism spectrum can provide useful insights into screening for stress-related complaints in these individuals (McGillivray & Evert, 2014; Ozsivadjian et al., 2014). In addition, repeated assessments might provide more insight into reliability and responsivity features of the reported questionnaires in this review. As previously mentioned, a more accurate determination of internal consistency can be accomplished by using other measures than Cronbach’s alpha. Following standardized guidelines, such as the COSMIN checklist, can increase the homogeneity in future study designs. In addition, the examined reporting interval should be mentioned to enhance the comparability of different study results. Next, as ESM is less susceptible of recall bias and has been used multiple times in individuals with psychiatric disorders (Myin-Germeys et al., 2009), it is of utmost importance to validate its use in individuals on the autism spectrum. Feasibility studies of the SSS, DASS, and PSS in children and adolescents on the autism spectrum need to be conducted in addition to studies focusing on the psychometric properties in this population. This could be combined with adapting the questions according to the developmental and age-specific situations that this population encounters. Finally, the reliability and quality of current self-reports in individuals on the autism spectrum and intellectual disability might be lower due to their limited ability to reflect upon their inner state. However, future researchers should aim to develop adapted versions of self-reports to increase the feasibility of use by simplifying the questions and using more concrete language. In addition, an adapted version of informant reports, as proposed by Hong et al. (2016), could be used for the assessment of perceived stress in this population. This adapted version inquires information of how the parents think their child would respond to the questions (Sheldrick et al., 2012) instead of typical other reports, where parents are asked to estimate the perceived stress of their child (Li et al., 2015). Correlations between self-reports and these adapted informant reports were higher compared to correlations between self-reports and “typical” informant reports. This argues for the use of adapted informant reports in order to gather information on a certain topic whenever respondents are unable to answer themselves (Hong et al., 2016). However, it should be noted that the questionnaires used were inquiring information on quality of life, for which the adapted informant reports might be more feasible than for topics related to the experience of stress. In sum, a combination of the previously mentioned adaptations regarding self-reports and informant reports could enhance the knowledge of self-reported stress in individuals on the autism spectrum and intellectual disability even more and should be addressed in future research.

Conclusion

This review included eight different questionnaires based on 31 studies regarding self-reported stress in individuals on the autism spectrum. It is important to keep in mind which concept of stress researchers aim to measure as not all questionnaires encompass the same aspects of perceived stress. Based on the self-report measures found in this review for adults and children on the autism spectrum, only the PSS and the SiC Questionnaire respectively cover the concept of perceived stress whereas the other questionnaires reflect upon the frequency or intensity of symptoms of stress. Currently, the use of any of these questionnaires cannot be recommended as evidence on psychometric properties is too scarce. Therefore, the first step for future research is to examine the psychometric properties of the questionnaires for individuals on the autism spectrum. Second, it may be necessary to implement autism-specific adaptations of the questions to enhance the comprehensibility in this population whenever unsatisfactory results for psychometric properties are found.