Introduction

Internalizing behaviors including depression, anxiety, social withdrawal, and somatic problems impact the behavioral, social, and emotional functioning of up to 20% of youth in the United States (Walker et al., 2000). Internalizing symptoms, such as worry, low self-esteem, and low mood usually occur when individuals try to control their emotions or impulses and tend to influence individuals’ emotions in a maladaptive manner (Merrell & Gueldner, 2010). Given the range of negative outcomes associated with internalizing distress, including functional impairment (Kendall et al., 2010), social skills deficits (Nelson et al., 2004), and substance abuse (Lopez et al., 2005), there is a critical need to support youth with internalizing symptomology. One means to support youth is providing evidence-based care, which includes monitoring responsiveness to treatment.

Progress monitoring, or continual routine tracking, is an important aspect of evidence-based practice as it facilitates more timely decision making as opposed to traditional pre-post assessment (APA, 2006). Reliance on infrequent monitoring can lead to waiting extended periods of time to determine intervention effectiveness and may result in youth receiving ineffective services and clinicians being uninformed about more frequent progress updates. However, to effectively progress monitor response to interventions that target internalizing difficulties, it is essential that reliable tools be validated for this specific assessment purpose. Specifically, for progress monitoring data to defensibly inform decision making, it is important for tools to demonstrate evidence of reliability, validity, and treatment sensitivity (i.e., the ability to detect changes in student behavior over time; Briesch & Volpe, 2007). Many self-report rating scales exist for assessing internalizing concerns in youth, such as the Children’s Depression Inventory, Second Edition (CDI-2; Kovacs, 2011), the Multidimensional Anxiety Scale for Children, Second Edition (MASC-2; March, 2013), the Revised Children’s Manifest Anxiety Scale, Second Edition (RCMAS-2; Reynolds & Richmond, 2008), and the Reynolds Adolescent Depression Scale, Second Edition (Reynolds, 2004). Although these traditional, norm-referenced behavior rating scales have impressive psychometric properties, these measures have been designed for diagnostic or summative purposes, comparing student reports to normative benchmarks. Additionally, youth are typically asked to reflect the extent to which they generally experience particular thoughts, feelings, and/or behaviors as opposed to focusing assessment on a shorter period of time (e.g., daily, weekly). As such, the utility of these tools within a progress monitoring context is unknown.

A systematic review was recently conducted by Dart et al. (2019) to better understand the landscape of available progress monitoring measures for internalizing symptoms that can be utilized with youth. In searching for tools appropriate for progress monitoring, the authors used the inclusion criterion that “an assessment is able to detect small changes in functioning and is designed for frequent (i.e., at least weekly) administration (i.e., resistant to practice effects, practical, and feasible) to provide information on treatment progress and inform treatment decisions” (Dart et al., 2019, p. 267). Although the review identified a total of 8 self-report measures, psychometric evidence was only available for three of these measures. Whereas the Separation Avoidance Anxiety Inventory (SAAI; In-Albon et al., 2013) focuses exclusively on the assessment of separation anxiety across 12 possible situations, the Brief Problem Checklist (BPC; Chorpita et al., 2010) provides a broad assessment of both internalizing and externalizing concerns across 15 items. In contrast, the Outcome Questionnaire (OQ; Vermeersch et al., 2000) measures three domains of adolescent functioning and potential impairment including subjective discomfort, interpersonal relations, and social role performance. Consideration of the limited scope of these measures in combination with the available psychometric evidence (i.e., the SAAI being the only measure with evidence and reliability, validity, and sensitivity to change), suggests a continuing need to identify self-report measures designed for the regular progress monitoring of youth with internalizing concerns.

Depression Anxiety Stress Scales

One tool that has been used to progress monitor internalizing concerns in adults is the Depression Anxiety Stress Scales-Short Form (DASS-21; Lovibond & Lovibond, 1995). This 21-item self-report rating scale is an abbreviated version of the full 42-item DASS, consisting of three subscales. The depression subscale measures dysphoria, hopelessness, low self-esteem, self-depreciation, lack of interest, anhedonia, and low positive affect. The anxiety subscale assesses autonomic arousal, musculo-skeletal symptoms, situational anxiety, and subjective experiences of anxious arousal. Finally, the stress subscale measures difficulty relaxing, tension, agitation, irritability, and negative affect (Lovibond & Lovibond, 1995). Each of the three subscales consists of 7 items, which ask respondents to indicate how much a statement applied to them over the past week using a 4-point scale (i.e., 0 = Did not apply to me at all; 3 = Applied to me very much, or most of the time). Higher scores indicate more frequent symptomatology. Items on the DASS-21 were selected from the full-length DASS based on several criteria, including strong factor loadings and item means (Lovibond & Lovibond, 1995). The DASS-21 is available for free from the Psychology Foundation of Australia (n.d.).

Initially validated in a nonclinical sample of participants aged 17–69 years old, numerous psychometric studies support use of the DASS-21 with adult populations (e.g., Clara et al., 2001; Ng et al., 2007). Additionally, the English-language measure has been successfully translated into, and validated across, multiple languages including Chinese (Cao et al., 2023), Hindi (e.g., Kumar et al., 2019), and Spanish (e.g., Bados et al., 2005). Although the lower age limit of the development samples was 17 years, Lovibond and Lovibond (1995) noted no compelling reasons not to use the DASS-21 with children as young as 12. As such, the DASS-21 may be able to fill the gap that exists with regard to psychometrically-defensible progress monitoring tools for internalizing concerns. Because the scale was developed for, and tested primarily with, adult populations, however, one cannot assume that similar psychometric properties exist when used with a younger population. As a result, it is necessary to better understand the extant evidence supporting use of the DASS-21 specifically with youth populations.

Purpose of Study

Despite increasingly widespread application of the DASS-21 with adult and youth populations, psychometric evidence in support of its use specifically with a youth population needs to be investigated to fully understand its utility. Given this gap in the literature, this paper documents a systematic review of the literature pertaining to the psychometric properties of the DASS-21 with adolescent populations. The goal of this systematic review was to present information for both researchers and clinicians while also identifying limitations of the literature and highlighting both ideas for future research and clinical implications. More specifically, we sought to evaluate the appropriateness of using the DASS-21 for progress monitoring with youth (i.e., under 18 years old) populations by answering the following questions:

  1. 1.

    What evidence exists regarding the reliability (i.e., test–retest, internal consistency) of data obtained from the DASS-21 with children and adolescents?

  2. 2.

    What evidence exists regarding the validity (i.e., structural, convergent, discriminant) of data obtained from the DASS-21 with children and adolescents?

  3. 3.

    What evidence exists regarding the sensitivity to change of data obtained from the DASS-21 with children and adolescents?

Method

Search Procedure

This systematic literature review adhered to the reporting guidelines of systematic reviews (Preferred Reporting Items for Systematic Reviews and Meta-Analyses [PRIMSA]; Moher et al., 2009). During March 2023, the APA PsycInfo, eBook Collection (EBSCOhost), ERIC, MEDLINE, Psychology and Behavioral Sciences Collection, SPORTDiscus, and Dissertation and Theses databases were searched for papers citing exploration of the psychometric properties of the DASS-21 with youth populations. No date restrictions were used. All possible combinations of the following search terms were used: (a) either “DASS-21” or “depression, anxiety, stress scale” and (b) “psychometric,” “reliability,” “validity,” “sensitivity to change,” “factor structure,” “factor analysis,” or “reproducibility of results.”

The database search identified 1266 articles (see Fig. 1). After removing duplicates, 468 original titles were screened based on their titles and abstracts to remove irrelevant articles. Specifically, studies were selected for inclusion in the review if two criteria were met. First, the authors published the paper in English and in a peer-reviewed scientific journal or as a dissertation. Second, the authors examined any psychometric properties of the DASS-21 relevant to progress monitoring assessment with child or adolescent populations. Studies that primarily (i.e., > 50% of the sample) included data on adults (aged 18 and above) were excluded. One reviewer screened all titles and abstracts and, if needed, the full article, to determine eligibility for inclusion. Any questionable articles were discussed with a second reviewer until a consensus was reached. Screening resulted in the exclusion of 450 papers, due to the following exclusion criteria: (a) no examination of the psychometrics of the DASS-21: 355 articles (79%), (b) reviewed DASS-21 with adult populations only: 91 articles (20%), and (c) articles published in a language other than English: 7 articles (2%). After screening, 18 articles were included.

Fig. 1
figure 1

Study inclusion flowchart

Coding Procedure and Analysis

After confirming eligible studies, each paper was reviewed to extract any psychometric evidence relevant to progress monitoring purposes (Briesch & Volpe, 2007). First, we looked for validity evidence based on internal structure by reviewing factor analytic findings. Specifically, we sought to determine whether the original three-factor model proposed by Lovibond and Lovibond (1995) was replicated in youth samples. Although factor analytic studies often compare multiple structural models, for the purposes of this study, we sought to identify the model identified by the authors as demonstrating the best model fit. Related, internal consistency reliability was coded if the authors assessed the degree of interrelatedness among the items either for the overall measure or for particular subscales. In reviewing factor analytic findings, we also coded whether each study provided evidence of measurement invariance across two or more demographic groups (e.g., gender, race, geography).

Second, we looked for validity evidence based on relations to other variables in the form of convergent and discriminant evidence. Convergent evidence was coded if the authors assessed the degree to which the scores from the DASS-21 related to scores on another measure to which it should be related. In contrast, discriminant evidence was coded if the authors assessed the degree to which the scores from the DASS-21 did not relate to scores on another scale that measured dissimilar constructs. Regarding the interpretation of coefficients, validity evidence is generally classified as weak (r < 0.30), moderate (0.30 ≥ r < 0.70), or strong (≥ 0.70).

Finally, a psychometric criterion unique to progress monitoring is the responsiveness of a scale (Centers for Disease Control & Prevention, 2000). Scale responsiveness includes both whether data remain stable in the absence of an effect (i.e., test–retest reliability) and whether the measure is able to detect an actual effect when it occurs (i.e., sensitivity to change). Test–retest reliability is most often reported using Cronbach’s coefficient alpha (α), with values closer to 1.0 indicating stronger reliability. For the purposes of this review, coefficients between 0.60 and 0.70 indicated an acceptable level of reliability, whereas 0.8 or greater indicated a very good level. In cases where coefficient omega was reported, omega coefficient values above 0.8 were interpreted as good internal reliability. Sensitivity to change could be assessed using either single-measure or comparative approaches. Within a single-measure approach, the target measure is administered over time to a sample or individual whose behavior is expected to change to determine whether meaningful change is captured in the data (e.g., Hustus et al., 2020). Within a comparative approach, both the target measure and an external criterion measure are administered over time to determine whether similar change is captured by the two measures (e.g., Stratford et al., 1996).

Results

The 18 studies identified through the systematic review were published between 2005 and 2023 (see Table 1). Total study sample sizes varied extensively from 216 (Duffy et al., 2005) to 4,202 (Naumova, 2022) and may be explained, in part, by the fact that studies were split between those that required youth to complete the measure independently (e.g., online) and those that administered the measure in a group setting (e.g., during class time). The majority of studies focused on general samples of secondary school students (78%, n = 14), whereas four studies focused on college-age students. The greatest number of studies were published in Australia (28%, n = 5) and the United States (28%, n = 5), with three studies published in Europe, three in Asia, one in South America, and one containing a multi-national sample. Of the 18 identified studies, 1 (6%) reported on temporal stability, 13 (72%) reported on internal consistency, 16 (89%) reported on the factor structure of the measure, and 5 (28%) reported on convergent evidence. No studies investigated discriminant validity evidence or sensitivity to change of the DASS-21 with youth populations.

Table 1  Characteristics of the DASS-21 Studies Included in the Systematic Review

Validity Evidence Based on Internal Structure

Factor analysis of the DASS-21 has been conducted in 16 studies. EFA was used with 5 samples, CFA was used with 10 adolescent samples, and one study utilized both methods (see Table 2). Studies investigating the structural validity of the DASS-21 in adolescent samples have produced inconsistent findings. Extant research supports various internal structures of the DASS-21, including unidimensional, two-factor, DASS-consistent three-factor, and bifactor (i.e., tripartite, quadripartite) models.

Unidimensional Model

Two studies provided support for a unidimensional structure of the DASS-21 among adolescents. Patrick et al. (2010) and Shaw et al.’s (2017) analyses revealed that a single, general factor explained most of the common variance in the DASS-21 scores for adolescents and young people (almost 90%; Shaw et al., 2017). Both authors suggested that one factor should be extracted, indicating that the measure does not discriminate between depression, anxiety and stress in children and adolescents, but rather measures a single distress dimension.

Two-Factor Model

Two studies provided support for a two-factor model. Both Silva et al. (2016) and Yap and Lee (2023) conducted EFAs and found the best model fit when combining the items in the anxiety and stress scales into a single factor and the depression items in a second factor.

DASS-Consistent Three-Factor Model

The original three-factor model, with three correlated factors of depression, anxiety, and stress, has been supported across three studies (Anghel, 2020; Mellor et al., 2015; Norton, 2007). Additionally, Mellor et al. (2015) found evidence of measurement invariance across Australian, Chilean, Chinese, and Malaysian samples, suggesting that this three-factor model is supported across contexts. Norton et al. (2007) similarly found evidence of measurement invariance across racial groups (i.e., African American, Asian, Latinx, White); however, they did find that the strength of the factor intercorrelations varied across racial groups.

Bifactor Models

A number of studies have specified bifactor measurement models for the DASS-21. Bifactor measurement models account for both (a) a general factor, onto which all items load (e.g., distress) and (b) specific factors that account for some proportion of variance above and beyond the general factor (e.g., depression, anxiety, stress).

Tripartite Model. Although the DASS-21 was developed to discriminate between depression, anxiety, and stress as distinct states of negative emotionality (Lovibond & Lovibond, 1995), some studies have tested the fit of the tripartite model proposed by Clark and Watson (1991). This model suggests that whereas there are some distinctive symptoms of anxiety (i.e., physiological hyperarousal) and depression (i.e., anhedonia, lack of positive affect), there are also many symptoms of negative affect (e.g., irritability, difficulty sleeping) that are common to both. Two studies provided support for the tripartite model. The tripartite model identified by Tully et al. (2009) contained a general factor (Negative Affectivity) on which all items were allowed to load and then two specific factors of depression and anxiety (anhedonia and physiological hyperarousal, respectively). This model was found to be invariant across younger (i.e., 12–14) and older (i.e., 15–18) adolescents. Willemsen et al. (2011) replicated the tripartite model in a sample of Belgian secondary school students and also found evidence of measurement invariance across males and females. In contrast, Duffy et al. (2005) found that a two-factor model comprising (1) physiological arousal and (2) generalized negativity (e.g., depression + negative affectivity) provided the best fit.

Quadripartite Model. Henry and Crawford (2005) first developed a quadripartite model, involving a common negative affectivity (NA) factor, as well as three specific factors of depression, anxiety, and stress when testing the construct validity of the DASS-21 for the general adult population. They assessed whether stress was synonymous with negative affectivity (NA) or if it represented a related, but separate, construct as it had remained unclear if the DASS stress subscale represented general psychological distress (i.e., NA) or a distinct stress component (Henry & Crawford, 2005). Data supported a quadripartite structure, including one general factor of psychological stress and three orthogonal factors of depression, anxiety, and stress.

Five studies have replicated the results of the quadripartite model with younger samples. Jovanović et al. (2021), Le et al. (2017), Naumova (2022), and Szabó (2010) reported this to be the best fitting model within their samples of secondary school students, though all noted that only the DASS-D and DASS-A contributed unique information beyond the general factor. Although Moore et al. (2017) also found strong support for the bifactor model with three specific factors, they reported that the specific factors failed to explain variance beyond the general factor of negative affectivity. Additionally, Jovanović et al. (2021) conducted tests of measurement invariance and found that invariance was supported across both gender and age.

Although Chin et al. (2019) attempted to replicate the quadripartite model, their research involving undergraduate students in the U.S. ultimately resulted in a unique bifactor model. Specifically, these authors found weak results for the anxiety and stress subscales, thus supporting a bifactor model that consisted of a general distress factor (onto which all 21 depression, anxiety, and stress items loaded) as well as a distinct 6-item depression factor.

Internal Consistency Reliability

Internal consistency coefficients (i.e., Cronbach’s alpha, omega) were reported in nine studies for the depression, anxiety, and stress subscales and in seven studies for the DASS-21 Total Score (see Table 2). Coefficients ranged from good to excellent across samples for the DASS-D (α = 0.72—0.93), DASS-A (α = 0.77–0.90), DASS-S (α = 0.70–0.94), and DASS-T (α = 0.83–0.97). Overall, strong internal consistency was identified across scales.

Table 2 Factor Structure and Internal Consistency of the DASS-21 Subscales and Total Score

Convergent Validity

To date, five studies have reported on the convergent validity of the DASS-21 when used with youth populations.

Depression Subscale. Four studies examined convergent validity by comparing the DASS-D subscale to other measures of depression or mental health. Evans et al. (2021) found strong correlations with the Center for Epidemiological Studies-Depression scale (CSED; Radloff, 1977) for both male (0.87) and female (0.86) secondary school students. Le et al. (2017) found the correlation with the Depression domain of the Duke Health Profile Adolescent Vietnamese validated version (ADHP-V; Parkerson et al., 1990) to also be statistically significant, though more moderate in strength (i.e., −0.58). Finally, both Jovanović et al. (2021) and Norton et al. (2007) found that the DASS-D was a significant predictor of scores on the Positive and Negative Affect Schedule-Positive Affect (PANAS-PA; Laurent et al., 1999; β = -0.28) and Beck Depression Inventory (BDI, Beck et al., 1996; η2 = 0.20) respectively. 2007) (Table 3).

Table 3 Validity of the DASS-21 Subscales and Total Score

Anxiety Subscale. Four studies examined convergent validity by comparing the DASS-A subscale to other measures of anxiety or mental health. Evans et al. (2021) found moderate correlations with the GAD Screener (GAD7; Spitzer et al., 2006) for male (0.50) secondary school students, whereas correlations were stronger for female students (0.72). Le et al. (2017) found the correlation with the Anxiety domain of the ADHP-V (Parkerson et al., 1990) to also be statistically significant, though more moderate in strength (i.e., -0.50). Finally, Norton et al. (2007) found that the DASS-A was a stronger predictor of scores on the Beck Anxiety Inventory (BAI, Beck et al., 1988; η2 = 0.21) than the PANAS-NA (Laurent et al., 1999; β = −0.05) when tested by Jovanović et al. (2021).

Stress Subscale. The stress subscale is intended to measure difficulty relaxing, tension, agitation, irritability, and negative affect. Three studies examined convergent validity by comparing the DASS-21 stress subscale to measures of related constructs, including depression (e.g., ADHP-V, BDI), anxiety (e.g., ADHP-V, BAI), affect (e.g., PANAS-NA, PANAS-PA), and general health (e.g., ADHP-V). Moderate correlations were found with the ADHP-V Anxiety (-0.50), Depression (-0.48), and General Health (-0.47) subscales (Le et al., 2017). Norton (2007) found that the DASS-S significantly predicted scores on the BDI (η2 = 0.10), PANAS-NA (η2 = 0.08), and BAI (η2 = 0.05). Jovanović et al. (2021) found that the DASS-S was not a significant predictor of scores on the PANAS-PA (β = 0.02) and PANAS-NA (β = 0.07).

Total Score. Three studies examined the convergent validity of the total score. Evans et al. (2021) found strong correlations with the Langner Symptom Survey, LSS; Langner, 1962) for both male (0.80) and female (0.76) secondary school students. Le et al. (2017) found correlations with the Mental Health and General Health domains of the ADHP-V (Parkerson et al., 1990) to also be statistically significant, though more moderate in strength (i.e., Mental Health = -0.63, General Health = -0.66). Finally, Jovanović et al. (2021) found the DASS-T to be a more significant predictor of scores on the PANAS (Laurent et al., 1999) PA (β = -0.43) and NA (β = 0.73) than the individual subscales.

Test–Retest Reliability

One study reported evidence of test–retest reliability of the DASS-21. Silva et al. (2016) administered the DASS-21 on two occasions one week apart and found intraclass correlation coefficients to be very strong across the subscales (DASS-D = 0.86; DASS-A = 0.80; DASS-S = 0.82).

Discussion

Although it has been suggested that the DASS-21 may be appropriate to use with youth in a progress monitoring context, much of the psychometric evidence has been obtained with adults to date. The purpose of this paper was therefore to review the growing literature pertaining to the psychometric properties of the DASS-21 when used specifically with youth populations to clarify the applied utility of the measure. The systematic review yielded 18 papers meeting inclusion criteria and helps to elucidate those areas in which psychometric evidence supporting use of the DASS-21 with youth populations is robust, as well as those in which additional inquiry appears warranted.

First and foremost, understanding the factor structure of the measure is critically important, as it provides guidance regarding how the items should be combined and summarized. Although the DASS-21 was designed to assess the three subscales of depression, anxiety, and stress in adults (Lovibond & Lovibond, 1995), studies utilizing the measure with youth participants have found varied factor structures (i.e., ranging from 1–4 factors). The greatest support (i.e., 31% of studies) was found for the hierarchical quadripartite model that involves a common negativity factor, as well as the three specific factors of depression, anxiety, and stress. It is notable, however, that several of the studies reviewed either found insufficient support for the three specific factors (supporting instead use of the total score; e.g., Patrick et al., 2010; Shaw et al., 2017) or only found support for the depression and anxiety scales (e.g., Silva et al., 2016; Tully et al., 2009). Related, Le et al. (2017) found that the DASS-D, DASS-A, and DASS-S correlated as strongly with measures of the same internalizing domain as they did with measures of other internalizing domains. Taken together, these results seem to suggest that there may not be adequate differentiation between the subscales of the DASS-21. As such, the accumulated evidence seems to support use of the DASS-21 total score while further research is needed to understand the utility of the subscales.

A second notable finding was that few studies to date have established validity evidence based on relations to other variables when using the DASS-21 with youth. Understanding whether the DASS-21 scales are correlated with other measures of the same construct (as well as not correlated with measures of other constructs) is important, as it provides evidence that the measure is actually assessing what it purports to. Across domains, there appears to be the strongest convergent evidence for the total score (DASS-T). That is, three studies highlighted moderate to strong relationships between the DASS-T and overall measures of mental health and distress. When the depression and anxiety subscales were directly compared to extant measures of the same constructs (e.g., BDI, GAD7), the strength of these relationships was found to vary across these three studies (r = 0.50-0.87). The range of these correlations is consistent, however, with evidence found with adult populations (e.g., Bibi et al., 2020; Oei et al., 2013). Although moderate associations have been identified between the stress subscale and other measures of mental (e.g., anxiety, depression) and general health, it is important to note that no studies to date have specifically examined correlations with established measures of stress.

Third, virtually no evidence was identified in support of the sensitivity to change of the DASS-21 when used with youth populations. Although reliability and validity evidence are relevant across all assessment purposes, sensitivity to change is uniquely relevant to progress monitoring. This indicator is important because if the DASS-21 is not sensitive to change, this means that it is unable to detect when meaningful changes occur as the result of treatment. The one study to report test–retest reliability found that subscale scores were highly consistent when the DASS-21 was administered one week apart; however, the study used a non-clinical youth sample who were not receiving treatment (Silva et al., 2016). Studies are therefore needed to understand whether the DASS-21 captures changes that occur as a function of treatment.

Finally, it is important to consider that differences in contextual and cultural factors may limit the generalizability of evidence on the psychometric properties of the DASS-21 to other youth populations. Although the current review found that structural validity has been examined across multiple, non-English speaking countries (i.e., Belgium, Brazil, Chile, China, Israel, Macedonia, Malaysia, Serbia, Singapore, Vietnam), additional evidence in support of reliability (e.g., internal consistency, test–retest) was not consistently presented and validity evidence was present in only two cases (i.e., Jovanovic et al., 2021; Le et al., 2017). As such, additional research is warranted to ensure that the DASS-21 operates similarly across international contexts.

Limitations and Recommendations for Future Research

The current review has limitations that should be acknowledged. First, the review excluded non-English papers, as well as unpublished work, which may have contributed relevant information. Second, most of the studies reviewed employed general, non-clinical samples. Although general samples are appropriate for screening purposes, it may be useful to investigate clinical samples of youth to provide support for use within progress monitoring. Third, most of the studies that evaluated internal consistency (i.e., all but one) used Cronbach’s alpha, which makes strong assumptions of unidimensionality and equal factor loadings. It was evident in the review of factor analysis studies that these assumptions were not always met.

Conclusions and Implications for Practice

Extant research has demonstrated that outcome measurement can improve the quality of care and improve the percentage of clients whose functioning improves during treatment (Blais et al., 2009). The DASS-21 has the potential to be a feasible means of monitoring youth functioning over time, given that it is both brief and freely available in the public domain. The majority of studies reporting psychometric properties in this review utilized the DASS-21 with mixed-gender groups of secondary school students, thus supporting its use with similar populations. Pending additional inquiry, however, practitioners are encouraged to utilize the total score, for which stronger evidence of reliability and validity exist. Additionally, future research focused on examining the sensitivity to change of DASS-21 data would help to support use with youth for clinical progress monitoring purposes.