Introduction

To improve the quality of mental health care, the field of psychiatry has increasingly acknowledged the importance of evidence-based practices, including evidence-based treatments and assessments (Pincus et al., 2007; Wright et al., 2022; Youngstrom, 2013). Although several studies have found that evidence-based, standardized assessments result in more accurate diagnoses that facilitate accurate clinical communication and effective treatment (Basco et al., 2000; Miller et al., 2001), these measures are often not used in clinical practice (Pincus et al., 2007). Surveys of mental health clinicians (including psychologists, psychiatrists, social workers, and counselors) have revealed that many clinicians report not using standardized tools, often due to time limitations, financial constraints, and limited training in administration, scoring, and interpretation of these tools (Arbuckle et al., 2013; Bruchmüller et al., 2011; Jensen-Doss & Hawley, 2010). Instead, many practitioners opt for unstructured clinical interviews, despite significant concern over the reliability and accuracy of many approaches to unstructured interviews (Basco et al., 2000; Zimmerman, 2016; Zimmerman & Mattia, 1999). Therefore, there is a significant need for brief, cost effective, easy-to-use standardized assessment tools with validated diagnostic accuracy. These tools are needed particularly on inpatient psychiatric units where length of hospitalization grows increasingly shorter, decreasing to under a week on average in the past few decades (Case et al., 2007; Lieberman et al., 1998). Further, inpatient psychiatric hospitalizations for adolescents are on the rise, with recent studies suggesting that more than one in ten adolescents who receive mental health care are hospitalized as a part of their care (Mojtabai & Olfson, 2020).

Of particular importance on adolescent inpatient units is the screening for depressive disorders, which are often linked to one of the most common reasons for psychiatric hospitalization in adolescents: suicidal behavior (Case et al., 2007; Laukkanen et al., 2016; Mathai & Bourne, 2009). While adolescents often present with multiple contributing factors for hospitalization, the majority of inpatient adolescents report suicidal ideations and related behaviors (Alqueza et al., 2021; Hanssen-Bauer et al., 2011; Rodriguez-Quintana & Ugueto, 2021). In addition, among adolescents hospitalized for suicidal behavior, 50–90% meet current or lifetime criteria for a depressive diagnosis (Millon et al., 2022; Poyraz Fındık et al., 2022). Given the brief nature of hospital stays and the prevalence of depressive disorders within inpatient settings, a number of short screening tools have been developed for depressive disorders, including rating scales (e.g., Children’s Depression Inventory [CDI; Kovacs, 2015], Center for Epidemiological Studies-Depression Scale [CES-D; Radloff, 1977]) and brief interviews (e.g., clinician-administered questions briefly assessing depressive symptoms; Sharp & Lipsky, 2002; Young et al., 2010). While the advantages of these measures are their efficiency when compared to administering full diagnostic interviews such as the Kiddie Schedule for Affective Disorders and Schizophrenia (KSADS; Kaufman et al., 1997, 2016), several limitations exist in these brief screening measures. For example, although depression rating scales often take only 5–10 min to complete (Sharp & Lipsky, 2002), this amount of time can quickly become overwhelming in fast-paced settings when assessment of several disorders is needed. The interpretation of these measures also often requires knowledge of psychometric properties and the use of norm-referenced scores to interpret the results. In addition, many of the brief interviews have not been systematically validated to clearly establish their reliability and screening utility (Roseman et al., 2016; Sharp & Lipsky, 2002), and there is evidence that existing screening measures continue to be used ineffectively in clinical practice (Fuchs et al., 2015).

Further, the content of most existing rating scales and brief interviews are limited in one particularly critical area: they do not assess for the duration of depressive symptoms, instead focusing solely on the presence of depressive symptomology. For example, the Patient Health Questionnaire (2 item version; PHQ-2) only assesses for depression with two questions regarding presence of depression and anhedonia over the last two weeks without assessing for the chronicity of the depressive symptoms (Richardson et al., 2010). These types of assessments therefore often result in a high rate of false positives (Roberts et al., 1991). For some community screenings, such over-diagnosis can be acceptable to enhance the likelihood that individuals with depression may not be missed (i.e., false negatives) by the screener. However, on an inpatient unit where critical decisions such as medication management need to be made quickly, both sensitivity and specificity (i.e., reducing both false positive and false negative screening results) are important considerations when selecting a screening tool.

Research suggests that assessment of duration of symptoms is also critical in detecting persistent levels of depression (i.e., Persistent Depressive Disorder [PDD]) that do not meet the threshold for Major Depressive Disorder (MDD). While many screening tools emphasize assessment of symptoms over the last two weeks in line with an MDD diagnosis (Kovacs, 2015; Richardson et al., 2010), duration and frequency of depressive symptoms can vary greatly in adolescents, with depressive periods ranging from a few weeks to multiple years (Karlsson et al., 2007). For example, in a sample of over 300 adolescents with MDD, the most common episode duration was two months, but ranged from two weeks to ten years (Lewinsohn et al., 1994). Persistent depressive symptoms are particularly problematic for adolescents and have been shown to have even greater impairment and mortality risk than MDD due to the longstanding nature of the depressive symptoms, even though less severe symptoms may be present (Alaie et al., 2021; Jonsson et al., 2011; Nobile et al., 2003). Children and adolescents with both MDD and PDD (i.e., “double depression”) have longer, more severe depressive episodes and greater risk for suicidality, comorbidity, and social impairment than either MDD or PDD alone (Nobile et al., 2003). Finally, research on both MDD and PDD have also been characterized as recurrent disorders with varying rates of recovery (Birmaher et al., 2007; Kovacs et al., 1994). Thus, existing research suggests that duration and chronicity of depressive symptoms are critical pieces of information that a clinician must gather in order to inform treatment. However, given the potential ranges in duration and chronicity that are likely to result, measures must also determine the optimal duration and chronicity to be considered of clinical concern (i.e., indicative of a depressive disorder) to guide clinical practice.

Therefore, the current study aimed to validate a newly developed screening measure for depressive disorders in a sample of inpatient adolescents. Unlike existing screening measures for depression, this screening tool assesses not only the presence of depressed mood but also provides a standardized and efficient method for assessing the duration and consistency of these mood states which are key in distinguishing between depressive diagnoses. To guide the use of this measure, the consistency of depressive symptoms that optimally screens for depressive diagnoses was first determined using empirical methods. It was then hypothesized that this new screening measure based on consistency of depression would be able to identify a group of individuals with elevated rates of depressive disorder diagnoses and suicidal behavior. Finally, the utility of the newly-developed screening method was compared to the utility of an established screening tool based on depressive symptomology, and it was hypothesized that the consistency-based method would identify depressive diagnoses and suicidal behavior with similar or greater accuracy than the symptomology-based method.

Method

Participants

Data were collected from a sample of 396 adolescents (aged 12–17; M = 14.47) hospitalized at an inpatient psychiatric unit in the southeastern United States. This short-term unit (i.e., average length of stay is approximately 7 days) provides care to adolescents who present to the emergency room of a major public hospital with a variety of psychiatric emergencies. In the present sample, 84% of patients were admitted for suicidal behavior, 13% for homicidal/aggressive behavior, 2% for psychotic behavior, and 1% for other psychiatric concerns. This sample includes all adolescents admitted to the unit during the span of data collection (September 2020 to May 2022) who met inclusion criteria: 1) patients were assigned to the care of the primary investigator (i.e., incoming patients were assigned to one of two physicians, per standard hospital procedures, by hospital administrators who were not involved in the current project), 2) patients were determined to be adequate reporters of their mental status (i.e., individuals with severe psychosis/substance use, intellectual/developmental disabilities, or aggression/noncompliance were excluded), and 3) parental consent and patient assent were obtained. The sample was predominantly female (71%) and White (62%; see Table 1).

Table 1 Demographic and study variable descriptive characteristics

Procedure

Study procedures were approved by the hospital institutional review board (#2020-025). For all adolescents admitted to the unit, standard intake procedures were conducted by the primary investigator or a graduate student and included the brief depression screen (described below) and questions related to the patient’s medical, psychological, and developmental history. Parents or legal guardians of patients who were deemed to be study candidates were then contacted (often by phone) to provide informed consent. After receiving parental consent, eligible patients were informed of study procedures (e.g., the voluntary nature of participation, discharge planning being unaffected by participation) and asked to provide informed assent. After consent/assent was obtained by study staff (i.e., the primary investigator or a graduate/undergraduate student), participants then completed an additional semi-structured diagnostic interview with a graduate student who was blind to the results of the intake interview. Finally, participants completed self-report questionnaires with the assistance of a graduate or undergraduate student. The intake interview took place on the first day of the participant’s hospitalization, and remaining data collection often took place over the course of several days during the participant’s hospitalization depending on patient and staff availability.

Measures

Brief Adolescent Depression Screen (BADS)

The BADS is part of a larger screening interview for multiple disorders and refers to the section that assesses for the possibility of depressive disorders. The BADS consists of one screen or stem question followed by three questions assessing onset and consistency of depressive symptoms. Following a screen question of “Have you ever felt really, really sad or depressed?”, individuals who endorsed depression were asked, “When did you first experience feelings of depression?”. Individuals who reported onset of more than one year ago were then asked, “What percent of the time have you consistently felt depressed over the last twelve months?”, resulting in an answer between 0 and 100. Individuals who reported onset of less than one year ago were instead asked “What percent of the time have you consistently felt depressed from when it first began until now?”, which is used to inform clinical judgement but does not correspond to a diagnosis. All participants regardless of onset were then asked, “What percent of the time have you consistently felt depressed over the last two weeks?”. The screen for MDD is based on a composite of the consistency over the past two weeks and the past year, with the intention of screening for a wide range of depressive episodes rather than only the most recent. The screen for PDD is based on the consistency over the past year. Although the duration of administration was not systematically measured in the current study, administration of the BADS takes approximately one minute according to clinician estimate.

Behavior Assessment System for Children: 3rd Edition (BASC), Depression Scale

Depression was also screened using the adolescent self-report version of the BASC (Reynolds & Kamphaus, 2015). Participants completed the full measure, but only the depression subscale was used in current analyses. This subscale consists of 12 items assessing depressive symptoms (i.e., “I feel depressed”, “I feel life isn’t worth living”), which are rated on a scale from “never” (1) to “almost always” (4) or “true” (1) or “false” (0). This scale showed good internal consistency in the current sample (Cronbach’s α = 0.90). Using the BASC-3 scoring software, raw scores were converted to T-scores (M = 50, SD = 10) based on gender-combined age-based norms. T-scores of at least 70 (i.e., 2 standard deviations above the mean) are typically considered to indicate significant clinical risk for disorder (Reynolds & Kamphaus, 2015).

Kiddie Schedule for Affective Disorders and Schizophrenia: Present Lifetime Version (KSADS)

The depressive disorders supplement of the KSADS (Kaufman et al., 1997, 2016) assessed for MDD and PDD diagnoses according to Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5; APA, 2013) criteria. First, the depression screen questions assessed for feelings of depression, irritability, anhedonia, suicidal ideation, suicide attempts, and non-suicidal self-injury (NSSI). If established criteria were met for any of these screen questions, the depression supplement was administered, which assessed full criteria for both MDD and PDD. Individuals with periods of depression, irritability, or anhedonia lasting at least two weeks (either current episode or lifetime), along with the requisite number of diagnostic symptoms (i.e., sleep disturbance, appetite change, etc.) were diagnosed with MDD. Individuals with periods of depression lasting at least one year (either current episode or lifetime), along with the requisite number of diagnostic symptoms, were diagnosed with PDD.

Suicidal Behavior Assessment

As a part of standard intake procedures, history of suicidal behavior was assessed for each patient, including frequency of suicidal ideation (“never” = 0, “less than monthly” = 1, “monthly” = 2, “2-4x/month” = 3, “weekly” = 4, “daily” = 5), number of suicide attempts, and history of NSSI (“yes” = 1, “no” = 0). Due to the bimodal distribution of the frequency of suicidal ideation variable (see Table 1), this variable was dichotomized into “once a month or less” (n = 173) or “more than once a month” (n = 218) for analyses.

Results

Empirically Derived Screening Cutoffs

To determine the optimal cutoff for a positive screen on the BADS, receiver operating characteristic (ROC) curve analyses were conducted to determine the point (i.e., level of depressive consistency) that optimally screened for a diagnosis of MDD or PDD on the KSADS. The highest percent of the one-year and two-week depression consistency questions from the BADS was used to screen for MDD; for the purposes of these analyses, those who did not endorse depression on the screen question were coded as having 0 percent depression over the past year and two weeks, and those who had been depressed for less than a year were coded based only on their percent in the last two weeks. Individuals with missing data on either variable were excluded from these analyses. ROC curves were used to calculate Youden’s indices (sensitivity + specificity − 1; i.e., a measure of the accuracy for a specific cutoff) for all possible values in order to find the cutoff value (0–100) with maximal sensitivity and specificity. The overall area under the curve (AUC) was also calculated to provide an estimate of the overall accuracy of the screener across all cutoffs.

A cutoff of being depressed 67% of the time over the last two weeks or the last year according to the BADS was determined to be the optimal minimum cutoff (Youden’s index = 0.461) for correctly identifying those who met criteria for MDD on the KSADS. For ease of clinician use and because most individuals (96%) reported values that were multiples of five, this cutoff was rounded up to 70% (Youden’s index = 0.456; rounding down to 65% resulted in a slightly lower Youden’s index of 0.450). Therefore, individuals who reported being depressed at least 70% of the time over either the last year or last two weeks were considered to have a positive screen for MDD according to the BADS. The AUC for these analyses was 0.769, indicating that this screener can classify individuals with acceptable accuracy (i.e., 77% chance the screener accurately distinguishes between MDD and non-MDD individuals). Likewise, similar analyses were conducted testing the optimal cutoff for the BASC depression scale, and a T-score of 70 was confirmed as the optimal cutoff in identifying KSADS MDD diagnoses (Youden’s index = 0.343, AUC = 0.712).

To determine the optimal cutoff for PDD, percent of time depressed over the last year was used to screen for KSADS PDD diagnoses. For the purposes of these analyses, those who were depressed less than one year were coded as having 0 percent depression for this variable. A cutoff of being depressed 55% of the time over the last year according to the BADS was determined to be the optimal minimum cutoff, resulting in a Youden’s index of 0.449. Therefore, individuals who reported being depressed at least 55% of the time over the last year were considered to have a positive screen for PDD. Because individuals who reported being depressed 70% of the time or more for the last year also showed a positive screen for MDD, those individuals would have a positive screen for both MDD and PDD (i.e., PDD with persistent or intermittent major depressive episodes). The AUC for these analyses was 0.764. Further, the BASC depression scale had an AUC of 0.729 in screening for a diagnosis of PDD on the KSADS, and a T-score of 70 was confirmed as the optimal cutoff (Youden’s index = 0.385).

Based on these BADS cutoffs, 142 individuals (37%) showed a positive screen for both MDD and PDD, 59 individuals (15%) had a positive screen for MDD only, 14 individuals (4%) had a positive screen for PDD only, and 168 individuals (44%) did not have a positive screen for either MDD or PDD. Thirteen individuals had missing data on the BADS. On the BASC, 192 individuals (51%) had a positive screen for depressive diagnoses, 185 individuals (49%) had a negative screen for depressive diagnoses, and 18 individuals had missing data.

Agreement Between Depressive Screening Tools and KSADS Depressive Diagnoses

Agreement between BADS MDD screening and KSADS MDD diagnoses are provided in Table 2. The two measures agreed on MDD diagnosis or non-diagnosis in 73% of cases. BASC depression screening and KSADS MDD diagnosis (also presented in Table 2) agreed on diagnosis in 67% of cases. Further indicators of agreement or association (i.e., sensitivity, specificity, positive predictive value, negative predictive value, chi-square, and phi) between the two measures and KSADS MDD diagnoses are displayed in Table 3. The BADS was similar to or better than the BASC in identifying KSADS MDD diagnoses according to all indicators.

Table 2 Agreement between depressive diagnostic and screening tools
Table 3 Agreement indicators of depressive screening tools

Agreement between BADS PDD screening and KSADS PDD diagnoses are provided in Table 2. The two measures agreed on PDD diagnoses in 73% of cases, while depressive screens on the BASC agreed with KSADS PDD diagnosis in 68% of cases (see Table 2). Table 3 also displays agreement indicators for identifying KSADS PDD diagnoses. The BASC displays higher sensitivity and negative predictive value than the BADS, but this is at the cost of a lower specificity and positive predictive value due to a high number of false positives (i.e., 83 individuals were identified by the BASC but did not meet KSADS PDD criteria).

Agreement between a composite BADS positive screen (for either MDD or PDD) and composite KSADS diagnoses (of either MDD or PDD) are also provided in Table 2. The two measures agreed on a depressive diagnosis in 73% of cases, while depressive diagnoses on the BASC agreed with composite KSADS diagnoses in 70% of cases (see Table 2). The BADS composite depressive screen outperformed the BASC in identifying composite depressive diagnoses on the KSADS according to all indicators with the exception of specificity (see Table 3).

Agreement Between Depressive Screening Tools and History of Suicidal Behavior

Of those identified by the BADS as screening positive for MDD, 70% reported a history of NSSI (χ2[1] = 25.42, Φ = 0.26, p < 0.001). Of those identified as a positive screen for PDD according to the BADS, 74% reported a history of NSSI (χ2[1] = 27.75, Φ = 0.27, p < 0.001). Meanwhile, of those identified by the BASC depressive screening, 67% reported a history of NSSI (χ2[1] = 13.12, Φ = 0.19, p < 0.001). Of those identified as having a positive MDD or PDD screen on the BADS, 76% and 75%, respectively, reported having suicidal ideation more than once a month (MDD: χ2[1] = 63.17, Φ = 0.41, p < 0.001; PDD: χ2[1] = 39.12, Φ = 0.32, p < 0.001). Of those identified with a positive screen by the BASC, 74% reported having suicidal ideation more than once a month (χ2[1] = 49.51, Φ = 0.36, p < 0.001). Those with a positive BADS MDD screen had an average of 1.86 suicide attempts (compared to 1.07 attempts for those with a negative screen, t[312] =  − 2.82, p < 0.01) and those with a positive BADS PDD screen had an average of 2.21 suicide attempts (compared to 1.00 attempts for those with a negative screen, t[198] =  − 3.68, p < 0.001). Those with a positive BASC depressive screen had an average of 1.91 suicide attempts, compared to 1.10 attempts for those with a negative screen (t[328] =  − 2.75, p < 0.01).

Relative Clinical Utility of Depressive Screening Tools

The ability of each screening measure (i.e., the BADS or BASC) to identify KSADS diagnoses and suicidal behavior was tested using regression analyses controlling for age, sex, and race/ethnicity. As shown in Table 4, all diagnostic methods (BADS MDD, BADS PDD, BADS DEP, and BASC DEP) were associated with significant differences in all outcomes. Effect sizes (i.e., odds ratios) for the BADS diagnoses were approximately equal to or greater than effect sizes for the BASC.

Table 4 Clinical utility of BADS and BASC

Because the BASC screening was significantly associated with all outcomes, the incremental validity of the BADS screening over and above the screening utility of the BASC was also tested using regression analyses, which controlled for age, sex, race/ethnicity, and the BASC screening outcome (Table 5). After accounting for the BASC, the BADS continued to be associated with statistically significant differences in all outcomes, suggesting that the BADS screening offered incremental validity over and above the BASC in screening for KSADS diagnoses and suicidal behavior.

Table 5 Incremental clinical utility of the BADS after accounting for the BASC

Discussion

With the acknowledged importance of evidence-based assessments, there remains significant need for screening measures for psychopathology that are appropriate and validated for use in inpatient settings with high demands on clinician time. Screening tools for depression are particularly needed, given the association between depression and suicidal risk. To address this need, the present study presents initial validation of a new screening measure for depressive disorders in inpatient adolescents. This measure, the BADS, is extremely brief and can be administered with only minimal training. Further, it assesses duration of depressed mood, an important dimension that is often missed by other screenings and is necessary to inform clinical practice. Despite its brevity, the BADS showed strong utility in screening for depressive diagnoses according to a semi-structured diagnostic interview and in screening for a history of suicidal behaviors. The clinical utility of the time-efficient BADS was comparable to or greater than a standard rating scale and offered incremental validity over and above this measure.

An important aspect of current analyses was the determination of the level of duration and consistency of depressive symptoms that optimally screens for a depressive disorder. A possible reason why duration is not considered in other depressive screening tools is that it adds a level of complexity in determining what level of duration should be considered clinically significant (i.e., a positive screen). ROC analyses indicated that being depressed 70% of the time over the last two weeks or the last year was an optimal cutoff for screening for MDD, and being depressed 55% of the time over the last year was an optimal cutoff for screening for PDD. These findings could have important implications for other tools assessing depression in adolescents as well. Given that adolescent depressive symptoms can vary widely in duration and consistency (Karlsson et al., 2007; Lewinsohn et al., 1994), these findings can guide clinicians in determining the pervasiveness of depressive symptoms that optimally indicates the presence of a depressive disorder.

Importantly, our simple screening using the BADS performed as well or better than a commercially available rating scale, the BASC, in screening for both depressive diagnoses and suicidal behavior. Due to its brief and easy-to-use nature, this provides support that the BADS could be a particularly useful tool in clinical settings, such as inpatient units, which face significant demands on time and resources. It is important to note, though, that the current study limited its focus on only one rating scale, and there are several other commonly-used rating scales for depressive screening, such as the CDI (Kovacs, 2015) and the CES-D (Radloff, 1977). Similar comparisons of the BADS to these other screening measures are warranted. Further, the BADS, like other screening measures, produces a number of false positives (12–15% depending on the diagnosis) that are not supported by KSADS diagnoses, emphasizing the importance of using this tool as a screening measure and not a replacement for a full diagnostic interview.

While current findings have important implications, they need to be interpreted in light of several limitations to the study. While the inpatient adolescent sample allowed us to investigate the ability to identify clinical diagnosis in a high risk sample, it is possible that the screening accuracy of the BADS may not be the same in samples with lower base rates of depression and suicidal behavior. That is, the accuracy of screening measures can vary greatly depending on the base rates of the outcome across samples (Wilson & Reichmuth, 1985). An additional limitation of the adolescent inpatient sample was the high degree of overlap between MDD and PDD. On the KSADS, only 32 individuals diagnosed with PDD did not also have MDD, and only 14 individuals with a positive PDD screen on the BADS did not also have a positive MDD screen. Thus, the BADS MDD and PDD screening did not offer discriminant validity in their ability to identify KSADS MDD and PDD diagnoses, which may not be the case in samples with less overlap in diagnoses. For example, individuals may be most likely to be hospitalized when in the midst of a major depressive episode and, as a result, those with PDD may report significantly higher levels of depression in the two weeks leading up to psychiatric hospitalization. Thus, the clinical utility of the BADS and the ability to discriminate between MDD and PDD requires further study in non-inpatient samples. Further, the current sample was predominantly female and White, which, despite being similar to other inpatient adolescent samples (particularly those with high rates of suicidal behavior; Millon et al., 2022; Patel et al., 2020), does indicate that these findings need to be replicated in various samples with diverse demographic characteristics to determine their generalizability. These analyses were also cross-sectional, taking place during a limited hospitalization of the adolescent. Of particular note is that the measures of suicidal behavior assessed past suicidal behavior prior to hospitalization. While past suicidal behavior is consistently the best predictor of future suicidal behavior in adolescents (Barzilay & Apter, 2014), the ability of the BADS to predict future suicidal behavior could not be tested in the current study. In addition, as noted above, the BADS was only compared to a single other rating scale assessing depression, and other interview methods of screening for depression were not included in the current study. We were also only able to investigate the validity of the self-report of the BADS, which research has shown to be the most valid method for assessing depression in adolescents (Frick et al., 2020). However, further research should test whether the addition of other informants improves its validity.

Finally, it should be noted that the entirety of data collection took place following the onset of the COVID-19 pandemic in the United States, which may limit the generalizability of findings. Investigation of adolescent inpatient psychiatric hospitalizations have revealed that depression is more severe for individuals hospitalized after the onset of the pandemic in comparison to pre-pandemic hospitalizations, potentially as a result of social distancing restrictions such as school closures and limited social interaction (Ramirez et al., 2022). As such, the present sample may display more severe depression and a higher base rate of depressive diagnoses than other inpatient samples. The severity of depression may also have fluctuated throughout data collection as social distancing measures decreased in the years following pandemic onset. In the current study, we conducted follow-up analyses to investigate whether time since the onset of the COVID-19 pandemic (which, in the current location, was approximately mid-March 2020) was correlated with severity of depression. Number of months between the onset of the pandemic and admission date was not significantly correlated with any measure of depression. Therefore, while we were unable to compare our findings pre- and post-pandemic, these results suggest that there were no major differences in depression from the beginning of the pandemic to approximately two years later (when a majority of social distancing restrictions had been removed).

Within the context of these limitations, current findings support further validation of the BADS as a quick and efficient screening tool for depression. In particular, it shows promise for use in assessing this critically important outcome, in a way that is feasible within the time, cost, and staff limitations faced by many clinical settings, including inpatient hospitals. More broadly, these results illustrate the great potential of evidence-based assessment in helping to bridge the gap between research and practice by providing tools that can greatly enhance the clinical care for youth who are in critical need of effective services.