1 Introduction

Depression constitutes a multifaceted and debilitating mental health disorder characterized by persistent and pervasive dysphoric mood states, including profound sadness, feelings of hopelessness, and a notable diminishment in the capacity to experience pleasure or interest in routine activities [1]. According to World Health Organization (WHO), approximately 5% of adults (4% men and 6% women) in the world are suffering from depression [2]. Beyond the realm of ordinary mood fluctuations, depression engenders a pervasive impact on cognitive, affective, and behavioral domains, thereby significantly impairing daily functioning [3]. Commonly observed symptoms encompass disruptions in sleep patterns, alterations in appetite, decreased energy levels, cognitive impairments such as difficulties in concentration, and a proclivity towards indecisiveness [4]. The etiology of depression is inherently complex, implicating a confluence of genetic, neurobiological, environmental, and psychosocial determinants [5]. In the realm of academic discourse, depression manifests not only through cognitive and emotional dimensions but also encompasses somatic aspects, frequently leading individuals to exhibit concurrent physical symptoms [6, 7]. Given the considerable variation in symptomatology and the deleterious ramifications on overall well-being, a holistic strategy for addressing depression involves utilizing both medication and therapy, highlighting the importance of expert evaluation and support to achieve the best treatment results [8, 9].

Existing depression scales encompass a variety of instruments designed to assess and quantify depressive symptoms across different populations. The Beck Depression Inventory (BDI) [10] and Patient Health Questionnaire (PHQ-9) [11] offer self-report measures, while the Hamilton Depression Rating Scale (HAM-D) [12] provides a comprehensive questionnaire for symptom evaluation. For elderly individuals, the Geriatric Depression Scale (GDS) [13] is valuable, and the Zung Self-Rating Depression Scale (SDS) [14] offers a self-administered option. The Center for Epidemiologic Studies Depression Scale (CES-D) [15] targets the general population, while the Montgomery-Åsberg Depression Rating Scale (MADRS) [16] aids in severity assessment. Additionally, the Quick Inventory of Depressive Symptomatology (QIDS) [17] and the 21-item Depression Anxiety Stress Scales (DASS-21) [18] offer a broader assessment of emotional states. These instruments collectively enhance the multidimensional understanding and diagnosis of depression, facilitating effective interventions and support strategies for clinicians and researchers alike.

The absence of culturally sensitive and contextually appropriate screening tools contributes to the under-recognition of depression in Afghanistan. Existing depression assessment instruments, often developed in Western contexts, may not adequately capture the unique manifestations, cultural expressions, and determinants of depression in Afghan populations. Moreover, the stigma associated with mental illness further impedes help-seeking behaviors and exacerbates the treatment gap.

To address these challenges, there is a pressing need for a validated depression screening instrument tailored specifically to the Afghan context [19]. The development of such a tool requires careful consideration of cultural nuances, linguistic diversity, and socio-economic factors prevalent in Afghanistan [19]. By creating a culturally sensitive screening scale, mental health professionals can enhance the early detection of depression, facilitate timely interventions, and ultimately alleviate the burden of this debilitating condition in Afghan society [19].

Refining existing methodologies, our research introduces the Afghanistan National Depression Screening (ANDs) Scale, meticulously crafted to align with Afghanistan’s distinct cultural and linguistic landscape. This innovation transcends the limitations of generic depression assessments, ensuring precise detection of culturally specific manifestations of distress. The ANDs Scale not only advances contextually nuanced mental health diagnostics but also pioneers a paradigm shift towards the development of universally adaptable, culturally sensitive assessment tools. The development of the ANDs scale involved rigorous processes of adaptation, translation, validation, and psychometric evaluation, guided by established principles of cross-cultural assessment and collaborative research with local stakeholders.

2 Materials and methods

2.1 Methods

2.1.1 Procedure

A team of clinicians and researchers conducted a thorough review of existing scales for depression. The final selection for developing the ANDs scale comprised four scales: DASS-21, CES-D 20, GHQ-28, and PHQ-9. The resulting questionnaire, with a total of 85 items, was carefully finalized. Based on a comprehensive literature review and expert advice, the major components of depressive symptomatology were identified. These components include depressed mood, feelings of helplessness, worthlessness, hopelessness, and psychomotor retardation. To capture the current state, respondents were asked, “During the past two weeks, how often have you…”. Responses were then evaluated on a scale ranging from zero (never) to three (almost always), reflecting the frequency of symptom occurrence. Initial tests on small convenience samples demonstrated the scale's performance and guided revisions to enhance clarity and acceptability. The final scale used in this study, consisting of 15 items, is detailed in Table 2. Scores on this scale can range from zero to 45, with higher scores indicating a greater prevalence of symptoms. The weighting of scores takes into account the frequency of symptom occurrence over the preceding two weeks.

After curating items deemed suitable for inclusion in the questionnaire, it was subsequently implemented in a self-rating format among a cohort of 50 participants. The sampled individuals included both ordinary community residents without reported depressive symptoms or a history of mental disorders and individuals under hospitalization for depression. The inclusion criteria encompassed both male and female participants sourced from various parts of Herat city in Afghanistan. It is noteworthy that all participants in the pilot study were aged 18 years or older. The results of the pilot study indicated that no changes were needed.

Comprehensive investigation was conducted to validate measurement instruments, involving a convenience sample of 1245 adult participants from diverse regions of Afghanistan. This study’s data collection transpired between June and September 2023, incorporating meticulous efforts to ensure linguistic and cultural appropriateness. a total of 75 questionnaires were collected for each item in the ANDs scale. To enhance the robustness of the data and account for potential incomplete responses, an additional 10% of questionnaires were also collected, resulting in a final sample size of 1245 participants. This approach aimed to ensure a sufficiently large and representative dataset for the psychometric testing and validation of the ANDs scale.

Skilled data collectors engaged in face-to-face interactions following rigorous training that emphasized the standardized administration of the ANDs scale. The training program addressed study objectives, questionnaire nuances, and ethical considerations, fostering uniformity in data collection procedures. Regular debriefing sessions facilitated ongoing feedback and refinement of methods. Consistency was maintained through standardized scripts, minimizing variations in question presentation. To mitigate biases, data collectors adopted a neutral and non-judgmental approach. Overall, this comprehensive training, coupled with standardized procedures and ongoing support, ensured the reliability and validity of collected data while minimizing potential sources of bias in participant responses.

In order to participate in the study, individuals were required to meet specific criteria, including providing informed consent, being 18 years old or older, and demonstrating the ability to read and comprehend the Dari language. These prerequisites were established to ensure the ethical inclusion of participants who could actively engage in the self-administered questionnaire, fostering a comprehensive and meaningful assessment of depressive symptomatology within the targeted population.

2.1.2 Analysis

Data extraction from the questionnaire was carried out, with information meticulously organized in Excel format for Windows. Statistical analyses were conducted using IBM SPSS Statistics software (version 26) for Windows. Internal Consistency Reliability was assessed using Cronbach’s Alpha, with values exceeding 0.70 considered satisfactory. Pearson correlation analyses were employed to assess convergent validity, and criterion validity, comparing results with the DASS-21, GHQ-28, CES-D 20, and PHQ-9. Test–retest reliability was evaluated using the Interclass Correlation Coefficient, and a value of 0.70 or above was deemed satisfactory. Exploratory factor analysis was performed using the principal component analysis with oblimin rotation (Kaiser normalization) for the factor structure. The Kaiser–Meyer–Olkin (KMO) statistic and Bartlett’s test of sphericity were carried out to check for sampling suitability and factor analysis. Factor loading greater than 0.30 was considered statistically meaningful.

The validation process extended to confirmatory factor analysis, gauging the goodness-of-fit of the four construct models of DASS-21 within the Afghan population. Various parameters, including minimum discrepancy over degree of freedom (CMIN/df), root mean square error of approximation (RMSEA), standardized root mean residual (SRMR), goodness-of-fit index (GFI), adjusted goodness-of-fit index (AGFI), comparative fit index (CFI), and Tucker-Lewis index (TLI), were utilized for this assessment, with adherence to established criteria such as CMIN/df < 5; Standardized RMR < 0.05; RMSEA < 0.08; GFI > 0.90; AGFI > 0.90; CFI > 0.95; and TLI > 0.95.

2.1.3 Ethical approval

The present investigation obtained ethical clearance from the Ethical Committee of the Afghanistan Center for Epidemiological Studies, as denoted by reference number #21.1.041.

2.2 Measures

2.2.1 Socio-demographics

A demographic information sheet, encompassing details such as age, gender, marital status, residency, economic standing, and recent exposure to a traumatic event within the past month, was employed to gather background information about the participants. The assessment of participants’ levels of depression was conducted using the Persian version of the following scales:

2.2.2 DASS-21

The Dari version of the DASS-21 [18] is a 21-item instrument used to assess levels of depression, anxiety, and stress. Responses are given on a four-point scale, with subscale scores calculated independently. Higher scores indicate greater severity in depression, anxiety, or stress. The DASS-21 has shown acceptable reliability, with Cronbach's α coefficients indicating satisfactory validity: 0.79 for anxiety, 0.91 for stress, and 0.93 for depression [20].

2.2.3 GHQ-28

The Persian version of the GHQ-28 [21] is a self-report tool designed to evaluate psychological well-being and identify potential mental health issues [21]. With 28 items, participants rate their symptoms on a four-point scale, generating scores ranging from 0 to 21 for each of the four subscales: somatic symptoms, anxiety and insomnia, social dysfunction, and severe depression [22].

2.2.4 CES-D-20

The CES-D 20 is a self-report questionnaire used to gauge the presence and intensity of depressive symptoms [15]. It's a condensed version of the original CES-D, featuring 20 items with responses on a four-point scale. Scores range from 0 to 60, with higher scores indicating more severe symptoms. Covering a range of depressive symptoms, it offers a thorough assessment for clinical and research use. Translations, like the Dari version, extend its utility across diverse linguistic and cultural settings [23].

2.2.5 PHQ-9

The Persian version of the PHQ-9 [24] is a concise self-report tool crafted to evaluate the presence and intensity of depressive symptoms [11]. With nine items, participants rate symptom frequency over the past two weeks on a four-point scale. Scores range from 0 to 27, with higher scores indicating greater depressive symptom severity.

3 Results

3.1 Characteristics

This study involved 1,245 participants, with a mean age of 32.49 years (SD ± 13.24). A majority of the participants were female (n = 625; 50.2%), and a substantial proportion were married (n = 742; 59.6%). Less than half of the participants hailed from rural areas, constituting 512 individuals (41.1%) (Table 1).

Table 1 Characteristics of participants

Table 2 displays the frequency of item responses, along with Classical Test Theory (CTT) statistics. Respondents endorsed all four response categories. Notably, a substantial proportion (25.9%) acknowledged occasional experiences of being unable to become enthusiastic about anything. Approximately half of the participants (45.8%) reported no instances of thinking their life had been a failure in the past week. Similarly, about half of the respondents (45.1%) indicated an absence of feelings of loneliness in the past week. The overall internal consistency (reliability) of the 15-item scale, as measured by Cronbach’s alpha, was 0.846, signifying excellent reliability (Table 2).

Table 2 Abbreviated item content, response category percentages, and classical test theory statistics of ANDs Scale items

The Interclass Correlation Coefficient (ICC) for the ANDs scale was found to be 0.178 (95% CI 0.150–0.214), indicating a moderate level of reliability. The F Test [F(1, 223) = 7.517, p < 0.001] further supports the reliability of the measurements. The Spearman-Brown Coefficient, a measure of internal consistency, was calculated to be 0.975, suggesting high reliability between the two rounds of assessment (Table 3).

Table 3 Examining the interclass correlation coefficient between two rounds of the ANDs scale (n = 224)

As shown in Table 4, significant positive correlations were observed between the ANDs scale and related measures, providing evidence for both convergent and criterion validity. Notably, strong correlations were found with DASS-21-Depression (r = 0.854, p < 0.001), GHQ-28-Depression (r = 0.693, p < 0.001), CES-D-20 (r = 0.922, p < 0.001), and PHQ-9 (r = 0.758, p < 0.001). These findings support the construct validity of the ANDs scale in relation to measures of depression (Table 4).

Table 4 Pearson correlation between ANDs, DASS-21-depression, GHQ-28-depression, CES-D 20, and PHQ-9

The exploratory factor analysis revealed the factor loadings and communalities (h2) for each item. Factor loadings represent the strength and direction of the relationship between items and factors, while communalities indicate the proportion of variance in each item explained by the factors. Notably, item 12 demonstrated a high factor loading of 0.725, suggesting a strong association with the underlying factor. The overall pattern of factor loadings supports the construct validity of the scale, indicating that the items align well with the underlying factors (Table 5).

Table 5 Results from the exploratory factor analysis

In the examination of the ANDs scale's dimensionality through confirmatory factor analysis, the fit indices revealed a varied fit. Specifically, CFI was 0.819, TLI was 0.789, AGFI was 0.868, RMSEA was 0.087, NFI was 0.804, GFI was 0.901, and SRMR was 0.081. While certain indices fell below conventional thresholds for excellent fit, such as CFI, TLI, AGFI, NFI, and GFI, the RMSEA and SRMR values suggest a reasonable fit. The chi-square test (χ2) was statistically significant (p < 0.001) (Table 6) (Fig. 1).

Table 6 Examining the dimensionality of ANDs scale using confirmatory factor analysis
Fig. 1
figure 1

Confirmatory analysis model with factor loadings and correlations for the ANDs scale

4 Discussion

The current investigation introduced the development of the Afghanistan National Depression Screening (ANDs) Scale. Results revealed a consistent unidimensional framework and robust psychometric characteristics of the ANDs scale. Initial psychometric evaluations indicated favorable properties across various testing methodologies, including Classical Test Theory and Item Response Theory. Additionally, the cumulative score of the scale's items reliably reflects the severity of depression, as evidenced by its positive correlations with scores on the DASS-21 depression subscale, CES-D 20, GHQ-28 Depression subscale, and PHQ-9. Hence, higher scores on the ANDs scale correspond to more pronounced depressive symptoms, suggesting its utility in assessing and addressing psychological concerns among both genders.

Various research studies have elucidated the nuanced impact of psychological responses on the health and overall well-being of individuals, highlighting the influence of diverse contextual and cultural factors [25,26,27,28]. The assessment of convergent validity, employing established measures such as the DASS-21 depression subscale, CES-D 20, GHQ-28 depression subscale, and PHQ-9, underscores the validity of the ANDs scale as a reliable instrument for gauging both depression symptoms and the broader construct of depression.

The results of the item analysis indicated that the items on the ANDs Scale exhibited robust discrimination indices, as evidenced by the corrected item-total correlation coefficients. Such indices provide empirical support for the scale's effectiveness in differentiating between individuals who score high and those who score low. This capability is crucial for ensuring that the scale accurately reflects varying degrees of the attribute it measures. Comparative analysis with other research on depression scales and questionnaires further corroborates the strength of item discrimination in assessment tools employed in the development of the ANDs Scale [29,30,31,32]. These findings collectively underscore the precision and reliability of the ANDs Scale in assessing nuances within individual responses, thus enhancing its utility in both clinical and research settings.

The findings of the current investigation revealed that the ANDs Scale is characterized by a unidimensional factor structure. This aligns with the configurations documented in several established depression assessment instruments, including the CES-D 20 [15], the BDI [10], the PHQ-9 [11], the HRSD [12], and the MADRS [16]. Furthermore, the analysis indicated that each item on the ANDs Scale demonstrated robust factor loadings, a finding that echoes the structural integrity observed in the QIDS [17]. These consistencies underline the methodological soundness of the ANDs Scale in assessing symptoms of depression, thereby reinforcing its utility in both clinical and research contexts.

The ANDs Scale exhibited an acceptable test–retest reliability, as evidenced by its robust psychometric properties. This was evidenced by comparison with other established measures such as DASS-21 by Osman et al. [33], the CES-D 20 by Ohno et al. [34], GHQ-28 by Roty et al. [35], and PHQ-9 by Son et al. [36].

The ANDs Scale demonstrated a high degree of correlation with several well-established measures, including CES-D 20, GHQ-28, PHQ-9, and DASS-21. Such correlations are consistent with patterns observed with other recognized scales in the field. For instance, similar associations have been documented with the Brief Psychiatric Rating Scale [37] and the Behavioral Activation for Depression Scale (BADS) [38]. This convergence in correlations substantiates the scale’s validity and underscores its comparative reliability alongside these established instruments, thereby confirming its efficacy in capturing relevant psychological dimensions parallel to those assessed by the referenced scales.

This study is subject to several limitations. Firstly, the participants under study were drawn from the broader Afghan populace, without formal diagnostic assessments for mood disorders such as depression. Consequently, the assessment of the scale's sensitivity and specificity remains unfeasible. Secondly, the subjective nature of depression symptoms and inherent limitations in their objective assessment via self-reporting raise concerns regarding the potential influence of social desirability biases on respondents' responses. Lastly, the utilization of convenience sampling undermines the extent to which the findings of the present study can be generalized.

5 Conclusion and implications

The ANDs scale, consisting of 15 items, is developed to assess depression within the Afghan population. Its alignment with established measures such as the DSM-V based PHQ-9, as well as other widely used depression subscales like the DASS-21, CESD-20, and GHQ-28, suggests its potential utility in identifying depression symptomatology. The implications of the ANDs scale for clinical practice and public health are significant. Clinically, it could improve early detection and diagnosis of depression in the Afghan population, particularly by addressing cultural nuances in symptom expression. This may allow for better monitoring and tailoring of treatment. In public health terms, the scale supports efforts to enhance mental health services in Afghanistan, aiding in policy development and resource allocation to improve healthcare worker training and infrastructure. However, it is crucial for these findings to be replicated across diverse clinical and demographic samples to ensure the generalizability of the one-factor solution identified in this study. Further research is necessary to validate the use of the ANDs scale in clinical settings.