Empowerment is a core concept of the World Health Organization’s (WHO) perspective on health promotion, and it is one of the cross-cutting principles in their mental health strategic plan (World Health Organization, 2013, 2021). Promoting empowerment is therefore central to the person-centred, recovery-oriented, and rights-based paradigm (World Health Organization, 2021). Empowerment is also a component of the CHIME recovery framework (Leamy et al., 2011), that is, the current international consensus about the key components of the recovery process, which serves as the foundation for research and practical applications in the field of personal recovery. This framework emerged from a systematic review and narrative synthesis of 97 publications in which the concept was defined, identifying five key elements. These elements, represented by the acronym CHIME, are Connectedness, Hope, Identity, Meaning in life, and Empowerment.

Together with the promotion of mutual support and advocacy, empowerment is one of the original goals of the users and survivors of psychiatry movements (Chamberlin et al., 1989). Reflecting this, in the early 1980s the psychologist Julian Rappaport proposed empowerment as a key concept for a paradigm shift and as an approach to intervention that could move beyond paternalism (Rappaport, 1981). In the almost two decades since the WHO recognized empowerment as a key priority (WHO – Regional Office for Europe, 2005), many countries and regions have incorporated empowerment promotion as an objective in their strategic plans. This shift in public policy is gradually being reflected in practice, which is increasingly driven by an emphasis on promoting user empowerment to make informed decisions (Barr et al., 2015).

As a result of this shift, empowerment is now seen both as a strategy to promote mental health and recovery and as an outcome in itself (Pekonen et al., 2020; Wallerstein, 2006). This means that interventions aimed at empowering users need to be evaluated. Although interventions are frequently described as promoting user empowerment, it is often unclear whether programs that employ this concept offer demonstrable benefits over those that do not (Chamberlin, 1997). Furthermore, there is still relatively little research on tools for evaluating empowerment outcomes (Cyril et al., 2016). This is particularly notable in Spanish-speaking countries, where instruments specifically designed (or adapted) and validated for measuring empowerment in mental health are lacking.

Since the promotion of empowerment began to be included as a goal of mental health services, a number of assessment tools have been developed in English-speaking and northern European countries. Barr et al. (2015) conducted a scoping review aimed at identifying scales that evaluate patient empowerment. They found five scales specifically designed to assess empowerment for mental health users (López et al., 2010; Oades et al., 2011; Rogers et al., 1997; Segal et al., 1995; Svedberg et al., 2007), but all of them had important limitations due to the low methodological quality of their respective validation studies, which lacked comprehensive psychometric testing (Barr et al., 2015).

In light of these limitations, Boevink et al. (2017) designed the Netherlands Empowerment List (NEL), a 40-item scale encompassing six domains: self-management, social support, caring community, connectedness, confidence and purpose, and professional help. Unlike earlier instruments, the NEL was developed following the COSMIN criteria (Mokkink et al., 2010), and the authors examined a broader range of psychometric properties, obtaining satisfactory results. The NEL has since been used in studies involving users with chronic and treatment-resistant anxiety or depressive disorders (Zoun et al., 2019), people with psychotic disorders (Vogel et al., 2020), and users diagnosed with severe mental disorders engaged in assertive community treatment (Tjaden et al., 2021).

The aim of the present study was to translate and culturally adapt the NEL into Spanish and then to test its psychometric properties with Spanish-speaking users of mental health services. The psychometric validation included analysis of differential item functioning by gender. The overall goal is to provide users, healthcare providers and the scientific community with a measurement instrument that yields valid and reliable scores, and which may be used to assess empowerment in Spanish-speaking populations.

Method

Participants

Participants were adult (≥ 18 years) users of community mental health services with no relevant cognitive impairment or comprehension difficulties, nor a severe or decompensated somatic disease. Informed consent to participate was signed by 415 users, but nine were excluded due to missing data. The final sample therefore comprised 406 participants.

Measures

Netherlands Empowerment List (NEL) (Boevink et al., 2017). The NEL has 40 items distributed across six subscales: confidence and purpose (CaP), social support (ScS), caring community (CrC), connectedness (Cnn), self-management (S-M), and professional help (PrH). Items are rated using a 5-point Likert-type scale (1 = Completely disagree; 5 = Completely agree). In both samples of the study by Boevink et al. (2017), scores on the NEL showed excellent internal consistency (Cronbach’s α = .94 and .95) and were moderately correlated (rs = .66) with scores on the Empowerment Scale (Rogers et al., 1997).

In order to obtain validity evidence for the Spanish version of the NEL based on relationships with other variables, we administered a series of additional instruments to two subsamples of our 406 participants. Specifically, and in addition to the NEL, a subsample of 148 participants also responded to the Empowerment Scale (ES), while a subsample of 228 participants also completed the Maryland Assessment of Recovery Scale (MARS-12), the Dispositional Hope Scale (DHS) and the Multidimensional Scale of Perceived Social Support (MSPSS). These scales were selected for several reasons. Firstly, they assess recovery (i.e., MARS-12), which includes empowerment as a key element, or evaluate other components of the CHIME framework, such as hope (i.e., DHS) and social support (i.e., MSPSS), expected to correlate with empowerment. Secondly, these constructs— recovery, hope, and social support—have established associations with empowerment in the existing literature (Corrigan et al., 1999; Rogers et al., 2010). Lastly, social support constitutes one of the six dimensions within the NEL itself.

Empowerment Scale (ES) (Rogers et al., 1997, 2010). The ES is a 28-item scale comprising five factors: self-esteem/self-efficacy, power/powerlessness, community activism and autonomy, optimism/control over the future, and righteous anger. Items are rated using a 4-point Likert-type scale (1 = Totally disagree; 4 = Totally agree). Although a number of Spanish translations of the ES have been used in the literature, the scale has yet to be validated in a Spanish-speaking sample. For the present study, therefore, we developed a Spanish adaptation of the ES following the same procedure as used for the NEL and described in the Procedure section. In our sample (n = 148), scores on the ES showed good internal consistency (McDonald's ω = .83; Cronbach’s α = .82).

Maryland Assessment of Recovery Scale (MARS-12) (Drapalski et al., 2012, 2016; Medoff, 2015). The MARS-12 comprises 12 items that measure six components of personal recovery: self-direction/empowerment, holistic, non-linear, strengths-based, responsibility and hope. Items are rated using a 5-point Likert-type scale (1 = Not at all; 5 = Very much). The Spanish version of the MARS-12 has recently been validated, showing good psychometric properties (Balluerka et al., 2024). In our sample (n = 228), scores on this scale exhibited excellent internal consistency (McDonald's ω = .94; Cronbach’s α = .94).

Dispositional Hope Scale (DHS) (Snyder et al., 1991). The DHS is a 12-item scale comprising two domains: pathway and agency. Item responses are given using a 4-point Likert-type scale (1 = Definitely false; 4 = Definitely true). The DHS has been adapted into Spanish, with the authors providing evidence of adequate psychometric properties (Galiana et al., 2015). In our sample (n = 228), scores on the DHS showed excellent internal consistency (McDonald's ω = .91; Cronbach’s α = .91).

Multidimensional Scale of Perceived Social Support (MSPSS) (Zimet et al., 1988, 1990). The MSPSS is a 12-item scale with three factors corresponding to three sources of social support: family, friends and significant others. Items are rated using a 7-point Likert-type scale (1 = Very strongly disagree; 7 = Very strongly agree). This scale has been adapted into Spanish, evidencing good reliability and validity (Ruiz-Jiménez et al., 2017). In our sample (n = 228), scores on the MSPSS exhibited excellent internal consistency (McDonald's ω = .93; Cronbach’s α = .93).

For all the above scales, a total score is calculated by summing the item scores, and in each case, a higher score indicates a higher level of the variable evaluated.

Procedure

The NEL was developed and validated in Dutch, but its authors also report an English version (Boevink et al., 2017). Both languages were used as the basis for our translation. In accordance with International Test Commission guidelines (International Test Commission, 2018), the steps in translating and adapting the NEL were as follows:

  1. (1)

    We obtained permission from the authors of the NEL to adapt the scale into Spanish.

  2. (2)

    An official translation service translated the original Dutch version of the NEL into Spanish.

  3. (3)

    The English version of the NEL was also translated into Spanish by four independent bilingual psychologists using the parallel translation procedure.

  4. (4)

    All translations were compared to arrive at a consensus version, evaluating whether the instructions, response options and items maintained their original meaning.

  5. (5)

    A group of users of mental health services, considered experts due to their lived experience, was constituted (seven women and four men, with a mean age of 50.2 years [SD = 7.9, range 37–61]), whose task was to individually assess the clarity (wording and cultural appropriateness) of the instructions, response options and each item of the Spanish NEL. Ratings were given using a 4-point Likert-type scale (1 = Not at all clear; 4 = Perfectly clear). For each item rated below 4, the expert was asked to explain why and offer an alternative wording.

  6. (6)

    A multidisciplinary committee, including the four bilingual psychologists, members of the research team, and experts by lived experience, reviewed the results and reached a consensus version.

  7. (7)

    A Spanish language consultancy reviewed and refined the syntax, grammar and terminology of the instructions, response options and items of the Spanish version of the NEL.

  8. (8)

    An official translation service carried out a blind back-translation of the scale into English.

  9. (9)

    The first and second authors of the original scale were contacted and asked to rate the agreement between the original and the back-translated versions of the NEL. Ratings were given using a 4-point Likert scale (1 = Item does not capture the conceptual meaning of the original; 4 = Item completely captures the conceptual meaning of the original). Additionally, both authors were invited to include explanatory comments for all ratings below 4.

Participants were recruited through convenience sampling in 17 mental health community rehabilitation services (CRS) across Catalonia, Spain. All users of these services who met the aforementioned inclusion criteria were invited to the study, and 415 of them agreed to participate. They were informed about the study’s objectives by a professional from each CRS along with a research team member. Participation was voluntary and no financial compensation was offered.

Data collection spanned from January 24, 2022, to April 29, 2023. Participants responded to the battery of scales individually, and they were also requested to provide sociodemographic information (i.e., age, gender, diagnosis, marital status, living arrangement, education level, and employment status). This process occurred in the CRS settings, facilitated by a research team member and supported by a professional from each service.

Among the 17 CRS participating in our study, 12 agreed to facilitate a second data collection session for the retest. Subsequently, all individuals from these CRS who completed the initial NEL assessment were invited to participate in the retest, which was scheduled between one and two weeks after their first evaluation. Ultimately, 66 participants consented to and completed the retest.

The research was approved by the Bioethics Committee of the University of Barcelona (CBUB; Institutional Review Board Number: IRB00003099) and was conducted in accordance with the ethical standards of the Helsinki Declaration and its later amendments.

Statistical Analysis

Data distribution at the item level, frequency of responses for each category, and skewness and kurtosis were computed. When the absolute values of skewness were greater than 1, the distribution was considered highly skewed, and when the kurtosis was greater than 3 it was considered high (Bulmer, 1979). The distribution of NEL total scores was examined by calculating the mean and standard deviation and then applying the Shapiro-Wilk normality test. Multivariate normality was assessed using the Mardia test.

To provide validity evidence based on the internal structure of the NEL, we performed a confirmatory factor analysis (CFA) with the weighted least squares mean and variance adjusted (WLSMV) estimator. Two models were assessed: the six first-order factor model reported in the original validation study (Boevink et al., 2017), and a six first-order and one second-order factor model with Empowerment as a general factor. The fit of these models was assessed by calculating the chi-square statistic/degrees of freedom ratio, as well as the comparative fit index (CFI), the Tucker-Lewis index (TLI) and the root mean squared error of approximation (RMSEA). These indices were interpreted as follows: chi-square/degrees of freedom ratio (χ2/df) less than 2, CFI and TLI values ≥ .95 and RMSEA values ≤ .06 indicated excellent fit; χ2/df less than 3, CFI and TLI values ≥ .90 and RMSEA values ≤ .08 indicated acceptable fit (Hu & Bentler, 1999).

The presence of differential item functioning (DIF) by gender in the Spanish version of the NEL was explored using the ordinal logistic regression (OLR) method (Choi et al., 2011). This technique offers a flexible framework for detecting various types of DIF by integrating trait scores based on item response theory and an iterative process using group-specific item parameters. The total DIF effect, uniform DIF, and non-uniform DIF were assessed using a significance level of .05 and McFadden’s R2, following the guidelines of Jodoin and Gierl (2001) for measuring effect size.

Internal consistency was assessed by calculating McDonald’s omega (ω) and Cronbach’s alpha (α) for ordinal variables. These coefficients were interpreted as recommended by Kline (2016), and hence values over .90 were considered excellent, those over .80 as very good and values above .70 as adequate. Temporal stability was assessed by calculating the intraclass correlation coefficient (ICC) and interpreting the results as follows: above .90, excellent; .75 – .89, good; .50 – .74, moderate; and below .49, poor (Koo & Li, 2016). Internal consistency and temporal stability were calculated for both the total NEL score and subscale scores.

Finally, validity evidence based on relationships with other variables was obtained by calculating Spearman correlation coefficients between scores on the Spanish version of the NEL (both total and subscale scores) and total scores on the ES, the MARS-12, the DHS and the MPSS.

Statistical analyses were conducted using R 4.2.2 with the packages psych (Revelle, 2014), lavaan (Rosseel, 2012), likert (Bryer, 2022) and lordif (Choi et al., 2011).

Results

Spanish Adaptation of the NEL

All members of the group of users, considered as experts by lived experience, rated the clarity of the instructions and response options of the NEL as 4 (“Perfectly clear”), and hence no rewording was required. By contrast, six items were rated below 4 by at least one member of the committee and, in these cases, an alternative wording was proposed (see Supplemental Table 1 for more details). Following the committee's proposals, items 29 and 33 were reworded. However, no changes were made to items 6, 20, 26 and 30 because the proposed rewording modified the meaning of the original item. For example, for item 26, "I know what I am good at", the proposed wording in Spanish would have rendered it as "I know that I am a good person". The Spanish language consultancy also suggested minor changes to the wording of five items, all of which were incorporated. Finally, following the review of the back-translated version by the scale's authors, item 10 was modified slightly so as to capture better the tone of the original item, "I know what is good and what is not good for me" (the proposed Spanish version had implied a stronger imperative, "I know what I must do"). The final version of the Spanish NEL is shown in Supplemental Table 2.

Sample Description

The mean age of participants was 47.8 years (SD = 9.8; range 20 – 71). The majority were male (52.5%), single (50.74%) and in receipt of a disability pension (67.24%). The most common living arrangement was with their original family (39.90%), and the most frequent educational level was secondary (43.84%). The main self-reported diagnoses were depression (28.82%), schizophrenia (19.21%) and bipolar disorder (17.98%). For more details on the sociodemographic characteristics of the sample, see Table 1.

Table 1 Sample sociodemographic characteristics (N = 406)

Item Analysis

As shown in Fig. 1, nearly one in three responses on the Spanish NEL fell into the Agree category (32.30%), followed by Neither Agree nor Disagree (24.50%) and Agree Strongly (21.63%). Few participants endorsed response options 1 or 2 (Strongly Disagree or Disagree, respectively).

Fig. 1
figure 1

Item distribution analysis

Most items of the Spanish NEL showed a negatively skewed distribution, with items 1, 6, 12, 14, and 16 showing a significant degree of negative skewness. For further details, see Table 2.

Table 2 Scores distribution on the NEL

According to the Shapiro-Wilk normality test, the distribution of total scores on the Spanish NEL deviated significantly from normality (W = .99, p < .028; M = 138.32, SD = 25.88). The Mardia test for multivariate normality also showed that the data were non-normally distributed (skewness = 24612.35, p < .001; kurtosis = 81.62, p < .001).

Internal Structure

The results of the CFA supported a six first-order factor structure for the Spanish NEL (χ2(725) = 1754.42, p < .001, χ2/df = 2.42, CFI = .943, TLI = .939, RMSEA = .059, CI 90% [.056, .063]). Although the CFA also supported the one-factor second-order model (χ2(734) = 1881.42, p < .001, χ2/df = 2.56, CFI = .937, TLI = .933, RMSEA = .062, CI 90% [.059, .066]), the fit of the six first-order factor model was better. Factor loadings were above .40 for all items and were all significant at p < .001. Correlations between factors ranged from .96 (CaP and S-M) to .47 (CrC and PrH), and they were all significant at p < .001. More details are shown in Fig. 2.

Fig. 2
figure 2

Path diagram of the six first-order factor model of the NEL

The OLR method detected differential item functioning by gender in four items, namely item 40 of the subscale CaP, items 5 and 17 of the subscale ScS, and item 6 of the subscale PrH. Figure 3 shows that although these items display uniform DIF, the effect size (measured by McFadden’s R2) was less than .035, and hence the DIF was negligible (Jodoin & Gierl, 2001).

Fig. 3
figure 3

Items of the NEL with differential item functioning by gender

Internal Consistency and Temporal Stability

Internal consistency was excellent for the Spanish NEL total score (ω = .98; α = .96) and for scores on two of its subscales: CaP (ω = .95, α = .92) and ScS (ω = .92, α = .91). It was very good for scores on a further two subscales, CrC (ω = .86, α = .88) and Cnn (ω = .81, α = .82), and adequate for the remaining two, S-M (ω = .78, α = .78) and PrH, (ω = .72, α = .78).

Temporal stability was excellent for the Spanish NEL total score (ICCTotal = .86), and for scores on four of its subscales (ICCS-M = .86; ICCCaP = .85; ICCScS = .84; ICCCnn = .83). The exceptions were scores on the PrH and CrC subscales, which showed good (ICCPrH = .75) and moderate (ICCCrC = .72) temporal stability, respectively. Table 3 shows more details regarding internal consistency and temporal stability.

Table 3 NEL and NEL subscales internal consistency and temporal stability

Validity Evidence Based on Relationships with Other Variables

The Spanish NEL total score showed strong positive correlations with the ES, the MARS-12, the DHS, and the MSPSS. The subscales of the Spanish NEL also correlated strongly with scores on scales measuring a similar construct. Specifically, the CaP, S-M and Cnn subscales correlated strongly with the MARS-12, the DHS, and the ES; the ScS subscale was strongly correlated with the MSPSS scores, and the CrC subscale correlated strongly with the ES scores. The PrH subscale showed lower correlations with scores on the other scales. More details are shown in Table 4.

Table 4 Correlations between NEL scores and the ES, MARS-12, DHS, and MSPSS

Discussion

This study aimed to develop a Spanish version of the NEL and to conduct a psychometric validation to determine its usefulness for assessing empowerment in Spanish-speaking populations.

The results obtained in the process of translating and culturally adapting the scale support its semantic, linguistic, and conceptual equivalence with respect to the original instrument and provide evidence of content validity for the Spanish version of the NEL. Although the model fit was good for both of the factor structures evaluated (i.e., a six first-order factor model and a six first-order and one second-order factor model), it was better for the six first-order factor model. This suggests that the Spanish NEL maintains the same factor structure as its original Dutch version (Boevink et al., 2017). However, the fact that the one-factor second-order model showed a good fit indicates that the Spanish NEL measures a single underlying construct, namely the level of empowerment. Accordingly, both its total score and the scores corresponding to its six dimensions may be used (Zoun et al., 2019).

The DIF analysis showed the presence of uniform DIF by gender in four items but with a negligible effect. The direction of DIF suggests that men tended to express greater agreement with the items "I am not afraid to rely on myself", and "The people around me accept me", while women expressed greater agreement with the items "I can obtain adequate support when I need it", and "My caregiver takes my abilities as a starting point, not my limitations". Although these results do not point unequivocally to the presence of DIF, they suggest that men and women may show a differential response pattern on these items. In a study examining gender invariance in the measurement of psychological empowerment, Boudrias et al. (2004) concluded that there were a few notable differences. Specifically, items written in the first person and reaffirming a personal dimension of empowerment might be more positive for men, while items that describe more of a social dimension of support could be more positive for women. Whatever the case, this was not a general trend affecting all items of the Spanish NEL, and as already mentioned, the associated effect size was negligible.

As in the original validation study by Boevink et al. (2017), we analysed internal consistency and temporal stability for both the total score and subscale scores of the Spanish NEL. Internal consistency and temporal stability were excellent for the NEL total score, with coefficients slightly higher than in the Dutch sample. The internal consistency and temporal stability of subscale scores on the Spanish NEL ranged from excellent to adequate.

Total scores on the Spanish NEL were strongly correlated with scores on all the other scales administered, which measured empowerment, personal recovery, hope and perceived social support. Importantly, the results provide evidence of convergent validity with the ES (Rogers et al., 1997), an instrument that measures the same construct (empowerment). Furthermore, the correlation between scores on the NEL and the ES was higher in our sample than in Boevink et al. (2017), who reported a moderate coefficient. The total score on the Spanish NEL was also strongly correlated with scores on the MARS-12 (Medoff, 2015), as well as with scores on the DHS (Snyder et al., 1991) and the MSPSS (Zimet et al., 1988), two scales that measure components of the CHIME model (Leamy et al., 2011), namely hope and community. These results support the idea that community, hope, and empowerment are key components of the recovery process, and as such they can be used as variables in assessing recovery-oriented interventions. Finally, as expected, the strongest correlations for NEL subscales were obtained with scales measuring related constructs (i.e., the ES, the MARS-12 and the DHS with CaP, S-M and Cnn, and the MSPSS with ScS). This further supports the validity of scores on the Spanish NEL.

The main limitation of this study stems from the use of convenience sampling, with all participants being users of community mental health services. Accordingly, none of them were in a state of clinical decompensation but they all required professional assistance to promote their autonomy, social functioning and/or community inclusion. This represents a specific profile within the population of users of mental health services. Future research should aim to validate the Spanish version of the NEL in samples of different characteristics. It is also worth noting that, although the proportion of participants who agreed to participate in the retest was small, the sample size was sufficient to evaluate the temporal stability of the NEL scores. Moreover, the present study did not examine the responsiveness of the Spanish-NEL. Subsequent studies are necessary to determine the scale's ability to detect empowerment changes resulting from interventions.

This research also has some strengths. The NEL was translated and adapted into Spanish in strict accordance with international guidelines for test adaptation (International Test Commission, 2018). Furthermore, the psychometric properties of the Spanish version were examined in a relatively large sample of users of mental health services. To our knowledge, this is the first validated scale in Spanish for assessing empowerment. In this respect, our study addresses an important gap in our socio-cultural context, which to date has lacked a valid and reliable instrument for evaluating programs or interventions that claim to promote the empowerment of users of mental health services. Furthermore, the availability of a validated instrument for assessing empowerment in mental health within our cultural context lays the groundwork for future cross-cultural research.

Conclusions

The Spanish version of the NEL yields valid and reliable scores and it may be used to assess empowerment among Spanish-speaking users of mental health services. It is therefore a useful tool for evaluating interventions aimed at promoting empowerment in this population.