Introduction

Obesity stigma, also known as weight stigma and weight bias, is pervasive in Western societies and has been defined as a globalizing health challenge [1]. Weight stigma refers to social devaluation and denigration of individuals because of their excess body weight, and can lead to negative attitudes, stereotypes, prejudice, and discrimination [2]. Weight bias, when explicit, refers to overt, consciously held negative attitudes, while implicit weight bias refers to automatic, negative attributions and stereotypes which exist outside of conscious awareness [2].

Weight stigma has been associated with psychological distress [3], poorer mental health [4], disordered eating and unhealthy eating behaviours [5], substance use [6], more physiological stress [7], reduced motivation to engage in physical activity [8], additional barriers to weight loss maintenance [9], reduced healthcare seeking behaviour [10], and a higher risk for suicidality and mortality [11, 12].

Because weight stigma negatively affects weight-related behaviours and health, it can, paradoxically, contribute to worsened problems associated with obesity and create additional barriers to healthy behaviour change [13]. Therefore, weight stigma is considered a psychosocial contributor to obesity [14]. A recent meta-analysis has supported bidirectional relationships between weight stigma and paediatric obesity [15].

Weight bias internalization (WBI) occurs when individuals engage in self-blame and self-directed weight stigma because of their weight [2]. WBI is also pervasive and potentially damaging for health beyond body weight and experiences of stigma [16, 17], leading to increased research attention in recent years, especially in adults. In contrast, not much is known about WBI in youth [18]. Children and adolescents can be particularly vulnerable to weight stigma, with long-lasting consequences that negatively affect their life course and emotional and physical well being [6]. Thus, tools for assessing experienced and internalized weight bias in children and adolescents are needed as a first step to study their negative impact, and as a measure outcome in future studies focused on how to address weight bias.

A systematic review [19] identified two questionnaires designed to assess WBI in people who are overweight or have obesity: the Weight Self-Stigma Questionnaire (WSSQ) [20] and the Weight Bias Internalization Scale (WBIS) [21]. As negative effects of WBI have been observed when controlling for BMI, a call for measures that allow for assessment of weight stigmatization across different body weight categories was made [22]. The Modified Weight Bias Internalization Scale (WBIS-M; Pearl and Puhl [23]) was consequently developed to be applied across body weight statuses to assess the full impact of this problem.

Several adaptations of the WBIS-M to other languages have been developed in the recent years. A Turkish validation was conducted with college students [24]. Two Chinese versions for children and adolescents have been developed based on the samples from Hong Kong [25] and mainland China [26]. Two different German versions are also available. The Weight Bias Internalization Scale for Children (WBIS-C), was developed for children aged 9–13 years [27] and was based on the previous German version of the WBIS [28]. A recent three-item short version (WBSI-3) has also been developed using a German sample [29]. A Spanish version of the WBIS-M has been developed with a sample of adults from the general population [30]. However, there are currently no instruments for assessing WBI in Spanish children and adolescents across different weight statuses. This study was designed to cover this deficit. The specific aims were: (1) to study the internal structure of our WBIS-M version using exploratory and confirmatory factorial analysis; (2) to study its internal consistency; (3) to study the association between the WBIS-M score and the same set of external variables analysed in the original version [23] to provide information about its validity; and (4) to study their relationships with gender and weight status.

Methods

Participants

This study is part of a funded project on weight bias in adolescents carried out using a representative sample of the city of Terrassa (Barcelona). Schools and classrooms were selected using a multistage cluster sampling. The final sample of this study consisted of 298 secondary school students (18.75% of the expected sample), up to the point when data collection was stopped due to the COVID-19 lockdown. No specific exclusion criteria were used and all students present at the time of the assessment with parental consent participated. Only participants who did not have parental consent, refused to participate, did not respond to the parental consent request or were invalid (because of language issues or because did not pass the surveys controls) were excluded from the original class lists. Figure 1 shows the flow diagram of the sample.

Fig. 1
figure 1

Participants’ flow diagram

The sample came from one public and four grant-aided schools and was composed of students from the four years of Compulsory Secondary Education in the Spanish system. Table 1 shows the main characteristics of the sample.

Table 1 Descriptions and tests of significance for age, BMI, parental origin, weight status, DFT, Binge Eating, and WBIS-M based on comparison between genders

Measures

Demographics

The Four-Factor Index of Social Status [31] was used to determine the socioeconomic status (SES) of the household using a weighted average of each parent’s educational and occupational level. Total scores were categorized into five different levels, namely high, medium–high, medium, medium–low and low. Height in cm and weight in kg were measured using a SECA portable stadiometer, model 214 (20–207 cm; accuracy range of 0.1 cm), and SECA portable scales, model 8777021094 (0–200 kg; accuracy range of 0.1 kg), respectively. Weight status was calculated using the World Health Organization growth reference criteria [32]. Participants also reported information about age, gender, and parental origin.

Weight Bias Internalization

The WBIS-M [23] was based on the original 11 items from the WBIS [21]. Items that included the word “overweight” were replaced with phrases that instead used the words “my weight”. Responses are rated on a 7-point Likert scale ranging from “Strongly Disagree” to “Strongly Agree”. Two items (1 and 9) are reverse-scored. The mean of the item responses serves as the participant’s score (range 1–7), with higher scores indicating higher internalized weight bias. The Cronbach’s alpha of the original WBIS-M was 0.94, showed strong construct validity, and presented a one-factor structure. The translation process of the WBIS-M is described in the procedures section.

We included the same set of measures as were used in the original versions of both WBIS and WBIS-M, but in their Spanish versions, to test their construct and convergent validity.

Anti-fat bias

The Dislike subscale of the Anti-fat Attitudes Questionnaire (AAQ-D) [33] in its Spanish version [34] was also administered. It has 7 items with responses ranging from 1 (‘strongly disagree’) to 7 (‘strongly agree’). Cronbach’s alpha was among 0.70 and 0.86. In our sample, Cronbach’s alpha was 0.74.

Self-esteem and mood

The Spanish versions of the Rosenberg Self-Esteem Scale (RSE)[35] and the Depression Anxiety Stress Scales (DASS-21)[36] were included. The Spanish version of the RSE consists of 10 items based on a 4-point scale from 1 (‘totally disagree’) to 4 (‘totally agree’). Cronbach’s alpha (0.85 to 0.88) and test–retest correlation (0.84) was found to be satisfactory. Cronbach’s alpha in our study was 0.88. In the Spanish version of the DASS-21, respondents evaluate the severity/frequency with which they have experienced each of the 21 negative emotional symptoms of depression, anxiety, and stress during the previous week on a scale from 0 to 3. Both a three-factor model and a one-factor model were used. The discriminant validity was satisfactory and Cronbach’s alfa values were 0.84, 0.70, and 0.82 for the Depression, Anxiety, and Stress subscales, respectively. Cronbach’s alpha for the whole scale in our sample was 0.92.

Body image and eating disorder pathology

A short 10-item Spanish version of the Body Shape Questionnaire (BSQ) [37] and the Spanish version of the Drive for Thinness (DFT) subscale of the Eating Disorders Inventory-3 [38] were included. Scores in the BSQ range from 1 (‘never’) to 6 (‘always’). This version demonstrated metric invariance and was found to be more consistent than other short versions (Warren et al. [37]). In our sample, Cronbach’s alpha was 0.91. The DFT consists of 7 items that measure restrictive tendencies in eating and weight behaviours and cognitions. The internal consistency with Cronbach’s alpha in nonclinical samples was 0.44–0.95 and test–retest values were 0.85 to 0.99. In our sample, Cronbach’s alpha was 0.82.

The original development of the WBIS-M included two questions from the Eating Disorder Diagnostic Scale (EDDS) [39] for assessing frequency of binge eating in the past three and 6 months. In our study, binge eating was measured using four previously established and validated questions used in studies focused on weight stigma in adolescents [18]. These questions were updated to align with Binge Eating Disorder criteria from DSM-V [40], assessing the presence of binge eating in the past three months (yes/no), with or without loss of control (yes/no), frequency of binge eating with loss of control (4-point scale from “every day” to “less than once per month”), and distress over binge eating (4-point scale from “not at all” to “a lot”). Based on the previous research [41], these items were combined to determine a severity score on a 4-point ordinal scale.

Procedure

This study was supported by Terrassa City Council’s Health and Community Service. The Principal Investigator of the funded research project held meetings with the management teams of the schools to introduce the project, agree the assessment scheduling, and obtain the lists of the enrolled students. Parental consent was requested. The survey was designed in an online format on a specific platform of the company Digital Insights, and included controls as response ranges and three interspersed control questions to test attention levels. Participants who did not pass the surveys controls were excluded and their data were not registered. The missing data were avoided with this system. The survey was administered in classrooms during normal class hours, under the supervision of a group of previously trained psychology master’s students and their regular teacher. Assent from each participant was also requested. Groups of 5–7 participants were taken to a private room to take height and weight measures, following a standardized protocol [42]. The participants returned to the classroom to continue with the survey. Confidentiality was guaranteed throughout the entire process, and data processing was based on anonymous data. The assessments were conducted between March 4–12, 2020. On March 13, the COVID-19 lockdown was decreed and assessments were stopped. This study was conducted in accordance with the Declaration of Helsinki and was approved by the Ethics Committee on Animal and Human Experimentation at the Universitat Autònoma de Barcelona (CEAAH 3451).

The adaptation of the WBIS-M from English to Spanish was conducted following the International Test Commission Guidelines [43]. Two translators whose native tongue was the target language (Spanish) translated the original questionnaire from English to Spanish. A panel of experts on the subject, whose native language was Spanish and who had a good level in the source language (English), unified a preliminary Spanish version. Then, the Spanish version was back-translated into English by two other independent translators. A bilingual expert–reconciliation panel, made up of the translators and members of the research team, then concluded a final Spanish version that guaranteed conceptual equivalence. Last, a pilot study was carried out with fifteen participants within the age and group of interest to confirm that the final Spanish version could be read and understood. The final document did not need to be modified.

Data analysis

The following analytic plan was pre-specified. The psychometric properties of the WBIS-M for adolescents were analysed through IBM SPSS Statistics, version 24 (IBM Corp., Armonk, N.Y., USA), and IBM SPSS Amos Statistics, version 20 (IBM Corp., Chicago, I.L., USA). A cross-validation was conducted [44], for which the sample was split into two random subsamples. Exploratory factor analysis (EFA) was conducted with subsample 1 (n = 149) and a confirmatory factor analysis (CFA) was conducted with subsample 2 (n = 149). Non-statistically significant differences were observed between samples in terms of school (χ2 = 9.061, df = 5, p = 0.107), year (χ2 = 2.403, df = 3, p = 0.493) and gender (χ2 = 0,215, df = 1, p = 0.643). The adequacy of the data to conduct the EFA was tested using the Kaiser–Meyer–Olkin (KMO) test and Bartlett’s sphericity test [45]. In line with the original development of the WBIS [21], the EFA was conducted through varimax rotation. In addition, the Kaiser–Guttman [46] rule and parallel analysis (Hayton, Allen & Scarpello, 2004) were applied to retain factors.

CFA was conducted through the unweighted least squares (ULS) method, because data did not fit multivariate normality. The following goodness-of-fit indices were obtained: goodness-of-fit index (GFI), the parsimony GFI (PGFI), and the normed fit index (NFI), and the standardized root-mean square residual (SRMR). The cutoff points for these indices were: equal to or higher than 0.9 for GFI, and NFI, 0.6 for PGFI, and lower than 0.08 for SRMR [47, 48]. The internal consistency of the scale was analysed using Cronbach’s alpha, McDonald’s omega, and item-test corrected correlations. According to Tavakol and Dennick [49], cutoff points for acceptable internal consistency indices range from 0.7 to 0.95. The cutoff value of 0.4 was considered for acceptable item-test correlations [50].

A descriptive analysis of the questionnaires was conducted. Spearman’s correlation (for the ordinal variable Binge Eating) and partial correlations controlled by BMI were obtained to assess the relationship between the WBIS-M and the other subscales. Moreover, the WBIS-M score was correlated to BMI. Mean comparisons (ANOVA), adjusted by parental origin, age, and socioeconomic status, were also calculated to analyse the relationship between weight status and gender and the scores obtained in the WBIS-M. It must be noted that standardised values (z) of BMI were considered. Simple comparisons were also obtained.

Results

Descriptive analysis of items

Descriptive analyses of items are provided in Table 2. It must be noted that item 8 obtained higher scores in terms of skewness and kurtosis as compared to the rest of items.

Table 2 Descriptive statistics of items

Internal structure

The data obtained from subsample 1 (n = 149) proved to be adequate for analysis through an EFA (KMO = 0.915, χ2 = 1075.633, df = 55, p < 0.0001). According to parallel analysis and Kaiser–Guttman rule, one-factor solution was retained, explaining the 54.57% of variance. Item 1 showed a low communality (0.092) and the lowest factor loading (0.302), whereas the rest of the items showed higher communalities and adequate factor loadings of above 0.6 in all cases (Table 3). These results suggested that item 1 did not fit adequately into the one-factor model. Moreover, item–total corrected correlation was low (0.299), whereas the rest of the items showed values higher than 0.6. Therefore, item 1 was excluded from further analysis.

Table 3 Communalities and factor loadings obtained from subsample 1 (n = 149) in the EFA

CFA was conducted with subsample 2 (n = 149) to test whether the unidimensional structure of the questionnaire could be confirmed in another set of data. The ten items of the questionnaire (excluding item 1) were considered in the analysis given that the data showed an adequate fit to the one-factor model (χ2 = 200.515, df = 35), GFI = 0.995, PGFI = 0.992, NFI = 0.991, and SRMR = 0.060). Standardised regression weights were adequate in all cases according to Floyd and Widaman [44], ranging from 0.449 to 0.880.

Internal consistency

Reliability analyses were performed on both subsamples and the total sample. As shown in Table 4, internal consistency was higher than 0.9 in both subsamples and in the total sample, and according to Cronbach’s alpha and McDonald’s omega. Moreover, item analyses revealed that all the items contributed to the internal consistency of the scale, since item–total correlations were higher than 0.6 in both the subsamples and in the total sample. The scale showed an adequate internal consistency in terms of both the reliability coefficient and item–total correlations.

Table 4 Internal consistency coefficients and item–total correlations in both subsamples and the total sample

Correlations and differences by gender and weight status

Table 5 shows the means and standard deviations of all the scales. The mean of the WBIS-M was 2.22 (SD = 1.37). The same table shows the results of the WBIS-M correlations and the correlations reported by Pearl and Phul [23] for the original WBIS-M. The original WBIS correlations found by Durso and Latner [21] are also shown.

Table 5 Means, correlations, and partial correlations of WBIS and WBIS-M with study measures

A significant positive correlation was found between WBIS-M and BMI. The total and partial correlations (controlling for BMI) for all the other measures were comparable to those reported in the original validation studies, except for those corresponding to the Dislike scale, which were slightly lower [21, 23]. WBIS-M showed significant correlations with all the outcome variables when controlling for BMI.

Figure 2 shows the WBIS-M means based on the weight status (categorized according to zBMI) and gender.

Fig. 2
figure 2

Line graph with the means of WBIS-M by gender and weight status

In the ANOVA, adjusted by parental origin, age, and socioeconomic status (degrees of freedom are corrected for non-homogeneity of variances), the interaction between the two variables was significant (F(1, 287) = 6.14; p = 0.014; partial η2 = 0.060), as were the effects of weight status (F(1, 287) = 22.65; p < 0.001; partial η2 = 0.193) and gender. WBIS-M scores were significantly higher in the girls than in the boys (F(1, 287) = 33.47; p < 0.001; partial η2 = 0.104), with means of 2.57, SD = 1.55, and 1.90, SD = 1.09, respectively (Cohen’s d = 0,5).

A simple contrast shows that participants with obesity status (reference category) scored significantly higher on the WBIS-M than participants with underweight status (p < 0.001), normal weight status (p < 0.001), and overweight status (p < 0.001).

Discussion

To the best of our knowledge, this is one of the few studies that provides a validated version of the WBIS-M for use with adolescents across weight categories, and the first one adapted to Spanish. After conducting both an EFA and a CFA of the WBIS-M with different subsamples, the results confirmed adequate psychometric properties of our WBIS-M version, with adequate fit of the 10-item version (excluding item 1), high internal consistence, and support for its validity.

The study had four specific objectives. The first was to study the internal structure of this Spanish version with adolescents using EFA and CFA. Item 1 (‘‘Because of my weight, I feel that I am just as competent as anyone”), a reverse-keyed item, obtained a low communality when compared with the rest of the items, as well as a low factor loading and a low item–total correlation. Therefore, it was not included in subsequent analyses. These results are consistent with previous studies conducted with adults who were overweight or had obesity, also suggesting that this item did not correctly fit the WBIS-M structure [18, 28, 51, 52]. A similar result was found with WBIS versions for adolescents with obesity seeking bariatric surgery [53]. Regarding WBIS-M versions, the German WBIS-C for children also removed item 1 [27], as did the Mainland Chinese version for children and adolescents [26]. In the Chinese version with children and adolescents from Hong Kong, item 1 was retained, although it showed a very low factor loading [25]. A similar result was found in the Spanish version of the WBIS-M for adults, where item 1 was retained in the final version, despite it showing the lowest factor loading and the lowest standardized regression weight [30]. The removal of item 1 from the original WBIS-M 11-item version has been observed in recent multinational studies on WBI correlates [54]. In summary, our exclusion of item 1 based on its anomalous psychometric functioning is supported by evidence coming from the previous studies. After removing this item WBIS-M yielded a one-factorial structure with adequate good fit, replicating the unidimensional structure identified in the original development of the WBIS [21] and the WBIS-M [23]. This one-factorial structure was also found in all adaptations of the WBIS-M to other languages [24,25,26,27, 29, 30].

The mean WBIS-M score was slightly below the middle range. When compared with other versions of the WBIS-M, which used the original 1–7 range scores, our mean WBIS-M score was slighly lower than in versions focused on adults, including the original version [23, 30], and similar to other versions with youth and adolescents [24, 26]. Because WBI research has mainly focused on adults, and few studies have been developed with adolescents across different weight statuses, more studies are needed to confirm this trend.

The second objective was to study the internal consistency. We obtained excellent reliability results in terms of both the internal consistency coefficient and item–total correlations in both subsamples and for the whole sample. The internal consistency was very similar to the original version [23] and similar or slightly higher than in the other WBIS-M versions published to date [24,25,26,27, 29, 30].

The third goal was to study the association between the WBIS-M score and the same set of external variables analysed in the original version. We found that higher scores of WBIS-M were positively associated with an increase in dislike of people with obesity, higher body dissatisfaction and drive for thinness, lower self-esteem, higher symptoms of stress, depression and anxiety, and a higher frequency and severity of binge eating behaviour. Total and partial correlations (adjusted by BMI) were highly comparable to those reported in the original version, except the total correlation for the binge eating measure, which was lower in our study. Nevertheless, the fact that we used the Spanish versions of the same measures used in the original version, except in the case of binge eating, for which we used the stricter DSM-V criteria [40], and that this variable was analysed as ordinal (and not as quantitative as in the original), must be taken onto account. The original version of the WBIS-M [23] used the same measures as per the original version of the WBIS [21], and DSM-V criteria were not available when the WBIS was originally developed. All the correlated variables have been considered as correlates of WBI in adolescents [18, 53, 55,56,57]. These results provide support for the validity of the questionnaire.

The fourth and last goal of this study was to assess the relationships between WBI, gender, and weight status. Higher scores of WBI were found in girls and in higher weight statuses. Social pressure to attain the beauty ideal is higher among young women than among young men [58], which could make women more vulnerable to weight-based stigmatization [59]. Furthermore, young people who are overweight or are living with obesity are more prone to experience weight-based discrimination [6]. Our results are in line with research in adults across weight categories showing that women and people with higher BMIs show higher WBI levels than men and people with lower BMIs [23, 54]. Regarding children and adolescents, a higher risk of reporting weight discrimination has been associated with the female gender and those with a higher BMI in Chinese adolescents [57]. Higher WBI scores in girls and participants who were overweight than in boys and participants who were not overweight have been reported in previous research conducted in Germany with a general population sample [27]. The same results were observed in another German study conducted with adolescents seeking weight loss [56]. In contrast, other studies conducted with adolescents from the USA did not find gender differences regarding WBI in adolescents [18, 53]. Notably, these last studies were conducted with samples of adolescents seeking weight loss or bariatric surgery. Differences in sample characteristics could explain these results. More studies are needed to clarify the role of gender regarding WBI in adolescents. Together, our results regarding WBI and gender and weight status differences support the known-groups validity of our WBIS-M version.

Strength and limits

This study has several limitations and strengths. Regarding limitations, there is a lack of data on the test–retest reliability. Although studies conducted with adults have provided stability data for the WBIS-M (e.g. Macho et al. [30]), up to now only the Mainland Chinese version with a general population sample [26] and the German version with participants seeking weight loss [56] have reported satisfactory stability with adolescent samples. Another limitation is related to the sample. A bigger sample would provide a more accurate analysis of the internal structure of the questionnaire. Moreover, it has to be noted that the sample is not representative of Spanish adolescents. Last, measurement invariance by relevant groups such as age, gender, or weight status is necessary and this could not be estimated in our study because of the insufficient sample size. Measurement invariance would provide substantial endorsement for using the same version of the questionnaire with different groups (e.g., by gender, different ages, weight status) without targeted adaptations to specific group characteristics, and for detecting real differences in WBI across these groups that are not attributable to different interpretations of WBIS-M item content between groups. To date, only the German WBIS-C [27] and the WBIS-M version with Chinese adolescents from Hong Kong [25] have provided invariance for gender groups, and the latter only partially for weight status. We are planning to study measurement invariance for gender, age groups, and weight status once the total sample initially conceived is assessed next year, if the restrictions derived from the COVID-19 epidemic have been lifted.

On the other hand, the study has several strengths. This is one of the few studies to provide a validated version of the WBIS-M for use with adolescents across weight categories, and the first one adapted to Spanish. The internal structure of our WBIS-M version was analysed by means of both EFA and CFA in different subsamples. The translation process was done following the International Test Commission Guidelines [43]. To ensure the quality of data collection, the survey was designed in an online format, included response ranges and three control interspersed questions, and was administered in classrooms under the supervision of a group of technicians previously trained. And last, the adolescents’ weight and height were measured objectively using portable and accurate instruments, and following a standardized protocol, preventing inaccuracy in the self-reported data in adolescents [60, 61].

Conclusion

Our findings provide support for the validity and reliability of our WBIS-M version for use with adolescents across weight categories. The availability of a Spanish version of the WBIS-M for adolescents will provide research teams and practitioners from Spain who are addressing weight bias in the population with a validated version of one of the few existing but the most employed instrument for assessing WBI. In addition to its value for research and clinical settings, data coming from the use of this new tool could contribute as evidence to support the promotion of educational, regulatory, and legal initiatives designed to prevent weight stigma and discrimination, as recommended by a recent international consensus statement for ending obesity stigma [2].

What is already known on this subject?

Weight bias internalization (WBI) is pervasive and potentially damaging for health, independent of weight status. The WBIS-M is one of the few existing but the most employed instrument for assessing WBI, and several validations to different languages have been developed. There are currently no Spanish versions for assessing WBI in children and adolescents across different weight statuses.

What this study adds?

This study provides research teams and practitioners from Spain who are addressing weight bias in the population with a validated version of one of the few existing but the most employed instrument for assessing WBI among adolescents across different weight statuses.