Abstract
Purpose
The Modified Weight Bias Internalization Scale (WBIS-M) is perhaps the most frequently used measure of internalised weight bias and has growing support for its psychometric properties. However, there is a lack of clarity regarding how many items are necessary for adequate interpretation of the WBIS-M and limited study of internalised weight bias in young adults. The aims of this study are to evaluate different versions of the WBIS-M, assessing structural and convergent validity.
Methods
The current study recruited 205 university students (aged 18–46, mean body mass index = 22.60 kg/m2) in the UK and examined the factor structure, reliability, and convergent validity of the WBIS-M, looking at 11-item, 10-item, and 9-item versions.
Results
Confirmatory factor analysis suggested that a 10-item version of the WBIS-M showed acceptable structural validity and expected correlations with relevant constructs (depression, anxiety, weight status, and eating pathology). Estimates of internal consistency reliability were high for all three versions.
Conclusion
Given potential problems with one item, the 10-item WBIS-M presents a measure of internalised weight bias with sound psychometric properties in young adults.
Level of evidence: Level III, well-designed cohort study.
Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Avoid common mistakes on your manuscript.
Introduction
Discrimination towards individuals with obesity has long been considered a driver of poorer mental health, impaired quality of life, and social problems in this population [1]. Further, negative attitudes towards weight can impair individuals of different weight statuses, with the potential for stigma to be both received from others and directed towards oneself (e.g., [2]).
Research into internalisation of weight-biased attitudes (weight-bias internalisation, or WBI) has been increasing since the turn of the millennium and suggests that WBI “is consistently associated with negative mental health outcomes such as depression, anxiety, poor self-esteem and body image, disordered eating and impaired mental HRQOL [health-related quality of life]” ([2], p. 1159). Several sociodemographic variables have been linked to higher WBI, and a number of studies have found higher weight bias in younger, compared to older, adults (e.g., [3]), suggesting that this population is at heightened risk for weight-biased attitudes.
Given the influence of WBI on health-related outcomes, various measures have been developed to assess WBI and weight stigma, and a review completed in 2020 [4, 5] found 18 measures intended to assess internalised weight bias, the most commonly used of which was the Weight Bias Internalization Scale (WBIS; [6]). Whilst support for the psychometric properties of the WBIS has generally been found [4], the measure itself is restricted to use with individuals who consider themselves overweight. As a result, Pearl and Puhl [7] modified the wording of the WBIS to cover individuals of all weight statuses and thus assess internalised weight bias regardless of weight classification, a measure known the Modified WBIS, or WBIS-M.
Adding to data on the WBIS, the WBIS-M has been found to demonstrate good psychometric properties (e.g., discriminant validity) and expected correlations with measures of eating pathology, depression, and anxiety (e.g., [7, 8]). In addition to other estimates of validity, the factor structure (i.e., structural validity) of the WBIS-M has been investigated with exploratory (EFA) and confirmatory factor analysis (CFA). In samples of adults, translations of the WBIS-M into Spanish [9], Greek [10], Turkish [11], and Norwegian [12] have offered support for a one-factor, 11-item measure. In a sample of secondary school students in Barcelona, EFA and CFA suggested a one-factor structure of a 10-item version, with one item (Item 1: “Because of my weight, I feel that I am just as competent as anyone”) excluded due to a poor factor loading [13]. A similar study of first-generation Asian immigrants in the United States examined the factor structure of this 10-item version of the WBIS-M, confirming its unidimensionality and also offering “preliminary support for a nine-item version” (also excluding Item 9, “I am OK being the weight that I am,” which is reverse-scored; [14], p. 17). Finally, in a large multinational study [15], the (unidimensional) 11-item WBIS-M was found to evidence poor fit, which was improved by removing Item 1 (as in [13], and [8]) and allowing some residual (error) variances to correlate (see also [12]).
Whilst results from these studies offer support for the unidimensionality of the WBIS-M, findings are inconsistent (see also [2]) and there are limitations to both the samples and methods used. There have been limited studies using English language versions, and few with young adult and university samples, who may present with stronger internalised weight bias. One (unpublished) work has reported CFA of the 11-item WBIS-M with a UK sample [16], which comprised young adults (mean age = 26.0 years), around 75% of whom were university students. The fit indices (Table C1 in [16]) suggest near-adequate fit on some measures, but poor fit on another, leading to uncertain conclusions regarding structural validity in such samples.
Given inconsistent findings regarding the structural validity of the 11-item version [9,10,11,12,13,14,15,16], few studies in English-speaking countries and the UK in particular, and limited support for reduced-item measures, the current study aims to compare three hypothesised factor structures of the WBIS-M—specifically, 11-item, 10-item, and 9-item [14] versions. Further, the study will look at associations with variables known to be correlated with WBI (i.e., depression, anxiety, and eating pathology) across all models.
Methods
Participants
Two-hundred and five undergraduate and postgraduate students were recruited from a moderately large UK university through local advertising (posters on campus) and the Psychology Research Participation Scheme. Data collection took place online, including providing informed consent, and the study was approved by the University’s Ethics Committee.
Measures
Participants were asked to provide demographic information (age, gender, weight, height, ethnicity; see Table 1), in addition to responses to several questionnaires. The WBIS-M [7] consists of 11 items rated on a 7-point Likert scale (from ‘Strongly Disagree’ to ‘Strongly Agree’). Items are averaged to produce a total score, with higher scores indicative of greater internalised weight bias. The 8-item PHQ-8 assesses symptoms of depression using a 4-point Likert scale (from ‘Not at all’ to ‘Nearly every day’ over the past 2 weeks), and a total score indicates more severe symptoms [17]. The 7-item GAD-7 assesses symptoms of anxiety, also using a 4-point Likert scale (from ‘Not at all’ to ‘Nearly every day’ of the past 2 weeks). Akin to the PHQ-8, a Total score indicates more severe symptoms [18]. The 26 items of the Eating Attitudes Test (EAT) can be used as a measure of eating disorder symptoms, using a 6-point Likert scale (“Always” to “Never”), whereby higher scores indicate more frequent disordered eating attitudes [19].
Statistical analysis
Analyses were conducted using R (v. 4.3.0; [20]). For CFA, the lavaan package (0.6–17, [21]) was used, and the psych package [22] for estimates of skewness, kurtosis, sampling adequacy (Kaiser–Meyer–Olkin measure; KMO), and factor analysis reliability (using ρFAFootnote 1; [23, 24]). The correlation package [25] was used for correlations.
To identify models within CFA, factor variances were fixed to 1 and items were treated as categorical (see [21]). Mardia’s test suggested the presence of non-normality, so robust estimation (Weighted Least Squares Means- and Variance-adjusted; WLSMV) was used in CFA. Skewness estimates for WBIS-M items ranged from −0.30 (Item 3) to 1.28 (Item 8). To assess model specification, common fit indices (comparative fit index [CFI], Tucker–Lewis index [TLI], root mean square error of approximation [RMSEA, including 90% CIs], standardized root mean squared residual [SRMR]) were used (see [26]). Recent work has suggested that the RMSEA is sensitive (and more likely to suggest rejection of models) in cases of strong factor loadings (i.e., λs = 0.70–0.90), although this phenomenon will also affect CFI, and SRMR, albeit to a lesser extent (e.g., [27, 28]). As such, whilst the RMSEA is reported, we will interpret the findings with this possibility in mind, given reasonably strong factors loadings (typically λs > 0.65) predicted for the WBIS-M (e.g., [9, 10, 13]).
To assess the relationship between WBI and other constructs of interest, non-parametric (Spearman’s r) correlations were conducted among the sum scores of the WBIS-M (Total scores based on each model’s items) and PHQ-8, GAD-7, and EAT scores. Participants’ body mass index (BMI; kg/m2) was calculated from self-reported weight and height, and used as a further criterion variable.
A sample size of at least 200 was planned, given guidance on conducting CFA and the magnitude of expected item loadings [29, 30]. This figure is also sufficient for correlation analyses (given previous estimates of r ≈ 0.45; [7]).
Results
The overall KMO statistic for WBIS-M items was 0.93 (range = 0.78–0.97), suggesting that data were appropriate for factor analysis. Of note, Item 1 represented the lowest value, with the next lowest being 0.91; this reflects the pattern of (standardised) factor loadings (λs), which were strong (> 0.75) except for Item 1 (see Table 2). Inter-item correlations for the WBIS-M are provided in Online Resource 1. No item on the WBIS-M had more than one missing data point (overall mean = 0.003%), so data were deleted listwise where necessary.
Model fit
Robust fit indices for the 10-item and 9-item models were improved over the 11-item model (Table 2). Values fell within the ‘acceptable’ range for the CFI and TLI (i.e., between 0.90 and 0.95), and were ‘good’ for the SRMR (< 0.08; [26]). Regarding the RMSEA, values indicated poor fit, and strong factor loadings of the WBIS-M were seen; specifically, the highest factor loading of the 10-item WBIS-M in this sample was 0.93, with an average of 0.84.Footnote 2 Factor analysis reliability estimates were ≥ 0.95 for the three versions of the WBIS-M (see Table 2).
Correlations
Correlations with relevant criterion variables (i.e., PHQ-8, GAD-7, EAT, BMI) were all significant (ps < 0.001, two-tailed) and of a similar magnitude regardless of which version of the WBIS-M was used (see Table 2).
Discussion
This study examined the factor structure and psychometric properties of the WBIS-M in a sample of university students in the UK, supporting previous work in other samples demonstrating acceptable fit of a one-factor model. However, whilst fit for the 11-item model was marginal, poor factor loadings for Item 1 suggest that use of the 10-item version of the WBIS-M is more defensible. Given existing work in samples from different backgrounds (e.g., [12,13,14,15]) and few notable drawbacks, use of the 10-item WBIS-M to assess WBI should be encouraged.
In addition to structural validity, the 10-item WBIS-M demonstrated expected correlations with depression, anxiety, and eating pathology. The WBIS-M also showed moderate correlations with BMI in this student sample, in contrast to use of the ‘original’ WBIS [6, 7]. Taken together, a number of findings support the convergent validity of the WBIS-M (e.g., [10, 13]), with the current study suggesting that this is the case regardless of which version is used. Removal of Item 9 [14], however, does not seem to confer particular advantages in terms of model fit, at least in this UK student sample. The proportion of missing data for the WBIS-M was very low, in line with previous work [14, 31].
Examining findings regarding structural validity, RMSEA values were above recommended cutoffs for all versions of the WBIS-M and might appear to indicate poor fit, particularly for the reduced-item models. However, recent empirical work has suggested that the RMSEA is very likely to indicate poor fit when factor loadings are high (see [27, 28]), a difficulty exacerbated as removal of one item is based largely on its weak factor loading (in the current study, λItem 1 = 0.29), as well as small correlations with other items. Similarly, some fit indices may be more likely to indicate poor model fit when the proportion of missing data is low [32]. Thus, as often recommended (e.g., [28, 29]), interpretation of fit indices should consider the context and complexity of the models under study rather than adhering strictly to a given ‘cutoff’. In the current study, therefore, the fit indices reported can be taken to indicate good fit of all models given the strong factor loadings, and good performance on the SRMR in particular. However, as Item 1 represents a clear exception, it seems both logical and empirically supported (e.g., [13, 15]) to omit this question from the WBIS-M in future work. Future psychometric studies of the WBIS-M should therefore consider the strong factor loadings typically seen, the small degrees of freedom, and the often-observed data completeness.
A recent study [33] proposed a short form of the WBIS-M, comprising three items (“I feel anxious about my weight because of what people might think of me”, “Whenever I think a lot about my weight, I feel depressed”, “I hate myself for my weight”). Results suggested that this measure can be interpreted similarly regardless of gender, age, and weight status [33], and the suggested factor structure has also been supported in a sample of Lebanese adults – using an Arabic version of the WBIS-3 [34]. However, as a structural equation model with only three indicators and one latent variable, it has zero degrees of freedom and is thus ‘just-identified’; model fit cannot be tested with standard CFA approaches and necessitates different methods (e.g., see [35]) and was therefore not evaluated in the current study.
Strengths and limits
This is one of only a few studies assessing the structural validity and psychometric properties of the English-language WBIS-M and, additionally, directly compares the performance of three different versions. Commonly used and well-validated measures were used to assess construct validity and findings suggest that one item of the WBIS-M may be (statistically) redundant and removed without compromising key advantages of the questionnaire.
Whilst there is good reason to support the interpretation of the findings based on fit indices (e.g., [26,27,28,29, 32]), there are some inconsistencies, particularly when compared to relatively close-fitting models reported in the literature (e.g., CFI of 0.977 [14] and Goodness-of-Fit Index of 0.989 [9]; cf. [12]). Further study, ideally with larger samples, modern methods, and diverse groups might help clarify this. Similarly, whilst the sample size of the current study was adequate for CFA, this might have affected interpretation of some fit indices [30], and small subgroups (e.g., men) did not afford testing of measurement invariance of the WBIS-M. Finally, convergent validity could have been further assessed through inclusion of measures assessing weight stigma or WBI.
What is already known on this subject?
The WBIS-M has shown acceptable psychometric properties across several international samples. However, there has been limited assessment of a 10-item version of this measure, few empirical comparisons of different versions, and a dearth of work with university students, who are at elevated risk for internalised weight bias.
What this study adds
The current study offers further support for the unidimensionality of the WBIS-M in a UK student sample, suggesting that a 10-item version shows sound psychometric properties and excludes one item which evidences poor relationships to the overall score. The findings show that internalised weight bias is strongly related to depression, anxiety, eating pathology, and BMI.
Availability of data and materials
Data supporting this study are available from the University of Reading Research Data Archive at https://doi.org/10.17864/1947.001395.
Notes
This statistic is also referred to as McDonald’s omega [ω] and used as an indicator of internal consistency reliability; see [23] for a discussion.
To illustrate this apparent paradox, Saris et al. [27] found that manipulating factor loadings from 0.70 to 0.85 across a consistent model resulted in a change in RMSEA from 0.00 (indicating near-perfect fit) to 0.092 (representing poor fit, by the Hu & Bentler [26] criteria); Saris et al. conclude that this is “very inconvenient… because the better the measurement model – high loadings – the higher probability of [the model] getting rejected” (p. 569).
References
Wu Y-K, Berry DC (2018) Impact of weight stigma on physiological and psychological health outcomes for overweight and obese adults: a systematic review. J Adv Nurs 74(5):1030–1042. https://doi.org/10.1111/jan.13511
Pearl RL, Puhl RM (2018) Weight bias internalization and health: a systematic review. Obes Rev 19(8):1141–1163. https://doi.org/10.1111/obr.12701
Puhl RM, Himmelstein MS, Quinn DM (2018) Internalizing weight stigma: prevalence and sociodemographic considerations in US adults. Obesity 26(1):167–175. https://doi.org/10.1002/oby.22029
Papadopoulos S, de la Piedad GX, Brennan L (2021) Evaluation of the psychometric properties of self-reported weight stigma measures: a systematic literature review. Obes Rev 22(8):e13267. https://doi.org/10.1111/obr.13267
Lindloff M, Meadows A (2023) Assessment of weight-related stigmatization. In: Meule A (ed) Assessment of eating behaviour. Hogrefe, Göttingen, pp 218–236
Durso LE, Latner JD (2008) Understanding self-directed stigma: development of the Weight Bias Internalization Scale. Obesity 16(S2):S80–S86. https://doi.org/10.1038/oby.2008.448
Pearl RL, Puhl RM (2014) Measuring internalized weight attitudes across body weight categories: validation of the modified weight bias internalization scale. Body Image 11(1):89–92. https://doi.org/10.1016/j.bodyim.2013.09.005
Saunders JF, Nutter S, Russell-Mayhew S (2022) Examining the conceptual and measurement overlap of body dissatisfaction and internalized weight stigma in predominantly female samples: a meta-analysis and measurement refinement study. Front Glob Womens Health 3:877554. https://doi.org/10.3389/fgwh.2022.877554
Macho S, Andrés A, Saldaña C (2021) Validation of the modified weight bias internalization scale in a Spanish adult population. Clin Obes 11(4):e12454. https://doi.org/10.1111/cob.12454
Argyrides M, Charalambous Z, Anastasiades E, Michael K (2022) Translation and validation of the Greek version of the modified weight bias internalization scale in an adult population. Clin Obes 12(2):e12503. https://doi.org/10.1111/cob.12503
Apay SE, Yilmaz E, Aksoy M, Akalin H (2017) Validity and reliability study of modified weight bias internalization scale in Turkish. Int J Car Sci 10(3):1341–1347
Lussier T, Tangen JHQ, Eik-Nes TT, Karlsen HR, Berg KH, Fiskum C (2024) Testing the validity of the Norwegian translation of the modified weight bias internalization scale. J Eat Disord 12:117. https://doi.org/10.1186/s40337-024-01067-z
Andrés A, Fornieles-Deu A, Sepúlveda AR, Beltrán-Garrayo L, Montcada-Ribera A, Bach-Faig A, Sánchez-Carracedo D (2022) Spanish validation of the modified weight bias internalization scale (WBIS-M) for adolescents. Eat Weight Disord 27:3245–3256. https://doi.org/10.1007/s40519-022-01453-z
Adams V (2024) Validation of the MODIFIED WEIGHT BIAS INTERNALIZATION SCALE (WBIS-M) among first-generation Asian immigrants. Health Soc Work 49(1):17–24. https://doi.org/10.1093/hsw/hlad033
Aimé A, Fuller-Tyszkiewicz M, Dion J et al (2020) Assessing positive body image, body satisfaction, weight bias, and appearance comparison in emerging adults: a cross-validation study across eight countries. Body Image 35:320–332. https://doi.org/10.1016/j.bodyim.2020.09.014
Meadows A (2017) Fear and self-loathing: Internalised weight stigma and maladaptive coping in higher-weight individuals. Dissertation, University of Birmingham. https://etheses.bham.ac.uk/id/eprint/8465/1/Meadows18PhD.pdf. Accessed 8 Dec 2024
Kroenke K, Strine TW, Spitzer RL, Williams JBW, Berry JT, Mokdad AH (2009) The PHQ-8 as a measure of current depression in the general population. J Affect Disord 114(1–3):163–173. https://doi.org/10.1016/j.jad.2008.06.026
Spitzer RL, Kroenke K, Williams JBW, Löwe B (2006) A brief measure for assessing generalized anxiety disorder: the GAD-7. Arch Intern Med 166(10):1092–1097. https://doi.org/10.1001/archinte.166.10.1092
Garner DM, Olmsted MP, Bohr Y, Garfinkel PE (1982) The eating attitudes test: psychometric features and clinical correlates. Psychol Med 12(4):871–878. https://doi.org/10.1017/S0033291700049163
R Core Team (2023) _R: A Language and Environment for Statistical Computing_. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
Rosseel Y (2012) lavaan: an R package for structural equation modeling. J Stat Softw 48(2):1–36. https://doi.org/10.18637/jss.v048.i02
Revelle W (2023) _psych: Procedures for Psychological, Psychometric, and Personality Research_. Northwestern University, Evanston, Illinois. R package version 2.3.12. https://CRAN.R-project.org/package=psych
Cho E (2021) Neither Cronbach’s alpha nor McDonald’s omega: a commentary on Sijtsma and Pfadt. Psychometrika 86:877–886. https://doi.org/10.1007/s11336-021-09801-1
McDonald RP (1999) Test theory: a unified treatment. Lawrence Erlbaum, Mahwah, NJ
Makowski D, Wiernik BM, Patil I, Lüdecke D, Ben-Shachar MS (2022) correlation: Methods for correlation analysis (0.8.3) [R package]. https://CRAN.R-project.org/package=correlation (Original work published 2020)
Hu L-t, Bentler PM (1999) Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives. Struct Equ Model 6(1):1–55. https://doi.org/10.1080/10705519909540118
Saris WE, Satorra A, van der Veld W (2009) Testing structural equation models or detection of misspecifications? Struct Equ Model 16(4):561–582. https://doi.org/10.1080/10705510903203433
McNeish D, An J, Hancock GR (2018) The thorny relation between measurement quality and fit index cutoffs in latent variable models. J Pers Assess 100(1):43–52. https://doi.org/10.1080/00223891.2017.1281286
Jackson DL, Voth J, Frey MP (2013) A note on sample size and solution propriety for confirmatory factor analytic models. Struct Equ Model 20(1):86–97. https://doi.org/10.1080/10705511.2013.742388
Wolf EJ, Harrington KM, Clark SL, Miller MW (2013) Sample size requirements for structural equation models: an evaluation of power, bias, and solution propriety. Educ Psychol Meas 73(6):913–934. https://doi.org/10.1177/0013164413495237
Schraven S, Hübner C, Eichler J, Mansfeld T, Sander J, Seyfried F, Kaiser S, Dietrich A, Schmidt R, Hilbert A (2014) Psychometric properties of the WBIS/M in a representative prebariatric sample: evidence for an improved 10-item version. Obes Facts 17(4):329–337. https://doi.org/10.1159/000537689
Zhang X, Savalei V (2020) Examining the effect of missing data on RMSEA and CFI under normal theory full-information maximum likelihood. Struct Equ Model 27(2):219–239. https://doi.org/10.1080/10705511.2019.1642111
Kliem S, Puls H-C, Hinz A, Kersting A, Brähler E, Hilbert A (2020) Validation of a three-item short form of the modified weight bias internalization scale (WBIS-3) in the German population. Obes Facts 13(6):560–571. https://doi.org/10.1159/000510923
Fekih-Romdhane F, He J, Malaeb D, Dabbous M, Hallit R, Obeid S, Hallit S (2023) Psychometric properties of the Arabic versions of the three-item short form of the modified weight bias internalization scale (WBIS-3) and the muscularity bias internalization scale (MBIS). J Eat Disord 11:82. https://doi.org/10.1186/s40337-023-00805-z
Czerwiński SK, Atroszko PA (2023) A solution for factorial validity testing of three-item scales: an example of tau-equivalent strict measurement invariance of three-item loneliness scale. Curr Psychol 42:1652–1664. https://doi.org/10.1007/s12144-021-01554-5
Funding
The authors declare that no specific grant from funding agencies in the public, commercial, or not-for-profit sectors was received during the preparation of this manuscript.
Author information
Authors and Affiliations
Contributions
PEJ: conceptualization, methodology, formal analysis, writing- original draft preparation, supervision. LB: data curation, writing- reviewing and editing.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
This study was performed in line with the principles of the Declaration of Helsinki. Approval was granted by the Ethics Committee of the University of Reading (Ref: 2023-034-PJ).
Informed consent
Informed consent was obtained from all individual participants included in the study, who were also aware that their data would be preserved in an anonymised form for possible re-use.
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Jenkins, P.E., Baysen, L. The modified weight bias internalization scale: psychometric validation of three versions in a sample of university students. Eat Weight Disord 30, 28 (2025). https://doi.org/10.1007/s40519-025-01741-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s40519-025-01741-4


