Abstract
The economics of obesity literature implicitly assumes that measured anthropometrics are error-free and they are often treated as a gold standard when compared to self-reported data. We use factor mixture models to analyse measurement error in both self-reported and measured anthropometrics with nationally representative data from the 2013 National Health Survey in Brazil. A small but statistically significant fraction of measured anthropometrics are attributed to recording errors, while, as they are imprecisely recorded and due to reporting behaviour, only between 10 and 23% of our self-reported anthropometrics are free from any measurement error. Post-estimation analysis allows us to calculate hybrid anthropometric predictions that best approximate the true body weight and height distribution. BMI distributions based on the hybrid measures do not differ between our factor mixture models, with and without covariates, and are generally close to those based on measured data, while BMI based on self-reported data under-estimates the true BMI distribution. “Corrected self-reported BMI” measures, based on common methods to mitigate reporting error in self-reports using predictions from corrective equations, do not seem to be a good alternative to our “hybrid” BMI measures. Analysis of regression models for the association between BMI and health care utilization shows only small differences, concentrated at the far-right tails of the BMI distribution, when they are based on our hybrid measure as opposed to measured BMI. However, more pronounced differences are observed, at the lower and higher tails of BMI, when these are compared to self-reported or “corrected self-reported” BMI.
Similar content being viewed by others
Notes
These fabrication errors (if they exist) are unlikely to result in mean reversion/mean divergence but may be fairly random errors. Existing studies have shown evidence of misperception of body size (Zelenytė et al. 2021), suggesting that interviewers may not be able to accurately predict participants’ body weight/height (if not measured) and, thus, not be able to make guesses that may lead to mean reversion/mean divergence (i.e. guesswork that is strongly correlated with true body weight and height).
The factor mixture measurement error model proposed by Kapteyn and Ypma (2007) assumes that observed administrative income data are a mixture of correct matches and mismatches (with survey data). However, they argue that, over and above potential mismatches in the linkage between administrative and survey data, it is also likely that administrative and survey data may capture conceptually different things. As such, they argue that there is no loss of generality to assume that measurement error in administrative data may reflect different sources. Analogously, in our analysis measurement error in measured anthropometrics may reflect different sources (as described above), in particular interviewers’ errors related to entering values from the measurement equipment to the survey materials, fabrication of the measurement of anthropometrics by the interviewer or even physical measurements for the wrong household member.
Even in the case of fabricated interviews or when anthropometric measurement is not conducted for the intended respondent, this may be a strong assumption if quality control takes place. However, there is no such quality control undertaken in the dataset used in our analysis (as well as in many other multi-purpose social science datasets that collect anthropometrics).
Self-reported anthropometrics are collected as integer values (cm for height and Kg for weight), while the corresponding measured values are measured to one decimal point. In those cases where the respondent provided a non-integer value of their self-reported body weight and/or height (for example 61.5 kg), the interviewer recorded an integer value (such as 61 kg or 62 kg).
Mean reversion (ρ < 0) means that respondents with high (low) values of true anthropometric measures, relative to the true mean, tend to under-report (over-report) their body weight and height in self-reports; the opposite is the case for mean divergence (ρ > 0).
Moreover, one may argue that survey mode may influence measurement error in self-reported anthropometrics. For example, social desirability bias is much lower in the case of self-completion as opposed to the open interview (Bowling 2005); thus, assuming that being taller and not of excess weight is more socially desirable, shorter people and those with excess weight may have distinct reporting patterns across collection modes. However, existing studies do not confirm the presence of such influences in reporting errors. Davillas and Jones (2021) find that measurement errors in anthropometrics do not differ according to the mode of interview, with similar patterns observed when self-reported anthropometrics are collected using randomly assigned open interview and self-completion modes. Along similar lines, Cawley et al. (2015) who also discuss mean reversion in reporting error in weight highlight that interviewers do not amend/correct the self-reported anthropometrics based on measured data in their datasets and, thus, no additional interviewer effects are expected.
Typically, failures of measurement equipment may be also relevant for measurement error in physical measurements of anthropometrics. However, we believe that the risk of equipment failure is less relevant in our dataset given the prevention mechanisms/protocols we describe above.
The mean square error is computed as \(E\left( {{\text{predictor}} - \xi } \right)^{2} = {\text{Bias}}^{2} + {\text{Variance}}\). Reliability measures are computed as follows: \({\text{Rel}}1\left( r \right) = {\text{cov}} \left( {\xi ,r} \right)/{\text{var}} \left( r \right)\), \({\text{Rel}}1\left( s \right) = {\text{cov}} \left( {\xi ,s} \right)/{\text{var}} \left( s \right)\), \({\text{Rel}}2\left( r \right) = {\text{cov}} \left( {\xi ,r} \right)^{2} /\left[ {{\text{var}} \left( \xi \right) \cdot {\text{var}} \left( r \right)} \right]\) and \({\text{Rel}}2\left( s \right) = {\text{cov}} \left( {\xi ,s} \right)^{2} /\left[ {{\text{var}} \left( \xi \right) \cdot {\text{var}} \left( s \right)} \right]\). Further details can be found in Jenkins and Rios-Avila (2023a).
The 2013 National Health Survey of Brazil is publicly available online: https://www.ibge.gov.br/estatisticas/sociais/saude/9160-pesquisa-nacional-de-saude.html?=&t=microdados.
In PNS-2019, that collected data in 2019, body weight and height were measured for a much smaller sub-sample of respondents, due to the difficulties in physical anthropometric measurements for the full survey sample selected for individual interviews (Reis et al. 2022). On the other hand, in PNS-2013, the anthropometric measurements were carried out on all residents selected for the individual interview, except pregnant women (Damacena et al. 2015). Collection of both self-reported and measured anthropometrics at the same wave is necessary for our research question and the estimation requirements of our factor mixture models. Given that measured anthropometrics are only available for a small fraction of the total survey sample in PNS-2019 and because time sensitivity is not a constraint for the scope and the nature of our research question for this study, we have used the PNS-2013 data for our analysis.
Figure 4 (Appendix) plots the absolute differences between the 1st and 2nd body weight and height physical measurement. The graph shows that the mass of the absolute difference is concentrated at zero, and there are a few observations with absolute differences between the 1st and 2nd measurement that exceeds 1.5 kg (for body weight) or 1.5 cm (for body height).
The corresponding kernel density distributions for self-reported and measured body weight, height and BMI are presented in Figure 5 (Appendix). It seems that both self-reported and measured body height data have approximately normally shaped distributions, although right-skewed distributions are observed for the case of body weight and BMI. This is important as our model assumes normality for the factor distributions and identification of the components of the mixture of normals stems from non-normality in the (joint) distribution of observed outcomes.
Existing studies in the economics of obesity literature that rely on self-reported anthropometrics often estimate corrective equations (or utilize the coefficients from existing equations) based on the relationship between measured and self-reported body weight and height data from alternative data sources (Cawley 2015). To mimic correction procedures for self-reported anthropometrics in the existing studies, we estimate analogous “corrective” equations by regressing measured weight and height data on self-reports and a vector of demographics (results from these equations are available in Appendix, Table 14). The predictions from these equations are used to calculate self-reports of body weight and height that are corrected for reporting error—these results from our "corrected self-reported BMI” measure as presented in Tables 10 and 11.
References
Arntsen SH, Borch KB, Wilsgaard T, Njølstad I, Hansen AH (2023) Time trends in body height according to educational level: a descriptive study from the Tromsø Study 1979–2016. PLoS ONE 18(1):e0279965
Baum CL II, Ruhm CJ (2009) Age, socioeconomic status and obesity growth. J Health Econ 28(3):635–648
Baum CL (2007) The effects of race, ethnicity, and age on obesity. J Popul Econ 20:687–705
Bilger M, Kruger EJ, Finkelstein EA (2017) Measuring socioeconomic inequality in obesity: looking beyond the obesity threshold. Health Econ 26:1052–1066
Bowling A (2005) Mode of questionnaire administration can have serious effects on data quality. J Public Health 27(3):281–291
Cawley J (2015) An economy of scales: a selective review of obesity’s economic causes, consequences, and solutions. J Health Econ 43:244–268
Cawley J, Meyerhoefer C (2012) The medical care costs of obesity: an instrumental variables approach. J Health Econ 31(1):219–230
Cawley J (2004) The impact of obesity on wages. J Hum Resources 39(2):451–474
Cawley J, Maclean JC, Hammer M, Wintfeld N (2015) Reporting error in weight and its implications for bias in economic models. Econ Hum Biol 19:27–44
Damacena GN, Szwarcwald CL, Malta DC et al (2015) The development of the National Health survey in Brazil, 2013. Epidemiologia e Serviços De Saúde 24:197–206
Davillas A, Benzeval M (2016) Alternative measures to BMI: exploring income-related inequalities in adiposity in Great Britain. Soc Sci Med 166:223–232
Davillas A, Jones AM (2020) Regional inequalities in adiposity in England: distributional analysis of the contribution of individual-level characteristics and the small area obesogenic environment. Econ Hum Biol 38:100887
Davillas A, Jones AM (2021) The implications of self-reported body weight and height for measurement error in BMI. Econ Lett 209:110101
Davillas A, Pudney S (2017) Concordance of health states in couples: analysis of self-reported, nurse administered and blood-based biomarker data in the UK Understanding Society panel. J Health Econ 56:87–102
Davillas A, Pudney S (2020a) Biomarkers as precursors of disability. Econ Hum Biol 36:100814
Davillas A, Pudney S (2020b) Biomarkers, disability and health care demand. Econ Hum Biol 39:100929
Engstrom JL, Paterson SA, Doherty A et al (2003) Accuracy of self-reported height and weight in women: an integrative review of the literature. J Midwifery Womens Health 48(5):338–345
Finn A, Ranchhod V (2017) Genuine fakes: the prevalence and implications of data fabrication in a large South African survey. World Bank Econ Rev 31(1):129–157
Fryar CD, Carroll MD, Gu Q, Afful J, Ogden CL (2021) Anthropometric reference data for children and adults. U.S. Department of Health & Human Services, National Centre of Health Statistics, United States
Gil J, Mora T (2011) The determinants of misreporting weight and height: the role of social norms. Econ Hum Biol 9:78–91
Gorber SC, Tremblay M, Moher D, Gorber B (2007) A comparison of direct vs. self-report measures for assessing height, weight and body mass index: a systematic review. Obes Rev 8(4):307–326
Groves RM (2005) Survey errors and survey costs. Wiley
Jenkins SP, Rios-Avila F (2020) Modelling errors in survey and administrative data on employment earnings: sensitivity to the fraction assumed to have error-free earnings. Econ Lett 192:109253
Jenkins SP, Rios-Avila F (2021) Measurement error in earnings data: replication of Meijer, Rohwedder, and Wansbeek’s mixture model approach to combining survey and register data. J Appl Economet 36(4):474–483
Jenkins SP, Rios-Avila F (2023a) Reconciling reports: modelling employment earnings and measurement errors using linked survey and administrative data. J R Stat Soc Ser A Stat Soc 186(1):110–136
Jenkins SP, Rios-Avila F (2023b) Finite mixture models for linked survey and administrative data: estimation and post-estimation. Stand Genomic Sci 23(1):53–85
Johnston DW, Propper C, Shields MA (2009) Comparing subjective and objective measures of health: evidence from hypertension for the income/health gradient. J Health Econ 28(3):540–552
Kapteyn A, Ypma JY (2007) Measurement error and misclassification: a comparison of survey and administrative data. J Law Econ 25:513–551
Keith SW, Fontaine KR, Pajewski NM, Mehta T, Allison DB (2011) Use of self-reported height and weight biases the body mass index–mortality association. Int J Obes 35(3):401–408
Knäuper B, Carrière K, Chamandy M, Xu Z, Schwarz N, Rosen NO (2016) How aging affects self-reports. Eur J Ageing 13:185–193
Li J, Simon G, Castro MR, Kumar V, Steinbach MS, Caraballo PJ (2021) Association of BMI, comorbidities and all-cause mortality by using a baseline mortality risk model. PLoS ONE 16(7):e0253696
Lin X, Xu Y, Jl Xu et al (2020) Global burden of noncommunicable disease attributable to high body mass index in 195 countries and territories, 1990–2017. Endocrine 69(2):310–320
Ljungvall Å, Gerdtham UG, Lindblad U (2015) Misreporting and misclassification: implications for socioeconomic disparities in body-mass index and obesity. Eur J Health Econ 16:5–20
Meijer E, Rohwedder S, Wansbeek T (2012) Measurement error in earnings data: using a mixture model approach to combine survey and register data. J Bus Econ Stat 30:191–201
Ng M, Fleming T, Robinson M et al (2014) Global, regional, and national prevalence of overweight and obesity in children and adults during 1980–2013: a systematic analysis for the Global Burden of Disease Study 2013. The Lancet 384(9945):766–781
O’Neill D, Sweetman O (2013) The consequences of measurement error when estimating the impact of obesity on income. IZA J Labor Econ 2(1):1–20
Olbrich L, Kosyakova Y, Sakshaug JW (2022) The reliability of adult self-reported height: the role of interviewers. Econ Hum Biol. https://doi.org/10.1016/j.ehb.2022.101118
PNS (2013) Pesquisa Nacional de Saúde 2013 – Manual de Antropometria. Instituto Brasileiro de Geografia e Estatistica. Rio de Janeiro. Available at: https://biblioteca.ibge.gov.br/visualizacao/instrumentos_de_coleta/doc3426.pdf
Collaboration PS, Whitlock G, Lewington S et al (2009) Body-mass index and cause-specific mortality in 900000 adults: collaborative analyses of 57 prospective studies. The Lancet 373(9669):1083–1096
Puhl RM, Heuer CA (2009) The stigma of obesity: a review and update. Obesity 17(5):941–964
Reis RCPD, Duncan BB, Malta DC et al (2022) Evolution of diabetes in Brazil: prevalence data from the 2013 and 2019 Brazilian National Health Survey. Cad Saude Publica 38:e00149321. https://doi.org/10.1590/0102-311X00149321
Rimes-Dias KA, Costa JC, Canella DS (2022) Obesity and health service utilization in Brazil: data from the National Health Survey. BMC Public Health 22(1):1474
Rooth DO (2009) Obesity, attractiveness, and differential treatment in hiring a field experiment. J Hum Resources 44(3):710–735
Rtveladze K, Marsh T, Webber L et al (2013) Health and economic burden of obesity in Brazil. PLoS ONE 8(7):e68785. https://doi.org/10.1371/journal.pone.0068785
Sattler KM, Deane FP, Tapsell L, Kelly PJ (2018) Gender differences in the relationship of weight-based stigmatisation with motivation to exercise and physical activity in overweight individuals. Health Psychol Open 5(1):2055102918759691. https://doi.org/10.1177/2055102918759691
Sherry B, Jefferds ME, Grummer-Strawn LM (2007) Accuracy of adolescent self-report of height and weight in assessing overweight status: a literature review. Arch Pediatr Adolesc Med 161(12):1154–1161
Szwarcwald CL, Malta DC, Pereira CA et al (2014) Pesquisa Nacional de Saúde no Brasil: concepção e metodologia de aplicação. Cien Saude Colet 19(2):333–342
Triaca LM, Jacinto PA, França MTA, Tejada CAO (2020) Does greater unemployment make people thinner in Brazil? Health Econ 29:1279–1288
U.S.D.H.H.S. (2010) The Surgeon General’s Vision for a Healthy and Fit Nation. U.S. Department of Health and Human Services, Office of the Surgeon General, Rockville, MD
Zelenytė V, Valius L, Domeikienė A et al (2021) Body size perception, knowledge about obesity and factors associated with lifestyle change among patients, health care professionals and public health experts. BMC Fam Pract 22(1):1–13
Zhang Q, Wang Y (2004) Socioeconomic inequality of obesity in the United States: do gender, age, and ethnicity matter? Soc Sci Med 58(6):1171–1180
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Davillas, A., de Oliveira, V.H. & Jones, A.M. A model of errors in BMI based on self-reported and measured anthropometrics with evidence from Brazilian data. Empir Econ (2024). https://doi.org/10.1007/s00181-024-02616-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00181-024-02616-w