Skip to main content
Log in

A model of errors in BMI based on self-reported and measured anthropometrics with evidence from Brazilian data

  • Published:
Empirical Economics Aims and scope Submit manuscript

Abstract

The economics of obesity literature implicitly assumes that measured anthropometrics are error-free and they are often treated as a gold standard when compared to self-reported data. We use factor mixture models to analyse measurement error in both self-reported and measured anthropometrics with nationally representative data from the 2013 National Health Survey in Brazil. A small but statistically significant fraction of measured anthropometrics are attributed to recording errors, while, as they are imprecisely recorded and due to reporting behaviour, only between 10 and 23% of our self-reported anthropometrics are free from any measurement error. Post-estimation analysis allows us to calculate hybrid anthropometric predictions that best approximate the true body weight and height distribution. BMI distributions based on the hybrid measures do not differ between our factor mixture models, with and without covariates, and are generally close to those based on measured data, while BMI based on self-reported data under-estimates the true BMI distribution. “Corrected self-reported BMI” measures, based on common methods to mitigate reporting error in self-reports using predictions from corrective equations, do not seem to be a good alternative to our “hybrid” BMI measures. Analysis of regression models for the association between BMI and health care utilization shows only small differences, concentrated at the far-right tails of the BMI distribution, when they are based on our hybrid measure as opposed to measured BMI. However, more pronounced differences are observed, at the lower and higher tails of BMI, when these are compared to self-reported or “corrected self-reported” BMI.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. These fabrication errors (if they exist) are unlikely to result in mean reversion/mean divergence but may be fairly random errors. Existing studies have shown evidence of misperception of body size (Zelenytė et al. 2021), suggesting that interviewers may not be able to accurately predict participants’ body weight/height (if not measured) and, thus, not be able to make guesses that may lead to mean reversion/mean divergence (i.e. guesswork that is strongly correlated with true body weight and height).

  2. The factor mixture measurement error model proposed by Kapteyn and Ypma (2007) assumes that observed administrative income data are a mixture of correct matches and mismatches (with survey data). However, they argue that, over and above potential mismatches in the linkage between administrative and survey data, it is also likely that administrative and survey data may capture conceptually different things. As such, they argue that there is no loss of generality to assume that measurement error in administrative data may reflect different sources. Analogously, in our analysis measurement error in measured anthropometrics may reflect different sources (as described above), in particular interviewers’ errors related to entering values from the measurement equipment to the survey materials, fabrication of the measurement of anthropometrics by the interviewer or even physical measurements for the wrong household member.

  3. Even in the case of fabricated interviews or when anthropometric measurement is not conducted for the intended respondent, this may be a strong assumption if quality control takes place. However, there is no such quality control undertaken in the dataset used in our analysis (as well as in many other multi-purpose social science datasets that collect anthropometrics).

  4. Self-reported anthropometrics are collected as integer values (cm for height and Kg for weight), while the corresponding measured values are measured to one decimal point. In those cases where the respondent provided a non-integer value of their self-reported body weight and/or height (for example 61.5 kg), the interviewer recorded an integer value (such as 61 kg or 62 kg).

  5. Mean reversion (ρ < 0) means that respondents with high (low) values of true anthropometric measures, relative to the true mean, tend to under-report (over-report) their body weight and height in self-reports; the opposite is the case for mean divergence (ρ > 0).

  6. Moreover, one may argue that survey mode may influence measurement error in self-reported anthropometrics. For example, social desirability bias is much lower in the case of self-completion as opposed to the open interview (Bowling 2005); thus, assuming that being taller and not of excess weight is more socially desirable, shorter people and those with excess weight may have distinct reporting patterns across collection modes. However, existing studies do not confirm the presence of such influences in reporting errors. Davillas and Jones (2021) find that measurement errors in anthropometrics do not differ according to the mode of interview, with similar patterns observed when self-reported anthropometrics are collected using randomly assigned open interview and self-completion modes. Along similar lines, Cawley et al. (2015) who also discuss mean reversion in reporting error in weight highlight that interviewers do not amend/correct the self-reported anthropometrics based on measured data in their datasets and, thus, no additional interviewer effects are expected.

  7. Typically, failures of measurement equipment may be also relevant for measurement error in physical measurements of anthropometrics. However, we believe that the risk of equipment failure is less relevant in our dataset given the prevention mechanisms/protocols we describe above.

  8. The user-written Stata command “ky_fit” predicts the seven “hybrid” measures proposed by Meijer et al. (2012). Table 6 in Jenkins and Rios-Avila (2023b) provides the descriptions of the predictors (“hybrid” outcomes), with the corresponding derivation of the formulae presented in their appendix.

  9. The mean square error is computed as \(E\left( {{\text{predictor}} - \xi } \right)^{2} = {\text{Bias}}^{2} + {\text{Variance}}\). Reliability measures are computed as follows: \({\text{Rel}}1\left( r \right) = {\text{cov}} \left( {\xi ,r} \right)/{\text{var}} \left( r \right)\), \({\text{Rel}}1\left( s \right) = {\text{cov}} \left( {\xi ,s} \right)/{\text{var}} \left( s \right)\), \({\text{Rel}}2\left( r \right) = {\text{cov}} \left( {\xi ,r} \right)^{2} /\left[ {{\text{var}} \left( \xi \right) \cdot {\text{var}} \left( r \right)} \right]\) and \({\text{Rel}}2\left( s \right) = {\text{cov}} \left( {\xi ,s} \right)^{2} /\left[ {{\text{var}} \left( \xi \right) \cdot {\text{var}} \left( s \right)} \right]\). Further details can be found in Jenkins and Rios-Avila (2023a).

  10. The 2013 National Health Survey of Brazil is publicly available online: https://www.ibge.gov.br/estatisticas/sociais/saude/9160-pesquisa-nacional-de-saude.html?=&t=microdados.

  11. In PNS-2019, that collected data in 2019, body weight and height were measured for a much smaller sub-sample of respondents, due to the difficulties in physical anthropometric measurements for the full survey sample selected for individual interviews (Reis et al. 2022). On the other hand, in PNS-2013, the anthropometric measurements were carried out on all residents selected for the individual interview, except pregnant women (Damacena et al. 2015). Collection of both self-reported and measured anthropometrics at the same wave is necessary for our research question and the estimation requirements of our factor mixture models. Given that measured anthropometrics are only available for a small fraction of the total survey sample in PNS-2019 and because time sensitivity is not a constraint for the scope and the nature of our research question for this study, we have used the PNS-2013 data for our analysis.

  12. Figure 4 (Appendix) plots the absolute differences between the 1st and 2nd body weight and height physical measurement. The graph shows that the mass of the absolute difference is concentrated at zero, and there are a few observations with absolute differences between the 1st and 2nd measurement that exceeds 1.5 kg (for body weight) or 1.5 cm (for body height).

  13. The corresponding kernel density distributions for self-reported and measured body weight, height and BMI are presented in Figure 5 (Appendix). It seems that both self-reported and measured body height data have approximately normally shaped distributions, although right-skewed distributions are observed for the case of body weight and BMI. This is important as our model assumes normality for the factor distributions and identification of the components of the mixture of normals stems from non-normality in the (joint) distribution of observed outcomes.

  14. Existing studies in the economics of obesity literature that rely on self-reported anthropometrics often estimate corrective equations (or utilize the coefficients from existing equations) based on the relationship between measured and self-reported body weight and height data from alternative data sources (Cawley 2015). To mimic correction procedures for self-reported anthropometrics in the existing studies, we estimate analogous “corrective” equations by regressing measured weight and height data on self-reports and a vector of demographics (results from these equations are available in Appendix, Table 14). The predictions from these equations are used to calculate self-reports of body weight and height that are corrected for reporting error—these results from our "corrected self-reported BMI” measure as presented in Tables 10 and 11.

References

  • Arntsen SH, Borch KB, Wilsgaard T, Njølstad I, Hansen AH (2023) Time trends in body height according to educational level: a descriptive study from the Tromsø Study 1979–2016. PLoS ONE 18(1):e0279965

    Article  Google Scholar 

  • Baum CL II, Ruhm CJ (2009) Age, socioeconomic status and obesity growth. J Health Econ 28(3):635–648

    Article  Google Scholar 

  • Baum CL (2007) The effects of race, ethnicity, and age on obesity. J Popul Econ 20:687–705

    Article  Google Scholar 

  • Bilger M, Kruger EJ, Finkelstein EA (2017) Measuring socioeconomic inequality in obesity: looking beyond the obesity threshold. Health Econ 26:1052–1066

    Article  Google Scholar 

  • Bowling A (2005) Mode of questionnaire administration can have serious effects on data quality. J Public Health 27(3):281–291

    Article  Google Scholar 

  • Cawley J (2015) An economy of scales: a selective review of obesity’s economic causes, consequences, and solutions. J Health Econ 43:244–268

    Article  Google Scholar 

  • Cawley J, Meyerhoefer C (2012) The medical care costs of obesity: an instrumental variables approach. J Health Econ 31(1):219–230

    Article  Google Scholar 

  • Cawley J (2004) The impact of obesity on wages. J Hum Resources 39(2):451–474

    Article  Google Scholar 

  • Cawley J, Maclean JC, Hammer M, Wintfeld N (2015) Reporting error in weight and its implications for bias in economic models. Econ Hum Biol 19:27–44

    Article  Google Scholar 

  • Damacena GN, Szwarcwald CL, Malta DC et al (2015) The development of the National Health survey in Brazil, 2013. Epidemiologia e Serviços De Saúde 24:197–206

    Article  Google Scholar 

  • Davillas A, Benzeval M (2016) Alternative measures to BMI: exploring income-related inequalities in adiposity in Great Britain. Soc Sci Med 166:223–232

    Article  Google Scholar 

  • Davillas A, Jones AM (2020) Regional inequalities in adiposity in England: distributional analysis of the contribution of individual-level characteristics and the small area obesogenic environment. Econ Hum Biol 38:100887

    Article  Google Scholar 

  • Davillas A, Jones AM (2021) The implications of self-reported body weight and height for measurement error in BMI. Econ Lett 209:110101

    Article  Google Scholar 

  • Davillas A, Pudney S (2017) Concordance of health states in couples: analysis of self-reported, nurse administered and blood-based biomarker data in the UK Understanding Society panel. J Health Econ 56:87–102

    Article  Google Scholar 

  • Davillas A, Pudney S (2020a) Biomarkers as precursors of disability. Econ Hum Biol 36:100814

    Article  Google Scholar 

  • Davillas A, Pudney S (2020b) Biomarkers, disability and health care demand. Econ Hum Biol 39:100929

    Article  Google Scholar 

  • Engstrom JL, Paterson SA, Doherty A et al (2003) Accuracy of self-reported height and weight in women: an integrative review of the literature. J Midwifery Womens Health 48(5):338–345

    Article  Google Scholar 

  • Finn A, Ranchhod V (2017) Genuine fakes: the prevalence and implications of data fabrication in a large South African survey. World Bank Econ Rev 31(1):129–157

    Google Scholar 

  • Fryar CD, Carroll MD, Gu Q, Afful J, Ogden CL (2021) Anthropometric reference data for children and adults. U.S. Department of Health & Human Services, National Centre of Health Statistics, United States

  • Gil J, Mora T (2011) The determinants of misreporting weight and height: the role of social norms. Econ Hum Biol 9:78–91

    Article  Google Scholar 

  • Gorber SC, Tremblay M, Moher D, Gorber B (2007) A comparison of direct vs. self-report measures for assessing height, weight and body mass index: a systematic review. Obes Rev 8(4):307–326

    Article  Google Scholar 

  • Groves RM (2005) Survey errors and survey costs. Wiley

    Google Scholar 

  • Jenkins SP, Rios-Avila F (2020) Modelling errors in survey and administrative data on employment earnings: sensitivity to the fraction assumed to have error-free earnings. Econ Lett 192:109253

    Article  Google Scholar 

  • Jenkins SP, Rios-Avila F (2021) Measurement error in earnings data: replication of Meijer, Rohwedder, and Wansbeek’s mixture model approach to combining survey and register data. J Appl Economet 36(4):474–483

    Article  Google Scholar 

  • Jenkins SP, Rios-Avila F (2023a) Reconciling reports: modelling employment earnings and measurement errors using linked survey and administrative data. J R Stat Soc Ser A Stat Soc 186(1):110–136

    Article  Google Scholar 

  • Jenkins SP, Rios-Avila F (2023b) Finite mixture models for linked survey and administrative data: estimation and post-estimation. Stand Genomic Sci 23(1):53–85

    Google Scholar 

  • Johnston DW, Propper C, Shields MA (2009) Comparing subjective and objective measures of health: evidence from hypertension for the income/health gradient. J Health Econ 28(3):540–552

    Article  Google Scholar 

  • Kapteyn A, Ypma JY (2007) Measurement error and misclassification: a comparison of survey and administrative data. J Law Econ 25:513–551

    Google Scholar 

  • Keith SW, Fontaine KR, Pajewski NM, Mehta T, Allison DB (2011) Use of self-reported height and weight biases the body mass index–mortality association. Int J Obes 35(3):401–408

    Article  Google Scholar 

  • Knäuper B, Carrière K, Chamandy M, Xu Z, Schwarz N, Rosen NO (2016) How aging affects self-reports. Eur J Ageing 13:185–193

    Article  Google Scholar 

  • Li J, Simon G, Castro MR, Kumar V, Steinbach MS, Caraballo PJ (2021) Association of BMI, comorbidities and all-cause mortality by using a baseline mortality risk model. PLoS ONE 16(7):e0253696

    Article  Google Scholar 

  • Lin X, Xu Y, Jl Xu et al (2020) Global burden of noncommunicable disease attributable to high body mass index in 195 countries and territories, 1990–2017. Endocrine 69(2):310–320

    Article  Google Scholar 

  • Ljungvall Å, Gerdtham UG, Lindblad U (2015) Misreporting and misclassification: implications for socioeconomic disparities in body-mass index and obesity. Eur J Health Econ 16:5–20

    Article  Google Scholar 

  • Meijer E, Rohwedder S, Wansbeek T (2012) Measurement error in earnings data: using a mixture model approach to combine survey and register data. J Bus Econ Stat 30:191–201

    Google Scholar 

  • Ng M, Fleming T, Robinson M et al (2014) Global, regional, and national prevalence of overweight and obesity in children and adults during 1980–2013: a systematic analysis for the Global Burden of Disease Study 2013. The Lancet 384(9945):766–781

    Article  Google Scholar 

  • O’Neill D, Sweetman O (2013) The consequences of measurement error when estimating the impact of obesity on income. IZA J Labor Econ 2(1):1–20

    Article  Google Scholar 

  • Olbrich L, Kosyakova Y, Sakshaug JW (2022) The reliability of adult self-reported height: the role of interviewers. Econ Hum Biol. https://doi.org/10.1016/j.ehb.2022.101118

    Article  Google Scholar 

  • PNS (2013) Pesquisa Nacional de Saúde 2013 – Manual de Antropometria. Instituto Brasileiro de Geografia e Estatistica. Rio de Janeiro. Available at: https://biblioteca.ibge.gov.br/visualizacao/instrumentos_de_coleta/doc3426.pdf

  • Collaboration PS, Whitlock G, Lewington S et al (2009) Body-mass index and cause-specific mortality in 900000 adults: collaborative analyses of 57 prospective studies. The Lancet 373(9669):1083–1096

    Article  Google Scholar 

  • Puhl RM, Heuer CA (2009) The stigma of obesity: a review and update. Obesity 17(5):941–964

    Article  Google Scholar 

  • Reis RCPD, Duncan BB, Malta DC et al (2022) Evolution of diabetes in Brazil: prevalence data from the 2013 and 2019 Brazilian National Health Survey. Cad Saude Publica 38:e00149321. https://doi.org/10.1590/0102-311X00149321

    Article  Google Scholar 

  • Rimes-Dias KA, Costa JC, Canella DS (2022) Obesity and health service utilization in Brazil: data from the National Health Survey. BMC Public Health 22(1):1474

    Article  Google Scholar 

  • Rooth DO (2009) Obesity, attractiveness, and differential treatment in hiring a field experiment. J Hum Resources 44(3):710–735

    Google Scholar 

  • Rtveladze K, Marsh T, Webber L et al (2013) Health and economic burden of obesity in Brazil. PLoS ONE 8(7):e68785. https://doi.org/10.1371/journal.pone.0068785

    Article  Google Scholar 

  • Sattler KM, Deane FP, Tapsell L, Kelly PJ (2018) Gender differences in the relationship of weight-based stigmatisation with motivation to exercise and physical activity in overweight individuals. Health Psychol Open 5(1):2055102918759691. https://doi.org/10.1177/2055102918759691

    Article  Google Scholar 

  • Sherry B, Jefferds ME, Grummer-Strawn LM (2007) Accuracy of adolescent self-report of height and weight in assessing overweight status: a literature review. Arch Pediatr Adolesc Med 161(12):1154–1161

    Article  Google Scholar 

  • Szwarcwald CL, Malta DC, Pereira CA et al (2014) Pesquisa Nacional de Saúde no Brasil: concepção e metodologia de aplicação. Cien Saude Colet 19(2):333–342

    Article  Google Scholar 

  • Triaca LM, Jacinto PA, França MTA, Tejada CAO (2020) Does greater unemployment make people thinner in Brazil? Health Econ 29:1279–1288

    Article  Google Scholar 

  • U.S.D.H.H.S. (2010) The Surgeon General’s Vision for a Healthy and Fit Nation. U.S. Department of Health and Human Services, Office of the Surgeon General, Rockville, MD

  • Zelenytė V, Valius L, Domeikienė A et al (2021) Body size perception, knowledge about obesity and factors associated with lifestyle change among patients, health care professionals and public health experts. BMC Fam Pract 22(1):1–13

    Article  Google Scholar 

  • Zhang Q, Wang Y (2004) Socioeconomic inequality of obesity in the United States: do gender, age, and ethnicity matter? Soc Sci Med 58(6):1171–1180

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Apostolos Davillas.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

See Figs.

Fig. 4
figure 4

Kernel densities for the absolute differences between the 1st and 2nd body weight and height physical measurement

4 and

Fig. 5
figure 5

Kernel densities: body weight, height and BMI

5. Tables

Table 12 Estimation of factor mixture model for body weight and height—measured weight/height data are rounded at the nearest integer

12,

Table 13 Estimation of factor mixture model for body weight and height (measured data: average between 1st and 2nd measurement)

13 and

Table 14 “Corrective equations” for body weight and height

14.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Davillas, A., de Oliveira, V.H. & Jones, A.M. A model of errors in BMI based on self-reported and measured anthropometrics with evidence from Brazilian data. Empir Econ (2024). https://doi.org/10.1007/s00181-024-02616-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00181-024-02616-w

Keywords

JEL Classification

Navigation