A model of errors in BMI based on self-reported and measured anthropometrics with evidence from Brazilian data

Davillas, Apostolos; de Oliveira, Victor Hugo; Jones, Andrew M.

doi:10.1007/s00181-024-02616-w

A model of errors in BMI based on self-reported and measured anthropometrics with evidence from Brazilian data

Published: 20 May 2024

(2024)
Cite this article

Empirical Economics Aims and scope Submit manuscript

Apostolos Davillas ORCID: orcid.org/0000-0002-6607-274X¹,
Victor Hugo de Oliveira² &
Andrew M. Jones³

64 Accesses
2 Altmetric
Explore all metrics

Abstract

The economics of obesity literature implicitly assumes that measured anthropometrics are error-free and they are often treated as a gold standard when compared to self-reported data. We use factor mixture models to analyse measurement error in both self-reported and measured anthropometrics with nationally representative data from the 2013 National Health Survey in Brazil. A small but statistically significant fraction of measured anthropometrics are attributed to recording errors, while, as they are imprecisely recorded and due to reporting behaviour, only between 10 and 23% of our self-reported anthropometrics are free from any measurement error. Post-estimation analysis allows us to calculate hybrid anthropometric predictions that best approximate the true body weight and height distribution. BMI distributions based on the hybrid measures do not differ between our factor mixture models, with and without covariates, and are generally close to those based on measured data, while BMI based on self-reported data under-estimates the true BMI distribution. “Corrected self-reported BMI” measures, based on common methods to mitigate reporting error in self-reports using predictions from corrective equations, do not seem to be a good alternative to our “hybrid” BMI measures. Analysis of regression models for the association between BMI and health care utilization shows only small differences, concentrated at the far-right tails of the BMI distribution, when they are based on our hybrid measure as opposed to measured BMI. However, more pronounced differences are observed, at the lower and higher tails of BMI, when these are compared to self-reported or “corrected self-reported” BMI.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Update on the Obesity Epidemic: After the Sudden Rise, Is the Upward Trajectory Beginning to Flatten?

Article Open access 02 October 2023

A Tutorial on Applying the Difference-in-Differences Method to Health Data

Article Open access 07 September 2023

Fixed and random effects models: making an informed choice

Article Open access 07 August 2018

Notes

These fabrication errors (if they exist) are unlikely to result in mean reversion/mean divergence but may be fairly random errors. Existing studies have shown evidence of misperception of body size (Zelenytė et al. 2021), suggesting that interviewers may not be able to accurately predict participants’ body weight/height (if not measured) and, thus, not be able to make guesses that may lead to mean reversion/mean divergence (i.e. guesswork that is strongly correlated with true body weight and height).
The factor mixture measurement error model proposed by Kapteyn and Ypma (2007) assumes that observed administrative income data are a mixture of correct matches and mismatches (with survey data). However, they argue that, over and above potential mismatches in the linkage between administrative and survey data, it is also likely that administrative and survey data may capture conceptually different things. As such, they argue that there is no loss of generality to assume that measurement error in administrative data may reflect different sources. Analogously, in our analysis measurement error in measured anthropometrics may reflect different sources (as described above), in particular interviewers’ errors related to entering values from the measurement equipment to the survey materials, fabrication of the measurement of anthropometrics by the interviewer or even physical measurements for the wrong household member.
Even in the case of fabricated interviews or when anthropometric measurement is not conducted for the intended respondent, this may be a strong assumption if quality control takes place. However, there is no such quality control undertaken in the dataset used in our analysis (as well as in many other multi-purpose social science datasets that collect anthropometrics).
Self-reported anthropometrics are collected as integer values (cm for height and Kg for weight), while the corresponding measured values are measured to one decimal point. In those cases where the respondent provided a non-integer value of their self-reported body weight and/or height (for example 61.5 kg), the interviewer recorded an integer value (such as 61 kg or 62 kg).
Mean reversion (ρ < 0) means that respondents with high (low) values of true anthropometric measures, relative to the true mean, tend to under-report (over-report) their body weight and height in self-reports; the opposite is the case for mean divergence (ρ > 0).
Moreover, one may argue that survey mode may influence measurement error in self-reported anthropometrics. For example, social desirability bias is much lower in the case of self-completion as opposed to the open interview (Bowling 2005); thus, assuming that being taller and not of excess weight is more socially desirable, shorter people and those with excess weight may have distinct reporting patterns across collection modes. However, existing studies do not confirm the presence of such influences in reporting errors. Davillas and Jones (2021) find that measurement errors in anthropometrics do not differ according to the mode of interview, with similar patterns observed when self-reported anthropometrics are collected using randomly assigned open interview and self-completion modes. Along similar lines, Cawley et al. (2015) who also discuss mean reversion in reporting error in weight highlight that interviewers do not amend/correct the self-reported anthropometrics based on measured data in their datasets and, thus, no additional interviewer effects are expected.
Typically, failures of measurement equipment may be also relevant for measurement error in physical measurements of anthropometrics. However, we believe that the risk of equipment failure is less relevant in our dataset given the prevention mechanisms/protocols we describe above.
The user-written Stata command “ky_fit” predicts the seven “hybrid” measures proposed by Meijer et al. (2012). Table 6 in Jenkins and Rios-Avila (2023b) provides the descriptions of the predictors (“hybrid” outcomes), with the corresponding derivation of the formulae presented in their appendix.
The mean square error is computed as \(E\left( {{\text{predictor}} - \xi } \right)^{2} = {\text{Bias}}^{2} + {\text{Variance}}\). Reliability measures are computed as follows: \({\text{Rel}}1\left( r \right) = {\text{cov}} \left( {\xi ,r} \right)/{\text{var}} \left( r \right)\), \({\text{Rel}}1\left( s \right) = {\text{cov}} \left( {\xi ,s} \right)/{\text{var}} \left( s \right)\), \({\text{Rel}}2\left( r \right) = {\text{cov}} \left( {\xi ,r} \right)^{2} /\left[ {{\text{var}} \left( \xi \right) \cdot {\text{var}} \left( r \right)} \right]\) and \({\text{Rel}}2\left( s \right) = {\text{cov}} \left( {\xi ,s} \right)^{2} /\left[ {{\text{var}} \left( \xi \right) \cdot {\text{var}} \left( s \right)} \right]\). Further details can be found in Jenkins and Rios-Avila (2023a).
The 2013 National Health Survey of Brazil is publicly available online: https://www.ibge.gov.br/estatisticas/sociais/saude/9160-pesquisa-nacional-de-saude.html?=&t=microdados.
In PNS-2019, that collected data in 2019, body weight and height were measured for a much smaller sub-sample of respondents, due to the difficulties in physical anthropometric measurements for the full survey sample selected for individual interviews (Reis et al. 2022). On the other hand, in PNS-2013, the anthropometric measurements were carried out on all residents selected for the individual interview, except pregnant women (Damacena et al. 2015). Collection of both self-reported and measured anthropometrics at the same wave is necessary for our research question and the estimation requirements of our factor mixture models. Given that measured anthropometrics are only available for a small fraction of the total survey sample in PNS-2019 and because time sensitivity is not a constraint for the scope and the nature of our research question for this study, we have used the PNS-2013 data for our analysis.
Figure 4 (Appendix) plots the absolute differences between the 1st and 2nd body weight and height physical measurement. The graph shows that the mass of the absolute difference is concentrated at zero, and there are a few observations with absolute differences between the 1st and 2nd measurement that exceeds 1.5 kg (for body weight) or 1.5 cm (for body height).
The corresponding kernel density distributions for self-reported and measured body weight, height and BMI are presented in Figure 5 (Appendix). It seems that both self-reported and measured body height data have approximately normally shaped distributions, although right-skewed distributions are observed for the case of body weight and BMI. This is important as our model assumes normality for the factor distributions and identification of the components of the mixture of normals stems from non-normality in the (joint) distribution of observed outcomes.
Existing studies in the economics of obesity literature that rely on self-reported anthropometrics often estimate corrective equations (or utilize the coefficients from existing equations) based on the relationship between measured and self-reported body weight and height data from alternative data sources (Cawley 2015). To mimic correction procedures for self-reported anthropometrics in the existing studies, we estimate analogous “corrective” equations by regressing measured weight and height data on self-reports and a vector of demographics (results from these equations are available in Appendix, Table 14). The predictions from these equations are used to calculate self-reports of body weight and height that are corrected for reporting error—these results from our "corrected self-reported BMI” measure as presented in Tables 10 and 11.

References

Arntsen SH, Borch KB, Wilsgaard T, Njølstad I, Hansen AH (2023) Time trends in body height according to educational level: a descriptive study from the Tromsø Study 1979–2016. PLoS ONE 18(1):e0279965
Article Google Scholar
Baum CL II, Ruhm CJ (2009) Age, socioeconomic status and obesity growth. J Health Econ 28(3):635–648
Article Google Scholar
Baum CL (2007) The effects of race, ethnicity, and age on obesity. J Popul Econ 20:687–705
Article Google Scholar
Bilger M, Kruger EJ, Finkelstein EA (2017) Measuring socioeconomic inequality in obesity: looking beyond the obesity threshold. Health Econ 26:1052–1066
Article Google Scholar
Bowling A (2005) Mode of questionnaire administration can have serious effects on data quality. J Public Health 27(3):281–291
Article Google Scholar
Cawley J (2015) An economy of scales: a selective review of obesity’s economic causes, consequences, and solutions. J Health Econ 43:244–268
Article Google Scholar
Cawley J, Meyerhoefer C (2012) The medical care costs of obesity: an instrumental variables approach. J Health Econ 31(1):219–230
Article Google Scholar
Cawley J (2004) The impact of obesity on wages. J Hum Resources 39(2):451–474
Article Google Scholar
Cawley J, Maclean JC, Hammer M, Wintfeld N (2015) Reporting error in weight and its implications for bias in economic models. Econ Hum Biol 19:27–44
Article Google Scholar
Damacena GN, Szwarcwald CL, Malta DC et al (2015) The development of the National Health survey in Brazil, 2013. Epidemiologia e Serviços De Saúde 24:197–206
Article Google Scholar
Davillas A, Benzeval M (2016) Alternative measures to BMI: exploring income-related inequalities in adiposity in Great Britain. Soc Sci Med 166:223–232
Article Google Scholar
Davillas A, Jones AM (2020) Regional inequalities in adiposity in England: distributional analysis of the contribution of individual-level characteristics and the small area obesogenic environment. Econ Hum Biol 38:100887
Article Google Scholar
Davillas A, Jones AM (2021) The implications of self-reported body weight and height for measurement error in BMI. Econ Lett 209:110101
Article Google Scholar
Davillas A, Pudney S (2017) Concordance of health states in couples: analysis of self-reported, nurse administered and blood-based biomarker data in the UK Understanding Society panel. J Health Econ 56:87–102
Article Google Scholar
Davillas A, Pudney S (2020a) Biomarkers as precursors of disability. Econ Hum Biol 36:100814
Article Google Scholar
Davillas A, Pudney S (2020b) Biomarkers, disability and health care demand. Econ Hum Biol 39:100929
Article Google Scholar
Engstrom JL, Paterson SA, Doherty A et al (2003) Accuracy of self-reported height and weight in women: an integrative review of the literature. J Midwifery Womens Health 48(5):338–345
Article Google Scholar
Finn A, Ranchhod V (2017) Genuine fakes: the prevalence and implications of data fabrication in a large South African survey. World Bank Econ Rev 31(1):129–157
Google Scholar
Fryar CD, Carroll MD, Gu Q, Afful J, Ogden CL (2021) Anthropometric reference data for children and adults. U.S. Department of Health & Human Services, National Centre of Health Statistics, United States
Gil J, Mora T (2011) The determinants of misreporting weight and height: the role of social norms. Econ Hum Biol 9:78–91
Article Google Scholar
Gorber SC, Tremblay M, Moher D, Gorber B (2007) A comparison of direct vs. self-report measures for assessing height, weight and body mass index: a systematic review. Obes Rev 8(4):307–326
Article Google Scholar
Groves RM (2005) Survey errors and survey costs. Wiley
Google Scholar
Jenkins SP, Rios-Avila F (2020) Modelling errors in survey and administrative data on employment earnings: sensitivity to the fraction assumed to have error-free earnings. Econ Lett 192:109253
Article Google Scholar
Jenkins SP, Rios-Avila F (2021) Measurement error in earnings data: replication of Meijer, Rohwedder, and Wansbeek’s mixture model approach to combining survey and register data. J Appl Economet 36(4):474–483
Article Google Scholar
Jenkins SP, Rios-Avila F (2023a) Reconciling reports: modelling employment earnings and measurement errors using linked survey and administrative data. J R Stat Soc Ser A Stat Soc 186(1):110–136
Article Google Scholar
Jenkins SP, Rios-Avila F (2023b) Finite mixture models for linked survey and administrative data: estimation and post-estimation. Stand Genomic Sci 23(1):53–85
Google Scholar
Johnston DW, Propper C, Shields MA (2009) Comparing subjective and objective measures of health: evidence from hypertension for the income/health gradient. J Health Econ 28(3):540–552
Article Google Scholar
Kapteyn A, Ypma JY (2007) Measurement error and misclassification: a comparison of survey and administrative data. J Law Econ 25:513–551
Google Scholar
Keith SW, Fontaine KR, Pajewski NM, Mehta T, Allison DB (2011) Use of self-reported height and weight biases the body mass index–mortality association. Int J Obes 35(3):401–408
Article Google Scholar
Knäuper B, Carrière K, Chamandy M, Xu Z, Schwarz N, Rosen NO (2016) How aging affects self-reports. Eur J Ageing 13:185–193
Article Google Scholar
Li J, Simon G, Castro MR, Kumar V, Steinbach MS, Caraballo PJ (2021) Association of BMI, comorbidities and all-cause mortality by using a baseline mortality risk model. PLoS ONE 16(7):e0253696
Article Google Scholar
Lin X, Xu Y, Jl Xu et al (2020) Global burden of noncommunicable disease attributable to high body mass index in 195 countries and territories, 1990–2017. Endocrine 69(2):310–320
Article Google Scholar
Ljungvall Å, Gerdtham UG, Lindblad U (2015) Misreporting and misclassification: implications for socioeconomic disparities in body-mass index and obesity. Eur J Health Econ 16:5–20
Article Google Scholar
Meijer E, Rohwedder S, Wansbeek T (2012) Measurement error in earnings data: using a mixture model approach to combine survey and register data. J Bus Econ Stat 30:191–201
Google Scholar
Ng M, Fleming T, Robinson M et al (2014) Global, regional, and national prevalence of overweight and obesity in children and adults during 1980–2013: a systematic analysis for the Global Burden of Disease Study 2013. The Lancet 384(9945):766–781
Article Google Scholar
O’Neill D, Sweetman O (2013) The consequences of measurement error when estimating the impact of obesity on income. IZA J Labor Econ 2(1):1–20
Article Google Scholar
Olbrich L, Kosyakova Y, Sakshaug JW (2022) The reliability of adult self-reported height: the role of interviewers. Econ Hum Biol. https://doi.org/10.1016/j.ehb.2022.101118
Article Google Scholar
PNS (2013) Pesquisa Nacional de Saúde 2013 – Manual de Antropometria. Instituto Brasileiro de Geografia e Estatistica. Rio de Janeiro. Available at: https://biblioteca.ibge.gov.br/visualizacao/instrumentos_de_coleta/doc3426.pdf
Collaboration PS, Whitlock G, Lewington S et al (2009) Body-mass index and cause-specific mortality in 900000 adults: collaborative analyses of 57 prospective studies. The Lancet 373(9669):1083–1096
Article Google Scholar
Puhl RM, Heuer CA (2009) The stigma of obesity: a review and update. Obesity 17(5):941–964
Article Google Scholar
Reis RCPD, Duncan BB, Malta DC et al (2022) Evolution of diabetes in Brazil: prevalence data from the 2013 and 2019 Brazilian National Health Survey. Cad Saude Publica 38:e00149321. https://doi.org/10.1590/0102-311X00149321
Article Google Scholar
Rimes-Dias KA, Costa JC, Canella DS (2022) Obesity and health service utilization in Brazil: data from the National Health Survey. BMC Public Health 22(1):1474
Article Google Scholar
Rooth DO (2009) Obesity, attractiveness, and differential treatment in hiring a field experiment. J Hum Resources 44(3):710–735
Google Scholar
Rtveladze K, Marsh T, Webber L et al (2013) Health and economic burden of obesity in Brazil. PLoS ONE 8(7):e68785. https://doi.org/10.1371/journal.pone.0068785
Article Google Scholar
Sattler KM, Deane FP, Tapsell L, Kelly PJ (2018) Gender differences in the relationship of weight-based stigmatisation with motivation to exercise and physical activity in overweight individuals. Health Psychol Open 5(1):2055102918759691. https://doi.org/10.1177/2055102918759691
Article Google Scholar
Sherry B, Jefferds ME, Grummer-Strawn LM (2007) Accuracy of adolescent self-report of height and weight in assessing overweight status: a literature review. Arch Pediatr Adolesc Med 161(12):1154–1161
Article Google Scholar
Szwarcwald CL, Malta DC, Pereira CA et al (2014) Pesquisa Nacional de Saúde no Brasil: concepção e metodologia de aplicação. Cien Saude Colet 19(2):333–342
Article Google Scholar
Triaca LM, Jacinto PA, França MTA, Tejada CAO (2020) Does greater unemployment make people thinner in Brazil? Health Econ 29:1279–1288
Article Google Scholar
U.S.D.H.H.S. (2010) The Surgeon General’s Vision for a Healthy and Fit Nation. U.S. Department of Health and Human Services, Office of the Surgeon General, Rockville, MD
Zelenytė V, Valius L, Domeikienė A et al (2021) Body size perception, knowledge about obesity and factors associated with lifestyle change among patients, health care professionals and public health experts. BMC Fam Pract 22(1):1–13
Article Google Scholar
Zhang Q, Wang Y (2004) Socioeconomic inequality of obesity in the United States: do gender, age, and ethnicity matter? Soc Sci Med 58(6):1171–1180
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of Macedonia, 156 Egnatia Street, 54636, Thessaloníki, Greece
Apostolos Davillas
Instituto de Pesquisa e Estratégia Econômica do Ceará, Fortaleza, Brazil
Victor Hugo de Oliveira
Department of Economics and Related Studies, University of York, York, UK
Andrew M. Jones

Authors

Apostolos Davillas
View author publications
You can also search for this author in PubMed Google Scholar
Victor Hugo de Oliveira
View author publications
You can also search for this author in PubMed Google Scholar
Andrew M. Jones
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Apostolos Davillas.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

See Figs.

4 and

5. Tables

Table 12 Estimation of factor mixture model for body weight and height—measured weight/height data are rounded at the nearest integer

Full size table

12,

Table 13 Estimation of factor mixture model for body weight and height (measured data: average between 1st and 2nd measurement)

Full size table

13 and

Table 14 “Corrective equations” for body weight and height

Full size table

14.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Davillas, A., de Oliveira, V.H. & Jones, A.M. A model of errors in BMI based on self-reported and measured anthropometrics with evidence from Brazilian data. Empir Econ (2024). https://doi.org/10.1007/s00181-024-02616-w

Download citation

Received: 01 June 2023
Accepted: 07 May 2024
Published: 20 May 2024
DOI: https://doi.org/10.1007/s00181-024-02616-w

Keywords

JEL Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A model of errors in BMI based on self-reported and measured anthropometrics with evidence from Brazilian data

Abstract

Access this article

Similar content being viewed by others

Update on the Obesity Epidemic: After the Sudden Rise, Is the Upward Trajectory Beginning to Flatten?

A Tutorial on Applying the Difference-in-Differences Method to Health Data

Fixed and random effects models: making an informed choice

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Rights and permissions

About this article

Cite this article

Keywords

JEL Classification

Navigation

A model of errors in BMI based on self-reported and measured anthropometrics with evidence from Brazilian data

Abstract

Access this article

Similar content being viewed by others

Update on the Obesity Epidemic: After the Sudden Rise, Is the Upward Trajectory Beginning to Flatten?

A Tutorial on Applying the Difference-in-Differences Method to Health Data

Fixed and random effects models: making an informed choice

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

JEL Classification

Search

Navigation