Statistical Analysis—Measurement Error

Brakenhoff, Timo B.; van Smeden, Maarten; Oberski, Daniel L.

doi:10.1007/978-3-031-36678-9_6

Timo B. Brakenhoff⁵,
Maarten van Smeden⁶ &
Daniel L. Oberski^6,7

348 Accesses

Abstract

An important aspect of data quality when conducting clinical analyses using real-world data is how variables in the data have been recorded or measured. The discrepancy between an observed value and the true value is called measurement error (also known as noise in the artificial intelligence and machine learning literature) and can have consequences for your analyses in all kinds of contexts. To properly assess the potential impact of measurement error it is essential to understand the relationship between the true and observed variables as well as the goal of the analysis and how it will be implemented in practice. Commonly, measurement error is distinguished as being classical, Berkson, systematic and/or differential. While it is clear that measurement error can have far-reaching consequences on analyses, the effect can differ depending on whether analyses are descriptive, explanatory or predictive. Validation studies can inform the estimation and characterization of measurement error as well as provide crucial information for correction methods that are available in several statistical programming languages such as SAS, R and Python.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Algan G, Ulusoy I. Label noise types and their effects on deep learning. 2020. ArXiv: https://arxiv.org/abs/2003.10471
Bauldry S, Bollen KA, Adair LS. Evaluating measurement error in readings of blood pressure for adolescents and young adults. Blood Press. 2015;24:96–102. https://doi.org/10.3109/08037051.2014.986952.
Article Google Scholar
Boeschoten L, Oberski D, De Waal T. Estimating classification errors under edit restrictions in composite survey-register data using multiple imputation latent class modelling (MILC). J Off Stat. 2017;33:921–62. https://doi.org/10.1515/jos-2017-0044.
Article Google Scholar
Boeschoten L, van Kesteren E-J, Bagheri A, Oberski DL. Achieving fair inference using error-prone outcomes. Int J Interact Multimed Artif Intell. 2021;6:9. https://doi.org/10.9781/ijimai.2021.02.007.
Article Google Scholar
Boudreau DM, Daling JR, Malone KE, et al. A validation study of patient interview data and pharmacy records for antihypertensive, statin, and antidepressant medication use among older women. Am J Epidemiol. 2004;159:308–17. https://doi.org/10.1093/aje/kwh038.
Article Google Scholar
Brakenhoff TB, Mitroiu M, Keogh RH, et al. Measurement error is often neglected in medical literature: a systematic review. J Clin Epidemiol. 2018;98:89–97. https://doi.org/10.1016/j.jclinepi.2018.02.023.
Article Google Scholar
Brakenhoff TB, van Smeden M, Visseren FLJ, Groenwold RHH. Random measurement error: why worry? An example of cardiovascular risk factors. PLoS ONE. 2018;13: e0192298. https://doi.org/10.1371/journal.pone.0192298.
Article Google Scholar
Buonaccorsi JP. Measurement error: models, methods, and applications. New York: Chapman and Hall/CRC; 2010.
Book MATH Google Scholar
Carroll RJ, Ruppert D, Stefanski LA, Crainiceanu CM. Measurement error in nonlinear models: a modern perspective. 2nd ed. New York: Chapman and Hall/CRC; 2006.
Book MATH Google Scholar
Carroll RJ, Spiegelman CH, Lan KKG, et al. On errors-in-variables for binary regression models. Biometrika. 1984;71:19–25. https://doi.org/10.1093/biomet/71.1.19.
Article MathSciNet MATH Google Scholar
Carroll RJ, Stefanski LA. Approximate quasi-likelihood estimation in models with surrogate predictors. J Am Stat Assoc. 1990;85:652–63. https://doi.org/10.1080/01621459.1990.10474925.
Article MathSciNet Google Scholar
Ching T, Himmelstein DS, Beaulieu-Jones BK, et al. Opportunities and obstacles for deep learning in biology and medicine. J R Soc Interface. 2018;15:20170387. https://doi.org/10.1098/rsif.2017.0387.
Article Google Scholar
Cole SR, Chu H, Greenland S. Multiple-imputation for measurement-error correction. Int J Epidemiol. 2006;35:1074–81. https://doi.org/10.1093/ije/dyl097.
Article Google Scholar
Cook JR, Stefanski LA. Simulation-extrapolation estimation in parametric measurement error models. J Am Stat Assoc. 1994;89:1314–28. https://doi.org/10.1080/01621459.1994.10476871.
Article MATH Google Scholar
Delate T, Jones AE, Clark NP, Witt DM. Assessment of the coding accuracy of warfarin-related bleeding events. Thromb Res. 2017;159:86–90. https://doi.org/10.1016/j.thromres.2017.10.004.
Article Google Scholar
Ferrari P, Friedenreich C, Matthews CE. The role of measurement error in estimating levels of physical activity. Am J Epidemiol. 2007;166:832–40. https://doi.org/10.1093/aje/kwm148.
Article Google Scholar
Freedman LS, Commins JM, Willett W, et al. Evaluation of the 24-hour recall as a reference instrument for calibrating other self-report instruments in nutritional cohort studies: evidence from the validation studies pooling project. Am J Epidemiol. 2017;186:73–82. https://doi.org/10.1093/aje/kwx039.
Article Google Scholar
Freedman LS, Schatzkin A, Midthune D, Kipnis V. Dealing with dietary measurement error in nutritional cohort studies. JNCI J Natl Cancer Inst. 2011;103:1086–92. https://doi.org/10.1093/jnci/djr189.
Article Google Scholar
Frenay B, Verleysen M. Classification in the presence of label noise: a survey. IEEE Trans Neural Netw Learn Syst. 2014;25:845–69. https://doi.org/10.1109/TNNLS.2013.2292894.
Article MATH Google Scholar
Fuller WA. Measurement error models. New York: John Wiley & Sons; 1987.
Book MATH Google Scholar
Gianfrancesco MA, Tamang S, Yazdany J, Schmajuk G. Potential biases in machine learning algorithms using electronic health record data. JAMA Intern Med. 2018;178:1544. https://doi.org/10.1001/jamainternmed.2018.3763.
Article Google Scholar
Goldman GT, Mulholland JA, Russell AG, et al. Impact of exposure measurement error in air pollution epidemiology: effect of error type in time-series studies. Environ Health. 2011;10:61. https://doi.org/10.1186/1476-069X-10-61.
Article Google Scholar
Gravel CA, Platt RW. Weighted estimation for confounded binary outcomes subject to misclassification. Stat Med. 2018;37:425–36. https://doi.org/10.1002/sim.7522.
Article MathSciNet Google Scholar
Guolo A. Robust techniques for measurement error correction: a review. Stat Methods Med Res. 2008;17:555–80. https://doi.org/10.1177/0962280207081318.
Article MathSciNet Google Scholar
Gupta S, Gupta A. Dealing with noise problem in machine learning data-sets: a systematic review. Procedia Comput Sci. 2019;161:466–74. https://doi.org/10.1016/j.procs.2019.11.146.
Article Google Scholar
Gustafson P. Measurement error and misclassification in statistics and epidemiology: impacts and bayesian adjustments. CRC Press (2003)
Google Scholar
Gyorkos TW, Frappier-Davignon L, Dick Maclean J, Viens P. Effect of screening and treatment on imported intestinal parasite infections: results from a randomized, Controlled Trial. Am J Epidemiol. 1989;129:753–61. https://doi.org/10.1093/oxfordjournals.aje.a115190
Gyorkos TW, Genta RM, Viens P, Maclean JD. Seroepidemiology of Strongyloides infection in the Southeast Asian refugee population in. Canada. Am. J. Epidemiol. 1990;257–64
Google Scholar
Hardin JW, Schmiediche H, Carroll RJ. The regression-calibration method for fitting generalized linear models with additive measurement error. Stata J Promot Commun Stat Stata. 2003;3:361–72. https://doi.org/10.1177/1536867X0400300406.
Article Google Scholar
Hardin JW, Schmiediche H, Carroll RJ. The simulation extrapolation method for fitting generalized linear models with additive measurement error. Stata J Promot Commun Stat Stata. 2003;3:373–85. https://doi.org/10.1177/1536867X0400300407.
Article Google Scholar
He W, Xiong J, Yi GY, SIMEX R package for accelerated failure time models with covariate measurement error. J Stat Softw. 2012;46:1–14. https://doi.org/10.18637/jss.v046.c01
Hui SL, Walter SD. Estimating the error rates of diagnostic tests. Biometrics. 1980;36:167–71. https://doi.org/10.2307/2530508.
Article MATH Google Scholar
Jiang T, Gradus JL, Lash TL, Fox MP. Addressing measurement error in random forests using quantitative bias analysis. Am J Epidemiol. 2021. https://doi.org/10.1093/aje/kwab010.
Article Google Scholar
Joseph L, Gyorkos TW, Coupal L. Bayesian estimation of disease prevalence and the parameters of diagnostic tests in the absence of a gold standard. Am J Epidemiol. 1995;141:263–72. https://doi.org/10.1093/oxfordjournals.aje.a117428.
Article Google Scholar
Karimi D, Dou H, Warfield SK, Gholipour A. Deep learning with noisy labels: exploring techniques and remedies in medical image analysis. Med Image Anal. 2020;65: 101759. https://doi.org/10.1016/j.media.2020.101759.
Article Google Scholar
Keogh RH, Shaw PA, Gustafson P, et al. STRATOS guidance document on measurement error and misclassification of variables in observational epidemiology: Part 1—Basic theory and simple methods of adjustment. Stat Med. 2020;39:2197–231. https://doi.org/10.1002/sim.8532.
Article MathSciNet Google Scholar
https://onlinelibrary.wiley.com/doi/abs/10.1002/sim.8531
Article MathSciNet MATH Google Scholar
Lash TL, Fox MP, MacLehose RF, et al. Good practices for quantitative bias analysis. Int J Epidemiol. 2014;43:1969–85. https://doi.org/10.1093/ije/dyu149.
Article Google Scholar
Lederer W, Küchenhoff H. A short introduction to the SIMEX and MCSIMEX. Newsl R Proj. 2006;6(4):26–31.
Google Scholar
Liao X, Zucker DM, Li Y, Spiegelman D. Survival analysis with error-prone time-varying covariates: a risk set calibration approach. Biometrics. 2011;67:50–8. https://doi.org/10.1111/j.1541-0420.2010.01423.x.
Article MathSciNet MATH Google Scholar
Lim S, Wyker B, Bartley K, Eisenhower D. Measurement error of self-reported physical activity levels in New York City: assessment and correction. Am J Epidemiol. 2015;181:648–55. https://doi.org/10.1093/aje/kwu470.
Article Google Scholar
Luijken K, Groenwold RHH, Calster BV, et al. Impact of predictor measurement heterogeneity across settings on the performance of prediction models: a measurement error perspective. Stat Med. 2019;38:3444–59. https://doi.org/10.1002/sim.8183.
Article MathSciNet Google Scholar
Luijken K, Wynants L, van Smeden M, et al. Changing predictor measurement procedures affected the performance of prediction models in clinical examples. J Clin Epidemiol. 2020;119:7–18. https://doi.org/10.1016/j.jclinepi.2019.11.001.
Article Google Scholar
McCaffrey DF, Griffin BA, Almirall D, et al. A tutorial on propensity score estimation for multiple treatments using generalized boosted models. Stat Med. 2013;32:3388–414. https://doi.org/10.1002/sim.5753.
Article MathSciNet Google Scholar
Murray RP, Connett JE, Lauger GG, Voelker HT. Error in smoking measures: effects of intervention on relations of cotinine and carbon monoxide to self-reported smoking. The Lung Health Study Research Group. Am J Public Health. 1993;83:1251–7. https://doi.org/10.2105/AJPH.83.9.1251.
Article Google Scholar
Nab L, Groenwold RHH., Welsing PMJ, van Smeden M. Measurement error in continuous endpoints in randomised trials: problems and solutions. Stat Med. 2019;38:5182–96. https://doi.org/10.1002/sim.8359.
Nab L, van Smeden M, de Mutsert R, et al. Sampling strategies for internal validation samples for exposure measurement error correction: a study of visceral adipose tissue measures replaced by waist circumference measures. Am J Epidemiol Kwab. 2021a;114. https://doi.org/10.1093/aje/kwab114
Nab L, van Smeden M, Keogh RH, Groenwold RHH. mecor: An R package for measurement error correction in linear regression models with a continuous outcome. Comput Methods Programs Biomed. 2021b;208:
Google Scholar
Nicholson B, Sheng VS, Zhang J. Label noise correction and application in crowdsourcing. Expert Syst Appl. 2016;66:149–62. https://doi.org/10.1016/j.eswa.2016.09.003.
Article Google Scholar
Nigam N, Dutta T, Gupta HP. Impact of noisy labels in learning techniques: a survey. In: Kolhe ML, Tiwari S, Trivedi MC, Mishra KK, editors. Advances in Data and Information Sciences. Singapore: Springer; 2020. p. 403–11.
Chapter Google Scholar
Nir G, Hor S, Karimi D, et al. Automatic grading of prostate cancer in digitized histopathology images: learning from multiple experts. Med Image Anal. 2018;50:167–80. https://doi.org/10.1016/j.media.2018.09.005.
Article Google Scholar
Nissen F, Morales DR, Mullerova H, et al. Validation of asthma recording in the clinical practice research datalink (CPRD). BMJ Open. 2017;7: e017474. https://doi.org/10.1136/bmjopen-2017-017474.
Article Google Scholar
Nitzan M, Slotki I, Shavit L. More accurate systolic blood pressure measurement is required for improved hypertension management: a perspective. Med Devices Auckl NZ. 2017;10:157–63. https://doi.org/10.2147/MDER.S141599.
Article Google Scholar
Pajouheshnia R, van Smeden M, Peelen LM, Groenwold RHH. How variation in predictor measurement affects the discriminative ability and transportability of a prediction model. J Clin Epidemiol. 2019;105:136–41. https://doi.org/10.1016/j.jclinepi.2018.09.001.
Article Google Scholar
Pot M, Kieusseyan N, Prainsack B. Not all biases are bad: equitable and inequitable biases in machine learning and radiology. Insights Imag. 2021;12:13. https://doi.org/10.1186/s13244-020-00955-7.
Article Google Scholar
Ratner A, Bach SH, Ehrenberg H, et al. Snorkel: rapid training data creation with weak supervision. Proc VLDB Endow Int Conf Very Large Data Bases 2017;11:269–282. https://doi.org/10.14778/3157794.3157797
Ravì D, Wong C, Deligianni F, et al. Deep learning for health informatics. IEEE J Biomed Health Inform. 2017;21:4–21. https://doi.org/10.1109/JBHI.2016.2636665.
Article Google Scholar
Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;70:41–55.
Article MathSciNet MATH Google Scholar
Rosner B, Spiegelman D, Willett WC. Correction of logistic regression relative risk estimates and confidence intervals for measurement error: the case of multiple covariates measured with error. Am J Epidemiol. 1990;132:734–45. https://doi.org/10.1093/oxfordjournals.aje.a115715.
Article Google Scholar
Rosner B, Spiegelman D, Willett WC. Correction of logistic regression relative risk estimates and confidence intervals for random within-person measurement error. Am J Epidemiol. 1992;136:1400–13. https://doi.org/10.1093/oxfordjournals.aje.a116453.
Article Google Scholar
Rosseel, Y. lavaan: an R package for structural equation modeling. J Stat Softw. 2012;48:1–36. https://doi.org/10.18637/jss.v048.i02.
Rothman KJ, Greenland S, Lash TL. Modern epidemiology. Wolters Kluwer Health/Lippincott Williams & Wilkins Philadelphia; 2008.
Google Scholar
Sánchez BN, Budtz-Jørgensen E, Ryan LM, Hu H. Structural equation models. J Am Stat Assoc. 2005;100:1443–55. https://doi.org/10.1198/016214505000001005.
Article MATH Google Scholar
Schnack, H. Bias, noise, and interpretability in machine learning. In: Machine Learning. Elsevier; 2020. p. 307–28
Google Scholar
Schneeweiss S, Avorn J. A review of uses of health care utilization databases for epidemiologic research on therapeutics. J Clin Epidemiol. 2005;58:323–37. https://doi.org/10.1016/j.jclinepi.2004.10.012.
Article Google Scholar
Shanthini A, Vinodhini G, Chandrasekaran RM, Supraja P. A taxonomy on impact of label noise and feature noise using machine learning techniques. Soft Comput. 2019;23:8597–607. https://doi.org/10.1007/s00500-019-03968-7.
Article Google Scholar
Shaw PA, Deffner V, Keogh RH, et al. Epidemiologic analyses with error-prone exposures: review of current practice and recommendations. Ann Epidemiol. 2018;28:821–8. https://doi.org/10.1016/j.annepidem.2018.09.001.
Article Google Scholar
Shaw PA, Gustafson P, Carroll RJ, et al. STRATOS guidance document on measurement error and misclassification of variables in observational epidemiology: Part 2—More complex methods of adjustment and advanced topics. Stat Med. 2020;39:2232–63. https://doi.org/10.1002/sim.8531.
Article MathSciNet Google Scholar
Sheppard L, Burnett RT, Szpiro AA, et al. Confounding and exposure measurement error in air pollution epidemiology. Air Qual Atmosphere Health. 2012;5:203–16. https://doi.org/10.1007/s11869-011-0140-9.
Article Google Scholar
Shmueli G. To Explain or to Predict? Stat Sci. 2010;25.https://doi.org/10.1214/10-STS330.
Smedt TD, Merrall E, Macina D, et al. Bias due to differential and non-differential disease- and exposure misclassification in studies of vaccine effectiveness. PLoS ONE. 2018;13: e0199180. https://doi.org/10.1371/journal.pone.0199180.
Article Google Scholar
Stefanski LA. Unbiased estimation of a nonlinear function a normal mean with application to measurement err oorf models. Commun Stat - Theory Methods. 1989;18:4335–58. https://doi.org/10.1080/03610928908830159.
Article MATH Google Scholar
Thiébaut ACM, Freedman LS, Carroll RJ, Kipnis V. Is It necessary to correct for measurement error in nutritional epidemiology? Ann Intern Med. 2007;146:65. https://doi.org/10.7326/0003-4819-146-1-200701020-00012.
Article Google Scholar
van Smeden M, Lash TL, Groenwold RHH. Reflection on modern methods: five myths about measurement error in epidemiological research. Int J Epidemiol. 2020;49:338–47. https://doi.org/10.1093/ije/dyz251.
Article Google Scholar
van der Wel MC, Buunk IE, van Weel C, et al. A novel approach to office blood pressure measurement: 30-minute office blood pressure vs daytime ambulatory blood pressure. Ann Fam Med. 2011;9:128–35. https://doi.org/10.1370/afm.1211.
Article Google Scholar
White JT, Fienen MN, Doherty JE. A python framework for environmental model uncertainty analysis. Environ Model Softw. 2016;85:217–28. https://doi.org/10.1016/j.envsoft.2016.08.017.
Article Google Scholar
Yu AYX, Quan H, McRae AD, et al. A cohort study on physician documentation and the accuracy of administrative data coding to improve passive surveillance of transient ischaemic attacks. BMJ Open. 2017;7: e015234. https://doi.org/10.1136/bmjopen-2016-015234.
Article Google Scholar
Zeger SL, Thomas D, Dominici F, et al. Exposure measurement error in time-series studies of air pollution: concepts and consequences. Environ Health Perspect. 2000;108:419–26. https://doi.org/10.1289/ehp.00108419.
Article Google Scholar
Zhu X, Wu X. Class noise vs. attribute noise: a quantitative study. Artif Intell Rev. 2004;22:177–210. https://doi.org/10.1007/s10462-004-0751-8.
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Julius Clinical, Zeist, The Netherlands
Timo B. Brakenhoff
Julius Center for Health Sciences and Primary Care, UMC Utrecht, Utrecht University, Utrecht, The Netherlands
Maarten van Smeden & Daniel L. Oberski
Dept. Methodology & Statistics, Utrecht University, Utrecht, The Netherlands
Daniel L. Oberski

Authors

Timo B. Brakenhoff
View author publications
You can also search for this author in PubMed Google Scholar
Maarten van Smeden
View author publications
You can also search for this author in PubMed Google Scholar
Daniel L. Oberski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Timo B. Brakenhoff .

Editor information

Editors and Affiliations

Amsterdam University Medical Center, University of Amsterdam, Amsterdam, The Netherlands
Folkert W. Asselbergs
Institute of Health Informatics, University College London, London, UK
Spiros Denaxas
Department of Data Science and Biostatistics, University Medical Center Utrecht (UMCU), Utrecht, The Netherlands
Daniel L. Oberski
Cedars-Sinai Medical Center, Los Angeles, CA, USA
Jason H. Moore

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Brakenhoff, T.B., van Smeden, M., Oberski, D.L. (2023). Statistical Analysis—Measurement Error. In: Asselbergs, F.W., Denaxas, S., Oberski, D.L., Moore, J.H. (eds) Clinical Applications of Artificial Intelligence in Real-World Data. Springer, Cham. https://doi.org/10.1007/978-3-031-36678-9_6

Download citation

DOI: https://doi.org/10.1007/978-3-031-36678-9_6
Published: 05 November 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36677-2
Online ISBN: 978-3-031-36678-9
eBook Packages: MedicineMedicine (R0)

Publish with us

Policies and ethics