Differential Item Functioning and its Relevance to Epidemiology

Jones, Richard N.

doi:10.1007/s40471-019-00194-5

Differential Item Functioning and its Relevance to Epidemiology

Epidemiologic Methods (P Howards, Section Editor)
Published: 01 May 2019

Volume 6, pages 174–183, (2019)
Cite this article

Current Epidemiology Reports Aims and scope Submit manuscript

Richard N. Jones¹

476 Accesses
23 Citations
1 Altmetric
Explore all metrics

Abstract

Purpose of Review

In this review, I trace the origins, applications, limitations, and future prospects for research on measurement item bias or differential item functioning (DIF) in the context of health research. DIF arises in the context of using multiple item or symptom health instruments to rate the level of a particular condition, and describes the situation where not all persons at the same level of the underlying condition have the same probability of endorsing one or more symptoms. The presence of DIF can lead to biased assessment of group differences and confound risk factor and outcomes research.

Recent Findings

The epidemiologic literature includes a great many applied, review, and methodological articles focusing on DIF. The preponderance of the literature appears in the areas of health-related quality of life, physical functioning, cognition, and mental health outcomes.

Summary

Epidemiologists and other researchers in the health sciences often rely upon multiple-item rating scales or questionnaires to assess for the presence of or level of health conditions or states that are otherwise not directly observable. When population subgroups respond differently to a subset of the items, this is referred to as differential item functioning (DIF) and might be a source of bias.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Health, Health-Related Quality of Life, and Quality of Life: What is the Difference?

Article 18 February 2016

Milad Karimi & John Brazier

A systematic review of quality of life research in medicine and health sciences

Article Open access 11 June 2019

K. Haraldstad, A. Wahl, … the LIVSFORSK network

Patient adherence and response time in electronic patient-reported outcomes: insights from three longitudinal clinical trials

Article 10 April 2024

Andrzej Nowojewski, Erik Bark, … Richard Dearden

Notes

Interested readers can type “net from http://s3.amazonaws.com/mplusmimicbucket” and install our Stata module mplusmimic, which automates Mplus/MIMIC and multiple group confirmatory factor analysis DIF detection algorithm.

References

Papers of particular interest, published recently, have been highlighted as: • Of importance •• Of major importance

Bontempo D, Hofer S. Assessing factorial invariance in cross-sectional and longitudinal studies. In: Ong A, van Dulmen M, editors. Handbook of methods in positive psychology: Oxford University Press; 2007. p. 153–75.
Bauer DJ. A more general model for testing measurement invariance and differential item functioning. Psychol Methods. 2017;22(3):507.
Article PubMed Google Scholar
Meredith W. Measurement invariance, factor analysis and factorial invariance. Psychometrika. 1993;58(4):525–43.
Article Google Scholar
Vandenberg RJ, Lance CE. A review and synthesis of the measurement invariance literature: suggestions, practices, and recommendations for organizational research. Organ Res Methods. 2000;3(1):4.
Article Google Scholar
• Kim J, Smith T. Exploring measurement invariance by gender in the profile of mood states depression subscale among cancer survivors. Qual Life Res. 2017;26(1):171–5 Kim and Smith provide a nice example of blending the measurement invariance and differential item functioning modes of analysis.
Article PubMed Google Scholar
Cole NS. History and development of DIF. In: Holland P, Wainer H, editors. Differential item functioning. New York: Routledge; 1993. p. 25–9.
Google Scholar
Camilli G, Shepard LA. Methods for identifying biased test items. Newbury Park: Sage Publishers; 1994.
Google Scholar
Teresi JA, Jones RN. Bias in psychological assessment and other measures. In: Geisinger K, Bracken B, Carlson J, Hansen J-I, Kuncel N, Reise S, et al., editors. APA handbook of testing and assessment in psychology, vol 1: test theory and testing and assessment in industrial and organizational psychology. APA handbooks in psychology. Washington, DC: American Psychological Association; 2013. p. 139–64.
Google Scholar
Millsap R, Everson H. Methodology review: statistical approaches for assessing measurement bias. Appl Psychol Meas. 1993;17(4):297–334.
Article Google Scholar
Bock DR. Different DIFs: comment on the papers read by Neil Dorans and David Thissen. In: Holland P, Wainer H, editors. Differential item functioning. New York: Routledge; 1993. p. 115–22.
Google Scholar
Oort F. Using restricted factor analysis to detect item bias. Methodika. 1992;6:150–66.
Google Scholar
Lord F, Novick M. Statistical theories of mental test scores. Reading, MA: Addison-Wesley; 1968.
Google Scholar
Embretson SE, Reise SP. Item response theory for psychologists. Mahwah, New Jersey: Lawrence Erlbaum Associates; 2000.
Reckase MD. Multidimensional item response theory. New York: Springer; 2009.
Book Google Scholar
Hambleton RK, Swaminathan H, Rogers H. Fundamentals of item response theory. Newbury Park: SAGE Publications; 1991.
Google Scholar
Woods CM. Ramsay-curve item response theory (RC-IRT) to detect and correct for nonnormal latent variables. Psychol Methods. 2006;11(3):253.
Article PubMed Google Scholar
Camilli G. Teacher's corner: origin of the scaling constant D=1.7 in item response theory. J Educ Behav Stat. 1994;19(3):293.
Article Google Scholar
Raykov T, Marcoulides GA. A course in item response theory and modeling with Stata. College Station, TX: Stata Press; 2018.
Google Scholar
Matlock Cole K, Paek I. PROC IRT: a SAS procedure for item response theory. Appl Psychol Meas. 2017;41(4):311–20.
Article PubMed Central Google Scholar
Rusch T, Mair P, Hatzinger R. In: Regina Dittrich ML, Miko K, Rusch T, Schiffinger M, editors. In discussion paper series of the Center for Empirical Research Methods. WU Vienna, Austria, Vienna: Center for Empirical Research Methods; 2013. http://epub.wu.ac.at/id/eprint/4010.
Google Scholar
Takane Y, De Leeuw J. On the relationship between item response theory and factor analysis of discretized variables. Psychometrika. 1987;52(3):393–408.
Article Google Scholar
Lord F, Novick M. Latent traits and item characteristic functions (chapter 16). Statistical theories of mental test scores. Reading, MA: Addison-Wesley; 1968. p. 358–93.
Google Scholar
Mislevy RJ. Recent developments in the factor analysis of categorical variables. J Educ Stat. 1986;11(1):3–31.
Article Google Scholar
Macintosh R, Hashim S. Variance estimation for converting MIMIC model parameters to IRT parameters in DIF analysis. Appl Psychol Meas. 2003;27(5):372–9.
Article Google Scholar
Rosseel Y. Lavaan: an R package for structural equation modeling. J Stat Softw. 2012;48:1–36.
Article Google Scholar
Teresi JA. Different approaches to differential item functioning in health applications: advantages, disadvantages and some neglected topics. Med Care. 2006;44(11 Suppl 3):S152–70.
Article PubMed Google Scholar
Crane PK, Cetin K, Cook KF, Johnson K, Deyo R, Amtmann D. Differential item functioning impact in a modified version of the Roland–Morris disability questionnaire. Qual Life Res. 2007;16(6):981–90.
Article PubMed Google Scholar
• Hays RD, Calderón JL, Spritzer KL, Reise SP, Paz SH. Differential item functioning by language on the PROMIS® physical functioning items for children and adolescents. Qual Life Res. 2018;27(1):235–47 Hays and colleagues demonstrate methods for examining the impact of differential item functioning.
Article PubMed Google Scholar
•• Verdam MG, Oort FJ, Sprangers MA. Item bias detection in the Hospital Anxiety and Depression Scale using structural equation modeling: comparison with other item bias detection methods. Qual Life Res. 2017;26(6):1439–50 Verdam and colleagues present a cohesive discussion of extensions to the binary test item, two-group, unidimensional latent trait conditions for conceptualizing and evaluating measurement bias.
Article PubMed Google Scholar
Ioannidis JPA. Why most published research findings are false. PLoS Med. 2005;2(8):e124.
Article PubMed Central PubMed Google Scholar
Yang FM, Heslin KC, Mehta KM, Yang C-W, Ocepek-Welikson K, Kleinman M, et al. A comparison of item response theory-based methods for examining differential item functioning in object naming test by language of assessment among older Latinos. Psychol Test Assess Model. 2011;53(4):440–60.
PubMed Central PubMed Google Scholar
Thissen DMULTILOG. User's guide: multiple, categorical item analysis and test scoring using item response theory. Chicago: Scientific Software, Inc; 1991.
Google Scholar
Thissen D. IRTLRDIF v. 2.0 b: software for the computation of the statistics involved in item response theory likelihood-ratio tests for differential item functioning. Chapel Hill: University of North Carolina, LL Thurstone Psychometric Laboratory; 2001.
Google Scholar
Flowers CP, Oshima TC, Raju NS. A description and demonstration of the polytomous-DFIT framework. Appl Psychol Meas. 1999;23(4):309–26.
Article Google Scholar
Crane P, Gibbons L, Jolley L, van Belle G. Differential item functioning analysis with ordinal logistic regression techniques: DIFdetect and difwithpar. Med Care. 2006;44(11 Suppl 3):S115–S23.
Article PubMed Google Scholar
Muraki E, Bock D. PARSCALE for windows. Chicago: Scientific Software International; 2003.
Google Scholar
Muthén L, Muthén B. Mplus Users Guide. Eighth ed. Los Angeles: Muthén & Muthén; 1998–2017.
Google Scholar
Wiegand RE. Performance of using multiple stepwise algorithms for variable selection. Stat Med. 2010;29(15):1647–59.
PubMed Google Scholar
Chun S, Stark S, Kim ES, Chernyshenko OS. MIMIC methods for detecting DIF among multiple groups: exploring a new sequential-free baseline procedure. Appl Psychol Meas. 2016;40(7):486–99.
Article PubMed Central PubMed Google Scholar
Finch W. The MIMIC model as a method for detecting DIF: comparison with Mantel–Haenszel, SIBTEST, and the IRT likelihood ratio. Appl Psychol Meas. 2005;29(4):278–95.
Article Google Scholar
Finch W, French BF. Detection of crossing differential item functioning: a comparison of four methods. Educ Psychol Meas. 2007;67(4):565–82.
Article Google Scholar
Finch W, French B. Anomalous type I error rates for identifying one type of differential item functioning in the presence of the other. Educ Psychol Meas. 2008;68:742–59.
Article Google Scholar
French BF, Maller SJ. Iterative purification and effect size use with logistic regression for differential item functioning detection. Educ Psychol Meas. 2007;67(3):373.
Article Google Scholar
Stark S, Chernyshenko OS, Drasgow F. Detecting differential item functioning with confirmatory factor analysis and item response theory: toward a unified strategy. J Appl Psychol. 2006;91(6):1292–306.
Article PubMed Google Scholar
Zwick R, Thayer DT, Wingersky M. A simulation study of methods for assessing differential item functioning in computerized adaptive tests. Appl Psychol Meas. 1994;18(2):121–40.
Article Google Scholar
Wang W-C. Assessment of differential item functioning. J Appl Meas. 2008;9(4):387–408.
PubMed Google Scholar
Woods CM, Grimm KJ. Testing for nonuniform differential item functioning with multiple indicator multiple cause models. Appl Psychol Meas. 2011;35(5):339–61.
Article Google Scholar
Muthén B. Beyond SEM: general latent variable modeling. Behaviormetrika. 2002;29(1):81–117.
Article Google Scholar
Jones R, Gallo J. Education and sex differences in the mini-mental state examination: effects of differential item functioning. J Gerontol B-Psychol Sci Soc Sci. 2002;57(6):P548–P58.
Article PubMed Google Scholar
Fratiglioni L, Jorm AF, Grut M, Viitanen M, Holmen K, Ahlbom A, et al. Predicting dementia from the mini-mental state examination in an elderly population: the role of education. J Clin Epidemiol. 1993;46(3):281–7.
Article CAS PubMed Google Scholar
Wu X, Sawatzky R, Hopman W, Mayo N, Sajobi TT, Liu J, et al. Latent variable mixture models to test for differential item functioning: a population-based analysis. Health Qual Life Outcomes. 2017;15(1):102.
Article PubMed Central PubMed Google Scholar
Peng R, Dominici F, Zeger SL. Reproducible epidemiologic research. Am J Epidemiol. 2006;163(9):783–9.
Article PubMed Google Scholar
Rothman KJ, Greenland S, Lash T. Modern epidemiology. third ed: Wolters Kluwer, Lippincott Williams & Wilkins; 2008.

Download references

Author information

Authors and Affiliations

Departments of Psychiatry and Human Behavior, and Neurology, Warren Alpert Medical School, Brown University, Butler Hospital, 345 Blackstone Boulevard, Box G-BH, Providence, RI, 02906-4800, USA
Richard N. Jones

Authors

Richard N. Jones
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Richard N. Jones.

Ethics declarations

Conflict of Interest

R.N.J. declares no potential conflict of interest.

Human and Animal Rights and Informed Consent

This article does not contain any studies with human or animal subjects performed by any of the authors.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the Topical Collection on Epidemiologic Methods

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jones, R.N. Differential Item Functioning and its Relevance to Epidemiology. Curr Epidemiol Rep 6, 174–183 (2019). https://doi.org/10.1007/s40471-019-00194-5

Download citation

Published: 01 May 2019
Issue Date: 15 June 2019
DOI: https://doi.org/10.1007/s40471-019-00194-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Differential Item Functioning and its Relevance to Epidemiology