Monitoring Countries in a Changing World: A New Look at DIF in International Surveys

Zwitser, Robert J.; Glaser, S. Sjoerd F.; Maris, Gunter

doi:10.1007/s11336-016-9543-8

Monitoring Countries in a Changing World: A New Look at DIF in International Surveys

Published: 14 November 2016

Volume 82, pages 210–232, (2017)
Cite this article

Psychometrika Aims and scope Submit manuscript

Robert J. Zwitser¹,
S. Sjoerd F. Glaser¹ &
Gunter Maris^1,2

923 Accesses
23 Citations
Explore all metrics

Abstract

This paper discusses the issue of differential item functioning (DIF) in international surveys. DIF is likely to occur in international surveys. What is needed is a statistical approach that takes DIF into account, while at the same time allowing for meaningful comparisons between countries. Some existing approaches are discussed and an alternative is provided. The core of this alternative approach is to define the construct as a large set of items, and to report in terms of summary statistics. Since the data are incomplete, measurement models are used to complete the incomplete data. For that purpose, different models can be used across countries. The method is illustrated with PISA’s reading literacy data. The results indicate that this approach fits the data better than the current PISA methodology; however, the league tables are nearly identical. The implications for monitoring changes over time are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

How to use and assess qualitative research methods

Article Open access 27 May 2020

Loraine Busetto, Wolfgang Wick & Christoph Gumbinger

Reciprocal relationships among reading and vocabulary over time: a longitudinal study from grade 1 to 5

Article Open access 30 March 2024

Callula Killingly, Linda J. Graham, … Pamela Snow

Literature reviews as independent studies: guidelines for academic practice

Article Open access 14 October 2022

Sascha Kraus, Matthias Breier, … João J. Ferreira

Notes

In fact, PISA consists of participating economies. However, since most economies are countries, and since we think that the term countries is easier for the reader, we use the term countries instead of economies.
The parameters of polytomous items are connected with a dotted line.
The data were retrieved from http://pisa2003.acer.edu.au/downloads.php and http://pisa2006.acer.edu.au/downloads.php on August 22nd, 2013.
The item numbering is according to the order in which the items appear in booklet 6 of PISA 2006.
Around 2000, it has been discussed whether this construct should be part of the PISA survey.

References

Adams, R. (2011, 19 April). Comments on Kreiner 2011: Is the foundation under PISA solid? A critical look at the scaling model underlying international comparisons of student attainment. Retrieved from http://www.oecd.org/pisa/47681954.
Adams, R., Wilson, M., & Wang, W. (1997). The multidimensional random coefficients multinomial logit model. Applied Psychological Measurement, 21(1), 1–23.
Article Google Scholar
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for educational and psychological testing. Washinton, DC: American Educational Research Association.
Andersen, E. B. (1973). Conditional inference and models for measuring. (Unpublished doctoral dissertation). Mentalhygiejnisk Forskningsinstitut.
Bechger, T. M., & Maris, G. (2015). A statistical test for differential item pair functioning. Psychometrika, 80(2), 317–340. doi:10.1007/s11336-014-9408-y.
Article PubMed Google Scholar
Bechger, T.M., Maris, G., & Verstralen, H.H.F.M. (2010). A different view on DIF (Measurement and Research Department Reports No. 2010-4). Cito.
Béguin, A. A., & Wools, S. (2015). Vertical comparison using reference sets. In R. E. Millsap, D. M. Bolt, L. A. van der Ark, & W. C. Wang (Eds.), Quantitative psychology research (Vol. 89, pp. 195–211). Switzerland: Springer International Publishing.
Google Scholar
Bolsinova, M., Maris, G., & Hoijtink, H. (2016). Unmixing Rasch scales: How to score an educational test. Annals of Applied Statistics, 10(2), 925–945. doi:10.1214/16-AOAS919.
Article Google Scholar
Council of Europe. (2012). First european survey on language competences: Technical report. Retrieved from http://www.surveylang.org/.
Dieterich, C. (2013, March). In or out, DJIA companies reflect changing times. The Wall Street Journal. Retrieved from http://online.wsj.com/news/articles/SB10001424127887324678604578342113520798752.
Goldstein, H. (2004). International comparisons of student attainment: Some issues arising from the PISA study. Assessment in Education, 11(3), 319–330. doi:10.1080/0969594042000304618.
Article Google Scholar
Holland, P., & Wainer, H. (Eds.). (1993). Differential item functioning. Hillsdale, NJ: Lawrence Erlbaum Associates.
Google Scholar
Kolen, M. J., & Brennan, R. L. (2004). Test equating, scaling, and linking. Methods and practices (2nd ed.). New York: Springer.
Book Google Scholar
Kreiner, S. (2011). Is the foundation under PISA solid? A critical look at the scaling model underlying international comparisons of student attainment. (Tech. Rep.). Dept. of Biostatistics, University of Copenhagen.
Kreiner, S., & Christensen, K. B. (2007). Validity and objectivity in health-related scales: Analysis by graphical loglinear Rasch models. In M. Von Davier & C. H. Carstensen (Eds.), Multivariate and mixture distribution Rasch models (pp. 329–346). New York: Springer.
Chapter Google Scholar
Kreiner, S., & Christensen, K. B. (2014). Analyses of model fit and robustness. A new look at the PISA scaling model underlying ranking of countries according to reading literacy. Psychometrika, 79(2), 210–231. doi:10.1007/s11336-013-9347-z.
Article PubMed Google Scholar
Le, L. T. (2007). Effects of item positions on their difficulty and discrimination: A study in PISA science data across test language and countries. Paper presented at the 72nd Annual Meeting of the Psychometric Society, Tokyo, Japan. Retrieved from http://research.acer.edu.au/pisa/2/.
Linthorne, N. (2014, August). Wind assistance in the 100m sprint. Retrieved from http://www.brunel.ac.uk/~spstnpl/Publications/.
Lord, F., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.
Google Scholar
Loyd, B. H., & Hoover, H. D. (1980). Vertical equating using the Rasch model. Journal of Educational Measurement, 17(3), 179–193.
Article Google Scholar
Marsman, M., Maris, G., Bechger, T., & Glas, C. (2016). What can we learn from Plausible Values? Psychometrika, 81, 274–289. doi:10.1007/s11336-016-9497-x.
Article PubMed PubMed Central Google Scholar
Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149–174.
Article Google Scholar
Mazzeo, J., Kulick, E., Tay-Lim, B., & Perie, M. (2006). Technical report for the 2000 market-basket study in mathematics (Tech. Rep.). ETS.
Mislevy, R. J. (1998). Implications of market-basket reporting for achievement-level setting. Applied Psychological Measurement, 11(1), 49–63.
Google Scholar
National Research Council. (2001). Naep reporting practices: Investigating district-level and market-basket reporting. Washington, DC: The National Academies Press. doi:10.17226/10049.
NCES. (1997, October). NAEP reconfigured: An integrated redesign of the national assessment of educational progress (Tech. Rep. No. 97-31). National Center For Educational Statistics. Retrieved from http://nces.ed.gov/pubs97/9731.
OECD. (2004). Learning for tomorrows world: First results from PISA 2003. Retrieved from www.oecd.org/dataoecd/1/60/34002216.
OECD. (2007). PISA 2006: Science competencies for tomorrows world: Volume 1: Analysis.
OECD. (2009a). PISA 2006 technical report.
OECD. (2009b) PISA data analysis manual.
OECD. (2012). The policy impact of PISA: An exploration of the normative effects of international benchmarking in school system performance (OECD Education Working Paper No. 71). Organisation for Economic Co-operation and Development.
Oliveri, M. E., & Ercikan, K. (2011). Do different approaches to examining construct comparability in multilanguage assessments lead to similar conclusions? Applied Measurement in Education, 24(4), 349–366. doi:10.1080/08957347.2011.607063.
Article Google Scholar
Oliveri, M. E., & Von Davier, M. (2011). Investigation of model fit and score scale comparability in international assessments. Psychological Test and Assessment Modeling, 53(3), 315–333.
Google Scholar
Oliveri, M. E., & Von Davier, M. (2014). Toward increasing fairness in score scale calibrations employed in international large-scale assessments. International Journal of Testing, 14(1), 1–21. doi:10.1080/15305058.2013.825265.
Article Google Scholar
Sandilands, D., Oliveri, M. E., Zumbo, B. D., & Ercikan, K. (2013). Investigating sources of differential item functioning in international large-scale assessments using a confirmatory approach. International Journal of Testing, 13(2), 152–174. doi:10.1080/15305058.2012.690140.
Article Google Scholar
Verhelst, N. D. (2012). Profile analysis: A closer look at the PISA 2000 reading data. Scandinavian Journal of Educational Research, 56(3), 315–332. doi:10.1080/00313831.2011.583937.
Article Google Scholar
Verhelst, N. D., & Glas, C. A. W. (1995). The one parameter logistic model: OPLM. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch models: Foundations, recent developments and applications (pp. 215–238). New York: Springer.
Chapter Google Scholar
Verhelst, N. D., Glas, C. A. W., & Verstralen, H. H. F. M. (1993). OPLM: One parameter logistic model. Computer program and manual. Arnhem: Cito.
Google Scholar

Download references

Author information

Authors and Affiliations

University of Amsterdam, Amsterdam, The Netherlands
Robert J. Zwitser, S. Sjoerd F. Glaser & Gunter Maris
Cito Institute for Educational Measurement, Arnhem, The Netherlands
Gunter Maris

Authors

Robert J. Zwitser
View author publications
You can also search for this author in PubMed Google Scholar
S. Sjoerd F. Glaser
View author publications
You can also search for this author in PubMed Google Scholar
Gunter Maris
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Robert J. Zwitser.

Appendix

See Table 5.

Table 5 Item parameters of the models explained in Section 4.1.

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zwitser, R.J., Glaser, S.S.F. & Maris, G. Monitoring Countries in a Changing World: A New Look at DIF in International Surveys. Psychometrika 82, 210–232 (2017). https://doi.org/10.1007/s11336-016-9543-8

Download citation

Received: 22 February 2014
Revised: 06 October 2016
Published: 14 November 2016
Issue Date: March 2017
DOI: https://doi.org/10.1007/s11336-016-9543-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Monitoring Countries in a Changing World: A New Look at DIF in International Surveys

Abstract

Access this article

Similar content being viewed by others

How to use and assess qualitative research methods

Reciprocal relationships among reading and vocabulary over time: a longitudinal study from grade 1 to 5

Literature reviews as independent studies: guidelines for academic practice

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Monitoring Countries in a Changing World: A New Look at DIF in International Surveys

Abstract

Access this article

Similar content being viewed by others

How to use and assess qualitative research methods

Reciprocal relationships among reading and vocabulary over time: a longitudinal study from grade 1 to 5

Literature reviews as independent studies: guidelines for academic practice

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation