Abstract
This chapter provides an overview of issues of cross-cultural comparability of latent constructs such as attitudes, values, and beliefs measured in ILSA questionnaires. We present the aim and scope of statistical approaches to analyze cross-cultural comparability in four parts. We first introduce the framework of bias and equivalence to guide the methodological considerations for valid cross-cultural comparisons. We then review the practice of construct validation and equivalence testing that has been documented across various ILSAs in the past, highlighting the urgency and challenges in ensuring comparability for multiple group comparisons as well as the related scientific discussions. Next, we briefly describe the recently developed approaches to accommodate the nonequivalence due to the numerous cultural, linguistic, and psychological differences in ILSA data. Finally, we discuss findings and provide recommendations for future applications for ILSA data publishers, policy stakeholders, and researchers who use ILSA data for secondary analysis or who collect data for cross-cultural comparisons.
References
Asparouhov, T., & Muthén, B. (2009). Exploratory structural equation modeling. Structural Equation Modeling: A Multidisciplinary Journal, 16, 397–438. https://doi.org/10.1080/10705510903008204
Asparouhov, T., & Muthén, B. (2014). Multiple-group factor analysis alignment. Structural Equation Modeling: A Multidisciplinary Journal, 21, 495–508. https://doi.org/10.1080/10705511.2014.919210
Avvisati, F., Le Donné, N., & Paccagnella, M. (2019). A meeting report: Cross-cultural comparability of questionnaire measures in large-scale international surveys. Measurement Instruments for the Social Sciences, 1, 8. https://doi.org/10.1186/s42409-019-0010-z
Bauer, D. J. (2017). A more general model for testing measurement invariance and differential item functioning. Psychological Methods, 22, 507–526. https://doi.org/10.1037/met0000077
Benítez, I., & Padilla, J.-L. (2014). Analysis of nonequivalent assessments across different linguistic groups using a mixed methods approach: Understanding the causes of differential item functioning by cognitive interviewing. Journal of Mixed Methods Research, 8, 52–68. https://doi.org/10.1177/1558689813488245
Boer, D., Hanke, K., & He, J. (2018). On detecting systematic measurement error in cross-cultural research: A review and critical reflection on equivalence and invariance tests. Journal of Cross-Cultural Psychology, 49, 713–734. https://doi.org/10.1177/0022022117749042
Buchholz, J., & Hartig, J. (2017). Comparing attitudes across groups: An IRT-based item-fit statistic for the analysis of measurement invariance. Applied Psychological Measurement, 43, 241–250. https://doi.org/10.1177/0146621617748323
Buchholz, J., & Hartig, J. (2020). Measurement invariance testing in questionnaires: A comparison of three multigroup-CFA and IRT-based approaches. Psychological Test and Assessment Modeling.
Byrne, B. M., Shavelson, R. J., & Muthén, B. (1989). Testing for the equivalence of factor covariance and mean structures: The issue of partial measurement invariance. Psychological Bulletin, 105, 456–466. https://doi.org/10.1037/0033-2909.105.3.456
Cheung, G. W., & Rensvold, R. B. (2002). Evaluating goodness-of-fit indexes for testing measurement invariance. Structural Equation Modeling, 9, 233–255. https://doi.org/10.1207/s15328007sem0902_5
Cieciuch, J., Davidov, E., Schmidt, P., Algesheimer, R., & Schwartz, S. H. (2014). Comparing results of an exact versus an approximate (Bayesian) measurement invariance test: A cross-country illustration with a scale to measure 19 human values. Frontiers in Psychology, 5, 1–10. https://doi.org/10.3389/fpsyg.2014.00982
Davidov, E., Cieciuch, J., Meuleman, B., Schmidt, P., Algesheimer, R., & Hausherr, M. (2015). The comparability of measurements of attitudes toward immigration in the European Social Survey: Exact versus approximate measurement equivalence. Public Opinion Quarterly, 79, 244–266. https://doi.org/10.1093/poq/nfv008
De Roover, K., Vermunt, J. K., Timmerman, M. E., & Ceulemans, E. (2017). Mixture simultaneous factor analysis for capturing differences in latent variables between higher level units of multilevel data. Structural Equation Modeling: A Multidisciplinary Journal, 24, 506–523. https://doi.org/10.1080/10705511.2017.1278604
Duckworth, A. L., & Yeager, D. S. (2015). Measurement matters: Assessing personal qualities other than cognitive ability for educational purposes. Educational Researcher, 44, 237–251. https://doi.org/10.3102/0013189x15584327
Fischer, J., Praetorius, A.-K., & Klieme, E. (2019). The impact of linguistic similarity on cross-cultural comparability of students’ perceptions of teaching quality. Educational Assessment, Evaluation and Accountability. https://doi.org/10.1007/s11092-019-09295-7
Greiff, S., & Scherer, R. (2018). Still comparing apples with oranges? Some thoughts on the principles and practices of measurement invariance testing. European Journal of Psychological Assessment, 34(3), 141–144. https://doi.org/10.1027/1015-5759/a000487
He, J., & van de Vijver, F. J. R. (2015). The motivation-achievement paradox in international educational achievement tests: Toward a better understanding. In R. B. King & A. B. I. Bernardo (Eds.), The psychology of Asian learners: A festschrift in honor of David Watkins (pp. 253–268). Springer.
Hopfenbeck, T. N., Lenkeit, J., El Masri, Y., Cantrell, K., Ryan, J., & Baird, J.-A. (2018). Lessons learned from PISA: A systematic review of peer-reviewed articles on the Programme for International Student Assessment. Scandinavian Journal of Educational Research, 62, 333–353. https://doi.org/10.1080/00313831.2016.1258726
Huang, X., Wilson, M., & Wang, L. (2016). Exploring plausible causes of differential item functioning in the PISA science assessment: Language, curriculum or culture. Educational Psychology, 36, 378–390. https://doi.org/10.1080/01443410.2014.946890
Jak, S. (2014). Testing strong factorial invariance using three-level structural equation modeling. [Methods]. Frontiers in Psychology, 5. https://doi.org/10.3389/fpsyg.2014.00745
Jak, S. (2017). Testing and explaining differences in common and residual factors across many countries. Journal of Cross-Cultural Psychology, 48, 75–92. https://doi.org/10.1177/0022022116674599.
Jerrim, J., Micklewright, J., Heine, J.-H., Salzer, C., & McKeown, C. (2018). PISA 2015: How big is the ‘mode effect’ and what has been done about it? Oxford Review of Education, 44, 476–493. https://doi.org/10.1080/03054985.2018.1430025
Jöreskog, K. G. (1971). Statistical analysis of sets of congeneric tests. Psychometrika, 36, 109–133. https://doi.org/10.1007/BF02291393
Khorramdel, L., Shin, H. J., & von Davier, M. (2019). GDM software mdltm including parallel EM algorithm. In M. von Davier & Y. S. Lee (Eds.), Handbook of diagnostic classification models. Methodology of educational measurement and assessment. Springer.
Kuger, S., Klieme, E., Jude, N., & Kaplan, D. (Eds.). (2017). Assessing contexts of learning: An international perspective. Springer.
Kyllonen, P. C., & Bertling, J. J. (2014). Innovative questionnaire assessment methods to increase cross-country comparability. In L. Rutkowski, M. von Davier, & D. Rutkowski (Eds.), Handbook of international large-scale assessment: Background, technical issues, and methods of data analysis (pp. 277–286). CRC Press.
Lafontaine, D., Dupont, V., Jaegers, D., & Schillings, P. (2019). Self-concept in reading: Factor structure, cross-cultural invariance and relationships with reading achievement in an international context (PIRLS 2011). Studies in Educational Evaluation, 60, 78–89. https://doi.org/10.1016/j.stueduc.2018.11.005
Matsumoto, D., & van de Vijver, F. J. R. (Eds.). (2011). Cross-cultural research methods in psychology. Cambridge University Press.
Meredith, W. (1993). Measurement invariance, factor analysis and factorial invariance. Psychometrika, 58, 525–543. https://doi.org/10.1007/BF02294825
Millsap, R. E. (2011). Statistical approaches to measurement invariance. Routledge.
Muthen, B., & Asparouhov, T. (2012). Bayesian SEM: A more flexible representation of substantive theory. Psychological Methods, 17, 313–335. https://doi.org/10.1037/a0026802
Nsamenang, A. B. (2006). Human ontogenesis: An indigenous African view on development and intelligence. International Journal of Psychology, 41, 293–297. https://doi.org/10.1080/00207590544000077
OECD. (2014). TALIS 2013 technical report. OECD Publishing.
OECD. (2017). PISA 2015 technical report. OECD Publishing.
Paulhus, D. L. (1991). Measurement and control of response biases. In J. Robinson, P. Shaver, & L. Wrightsman (Eds.), Measures of personality and social psychological attitudes (Vol. 1, pp. 17–59). Academic.
Rutkowski, D., & Rutkowski, L. (2013). Measuring socioeconomic background in PISA: One size might not fit all. Research in Comparative and International Education, 8, 259–278. https://doi.org/10.2304/rcie.2013.8.3.259
Rutkowski, L., & Rutkowski, D. (2016). A call for a more measured approach to reporting and interpreting PISA results. Educational Researcher, 45, 252–257. https://doi.org/10.3102/0013189x16649961
Rutkowski, L., & Svetina, D. (2014). Assessing the hypothesis of measurement invariance in the context of large-scale international surveys. Educational and Psychological Measurement, 74, 31–57. https://doi.org/10.1177/0013164413498257
Rutkowski, L., & Svetina, D. (2016). Measurement invariance in international surveys: Categorical indicators and fit measure performance. Applied Measurement in Education. https://doi.org/10.1080/08957347.2016.1243540
Rutkowski, L., von Davier, M., & Rutkowski, D. (Eds.). (2014). Handbook of international large-scale assessment: Background, technical issues, and methods of data analysis. CRC Press.
Schulz, W. (2009). Questionnaire construct validation in the international civic and citizenship education study IERI monograph series: Issues and methodologies in large-scale assessment, International Association for the Evaluation of Educational (IEA)/ Educational Testing Service (ETS) (Vol. 2, pp. 113–135).
Steenkamp, J.-B. E. M., & Baumgartner, H. (1998). Assessing measurement invariance in cross-national consumer research. Journal of Consumer Research, 25, 78–107. https://doi.org/10.1086/209528
Van De Schoot, R., Schmidt, P., De Beuckelaer, A., Lek, K., & Zondervan-Zwijnenburg, M. (2015). Editorial: Measurement Invariance. Frontiers in Psychology, 6, 1064. https://doi.org/10.3389/fpsyg.2015.01064
van de Vijver, F. J. R., & Leung, K. (1997). Methods and data analysis of comparative research. SAGE.
van de Vijver, F. J. R., & Poortinga, Y. H. (1997). Towards an integrated analysis of bias in cross-cultural assessment. European Journal of Psychological Assessment, 13, 29–37. https://doi.org/10.1027/1015-5759.13.1.29
von Davier, M. (2015). mdltm: Software for the general diagnostic model and for estimating mixtures of multidimensional discrete latent traits models. ETS.
Zumbo, B. D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and Likert-Type (ordinal) item scores. Directorate of Human Resources Research and Evaluation, Department of National Defense.
Appendix
-
[1].
ICCS 2009: Schulz, W., Ainley, J., & Fraillon, J. (Eds.). (2011). ICCS 2009 technical report. Amsterdam: International Association for the Evaluation of Educational Achievement. Retrieved from: https://www.iea.nl/sites/default/files/2019-04/ICCS_2009_Technical_Report.pdf
-
[2].
ICCS 2016: Schulz, W., Carstens, R., Losito, B., & Fraillon, J. (Eds.). (2017). International Civic and Citizenship Education Study 2016 Technical Report. IEA Secretariat. Retrieved from: https://www.iea.nl/sites/default/files/2019-07/ICCS%202016_Technical%20Report_FINAL.pdf
-
[3].
ICILS 2013: Fraillon, J., Schulz, W., Friedman, T., Ainley, T., & Gebhardt, E. (Eds.) (2015). ICILS 2013 Technical report. IEA Secretariat. Retrieved from: http://pub.iea.nl/fileadmin/user_upload/Publications/Electronic_versions/ICILS_2013_Technical_Report.pdf
-
[4].
PASEC 2014: Conference of Ministers of Education of French-Speaking Countries [CONFMEN] (2015). PASEC 2014 Education system performance in Francophone sub-saharan Africa. Competencies and learning factors in primary education. Retrieved from: http://www.pasec.confemen.org/wp-content/uploads/2015/12/Rapport_Pasec2014_GB_webv2.pdf
-
[5].
PIAAC: OECD (2016). Technical Report of the Survey of Adult Skills (PIAAC) (2nd ed.). OECD. Retrieved from: http://www.oecd.org/skills/piaac/PIAAC_Technical_Report_2nd_Edition_Full_Report.pdf
-
[6].
PIRLS 2011: Martin, M.O. & Mullis, I.V.S. (Eds.). (2012). Methods and procedures in TIMSS and PIRLS 2011. TIMSS & PIRLS International Study Center, Boston College. Retrieved from: https://timssandpirls.bc.edu/methods/index.html
-
[7].
PIRLS 2016: Martin, M. O., Mullis, I. V. S., & Hooper, M. (Eds.) (2017). Methods and procedures in PIRLS 2016. Retrieved from: https://timssandpirls.bc.edu/publications/pirls/2016-methods.html
-
[8].
PISA 2012: OECD (2014). PISA 2012 Technical report. OECD. Retrieved from: https://www.oecd.org/pisa/pisaproducts/PISA-2012-technical-report-final.pdf
-
[9].
PISA 2015: OECD (2017). PISA 2015 Technical report. OECD. Retrieved from: https://www.oecd.org/pisa/data/2015-technical-report/PISA2015_TechRep_Final.pdf
-
[10].
PRIDI 2013: Verdisco, A., Cueto, S., Thompson, J., Engle, P., Neuschmidt, O., Meyer, S., et al. (2014). Urgency and possibility: Results of PRIDI a first initiative to create regionally comparative data on child development in four Latin American countries. Technical annex. Banco Interamericano de Desenvolvimento. Retrieved from: https://webimages.iadb.org/education/Instrumentos/PRIDI_Technical_Annex.pdf
-
[11].
STEP: Pierre, G., Sanchez Puerta, M. L., Valerio, A., Rajadel, T. (2014). STEP Skills Measurement Surveys: Innovative Tools for Assessing Skills. Social protection and labor discussion paper; no. 1421. World Bank Group. Retrieved from: https://openknowledge.worldbank.org/handle/10986/19985
-
[12].
TALIS 2013: OECD (2014). TALIS 2013 Technical report. OECD. Retrieved from: http://www.oecd.org/education/school/TALIS-technical-report-2013.pdf
-
[13].
TALIS 2018: OECD (2019). TALIS 2018 Technical report. OECD. Retrieved from: http://www.oecd.org/education/talis/TALIS_2018_Technical_Report.pdf
-
[14].
TEDS-M 2008: Tatto, M. T. (Ed.) (2013). The Teacher Education and Development Study in Mathematics (TEDS-M). Policy, Practice, and Readiness to Teach Primary and Secondary Mathematics in 17 Countries: Technical report. IEA. Retrieved from: http://pub.iea.nl/fileadmin/user_upload/Publications/Electronic_versions/TEDS-M_technical_report.pdf
-
[15].
TERCE 2013: United Nations Educational, Scientific and Cultural Organization – Oficina Regional de Educacion para América Latina y el Caribe. (2016). Reporte Técnico: Tercer Estudio Regional Comparativo y Explicativo [Technical report. Third Regional Comparative and Explanatory Study], TERCE. Retrieved from: https://dokumen.tips/documents/reporte-tecnico-230916-final-agradecimientos-el-reporte-tecnico-detalla-la.html
-
[16].
TIMSS 2011: Martin, M.O. & Mullis, I.V.S. (Eds.). (2012). Methods and procedures in TIMSS and PIRLS 2011. TIMSS & PIRLS International Study Center, Boston College. Retrieved from: https://timssandpirls.bc.edu/methods/index.html
-
[17].
TIMSS 2015: Martin, M. O., Mullis, I. V. S., & Hooper, M. (Eds.). (2016). Methods and procedures in TIMSS 2015. Retrieved from Boston College, TIMSS & PIRLS International Study Center website: http://timssandpirls.bc.edu/publications/timss/2015-methods.html
-
[18].
TIMSS Advanced 2015: Martin, M. O., Mullis, I. V. S., & Hooper, M. (Eds.). (2016). Methods and procedures in TIMSS Advanced 2015. Retrieved from Boston College, TIMSS & PIRLS International Study Center website: http://timss.bc.edu/publications/timss/2015-a-methods.html
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this entry
Cite this entry
He, J., Buchholz, J., Fischer, J. (2021). Cross-Cultural Comparability of Latent Constructs in ILSAs. In: Nilsen, T., Stancel-Piątak, A., Gustafsson, JE. (eds) International Handbook of Comparative Large-Scale Studies in Education. Springer International Handbooks of Education. Springer, Cham. https://doi.org/10.1007/978-3-030-38298-8_58-1
Download citation
DOI: https://doi.org/10.1007/978-3-030-38298-8_58-1
Received:
Accepted:
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-38298-8
Online ISBN: 978-3-030-38298-8
eBook Packages: Springer Reference EducationReference Module Humanities and Social SciencesReference Module Education