Abstract
Purpose
The cross-cultural equivalence of patient-reported outcome (PRO) instruments is critical when they are used in international settings. The Universalist model of equivalence was proposed as a framework to investigate cross-cultural equivalence. The purpose of this paper was to illustrate how quantitative methods can be used to investigate cross-cultural equivalence within this framework.
Methods
The six types of equivalence of the Universalist model were reviewed from a statistical perspective and statistical techniques allowing addressing the underlying question were identified. These methods are described and examples are provided of how they can be applied. An integrated pragmatic approach to the exploration of cross-cultural equivalence was developed based on these methods.
Results
The statistical techniques identified were factor analysis to explore conceptual equivalence, differential item functioning to explore semantic and item equivalence, and comparison of measurement properties for the measurement equivalence. The statistical techniques addressing operational equivalence were found to be diverse and highly specific to the operational aspect under investigation. Functional equivalence involves a comprehensive appraisal of the potential impact of the results of the other equivalences on the conclusions of the research. This structured appraisal of functional equivalence offers a framework for a comprehensive, but flexible, approach for the efficient application of statistical analyses to explore cross-cultural equivalence of PRO instruments.
Conclusion
The different types of equivalence of the Universalist model can be investigated using quantitative methods. An integrated approach, which could be used in a variety of settings, was developed to allow the whole notion of cross-cultural equivalence to be comprehensively and efficiently addressed.
Similar content being viewed by others
References
Anderson, R. T., Aaronson, N. K., & Wilkin, D. (1993). Critical review of the international assessments of health-related quality of life. Quality of Life Research, 2(6), 369–395.
Bullinger, M., Anderson, R., Cella, D., & Aaronson, N. (1993). Developing and evaluating cross-cultural instruments from minimum requirements to optimal models. Quality of Life Research, 2(6), 451–459.
Hays, R. D., Anderson, R., & Revicki, D. (1993). Psychometric considerations in evaluating health-related quality of life measures. Quality of Life Research, 2(6), 441–449.
Schmidt, S., & Bullinger, M. (2003). Current issues in cross-cultural quality of life instrument development. Archives of Physical Medicine and Rehabilitation, 84(4 Suppl 2), S29–S34.
Hui, C. H., & Triandis, H. C. (1985). Measurement in cross-cultural psychology a review and comparison of strategies. Journal of Cross-Cultural Psychology, 16(2), 131–152.
Johnson, T. P. (2006). Methods and frameworks for crosscultural measurement. Medical Care, 44(11 Suppl 3), S17–S20.
Herdman, M., Fox-Rushby, J., & Badia, X. (1997). ‘Equivalence’ and the translation and adaptation of health-related quality of life questionnaires. Quality of Life Research, 6(3), 237–247.
Berry, J. W. (2002). Cross-cultural psychology: Research and applications. Cambridge: Cambridge University Press.
Herdman, M., Fox-Rushby, J., & Badia, X. (1998). A model of equivalence in the cultural adaptation of HRQoL instruments: The Universalist approach. Quality of Life Research, 7(4), 323–335.
Acquadro, C., Conway, C., Giroudet, C., & Mear, I. (2004). Linguistic validation manual for patient-reported outcomes (PRO) instruments. Lyon: Mapi Research Institute.
Behling, O., & Law, K. S. (2000). Translating questionnaires and other research instruments: Problems and solutions. Thousand Oaks: Sage.
McKenna, S. P., & Doward, L. C. (2005). The translation and cultural adaptation of patient-reported outcome measures. Value Health, 8(2), 89–91.
Wild, D., Grove, A., Martin, M., Eremenco, S., McElroy, S., Verjee-Lorenz, A., et al. (2005). Principles of good practice for the translation and cultural adaptation process for patient-reported outcomes (PRO) measures: Report of the ISPOR task force for translation and cultural adaptation. Value in Health, 8(2), 94–104.
Bjorner, J. B., Kreiner, S., Ware, J. E., Damsgaard, M. T., & Bech, P. (1998). Differential item functioning in the Danish translation of the SF-36. Journal of Clinical Epidemiology, 51(11), 1189–1202.
Bullinger, M., Alonso, J., Apolone, G., Leplege, A., Sullivan, M., Wood-Dauphinee, S., et al. (1998). Translating health status questionnaires and evaluating their quality: The IQOLA project approach. International quality of life assessment. Journal of Clinical Epidemiology, 51(11), 913–923.
Keller, S. D., Ware, J. E., Jr, Gandek, B., Aaronson, N. K., Alonso, J., Apolone, G., et al. (1998). Testing the equivalence of translations of widely used response choice labels: Results from the IQOLA project. International quality of life assessment. Journal of Clinical Epidemiology, 51(11), 933–944.
Ravens-Sieberer, U., Auquier, P., Erhart, M., Gosch, A., Rajmil, L., Bruil, J., et al. (2007). The KIDSCREEN-27 quality of life measure for children and adolescents: Psychometric results from a cross-cultural survey in 13 European countries. Quality of Life Research, 16(8), 1347–1356.
Robitail, S., Ravens-Sieberer, U., Simeoni, M. C., Rajmil, L., Bruil, J., Power, M., et al. (2007). Testing the structural and cross-cultural validity of the KIDSCREEN-27 quality of life questionnaire. Quality of Life Research, 16(8), 1335–1345.
Scott, N. W., Fayers, P. M., Bottomley, A., Aaronson, N. K., de Graeff, A., Groenvold, M., et al. (2006). Comparing translations of the EORTC QLQ-C30 using differential item functioning analyses. Quality of Life Research, 15(6), 1103–1115.
Scott, N. W., Fayers, P. M., Aaronson, N. K., Bottomley, A., de Graeff, A., Groenvold, M., et al. (2007). The use of differential item functioning analyses to identify cultural differences in responses to the EORTC QLQ-C30. Quality of Life Research, 16(1), 115–129.
Skevington, S. M. (2002). Advancing cross-cultural research on quality of life: Observations drawn from the WHOQOL development. World health organisation quality of life assessment. Quality of Life Research, 11(2), 135–144.
Ware, J. E., Jr., Kosinski, M., Gandek, B., Aaronson, N. K., Apolone, G., Bech, P., et al. (1998). The factor structure of the SF-36 Health Survey in 10 countries: Results from the IQOLA project. International quality of life assessment. Journal of Clinical Epidemiology, 51(11), 1159–1165.
Wild, D., Eremenco, S., Mear, I., Martin, M., Houchin, C., Gawlicki, M., et al. (2009). Multinational trials-recommendations on the translations required, approaches to using the same language in different countries, and the approaches to support pooling the data: The ISPOR patient-reported outcomes translation and linguistic validation good research practices task force report. Value Health, 12(4), 430–440.
Mullen, M. R. (1995). Diagnosing measurement equivalence in cross-national research. Journal of International Business Studies, 26(3), 573–596.
Singh, J. (1995). Measurement issues in cross-national research. Journal of International Business Studies, 26(3), 597–619.
Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 3(1), 4–70.
Meredith, W., & Teresi, J. A. (2006). An essay on measurement and factorial invariance. Medical Care, 44(11 Suppl 3), S69–S77.
Dupuy, H. J. (1984). The psychological general well-being (PGWB) index. Assessment of quality of life in clinical trials of cardiovascular therapies, pp. 170–183.
Regnault, A. (2007). Méthodes quantitatives pour l’évaluation de la validité interculturelle des instruments de mesure subjective évaluée par les patients. Université Claude Bernard Lyon 1.
Spencer-Rodgers, J., Peng, K., Wang, L., & Hou, Y. (2004). Dialectical self-esteem and East-West differences in psychological well-being. Personality and Social Psychological Bulletin, 30(11), 1416–1432.
Holland, P. W., & Wainer, H. (1993). Differential item functioning. Newbury Park (NJ): Lawrence Erlbaum Associates.
Cohen, A. S., Kim, S. H., & Wollack, J. A. (1996). An investigation of the likelihood ratio test for detection of differential item functioning. Applied Psychological Measurement, 20(1), 15–26.
Lord, F. M. (1980). Application of item response theory to practical testing problems. Hillsbaum, NJ: Lawrence Erlbaum.
Raju, N. S. (1988). The area between two item characteristic curves. Psychometrika, 53(4), 495–502.
Raju, N. S. (1990). Determining the significance of estimated signed and unsigned areas between two item response functions. Applied Psychological Measurement, 14(2), 197–207.
Teresi, J. A., Kleinman, M., & Ocepek-Welikson, K. (2000). Modern psychometric methods for detection of differential item functioning: Application to cognitive assessment measures. Statistics in Medicine, 19(11–12), 1651–1683.
Teresi, J. A. (2006). Different approaches to differential item functioning in health applications. Advantages, disadvantages and some neglected topics. Med Care, 44(11 Suppl 3), S152–S170.
Teresi, J. A., & Fleishman, J. A. (2007). Differential item functioning and health assessment. Quality of Life Research, 16(Suppl 1), 33–42.
Petersen, M. A., Groenvold, M., Bjorner, J. B., Aaronson, N., Conroy, T., Cull, A., et al. (2003). Use of differential item functioning analysis to assess the equivalence of translations of a questionnaire. Quality of Life Research, 12(4), 373–385.
van der Flier, H., Mellenbergh, G. J., Ader, H. J., & Wijn, M. (1984). An iterative item bias detection method. Journal of Cross-Cultural Psychology, 13, 267–298.
Hidalgo, M. D., & Lopez-Pina, J. A. (2004). Differential item functioning detection and effect size: A comparison between logistic regression and Mantel-Haenszel procedures. Educational and Psychological Measurement, 64(6), 903.
Regnault, A., Marfatia, S., Louie, M., Mear, I., Meunier, J., & Viala-Danten, M. (2009). Satisfactory cross-cultural validity of the ACTG symptom distress module in HIV-1-infected antiretroviral-naive patients. Clin Trials, 6(6), 574–584.
Zumbo, B. D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and likert-type (ordinal) item scores. Directorate of Human Ressources Research and Evaluation, Department of National Defense.
Scott, N. W., Fayers, P. M., Aaronson, N. K., Bottomley, A., de Graeff, A., Groenvold, M., et al. (2010). Differential item functioning (DIF) analyses of health-related quality of life instruments using logistic regression. Health Qual Life Outcomes, 8, 81.
Crane, P. K., Gibbons, L. E., Jolley, L., & van Belle, G. (2006). Differential item functioning analysis with ordinal logistic regression techniques. DIFdetect and difwithpar. Medical Care, 44(11 Suppl 3), S115–S123.
Crane, P. K., Gibbons, L. E., Narasimhalu, K., Lai, J. S., & Cella, D. (2007). Rapid detection of differential item functioning in assessments of health-related quality of life: The functional assessment of cancer therapy. Quality of Life Research, 16(1), 101–114.
Clauser, B. E., & Mazor, K. M. (1998). Using statistical procedures to identify differentially functioning test items. Educational Measurement: Issues and Practice, 17(1), 31–44.
Hambleton, R. K., & Rogers, H. J. (1989). Detecting potentially biased test items: Comparison of IRT area and Mantel-Haenszel methods. Applied Measurement in Education, 2(4), 313–334.
Scott, N. W., Fayers, P. M., Aaronson, N. K., Bottomley, A., de Graeff, A., Groenvold, M., et al. (2010). Interpretation of differential item functioning analyses using external review. Expert Review of Pharmacoeconomics & Outcomes Research, 10(3), 253–258.
Scott, N. W., Fayers, P. M., Aaronson, N. K., Bottomley, A., de Graeff, A., Groenvold, M., et al. (2009). The practical impact of differential item functioning analyses in a health-related quality of life instrument. Quality of Life Research, 18(8), 1125–1130.
Johnson, T., Kulesa, P., Cho, Y. I., & Shavitt, S. (2005). The relation between culture and response styles evidence from 19 countries. Journal of Cross-Cultural Psychology, 36(2), 264–277.
Lee, J. W., Jones, P. S., Mineyama, Y., & Zhang, X. E. (2002). Cultural differences in responses to a Likert scale. Research in Nursing & Health, 25(4), 295–306.
van Herk, H., Poortinga, Y. H., & Verhallen, T. M. M. (2004). Response styles in rating scales evidence of method bias in data from six EU countries. Journal of Cross-Cultural Psychology, 35(3), 346–360.
Bowden, A., & Fox-Rushby, J. A. (2003). A systematic and critical review of the process of translation and adaptation of generic health-related quality of life measures in Africa, Asia, Eastern Europe, the Middle East, South America. Social Science & Medicine, 57(7), 1289–1306.
Sarro, S., Duenas, R. M., Ramirez, N., Arranz, B., Martinez, R., Sanchez, J. M., et al. (2004). Cross-cultural adaptation and validation of the Spanish version of the Calgary Depression Scale for Schizophrenia. Schizophrenia Research, 68(2–3), 349–356.
Tauler, E., Vilagut, G., Grau, G., Gonzalez, A., Sanchez, E., Figueras, G., et al. (2001). The spanish version of the paediatric asthma quality of life questionnaire (PAQLQ): Metric characteristics and equivalence with the original version. Quality of Life Research, 10(1), 81–91.
Feldt, L. S., & Kim, S. (2006). Testing the difference between two alpha coefficients with small samples of subjects and raters. Educational and Psychological Measurement, 66(4), 589–600.
Fledt, L. S. (1969). A test of the hypothesis that Cronbach’s alpha or Kruder-Richardson coefficient twenty is the same for two tests. Psychometrika, 34, 357–370.
Vangeneugden, T., Laenen, A., Geys, H., Renard, D., & Molenberghs, G. (2005). Applying concepts of generalizability theory on clinical trial data to investigate sources of variation and their impact on reliability. Biometrics, 61(1), 295–304.
Pae, T. I., & Park, G. P. (2006). Examining the relationship between differential item functioning and differential test functioning. Language Testing, 23(4), 475–496.
Roznowski, M., & Reith, J. (1999). Examining the measurement quality of tests containing differentially functioning items: Do biased items result in poor measurement? Educational and Psychological Measurement, 59(2), 248–269.
Acknowledgments
We would like to thank Christine de la Loge for having initiated this work and for her participation in the earliest phase of this work.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Regnault, A., Herdman, M. Using quantitative methods within the Universalist model framework to explore the cross-cultural equivalence of patient-reported outcome instruments. Qual Life Res 24, 115–124 (2015). https://doi.org/10.1007/s11136-014-0722-8
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11136-014-0722-8