Skip to main content
Log in

Using quantitative methods within the Universalist model framework to explore the cross-cultural equivalence of patient-reported outcome instruments

  • Quantitative Methods Special Section
  • Published:
Quality of Life Research Aims and scope Submit manuscript

Abstract

Purpose

The cross-cultural equivalence of patient-reported outcome (PRO) instruments is critical when they are used in international settings. The Universalist model of equivalence was proposed as a framework to investigate cross-cultural equivalence. The purpose of this paper was to illustrate how quantitative methods can be used to investigate cross-cultural equivalence within this framework.

Methods

The six types of equivalence of the Universalist model were reviewed from a statistical perspective and statistical techniques allowing addressing the underlying question were identified. These methods are described and examples are provided of how they can be applied. An integrated pragmatic approach to the exploration of cross-cultural equivalence was developed based on these methods.

Results

The statistical techniques identified were factor analysis to explore conceptual equivalence, differential item functioning to explore semantic and item equivalence, and comparison of measurement properties for the measurement equivalence. The statistical techniques addressing operational equivalence were found to be diverse and highly specific to the operational aspect under investigation. Functional equivalence involves a comprehensive appraisal of the potential impact of the results of the other equivalences on the conclusions of the research. This structured appraisal of functional equivalence offers a framework for a comprehensive, but flexible, approach for the efficient application of statistical analyses to explore cross-cultural equivalence of PRO instruments.

Conclusion

The different types of equivalence of the Universalist model can be investigated using quantitative methods. An integrated approach, which could be used in a variety of settings, was developed to allow the whole notion of cross-cultural equivalence to be comprehensively and efficiently addressed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Anderson, R. T., Aaronson, N. K., & Wilkin, D. (1993). Critical review of the international assessments of health-related quality of life. Quality of Life Research, 2(6), 369–395.

    Article  CAS  PubMed  Google Scholar 

  2. Bullinger, M., Anderson, R., Cella, D., & Aaronson, N. (1993). Developing and evaluating cross-cultural instruments from minimum requirements to optimal models. Quality of Life Research, 2(6), 451–459.

    Article  CAS  PubMed  Google Scholar 

  3. Hays, R. D., Anderson, R., & Revicki, D. (1993). Psychometric considerations in evaluating health-related quality of life measures. Quality of Life Research, 2(6), 441–449.

    Article  CAS  PubMed  Google Scholar 

  4. Schmidt, S., & Bullinger, M. (2003). Current issues in cross-cultural quality of life instrument development. Archives of Physical Medicine and Rehabilitation, 84(4 Suppl 2), S29–S34.

    Article  PubMed  Google Scholar 

  5. Hui, C. H., & Triandis, H. C. (1985). Measurement in cross-cultural psychology a review and comparison of strategies. Journal of Cross-Cultural Psychology, 16(2), 131–152.

    Article  Google Scholar 

  6. Johnson, T. P. (2006). Methods and frameworks for crosscultural measurement. Medical Care, 44(11 Suppl 3), S17–S20.

    Article  PubMed  Google Scholar 

  7. Herdman, M., Fox-Rushby, J., & Badia, X. (1997). ‘Equivalence’ and the translation and adaptation of health-related quality of life questionnaires. Quality of Life Research, 6(3), 237–247.

    Article  CAS  PubMed  Google Scholar 

  8. Berry, J. W. (2002). Cross-cultural psychology: Research and applications. Cambridge: Cambridge University Press.

    Google Scholar 

  9. Herdman, M., Fox-Rushby, J., & Badia, X. (1998). A model of equivalence in the cultural adaptation of HRQoL instruments: The Universalist approach. Quality of Life Research, 7(4), 323–335.

    Article  CAS  PubMed  Google Scholar 

  10. Acquadro, C., Conway, C., Giroudet, C., & Mear, I. (2004). Linguistic validation manual for patient-reported outcomes (PRO) instruments. Lyon: Mapi Research Institute.

  11. Behling, O., & Law, K. S. (2000). Translating questionnaires and other research instruments: Problems and solutions. Thousand Oaks: Sage.

  12. McKenna, S. P., & Doward, L. C. (2005). The translation and cultural adaptation of patient-reported outcome measures. Value Health, 8(2), 89–91.

    Article  PubMed  Google Scholar 

  13. Wild, D., Grove, A., Martin, M., Eremenco, S., McElroy, S., Verjee-Lorenz, A., et al. (2005). Principles of good practice for the translation and cultural adaptation process for patient-reported outcomes (PRO) measures: Report of the ISPOR task force for translation and cultural adaptation. Value in Health, 8(2), 94–104.

    Article  PubMed  Google Scholar 

  14. Bjorner, J. B., Kreiner, S., Ware, J. E., Damsgaard, M. T., & Bech, P. (1998). Differential item functioning in the Danish translation of the SF-36. Journal of Clinical Epidemiology, 51(11), 1189–1202.

    Article  CAS  PubMed  Google Scholar 

  15. Bullinger, M., Alonso, J., Apolone, G., Leplege, A., Sullivan, M., Wood-Dauphinee, S., et al. (1998). Translating health status questionnaires and evaluating their quality: The IQOLA project approach. International quality of life assessment. Journal of Clinical Epidemiology, 51(11), 913–923.

    Article  CAS  PubMed  Google Scholar 

  16. Keller, S. D., Ware, J. E., Jr, Gandek, B., Aaronson, N. K., Alonso, J., Apolone, G., et al. (1998). Testing the equivalence of translations of widely used response choice labels: Results from the IQOLA project. International quality of life assessment. Journal of Clinical Epidemiology, 51(11), 933–944.

    Article  CAS  PubMed  Google Scholar 

  17. Ravens-Sieberer, U., Auquier, P., Erhart, M., Gosch, A., Rajmil, L., Bruil, J., et al. (2007). The KIDSCREEN-27 quality of life measure for children and adolescents: Psychometric results from a cross-cultural survey in 13 European countries. Quality of Life Research, 16(8), 1347–1356.

    Article  PubMed  Google Scholar 

  18. Robitail, S., Ravens-Sieberer, U., Simeoni, M. C., Rajmil, L., Bruil, J., Power, M., et al. (2007). Testing the structural and cross-cultural validity of the KIDSCREEN-27 quality of life questionnaire. Quality of Life Research, 16(8), 1335–1345.

    Article  PubMed  Google Scholar 

  19. Scott, N. W., Fayers, P. M., Bottomley, A., Aaronson, N. K., de Graeff, A., Groenvold, M., et al. (2006). Comparing translations of the EORTC QLQ-C30 using differential item functioning analyses. Quality of Life Research, 15(6), 1103–1115.

    Article  CAS  PubMed  Google Scholar 

  20. Scott, N. W., Fayers, P. M., Aaronson, N. K., Bottomley, A., de Graeff, A., Groenvold, M., et al. (2007). The use of differential item functioning analyses to identify cultural differences in responses to the EORTC QLQ-C30. Quality of Life Research, 16(1), 115–129.

    Article  CAS  PubMed  Google Scholar 

  21. Skevington, S. M. (2002). Advancing cross-cultural research on quality of life: Observations drawn from the WHOQOL development. World health organisation quality of life assessment. Quality of Life Research, 11(2), 135–144.

    Article  PubMed  Google Scholar 

  22. Ware, J. E., Jr., Kosinski, M., Gandek, B., Aaronson, N. K., Apolone, G., Bech, P., et al. (1998). The factor structure of the SF-36 Health Survey in 10 countries: Results from the IQOLA project. International quality of life assessment. Journal of Clinical Epidemiology, 51(11), 1159–1165.

    Article  PubMed  Google Scholar 

  23. Wild, D., Eremenco, S., Mear, I., Martin, M., Houchin, C., Gawlicki, M., et al. (2009). Multinational trials-recommendations on the translations required, approaches to using the same language in different countries, and the approaches to support pooling the data: The ISPOR patient-reported outcomes translation and linguistic validation good research practices task force report. Value Health, 12(4), 430–440.

    Article  PubMed  Google Scholar 

  24. Mullen, M. R. (1995). Diagnosing measurement equivalence in cross-national research. Journal of International Business Studies, 26(3), 573–596.

    Article  Google Scholar 

  25. Singh, J. (1995). Measurement issues in cross-national research. Journal of International Business Studies, 26(3), 597–619.

    Article  Google Scholar 

  26. Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 3(1), 4–70.

    Article  Google Scholar 

  27. Meredith, W., & Teresi, J. A. (2006). An essay on measurement and factorial invariance. Medical Care, 44(11 Suppl 3), S69–S77.

    Article  PubMed  Google Scholar 

  28. Dupuy, H. J. (1984). The psychological general well-being (PGWB) index. Assessment of quality of life in clinical trials of cardiovascular therapies, pp. 170–183.

  29. Regnault, A. (2007). Méthodes quantitatives pour l’évaluation de la validité interculturelle des instruments de mesure subjective évaluée par les patients. Université Claude Bernard Lyon 1.

  30. Spencer-Rodgers, J., Peng, K., Wang, L., & Hou, Y. (2004). Dialectical self-esteem and East-West differences in psychological well-being. Personality and Social Psychological Bulletin, 30(11), 1416–1432.

    Article  Google Scholar 

  31. Holland, P. W., & Wainer, H. (1993). Differential item functioning. Newbury Park (NJ): Lawrence Erlbaum Associates.

    Google Scholar 

  32. Cohen, A. S., Kim, S. H., & Wollack, J. A. (1996). An investigation of the likelihood ratio test for detection of differential item functioning. Applied Psychological Measurement, 20(1), 15–26.

    Article  Google Scholar 

  33. Lord, F. M. (1980). Application of item response theory to practical testing problems. Hillsbaum, NJ: Lawrence Erlbaum.

    Google Scholar 

  34. Raju, N. S. (1988). The area between two item characteristic curves. Psychometrika, 53(4), 495–502.

    Article  Google Scholar 

  35. Raju, N. S. (1990). Determining the significance of estimated signed and unsigned areas between two item response functions. Applied Psychological Measurement, 14(2), 197–207.

    Article  Google Scholar 

  36. Teresi, J. A., Kleinman, M., & Ocepek-Welikson, K. (2000). Modern psychometric methods for detection of differential item functioning: Application to cognitive assessment measures. Statistics in Medicine, 19(11–12), 1651–1683.

    Article  CAS  PubMed  Google Scholar 

  37. Teresi, J. A. (2006). Different approaches to differential item functioning in health applications. Advantages, disadvantages and some neglected topics. Med Care, 44(11 Suppl 3), S152–S170.

  38. Teresi, J. A., & Fleishman, J. A. (2007). Differential item functioning and health assessment. Quality of Life Research, 16(Suppl 1), 33–42.

    Article  PubMed  Google Scholar 

  39. Petersen, M. A., Groenvold, M., Bjorner, J. B., Aaronson, N., Conroy, T., Cull, A., et al. (2003). Use of differential item functioning analysis to assess the equivalence of translations of a questionnaire. Quality of Life Research, 12(4), 373–385.

    Article  PubMed  Google Scholar 

  40. van der Flier, H., Mellenbergh, G. J., Ader, H. J., & Wijn, M. (1984). An iterative item bias detection method. Journal of Cross-Cultural Psychology, 13, 267–298.

    Article  Google Scholar 

  41. Hidalgo, M. D., & Lopez-Pina, J. A. (2004). Differential item functioning detection and effect size: A comparison between logistic regression and Mantel-Haenszel procedures. Educational and Psychological Measurement, 64(6), 903.

  42. Regnault, A., Marfatia, S., Louie, M., Mear, I., Meunier, J., & Viala-Danten, M. (2009). Satisfactory cross-cultural validity of the ACTG symptom distress module in HIV-1-infected antiretroviral-naive patients. Clin Trials, 6(6), 574–584.

    Article  PubMed  Google Scholar 

  43. Zumbo, B. D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and likert-type (ordinal) item scores. Directorate of Human Ressources Research and Evaluation, Department of National Defense.

  44. Scott, N. W., Fayers, P. M., Aaronson, N. K., Bottomley, A., de Graeff, A., Groenvold, M., et al. (2010). Differential item functioning (DIF) analyses of health-related quality of life instruments using logistic regression. Health Qual Life Outcomes, 8, 81.

    PubMed Central  PubMed  Google Scholar 

  45. Crane, P. K., Gibbons, L. E., Jolley, L., & van Belle, G. (2006). Differential item functioning analysis with ordinal logistic regression techniques. DIFdetect and difwithpar. Medical Care, 44(11 Suppl 3), S115–S123.

    Article  PubMed  Google Scholar 

  46. Crane, P. K., Gibbons, L. E., Narasimhalu, K., Lai, J. S., & Cella, D. (2007). Rapid detection of differential item functioning in assessments of health-related quality of life: The functional assessment of cancer therapy. Quality of Life Research, 16(1), 101–114.

    Article  PubMed  Google Scholar 

  47. Clauser, B. E., & Mazor, K. M. (1998). Using statistical procedures to identify differentially functioning test items. Educational Measurement: Issues and Practice, 17(1), 31–44.

    Article  Google Scholar 

  48. Hambleton, R. K., & Rogers, H. J. (1989). Detecting potentially biased test items: Comparison of IRT area and Mantel-Haenszel methods. Applied Measurement in Education, 2(4), 313–334.

    Article  Google Scholar 

  49. Scott, N. W., Fayers, P. M., Aaronson, N. K., Bottomley, A., de Graeff, A., Groenvold, M., et al. (2010). Interpretation of differential item functioning analyses using external review. Expert Review of Pharmacoeconomics & Outcomes Research, 10(3), 253–258.

    Article  Google Scholar 

  50. Scott, N. W., Fayers, P. M., Aaronson, N. K., Bottomley, A., de Graeff, A., Groenvold, M., et al. (2009). The practical impact of differential item functioning analyses in a health-related quality of life instrument. Quality of Life Research, 18(8), 1125–1130.

    Article  PubMed  Google Scholar 

  51. Johnson, T., Kulesa, P., Cho, Y. I., & Shavitt, S. (2005). The relation between culture and response styles evidence from 19 countries. Journal of Cross-Cultural Psychology, 36(2), 264–277.

    Article  Google Scholar 

  52. Lee, J. W., Jones, P. S., Mineyama, Y., & Zhang, X. E. (2002). Cultural differences in responses to a Likert scale. Research in Nursing & Health, 25(4), 295–306.

    Article  Google Scholar 

  53. van Herk, H., Poortinga, Y. H., & Verhallen, T. M. M. (2004). Response styles in rating scales evidence of method bias in data from six EU countries. Journal of Cross-Cultural Psychology, 35(3), 346–360.

    Article  Google Scholar 

  54. Bowden, A., & Fox-Rushby, J. A. (2003). A systematic and critical review of the process of translation and adaptation of generic health-related quality of life measures in Africa, Asia, Eastern Europe, the Middle East, South America. Social Science & Medicine, 57(7), 1289–1306.

    Article  Google Scholar 

  55. Sarro, S., Duenas, R. M., Ramirez, N., Arranz, B., Martinez, R., Sanchez, J. M., et al. (2004). Cross-cultural adaptation and validation of the Spanish version of the Calgary Depression Scale for Schizophrenia. Schizophrenia Research, 68(2–3), 349–356.

    Article  PubMed  Google Scholar 

  56. Tauler, E., Vilagut, G., Grau, G., Gonzalez, A., Sanchez, E., Figueras, G., et al. (2001). The spanish version of the paediatric asthma quality of life questionnaire (PAQLQ): Metric characteristics and equivalence with the original version. Quality of Life Research, 10(1), 81–91.

    Article  CAS  PubMed  Google Scholar 

  57. Feldt, L. S., & Kim, S. (2006). Testing the difference between two alpha coefficients with small samples of subjects and raters. Educational and Psychological Measurement, 66(4), 589–600.

    Article  Google Scholar 

  58. Fledt, L. S. (1969). A test of the hypothesis that Cronbach’s alpha or Kruder-Richardson coefficient twenty is the same for two tests. Psychometrika, 34, 357–370.

    Google Scholar 

  59. Vangeneugden, T., Laenen, A., Geys, H., Renard, D., & Molenberghs, G. (2005). Applying concepts of generalizability theory on clinical trial data to investigate sources of variation and their impact on reliability. Biometrics, 61(1), 295–304.

    Article  PubMed  Google Scholar 

  60. Pae, T. I., & Park, G. P. (2006). Examining the relationship between differential item functioning and differential test functioning. Language Testing, 23(4), 475–496.

    Article  Google Scholar 

  61. Roznowski, M., & Reith, J. (1999). Examining the measurement quality of tests containing differentially functioning items: Do biased items result in poor measurement? Educational and Psychological Measurement, 59(2), 248–269.

    Article  Google Scholar 

Download references

Acknowledgments

We would like to thank Christine de la Loge for having initiated this work and for her participation in the earliest phase of this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Antoine Regnault.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Regnault, A., Herdman, M. Using quantitative methods within the Universalist model framework to explore the cross-cultural equivalence of patient-reported outcome instruments. Qual Life Res 24, 115–124 (2015). https://doi.org/10.1007/s11136-014-0722-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11136-014-0722-8

Keywords

Navigation