The practical impact of differential item functioning analyses in a health-related quality of life instrument

Scott, Neil W.; Fayers, Peter M.; Aaronson, Neil K.; Bottomley, Andrew; de Graeff, Alexander; Groenvold, Mogens; Gundy, Chad; Koller, Michael; Petersen, Morten A.; Sprangers, Mirjam A. G.

doi:10.1007/s11136-009-9521-z

The practical impact of differential item functioning analyses in a health-related quality of life instrument

Published: 04 August 2009

Volume 18, pages 1125–1130, (2009)
Cite this article

Quality of Life Research Aims and scope Submit manuscript

Neil W. Scott¹,
Peter M. Fayers^1,2,
Neil K. Aaronson³,
Andrew Bottomley⁴,
Alexander de Graeff⁵,
Mogens Groenvold^6,7,
Chad Gundy³,
Michael Koller⁸,
Morten A. Petersen⁶ &
…
Mirjam A. G. Sprangers⁹

313 Accesses
28 Citations
Explore all metrics

Abstract

Introduction

Differential item functioning (DIF) analyses are commonly used to evaluate health-related quality of life (HRQoL) instruments. There is, however, a lack of consensus as to how to assess the practical impact of statistically significant DIF results.

Methods

Using our previously published ordinal logistic regression DIF results for the Fatigue scale of a HRQoL instrument as an example, the practical impact on a particular Norwegian clinical trial was investigated. The results were used to determine the difference in mean Fatigue scores assuming that the same trial was conducted in the UK. The results were then compared with published information on what would be considered a clinically important change in scores.

Results

The item with the largest DIF effect resulted in differences between the mean English and Norwegian Fatigue scores that, although small, could be considered clinically important. Sensitivity analyses showed that larger differences were found for shorter scales, and when the proportions in each response category were equal.

Discussion

Our scenarios suggest that translation differences in an item can result in small, but clinically important, differences at the scale score level. This is more likely to be problematic for observational studies than for clinical trials, where randomised groups are stratified by centre.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Health, Health-Related Quality of Life, and Quality of Life: What is the Difference?

Article 18 February 2016

A systematic review of quality of life research in medicine and health sciences

Article Open access 11 June 2019

Differential item functioning of the PROMIS physical function, pain interference, and pain behavior item banks across patients with different musculoskeletal disorders and persons from the general population

Article 02 January 2019

References

Holland, P. W., & Wainer, H. (1993). Differential item functioning. Hillsdale, NJ: Lawrence Erlbaum.
Google Scholar
Crane, P. K., Gibbons, L. E., Jolley, L., & van Belle, G. (2006). Differential item functioning analysis with ordinal logistic regression techniques. Medical Care, 44, S115–S123.
Article PubMed Google Scholar
Groenvold, M., & Petersen, M. A. (2005). The role and use of differential item functioning (DIF) analysis of quality of life data from clinical trials. In P. Fayers & R. Hays (Eds.), Assessing quality of life in clinical trials (pp. 195–208). Oxford: Oxford University Press.
Google Scholar
Teresi, J. A. (2006). Overview of quantitative measurement methods: Equivalence, invariance, and differential item functioning in health applications. Medical Care, 44, S39–S49.
Article PubMed Google Scholar
Teresi, J. A. (2006). Different approaches to differential item functioning in health applications: Advantages, disadvantages and some neglected topics. Medical Care, 44, S152–S170.
Article PubMed Google Scholar
Scott, N. W., Fayers, P. M., Bottomley, A., Aaronson, N. K., de Graeff, A., Groenvold, M., et al. (2006). Comparing translations of the EORTC QLQ-C30 using differential item functioning analyses. Quality of Life Research, 15, 1103–1115.
Article PubMed CAS Google Scholar
Clauser, B. E., & Mazor, K. M. (1998). Using statistical procedures to identify differentially functioning test items. Educational Measurement: Issues and Practice, 2, 31–44.
Article Google Scholar
Millsap, R. E. (2006). Comments on methods for the investigation of measurement bias in the mini-mental state examination. Medical Care, 44, S171–S175.
Article PubMed Google Scholar
Jodoin, M. G., & Gierl, M. J. (2001). Evaluating type I error and power rates using an effect size measure with the logistic regression procedure for DIF detection. Applied Measurement in Education, 14, 329–349.
Article Google Scholar
Bjorner, J. B., Kreiner, S., Ware, J. E., Damsgaard, M. T., & Bech, P. (1998). Differential item functioning in the Danish translation of the SF-36. Journal of Clinical Epidemiology, 51, 1189–1202.
Article PubMed CAS Google Scholar
Fayers, P., Aaronson, N., Bjordal, K., Groenvold, M., Curran, D., & Bottomley, A. (2001). EORTC QLQ-C30 scoring manual. Brussels: European Organization for Research and Treatment of Cancer.
Google Scholar
Scott, N. W., Fayers, P. M., Aaronson, N. K., Bottomley, A., de Graeff, A., Groenvold, M., et al. (2007). The use of differential item functioning analyses to identify cultural differences in responses to the EORTC QLQ-C30. Quality of Life Research, 16, 115–129.
Article PubMed CAS Google Scholar
Wisloff, F., Hjorth, M., Kaasa, S., & Westin, J. (1996). Effect of interferon on the health-related quality of life of multiple myeloma patients: Results of a Nordic randomized trial comparing melphalan-prednisone to melphalan-prednisone + alpha-interferon. British Journal of Haematology, 94, 324–332.
Article PubMed CAS Google Scholar
Zieky, M. (1993). Practical questions in the use of DIF statistics in test development. In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 337–348). Hillsdale, NJ: Lawrence Erlbaum.
Google Scholar

Download references

Author information

Authors and Affiliations

Section of Population Health, Institute of Applied Health Sciences, University of Aberdeen, Polwarth Building, Foresterhill, Aberdeen, AB25 2ZD, UK
Neil W. Scott & Peter M. Fayers
Department of Cancer Research and Molecular Medicine, Faculty of Medicine, Norwegian University of Science and Technology, Trondheim, Norway
Peter M. Fayers
Division of Psychosocial Research and Epidemiology, Netherlands Cancer Institute, Amsterdam, The Netherlands
Neil K. Aaronson & Chad Gundy
Quality of Life Department, European Organisation for Research and Treatment of Cancer Headquarters, Brussels, Belgium
Andrew Bottomley
Division of Medical Oncology, Department of Internal Medicine, University Medical Centre, Utrecht, The Netherlands
Alexander de Graeff
Department of Palliative Medicine, Bispebjerg Hospital, Copenhagen, Denmark
Mogens Groenvold & Morten A. Petersen
Institute of Public Health, University of Copenhagen, Copenhagen, Denmark
Mogens Groenvold
Centre for Clinical Studies, University Hospital Regensburg, Regensburg, Germany
Michael Koller
Department of Medical Psychology, Academic Medical Centre, University of Amsterdam, Amsterdam, The Netherlands
Mirjam A. G. Sprangers

Authors

Neil W. Scott
View author publications
You can also search for this author in PubMed Google Scholar
Peter M. Fayers
View author publications
You can also search for this author in PubMed Google Scholar
Neil K. Aaronson
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Bottomley
View author publications
You can also search for this author in PubMed Google Scholar
Alexander de Graeff
View author publications
You can also search for this author in PubMed Google Scholar
Mogens Groenvold
View author publications
You can also search for this author in PubMed Google Scholar
Chad Gundy
View author publications
You can also search for this author in PubMed Google Scholar
Michael Koller
View author publications
You can also search for this author in PubMed Google Scholar
Morten A. Petersen
View author publications
You can also search for this author in PubMed Google Scholar
Mirjam A. G. Sprangers
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peter M. Fayers.

Additional information

On behalf of the EORTC Quality of Life Group and the Quality of Life Cross-Cultural Meta-Analysis Group.

Appendix

Coding the four response categories 0, 1, 2 and 3, respectively, the following three equations apply when conducting ordinal logistic regression using the proportional odds model:

$$ {\text{logit(}}\Pr (Y \ge j ) )= \beta_{{0_{j} }} + \beta_{1}^{*} {\text{LANG}} + \beta_{2}^{*} {\text{FA}} + \beta_{3}^{*} {\text{AGE}} + \cdots ,\quad \, j = 1,2,3 $$

(1)

where LANG is the language/translation (0 for English, 1 for Norwegian), FA is the overall Fatigue scale score and adjustment is also made for age (AGE) and other covariates.

These equations may then be written out separately for Norwegian (NO) and English (EN) speakers:

$$ \begin{array}{*{20}c} {{\text{logit}}(\Pr (Y_{\text{NO}} \ge j )) = \beta_{{0_{j} }} + \beta_{1} + \beta_{2}^{*} {\text{FA}} + \beta_{3}^{*} {\text{AGE}} + \cdots ,\quad \, j = 1,2,3} \hfill \\ {{\text{logit}}(\Pr (Y_{\text{EN}} \ge j )) = \beta_{{0_{j} }} + \beta_{2}^{*} {\text{FA}} + \beta_{3}^{*} {\text{AGE}} + \cdots ,\quad \, j = 1,2,3} \hfill \\ \end{array} $$

(2)

Combining the two equations and rearranging gives

$$ {\text{Pr(}}Y_{\text{EN}} \ge \, j )= {\frac{1}{{\left( {{\frac{{1 - {\text{Pr(}}Y_{\text{NO}} \ge \, j )}}{{{\text{Pr(}}Y_{\text{NO}} \ge \, j )}}}} \right)e^{{\beta_{1} }} + 1}}},\quad j = 1,2,3 $$

(3)

From the results of our DIF analyses for Q18, the estimate of β₁ was found to be −1.089. From Table 2, for this particular study:

$$ \begin{aligned} &{\Pr (Y_{\text{NO}} \ge 1) = 450/513 = 0.877}\\ &{\Pr (Y_{\text{NO}} \ge 2) = 242/513 = 0.472} \\ &{\Pr (Y_{\text{NO}} \ge 3) = 87/513\;\;= 0.170} \end{aligned} $$

(4)

Substituting these values into formulae [3] gives:

$$ \begin{array}{*{20}c} {\Pr (Y_{\text{EN}} \ge \, 1) = 0.955} \\ {\Pr (Y_{\text{EN}} \ge \, 2) = 0.726} \\ {\Pr (Y_{\text{EN}} \ge \, 3) = 0.378} \\ \end{array} $$

(5)

Hence the proportions choosing each category can be deduced, assuming that the same study was conducted using English-speaking patients:

$$ \begin{array}{ll} {{\text{Pr(not at all}}_{\text{EN}} )= 1- 0. 9 5 5= 0.0 4 5} \\ {{\text{Pr(a little}}_{\text{EN}} )= 0. 9 5 5- 0. 7 2 6= 0. 2 2 9} \\ {{\text{Pr(quite a bit}}_{\text{EN}} )= 0. 7 2 6- 0. 3 7 8= 0. 3 4 8} \\ {{\text{Pr(very much}}_{\text{EN}} )= 0. 3 7 8} \\ \end{array} $$

(6)

By comparison with the proportions in Table 2, this would mean that English speakers would be more likely to score highly on this item than Norwegian speakers.

Assuming that Q18 is the only item with true DIF, how would this affect the mean Fatigue scale scores of the patients in this study? Using scores of 0, 33.33, 66.67 and 100 for the four categories of Q18, the average scores for this item for Norwegian and English speakers would be:

$$ \begin{array}{*{20}c} {{\text{Norwegian: }}0 \times 0. 1 2 3+ 3 3. 3 3\times 0. 40 6+ 6 6. 6 7\times 0. 30 2+ 100 \times 0. 1 70 = 50. 6 2} \hfill \\ {{\text{English: }}0 \times 0.0 4 5+ 3 3. 3 3\times 0. 2 2 9+ 6 6. 6 7\times 0. 3 4 8+ 100 \times 0. 3 7 8= 6 8. 6 3} \hfill \\ \end{array} $$

(7)

Therefore, Norwegian speakers would be expected to score on average 68.63 − 50.62 = 18.01 points lower on this item.

The Fatigue scale score is made up of three items of equal weighting, so this would mean that for the overall scale score Norwegians would be expected to score 18.01/3 = 6.00 more than English speakers on average.

Table 1 shows that the 95% confidence limits for the language effect for Q18 (β₁) were −1.271 and −0.908. Following the same methods as for β₁ itself (working not shown), this implies that the difference between English and Norwegian speakers on the Fatigue subscale would be 6.00 (95% CI: 5.03–6.95).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Scott, N.W., Fayers, P.M., Aaronson, N.K. et al. The practical impact of differential item functioning analyses in a health-related quality of life instrument. Qual Life Res 18, 1125–1130 (2009). https://doi.org/10.1007/s11136-009-9521-z

Download citation

Received: 13 March 2009
Accepted: 11 July 2009
Published: 04 August 2009
Issue Date: October 2009
DOI: https://doi.org/10.1007/s11136-009-9521-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The practical impact of differential item functioning analyses in a health-related quality of life instrument