Quality of Life Research

, Volume 23, Issue 1, pp 217–227 | Cite as

Difference in method of administration did not significantly impact item response: an IRT-based analysis from the Patient-Reported Outcomes Measurement Information System (PROMIS) initiative

  • Jakob B. Bjorner
  • Matthias Rose
  • Barbara Gandek
  • Arthur A. Stone
  • Doerte U. Junghaenel
  • John E. WareJr.



To test the impact of method of administration (MOA) on the measurement characteristics of items developed in the Patient-Reported Outcomes Measurement Information System (PROMIS).


Two non-overlapping parallel 8-item forms from each of three PROMIS domains (physical function, fatigue, and depression) were completed by 923 adults (age 18–89) with chronic obstructive pulmonary disease, depression, or rheumatoid arthritis. In a randomized cross-over design, subjects answered one form by interactive voice response (IVR) technology, paper questionnaire (PQ), personal digital assistant (PDA), or personal computer (PC) on the Internet, and a second form by PC, in the same administration. Structural invariance, equivalence of item responses, and measurement precision were evaluated using confirmatory factor analysis and item response theory methods.


Multigroup confirmatory factor analysis supported equivalence of factor structure across MOA. Analyses by item response theory found no differences in item location parameters and strongly supported the equivalence of scores across MOA.


We found no statistically or clinically significant differences in score levels in IVR, PQ, or PDA administration as compared to PC. Availability of large item response theory-calibrated PROMIS item banks allowed for innovations in study design and analysis.


Patient-reported outcomes Quality of life Questionnaire Mode of administration Method of administration Item response theory 



Computerized adaptive testing


Chronic obstructive pulmonary disease






Item response theory


Interactive voice response


Method of administration


Personal computer


Personal digital assistant


Physical functioning


Paper questionnaire


SAS procedure for estimating mixed models


Patient-reported outcomes


Patient-Reported Outcomes Measurement Information System


Weighted least squares with mean and variance adjustment



The Patient-Reported Outcomes Measurement Information System (PROMIS) is a National Institutes of Health (NIH) Roadmap initiative to develop a computerized system measuring patient-reported outcomes in respondents with a wide range of chronic diseases and demographic characteristics. PROMIS was funded by cooperative agreements to a Statistical Coordinating Center (Northwestern University PI: David Cella, PhD, U01AR52177) and six Primary Research Sites (Duke University, PI: Kevin Weinfurt, PhD, U01AR52186; University of North Carolina, PI: Darren DeWalt, MD, MPH, U01AR52181; University of Pittsburgh, PI: Paul A. Pilkonis, PhD, U01AR52155; Stanford University, PI: James Fries, MD, U01AR52158; Stony Brook University, PI: Arthur Stone, PhD, U01AR52170; and University of Washington, PI: Dagmar Amtmann, PhD, U01AR52171). NIH Science Officers on this project are Deborah Ader, Ph.D., Susan Czajkowski, PhD, Lawrence Fine, MD, DrPH, Louis Quatrano, PhD, Bryce Reeve, PhD, William Riley, PhD, and Susana Serrate-Sztein, PhD. This manuscript was reviewed by the PROMIS Publications Subcommittee prior to external peer review. The authors would like to thank two anonymous PROMIS reviewers and two journal reviewers for comments on a previous version of this manuscript. See the web site at for additional information on the PROMIS cooperative group.


  1. 1.
    Gwaltney, C. J., Shields, A. L., & Shiffman, S. (2008). Equivalence of electronic and paper-and-pencil administration of patient-reported outcome measures: A meta-analytic review. Value Health, 11(2), 322–333.PubMedCrossRefGoogle Scholar
  2. 2.
    Raat, H., Mangunkusumo, R. T., Landgraf, J. M., et al. (2007). Feasibility, reliability, and validity of adolescent health status measurement by the Child Health Questionnaire Child Form (CHQ-CF): Internet administration compared with the standard paper version. Quality of Life Research, 16(4), 675–685.PubMedCentralPubMedCrossRefGoogle Scholar
  3. 3.
    Yu, S. C. (2007). Comparison of Internet-based and paper-based questionnaires in Taiwan using multisample invariance approach. CyberPsychology & Behavior, 10(4), 501–507.CrossRefGoogle Scholar
  4. 4.
    Duncan, P., Reker, D., Kwon, S., et al. (2005). Measuring stroke impact with the Stroke Impact Scale: Telephone versus mail administration in veterans with stroke. Medical Care, 43(5), 507–515.PubMedCrossRefGoogle Scholar
  5. 5.
    Hepner, K. A., Brown, J. A., & Hays, R. D. (2005). Comparison of mail and telephone in assessing patient experiences in receiving care from medical group practices. Evaluation and the Health Professions, 28(4), 377–389.PubMedCrossRefGoogle Scholar
  6. 6.
    de Vries, H., Elliott, M. N., Hepner, K. A., et al. (2005). Equivalence of mail and telephone responses to the CAHPS Hospital Survey. Health Services Research, 40(6 Pt 2), 2120–2139.PubMedCrossRefGoogle Scholar
  7. 7.
    Powers, J. R., Mishra, G., & Young, A. F. (2005). Differences in mail and telephone responses to self-rated health: Use of multiple imputation in correcting for response bias. Australian and New Zealand Journal of Public Health, 29(2), 149–154.PubMedCrossRefGoogle Scholar
  8. 8.
    Beebe, T. J., McRae, J. A., Harrison, P. A., et al. (2005). Mail surveys resulted in more reports of substance use than telephone surveys. Journal of Clinical Epidemiology, 58(4), 421–424.PubMedCrossRefGoogle Scholar
  9. 9.
    Kraus, L., & Augustin, R. (2001). Measuring alcohol consumption and alcohol-related problems: Comparison of responses from self-administered questionnaires and telephone interviews. Addiction, 96(3), 459–471.PubMedCrossRefGoogle Scholar
  10. 10.
    McHorney, C. A., Kosinski, M., & Ware, J. E, Jr. (1994). Comparisons of the costs and quality of norms for the SF-36 health survey collected by mail versus telephone interview: Results from a national survey. Medical Care, 32(6), 551–567.PubMedCrossRefGoogle Scholar
  11. 11.
    Hanmer, J., Hays, R. D., & Fryback, D. G. (2007). Mode of administration is important in US national estimates of health-related quality of life. Medical Care, 45(12), 1171–1179.PubMedCrossRefGoogle Scholar
  12. 12.
    Hays, R. D., Kim, S., Spritzer, K. L., et al. (2009). Effects of mode and order of administration on generic health-related quality of life scores. Value Health, 12(6), 1035–1039.PubMedCentralPubMedCrossRefGoogle Scholar
  13. 13.
    Agel, J., Rockwood, T., Mundt, J. C., et al. (2001). Comparison of interactive voice response and written self-administered patient surveys for clinical research. Orthopedics, 24(12), 1155–1157.PubMedGoogle Scholar
  14. 14.
    Dunn, J. A., Arakawa, R., Greist, J. H., & Clayton, A. H. (2007). Assessing the onset of antidepressant-induced sexual dysfunction using interactive voice response technology. Journal of Clinical Psychiatry, 68(4), 525–532.PubMedCrossRefGoogle Scholar
  15. 15.
    Rush, A. J., Bernstein, I. H., Trivedi, M. H., et al. (2006). An evaluation of the quick inventory of depressive symptomatology and the hamilton rating scale for depression: A sequenced treatment alternatives to relieve depression trial report. Biological Psychiatry, 59(6), 493–501.PubMedCentralPubMedCrossRefGoogle Scholar
  16. 16.
    Cella, D., Yount, S., Rothrock, N., et al. (2007). The Patient-Reported Outcomes Measurement Information System (PROMIS): Progress of an NIH Roadmap cooperative group during its first two years. Medical Care, 45(5 Suppl 1), S3–S11.PubMedCentralPubMedCrossRefGoogle Scholar
  17. 17.
    Broderick, J. E., Schwartz, J. E., Vikingstad, G., et al. (2008). The accuracy of pain and fatigue items across different reporting periods. Pain, 139(1), 146–157.PubMedCentralPubMedCrossRefGoogle Scholar
  18. 18.
    Broderick, J. E., Schneider, S., Schwartz, J. E., & Stone, A. A. (2010). Interference with activities due to pain and fatigue: Accuracy of ratings across different reporting periods. Quality of Life Research, 19(8), 1163–1170.PubMedCentralPubMedCrossRefGoogle Scholar
  19. 19.
    Schneider, S., Stone, A. A., Schwartz, J. E., & Broderick, J. E. (2011). Peak and end effects in patients’ daily recall of pain and fatigue: A within-subjects analysis. J Pain, 12(2), 228–235.PubMedCentralPubMedCrossRefGoogle Scholar
  20. 20.
    Ware, J. E, Jr, Kosinski, M., Bayliss, M. S., et al. (1995). Comparison of methods for the scoring and statistical analysis of SF-36 health profile and summary measures: Summary of results from the Medical Outcomes Study. Medical Care, 33(4 Suppl), AS264–AS279.PubMedGoogle Scholar
  21. 21.
    Cella, D., Riley, W., Stone, A., et al. (2010). The Patient-Reported Outcomes Measurement Information System (PROMIS) developed and tested its first wave of adult self-reported health outcome item banks: 2005–2008. Journal of Clinical Epidemiology, 63(11), 1179–1194.PubMedCentralPubMedCrossRefGoogle Scholar
  22. 22.
    Ware, J. E, Jr, Snow, K. K., Kosinski, M., & Gandek, B. (1993). SF-36 health survey. Manual and interpretation guide. Boston: The Health institute, New England Medical Center.Google Scholar
  23. 23.
    Hambleton, R. K., & Jones, R. W. (1993). An NCME Instructional Module on the comparison of classical test theory and item response theory and their applications to test development. Educational Measurement: Issues and Practice, 12(3), 38–47.CrossRefGoogle Scholar
  24. 24.
    van der Linden, W. J., & Hambleton, R. K. (1997). Handbook of modern item response theory. New York: Springer.CrossRefGoogle Scholar
  25. 25.
    Reeve, B. B., Hays, R. D., Bjorner, J. B., et al. (2007). Psychometric evaluation and calibration of health-related quality of life item banks: Plans for the Patient-Reported Outcomes Measurement Information System (PROMIS). Medical Care, 45(5 Suppl 1), S22–S31.PubMedCrossRefGoogle Scholar
  26. 26.
    Kolen, M. L., & Brennan, R. L. (2004). Test equating, scaling, and linking: Methods and practices. New York: Springer.CrossRefGoogle Scholar
  27. 27.
    Chew, L. D., Bradley, K. A., & Boyko, E. J. (2004). Brief questions to identify patients with inadequate health literacy. Family Medicine, 36, 588–594.PubMedGoogle Scholar
  28. 28.
    Muthen, B. O., & Muthen, L. (2007). Mplus user’s guide (5th ed.). Los Angeles: Muthén & Muthén.Google Scholar
  29. 29.
    Hochberg, Y. (1988). A sharper Bonferroni procedure for multiple tests of significance. Biometrika, 75, 800–803.CrossRefGoogle Scholar
  30. 30.
    Cohen, J. (1988). Statistical power for the behavioral sciences. Hillsdale NJ: Erlbaum.Google Scholar
  31. 31.
    Coons, S. J., Gwaltney, C. J., Hays, R. D., et al. (2009). Recommendations on evidence needed to support measurement equivalence between electronic and paper-based patient-reported outcome (PRO) measures: ISPOR ePRO Good Research Practices Task Force report. Value Health, 12(4), 419–429.PubMedCrossRefGoogle Scholar
  32. 32.
    Dillman, D. A., Phelps, G., Tortora, R., et al. (2009). Response rate and measurement differences in mixed-mode surveys using mail, telephone, interactive voice response (IVR) and the Internet. Social Science Research, 38, 1–18.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2013

Authors and Affiliations

  • Jakob B. Bjorner
    • 1
    • 2
    • 3
  • Matthias Rose
    • 4
    • 5
  • Barbara Gandek
    • 5
  • Arthur A. Stone
    • 6
  • Doerte U. Junghaenel
    • 6
  • John E. WareJr.
    • 5
    • 7
  1. 1.QualityMetricLincolnUSA
  2. 2.Department of Public HealthUniversity of CopenhagenCopenhagenDenmark
  3. 3.National Research Centre for the Working EnvironmentCopenhagenDenmark
  4. 4.Department of Psychosomatic Medicine and PsychotherapyMedical Clinic, Charité, UniversitätsmedizinBerlinGermany
  5. 5.Department of Quantitative Health Sciences University of Massachusetts Medical SchoolWorcesterUSA
  6. 6.Department of Psychiatry and Behavioral Science Stony Brook UniversityStony BrookUSA
  7. 7.John Ware Research GroupWorcesterUSA

Personalised recommendations