Quality of Life Research

, Volume 20, Issue 9, pp 1349–1357 | Cite as

Migrating from a legacy fixed-format measure to CAT administration: calibrating the PHQ-9 to the PROMIS depression measures

  • Laura E. Gibbons
  • Betsy J. Feldman
  • Heidi M. Crane
  • Michael Mugavero
  • James H. Willig
  • Donald Patrick
  • Joseph Schumacher
  • Michael Saag
  • Mari M. Kitahata
  • Paul K. Crane



We provide detailed instructions for analyzing patient-reported outcome (PRO) data collected with an existing (legacy) instrument so that scores can be calibrated to the PRO Measurement Information System (PROMIS) metric. This calibration facilitates migration to computerized adaptive test (CAT) PROMIS data collection, while facilitating research using historical legacy data alongside new PROMIS data.


A cross-sectional convenience sample (n = 2,178) from the Universities of Washington and Alabama at Birmingham HIV clinics completed the PROMIS short form and Patient Health Questionnaire (PHQ-9) depression symptom measures between August 2008 and December 2009. We calibrated the tests using item response theory. We compared measurement precision of the PHQ-9, the PROMIS short form, and simulated PROMIS CAT.


Dimensionality analyses confirmed the PHQ-9 could be calibrated to the PROMIS metric. We provide code used to score the PHQ-9 on the PROMIS metric. The mean standard errors of measurement were 0.49 for the PHQ-9, 0.35 for the PROMIS short form, and 0.37, 0.28, and 0.27 for 3-, 8-, and 9-item-simulated CATs.


The strategy described here facilitated migration from a fixed-format legacy scale to PROMIS CAT administration and may be useful in other settings.


Calibration Computerized adaptive testing Depression Item banks Item response theory PROMIS 



Computerized adaptive testing


Confirmatory factor analysis


Comparative Fit Index


Differential item functioning


Patient Health Questionnaire from the PRIME-MD depression measure


Patient-reported outcome


Patient-Reported Outcome Measurement Information System


Root mean square error of approximation


Standard deviation


Standard error of measurement


Tucker–Lewis Index


University of Washington


University of Alabama at Birmingham



This work was supported by National Institutes of Health grants U01 AR 057954, R01 MH 084759, P30 AI 27757, P30 AI 27767, R24 AI 067039, K23 MH 082641, and the Mary Fisher CARE Fund. The Patient-Reported Outcomes Measurement Information System (PROMIS) is an NIH Roadmap initiative to develop a computerized system measuring PROs in respondents with a wide range of chronic diseases and demographic characteristics. PROMIS II was funded by cooperative agreements with a Statistical Center (Northwestern University, PI: David F. Cella, PhD, 1U54AR057951), a Technology Center (Northwestern University, PI: Richard C. Gershon, PhD, 1U54AR057943), a Network Center (American Institutes for Research, PI: Susan (San) D. Keller, PhD, 1U54AR057926) and thirteen Primary Research Sites (State University of New York, Stony Brook, PIs: Joan E. Broderick, PhD and Arthur A. Stone, PhD, 1U01AR057948; University of Washington, Seattle, PIs: Heidi M. Crane, MD, MPH, Paul K. Crane, MD, MPH, and Donald L. Patrick, PhD, 1U01AR057954; University of Washington, Seattle, PIs: Dagmar Amtmann, PhD, and Karon Cook, PhD1U01AR052171; University of North Carolina, Chapel Hill, PI: Darren A. DeWalt, MD, MPH, 2U01AR052181; Children’s Hospital of Philadelphia, PI: Christopher B. Forrest, MD, PhD, 1U01AR057956; Stanford University, PI: James F. Fries, MD, 2U01AR052158; Boston University, PIs: Stephen M. Haley, PhD, and David Scott Tulsky, PhD, 1U01AR057929; University of California, Los Angeles, PIs: Dinesh Khanna, MD, and Brennan Spiegel, MD, MSHS, 1U01AR057936; University of Pittsburgh, PI: Paul A. Pilkonis, PhD, 2U01AR052155; Georgetown University, Washington DC, PIs: Carol. M. Moinpour, PhD, and Arnold L. Potosky, PhD, U01AR057971; Children’s Hospital Medical Center, Cincinnati, PI: Esi M. Morgan Dewitt, MD, 1U01AR057940; University of Maryland, Baltimore, PI: Lisa M. Shulman, MD, 1U01AR057967; and Duke University, PI: Kevin P. Weinfurt, PhD, 2U01AR052186). NIH Science Officers on this project have included Deborah Ader, PhD, Vanessa Ameen, MD, Susan Czajkowski, PhD, Basil Eldadah, MD, PhD, Lawrence Fine, MD, DrPH, Lawrence Fox, MD, PhD, Lynne Haverkos, MD, MPH, Thomas Hilton, PhD, Laura Lee Johnson, PhD, Michael Kozak, PhD, Peter Lyster, PhD, Donald Mattison, MD, Claudia Moy, PhD, Louis Quatrano, PhD, Bryce Reeve, PhD, William Riley, PhD, Ashley Wilder Smith, PhD, MPH, Susana Serrate-Sztein,MD, Ellen Werner, PhD, and James Witter, MD, PhD. This manuscript was reviewed by PROMIS reviewers before submission for external peer review. See the Web site at for additional information on the PROMIS initiative.

Supplementary material

11136_2011_9882_MOESM1_ESM.doc (272 kb)
Supplementary material 1 (DOC 272 kb)


  1. 1.
    Cella, D., Yount, S., Rothrock, N., Gershon, R., Cook, K., Reeve, B., et al. (2007). The Patient-Reported Outcomes Measurement Information System (PROMIS): Progress of an NIH Roadmap cooperative group during its first two years. Medical Care, 45(5 Suppl 1), S3–S11.PubMedCrossRefGoogle Scholar
  2. 2.
    Reeve, B. B., Hays, R. D., Bjorner, J. B., Cook, K. F., Crane, P. K., Teresi, J. A., et al. (2007). Psychometric evaluation and calibration of health-related quality of life item banks: Plans for the Patient-Reported Outcomes Measurement Information System (PROMIS). Medical Care, 45(5 Suppl 1), S22–S31.PubMedCrossRefGoogle Scholar
  3. 3.
    Cella, D., Riley, W. T., Stone, A., Rothrock, N., Reeve, B., Yount, S. E., et al. (in press). Initial item banks and first wave testing of the Patient-Reported Outcomes Measurement Information System (PROMIS) network: 2005–2008. Journal of Clinical Epidemiology.Google Scholar
  4. 4.
    Bjorner, J. B., Kosinski, M., & Ware, J. E., Jr. (2003). Calibration of an item pool for assessing the burden of headaches: An application of item response theory to the headache impact test (HIT). Quality of Life Research, 12(8), 913–933.PubMedCrossRefGoogle Scholar
  5. 5.
    Bjorner, J. B., Chang, C. H., Thissen, D., & Reeve, B. B. (2007). Developing tailored instruments: Item banking and computerized adaptive assessment. Quality of Life Research, 16(Suppl 1), 95–108.PubMedCrossRefGoogle Scholar
  6. 6.
    Cella, D., Gershon, R., Lai, J. S., & Choi, S. (2007). The future of outcomes measurement: Item banking, tailored short-forms, and computerized adaptive assessment. Quality of Life Research, 16(Suppl 1), 133–141.PubMedCrossRefGoogle Scholar
  7. 7.
    Fayers, P. M. (2007). Applying item response theory and computer adaptive testing: The challenges for health outcomes assessment. Quality of Life Research, 16(Suppl 1), 187–194.PubMedCrossRefGoogle Scholar
  8. 8.
    Choi, S. W., Reise, S. P., Pilkonis, P. A., Hays, R. D., & Cella, D. (2010). Efficiency of static and computer adaptive short forms compared to full-length measures of depressive symptoms. Quality of Life Research, 19(1), 125–136.PubMedCrossRefGoogle Scholar
  9. 9.
    Dorans, N. J. (2007). Linking scores from multiple health outcome instruments. Quality of Life Research, 16(Suppl 1), 85–94.PubMedCrossRefGoogle Scholar
  10. 10.
    Crane, P. K., Narasimhalu, K., Gibbons, L. E., Mungas, D. M., Haneuse, S., Larson, E. B., et al. (2008). Item response theory facilitated cocalibrating cognitive tests and reduced bias in estimated rates of decline. Journal of Clinical Epidemiology, 61(10), 1018–1027.PubMedCrossRefGoogle Scholar
  11. 11.
    Bjorner, J. B., Kosinski, M., & Ware, J. E., Jr. (2003). Using item response theory to calibrate the Headache Impact Test (HIT) to the metric of traditional headache scales. Quality of Life Research, 12(8), 981–1002.PubMedCrossRefGoogle Scholar
  12. 12.
    Ware, J. E., Jr., Kosinski, M., Bjorner, J. B., Bayliss, M. S., Batenhorst, A., Dahlof, C. G., et al. (2003). Applications of computerized adaptive testing (CAT) to the assessment of headache impact. Quality of Life Research, 12(8), 935–952.PubMedCrossRefGoogle Scholar
  13. 13.
    Kitahata, M. M., Rodriguez, B., Haubrich, R., Boswell, S., Mathews, W. C., Lederman, M. M., et al. (2008). Cohort profile: The Centers for AIDS Research Network of Integrated Clinical Systems. International Journal of Epidemiology, 37(5), 948–955.PubMedCrossRefGoogle Scholar
  14. 14.
    Lawrence, S. T., Willig, J. H., Crane, H. M., Ye, J., Aban, I., Lober, W., et al. (2010). Routine, self-administered, touch-screen, computer-based suicidal ideation assessment linked to automated response team notification in an HIV primary care setting. Clinical Infectious Diseases, 50(8), 1165–1173.PubMedCrossRefGoogle Scholar
  15. 15.
    Crane, H. M., Lober, W., Webster, E., Harrington, R. D., Crane, P. K., Davis, T. E., et al. (2007). Routine collection of patient-reported outcomes in an HIV clinic setting: The first 100 patients. Current HIV Research, 5(1), 109–118.PubMedCrossRefGoogle Scholar
  16. 16.
    Pilkonis, P. A., Choi, S. W., Reise, S. P., Stover, A. M., Riley, W. T., & Cella, D. (under review). Item banks for measuring emotional distress from the Patient-Reported Outcomes Measurement Information System (PROMIS). Depression, Anxiety, and Anger.Google Scholar
  17. 17.
    Kroenke, K., Spitzer, R. L., Williams, J. B., & Lowe, B. (2010). The Patient Health Questionnaire Somatic, Anxiety, and Depressive Symptom Scales: A systematic review. General Hospital Psychiatry, 32(4), 345–359.PubMedCrossRefGoogle Scholar
  18. 18.
    Spitzer, R. L., Kroenke, K., & Williams, J. B. (1999). Validation and utility of a self-report version of PRIME-MD: the PHQ primary care study. Primary Care Evaluation of Mental Disorders. Patient Health Questionnaire. JAMA, 282(18), 1737–1744.PubMedCrossRefGoogle Scholar
  19. 19.
    Crane, P. K., Gibbons, L. E., Willig, J. H., Mugavero, M. J., Lawrence, S. T., Schumacher, J. E., et al. (2010). Measuring depression and depressive symptoms in HIV-infected patients as part of routine clinical care using the 9-item Patient Health Questionnaire (PHQ-9). AIDS Care, 22(7), 874–885.PubMedCrossRefGoogle Scholar
  20. 20.
    Teresi, J. A., Ocepek-Welikson, K., Kleinman, M., Eimicke, J. P., Crane, P. K., Jones, R. N., et al. (2009). Analysis of differential item functioning in the depression item bank from the Patient Reported Outcome Measurement Information System (PROMIS): An item response theory approach. Psychology Science Quarterly, 51(2), 148–180.PubMedGoogle Scholar
  21. 21.
    Muthén, L. K., & Muthén, B. O. (1998–2007). Mplus: Statistical analysis with latent variables. Los Angeles, CA: Muthén & Muthén.Google Scholar
  22. 22.
    Wirth, R. J., & Edwards, M. C. (2007). Item factor analysis: Current approaches and future directions. Psychol Methods, 12(1), 58–79.PubMedCrossRefGoogle Scholar
  23. 23.
    Forero, C. G., & Maydeu-Olivares, A. (2009). Estimation of IRT graded response models: Limited versus full information methods. Psychological Methods, 14(3), 275–299.PubMedCrossRefGoogle Scholar
  24. 24.
    Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph, No. 17.Google Scholar
  25. 25.
    StataCorp. (2009). Stata statistical software: Release 11. College Station, TX: StataCorp LP.Google Scholar
  26. 26.
    Muraki, E., & Bock, D. (2003). PARSCALE for Windows. Chicago: Scientific Software International.Google Scholar
  27. 27.
    Kroenke, K., Spitzer, R. L., & Williams, J. B. (2001). The PHQ-9: Validity of a brief depression severity measure. Journal of General Internal Medicine, 16(9), 606–613.PubMedCrossRefGoogle Scholar
  28. 28.
    Choi, S. W. (2009). Firestar: Computerized adaptive testing (CAT) simulation program for polytomous IRT models. Applied Psychological Measurement, 33(8), 644–645.PubMedCrossRefGoogle Scholar
  29. 29.
    Fries, J. F., Cella, D., Rose, M., Krishnan, E., & Bruce, B. (2009). Progress in assessing physical function in arthritis: PROMIS short forms and computerized adaptive testing. Journal of Rheumatology, 36(9), 2061–2066.PubMedCrossRefGoogle Scholar
  30. 30.
    Rose, M., Bjorner, J. B., Becker, J., Fries, J. F., & Ware, J. E. (2008). Evaluation of a preliminary physical function item bank supported the expected advantages of the Patient-Reported Outcomes Measurement Information System (PROMIS). Journal of Clinical Epidemiology, 61(1), 17–33.PubMedCrossRefGoogle Scholar
  31. 31.
    Crane, H. M., Grunfeld, C., Harrington, R. D., Uldall, K. K., Ciechanowski, P. S., & Kitahata, M. M. (2008). Lipoatrophy among HIV-infected patients is associated with higher levels of depression than lipohypertrophy. HIV Medicine, 9(9), 780–786.PubMedCrossRefGoogle Scholar
  32. 32.
    Hansson, M., Chotai, J., Nordstom, A., & Bodlund, O. (2009). Comparison of two self-rating scales to detect depression: HADS and PHQ-9. British Journal of General Practice, 59(566), e283–e288.PubMedCrossRefGoogle Scholar
  33. 33.
    Wittkampf, K. A., Naeije, L., Schene, A. H., Huyser, J., & van Weert, H. C. (2007). Diagnostic accuracy of the mood module of the Patient Health Questionnaire: A systematic review. General Hospital Psychiatry, 29(5), 388–395.PubMedCrossRefGoogle Scholar
  34. 34.
    Coons, S. J., Gwaltney, C. J., Hays, R. D., Lundy, J. J., Sloan, J. A., Revicki, D. A., et al. (2009). Recommendations on evidence needed to support measurement equivalence between electronic and paper-based patient-reported outcome (PRO) measures: ISPOR ePRO Good Research Practices Task Force report. Value Health, 12(4), 419–429.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2011

Authors and Affiliations

  • Laura E. Gibbons
    • 1
  • Betsy J. Feldman
    • 2
  • Heidi M. Crane
    • 2
  • Michael Mugavero
    • 3
  • James H. Willig
    • 4
  • Donald Patrick
    • 5
  • Joseph Schumacher
    • 6
  • Michael Saag
    • 7
  • Mari M. Kitahata
    • 2
  • Paul K. Crane
    • 1
  1. 1.General Internal MedicineUniversity of WashingtonSeattleUSA
  2. 2.Allergy and Infectious DiseasesUniversity of WashingtonSeattleUSA
  3. 3.Department of Medicine, Division of Infectious DiseaseUniversity of Alabama at BirminghamBirminghamUSA
  4. 4.Department of Medicine, Division of Infectious DiseaseUniversity of Alabama at BirminghamBirminghamUSA
  5. 5.Department of Health ServicesUniversity of WashingtonSeattleUSA
  6. 6.Division of Preventive Medicine, School of MedicineUniversity of Alabama at BirminghamBirminghamUSA
  7. 7.Center for AIDS ResearchUniversity of Alabama at BirminghamBirminghamUSA

Personalised recommendations