Efficiency of static and computer adaptive short forms compared to full-length measures of depressive symptoms
- 502 Downloads
Short-form patient-reported outcome measures are popular because they minimize patient burden. We assessed the efficiency of static short forms and computer adaptive testing (CAT) using data from the Patient-Reported Outcomes Measurement Information System (PROMIS) project.
We evaluated the 28-item PROMIS depressive symptoms bank. We used post hoc simulations based on the PROMIS calibration sample to compare several short-form selection strategies and the PROMIS CAT to the total item bank score.
Compared with full-bank scores, all short forms and CAT produced highly correlated scores, but CAT outperformed each static short form in almost all criteria. However, short-form selection strategies performed only marginally worse than CAT. The performance gap observed in static forms was reduced by using a two-stage branching test format.
Using several polytomous items in a calibrated unidimensional bank to measure depressive symptoms yielded a CAT that provided marginally superior efficiency compared to static short forms. The efficiency of a two-stage semi-adaptive testing strategy was so close to CAT that it warrants further consideration and study.
KeywordsComputer adaptive testing PROMIS Item response theory Short form Two-stage testing
- 5.Pilkonis, P. A., Choi, S. W., Reise, S. P., Stover, A. M., Riley, W. T., & Cella, D. (in preparation). The development of scales for emotional distress from the patient-reported outcomes measurement information system (PROMIS): Depression, Anxiety, and Anger. Google Scholar
- 9.Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph, 17.Google Scholar
- 10.Thissen, D., Chen, W.-H., & Bock, R. D. (2003). Multilog (version 7) [Computer software]. Lincolnwood, IL: Scientific Software International.Google Scholar
- 13.Bjorner, J. B., Smith, K. J., Orlando, M., Stone, C., Thissen, D., & Sun, X. (2006). IRTFIT: A macro for item fit and local dependence tests under IRT models. Lincoln, RI: Quality Metric, Inc.Google Scholar
- 14.Liu, H., Cella, D., Gershon, R., Shen, J., Morales, L. S., Riley, W. T., & Hays, R. D. (in press). Representativeness of the PROMIS Internet Panel. Journal of Clinical Epidemiology.Google Scholar
- 15.Muthen, L. K. & Muthen, B. O. (1998). Mplus user’s guide. Google Scholar
- 17.Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.Google Scholar
- 21.van der Linden, W. J., & Pashley, P. J. (2000). Item selection and ability estimator in adaptive testing. In W. J. van der Linden & C. A. W. Glas (Eds.), Computerized adaptive testing: Theory and practice (pp. 1–25). Boston, MA: Kluwer Academic.Google Scholar
- 22.Veerkamp, W. J. J., & Berger, M. P. F. (1997). Some new item selection criteria for adaptive testing. Journal of Educational and Behavioral Statistics, 22(2), 203–226.Google Scholar
- 32.Altman, D. G., & Bland, J. M. (1994). Diagnostic tests 2: Predictive values. British Journal of Medicine, 309, 102.Google Scholar