A comparison of computer adaptive tests (CATs) and short forms in terms of accuracy and number of items administrated using PROMIS profile
In the Patient-Reported Outcomes Measurement Information System (PROMIS), seven domains (Physical Function, Anxiety, Depression, Fatigue, Sleep Disturbance, Social Function, and Pain Interference) are packaged together as profiles. Each of these domains can also be assessed using computer adaptive tests (CATs) or short forms (SFs) of varying length (e.g., 4, 6, and 8 items). We compared the accuracy and number of items administrated of CAT versus each SF.
PROMIS instruments are scored using item response theory (IRT) with graded response model and reported as T scores (mean = 50, SD = 10). We simulated 10,000 subjects from the normal distribution with mean 60 for symptom scales and 40 for function scales, and standard deviation 10 in each domain. We considered a subject’s score to be accurate when the standard error (SE) was less than 3.0. We recorded range of accurate scores (accurate range) and the number of items administrated.
The average number of items administrated in CAT was 4.7 across all domains. The accurate range was wider for CAT compared to all SFs in each domain. CAT was notably better at extending the accurate range into very poor health for Fatigue, Physical Function, and Pain Interference. Most SFs provided reasonably wide accurate range.
Relative to SFs, CATs provided the widest accurate range, with slightly more items than SF4 and less than SF6 and SF8. Most SFs, especially longer ones, provided reasonably wide accurate range.
KeywordsComputer adaptive testing (CAT) Short form PROMIS Item response theory
This study was funded by National Institutes of Health (U2CCA186878, Recipient David Cella).
Compliance with ethical standards
Conflict of interest
Dr. Cella is an unpaid board member of the PROMIS Health Organization (PHO). He declares no other conflict of interest. Eisuke Segawa declares that he has no conflict of interest. Benjamin David Schalet declares that he has no conflict of interest.
This article does not contain any studies with human participants or animals performed by any of the authors.
- 8.Northwestern University. HealthMeasures. (2018). http://www.healthmeasures.net/index.php. Accessed October 5, 2019.
- 14.Pilkonis, P. A., Choi, S. W., Reise, S. P., Stover, A. M., Riley, W. T., & Cella, D. (2011). Item banks for measuring emotional distress from the patient-reported outcomes measurement information system (PROMIS): Depression, anxiety, and anger. Assessment,18(3), 263–283.PubMedPubMedCentralGoogle Scholar
- 15.Cella D, Choi S, Schalet B, et al. (2018). PROMIS® Health Profiles: Efficient short-form measures of seven health domains. Value Health. Submitted.Google Scholar
- 18.Ware, J. E., Kosinski, M., & Dewey, J. E. (2000). How to score version 2 of the SF-36 health survey. Lincoln: QualityMetric.Google Scholar
- 26.Gibbons, R. D., Weiss, D. J., Kupfer, D. J., et al. (2008). Using computerized adaptive testing to reduce the burden of mental health assessment. Psychiatric Services (Washington, D. C.),59(4), 361–368.Google Scholar
- 28.Eisen, S. V., Schultz, M. R., Ni, P., et al. (2016). Development and validation of a computerized-adaptive test for PTSD (P-CAT). Psychiatric Services (Washington, D. C.),67(10), 1116–1123.Google Scholar
- 40.Kisala, P. A., Victorson, D., Pace, N., Heinemann, A. W., Choi, S. W., & Tulsky, D. S. (2015). Measuring psychological trauma after spinal cord injury: Development and psychometric characteristics of the SCI-QOL Psychological Trauma item bank and short form. Journal of Spinal Cord Medicine,38(3), 326–334.PubMedGoogle Scholar
- 44.Rose, M., Bjorner, J. B., Becker, J., Fries, J. F., & Ware, J. E. (2008). Evaluation of a preliminary physical function item bank supported the expected advantages of the Patient-Reported Outcomes Measurement Information System (PROMIS). Journal of Clinical Epidemiology,61(1), 17–33.PubMedGoogle Scholar
- 50.Samejima F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph Supplement, No. 17. Richmond, VA: Psychometric Society. https://link.springer.com/article/10.1007%2FBF03372160. Accessed October 5, 2019.
- 51.De Ayala, R. J. (2009). The theory and practice of item response theory. New York: Guilford Publications.Google Scholar
- 54.Cook, K. F., Schalet, B. D., Kallen, M., Rutsohn, J. P., & Cella, D. (2015). Establishing a common metric for self-reported pain: Linking BPI pain interference and SF-36 bodily pain subscale scores to the PROMIS pain interference metric. Quality of Life Research,24(10), 2305–2318.PubMedPubMedCentralGoogle Scholar
- 55.R: A language and environment for statistical computing [computer program]. Vienna, Austria: R Foundation for Statistical Computing; 2018.Google Scholar