Grooming a CAT: customizing CAT administration rules to increase response efficiency in specific research and clinical settings
To evaluate the degree to which applying alternative stopping rules would reduce response burden while maintaining score precision in the context of computer adaptive testing (CAT).
Analyses were conducted on secondary data comprised of CATs administered in a clinical setting at multiple time points (baseline and up to two follow ups) to 417 study participants who had back pain (51.3%) and/or depression (47.0%). Participant mean age was 51.3 years (SD = 17.2) and ranged from 18 to 86. Participants tended to be white (84.7%), relatively well educated (77% with at least some college), female (63.9%), and married or living in a committed relationship (57.4%). The unit of analysis was individual assessment histories (i.e., CAT item response histories) from the parent study. Data were first aggregated across all individuals, domains, and time points in an omnibus dataset of assessment histories and then were disaggregated by measure for domain-specific analyses. Finally, assessment histories within a “clinically relevant range” (score ≥ 1 SD from the mean in direction of poorer health) were analyzed separately to explore score level-specific findings.
Two different sets of CAT administration rules were compared. The original CAT (CATORIG) rules required at least four and no more than 12 items be administered. If the score standard error (SE) reached a value < 3 points (T score metric) before 12 items were administered, the CAT was stopped. We simulated applying alternative stopping rules (CATALT), removing the requirement that a minimum four items be administered, and stopped a CAT if responses to the first two items were both associated with best health, if the SE was < 3, if SE change < 0.1 (T score metric), or if 12 items were administered. We then compared score fidelity and response burden, defined as number of items administered, between CATORIG and CATALT.
CATORIG and CATALT scores varied little, especially within the clinically relevant range, and response burden was substantially lower under CATALT (e.g., 41.2% savings in omnibus dataset).
Alternate stopping rules result in substantial reductions in response burden with minimal sacrifice in score precision.
KeywordsComputer adaptive testing CAT stopping rules Response burden PROMIS®
Funding was provided by U.S. Army.
- 1.Ahmed, S., Ware, P., Gardner, W., Witter, J., Bingham, C. O. 3rd, Kairy, D., et al. (2017) Montreal Accord on patient-reported outcomes use series-paper 8: Patient-reported outcomes in electronic health records can inform clinical and policy decisions. Journal of Clinical Epidemiology, 89, 160–167.CrossRefPubMedGoogle Scholar
- 3.Broderick, J. E., DeWitt, E. M., Rothrock, N., Crane, P. K., & Forrest, C. B. (2013) Advances in patient-reported outcomes: The NIH PROMIS((R)) measures. EGEMS (Wash DC), 1, 1015.Google Scholar
- 4.Health USDo, Human Services FDACfDE, Research, Health USDo, Human Services FDACfBE, Research, et al. (2006). Guidance for industry: Patient-reported outcome measures: Use in medical product development to support labeling claims: Draft guidance. Health and Quality of Life Outcomes, 4, 79.CrossRefGoogle Scholar
- 5.Noonan, V. K., Lyddiatt, A., Ware, P., Jaglal, S. B., Riopelle, R. J., & Bingham, C. O. 3rd, et al. (2017) Montreal Accord on patient-reported outcomes use series-paper 3: Patient-reported outcomes can facilitate shared decision-making and guide self-management. Journal of Clinical Epidemiology, 89, 125–135CrossRefPubMedGoogle Scholar
- 7.Cella, D., Riley, W., Stone, A., Rothrock, N., Reeve, B., Yount, S., et al. (2010). The patient-reported outcomes measurement information system (PROMIS) developed and tested its first wave of adult self-reported health outcome item banks: 2005–2008. Journal of Clinical Epidemiology, 63, 1179–1194.CrossRefPubMedPubMedCentralGoogle Scholar
- 12.Pilkonis, P. A., Choi, S. W., Reise, S. P., Stover, A. M., Riley, W. T., Cella, D., et al. (2011). Item banks for measuring emotional distress from the patient-reported outcomes measurement information system (PROMIS(R)): Depression, anxiety, and anger. Assessment, 18, 263–283.CrossRefPubMedPubMedCentralGoogle Scholar
- 13.Christodoulou, C., Junghaenel, D. U., DeWalt, D. A., Rothrock, N., & Stone, A. A. (2008). Cognitive interviewing in the evaluation of fatigue items: Results from the patient-reported outcomes measurement information system (PROMIS). Quality of life research: An international journal of quality of life aspects of treatment. Care and Rehabilitation, 17, 1239–1246.Google Scholar
- 14.Noonan, V. K., Cook, K. F., Bamer, A. M., Choi, S. W., Kim, J., & Amtmann, D. (2012). Measuring fatigue in persons with multiple sclerosis: Creating a crosswalk between the Modified Fatigue Impact Scale and the PROMIS fatigue short form. Quality of life research: An international journal of quality of life aspects of treatment. Care and Rehabilitation, 21, 1123–1133.Google Scholar
- 18.Flynn, K. E., Shelby, R. A., Mitchell, S. A., Fawzy, M. R., Hardy, N. C., Husain, A. M., et al. (2010). Sleep-wake functioning along the cancer continuum: Focus group results from the patient-reported outcomes measurement information system (PROMIS((R))). Psychooncology, 19, 1086–1093.CrossRefPubMedPubMedCentralGoogle Scholar
- 19.Hahn, E. A., Devellis, R. F., Bode, R. K., Garcia, S. F., Castel, L. D., Eisen, S. V., et al. (2010). Measuring social health in the patient-reported outcomes measurement information system (PROMIS): Item bank development and testing. Quality of life research: An international journal of quality of life aspects of treatment. Care and Rehabilitation, 19, 1035–1044.Google Scholar
- 20.Amtmann, D., Kim, J., Chung, H., Askew, R. L., Park, R., & Cook, K. F. (2016). Minimally important differences for patient reported outcomes measurement information system pain interference for individuals with back pain. Journal of Pain Research, 9, 251–255.CrossRefPubMedPubMedCentralGoogle Scholar
- 21.Pilkonis, P. A., Yu, L., Dodds, N. E., Johnston, K. L., Maihoefer, C. C., & Lawrence, S. M. (2014). Validation of the depression item bank from the patient-reported outcomes measurement information system (PROMIS) in a three-month observational study. Journal of Psychiatric Research, 56, 112–119.CrossRefPubMedPubMedCentralGoogle Scholar
- 22.Ware, J. E. Jr., Kosinski, M., Bjorner, J. B., Bayliss, M. S., Batenhorst, A., Dahlof, C. G., et al. (2003) Applications of computerized adaptive testing (CAT) to the assessment of headache impact. Quality of life research: An international journal of quality of life aspects of treatment. Care and Rehabilitation, 12, 935–952.Google Scholar