Item selection, scaling and construct validation of the Patient-Reported Inventory of Self-Management of Chronic Conditions (PRISM-CC) measurement tool in adults

Purpose To select and scale items for the seven domains of the Patient-Reported Inventory of Self-Management of Chronic Conditions (PRISM-CC) and assess its construct validity. Methods Using an online survey, data on 100 potential items, and other variables for assessing construct validity, were collected from 1055 adults with one or more chronic health conditions. Based on a validated conceptual model, confirmatory factor analysis (CFA) and item response models (IRT) were used to select and scale potential items and assess the internal consistency and structural validity of the PRISM-CC. To further assess construct validity, hypothesis testing of known relationships was conducted using structural equation models. Results Of 100 potential items, 36 (4–8 per domain) were selected, providing excellent fit to our hypothesized correlated factors model and demonstrating internal consistency and structural validity of the PRISM-CC. Hypothesized associations between PRISM-CC domains and other measures and variables were confirmed, providing further evidence of construct validity. Conclusion The PRISM-CC overcomes limitations of assessment tools currently available to measure patient self-management of chronic health conditions. This study provides strong evidence for the internal consistency and construct validity of the PRISM-CC as an instrument to assess patient-reported difficulty in self-managing different aspects of daily life with one or more chronic conditions. Further research is needed to assess its measurement equivalence across patient attributes, ability to measure clinically important change, and utility to inform self-management support. Supplementary Information The online version contains supplementary material available at 10.1007/s11136-022-03165-4.

even the first iteration of the models provided evidence of fit to the conceptual model, and only a few items performed poorly. For example, the first run of models with all items yielded Cronbach's alphas for the 7 domains ranging from .86 -.93, and SRMRs ranging from .054 -.093. Accordingly, the item selection process focused on model refinement to identify the best performing set of items providing a parsimonious measurement instrument capturing the full range of each PRISM-CC domain, and differentiating between domains. Item selection and model refinement proceeded through six iterations. No more than a few items were excluded at each iteration, and then models were refitted before further decisions were made. At later iterations, items discarded at earlier iterations were reconsidered by adding them back into more refined version of the models to see how they performed.
Decisions to remove items were based on consideration of multiple statistical criteria and reassessment of face and content validity by our interdisciplinary team. Cognitive interview data and qualitative analyses, which informed item development, were often used to guide decisions. Statistical information informing item selection decisions included: 1. Low item variance: In the first iteration we excluded four items that had low variance (including due to ceiling or floor effects), or that were very weakly correlated with other items in the domain.
2. High rates of item non-response or not applicable responses: We excluded items where a high rate of missing or "not applicable" responses likely resulted from circumstances not salient to many respondents. For example, a potential resource domain item "When I need to, I access supports and resources to help deal with my health condition(s) at work " pertains to work environments and thus is not applicable to many persons not in the workforce; and the potential activity domain item "I use tools/aids/equipment to make everyday activities easier" was not applicable to many respondents who did not perceive that they used tools/aids/equipment. We excluded five items in the resource domain and one item in the activity domain based on this criterion.
3. Weak standardized factor loadings (<0.6) or weak discrimination parameters in the IRT models (<1.35): Supported by other conceptual and statistical criteria (especially # 4 below), this criterion contributed to about a third of the item exclusions. 4. IRT item information and response: The performance of each item was assessed through examination of threshold parameters, item category response function plots (plots of the estimated probability of choosing each response to each item by level of theta), and item information function plots. Specifically, we considered: a. Whether thresholds and response curves for each item showed that the latent variable (i.e. theta) was associated with probability of selecting each sequential ordinal response category, and that each response category discriminated between levels of the latent variable. All items that met criteria #1-#3 also satisfied this criterion.
b. The extent to which thresholds, response curves and information for each item showed that the ordinal response categories measured a broad spectrum of the latent variable. All items satisfied this criterion. The difficulty response scale, which was used for most items, was particularly strong on this criterion. 6. Evidence of differential item functioning (DIF) by sociodemographic variables: At later iterations, analysis of differential item functioning was used to inform item selection decisions. Specifically, we used CFA models to test for differences in factor loadings and thresholds by age group (18-30, 31-60, 61+), gender identification (male, female, other), and education (high school diploma or less, post-secondary trade or bachelor's degree, and graduate degree). Only one item was eliminated based on DIF (the social domain item "I make good choices about the time I spend with others" showed differential item discrimination by gender).

Sensitivity Analyses for Potentially Careless Responses
Careless or non-reflective responses in online surveys can result in underestimation of model fit and data which is missing not at random (MNAR). (1,2) Of the 1,055 respondents included in our analysis, 5% spent, on average, less than 3.6 seconds per question, giving plausibility to this concern. Accordingly, sensitivity analyses were conducted to assess the potential impact of careless responses on the model fit of each PRISM-CC domain.
We used a procedure proposed by Hong and Cheng.(3) Two person-fit statistics were estimated to measure the plausibility of each respondent's responses given the IRT graded response model.(4) The first, Gpoly, is based on the number of polytomous Guttman errors for each respondent.(5) Responders were classified as "non-reflective" if they were in the top 5% of Gpoly values. The second, lzpoly, is the standardized log-likelihood of the respondent's response vector, which is expected to be asymptotically normally distributed.
Low values indicate poor person fit, and we classified non-reflective respondents as those with the 5% lowest values. The 5% cut-off is somewhat arbitrary and does not differentiate the magnitude of potential non-reflectiveness. Accordingly, we also treated lzpoly as a continuous indicator of non-reflectiveness, which was used to weight the contribution of each respondent.
For sensitivity analyses, we assessed the improvement in global models fit indices and standardized factor loadings for each of the seven PRISM-CC domains after (1) dropping subjects who were classified as being non-reflective responders, and (2) weighting the data for each respondent according to the inverse of their normalized person fit, based on the lzpoly person fit statistic. Substantial improvements in model fit would indicate high potential impact of careless responses.
As shown in Table 1, non-reflective responses result in underestimation of the fit of the PRISM-CC. Dropping potentially non-reflective respondents increased standardized factor loadings by 0.05 or more, and substantially improved indices of model fit, including the RMSEA.