Deriving SF-12v2 physical and mental health summary scores: a comparison of different scoring algorithms
- First Online:
- Cite this article as:
- Fleishman, J.A., Selim, A.J. & Kazis, L.E. Qual Life Res (2010) 19: 231. doi:10.1007/s11136-009-9582-z
- 2.1k Downloads
Summary scores for the SF-12, version 2 (SF-12v2) health status measure are based on scoring coefficients derived for version 1 of the SF-36, despite changes in item wording and response scales and despite the fact that SF-12 scales only contain a subset of SF-36 items. This study derives new summary scores based directly on SF-12v2 data from a recent U.S. sample and compares the new summary scores to the standard ones. Due to controversy regarding methods for developing scoring coefficients for the summary score, we compare summary scores produced by different methods.
We analyzed nationally representative U.S. data, which provided 53,399 observations for the SF-12v2 in 2003–2005. In addition to the standard SF-12V2 scoring algorithm, summary scores were generated using exploratory factor analysis (EFA), principal components analysis (PCA), and confirmatory factor analysis (CFA), with orthogonal and oblique rotation. We examined correlations among different summary scores, their associations with demographic and clinical variables, and the consistency between changes in scale scores and in summary scores over time.
The 8 scale means in the current data were similar to the 1998 SF-12v2 means, with the exception of the vitality scale. Correlations among the scales based on SF-12v2 data differed slightly from correlations derived from scales based on the SF-36 data. Correlations among summary scores derived using different methods were high (≥0.84). However, changes in summary scores derived using orthogonal rotation of components or factors were not consistent with changes in sub-scales, whereas changes in summary scores derived using oblique rotation were more consistent with patterns of change in sub-scales.
Although the basic structure of the SF-12 is stable, summary scores derived from oblique rotation are preferable and more consistent with changes in individual scales. On empirical and conceptual grounds, we suggest using summary scores based on oblique CFA.