Interpreting SF&-36 summary health measures: A response
- Cite this article as:
- Ware, J.E. & Kosinski, M. Qual Life Res (2001) 10: 405. doi:10.1023/A:1012588218728
- 723 Downloads
In response to questions raised about the “accuracy” of SF-36 physical (PCS) and mental (MCS) component summary scores, particularly extremely high and low scores, we briefly comment on: how they were developed, how they are scored, the factor content of the eight SF-36 subscales, cross-tabulations between item-level responses and extreme summary scores, and published and new tests of their empirical validity.
Published cross-tabulations between SF-36 items and PCS and MCS scores, reanalyses of public datasets (N = 5919), and preliminary results from the Medicare Health Outcomes Survey (HOS) (N = 172,314) yielded little or no evidence in support of Taft's hypothesis that extreme scores are an invalid artifact of some negative scoring weights. For example, in the HOS, those (N = 432) with “unexpected” PCS scores worse than 20 (which, according to Taft, indicate better mental health rather than worse physical health) were about 25% more likely to die within two years, in comparison with those scoring in the next highest (21– 30)␣category. In this test and in all other empirical tests, results of predictions supported the validity of extreme PCS and MCS scores.
We recommend against the interpretation of average differences smaller than one point in studies that seek to detect “false” measurement and we again repeat our 7-year-old recommendation that results based on summary measures should be thoroughly compared with the SF-36 profile before drawing conclusions. To facilitate such comparisons, scoring utilities and user-friendly graphs for SF-36 profiles and physical and mental summary scores (both orthogonal and oblique scoring algorithms) have been made available on the Internet at www.sf-36.com/test.