Abstract
Purpose
The goal of this study is to compare three different types of retrospective frequency response formats on the Healthy Days Symptoms Module (HDSM). Responses are compared in terms of intra-individual consistency, psychometric value, and participant feedback about each type of response format.
Methods
Respondents each completed three versions of the HDSM, where items were framed to elicit an open-ended frequency, a fixed choice frequency, or a vague quantifier response. Traditional reliability statistics were used to evaluate intra-individual consistency. Differential item functioning (DIF) was used to test for response format effects, and item response theory (IRT) scale scores and standard errors were computed across the three forms to compare psychometric value. Linear mixed modeling was used to examine the associations of IRT scale scores across response formats with respondent characteristics.
Results
People are largely consistent in how they respond to items about their health, regardless of the response format, and no DIF was detected between response formats. The IRT scores computed from the “# of days” frequency response formats tend to have better measurement precision than those from vague quantifiers. Open-ended frequencies capture a greater span of individual differences for people reporting fewer symptoms; however, little measurement precision is lost in collapsing the frequencies into categories.
Conclusions
Both the open-ended and fixed choice frequency response formats offer more measurement precision than vague quantifiers. While the open-ended frequency response format may capture more individual differences, respondents tend to report more difficulty with exact frequency recall, and thus, prefer the fixed choice frequency format.
Similar content being viewed by others
References
Centers for Disease Control and Prevention (CDC). (2000). Measuring healthy days. Atlanta, GA: CDC.
Centers for Disease Control and Prevention (CDC), National Center for Health Statistics (NCHS). (1999). National Health and Nutrition Examination Survey Questionnaire. Hyattsville, MD: U.S. Department of Health and Human Services, Centers for Disease Control and Prevention.
Centers for Disease Control and Prevention (CDC). (1984). Behavioral Risk Factor Surveillance System Survey Questionnaire. Atlanta, GA: U.S. Department of Health and Human Services, Centers for Disease Control and Prevention.
Moriarty, D. G., Zack, M. M., & Kobau, R. (2003). The Centers for Disease Control and Prevention’s Healthy Days Measures—Population tracking of perceived physical and mental health over time. Health and Quality of Life Outcomes,1, 37.
Schneider, S., & Stone, A. A. (2016). The meaning of vaguely quantified frequency response options on a quality of life scale depends on respondents’ medical status and age. Quality of Life Research, 25(10), 2511–2521.
Hakel, M. D. (1968). How often is often? American Psychologist, 27(3), 533–534.
Schwarz, N., Hippler, H. J., Deutsch, B., & Strack, F. (1985). Response scales: Effects of category range on reported behavior and comparative judgments. Public Opinion Quarterly, 49, 388–395.
Ahmed, S., Mayo, N. E., Corbiere, M., Wood-Dauphinee, S., Hanley, J., & Cohen, R. (2005). Change in quality of life of people with stroke over time: True change or response shift? Quality of Life Research, 14, 611–627.
Schwartz, C. E., Andresen, E. M., Nosek, M. A., Krahn, G. L., & RRTC Expert Panel on Health Status Measurement. (2007). Response shift theory: Important implications for measuring quality of life in people with disability. Archives of Physical Medicine and Rehabilitation, 88, 529–536.
Andresen, E. M., Fouts, B. S., Romeis, J. C., & Brownson, C. A. (1999). Performance of health-related quality of life instruments in a spinal cord injured population. Archives of Physical Medicine and Rehabilitation, 80, 877–884.
Ôunpuu, S., Chambers, L. W., Chan, D., & Yusuf, S. (2001). Validity of the US Behavioral Risk Factor Surveillance System’s Health Related Quality of Life survey tool in a group of older Canadians. Chronic Diseases in Canada, 22, 93–101.
Mielenz, T., Jackson, E., Currey, S., DeVellis, R., & Callahan, L. F. (2006). Psychometric properties of the Centers for Disease Control and Prevention Health-Related Quality of Life (CDC HRQOL) items in adults with arthritis. Health and Qual of Life Outcomes, 4, 66.
Horner-Johnson, W., Krahn, G., Andresen, E., Hall, T., & RRTC Expert Panel on Health Status Measurement. (2009). Developing summary scores of health-related quality of life for a population-based survey. Public Health Reports, 124, 103–110.
Bann, C. M., Kobau, R., Lewis, M. A., Zack, M. M., & Thompson, W. W. (2012). Development and psychometric evaluation of the public health surveillance well-being scale. Quality of Life Research, 21(6), 1031–1043.
Mielenz, T. J., Callahan, L. F., & Edwards, M. C. (2016). Item response theory analysis of Centers for Disease Control and Prevention Health-Related Quality of Life (CDC HRQOL) items in adults with arthritis. Health and Quality of Life Outcomes, 14, 43.
Magnus, B. E., & Thissen, D. (2017). Item response modeling of multivariate count data with zero inflation, maximum inflation, and heaping. Journal of Educational and Behavioral Statistics, 42(5), 531–558.
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Lawrence Erlbaum Associates, Inc.
Wang, H., & Heitjan, D. F. (2008). Modeling heaping in self-reported cigarette counts. Statistics in Medicine, 27, 3789–3804.
Burton, S., & Blair, E. (1991). Task conditions, response formulation processes, and response accuracy for behavioral frequency questions in surveys. Public Opinion Quarterly, 55, 50–79.
Schneider, S., & Stone, A. A. (2016). Ambulatory and diary methods can facilitate the measurement of patient-reported outcomes. Quality of Life Research, 25(3), 497–506.
Shiffman, S. (2009). How many cigarettes did you smoke? Assessing cigarette consumption by global report, time-line follow-back, and ecological momentary assessment. Health Psychology, 28(5), 519–526.
Stone, A. A., Broderick, J. E., Shiffman, S. S., & Schwartz, J. E. (2004). Understanding recall of weekly pain from a momentary assessment perspective: Absolute agreement, between- and within-person consistency, and judged change in weekly pain. Pain, 107, 61–69.
Barile, J. P., Reeve, B. B., Smith, A. W., Zack, M. M., Mitchell, S. A., Kobau, R. et al. (2013). Monitoring population health for Healthy People 2020: Evaluation of the NIH PROMIS(R) Global Health, CDC Healthy Days, and satisfaction with life instruments. Quality of Life Research, 22(6), 1201–1211.
Yin, S., Njai, R., Barker, L., Siegel, P. Z., & Liao, Y. (2016). Summarizing health-related quality of life (HRQOL): Development and testing of a one-factor model. Population Health Metrics, 14, 22.
McGinley, J. S., & Curran, P. J. (2014). Validity concerns with multiplying ordinal items defined by binned counts: An application to a quantity-frequency measure of alcohol use. Methodology: European Journal of Research Methods for the Behavioral & Social Sciences, 10(3), 108–116.
MacCallum, R. C., Zhang, S., Preacher, K. J., & Rucker, D. D. (2002). On the practice of dichotomization of quantitative variables. Psychological Methods, 7(1), 19–40.
Schechter, S., Beatty, P., & Willis, G. B. (1998). Asking survey respondents about health status: Judgment and response issues. In N. Schwartz, D. Park, B. Knauper, & S. Sudman (Eds.), Cognition, aging and self-reports. Philadelphia: Psychology Press.
R Development Core Team. (2017). R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing.
Samejima, F. (1968). Estimation of latent ability using a response pattern of graded scores. ETS Research Bulletin Series. https://doi.org/10.1002/j.2333-8504.1968.tb00153.x.
Cai, L., Thissen, D., & du Toit, S. H. C. (2017). IRTPRO 2.1 for windows. Lincolnwood, IL: Scientific Software International.
Reise, S. P., & Waller, N. G. (2009). Item response theory and clinical measurement. Annual Review of Clinical Psychology, 5, 27–48.
Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B, 57(1), 289–300.
Broderick, J. E., Schwartz, J. E., Vikingstad, G., Pribbernow, M., Grossman, S., & Stone, A. A. (2008). The accuracy of pain and fatigue items across different reporting periods. Pain, 139(1), 146–157.
Schmier, J. K., & Halpern, M. T. (2004). Patient recall and recall bias of health state and health status. Expert Review of Pharmacoeconomics & Outcomes Research, 4(2), 159–163.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.
Informed consent
Informed consent was obtained from all individual participants included in the study.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Magnus, B.E., Kirkman, M., Dutta, T. et al. To bin or not to bin? A comparison of symptom frequency response formats in the assessment of health-related quality of life. Qual Life Res 28, 841–853 (2019). https://doi.org/10.1007/s11136-018-2064-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11136-018-2064-4