Abstract
Data harmonization is a topic of growing importance to demographers, who increasingly conduct domestic or international comparative research. Many self-reported survey items cannot be directly compared across demographic groups or countries because these groups differ in how they use subjective response categories. Anchoring vignettes, already appearing in numerous surveys worldwide, promise to overcome this problem. However, many anchoring vignettes have not been formally evaluated for adherence to the key measurement assumptions of vignette equivalence and response consistency. This article tests these assumptions in some of the most widely fielded anchoring vignettes in the world: the health vignettes in the World Health Organization (WHO) Study on Global AGEing and Adult Health (SAGE) and World Health Survey (WHS) (representing 10 countries; n = 52,388), as well as similar vignettes in the Health and Retirement Study (HRS) (n = 4,528). Findings are encouraging regarding adherence to response consistency, but reveal substantial violations of vignette equivalence both cross-nationally and across socioeconomic groups. That is, members of different sociocultural groups appear to interpret vignettes as depicting fundamentally different levels of health. The evaluated anchoring vignettes do not fulfill their promise of providing interpersonally comparable measures of health. Recommendations for improving future implementations of vignettes are discussed.
Similar content being viewed by others
Notes
Although we focus on health (given the widespread use of anchoring vignettes in health surveys), similar issues arise whenever subjective self-ratings are used.
King and Wand’s nonparametric method contends with respondents misordering vignettes in a series, but treats such misorderings as “random measurement error,” not as fundamental violations of VE (2007:49).
In some surveys, two closely related evaluation questions (e.g., regarding “pain” and “discomfort”) followed each vignette. Here, we present one question from each pair, given that pairs yielded extremely similar ratings, and European surveys included only the first question.
To ensure sequential increases in cutpoints, exponential coding is used: that is, τ i 1 = γ1 X i and τ i k = τ i k – 1+ exp(γ k X i ), k = 2, . . . , K – 1 (as in, e.g., van Soest and Vonkova (2014)). Note that in the cutpoint parametrization, the covariate vector X includes a constant term.
Some refer to this as “chopit” (with “c” standing for “compound”; Rabe-Hesketh and Skrondal (2002)); others use “chopit” only when multiple ratings of each vignette enable calculation of individual-level random effects. We do not calculate random effects, so use “hopit” to avoid ambiguity.
van Soest and Vonkova (2014) present an extension of the hopit model allowing for unobserved heterogeneity, and recommend other model variants as well, which future researchers may wish to consider. We are confident that our main (parametric) conclusions regarding VE are not artifacts of modeling assumptions, however, since our entirely nonparametric weak tests support the same conclusions.
One-way analysis of variance (ANOVA) confirms that between-country variation in perceived vignette locations dwarfs within-country variance, in all WHO vignette series (p < .001).
To align and facilitate comparison of the two sets of bars, Model C units (standard deviation of the self-rating) were converted to Model A units (standard deviation of the reference vignettes), and a constant was added to Model C’s predicted cutpoints. Graphs reflect these conversions.
To this end, patterns in rank-order violations may have diagnostic utility. For example, in WHO self-care vignettes, 35.71 % of respondents misordered Severities 3 versus 4, while fewer than 10 % misordered all other adjacent vignette pairs. Vignettes 3 and 4 thus particularly invite further investigation and refinement.
A full cost-benefit analysis of anchoring vignettes would consider both challenges of vignette development/assessment and subsequent challenges of analysis. At present, vignette analyses are often time-consuming to run, and typically adjust only dependent variables. Bago D’Uva et al. (2011a:641) reported that theirs was “only the second study” to use vignette-adjusted independent variables.
For example, both raw and vignette-adjusted self-ratings of distance vision yield this ranking of SAGE countries (best to worst; respondents aged 50+): China, Mexico, Russia, South Africa, Ghana, and India. An objective ranking, based on LogMAR vision tests, is Ghana, South Africa, China, Russia, India, and Mexico.
References
Angel, R. (2013). After Babel: Language and the fundamental challenges of comparative aging research. Journal of Cross-Cultural Gerontology, 28, 223–238.
Au, N., & Lorgelly, P. K. (2014). Anchoring vignettes for health comparisons: An analysis of response consistency. Quality of Life Research, 23, 1721–1731.
Bago D’Uva, T., Lindeboom, M., O’Donnell, O., & van Doorslaer, E. (2011a). Education-related inequity in healthcare with heterogeneous reporting of health. Journal of the Royal Statistical Society: Series A, 174, 639–664.
Bago D’Uva, T., Lindeboom, M., O’Donnell, O., & van Doorslaer, E. (2011b). Slipping anchor? Testing the vignettes approach to identification and correction of reporting heterogeneity. Journal of Human Resources, 46, 875–906.
Biss, E. (2005). The pain scale. Seneca Review, 35(1), 5–25.
Burgard, S. A., & Chen, P. V. (2014). Challenges of health measurement in studies of health disparities. Social Science & Medicine, 106, 143–150.
Corrado, L., & Weeks, M. (2010). Identification strategies in survey response using vignettes (Cambridge Working Papers in Economics No. 1031). Cambridge, UK: Cambridge University. Retrieved from http://ideas.repec.org/p/cam/camdae/1031.html
Dong, H., Campbell, C., Kurosu, S., Yang, W., & Lee, J. Z. (2015). New sources for comparative social science: Historical population panel data from East Asia. Demography, 52, 1061–1088.
Dowd, J. B., & Zajacova, A. (2007). Does the predictive power of self-rated health for subsequent mortality risk vary by socioeconomic status in the US? International Journal of Epidemiology, 36, 1214–1221.
Grol-Prokopczyk, H. (2014). Age and sex effects in anchoring vignette studies: Methodological and empirical contributions. Survey Research Methods, 8, 1–17.
Grol-Prokopczyk, H., Freese, J., & Hauser, R. M. (2011). Using anchoring vignettes to assess group differences in self-rated health. Journal of Health and Social Behavior, 52, 246–261.
Hanna, L. C., Hunt, S. M., & Bhopal, R. S. (2012). Using the Rose Angina Questionnaire cross-culturally: The importance of consulting lay people when translating epidemiological questionnaires. Ethnicity & Health, 17, 241–251.
Hopkins, D. J., & King, G. (2010). Improving anchoring vignettes: Designing surveys to correct interpersonal incomparability. Public Opinion Quarterly, 74, 201–222.
Hunt, S. M., & Bhopal, R. (2004). Self report in clinical and epidemiological studies with non-English speakers: The challenge of language and culture. Journal of Epidemiology and Community Health, 58, 618–622.
Iburg, K. M., Salomon, J. A., Tandon, A., & Murray, C. J. L. (2002). Cross-population comparability of physician-assessed and self-reported measures of health. In C. J. L. Murray, J. A. Salomon, C. D. Mathers, & A. D. Lopez (Eds.), Summary measures of population health: Concepts, ethics, measurement and applications (pp. 433–448). Geneva, Switzerland: World Health Organization.
Inglehart, R., & Welzel, C. (2005). Modernization, cultural change and democracy. New York, NY: Cambridge University Press.
Jürges, H. (2007). True health vs response styles: Exploring cross-country differences in self-reported health. Health Economics, 16, 163–178.
Jylhä, M., Guralnik, J. M., Ferrucci, L., Jokela, J., & Heikkinen, E. (1998). Is self-rated health comparable across cultures and genders? Journals of Gerontology, Series B: Psychological Sciences and Social Sciences, 53, S144–S152.
Kapteyn, A. (2010). What can we learn from (and about) global aging? Demography, 47(Suppl.), S191–S209.
King, G., Murray, C. J. L., Salomon, J. A., & Tandon, A. (2004). Enhancing the validity and cross-cultural comparability of survey research. American Political Science Review, 98, 191–207.
King, G., & Wand, J. (2007). Comparing incomparable survey responses: Evaluating and selecting anchoring vignettes. Political Analysis, 15, 46–66.
Kowal, P., Chatterji, S., Naidoo, N., Biritwum, R., Fan, W., Lopez Ridaura, R., . . . Boerma, J. T. (2012). Data resource profile: The World Health Organization Study on Global AGEing and Adult Health (SAGE). International Journal of Epidemiology, 41, 1639–1649.
Kristensen, N., & Johansson, E. (2008). New evidence on cross-country differences in job satisfaction using anchoring vignettes. Labour Economics, 15, 96–117.
Menec, V. H., Shooshtari, S., & Lambert, P. (2007). Ethnic differences in self-rated health among older adults: A cross-sectional and longitudinal analysis. Journal of Aging and Health, 19, 62–86.
Murray, C. J. L., Özaltin, E., Tandon, A., Salomon, J. A., Sadana, R., & Chatterji, S. (2003). Empirical evaluation of the anchoring vignette approach in health surveys. In C. J. L. Murray & D. B. Evans (Eds.), Health systems performance assessment: Debates, methods and empiricism (pp. 369–399). Geneva, Switzerland: World Health Organization.
Murray, C. J. L., Tandon, A., Salomon, J. A., Mathers, C. D., & Sadana, R. (2002). New approaches to enhance cross-population comparability of survey results. In C. J. L. Murray, J. A. Salomon, C. D. Mathers, & A. D. Lopez (Eds.), Summary measures of population health: Concepts, ethics, measurement and applications (pp. 421–431). Geneva, Switzerland: World Health Organization.
National Institute on Aging (NIA). (2012). Harmonization strategies for behavioral, social science, and genetic research (Workshop Summary Report). Retrieved from https://www.nia.nih.gov/sites/default/files/nia_bssg_harmonization_summary_version_2-5-20122.pdf
Pan, Y., & Fond, M. (2014). Evaluating multilingual questionnaires: A sociolinguistic perspective. Survey Research Methods, 8, 181–194.
Pasick, R. J., Stewart, S. L., Bird, J. A., & D’Onofrio, C. N. (2001). Quality of data in multiethnic health surveys. Public Health Reports, 116(Suppl. 1), 223–243.
Rabe-Hesketh, S., & Skrondal, A. (2002). Estimating chopit models in gllamm: Political efficacy example from King et al. Retrieved from http://www.gllamm.org/chopit.pdf
Rice, N., Robone, S., & Smith, P. (2011). Analysis of the validity of the vignette approach to correct for heterogeneity in reporting health system responsiveness. European Journal of Health Economics: HEPAC: Health economics in prevention and care, 12, 141–162.
Ruggles, S. (2014). Big microdata for population research. Demography, 51, 287–297.
Sadana, R., Mathers, C. D., Lopez, A. D., Murray, C. J. L., & Moesgaard Iburg, K. (2002). Comparative analyses of more than 50 household surveys on health status. In C. J. L. Murray, J. A. Salomon, C. D. Mathers, & A. D. Lopez (Eds.), Summary measures of population health: Concepts, ethics, measurement and applications (pp. 369–386). Geneva, Switzerland: World Health Organization.
Schenker, N., Raghunathan, T. E., & Bondarenko, I. (2010). Improving on analyses of self-reported data in a large-scale health survey by using information from an examination-based survey. Statistics in Medicine, 29, 533–545.
Schiavenato, M., & Craig, K. D. (2010). Pain assessment as a social transaction: Beyond the “gold standard.” Clinical Journal of Pain, 26, 667–676.
Sen, A. (2002). Health: Perception versus observation. BMJ, 324, 860–861.
Shetterly, S. M., Baxter, J., Mason, L. D., & Hamman, R. F. (1996). Self-rated health among Hispanic vs non-Hispanic white adults: The San Luis Valley Health and Aging Study. American Journal of Public Health, 86, 1798–1801.
Skevington, S. M. (2002). Advancing cross-cultural research on quality of life: Observations drawn from the WHOQOL development. Quality of Life Research, 11, 135–144.
Smith, T. W. (2003). Developing comparable questions in cross-national surveys. In J. A. Harkness, F. J. R. van der Vijver, & P. P. Mohler (Eds.), Cross-cultural survey methods (pp. 69–91). Hoboken, NJ: John Wiley & Sons.
Tandon, A., Murray, C. J. L., Salomon, J. A., & King, G. (2003). Statistical models for enhancing cross-population comparability. In C. J. L. Murray & D. B. Evans (Eds.), Health systems performance assessment: Debates, methods and empiricism (pp. 727–741). Geneva, Switzerland: World Health Organization.
United Nations Development Programme (UNDP). (2008). Human development report 2007/2008. Retrieved from http://hdr.undp.org/en/reports/global/hdr2007-2008/
van Soest, A., Delaney, L., Harmon, C., Kapteyn, A., & Smith, J. P. (2011). Validating the use of anchoring vignettes for the correction of response scale differences in subjective questions. Journal of the Royal Statistical Society: Series A, 174, 575–595.
van Soest, A., & Vonkova, H. (2014). Testing the specification of parametric models by using anchoring vignettes. Journal of the Royal Statistical Society: Series A, 177, 115–133.
Zimmer, Z., Natividad, J., Lin, H.-S., & Chayovan, N. (2000). A cross-national examination of the determinants of self-assessed health. Journal of Health and Social Behavior, 41, 465–481.
Acknowledgments
This article uses data from the World Health Organization (WHO) Study on Global Ageing and Adult Health (SAGE), the WHO World Health Surveys (WHS), and the Health and Retirement Study (HRS). SAGE is supported by the National Institute on Aging (NIA) Division of Behavioral and Social Research through interagency agreements and research grants, and the WHO Department of Health Statistics and Information Systems. The HRS is sponsored by the NIA (grant number NIA U01AG009740) and is conducted by the University of Michigan. The first author’s work on this research was supported by a Robert Wood Johnson Foundation Health & Society Scholars Dissertation Grant, and by research assistantships through the Center for Demography of Health and Aging and the Center for Demography and Ecology (core grants P30 AG017266 and R24 HD047873) at the University of Wisconsin–Madison. We thank Joan Fujimura, Robert M. Hauser, Pamela Herd, Cameron Macdonald, Claire Wendland, and James R. Walker for helpful comments on earlier versions of this manuscript.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Online Resource 1
(DOCX 27.7 kb)
Online Resource 2
(DOCX 131 kb)
Online Resource 3
(DOCX 68.4 kb)
Online Resource 4
(ZIP 179 kb)
Rights and permissions
About this article
Cite this article
Grol-Prokopczyk, H., Verdes-Tennant, E., McEniry, M. et al. Promises and Pitfalls of Anchoring Vignettes in Health Survey Research. Demography 52, 1703–1728 (2015). https://doi.org/10.1007/s13524-015-0422-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13524-015-0422-1