Abstract
Comparative measures such as paired comparisons and rankings are frequently used to evaluate health states and quality of life. The present article introduces log-linear Bradley-Terry (LLBT) models to evaluate intervention effectiveness when outcomes are measured as paired comparisons or rankings and presents a combination of the LLBT model and model-based recursive partitioning (MOB) to detect treatment effect heterogeneity. The MOB LLBT approach enables researchers to identify subgroups that differ in the preference order and in the effect an intervention has on choice behavior. Applicability of MOB LLBT models is demonstrated using an artificial data example with known data-generating mechanism and a real-world data example focusing on drug-harm perception among music festival visitors. In the artificial data example, the MOB LLBT model is able to adequately recover the “true” (population) model. In the real-world data example, the standard LLBT model confirms the existence of a situational willingness among festival visitors to trivialize drug harm when peer consumption behavior is made cognitively accessible. In addition, MOB LLBT results suggest that this trivialization effect is highly context-dependent and most pronounced for participants with low-to-moderate alcohol intoxication who also proactively contacted a substance counselor at the festival venue. Both data examples suggest that MOB LLBT models allow for more nuanced statements about the effectiveness of interventions. We provide R code examples to implement MOB LLBT models for paired comparisons, rankings, and rating (Likert-type) data.
Similar content being viewed by others
References
Adamson, S., & Sellman, J. D. (2003). A prototype screening instrument for cannabis use disorder: The Cannabis Use Disorders Identification Test (CUDIT) in an alcohol-dependent clinical sample. Drug and Alcohol Review, 22,309–315. https://doi.org/10.1080/0959523031000154454
Ali, S., & Ronaldson, S. (2012). Ordinal preference elicitation methods in health economics and health services research: Using discrete choice experiments and ranking methods. British Medical Bulletin, 103,21–44 https://doi.org/10.1093/bmb/lds020
Allison, P. D., & Christakis, N. A. (1994). Logit models for sets of ranked items. Sociological Methodology, 24,199–228 https://doi.org/10.2307/270983
Athey, S., & Imbens, G. (2016). Recursive partitioning for heterogeneous causal effects. Proceedings of the National Academy of Sciences, 113,7353–7360 https://doi.org/10.1073/pnas.1510489113
Babor, T. F., Higgins-Biddle, J. C., Saunders, J. B., & Monteiro M. G. (2001). AUDIT: The alcohol use disorders identification test: Guidelines for use in primary health care. World Health Organization.
Böckenholt, U. (2001). Hierarchical modeling of paired comparison data. Psychological Methods, 6,49–66 https://doi.org/10.1037/1082-989X.6.1.49
Böckenholt, U. (2004). Comparative judgments as an alternative to ratings: Identifying the scale origin. Psychological Methods, 9,453–465 https://doi.org/10.1037/1082-989X.9.4.453
Bradley, R. A., & Terry, M. E. (1952). Rank analysis of incomplete block designs: I. The method of paired comparisons. Biometrika, 39,324 https://doi.org/10.2307/2334029
D’Ambrosio, A., & Heiser, W. J. (2016). A recursive partitioning method for the prediction of preference rankings based upon Kemeny distances. Psychometrika, 81,774–794 https://doi.org/10.1007/s11336-016-9505-1
Dittrich, R., Hatzinger, R., & Katzenbeisser, W. (1998). Modelling the effect of subject-specific covariates in paired comparison studies with an application to university rankings. Journal of the Royal Statistical Society: Series C (Applied Statistics), 47,511–525 https://doi.org/10.1111/1467-9876.00125
Dittrich, R., Hatzinger, R., & Katzenbeisser, W. (2002). Modelling dependencies in paired comparison data. Computational Statistics & Data Analysis, 40,39–57 https://doi.org/10.1016/S0167-9473(01)00106-2
Dittrich, R., Francis, B., Hatzinger, R., & Katzenbeisser, W. (2006). Modelling dependency in multivariate paired comparisons: A log-linear approach. Mathematical Social Sciences, 52,197–209 https://doi.org/10.1016/j.mathsocsci.2006.06.001
Dittrich, R., Francis, B., Hatzinger, R., & Katzenbeisser, W. (2007). A paired comparison approach for the analysis of sets of Likert-scale responses. Statistical Modelling, 7,3–28 https://doi.org/10.1177/1471082X0600700102
Dittrich, R., Hatzinger, R., & Katzenbeisser, W. (2004). A log-linear approach for modelling ordinal paired comparison data on motives to start a PhD programme. Statistical Modelling, 4,181–193 https://doi.org/10.1191/1471082X04st072oa
Doove, L. L., Van Deun, K., Dusseldorp, E., & Van Mechelen, I. (2016). QUINT: A tool to detect qualitative treatment–subgroup interactions in randomized controlled trials. Psychotherapy Research, 26,612–622 https://doi.org/10.1080/10503307.2015.1062934
Dusetzina, S. B., Higashi, A. S., Dorsey, E. R., Conti, R., Huskamp, H. A., Zhu, S., et al. (2012). Impact of FDA drug risk communications on health care utilization and Health behaviors: A systematic review. Medical Care, 50,466–478 https://doi.org/10.1097/MLR.0b013e318245a160
Dusseldorp, E., & Van Mechelen, I. (2014). Qualitative interaction trees: A tool to identify qualitative treatment-subgroup interactions. Statistics in Medicine, 33,219–237 https://doi.org/10.1002/sim.5933
Farrell, A. D., Henry, D. B., & Bettencourt, A. (2013). Methodological challenges examining subgroup differences: Examples from universal school-based youth violence prevention trials. Prevention Science, 14,121–133 https://doi.org/10.1007/s11121-011-0200-2
Fokkema, M., Smits, N., Zeileis, A., Hothorn, T., & Kelderman, H. (2018). Detecting treatment-subgroup interactions in clustered data with generalized linear mixed-effects model trees. Behavior Research Methods, 50,2016–2034 https://doi.org/10.3758/s13428-017-0971-x
Garcia-Retamero, R., & Galesic, M. (2009). Communicating treatment risk reduction to people with low numeracy skills: A cross-cultural comparison. American Journal of Public Health, 99,2196–2202 https://doi.org/10.2105/AJPH.2009.160234
Grand, A., & Dittrich, R. (2015). Modelling assumed metric paired comparison data – Application to learning related emotions. Austrian Journal of Statistics, 44,3–15.
Hatzinger, R., & Dittrich, R. (2012). prefmod: An R package for modeling preferences based on paired comparisons, rankings, or ratings. Journal of Statistical Software, 48 https://doi.org/10.18637/jss.v048.i10
Hothorn, T., & Zeileis, A. (2015). partykit: A modular toolkit for recursive partitioning in R. Journal of Machine Learning Research, 16,3905–3909.
Klein, W. M. P., & Cerully, J. L. (2007). Health-related risk perception and decision-making: Lessons from the study of motives in social psychology. Social and Personality Psychology Compass, 1,334–358 https://doi.org/10.1111/j.1751-9004.2007.00023.x
Komboz, B., Strobl, C., & Zeileis, A. (2018). Tree-based global model tests for polytomous Rasch models. Educational and Psychological Measurement, 78,128–166 https://doi.org/10.1177/0013164416664394
Krabbe, P. F. M., Salomon, J. A., & Murray, C. J. L. (2007). Quantification of health states with rank-based nonmetric multidimensional scaling. Medical Decision Making, 27,395–405 https://doi.org/10.1177/0272989X07302131
Kreuter, M. W., & Wray, R. J. (2003). Tailored and targeted health communication: Strategies for enhancing information relevance. American Journal of Health Behavior, 27,227–232 https://doi.org/10.5993/AJHB.27.1.s3.6
Lanza, S. T., & Rhoades, B. L. (2013). Latent class analysis: An alternative perspective on subgroup analysis in prevention and treatment. Prevention Science, 14,157–168 https://doi.org/10.1007/s11121-011-0201-1
Matthews, J. N. S., & Morris, K. P. (1995). An application of Bradley-Terry-type models to the measurement of pain. Applied Statistics, 44,243 https://doi.org/10.2307/2986348
Maydeu-Olivares, A., & Böckenholt, U. (2005). Structural equation modeling of paired-comparison and ranking data. Psychological Methods, 10,285–304 https://doi.org/10.1037/1082-989X.10.3.285
Maydeu-Olivares, A., & Böckenholt, U. (2008). Modeling subjective health outcomes: Top 10 reasons to use Thurstoneʼs method. Medical Care, 46,346–348 https://doi.org/10.1097/MLR.0b013e31816dd8d9
Philipp, M., Rusch, T., Hornik, K., & Strobl, C. (2018). Measuring the stability of results from supervised statistical learning. Journal of Computational and Graphical Statistics, 27,685–700 https://doi.org/10.1080/10618600.2018.1473779
Pritikin, J. (2020). pcFactorStan: Stan models for the paired comparison factor model. R package version 1.4.0. https://cran.r-project.org/package=pcFactorStan
Rehm, J., & Frick, U. (2013). Establishing disability weights from pairwise comparisons for a US burden of disease study: Disability weights for the United States. International Journal of Methods in Psychiatric Research, 22,144–154 https://doi.org/10.1002/mpr.1383
Reyna, V. F., Nelson, W. L., Han, P. K., & Dieckmann, N. F. (2009). How numeracy influences risk comprehension and medical decision making. Psychological Bulletin, 135,943–973 https://doi.org/10.1037/a0017327
Rusch, T., & Zeileis, A. (2013). Gaining insight with recursive partitioning of generalized linear models. Journal of Statistical Computation and Simulation, 83,1301–1315 https://doi.org/10.1080/00949655.2012.658804
Seibold, H., Hothorn, T., & Zeileis, A. (2019). Generalised linear model trees with global additive effects. Advances in Data Analysis and Classification, 13,703–725 https://doi.org/10.1007/s11634-018-0342-1
Seibold, H., Zeileis, A., & Hothorn, T. (2016). Model-based recursive partitioning for subgroup analyses. The International Journal of Biostatistics, 12,45–63 https://doi.org/10.1515/ijb-2015-0032
Sinclair, C. D. (1982). GLIM for Preference. In R. Gilchrist (Ed.), GLIM 82: Proceedings of the International Conference on Generalised Linear Models 14,164–178. Springer New Yorkhttps://doi.org/10.1007/978-1-4612-5771-4_16
Stolk, E. A., Oppe, M., Scalone, L., & Krabbe, P. F. M. (2010). Discrete choice modeling for the quantification of health states: The case of the EQ-5D. Value in Health, 13,1005–1013 https://doi.org/10.1111/j.1524-4733.2010.00783.x
Strobl, C., Kopf, J., & Zeileis, A. (2015). Rasch Trees: A new method for detecting differential item functioning in the Rasch model. Psychometrika, 80,289–316 https://doi.org/10.1007/s11336-013-9388-3
Strobl, C., Wickelmaier, F., & Zeileis, A. (2011). Accounting for individual differences in Bradley-Terry models by means of recursive partitioning. Journal of Educational and Behavioral Statistics, 36,135–153 https://doi.org/10.3102/1076998609359791
Supplee, L. H., Kelly, B. C., MacKinnon, D. M., & Barofsky, M. Y. (2013). Introduction to the special issue: Subgroup analysis in prevention and intervention research. Prevention Science, 14(2),107–110 https://doi.org/10.1007/s11121-012-0335-9
Thurstone, L. L. (1927). A law of comparative judgment. Psychological Review, 34,273–286 https://doi.org/10.1037/h0070288
Turney, P. (1995). Technical note: Bias and the quantification of stability. Machine Learning, 20,23–33 https://doi.org/10.1007/BF00993473
Wang, R., & Ware, J. H. (2013). Detecting moderator effects using subgroup analyses. Prevention Science, 14,111–120 https://doi.org/10.1007/s11121-011-0221-x
Wiedermann, W., & Frick, U. (2013). Using surveys to calculate disability-adjusted life-year. Alcohol Research: Current Reviews, 35,128–133.
Wiedermann, W., Niggli, J., & Frick, U. (2014). The Lemming-effect: Harm perception of psychotropic substances among music festival visitors. Health, Risk & Society, 16,323–338 https://doi.org/10.1080/13698575.2014.930817
Winkelmann, R. & Zimmerman, K. F. (1992) Robust Poisson regression. In: Fahrmeir L., Francis B., Gilchrist R., & Tutz G. (eds). Advances in GLIM and Statistical Modelling. Lecture Notes in Statistics, 78. New York: Springer.
Xie, X. F., & Wang, X. T. X. (2003). Risk perception and risky choice: Situational, informational and dispositional effects. Asian Journal of Social Psychology, 6,117–132. https://doi.org/10.1111/1467-839X.t01-1-00015
Zeileis, A., & Hornik, K. (2007). Generalized M-fluctuation tests for parameter instability. Statistica Neerlandica, 61,488–508 https://doi.org/10.1111/j.1467-9574.2007.00371.x
Zeileis, A., Hothorn, T., & Hornik, K. (2008). Model-based recursive partitioning. Journal of Computational and Graphical Statistics, 17,492–514 https://doi.org/10.1198/106186008X319331
Zukier, H., & Pepitone, A. (1984). Social roles and strategies in prediction: Some determinants of the use of base-rate information. Journal of Personality and Social Psychology, 47,349–360 https://doi.org/10.1037/0022-3514.47.2.349
Acknowledgements
The authors are indebted to Dr. Regina Dittrich for valuable comments on an earlier draft of the article.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethical Approval
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. This article does not contain any studies with animals performed by any of the authors.
Informed Consent
Informed consent was obtained from all individual participants in the study.
Conflict of Interest
The authors declare that they have no conflict of interest.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Wiedermann, W., Frick, U. & Merkle, E.C. Detecting Heterogeneity of Intervention Effects in Comparative Judgments. Prev Sci 24, 444–454 (2023). https://doi.org/10.1007/s11121-021-01212-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11121-021-01212-z