A Recursive Partitioning Method for the Prediction of Preference Rankings Based Upon Kemeny Distances

D’Ambrosio, Antonio; Heiser, Willem J.

doi:10.1007/s11336-016-9505-1

A Recursive Partitioning Method for the Prediction of Preference Rankings Based Upon Kemeny Distances

Published: 01 July 2016

Volume 81, pages 774–794, (2016)
Cite this article

Psychometrika Aims and scope Submit manuscript

553 Accesses
29 Citations
Explore all metrics

Abstract

Preference rankings usually depend on the characteristics of both the individuals judging a set of objects and the objects being judged. This topic has been handled in the literature with log-linear representations of the generalized Bradley-Terry model and, recently, with distance-based tree models for rankings. A limitation of these approaches is that they only work with full rankings or with a pre-specified pattern governing the presence of ties, and/or they are based on quite strict distributional assumptions. To overcome these limitations, we propose a new prediction tree method for ranking data that is totally distribution-free. It combines Kemeny’s axiomatic approach to define a unique distance between rankings with the CART approach to find a stable prediction tree. Furthermore, our method is not limited by any particular design of the pattern of ties. The method is evaluated in an extensive full-factorial Monte Carlo study with a new simulation design.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Weighted distance-based trees for ranking data

Article 16 December 2017

Position Weighted Decision Trees for Ranking Data

The Bradley–Terry Regression Trunk approach for Modeling Preference Data with Small Trees

Article Open access 03 September 2022

References

Amodio, S., D’Ambrosio, A., & Siciliano, R. (2016). Accurate algorithms for identifying the median ranking when dealing with weak and partial rankings under the Kemeny axiomatic approach. European Journal of Operational Research, 249(2), 667–676.
Article Google Scholar
Ben-Israel, A., & Iyigun, C. (2008). Probabilistic distance clustering. Journal of Classification, 25, 5–26.
Article Google Scholar
Böckenholt, U. (2001). Mixed-effects analysis of rank-ordered data. Psychometrika, 77, 45–62.
Article Google Scholar
Bradley, R. A., & Terry, M. A. (1952). Rank analysis of incomplete block designs, I. Biometrika, 39, 324–345.
Google Scholar
Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and regression trees. Belmont, CA: Wadsworth International Group.
Google Scholar
Busing, F. M. T. A. (2009). Some Advances in Multidimensional Unfolding. Doctoral Dissertation, Leiden, The Netherlands: Leiden University.
Busing, F. M. T. A., Groenen, P. J. F., & Heiser, W. J. (2005). Avoiding degeneracy in multidimensional unfolding by penalizing on the coefficient of variation. Psychometrika, 70, 71–98.
Article Google Scholar
Busing, F. M. T. A., Heiser, W. J., & Cleaver, G. (2010). Restricted unfolding: Preference analysis with optimal transformations of preferences and attributes. Food Quality and Preference, 21, 82–92.
Article Google Scholar
Carroll, J. D. (1972). Individual differences and multidimensional scaling. In R. N. Shepard, et al. (Eds.), Multidimensional scaling theory (Vol. I, pp. 105–155). New York: Seminar Press.
Google Scholar
Chapman, R. G., & Staelin, R. (1982). Exploiting rank order choice set data within the stochastic utility model. Journal of Market Research, 19, 288–301.
Article Google Scholar
Cheng, W., Hühn, J., & Hüllermeier, E. (2009). Decision Tree and Instance-Based Learning for Label Ranking. Proceedings ICML-2009, 26th International Conference on Machine Learning, pp. 161–168, Montreal.
Coombs, C. H. (1950). Psychological scaling without a unit of measurement. Psychological Review, 57, 145–158.
Article PubMed Google Scholar
Coombs, C. H. (1964). A theory of data. New York: Wiley.
Google Scholar
Critchlow, D. E. (1985). Metric methods for analyzing partially ranked data (Vol. 34)., Lecture Notes in Statistics Berlin: Springer.
Google Scholar
Critchlow, D. E., Fligner, M. A., & Verducci, J. S. (1991). Probability models on rankings. Journal of Mathematical Psychology, 35, 294–318.
Article Google Scholar
Croon, M. A. (1989). Latent class models for the analysis of rankings. In G. De Soete, et al. (Eds.), New developments in psychological choice modeling (pp. 99–121). North-Holland: Elsevier.
Chapter Google Scholar
D’Ambrosio, A. (2008). Tree-based methods for data editing and preference rankings. Doctoral dissertation. Naples, Italy: Department of Mathematics and Statistics. http://www.fedoa.unina.it/2746/.
D’Ambrosio, A., Amodio, S., & Iorio, C. (2015). Two algorithms for finding optimal solutions of the Kemeny rank aggregation problem for full rankings. Electronic Journal of Applied Statistical Analysis, 8(2), 198–213.
Google Scholar
Diaconis, P. (1988). Group Representations in Probability and Statistics. Hayward, CA: Institute of Mathematical Statistics.
Google Scholar
De’ath, G. (2002). Multivariate regression trees: A new technique for modeling species-environment relationships. Echology, 83(4), 1105–1117.
Google Scholar
Ditrich, R., Hatzinger, R., & Katzenbeisser, W. (1998). Modelling the effect of subject-specific covariates in paired comparison studies with an application to university rankings. Journal of the Royal Statistical Society C, 47, 511–525.
Article Google Scholar
Ditrich, R., Katzenbeisser, W., & Hatzinger, R. (2000). The analysis of rank order preference data based on Bradley-Terry Type models. OR Spectrum, 22, 117–134.
Article Google Scholar
Dusseldorp, E., & Meulman, J. J. (2004). The regression trunk approach to discover treatment covariate interaction. Psychometrika, 69(3), 355–374.
Article Google Scholar
Emond, E. J., & Mason, D. W. (2000), A new technique for high level decision support. ORD project Report PR2000/13 Department of National Defence, Canada.
Emond, E. J., & Mason, D. W. (2002). A new rank correlation coefficient with application to the consensus ranking problem. Journal of Multi-Criteria Decision Analysis, 11, 17–28.
Article Google Scholar
Feigin, P. D., & Cohen, A. (1978). On a model for concordance between judges. Journal of the Royal Statistical Society, B, 40(2), 203–213.
Google Scholar
Fligner, M. A., & Verducci, J. S. (1986). Distance based ranking models. Journal of the Royal Statistical Society, Series B, 48, 359–369.
Google Scholar
Fligner, M. A., & Verducci, J. S. (1988). Multistage rankings models. Journal of the American Statistical Association, 83, 892–901.
Article Google Scholar
Francis, B., Dittrich, R., Hatzinger, R., & Penn, R. (2002). Analysing partial ranks by using smoothed paired comparison methods: An investigation of value orientation in Europe. Applied Statistics, 51, 319–336.
Google Scholar
Fürnkranz, J., & Hüllermeier, E. (2011). Preference learning. Berlin: Springer.
Book Google Scholar
Gormley, I. C., & Murphy, T. B. (2008). Exploring voting blocs within the Irish electorate: A mixture modeling approach. Journal of the American Statistical Association, 103, 1014–1027.
Article Google Scholar
Gormley, I. C., & Murphy, T. B. (2008b). A mixture of experts model for rank data with applications in election studies. The Annals of Applied Statistics, 4(2), 1452–1477.
Article Google Scholar
Gross, O. A. (1962). Preferential arrangements. The American Mathematical Monthly, 69, 1–4.
Article Google Scholar
Hastie, T., Tibshirani, R., & Friedman, J. H. (2009). The Elements of Statistical Learning. New York: Springer.
Book Google Scholar
Heiser, W. J. (2004). Geometric representation of association between categories. Psychometrica, 69(4), 513–545.
Article Google Scholar
Heiser, W.J., & D’Ambrosio, A. (2013). Clustering and prediction of rankings within a Kemeny distance framework. In B, Lausen, D., Van den Poel, Ultsch, A. (Eds.), Algorithms from and for Nature and Life, Springer series in Studies in Classification, Data Analysis, and Knowledge Organization, 19-31, Springer International Publishing Switzerland.
Heiser, W. J., & De Leeuw, J. (1981). Multidimensional mapping of preference data. Mathématiques et Sciences Humaines, 19, 39–96.
Google Scholar
Inglehart, R. (1977). The silent revolution: Changing values and political styles among Western Publics. Princeton, NJ: Princeton University Press.
Google Scholar
Kemeny, J. G. (1959). Mathematics without numbers. Daedalus, 88, 577–591.
Google Scholar
Kemeny, J. G., & Snell, L. (1962). Mathematical models in the social sciences. Boston: Ginn and Company.
Google Scholar
Kendall, M. (1948). Rank correlation methods. London: Charles Griffin & Company Limited.
Google Scholar
Larsen, D. R., & Speckman, C. L. (2004). Multivariate regression trees for analysis of abundance data. Biometrics, 60, 543–459.
Article PubMed Google Scholar
Lee, P. H., & Yu, P. L. H. (2010). Distance-based tree models for ranking data. Computational Statistics and Data Analysis, 54, 1672–1682.
Article Google Scholar
Luce, R. D. (1959). Individual choice behavior. New York: Wiley.
Google Scholar
Mallows, C. L. (1957). Non-null ranking models, I. Biometrika, 44, 114–130.
Article Google Scholar
Marden, J. I. (1995). Analyzing and modelling rank data. London: Chapman & Hall.
Google Scholar
Meulman, J. J., Van Der Kooij, A. J., & Heiser, W. J. (2004). Principal components analysis with nonlinear optimal scaling transformations for ordinal and nominal data. In D. Kaplan (Ed.), The SAGE handbook of quantitative methodology for the social sciences (pp. 49–70). Thousand Oaks: Sage.
Google Scholar
Murphy, T. B., & Martin, D. (2003). Mixtures of distance-based models for ranking data. Computational Statistics and Data Analysis, 41(3), 645–655.
Article Google Scholar
Nerini, D., & Ghattas, B. (2007). Classifying densities using functional regression trees: Applications in oceanology. Computational Statistics and Data Analysis, 51, 4984–4993.
Article Google Scholar
Siciliano, R., & Mola, F. (2000). Multivariate data analysis and modelling through classification and regression trees. Computational Statistics and Data Analysis, 32, 285–301.
Article Google Scholar
Skrondal, A., & Rabe-Hesketh, S. (2003). Multilevel logistic regression for polytomous data and rankings. Psychometrika, 68(2), 267–287.
Article Google Scholar
Strobl, C., Malley, J., & Tutz, G. (2009). An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychological Methods, 14(4), 323–348.
Article PubMed PubMed Central Google Scholar
Strobl, C., Wickelmaier, F., & Zeileis, A. (2011). Accounting for individual differences in Bradley-Terry models by means of recursive partitioning. Journal of Educational and Behavioral Statistics, 36(2), 135–153.
Google Scholar
Thurstone, L. L. (1927). A law of comparative judgment. Psychological Review, 34, 273–286.
Article Google Scholar
van Blokland-Vogelesang, R. (1990), Unfolding and group consensus ranking for individual preferences. Unpublished PhD thesis, University of Leiden.
Van Deun, K., Heiser, W. J., & Delbeke, L. (2007). Multidimensional unfolding by nonmetric multidimensional scaling of Spearman distances in the extended permutation polytope. Multivariate Behavioral Research, 42, 103–132.
Article PubMed Google Scholar
Vermunt, J. K. (2003). Multilevel latent class models. Sociological Methodology, 33(1), 213–239.
Article Google Scholar

Download references

Acknowledgments

The authors would like to thank anonymous reviewers for their helpful comments, which have helped us to greatly improve the quality of this manuscript.

Author information

Authors and Affiliations

Department of Economics and Statistics, University of Naples Federico II, Via Cinthia, 80126 , Naples, Italy
Antonio D’Ambrosio
Mathematical Institute and Institute of Psychology, Leiden University, Leiden, The Netherlands
Willem J. Heiser

Authors

Antonio D’Ambrosio
View author publications
You can also search for this author in PubMed Google Scholar
Willem J. Heiser
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Antonio D’Ambrosio.

Rights and permissions

Reprints and permissions

About this article

Cite this article

D’Ambrosio, A., Heiser, W.J. A Recursive Partitioning Method for the Prediction of Preference Rankings Based Upon Kemeny Distances. Psychometrika 81, 774–794 (2016). https://doi.org/10.1007/s11336-016-9505-1

Download citation

Received: 30 September 2014
Revised: 16 December 2015
Published: 01 July 2016
Issue Date: September 2016
DOI: https://doi.org/10.1007/s11336-016-9505-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Recursive Partitioning Method for the Prediction of Preference Rankings Based Upon Kemeny Distances

Abstract

Access this article

Similar content being viewed by others

Weighted distance-based trees for ranking data

Position Weighted Decision Trees for Ranking Data

The Bradley–Terry Regression Trunk approach for Modeling Preference Data with Small Trees

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Recursive Partitioning Method for the Prediction of Preference Rankings Based Upon Kemeny Distances

Abstract

Access this article

Similar content being viewed by others

Weighted distance-based trees for ranking data

Position Weighted Decision Trees for Ranking Data

The Bradley–Terry Regression Trunk approach for Modeling Preference Data with Small Trees

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation