Regression trees for detecting preference patterns from rank data

Shih, Yu-Shan; Liu, Kuang-Hsun

doi:10.1007/s11634-018-0332-3

Regression trees for detecting preference patterns from rank data

Regular Article
Published: 25 July 2018

Volume 13, pages 683–702, (2019)
Cite this article

Advances in Data Analysis and Classification Aims and scope Submit manuscript

423 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

A regression tree method for analyzing rank data is proposed. A key ingredient of the methodology is to convert ranks into scores by paired comparison. We then utilize the GUIDE tree method on the score vectors to identify the preference patterns in the data. This method is exempt from selection bias and the simulation results show that it is good with respect to the selection of split variables and has a better prediction accuracy than the two other investigated methods in some cases. Furthermore, it is applicable to complex data which may contain incomplete ranks and missing covariate values. We demonstrate its usefulness in two real data studies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Thinking twice about sum scores

Article 22 April 2020

Confirmatory factor analysis with ordinal data: Comparing robust maximum likelihood and diagonally weighted least squares

Article 15 July 2015

Partial Least Squares Methods: Partial Least Squares Correlation and Partial Least Square Regression

Notes

Pearson’s Chi-Square test of independence and its Bonferroni-adjusted p value were used in CHAID, a classical tree method (Kass 1980).

References

Alvo M, Yu PLH (2014) Statistical methods for ranking data. Springer, New York
Book MATH Google Scholar
Bradley RA, Terry ME (1952) Rank analysis of incomplete block designs. I. The method of paired comparisons. Biometrika 39:324–345
MathSciNet MATH Google Scholar
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth, Belmont
MATH Google Scholar
Cattelan M (2012) Models for paired comparison data: a review with emphasis on dependent data. Stat Sci 27:412–433
Article MathSciNet MATH Google Scholar
Cheng W, Hühn J, Hüllermeier E (2009) Decision tree and instance-based learning for label ranking. In: International conference on machine learning, Montreal
Critchlow DE (1985) Metric methods for analyzing partially ranked data. Springer, New York
Book MATH Google Scholar
D’Ambrosio A, Heiser WJ (2016) A recursive partitioning method for the prediction of preference rankings based upon Kemeny distances. Psychometrika 81:774–794
Article MathSciNet MATH Google Scholar
Davidson RR (1970) On extending the Bradley–Terry model to accommodate ties in paired comparison experiments. J Am Stat Assoc 65:317–328
Article Google Scholar
De’ath G (2002) Multivariate regression trees: a new technique for modeling species-environment relationships. Ecology 83:1105–1117
Google Scholar
Diaconis P (1988) Group representations in probability and statistics. Institute of Mathematical Statistics, Hayward
MATH Google Scholar
Emond EJ, Mason DW (2002) A new rank correlation coefficient with application to the consensus ranking problem. J Multi-Criteria Decis Anal 11:17–28
Article MATH Google Scholar
Francis B, Dittrich R, Hatzinger R, Penn R (2002) Analysing partial ranks by using smoothed paired comparison methods: an investigation of value orientation in Europe. J R Stat Soc Ser C (Appl Stat) 51:319–336
Article MathSciNet MATH Google Scholar
Francis B, Dittrich R, Hatzinger R, Humphreys L (2014) A mixture model for longitudinal partially ranked data. Commun Stat Theory Methods 43:722–734
Article MathSciNet MATH Google Scholar
Hatzinger R, Dittrich R (2012) prefmod: an R package for modeling preferences based on paired comparisons, rankings, or ratings. J Stat Softw 48:1–31
Article Google Scholar
Hothorn T, Hornik K, Zeileis A (2006) Unbiased recursive partitioning: a conditional inference framework. J Comput Graph Stat 15:651–674
Article MathSciNet Google Scholar
Hsiao WC, Shih YS (2007) Splitting variable selection for multivariate regression trees. Stat Probab Lett 77:265–271
Article MathSciNet MATH Google Scholar
Inglehart R (1977) The silent revolution: changing values and political styles among western publics. Princeton University Press, Princeton
Google Scholar
Kass GV (1980) An exploratory technique for investigating large quantities of categorical data. Appl Stat 29:119–127
Article Google Scholar
Kemeny JG, Snell JL (eds) (1962) Preference rankings: an axiomatic approach. In: Mathematical models in the social sciences. The MIT press, Cambridge, pp 9–23
Kung YH, Lin CT, Shih YS (2012) Split variable selection for tree modeling on rank data. Comput Stat Data Anal 56:2830–2836
Article MathSciNet MATH Google Scholar
Lee PH, Yu PLH (2010) Distance-based tree models for ranking data. Comput Stat Data Anal 54:1672–1682
Article MathSciNet MATH Google Scholar
Liu KH, Shih YS (2016) Score-scale decision tree for paired comparison data. Statistica Sinica 26:429–444
MathSciNet MATH Google Scholar
Loh WY (2014) Fifty years of classification and regression trees (with discussion). Int Stat Rev 34:329–370
Article MATH Google Scholar
Loh WY, Zheng W (2013) Regression trees for longitudinal and multiresponse data. Ann Appl Stat 7:495–522
Article MathSciNet MATH Google Scholar
Marden JI (1995) Analyzing and modeling rank data. Chapman & Hall, London
MATH Google Scholar
Qinglong L (2015) StatMethRank: statistical methods for ranking data. R package version 1.3
Strobl C, Wickelmaier F, Zeileis A (2011) Accounting for individual differences in Bradley–Terry models by means of recursive partitioning. J Educ Behav Stat 36:135–153
Article Google Scholar
Vermunt J (2003) Multilevel latent class models. Sociol Methodol 33:213–239
Article Google Scholar
Yandell BS (1997) Practical data analysis for designed experiments. Chapman & Hall, Boca Raton
Book MATH Google Scholar
Yu PLH, Wan WM, Lee PH (2010) Decision tree modeling for ranking data. In: Fürnkranz J, Hüllermeier E (eds) Preference learning. Springer, New York, pp 83–106
Chapter Google Scholar
Yu PLH, Lee PH, Cheung SF, Lau EYY, Mok DSY, Hui HC (2016) Logit tree models for discrete choice data with application to advice-seeking preferences among Chinese Christians. Comput Stat 31:799–827
Article MathSciNet MATH Google Scholar
Zeileis A, Hornik K (2007) Generalized M-fluctuation tests for parameter instability. Statistica Neerlandica 61:488–508
Article MathSciNet MATH Google Scholar
Zeileis A, Hothorn T, Hornik K (2008) Model-based recursive partitioning. J Comput Graph Stat 17:492–514
Article MathSciNet Google Scholar

Download references

Acknowledgements

The authors are very grateful to the two reviewers and the editors for many helpful comments and suggestions.

Author information

Authors and Affiliations

Department of Mathematics, National Chung Cheng University, Chiayi, 621, Taiwan
Yu-Shan Shih
XDM Technology, Hsinchu, 300, Taiwan
Kuang-Hsun Liu

Authors

Yu-Shan Shih
View author publications
You can also search for this author in PubMed Google Scholar
Kuang-Hsun Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yu-Shan Shih.

Additional information

This research is supported in part by Taiwan MOST Grant 106-2118-M-194-002.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shih, YS., Liu, KH. Regression trees for detecting preference patterns from rank data. Adv Data Anal Classif 13, 683–702 (2019). https://doi.org/10.1007/s11634-018-0332-3

Download citation

Received: 22 October 2017
Revised: 16 July 2018
Accepted: 20 July 2018
Published: 25 July 2018
Issue Date: 01 September 2019
DOI: https://doi.org/10.1007/s11634-018-0332-3

Keywords

Mathematics Subject Classification

62G08

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Regression trees for detecting preference patterns from rank data

Abstract

Access this article

Similar content being viewed by others

Thinking twice about sum scores

Confirmatory factor analysis with ordinal data: Comparing robust maximum likelihood and diagonally weighted least squares

Partial Least Squares Methods: Partial Least Squares Correlation and Partial Least Square Regression

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Regression trees for detecting preference patterns from rank data

Abstract

Access this article

Similar content being viewed by others

Thinking twice about sum scores

Confirmatory factor analysis with ordinal data: Comparing robust maximum likelihood and diagonally weighted least squares

Partial Least Squares Methods: Partial Least Squares Correlation and Partial Least Square Regression

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation