Bayesian Plackett–Luce Mixture Models for Partially Ranked Data

Abstract

The elicitation of an ordinal judgment on multiple alternatives is often required in many psychological and behavioral experiments to investigate preference/choice orientation of a specific population. The Plackett–Luce model is one of the most popular and frequently applied parametric distributions to analyze rankings of a finite set of items. The present work introduces a Bayesian finite mixture of Plackett–Luce models to account for unobserved sample heterogeneity of partially ranked data. We describe an efficient way to incorporate the latent group structure in the data augmentation approach and the derivation of existing maximum likelihood procedures as special instances of the proposed Bayesian method. Inference can be conducted with the combination of the Expectation-Maximization algorithm for maximum a posteriori estimation and the Gibbs sampling iterative procedure. We additionally investigate several Bayesian criteria for selecting the optimal mixture configuration and describe diagnostic tools for assessing the fitness of ranking distributions conditionally and unconditionally on the number of ranked items. The utility of the novel Bayesian parametric Plackett–Luce mixture for characterizing sample heterogeneity is illustrated with several applications to simulated and real preference ranked data. We compare our method with the frequentist approach and a Bayesian nonparametric mixture model both assuming the Plackett–Luce model as a mixture component. Our analysis on real datasets reveals the importance of an accurate diagnostic check for an appropriate in-depth understanding of the heterogenous nature of the partial ranking data.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3

References

  1. Alvo, M., & Yu, P. L. (2014). Statistical methods for ranking data. Berlin: Springer.

    Google Scholar 

  2. Ando, T. (2007). Bayesian predictive information criterion for the evaluation of hierarchical Bayesian and empirical Bayes models. Biometrika, 94(2), 443–458.

    Article  Google Scholar 

  3. Bulteel, K., Wilderjans, T. F., Tuerlinckx, F., & Ceulemans, E. (2013). CHull as an alternative to AIC and BIC in the context of mixtures of factor analyzers. Behavior Research Methods, 45(3), 782–791.

    Article  PubMed  Google Scholar 

  4. Caron, F., & Doucet, A. (2012). Efficient Bayesian inference for generalized Bradley–Terry models. Journal of Computational and Graphical Statistics, 21(1), 174–196.

    Article  Google Scholar 

  5. Caron, F., Teh, Y. W., & Murphy, T. B. (2012). Bayesian nonparametric Plackett-Luce models for the analysis of clustered ranked data. Technical Report 8143, Project-Team ALEA.

  6. Caron, F., Teh, Y. W., & Murphy, T. B. (2014). Bayesian nonparametric Plackett-Luce models for the analysis of preferences for college degree programmes. The Annals of Applied Statistics, 8(2), 1145–1181.

    Article  Google Scholar 

  7. Celeux, G., Hurn, M., & Robert, C. P. (2000). Computational and inferential difficulties with mixture posterior distributions. Journal of the American Statistical Association, 95(451), 957–970.

    Article  Google Scholar 

  8. Celeux, G., & Soromenho, G. (1996). An entropy criterion for assessing the number of clusters in a mixture model. Journal of Classification, 13(2), 195–212.

    Article  Google Scholar 

  9. Dabic, M., & Hatzinger, R. (2009). Zielgruppenadaequate Ablaeufe in Konfigurationssystemen - Eine empirische Studie im Automobilmarkt - Partial Rankings. In R. Hatzinger, R. Dittrich, & T. Salzberger (Eds.), Praeferenzanalyse mit R: Anwendungen aus Marketing, Behavioural Finance und Human Resource Management. Wien: Facultas.

    Google Scholar 

  10. Dahl, D. B. (2006). Model-based clustering for expression data via a Dirichlet process mixture model. In K.-A. Do, P. Müller, & M. Vannucci (Eds.), Bayesian inference for gene expression and proteomics (pp. 201–218). New York: Springer.

    Google Scholar 

  11. Diaconis, P. W. (1987). Spectral analysis for ranked data. Technical Report 282, Department of Statistics, Stanford University, Stanford.

  12. Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (2004). Bayesian data analysis (2nd ed.). Boca Raton: Chapman & Hall/CRC.

    Google Scholar 

  13. Gelman, A., Hwang, J., & Vehtari, A. (2014). Understanding predictive information criteria for Bayesian models. Statistics and Computing, 24(6), 997–1016.

    Article  Google Scholar 

  14. Gelman, A., Meng, X.-L., & Stern, H. (1996). Posterior predictive assessment of model fitness via realized discrepancies. Statistica Sinica, 6(4), 733–760.

    Google Scholar 

  15. Gormley, I. C., & Murphy, T. B. (2006). Analysis of Irish third-level college applications data. Journal of the Royal Statistical Society: Series A, 169(2), 361–379.

    Article  Google Scholar 

  16. Gormley, I. C., & Murphy, T. B. (2008). A mixture of experts model for rank data with applications in election studies. Annals of Applied Statistics, 2(4), 1452–1477.

    Article  Google Scholar 

  17. Gormley, I. C., & Murphy, T. B. (2009). A grade of membership model for rank data. Bayesian Analysis, 4(2), 265–295.

    Article  Google Scholar 

  18. Gormley, I. C. & Murphy, T. B. (2010). Clustering ranked preference data using sociodemographic covariates. In Hess, S., Daly, A., (Eds.), Choice modelling: The state-of-the-art and the state-of-practice. Proceedings from the Inaugural International Choice Modelling Conference (pp. 543–569). Emerald.

  19. Guiver, J., & Snelson, E. (2009). Bayesian inference for Plackett-Luce ranking models. In Bottou, L., & Littman, M., (Eds.), Proceedings of the 26th International Conference on Machine Learning—ICML 2009 (pp. 377–384). Omnipress.

  20. Hatzinger, R., & Dittrich, R. (2012). prefmod: An R package for modeling preferences based on paired comparisons, rankings, or ratings. Journal of Statistical Software, 48(10), 1–31.

    Article  Google Scholar 

  21. Hunter, D. R. (2004). MM algorithms for generalized Bradley–Terry models. Annals of Statistics, 32(1), 384–406.

    Article  Google Scholar 

  22. Jacques, J., & Biernacki, C. (2014). Model-based clustering for multivariate partial ranking data. Journal of Statistical Planning and Inference, 149, 201–217.

    Article  Google Scholar 

  23. Luce, R. D. (1959). Individual choice behavior: A theoretical analysis. New York: Wiley.

    Google Scholar 

  24. Lukočienė, O., & Vermunt, J. K. (2009). Determining the number of components in mixture models for hierarchical data. Advances in data analysis, data handling and business intelligence (pp. 241–249). Berlin: Springer.

    Google Scholar 

  25. Marden, J. I. (1995). Analyzing and modeling rank data (Vol. 64). Monographs on Statistics and Applied Probability, Boca Raton: Chapman & Hall.

  26. Marin, J.-M., Mengersen, K., & Robert, C. P. (2005). Bayesian modelling and inference on mixtures of distributions. Handbook of Statistics, 25, 459–507.

    Article  Google Scholar 

  27. McCullagh, P., Yang, J., et al. (2008). How many clusters? Bayesian Analysis, 3(1), 101–120.

    Article  Google Scholar 

  28. Miller, J. W., & Harrison, M. T. (2013). A simple example of Dirichlet process mixture inconsistency for the number of components. In Neural Information Processing Systems - NIPS, 2013, 199–206.

    Google Scholar 

  29. Miller, J. W., & Harrison, M. T. (2014). Inconsistency of Pitman–Yor process mixtures for the number of components. The Journal of Machine Learning Research, 15(1), 3333–3370.

    Google Scholar 

  30. Mollica, C., & Tardella, L. (2014). Epitope profiling via mixture modeling of ranked data. Statistics in Medicine, 33(21), 3738–3758.

    Article  PubMed  Google Scholar 

  31. Papastamoulis, P. (2016). label. switching: An R package for dealing with the label switching problem in MCMC outputs. Journal of Statistical Software, 69(1), 1–24.

  32. Plackett, R. L. (1975). The analysis of permutations. Journal of the Royal Statistical Society: Series C (Applied Statistics), 24(2), 193–202.

    Google Scholar 

  33. Raftery, A. E., Satagopan, Jaya, M., Newton, M. A., & Krivitsky, P. N. (2007). Bayesian statistics 8. In Bernardo, J., Bayarri, M., Berger, J., Dawid, A., Heckerman, D., Smith, A., West, M., (Eds.), Proceedings of the eighth Valencia International Meeting, June 2-6, 2006, pages 371–416. Oxford: Oxford University Press.

  34. Spiegelhalter, D. J., Best, N. G., Carlin, B. P., & Van Der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(4), 583–639.

    Article  Google Scholar 

  35. Stern, H. (1993). Probability models on rankings and the electoral process. Probability models and statistical analyses for ranking data (Vol. 80, pp. 173–195)., Lecture Notes in Statistics New York: Springer.

    Google Scholar 

  36. Yao, G., & Böckenholt, U. (1999). Bayesian estimation of Thurstonian ranking models based on the Gibbs sampler. British Journal of Mathematical and Statistical Psychology, 52(1), 79–92.

    Article  Google Scholar 

  37. Yellott, John I. (1977). The relationship between Luce’s choice axiom, Thurstone’s theory of comparative judgment, and the double exponential distribution. J. Mathematical Psychology, 15(2), 109–144.

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Cristina Mollica.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 2719 KB)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Mollica, C., Tardella, L. Bayesian Plackett–Luce Mixture Models for Partially Ranked Data. Psychometrika 82, 442–458 (2017). https://doi.org/10.1007/s11336-016-9530-0

Download citation

Keywords

  • ranking data
  • Plackett–Luce model
  • mixture models
  • data augmentation
  • MAP estimation
  • Gibbs sampling
  • label switching
  • goodness-of-fit