Skip to main content
Log in

Bayesian analysis of ranking data with the Extended Plackett–Luce model

  • Original Paper
  • Published:
Statistical Methods & Applications Aims and scope Submit manuscript

Abstract

Multistage ranking models, including the popular Plackett–Luce distribution (PL), rely on the assumption that the ranking process is performed sequentially, by assigning the positions from the top to the bottom one (forward order). A recent contribution to the ranking literature relaxed this assumption with the addition of the discrete-valued reference order parameter, yielding the novel Extended Plackett–Luce model (EPL). Inference on the EPL and its generalization into a finite mixture framework was originally addressed from the frequentist perspective. In this work, we propose the Bayesian estimation of the EPL in order to address more directly and efficiently the inference on the additional discrete-valued parameter and the assessment of its estimation uncertainty, possibly uncovering potential idiosyncratic drivers in the formation of preferences. We overcome initial difficulties in employing a standard Gibbs sampling strategy to approximate the posterior distribution of the EPL by combining the data augmentation procedure and the conjugacy of the Gamma prior distribution with a tuned joint Metropolis–Hastings algorithm within Gibbs. The effectiveness and usefulness of the proposal is illustrated with applications to simulated and real datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Alvo M, Yu PL (2014) Statistical methods for ranking data. Springer, New York

    Book  Google Scholar 

  • Caron F, Doucet A (2012) Efficient Bayesian inference for generalized Bradley–Terry models. J Comput Graph Stat 21(1):174–196

    Article  MathSciNet  Google Scholar 

  • Critchlow DE, Fligner MA, Verducci JS (1991) Probability models on rankings. J Math Psychol 35(3):294–318

    Article  MathSciNet  Google Scholar 

  • Gormley IC, Murphy TB (2006) Analysis of Irish third-level college applications data. J R Stat Soc A 169(2):361–379

    Article  MathSciNet  Google Scholar 

  • Henery RJ (1981) Permutation probabilities as models for horse races. J R Stat Soc B 43(1):86–91

    MathSciNet  MATH  Google Scholar 

  • Jacques J, Grimonprez Q, Biernacki C (2014) Rankcluster: an R package for clustering multivariate partial rankings. R J 6(1):101–110

    Article  Google Scholar 

  • Kamishima T (2003) Nantonac collaborative filtering: recommendation based on order responses. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 583–588

  • Liu Q, Crispino M, Scheel I, Vitelli V, Frigessi A (2019) Model-based learning from preference data. Annu Rev Stat Appl 6:329–354

    Article  MathSciNet  Google Scholar 

  • Luce RD (1959) Individual choice behavior: a theoretical analysis. Wiley, New York

    MATH  Google Scholar 

  • Marden JI (1995) Analyzing and modeling rank data. Monographs on statistics and applied probability, vol 64. Chapman & Hall, New York

    Google Scholar 

  • Mollica C, Tardella L (2014) Epitope profiling via mixture modeling of ranked data. Stat Med 33(21):3738–3758

    Article  MathSciNet  Google Scholar 

  • Mollica C, Tardella L (2017) Bayesian mixture of Plackett–Luce models for partially ranked data. Psychometrika 82(2):442–458

    Article  MathSciNet  Google Scholar 

  • Plackett RL (1975) The analysis of permutations. J R Stat Soc C 24(2):193–202

    MathSciNet  Google Scholar 

  • Spiegelhalter DJ, Best NG, Carlin BP, Van Der Linde A (2002) Bayesian measures of model complexity and fit. J R Stat Soc B 64(4):583–639

    Article  MathSciNet  Google Scholar 

  • Stern H (1990) Models for distributions on permutations. J Am Stat Assoc 85(410):558–564

    Article  Google Scholar 

  • Tierney L (1994) Markov chains for exploring posterior distributions. Ann Stat 22(4):1701–1762

    Article  MathSciNet  Google Scholar 

  • Vigneau E, Courcoux P, Semenou M (1999) Analysis of ranked preference data using latent class models. Food Qual Prefer 10(3):201–207

    Article  Google Scholar 

  • Vitelli V, Sørensen Ø, Crispino M, Frigessi A, Arjas E (2018) Probabilistic preference learning with the Mallows rank model. J Mach Learn Res 18(158):1–49

    MathSciNet  MATH  Google Scholar 

  • Yu PLH, Lam KF, Lo SM (2005) Factor analysis for ranked data with application to a job selection attitude survey. J R Stat Soc A 168(3):583–597

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We are deeply grateful to both anonymous referees, whose comments and suggestions allowed us to improve the article. This work has been supported by Sapienza Università di Roma, Grant RP11816436B15B6B.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cristina Mollica.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PDF 1682 KB)

Appendix

Appendix

1.1 An example of EPL model

Here is a simple example to clarify the EPL formulation introduced in (1). Without loss of generality, let us suppose that an \(\text {EPL}(\dot{\rho },\underline{\dot{p}})\) with parameter values \(\dot{\rho }=(4,1,3,2)\) and \(\underline{\dot{p}}=(0.4,0.3,0.2,0.1)\) represents the true data generating mechanism. Under this EPL scenario, the positions are assigned according to an alternating preference scheme: the item selected at the first stage corresponds to the least-liked alternative (\(\dot{\rho }(1)=4\)); at the second stage, the most-liked item is specified (\(\dot{\rho }(2)=1\)); the item ranked at the third stage is the one receiving the third position (\(\dot{\rho }(3)=3\)) and, finally, the remaining alternative of the forth stage is placed second in order of preference (\(\dot{\rho }(4)=2\)). Regarding the support parameters, they reflect a decreasing first-stage choice probability such that \({\dot{p}}_i\propto (K+1)-i\). Hence, the chance of being ranked last reduces when proceeding from alternative 1 up to alternative 4: item 1 is more likely to be chosen at the first step and, thus, to be ranked last, followed in the order by item 2, 3 and 4.

Since the rank assignment order \(\rho\) is not restricted to the identity permutation \(\rho _{\text {F}}\) as in the PL, a generic ordering \(\pi ^{-1}\) does not coincide with the sequence \(\eta ^{-1}=\pi ^{-1}\circ \rho\) listing the items selected at each stage of the ranking process. For example, the considered \(\text {EPL}(\dot{\rho },\underline{\dot{p}})\) implies that observing the ordering \(\pi ^{-1}=(3,1,4,2)\) corresponds to the sequential item selections indicated by the composition below

$$\begin{aligned} \eta ^{-1}=\pi ^{-1}\circ \dot{\rho }&= (\pi ^{-1}(\dot{\rho }(1)),\pi ^{-1}(\dot{\rho }(2)),\pi ^{-1}(\dot{\rho }(3)),\pi ^{-1}(\dot{\rho }(4)))\\&= (\pi ^{-1}(4),\pi ^{-1}(1),\pi ^{-1}(3),\pi ^{-1}(2))=(2,3,4,1), \end{aligned}$$

that is, one chooses item 2 at the first stage, item 3 at the second stage, item 4 at the third stage and item 1 at the last stage.

Equation (1) states that the probability mass associated to \(\pi ^{-1}=(3,1,4,2)\) under the specified \(\text {EPL}(\dot{\rho },\underline{\dot{p}})\) can be computed from the PL distribution after rearranging the components of \(\pi ^{-1}\) according the reference order \(\dot{\rho }\):

$$\begin{aligned} {{\,\mathrm{\mathbf {P}}\,}}_{\text {EPL}}(\pi ^{-1}&=(3,1,4,2) |\dot{\rho },\underline{\dot{p}})={{\,\mathrm{\mathbf {P}}\,}}_{\text {PL}}(\eta ^{-1}=(2,3,4,1) |\underline{\dot{p}}) \\ &= \frac{0.3}{1} \cdot \frac{0.2}{0.4+0.2+0.1} \cdot \frac{0.1}{0.4+0.1} \cdot 1 \approx 0.017. \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mollica, C., Tardella, L. Bayesian analysis of ranking data with the Extended Plackett–Luce model. Stat Methods Appl 30, 175–194 (2021). https://doi.org/10.1007/s10260-020-00519-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10260-020-00519-5

Keywords

Navigation