1 Introduction

In social choice theory, preference rankings and approvals are two popular ways to collect the preferences of a group of agents on a set of alternatives. Preference rankings order the alternatives from best to worst without distinguishing between acceptable and unacceptable alternatives. That is, if a is ranked above b, we can only infer that a is preferred to b, but we cannot infer anything about their absolute acceptability. In contrast, the approval voting system (Brams & Fishburn, 1978) consists of separating the set of acceptable alternatives from the set of unacceptable alternatives without considering preferences neither over acceptable nor over unacceptable alternatives.

Preference rankings and approval voting are related, but they are basically different types of information and cannot be inferred from each other.

In this paper, we focus on preference–approval structures. They combine preferences over the alternatives, through a weak order, and establish which alternatives are acceptable (Brams, 2008, Chapter 3; Brams & Sanver, 2009; Sanver, 2010). In preference–approval structures, voters can pay attention to which alternatives are acceptable and simultaneously rank-order them. Voters may either rank-order unacceptable alternatives or avoid declaring their preferences about themFootnote 1 by (implicitly) showing indifference between these alternativesFootnote 2.

The distance and correlation between two preference rankings are of particular interest within this framework. Kendall’s correlation coefficient (Kendall, 1948) is likely the most well-known ranking metric. Kendall’s correlation coefficient is a measure of similarity that can be linearly transformed into a measure of dissimilarity (i.e., the Kendall’s distance), which counts the number of pairwise disagreements between two rankings. Emond and Mason (2002, p. 20) demonstrated that when indifference between alternatives is allowed (weak orders), Kendall’s distance violates the triangle inequality. Moreover, the Kendall’s correlation of the all ties ranking with any other weak order is undefined, resulting in a 0/0.

The Spearman’s distance (Spearman, 1987), another famous ranking metric, between two rank vectors is calculated by taking the square root of the sum of the squared rank differences. Spearman’s distance uses rank values as if they were mathematical variates, which leads to anomalous behaviour. Indeed, Spearman distance suffers from sensitivity to irrelevant alternatives (see Emond, 1997, p. 4; Emond & Mason, 2000, p. 16). In short, including additional irrelevant objects in the ranking exercise may alter the maximum agreement solution.

Kemeny and Snell (1962) took a different approach to this problem. They defined a set of additional axioms that should be applicable to any distance measure between two weak orders, and introduced a distance that satisfies all these constraints. Besides the classical properties (positivity, symmetry, identity of indiscernibles and triangle inequality), the Kemeny distance is not affected by a relabeling of alternatives (neutrality), and it is consistent in measurement as the number of objects varies (i.e. it is not sensible to irrelevant alternatives). The Kemeny distance is a city block and a geodesic distance in the permutation polytope (see Heiser, 2004). It takes the shortest path between two rankings.

Generally, extensions to ranking measures have mainly focused on the definition of weighted distances (see García-Lapresta & Pérez-Román, 2010, Albano & Plaia 2021, Plaia et al., 2021). In the last years, there has been a dramatic increase in recent publications about preference–approval structures and the introduction of consensus and distance measures in that setting.

Erdamar et al. (2014) introduced a family of distances in the preference–approval setting, and they applied them to measure the consensus in that framework. Kamwa (2019) studied the propensity of the preference–approval voting of electing the Condorcet winner/loser when they exist.

Dong et al. (2021) established some axioms implying the existence of a distinct distance function of preference–approval systems. They investigated a preferences aggregation model in the context of group decision-making based on the proposed axiomatic distance function.

Kruger and Sanver (2021) investigated the compatibility between ordinal and evaluative approaches to social choice theory under two weak assumptions: respect for unanimity and independence of evaluation of each alternative. They claimed that there is an incompatibility between the two, and described some options whenever the second assumption is relaxed.

Long et al. (2021) developed a two-stage consensus reaching method for multi-attribute group decision making problems with preference–approval structures, promoting the efficiency of consensus reaching.

Barokas and Sprumont (2021) extended the classing Borda count to rank alternative in preference–approval setting, constructing an axiomatization of a new aggregation procedure called broken Borda rule.

In this paper, we propose a new distance for preference–approvals, following the axiomatic approach of the Kemeny distance. However, while the Kemeny distance can only consider the preference–discordance, our approach takes into account the approval-discordance as well, and use an aggregation function to combine the two types of information for each pair of alternatives.

We show that using, as an aggregation function, the family of weigthed power means (a class of weighted quasiarithmetic means) brings the benefit of many interesting properties. The final aggregated distance will thus be the sum of the pairwise preference–approval discordances. Furthermore, we show that our distance respects the fundamental properties to be defined as a metric and that, under certain assumptions, it has a precise geometric interpretation.

Our proposal can be regarded as the generalization of the Erdamar et al. (2014) distance measure, with the two coinciding for a specific parameter setting. However, we show that the proposed distance family has some advantages over the existing one as it is more versatile and performs better in cluster analysis.

Finally, the proposed metric is used to cluster a set of preference–approvals into homogeneous groups, considering the whole 2-dimensional universe of preference–approvals and a real case study.

The paper is organized as follows. Section 2 is devoted to introduce basic notation, preference–approvals and the codifications used throughout the paper. Section 3 includes our proposal for measuring distances between preference–approvals and some results. Section 4 offers some applications to the clustering task. Finally, Sect. 5 concludes the paper with some remarks.

2 Preliminaries

Let \(\,X=\{x_1,\dots ,x_n\}\,\) a finite set of alternatives, with \(\,n\ge 2\). A weak order (or complete preorder) on X is a complete and transitive binary relation on X. A linear order on X is an antisymmetric weak order on X. With \(\,W(X)\,\) and \(\,L(X)\,\) we denote the set of weak and linear orders on X, respectively. Given \(\,R\in W(X)\), with \(\,\succ \,\) and \(\,\sim \,\) we denote the asymmetric and the symmetric parts of R, respectively: \(\,x_i \succ x_j\,\) if not \(\,x_j\,R\,x_i\), and \(\,x_i \sim x_j\,\) if \(\,x_i\,R\,x_j\,\) and \(\,x_j\,R\,x_i\).

Given a set Y, with \(\,\mathcal {P}(Y)\,\) we denote its power set, i.e., \(\,I\in \mathcal {P}(Y) \,\Leftrightarrow \, I\subseteq Y\). In turn, with \(\,\# Y\,\) we denote the cardinality of Y.

2.1 Preference–approval

Consider that a set of voters \(\,V=\{v_1,\dots ,v_m\}\), with \(\,m\ge 2\), have to express their opinions over X. We assume that each voter ranks the alternatives in X by means of a weak order and, additionally, assesses each alternative as either acceptable or unacceptable by partitioning X into A, the set of acceptable alternatives, and \(\,U=X\setminus A\), the set of unacceptable alternatives, where A and U can be empty sets.

We also assume the following consistency condition: given two alternatives \(x_i\) and \(x_j\), if \(x_j\) is acceptable and \(x_i\) is ranked above \(x_j\), then \(x_i\) should be acceptable as well.

Definition 1

A preference–approval on X is a pair \(\,(R,A)\in W(X) \times \mathcal {P}(X)\,\) satisfying the following condition:

$$\begin{aligned} \forall x_i,x_j\in X\; \big ((x_i\,R\,x_j \text{ and } x_j\in A) \;\Rightarrow \; x_i\in A\big ). \end{aligned}$$

With \(\,\mathcal {R}(X)\,\) we denote the set of preference–approvals on X.

Remark 1

If \(\,(R,A)\in \mathcal {R}(X)\), then the following conditions are satisfied:

  1. 1.

    \(\forall x_i,x_j\in X\; \big ((x_i\in A \text{ and } x_j\in U) \;\Rightarrow \; x_i \,\succ \,x_j\big )\).

  2. 2.

    \(\forall x_i,x_j\in X\; \big ((x_i\,R\,x_j \text{ and } x_i\in U) \;\Rightarrow \; x_j\in U\big )\).

We now illustrate preference–approval structures through the following example.

Example 1

Let us consider \(\,(R,A)\in \mathcal {R}(\{x_1,\dots ,x_8\})\,\) represented by

$$\begin{aligned} \begin{array}{c} x_4\\ x_1\;x_6\\ x_2 \\ \hline \\ x_3\\ x_5\;x_7\;x_8.\\ \end{array} \end{aligned}$$

It means that alternatives in the same row are indifferent, alternatives in upper rows are preferred to those located in lower rows, alternatives above the line are acceptable, i.e., \(\,A=\{x_1,x_2,x_4,x_6\}\), and those below the line are unacceptable, i.e., \(\,U=\{x_3,x_5,x_7,x_8\}\).

Table 1 includes the number of possible approvals, linear orders, weak orders and preference–approvals when the number of alternatives is \(\,n=2,3,\dots ,10\).

Table 1 Number of approvals, linear orders, weak orders and preference–approvals

It is well-known that the total number of approvals (subsets of X) and linear orders are \(\,2^n\,\) and \(\,n!\), respectively. The number of weak orders is \(\,n!(\log _2\,e)^{n+1}/2\) (see Good, 1980). The formula for calculating the number of preference–approvals has never been defined in the literature. For the first time, the exact number of preference–approvals for \(\,n=2,3,\dots ,10\,\) alternatives is reported herein in Table 1. The formula to compute the exact number of preference–approvals on a set of n alternatives is

$$\begin{aligned} \omega (n)=\sum _{r=0}^n(r+1)!\,{{S}}_n^{(r)}, \end{aligned}$$
(1)

where \(\,{{S}}_n^{(r)}\,\) is a Stirling integer (number) of the second kind defined by David and Barton (1962, p. 294), Abramowitz and Stegun (1964, p. 824) and more thoroughly in Fisher and Yates (1953, p. 78), while r denotes the number of distinct positions in a weak order on n alternatives, also known as buckets. For example, considering four alternatives, if two are tied for first place and the other two are tied for third place, we can say that the number of distinct positions, or buckets, is two.

Table 2 shows the quotients between preference–approvals and approvals. In turn, Table 3 shows the quotients between preference–approvals and weak orders.

It is clear that the expressivity of voters explodes with preference–approvals.

Table 2 Quotients between preference–approvals and approvals
Table 3 Quotients between preference–approvals and weak orders

2.2 Codifications

Assigning positions to alternatives in linear orders is trivial because indifferences among distinct alternatives are not allowed. Given \(\,R\in L(X)\), the position of each alternative \(\,x_i\in X\,\) in \(R\,\) is defined through the mapping \(\,P_R:X\longrightarrow \{1,\dots ,n\}\,\) that assigns 1 to the alternative ranked first, 2 to the alternative ranked second, and so on.

There are different ways of assigning positions to the alternatives in weak orders. One of them is used by García-Lapresta and Pérez-Román (2011) and it is based on Smith (1973), Black (1976) and Cook and Seiford (1982).

Given \(\,R\in W(X)\), the position of \(\,x_i\in X\,\) in \(R\,\) is assigned through the mapping \(\,P_R:X\longrightarrow [1,n]\,\) defined as

$$\begin{aligned} P_R(x_i)=n - \#\left\{ x_k\in X \mid x_i \succ x_k\right\} - \frac{1}{2} \cdot \#\left\{ x_k\in X \setminus \{x_i\} \mid x_i \sim x_k\right\} . \end{aligned}$$
(2)

Given \(\,A \subseteq X\), the indicator function (or characteristic function) of A, \(\,I_A:X\longrightarrow \{0,1\}\), is defined as

$$\begin{aligned} I_A(x_i)=\left\{ \begin{array}{ll} 1, \quad \text { if} \;x_i \in A,\\ 0, \quad \text { if} \;x_i \in X\setminus A. \end{array} \right. \end{aligned}$$
(3)

Remark 2

Every preference–approval \(\,(R,A)\in \mathcal {R}(\{x_1,\dots ,x_n\})\,\) can be codified in terms of \(\,P_R(x_i)\,\) [Eq. (2)] and \(\,I_A(x_i)\,\) [Eq. (3)] as follows:

$$ \begin{aligned} \big (P_R(x_1),P_R(x_2),\dots ,P_R(x_n)\big )\, \& \,\big (I_A(x_1),I_A(x_2),\dots ,I_A(x_n)\big ) . \end{aligned}$$
(4)

Example 2

Consider the preference–approval \(\,(R,A)\in \mathcal {R}(\{x_1,x_2,x_3,x_4\})\,\) represented by

$$\begin{aligned} \begin{array}{c} x_4\\ x_1\\ x_2 \\ \hline \\ x_3 \end{array} \end{aligned}$$

Following Eq. (4), \(\,(R,A)\,\) is codified as \( \,(2,3,4,1)\, \& \,(1,1,0,1)\).

The sign function, \(\,\text{ sgn } : \mathbb {R} \longrightarrow \{-1,0,1\}\), is defined as

$$\begin{aligned} \text{ sgn }\,(a)= \left\{ \begin{array}{rl} 1, &{} \, \text{ if } \, a>0 \, ,\\ 0, &{} \, \text{ if } \, a=0\, ,\\ -1, &{} \, \text{ if } \, a<0\,. \end{array} \right. \end{aligned}$$

Taking into account May (1952) and Fishburn (2015), we now introduce an index that codifies the order between two alternatives in a weak order \(\,R\in W(X)\):

$$\begin{aligned} O_R(x_i,x_j)= \left\{ \begin{array}{rl} 1, &{} \quad \text{ if } \, x_i \succ x_j \, ,\\ 0, &{} \quad \text{ if } \, x_i \sim x_j\, ,\\ -1, &{} \quad \text{ if } \, x_j \succ x_i\,. \end{array} \right. \end{aligned}$$
(5)

It is worth noting that the index \(\,O_R(x_i,x_j)\,\) is also known in the literature as \(\,a_{ij}\,\) (see Kemeny & Snell, 1962, p. 11) or score matrix (see Emond & Mason, 2002). In this paper, to avoid confusion with the approval-discordance notation of Eq. (8), we chose to use the notation \(\,O_R(x_i,x_j)\).

3 The proposal

Given two preference–approvals \(\,\big ( (R_1,A_1), (R_2,A_2) \big ) \in \mathcal {R}(X)\,\) and two generic alternatives \(\,x_i,x_j\in X\), we now introduce two indices that measure the discordances between these alternatives with respect to preference and approvals, respectively.

The preference–discordance between \(x_i\) and \(x_j\) is defined as

$$\begin{aligned} p_{ij} = \frac{1}{2} \cdot \vert \,\text{ sgn } \left( P_{R_1} (x_j) - P_{R_1} (x_i) \right) - \text{ sgn } \left( P_{R_2} (x_j) - P_{R_2} (x_i) \right) \vert . \end{aligned}$$
(6)

Taking into account Eqs. (5), (6) can be defined in an equivalent and simpler way:

$$\begin{aligned} p_{ij} = \frac{1}{2} \cdot \vert O_{R_1}(x_i,x_j) - O_{R_2}(x_i,x_j) \vert , \end{aligned}$$
(7)

and therefore, \(\,p_{ij} \in \{0,\,0.5,\,1\}\).

The approval-discordance between \(x_i\) and \(x_j\) is defined as

$$\begin{aligned} a_{ij} = \frac{1}{2} \cdot \big ( \vert I_{A_1}(x_i) - I_{A_2}(x_i) \vert + \vert I_{A_1}(x_j) - I_{A_2}(x_j) \vert \big ), \end{aligned}$$
(8)

and again \(\,a_{ij} \in \{0,\,0.5,\,1\}\).

In both cases, the values of 0, 0.5 and 1 indicate a null, moderate and high discordance, respectively. In order to generate a global measure of discordance between two alternatives, we consider an aggregation function (see Beliakov et al., 2007; Grabisch et al., 2009; Ramík & Vlach, 2012, Sect. 2, among others).

Definition 2

Given an aggregation function \(\,h:[0,1]\times [0,1] \longrightarrow [0,1]\), the distance associated with h, \(\,D:\mathcal {R}(X) \times \mathcal {R}(X) \longrightarrow [0,1]\), is defined as

$$\begin{aligned} D\big ( (R_1,A_1), (R_2,A_2) \big ) = \frac{2}{n \cdot (n-1)} \cdot \sum _{\begin{array}{c} i,j=1 \\ i<j \end{array}}^n h(p_{ij}, a_{ij}). \end{aligned}$$
(9)

Among the huge variety of aggregation functions, in this proposal we consider a class of weighted quasiarithmetic meansFootnote 3: the family of weighted power means, \(\,h:[0,1] \times [0,1] \longrightarrow [0,1]\), defined as

$$\begin{aligned} h(x,y) = \left( \lambda \cdot x^r + (1-\lambda )\cdot y^r\right) ^{\frac{1}{r}}, \end{aligned}$$
(10)

where \(\,\lambda \in [0,1]\,\) and \(\,r>0\).

Remark 3

Weighted power means, defined in Eq. (10), have interesting properties [see, for instance, Beliakov et al. (2007, pp. 45–47)]:

  1. 1.

    Continuity: h is continuous.

  2. 2.

    Monotoniciy: (\(x\le x'\,\) and \(\,y\le y'\)) \(\,\Rightarrow \; h(x,y) \le h(x',y')\), for all \(\,x,y,x',y'\in [0,1]\).

  3. 3.

    Idempotency: \(h(x,x)=x\,\) for every \(\,x\in [0,1]\).

  4. 4.

    Compensativeness: \(\min \{x,y\} \le h(x,y) \le \max \{x,y\}\,\) for all \(\,x,y\in [0,1]\).

  5. 5.

    Comparability: h is increasing in r.

  6. 6.

    Symmetry: \(h(x,y)=h(y,x)\,\) for all \(\,x,y\in [0,1] \;\Leftrightarrow \; \lambda =0.5\).

  7. 7.

    \(\displaystyle \lim _{r \rightarrow \infty } h(x,y)=\max \{x,y\}\).

  8. 8.

    \(\displaystyle \lim _{r \rightarrow 0} h(x,y)= x^{\lambda } \cdot y^{1-\lambda }\,\) (weighted geometric mean).

Notice that the inputs of h in Eq. (9) are the pairs of \(\,0, \,0.5,\, 1\). In Tables 4 and 5 we show the values of \(\,h\,\) for these pairs and different values of the parameter \(\,r\,\) for \(\,\lambda = 0.5,\, 0.75\), respectively.

According to Tables 4 and 5 , the parameter r governs the penalty for each pair of values. Indeed, as r increases, so does the value of \(\,h(p_{ij},a_{ij})\). As a result, taking an excessively large r value results in very similar penalties and reduces the weight of high discordance compared to moderate discordance.

Table 4 Values of \(\,h\,\) for \(\,\lambda =0.5\)
Table 5 Values of \(\,h\,\) for \(\,\lambda =0.75\)

Taking into account Eq. (9) with the aggregation function h in Eq. (10), we now introduce the family of distances on preference–approvals that we analyze in the present paper.

Definition 3

Given \(\,\lambda \in [0,1]\,\) and \(\,r>0\), the distance associated with \(\,\lambda \,\) and \(\,r\,\) is the mapping \(\,D^r_{\lambda }:\mathcal {R}(X) \times \mathcal {R}(X) \longrightarrow [0,1]\,\) defined as

$$\begin{aligned} D^r_{\lambda }\big ( (R_1,A_1), (R_2,A_2) \big ) = \frac{2}{n \cdot (n-1)} \cdot \sum _{\begin{array}{c} i,j=1 \\ i<j \end{array}}^n \left( \lambda \cdot p_{ij}^r + (1-\lambda )\cdot a_{ij}^r\right) ^{\frac{1}{r}}. \end{aligned}$$
(11)

Remark 4

When \(\,r=2\,\) and \(\,\lambda =0.5\), the geometric interpretation of \(\,h(p_{ij},a_{ij})\,\) is related to the Euclidean distance.

Figure 1 reports the preference–approval plane, that is a Euclidean plane having on the x-axis the preference–discordance, \(\,p_{ij}\), and on the y-axis the approval-discordance, \(\,a_{ij}\).

If \(\,r=2\,\) and \(\,\lambda =0.5\), then \(\,h(p_{ij},a_{ij})\,\) is proportional to the Euclidean distance between \(\,(p_{ij},a_{ij})\,\) and the origin, (0, 0), \(\,d\big ((p_{ij},a_{ij}),(0,0)\big )\):

$$\begin{aligned} \begin{aligned}&h(p_{ij},a_{ij})=\sqrt{0.5\cdot (p^2_{ij}+a^2_{ij})}=\sqrt{0.5}\cdot d\big ((p_{ij},a_{ij}),(0,0)\big ), \text{ i.e., }\\&h(p_{ij},a_{ij}) \propto d\big ((p_{ij},a_{ij}),(0,0)\big ). \end{aligned} \end{aligned}$$

This means that the aggregation function h can be interpreted as a proper distance in the preference–approval plane. As a result, the point of greatest discordance, (1, 1), will be the farthest from the origin of the axes. Conversely, (0, 0) represents the point of greatest agreement. The red segments in Fig. 1 are proportional to the values \(h(p_{ij},a_{ij})\) for each \(p_{ij}, a_{ij} \in \{0,0.5,1\}\).

Thus, the aggregated distance \(\,D^2_{0.5}\,\big ( (R_1,A_1), (R_2,A_2) \big )\,\) [see Eq. (11)] can be interpreted as the sum of \(\,\frac{n\cdot (n-1)}{2}\,\) Euclidean distances in the preference–approval plane. That is,

$$\begin{aligned} D^2_{0.5}\,\big ( (R_1,A_1), (R_2,A_2) \big )=\sqrt{0.5} \cdot \sum _{\begin{array}{c} i,j=1 \\ i<j \end{array}}^n d\big ((p_{ij},a_{ij}),(0,0)\big ). \end{aligned}$$
Fig. 1
figure 1

Preference–approval plane

Proposition 1

\(D^r_{\lambda }\,\) is a metric on \(\,\mathcal {R}(X)\,\) for all \(\,\lambda \in (0,1)\,\) and \(\, r\ge 1\). That is, for all \(\,(R_1,A_1),(R_2,A_2) \; \in \mathcal {R}(X)\,\) the following conditions are satisfiedFootnote 4:

  1. 1.

    Positivity: \(D^r_{\lambda }\big ( (R_1,A_1), (R_2,A_2) \big )\ge 0\).

  2. 2.

    Symmetry: \(D^r_{\lambda }\big ( (R_1,A_1), (R_2,A_2) \big )=D^r_{\lambda }\big ( (R_2,A_2), (R_1,A_1) \big )\).

  3. 3.

    Identity of indiscernibles: \(D^r_{\lambda }\big ( (R_1,A_1), (R_2,A_2) \big )=0 \;\Leftrightarrow \; (R_1,A_1)=(R_2,A_2)\).

  4. 4.

    Triangle inequality: \(D^r_{\lambda }\big ( (R_1,A_1), (R_3,A_3) \big )\le D^r_{\lambda }\big ( (R_1,A_1), (R_2,A_2) \big )+D^r_{\lambda }\big ( (R_2,A_2), (R_3,A_3) \big )\), for every \(\,(R_3,A_3) \in \mathcal {R}(X)\).

Remark 5

If \(\,\lambda \in \{0,1\}\), then \(\,D^r_{\lambda }\,\) is not a metric.

If \(\,\lambda =0\), let \(\,(R_1,A_1),(R_2,A_1)\in \mathcal {R}(X)\,\) be such that \(\,R_1 \ne R_2\). Then, we have \(\,D^r_{\lambda }\big ((R_1,A_1),(R_2,A_1)\big ) = 0\).

If \(\,\lambda =1\), let \(\,(R_1,A_1),(R_1,A_2)\in \mathcal {R}(X)\,\) be such that \(\,A_1 \ne A_2\). Then, we have \(\,D^r_{\lambda }\big ((R_1,A_1),(R_1,A_2)\big ) = 0\).

Consequently, if \(\,\lambda \in \{0,1\}\), then \(\,D^r_{\lambda }\,\) does not verify the identity of indiscernibles, hence it is not a metric.

Proposition 2 demonstrate that our proposal can be considered as a generalization of the preference–approval distance proposed by Erdamar et al. (2014).

Given two preference–approvals \(\,\big ( (R_1,A_1), (R_2,A_2) \big )\in \mathcal {R}(X)\), its distance, \(\,d_{\lambda }\big ( (R_1,A_1), (R_2,A_2) \big )\), is generated from the preference distance and the approval distance marginally, and eventually aggregate them by a convex combination.

The authors measure the disagreement between preferences by using the Kemeny metric (Kemeny, 1959), \(d_K\):

$$\begin{aligned} d_K(R_1,R_2) = \sum _{\begin{array}{c} i,j=1 \\ i<j \end{array}}^n \vert \,\text{ sgn } \left( P_{R_1} (x_j) - P_{R_1} (x_i) \right) - \text{ sgn } \left( P_{R_2} (x_j) - P_{R_2} (x_i) \right) \vert . \end{aligned}$$

Or, equivalently, by considering Eq. (5):

$$\begin{aligned} d_K(R_1,R_2) = \sum _{\begin{array}{c} i,j=1 \\ i<j \end{array}}^n \vert O_{R_1}(x_i,x_j) - O_{R_2}(x_i,x_j) \vert . \end{aligned}$$
(12)

Notice that \(\,d_K(R_1,R_2) \in [0,n\cdot (n-1)]\).

In turn, the approval disagreement is measured through the Hamming metric (Hamming, 1950), \(d_H\):

$$\begin{aligned} d_H(A_1,A_2) = \sum _{i=1}^n \vert I_{A_1}(x_i) - I_{A_2}(x_i) \vert . \end{aligned}$$
(13)

Notice that \(\,d_H (A_1,A_2) \in [0,n]\).

In order to aggregate \(d_K\) and \(d_H\) as a global distance, the two metrics are normalized to the same codomain \(\,[0,1]\,\) via dividing by their maximum distances.

The mappings \(\,d_R:\mathcal {R}(X)\times \mathcal {R}(X) \longrightarrow [0,1]\,\) and \(\,d_A:\mathcal {R}(X)\times \mathcal {R}(X) \longrightarrow [0,1]\,\) are defined as

$$\begin{aligned} d_R\big ( (R_1,A_1), (R_2,A_2) \big )=\frac{d_K(R_1,R_2)}{n \cdot (n-1)},\\ d_A\big ( (R_1,A_1), (R_2,A_2) \big )= \frac{d_H(A_1,A_2)}{n}. \end{aligned}$$

The two normalized distances are eventually aggregated in a final preference–approval distance, \(\,d_{\lambda }:\mathcal {R}(X)\times \mathcal {R}(X) \longrightarrow [0,1]\), defined as

$$\begin{aligned}&d_{\lambda }\big ( (R_1,A_1), (R_2,A_2) \big )= \nonumber \\&\lambda \cdot d_R\big ( (R_1,A_1), (R_2,A_2) \big ) + (1-\lambda )\cdot d_A\big ( (R_1,A_1), (R_2,A_2) \big ), \end{aligned}$$
(14)

where \(\,\lambda \in [0,1]\,\) is a parameter used to control the relative relevance of the two components.

Taking into account Eqs. (12) and (13), (14) can be re-written as

$$\begin{aligned}&d_{\lambda }\big ((R_1,A_1), (R_2,A_2)\big ) = \nonumber \\&\frac{\lambda }{n \cdot (n-1)} \cdot \sum _{\begin{array}{c} i,j=1 \\ i<j \end{array}}^n \vert O_{R_1}(x_i,x_j) - O_{R_2}(x_i,x_j) \vert + \nonumber \\&\frac{1-\lambda }{n} \cdot \sum _{i=1}^n \vert I_{A_1}(x_i) - I_{A_2}(x_i) \vert . \end{aligned}$$
(15)

Proposition 2

For all \(\,(R_1,A_1), (R_2,A_2) \in \mathcal {R}(X)\,\) and \(\,\lambda \in [0,1]\,\) it holds

$$\begin{aligned} D^1_{\lambda }\big ( (R_1,A_1), (R_2,A_2) \big )\,= d_{\lambda }\big ( (R_1,A_1), (R_2,A_2) \big ). \end{aligned}$$

Note that Proposition 2 is valid for weighted power means. They are the proper weighted quasiarithmetic means that allow us to generalize the distance between preference–approvals introduced by Erdamar et al. (2014).

In Proposition 2, we have shown that \(\,D^r_{\lambda }=d_{\lambda }\,\) when \(\,r=1\). We now show that is not true if \(\,r \ne 1\).

Proposition 3

If \(\,r \ne 1\), \(\,D^r_{\lambda }\big ( (R_1,A_1), (R_2,A_2) \big )= d_{\lambda }\big ( (R_1,A_1), (R_2,A_2) \big )\,\) for all \(\,(R_1,A_1), (R_2,A_2) \in \mathcal {R}(X)\,\) and \(\,\lambda \in [0,1]\,\) is not true.

Proof

Let us consider the case of two alternatives. Notice that in Eq. (11), when \(\,n=2\), \(\,D^r_{\lambda }\,\) reduces to the h function computed in \(i=1\) and \(j=2\). That is, \(\,D^r_{\lambda }\big ( (R_1,A_1), (R_2,A_2)\big )=h(p_{12},a_{12})= (\lambda \cdot p_{12}^r + (1-\lambda ) \cdot a_{12}^r)^{\frac{1}{r}}\). By Proposition 2, we have \(\,D^1_{\lambda }\big ( (R_1,A_1),(R_2,A_2)\big ) = d_{\lambda } \big ( (R_1,A_1), (R_2,A_2)\big ) = \lambda \cdot p_{12} + (1-\lambda ) \cdot a_{12}\).

If we force the equality \(\,D^1_{\lambda }\big ( (R_1,A_1), (R_2,A_2)\big )=D^r_{\lambda }\big ((R_1,A_1), (R_2,A_2)\big )\), we have \(\,\lambda \cdot p_{12} + (1-\lambda ) \cdot a_{12}= (\lambda \cdot p_{12}^r + (1-\lambda ) \cdot a_{12}^r)^{\frac{1}{r}}\), i.e.,

$$\begin{aligned} (\lambda \cdot p_{12}+(1-\lambda )\cdot a_{12})^r=\lambda \cdot p_{12}^r+(1-\lambda )a_{12}^r. \end{aligned}$$
(16)

We have to prove that there exist \(\,p_{12},a_{12} \in \{0,0.5,1\}\,\) and \(\,\lambda \in [0,1]\,\) such that Eq. (16) is not true for any \(\,r\ne 1\).

If \(\,p_{12}=1\,\) and \(\,a_{12}=0\), then Eq. (16) becomes \(\,\lambda ^r=\lambda \), and it is true if and only if \(\,\lambda \in \{0,1\}\). In all the other cases, if \(\,r \ne 1\), then Eq. (16) is false. \(\square \)

4 Clustering tasks

This section shows how the proposed distance can be used to study the universe of preference–approvals and to determine clusters.

Section 4.1 examines the universe of preference approvals in the case of two alternatives in order to observe how the values of \(\,r\,\) and \(\,\lambda \,\) affect the creation of homogeneous clusters. Afterwards, the influence of the two parameters \(\,r\,\) and \(\,\lambda \,\) when the number of alternatives \(\,n\,\) varies is investigated.

Section 4.2 provides an application on real data, to investigate how the countries of the European Union can be clustered into groups, according to their preference–approvals on nine alternatives concerning social values. The dataset used comes from the Eurobarometer websiteFootnote 5.

4.1 Universe of preference–approvals

Let us consider the 2-dimensional preference–approval universe where the set of alternatives is \(\,X=\{x_1,x_2\}\). Following Eq. (4), the preference–approvals \(\,(R_i,A_i)\), \(\, i=1,2,\dots ,8\), are represented by two 2-dimensional vectors:

$$ \begin{aligned} (2,1)\, \& \,(1,1) \;\equiv \begin{array}{c} x_2 \\ x_1\\ \hline \end{array} \qquad \qquad (2,1)\, \& \,(0,1) \;\equiv \begin{array}{c} x_2 \\ \hline \\ x_1 \end{array} \end{aligned}$$
$$ \begin{aligned} (2,1)\, \& \,(0,0) \;\equiv \begin{array}{c} \hline x_2 \\ x_1 \end{array} \qquad \qquad (1,2)\, \& \,(1,1) \;\equiv \begin{array}{c} x_1 \\ x_2 \\ \hline \end{array} \end{aligned}$$
$$ \begin{aligned} (1,2)\, \& \,(1,0) \;\equiv \begin{array}{c} x_1 \\ \hline x_2 \end{array} \qquad \qquad (1,2)\, \& \,(0,0) \;\equiv \begin{array}{c} \hline x_1 \\ x_2 \end{array} \end{aligned}$$
$$ \begin{aligned} (1.5,1.5)\, \& \,(1,1) \;\equiv \begin{array}{c} x_1 \; x_2 \\ \hline \end{array} \qquad \qquad (1.5,1.5)\, \& \,(0,0) \;\equiv \begin{array}{c} \hline x_1 \; x_2 \end{array} \end{aligned}$$

The distances between preference–approvals on two alternatives for \(\,r=1\,\) and \(\,\lambda =0.5\) (Fig. 2) and \(\,\lambda =0.75\) (Fig. 3) are reported in the heatmaps.

Fig. 2
figure 2

Distances between preference–approvals for 2 alternatives, \(\,r=1\,\) and \(\,\lambda =0.5\)

Fig. 3
figure 3

Distances between preference–approvals for 2 alternatives, \(\,r=1\,\) and \(\,\lambda =0.75\)

Increasing the value of \(\,\lambda \,\) emphasizes the discordance in the preference part, and modifies the relationships between the corresponding preference–approvals. Indeed, when \(\,\lambda =0.75\), there is an increase in the intensity of the distances at the top-right hand side of the graph, which concerns the triples

$$ \begin{aligned} (2,1)\, \& \,(1,1),\; (2,1)\, \& \,(0,1), \; (2,1)\, \& \,(0,0) \end{aligned}$$

and

$$ \begin{aligned} (1,2)\, \& \,(0,0),\; (1,2)\, \& \,(1,1),\; (1,2)\, \& \,(1,1). \end{aligned}$$

The hierarchical relationship between objects is reported in Fig. 4; the dendrograms show how the hierarchical clustering of the eight preference–approvals changes based on \(\,D^r_{\lambda }\).

Fig. 4
figure 4

Hierarchical clustering dendrogram for 2 alternatives, \(\,r=1\), \(\,\lambda =0.5\,\) (left) and \(\,\lambda =0.75\,\) (right)

Figure 4 shows that the value of \(\lambda \) strongly influences the hierarchical aggregation of preference–approvals.

A similar analysis can be carried out by varying the value of r. In Fig. 5 the distances between the corresponding preference–approvals, for \(\,r=2\,\) and \(\,\lambda =0.5\,\) are shown.

Fig. 5
figure 5

Distance between preference–approvals for 2 alternatives, \(\,r=2\), \(\,\lambda =0.5\)

Compared to Figs. 2 and 5 shows a general increase of distances determined by the increase of r. In particular,

$$\begin{aligned} D^2_{\lambda }\big ( (R_1,A_1), (R_2,A_2)\big )\ge D^1_{\lambda }\big ((R_1,A_1), (R_2,A_2)\big ), \end{aligned}$$

for all \(\,(R_1,A_1), (R_2,A_2) \in \mathcal {R}(X)\). This is due to h being increasing in r.

The dendrograms between preference–approvals objects are reported in Fig. 6.

Fig. 6
figure 6

Hierarchical clustering dendrogram for 2 alternatives, \(\,r=1\,\) (left), \(\,r=2\,\) (right) and \(\,\lambda =0.5\)

Figure 6 shows that an increase in r contributes differently (with respect to an increase in \(\lambda \)) to the change of the hierarchical aggregation structure. In fact, the two dendrograms merge preference–approvals in the same way. What changes is the “height” at which there is the aggregation or, in other words, the distance to be tolerated to aggregate two preference–approvals. Note that this happens only for two alternatives.

Tables 678 and 9 show the cophenetic correlation coefficientFootnote 6 (see Sokal & Rohlf, 1962; Schlee, 1973, pp. 278–284) between dendrograms, for \(\,n=2,\,3,\,4,\,5\,\) and \(\,\lambda =0.5\). The cophenetic coefficient was computed in R using the dendextend package (Galili, 2015).

Table 6 Cophenetic dendrogram correlations for \(\,n=2\), \(\, r=1,\,1.5,\,2,\,5,\,10\,\) and \(\,\lambda =0.5\)
Table 7 Cophenetic dendrogram correlations for \(\,n=3\), \(\, r=1,\,1.5,\,2,\,5,\,10\,\) and \(\,\lambda =0.5\)
Table 8 Cophenetic dendrogram correlations for \(\,n=4\), \(\, r=1,\,1.5,\,2,\,5,\,10\,\) and \(\,\lambda =0.5\)
Table 9 Cophenetic dendrogram correlations for \(\,n=5\), \(\,r=1,\,1.5,\,2,\,5,\,10\,\) and \(\,\lambda =0.5\)

Tables 678 and 9 show that dendrogram correlations are strictly related to the values of r and n. Overall, the correlations between dendrograms tend to decrease as r increases. This is especially evident when we examine the first column of each table, which reports the correlation between dendrograms obtained with \(r=1\) and dendrograms obtained with \(\,r=1.5,\,2,\,5,\,10\). In terms of the number of alternatives, it should be noted that as n increases, the dendrogram correlations generally decrease with an oscillatory trend.

In other words, Tables 678 and 9 highlight that the parameter r has a considerable influence, not only on the resulting values of the proposed distance \(D_{\lambda }^r\), but also on the cluster structure discovered among the observations of the preference–approvals universe. Specifically, as n increases and the expressiveness of the voters explodes (Table 1), so does the discriminating power of r, allowing different clustering structures to be highlighted. Indeed, the proposed family of distances \(\,D_{\lambda }^r\,\) is more flexible than the existing one, and it ultimately comes down to a new parameter that can be exploited in various applications, such as maximizing the goodness of a clustering procedure.

To explore further this issue, let us consider a simulation study on the universe of 5 alternatives, which involves three steps:

  • generate four groups of clustered preference–approvals;

  • apply a hierarchical clustering algorithm for different values of r.

  • compute an external validation index, the Adjusted Rand index (Hubert & Arabie, 1985), to investigate which value of r maximises the similarity between the estimated and the theoretical clusters.

Therefore, we aim to find the value of r that provides more reliable clusters, i.e. clusters that are more consistent with the data-generating process.

The number of preference–approvals (on five alternatives) generated within each cluster was determined by randomly drawing four values from a normal distribution \(\,\mathcal {N}(50, 4)\,\) and converting them into integer numbers.

Orderings and approvals were generated individually and merged to produce the final set of preference–approvals. Specifically, orderings within each sub-partition were generated from a Mallows Model (Mallows, 1957), which was one of the earliest probability models suggested for rankings and it is still widely used in theoretical and applied research. It is an exponential model defined by a central permutation \(\sigma _0\) and a dispersion parameter \(\theta \). When \(\theta \ne 0\), \(\sigma _0\) represents the mode of the distribution, i.e., the permutation with the highest probability of being generated. The probability of any other ranking decays exponentially with increasing distance to the central permutation. The dispersion parameter \(\theta \) controls the steepness of this decline. The \(\theta \) values for our simulation studies are \(\,\{0,\,0.5,\,1,\,1.5,\,2\}\). Assuming that \(\sigma \) is a generic ranking, the probability for this ranking is function of \(\theta \), and it is given by:

$$\begin{aligned} \hbox {Pr}(\theta )=\frac{\exp (-\theta d(\sigma ,\sigma _0^{-1}))}{\psi (\theta )}, \end{aligned}$$
(17)

where d is a ranking distance measure and \(\psi (\theta )\) is a normalization constant.

We generated rankings assuming the Kemeny distance \(d_K\). The cluster central permutations, \(\sigma _0\), used in the analysis are reported in Table 10.

Table 10 Cluster central permutations

Approvals, within each cluster are generated from four multinomial distributions, with probability vectors, \(p_{ik}\), described in Table 11. Specifically, \(p_{ik}\) is the probability to draw i approved alternatives into the k-th cluster.

Table 11 Multinomial probability vectors

After deriving clusters, the adjusted Rand index (Hubert & Arabie, 1985) is used to assess their goodness. The adjusted Rand index is a measure of the similarity between two set of clusterings; it is the corrected-for-chance version of the Rand index (Rand, 1971). The correction uses the predicted similarity of all pair-wise comparisons between clusterings described by a random model to generate a baseline. Although the Rand Index can only provide values between 0 and +1 (0 when the two data clusterings do not agree on any pair of points, and 1 when data clusterings are exactly the same), the modified Rand Index can return negative values if the index is lower than the expected similarity of all pair-wise comparisons between clusterings specified by a random model.

The results (Table 12) are obtained by averaging the adjusted Rand index over ten randomly generated datasets for each value of \(\theta \).

Table 12 Average adjusted Rand index over r and \(\theta \)

Table 12 shows that, except for the case \(\theta =1.5\), our measure \(\,D_{\lambda }^r\,\) with \(\,r\ne 1\,\) results in higher average adjusted Rand indices. Thus, \(\,r\ne 1\,\) allows the true clustered structure of data to be found more accurately and provides more accurate clusters.

4.2 A real data application

This subsection shows how the proposed metric can be used to perform cluster analysis on real data retrieved from the Eurobarometer website.

Since 1973, Eurobarometer has undertaken a series of public opinion polls on behalf of the European Commission and other European Union (EU) institutions. These polls cover a wide range of topics concerning the EU and its member countries. The data utilized in these analyses are specifically from question Q5 of the poll titled “Defending Democracy, Empowering citizens. Public Opinion at the legislature’s midpoint”Footnote 7.

A group of voters, divided by countries, was asked to indicate which of the following values should the European Parliament defend as a matter of priority:

  • \(x_1\): Equality between women and men.

  • \(x_2\): The fight against discrimination and for the protection of minorities.

  • \(x_3\): Tolerance and respect for diversity in society.

  • \(x_4\): Solidarity between EU Member States and between its regions.

  • \(x_5\): Solidarity between the EU and poor countries in the world.

  • \(x_6\) The protection of human rights in the EU and worldwide.

  • \(x_7\): Freedom of religion and belief.

  • \(x_8\): Freedom of movement.

  • \(x_9\): Freedom of speech and thought.

As a result, data are stored in a table (see Table 14) with 27 rows (one row for each EU member country) and 9 columns (each column representing an alternative of \(\,X=\{x_1, \dots , x_9\}\)). The total number of votes cast by the i-th country in favor of the j-th alternative is shown in the table’s generic cell \(\,ij\).

In order to transform the original table into a set of preference–approvals, preferences and approvals need to be derived. For each country, the alternatives are ranked in order of popularity, beginning with the one that received the most votes and ending with the one that received the fewest. Furthermore, in order to generate a vector of approvals, those alternatives that received more votes than the national average were deemed acceptable.

For example, in Table 13 we show the votes expressed in France (the votes of all countries are included in Table 14).

Table 13 Votes in France

Since the votes’ average is 19.44, the votes in France are transformed into a preference–approval codification [see Eq. (4)] as

$$ \begin{aligned} (1, 5, 3.5, 8, 6, 3.5, 8, 8, 2) \, \& \, (1, 0, 1, 0, 0, 1, 0, 0, 1) \end{aligned}$$

that can be visualized as follows:

$$\begin{aligned} \begin{array}{c} x_1 \\ x_9 \\ x_3 \; x_6\\ \hline x_2 \\ x_5 \\ x_4 \; x_7 \; x_8.\\ \end{array} \end{aligned}$$
Table 14 Votes in the EU

To run the cluster analysis, the distance matrix \(\,27\times 27\,\) was constructed using Eq. (11). All the alternatives seem important in this example, so a distinction between acceptable and unacceptable alternatives should not be interpreted as a distinction between valuable and not valuable, but instead as a distinction between more and less urgent. For this reason, \(\,\lambda =0.75\,\) was chosen in order to emphasize preference differences more than approvals.

A cluster-wise measure of cluster stability (Hennig, 2007) is used to jointly discover the optimal value of r and the optimal number of clusters k. Stability refers to the property of a meaningful and valid cluster that does not change easily when the data set is perturbed in a non-essential way. That is, when applied to many datasets collected from the same data distribution, a reliable clustering method should produce similar partitions. The cluster stability method (Hennig, 2007) employs three steps:

  1. 1.

    use various strategies to resample new data sets from the original and apply the hierarchical clustering method to each of them;

  2. 2.

    for every given original cluster, find the most similar cluster using the Jaccard coefficient (Jaccard, 1901) in the new data set and record the similarity value;

  3. 3.

    assess the cluster stability of every single cluster by the mean similarity taken over the resampled data sets.

The average cluster-wise stability is shown in Fig. 7 as a function of r (for \(k=2,3,4\) clusters). The procedure suggests that the most stable cluster configuration is \(k=2\) and \(r=2\). It is worth noting that, regardless of the value of \(\,k\), \(\,r>1\,\) always leads to improved cluster stability. Indeed, with two clusters (\(k=2\)) the value of r that maximizes stability is \(\,r=2\). Whereas with three or four clusters, the optimal solution is \(\,r=4\). In addition, as the number of clusters k increases, the average stability decreases.

Fig. 7
figure 7

Average cluster-wise stability over r

For several reasons, stability is a particularly relevant cluster validation measure in this example for determining the best value of r. First, it is not possible to use external validation measures in this case as the true clustered structure of the EU countries is not known. At the same time, most internal validation measures employ the distance between observations (\(D_{\lambda }^r\)) to assess the goodness of clusters. However, this may be an issue in our instance since the distance between observations (\(D_{\lambda }^r\)) is influenced by r. Therefore, to determine which value of r yields more accurate clusters, a metric that is independent of r is desirable. Furthermore, cluster stability has been examined both theoretically and practically (Hennig, 2007; Von Luxburg, 2010; Ullmann et al., 2022), and it has been shown to be capable of distinguishing between meaningful stable and spurious clusters.

Figures 8 and 9 show the resulting dendrogram and clusters, respectively, obtained with \(\,k=2\,\) and \(\,r=2\).

Fig. 8
figure 8

EU cluster dendrogram

Fig. 9
figure 9

Map of EU voters with clusters

The clustering procedure suggests that the EU countries can be separated into two large groups. Cluster 1 is mainly made up of Western European countries, whereas Cluster 2 of Eastern European countries.

To provide a more in-depth picture of how the EU countries express their views on the nine alternatives proposed, the two preference–approvals that represent the two clusters, that we call representative preference–approvals, are shown in Eq. (18).

To obtain the representative preference–approvals that summarize each cluster, preferences and approvals need to be aggregated. In each cluster, the set of preferences is combined into a unique weak order by deriving the average position for each alternative and ranking them according to it. Note that this aggregation method is equivalent to the the Borda count (Borda, 1781) extended to weak orders (see Smith, 1973; Black, 1976, Cook & Seiford, 1982).

In our example, the extended Borda count assigns a score to each alternative, for each country, the number of alternatives ranked below plus half of the number of alternatives that are indifferent to it:

$$\begin{aligned} B_R(x_i)=\#\left\{ x_k\in X \mid x_i \succ x_k\right\} + \frac{1}{2} \cdot \#\left\{ x_k\in X \setminus \{x_i\} \mid x_i \sim x_k\right\} . \end{aligned}$$

Similarly, the set of approvals are combined into a unique approval vector by taking the average approval for each alternative, and then considering those alternatives whose average approval is greater than 0.5 as approved.

$$\begin{aligned} \begin{array}{c} \text{ Cluster } 1\\ x_1 \,x_9 \\ x_6\\ \hline x_4\\ x_2\\ x_3 \\ x_8\\ x_5 \\ x_7\\ \end{array} \qquad \qquad \begin{array}{c} \text{ Cluster } 2\\ x_6 \\ x_9\\ x_8\\ x_4\\ \hline x_1 \, x_3 \\ x_2\\ x_5\\ x_7\\ \end{array} \end{aligned}$$
(18)

It is worth noting that \(\,x_6\,\) and \(\,x_9\), namely, “The protection of human rights in the EU and worldwide” and “Freedom of speech and thought”, respectively, are above the approval line in the two representative preference–approvals, indicating that they can be considered very urgent. Regarding \(\,x_1\), that is “Equality between women and men”, it is ranked at the top of the representative preference–approval of Cluster 1, while it is just below the approval line in the Cluster 2 representative preference–approval. Similarly, \(\,x_4\), that is “Solidarity between the EU Member States and between its regions”, is ranked fourth (above the approval line) in Cluster 2. Still, it is the first alternative below the approval line in Cluster 1. Furthermore, Cluster 2 prioritizes \(\,x_8\), that is “Freedom of movement”, which is at the end of the preference–approval of Cluster 1. Finally, in both the two representative preference–approvals, \(x_7\), that is “Freedom of religion and belief", is ranked last.

Table 15 reports the \(D_{0.75}^2\) distances of each country to the representative cluster preference–approvals.

Table 15 Distance between countries and representative cluster preference–approvals

It should be noted that, except for Greece, each country is closer to the preference–approval of its own cluster than the other. Despite being reasonable, this result is not trivial since the technique for obtaining the cluster preference–approval does not involve \(D_{\lambda }^r\).

Some countries can be considered central in their clusters as they are very close to the representative preference–approval, e.g. Belgium (0.092), Austria (0.096), Malta (0.096) for Cluster 1, and the Czech Republic (0.036), Lithuania (0.094), Hungary (0.072) and Slovenia (0.105) for Cluster 2. As a rule of thumb, the greater the distance from the own cluster preference–approval, the more the country disagrees with the other countries in its cluster. Finally, it is worth noting that some countries, such as Ireland, Italy and Greece, are located in the middle of the two clusters, as they have a similar distances to the two cluster preference–approvals.

5 Concluding remarks

In social choice theory, preference rankings and approvals are two popular ways to collect the preferences of a group of agents on a set of alternatives. In the preference–approval setting, each agent, in addition to ordering a set of alternatives from best to worst, submits a cut-off line to distinguish between acceptable and unacceptable. Within this framework, in this paper, we propose a new distance for preference–approvals, following the approach of the Kemeny distance.

Given two preference–approvals and two alternatives, we introduce two indices that measure the discordances between these alternatives with respect to preference and approvals, and an aggregation function belonging to the class of weighted power means to define a new distance. This new distance depends on two parameters. The effect of these parameters on the distance is analyzed and described through some heatmaps. The proposed distance can be used to study the universe of preference–approvals and to determine clusters of voters: how the two parameters characterizing the distances affect the clustering process is shown with some dendrograms and by the cophenetic correlations among them. We have shown that the new distance family offers some advantages compared to the existing distance function. Specifically, through a simulation study and the adjusted Rand index, we have proved that \(D_{\lambda }^r\) with \(r\ne 1\) allows the true clustered structure of data to be found more accurately. Similarly, through a cluster-wise stability index, we have shown that \(D_{\lambda }^r\) with \(r\ne 1\) produces more stable clusters on the real data example.

In future work, axiomatizing the new family of distance functions might prove important.

Moreover, future research should examine consensus measures based on distances between preference–approvals (see Erdamar et al.,  2014), algorithms to determine representative preference–approvals efficiently (see D’Ambrosio,  2017), clustering on alternatives (see González del Pozo et al., 2017), and also reaching consensus processes (see Palomares et al.,  2014; García-Lapresta & Pérez-Román, 2017; Chao et al., 2021, among others).