Keywords

1 The Classic Kemeny’s Median

The most popular aggregation principle is the majority rule. It is proved, if the majority relation is transitive then it appears to be the median. For a set of rankings, this median is the ranking too with the least distance to others. This median is known as the Kemeny’s median. There are different algorithms to find the Kemeny’s median. One version of the locally optimal algorithm calculates the special loss matrix.

Let \( A = \{ a_1 , \ldots \;a_N \} \) be some unordered set of alternatives. Let a ranking P be defined as an ordering of alternatives with a preference (\( \succ \)) or indistinguishability (\( \sim \)) relations between them. Therefore, the ranking P can be presented by the relation matrix \( M(N,N) \) with elements

$$ m_{ij} = \left\{ {\begin{array}{*{20}c} {\;1,\;a_i \succ a_j } \\ {\;0,\;a_i \sim a_j } \\ { - 1,\;a_i \prec a_j \;.} \\ \end{array} } \right. $$

Let two rankings \( P_u \) and \( P_v \) be presented by relation matrices Mu and Mv with the distance between them under the same numeration in A:

$$ d(P_u ,P_v ) = \frac{1}{2}\sum_{i = 1}^N {\sum_{j = 1}^N {|m_{ij}^u - m_{ij}^v |} } {\mkern 1mu} . $$
(1)

It is known, this is the metric value for binary relations in (quasi)ordinal scales, i.e. for rankings [1].

Let n individual preferences (rankings) be given. It needs to find a group relation P coordinated in a certain way with relations \( P_1 , \ldots \,P_n \). Methods to find a group relation are usually denoted as concordance principles. It is proved, the group transitive relation (or transformed to it) is the median, specifically, the Kemeny’s one. The median P* is the ranking with the least distance to other rankings

$$ P^{*} = \mathop {\arg \hbox{min} }\limits_P \sum_{u = 1}^n {d(P,P_u )} . $$
(2)

There are different algorithms to find the Kemeny’s median. One version of the locally optimal algorithm to find the Kemeny’s median calculates the loss matrix \( Q(N,N) \) for N alternatives [2].

Let some ranking P and experts’ rankings \( P_1 , \ldots P_n \) be presented by relation matrices M and \( M_1 , \ldots \,M_n \). The total distance from P to all other rankings is defined by

$$ \sum_{u = 1}^n {d(P,P_u ) = } \frac{1}{2}\sum_{i = 1}^N {\sum_{j = 1}^N {\sum_{u = 1}^n {d_{ij} (P,P_u )} } } $$

with partial distances defined under conditions \( m_{ij} = 1 \) as

$$ d_{ij} (P,P_u ) = \;|m_{ij} - m_{ij}^u |\; = \left\{ {\begin{array}{*{20}c} {0,\;m_{ij}^u = 1\;\;\;\;} \\ {1,\;\;m_{ij}^u = 0\;\;\;} \\ {2,\;m_{ij}^u = - 1\;} \\ \end{array} } \right.. $$

The loss matrix element \( q_{ij} = \sum_{u = 1}^n {d_{ij} (P,P_u )} \) defines the total mismatch losses of the preference \( a_i \succ a_j \) in the unknown ranking P relative to corresponding preferences in rankings \( P_1 , \ldots P_n \).

The Kemeny’s algorithm finds the ordering of alternatives to minimize the sum of elements above the main diagonal of the loss matrix Q, and consists of the follows.

The matrix Q is reduced step-by-step by eliminating the line (and corresponding column) with the minimal sum of losses in the line. The corresponding alternative takes the last place in the ordered line of alternatives. The result is the so-called ranking PI. In a sequential reverse scan, it is checked the correspondence between preferences in the ranking (\( a_i \succ a_j \)) and penalties in the loss matrix (\( q_{ij} \le q_{ji} \)). In a case of violation, the current pair of alternatives is reversed. The final result is the so-called ranking PII, i.e. the Kemeny’s median [2].

2 The Kemeny’s Metric Median

Let the distance matrix \( D(n,n) \) be calculated between all pairs of rankings according to (1). According to it, the set \( P_1 , \ldots P_n \) of rankings is considered like points immersed into Euclidean space as an unordered set.

If metric violations have not occurred in the set configuration, then it is known, the scalar product matrix calculated relative to some origin appears to be the positive definite or nonnegative definite at least [3].

According to the Torgerson’s method of principal projections [4], it is suitable to use the center element of the given set as the origin. This center element P0 can be presented by distances to other rankings

$$ d^2 (P_0 ,P_i ) = d_{0i}^2 = \frac{1}{n}\sum_{p = 1}^n {d_{ip}^2 } - \frac{1}{2n^2 }\sum_{p = 1}^n {\sum_{q = 1}^n {d_{pq}^2 } } \;,\,\,\,i = 1, \ldots \;n. $$
(3)

The central element P0 is the least distant from other elements in the set like the median P* and needs to satisfy (2) as well as a ranking. Nevertheless, the median P* is represented both by distances (1) to other rankings and by the ranking itself, while the central element P0 is represented only by distances (3) and it doesn’t exist as a ranking.

From the mathematical point of view, these points must coincide with each other \( P^{*} = P_0 \). In this case, the well-known clustering and machine learning algorithms (for example, k-means, etc.) developed for pair comparisons [5, 6] can be correctly used instead of discrete optimization ones for ordinal scales.

The monotonic transformation is correct for ordinal scales and consists in reallocations of alternatives on a numerical axis without changing their ordering relative to each other. In general, the center P0 and the ranking P* are both presented by their own distances to other elements \( P_1 , \ldots P_n \). Based on the monotonic transformation, we can show that the ranking P* is equivalent by distances to the center P0.

We denote as the Kemeny’s metric median \( P_0^\ast \) the Kemeny’s median P* with the same distances to other rankings like the center P0 has. We use the Kemeny’s algorithm to find it.

Let points Pu, P0, and P* be given, where \( \delta = d(P_0 ,P_u ) - d(P^{*},P_u ) \ne 0 \). There are two cases of the \( \delta \) value, if P0 and P* are similar each other as rankings [7].

  1. 1.

    The case of \( \delta > 0 \). To compensate this difference, it is necessary to uniformly distribute this value between all nonzero elements of Mu and to calculate the new relation matrix \( \tilde{M}_u \) with elements

$$\tilde{m}_{{ij}}^{u} = \left\{ {\begin{array}{*{20}c} { + 1 + 2\delta /k{\mkern 1mu} ,} & {a_{i} \succ a_{j} \;} \\ {0{\mkern 1mu} ,} & {a_{i} \sim a_{j} \;} \\ { - 1 - 2\delta /k{\mkern 1mu} ,} & {a_{i} \prec a_{j} \;,} \\ \end{array} } \right.$$

where \( k = N^2 - N - N_0 \) is the general number of nonzero elements in Mu without the main diagonal, N0 is the number of zero non-diagonal elements \( m_{ij}^u = 0 \) in Mu. This new relation matrix \( \tilde{M}_u \) doesn’t change the expert’s ranking Pu.

The value \( 2\delta /k \) is used, since in (1) all differences are used twice. It is necessary to correct nonzero elements only, since each zero element indicates two alternatives are on the same place in the expert’s ranking. Hence, the distance between the modified median and the other ranking increases to compensate \( \delta > 0 \). In this case, for \( m_{ij} = 1 \) the preference \( a_i \succ a_j \) in the unknown ranking P is penalized by the expert’s ranking Pu to form partial distances

$$ d_{ij} (P,P_u ) = \left\{ {\begin{array}{*{20}c} {0 + 2\delta /k{\mkern 1mu} ,\;\tilde{m}_{ij}^u = + 1 + 2\delta /k\;} \\ {\;\;\;1{\mkern 1mu} ,\;\tilde{m}_{ij}^u = {\mkern 1mu} {\mkern 1mu} {\mkern 1mu} {\mkern 1mu} {\mkern 1mu} 0\;\;\;\;} \\ {2 + 2\delta /k{\mkern 1mu} ,\;\tilde{m}_{ij}^u = - 1 - 2\delta /k\;.} \\ \end{array} } \right. $$
  1. 2.

    The case of \( - 1 < 2\delta /k^{\prime} < 0 \). To remove this difference, it is necessary to uniformly distribute this value between nonzero elements in Mu too. Additionally, the number of modified elements in Mu is decreased to \( k^{\prime} = k - \Delta k \), where \( \Delta k \) is the number of coinciding elements \( m_{ij}^u = m_{ij}^\ast \) in relation matrices Mu and M*. Indeed, the partial distance is zero \( d_{ij} (P^{*},P_u ) = 0 \) in this case and any negative additive to \( m_{ij}^u \) obligatory increases the distance between Pu and P*.

To get the unchanged expert’s ranking Pu, it is necessary to calculate the new relation matrix \( \tilde{M}_u \) with elements

$$ \tilde{m}_{ij}^u = \left\{ {\begin{array}{*{20}c} { + 1 - |2\delta /k^{\prime}|{\mkern 1mu} ,\;a_i \succ a_j } \\ {\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;0{\mkern 1mu} ,\;a_i \sim a_j } \\ { - 1 + |2\delta /k^{\prime}|{\mkern 1mu} ,\;a_i \prec a_j } \\ {\;\;\;\;\;\;\;\;\;m_{ij}^\ast {\mkern 1mu} ,\;m_{ij}^u = m_{ij}^\ast \,.} \\ \end{array} } \right. $$

For \( m_{ij} = 1 \) the preference \( a_i \succ a_j \) in the unknown ranking P is penalized by the expert’s ranking Pu to form partial distances

$$ d_{ij} (P,P_u ) = \left\{ {\begin{array}{*{20}c} {0{\mkern 1mu} ,\;\tilde{m}_{ij}^u = 1\;\;\;\;} \\ {1{\mkern 1mu} ,\;\tilde{m}_{ij}^u = 0\;\;\;} \\ {0 + |2\delta /k'|{\mkern 1mu} ,\;\tilde{m}_{ij}^u = + 1 - |2\delta /k'|\;} \\ {2 - |2\delta /k'|{\mkern 1mu} ,\;\tilde{m}_{ij}^u = - 1 + |2\delta /k'|.} \\ \end{array} } \right. $$

As a result, the metric median has the following main properties. In general case, the Kemeny’s metric median isn’t obligatory coincides with the classic Kemeny’s median as the binary relation. There is not a contradiction, since the metric median only clarifies the classic median. Namely, previously indistinguishable alternatives usually become distinguishable, while positions of previously distinguishable alternatives are not changed relative to each other.

The metric median as the ranking \( P_0^\ast \) can be finally presented by its relation matrix \( M_0^\ast \)  with distances (1) to other rankings presented by their relation matrices \( M_u ,\;u = 1, \ldots \;n \). Unfortunately, such distances differ again from distances of the center P0 to other rankings. In order to preserve “metric” distances, it is necessary to remember all modified experts’ relation matrices \( \tilde{M}_u ,\;u = 1, \ldots \;n \).

It isn’t so convenient sometimes. Moreover, changes in the expert group (for example, a reduction by one) change the Kemeny’s median and, consequently, change the modified relation matrices of the remaining experts.

3 The Kemeny’s Weighted Median

Usually, weights are considered as experts’ competences and determine the proportion of taking their opinions into account in the group decision. Different techniques are developed to determine a competence level, for example in [8].

Let \( w_1 \ge 0, \ldots \;w_n \ge 0 \), where \( \sum_{u = 1}^n {w_u = 1} \), be weights of expert opinions. An element \( q_{ij} \)  is defined as \( q_{ij} = \sum_{u = 1}^n {w_u d_{ij} (P,\;P_u )} \) for the weighted loss matrix \( Q(N,N) = \sum_{u = 1}^n {w_u Q_u (N,N)} \). In this case, the problem (2) is solved actually with weighted partial distances defined under conditions \( m_{ij} = 1 \) as

$$ \tilde{d}_{ij} (P,P_u ) = \;w_u d_{ij} (P,P_u ) = w_u |m_{ij} - m_{ij}^u |\; = \left\{ {\begin{array}{*{20}c} {\;\;\;0,\;m_{ij}^u = 1\;\;\;\;} \\ {\;w_u ,\;m_{ij}^u = 0\;\;\;} \\ {2w_u ,\;m_{ij}^u = - 1\;} \\ \end{array} } \right.. $$

It is possible to find any ranking within the convex hull of a given set \( P_1 , \ldots \;P_n \) by a linear combination of individual rankings as elements of a discrete set. In particular, any individual ranking can be found as the Kemeny’s median \( P^{*} = P_i \) for weights \( w_i = 1 \), \( w_{j \ne i} = 0 \), \( j = 1, \ldots \;n \).

The classic Kemeny’s median can be found with weights \( w_i = 1/n,\;\;i = 1, \ldots \;n \). Moreover, it is the same Kemeny’s median with any \( w_i = const > 0 \), and the inverted one with \( w_i = const < 0,\;i = 1, \ldots \;n \).

The problem to find the Kemeny’s weighted median is formulated as follows. Let rankings be presented by the distance matrix \( D(n,n) \), the center P0 be presented by distances \( d(P_0 ,P_u ),\;u = 1, \ldots \;n \), and the Kemeny’s median be presented by distances \( d(P^{*},P_u ),\;u = 1, \ldots \;n \). Such distances define deviations \( \delta_u = d(P_0 ,P_u ) - d(P^{*},P_u ) \), \( u = 1, \ldots \;n \).

4 Weights Searching

It needs to solve the optimization problem under constraints:

$$ \sum_{u = 1}^n {\delta_u^2 \to \hbox{min} } ,\,\,\sum_{u = 1}^n {w_u } = 1,\;w_u \ge 0,\;u = 1, \ldots \,n. $$

This procedure is based on the Gauss-Seidel coordinate descent algorithm, where each expert determines a “coordinate” axis. The changing in the weight of the expert’s individual loss matrix \( Q_u (N,N) \) in the range from 0 to 1 is considered as a variation of the correspondent coordinate axis.

At the beginning, all n experts’ weights are changed from \( 1/(n - 1), \ldots 0, \ldots 1/(n - 1) \) for completely excluded loss matrix Qu, till weights 0,…1,…0 for the loss matrix Qu only. Each step ends after weight variations for all experts with selecting the “optimal” expert with the number u* for the total deviation \( \sum_{u = 1}^n {\delta_u^2 } \to \hbox{min} \).

At the next step, the weight of the loss matrix Qu varies in the interval \( 0 \le p \le 1 \). It defines the normalized weight \( w_u = p \). Other loss matrices have constant weights \( q_i ,\;i = 1, \ldots \,n \), \( i \ne u \) with the constant sum \( V = \sum_{i = 1}^n {q_i } ,\;i \ne u \).

The normalized weights of the loss matrices for other rankings vary in the range from \( w_i = q_i /V \) to \( w_i = 0 \), taking values \( w_i = q_i (1 - p)/V \) within the range of variation. Initial weights \( p = q_u \), \( q_i ,\;i = 1, \ldots \,n \), \( i \ne u \) correspond to the inner point of the variation interval \( w_i = q_i ,\;i = 1, \ldots \;n \). The Kemeny’s median is calculated for the linear combination of losses \( Q(N,N) = \sum_{u = 1}^n {w_u Q_u (N,N)} \), and deviations \( \delta_u ,\;u = 1, \ldots \;n \) are defined.

In such the problem statement, the solution does not guarantee that the distance difference from the center P0 and the weighted median \( P_0^w \) to other rankings becomes zero. In this case, it is required to take this weighted median as the original one, and find the metric median by modifying the experts’ relation matrices.

5 Conclusion

The considered types of Kemeny’s median allow correct solving of important tasks appeared in the problem of rank aggregation: coordination of individual rankings, identifying of conflicting opinions, and reaching consensus.

The classical Kemeny’s median is one of relevant methods for solving the problem of ranks aggregation. However, rankings are discrete objects and after immersion in a continuous metric space, they form a set of isolated points (there is nothing between them). In this situation, the metric and weighted Kemeny’s medians allow refinement of the group ranking relative to the correct arithmetic mean of the set of points representing individual rankings.

In the case of conflicting opinions, the classical Kemeny’s median often contains many indistinguishable alternatives. Using of metric medians reduces the number of such alternatives, i.e. improves the quality of a group ranking. Immersion of rankings in the metric space allows identifying of groups of experts with very different opinions as a solution of the clustering problem. For this purpose, it is convenient to apply, for example, the well-known k-means algorithm (in the appropriate metric form) in contrast to usually complex discrete optimization algorithms. Then, for average objects of each cluster the group ranking is defined as the metric or weighted Kemeny’s median.

Finally, to rich a consensus, it is necessary to determine weights of individual opinions or groups as a solution of the problem of opinion competence. Using of the weighted Kemeny’s median allows more accurate taking into account of conflicting opinions in the right proportion. It is also improves the quality of the group decision compared to the case, when only the most competent individual or group opinions are used.