On New Kemeny’s Medians

Dvoenko, Sergey; Pshenichny, Denis

doi:10.1007/978-3-030-68821-9_9

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12665))

Included in the following conference series:

International Conference on Pattern Recognition

1685 Accesses
1 Citations

Abstract

The Kemeny’s median is the well-known method to get a coordinated decision representing a group opinion. This ranking is the least different from experts’ orderings of alternatives. If rankings are immersed in a metric space, the median should be an average ranking from the mathematical point of view. As a result, the correct median should be the center of the set of rankings as points in a metric space. In this case it should be the median denoted as the Kemeny’s metric median. In this paper we propose also a new median denoted as the Kemeny’s weighted median as another type of the metric one. A new procedure is developed for the linear combination of experts’ rankings to build the weighted median.

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Keywords

1 The Classic Kemeny’s Median

The most popular aggregation principle is the majority rule. It is proved, if the majority relation is transitive then it appears to be the median. For a set of rankings, this median is the ranking too with the least distance to others. This median is known as the Kemeny’s median. There are different algorithms to find the Kemeny’s median. One version of the locally optimal algorithm calculates the special loss matrix.

Let $ A = \{ a_1 , \ldots \;a_N \} $ be some unordered set of alternatives. Let a ranking P be defined as an ordering of alternatives with a preference ($ \succ $) or indistinguishability ($ \sim $) relations between them. Therefore, the ranking P can be presented by the relation matrix $ M(N,N) $ with elements

$$ m_{ij} = \left\{ {\begin{array}{*{20}c} {\;1,\;a_i \succ a_j } \\ {\;0,\;a_i \sim a_j } \\ { - 1,\;a_i \prec a_j \;.} \\ \end{array} } \right. $$

Let two rankings $ P_u $ and $ P_v $ be presented by relation matrices M_u and M_v with the distance between them under the same numeration in A:

$$ d(P_u ,P_v ) = \frac{1}{2}\sum_{i = 1}^N {\sum_{j = 1}^N {|m_{ij}^u - m_{ij}^v |} } {\mkern 1mu} . $$

(1)

It is known, this is the metric value for binary relations in (quasi)ordinal scales, i.e. for rankings [1].

Let n individual preferences (rankings) be given. It needs to find a group relation P coordinated in a certain way with relations $ P_1 , \ldots \,P_n $. Methods to find a group relation are usually denoted as concordance principles. It is proved, the group transitive relation (or transformed to it) is the median, specifically, the Kemeny’s one. The median P* is the ranking with the least distance to other rankings

$$ P^{*} = \mathop {\arg \hbox{min} }\limits_P \sum_{u = 1}^n {d(P,P_u )} . $$

(2)

There are different algorithms to find the Kemeny’s median. One version of the locally optimal algorithm to find the Kemeny’s median calculates the loss matrix $ Q(N,N) $ for N alternatives [2].

Let some ranking P and experts’ rankings $ P_1 , \ldots P_n $ be presented by relation matrices M and $ M_1 , \ldots \,M_n $. The total distance from P to all other rankings is defined by

$$ \sum_{u = 1}^n {d(P,P_u ) = } \frac{1}{2}\sum_{i = 1}^N {\sum_{j = 1}^N {\sum_{u = 1}^n {d_{ij} (P,P_u )} } } $$

with partial distances defined under conditions $ m_{ij} = 1 $ as

$$ d_{ij} (P,P_u ) = \;|m_{ij} - m_{ij}^u |\; = \left\{ {\begin{array}{*{20}c} {0,\;m_{ij}^u = 1\;\;\;\;} \\ {1,\;\;m_{ij}^u = 0\;\;\;} \\ {2,\;m_{ij}^u = - 1\;} \\ \end{array} } \right.. $$

The loss matrix element $ q_{ij} = \sum_{u = 1}^n {d_{ij} (P,P_u )} $ defines the total mismatch losses of the preference $ a_i \succ a_j $ in the unknown ranking P relative to corresponding preferences in rankings $ P_1 , \ldots P_n $.

The Kemeny’s algorithm finds the ordering of alternatives to minimize the sum of elements above the main diagonal of the loss matrix Q, and consists of the follows.

The matrix Q is reduced step-by-step by eliminating the line (and corresponding column) with the minimal sum of losses in the line. The corresponding alternative takes the last place in the ordered line of alternatives. The result is the so-called ranking P_I. In a sequential reverse scan, it is checked the correspondence between preferences in the ranking ($ a_i \succ a_j $) and penalties in the loss matrix ($ q_{ij} \le q_{ji} $). In a case of violation, the current pair of alternatives is reversed. The final result is the so-called ranking P_II, i.e. the Kemeny’s median [2].

2 The Kemeny’s Metric Median

Let the distance matrix $ D(n,n) $ be calculated between all pairs of rankings according to (1). According to it, the set $ P_1 , \ldots P_n $ of rankings is considered like points immersed into Euclidean space as an unordered set.

If metric violations have not occurred in the set configuration, then it is known, the scalar product matrix calculated relative to some origin appears to be the positive definite or nonnegative definite at least [3].

According to the Torgerson’s method of principal projections [4], it is suitable to use the center element of the given set as the origin. This center element P₀ can be presented by distances to other rankings

$$ d^2 (P_0 ,P_i ) = d_{0i}^2 = \frac{1}{n}\sum_{p = 1}^n {d_{ip}^2 } - \frac{1}{2n^2 }\sum_{p = 1}^n {\sum_{q = 1}^n {d_{pq}^2 } } \;,\,\,\,i = 1, \ldots \;n. $$

(3)

The central element P₀ is the least distant from other elements in the set like the median P* and needs to satisfy (2) as well as a ranking. Nevertheless, the median P* is represented both by distances (1) to other rankings and by the ranking itself, while the central element P₀ is represented only by distances (3) and it doesn’t exist as a ranking.

From the mathematical point of view, these points must coincide with each other $ P^{*} = P_0 $. In this case, the well-known clustering and machine learning algorithms (for example, k-means, etc.) developed for pair comparisons [5, 6] can be correctly used instead of discrete optimization ones for ordinal scales.

The monotonic transformation is correct for ordinal scales and consists in reallocations of alternatives on a numerical axis without changing their ordering relative to each other. In general, the center P₀ and the ranking P* are both presented by their own distances to other elements $ P_1 , \ldots P_n $. Based on the monotonic transformation, we can show that the ranking P* is equivalent by distances to the center P₀.

We denote as the Kemeny’s metric median $ P_0^\ast $ the Kemeny’s median P* with the same distances to other rankings like the center P₀ has. We use the Kemeny’s algorithm to find it.

Let points P_u, P₀, and P* be given, where $ \delta = d(P_0 ,P_u ) - d(P^{*},P_u ) \ne 0 $. There are two cases of the $ \delta $ value, if P₀ and P* are similar each other as rankings [7].

1.
The case of $ \delta > 0 $. To compensate this difference, it is necessary to uniformly distribute this value between all nonzero elements of M_u and to calculate the new relation matrix $ \tilde{M}_u $ with elements

$$\tilde{m}_{{ij}}^{u} = \left\{ {\begin{array}{*{20}c} { + 1 + 2\delta /k{\mkern 1mu} ,} & {a_{i} \succ a_{j} \;} \\ {0{\mkern 1mu} ,} & {a_{i} \sim a_{j} \;} \\ { - 1 - 2\delta /k{\mkern 1mu} ,} & {a_{i} \prec a_{j} \;,} \\ \end{array} } \right.$$

where $ k = N^2 - N - N_0 $ is the general number of nonzero elements in M_u without the main diagonal, N₀ is the number of zero non-diagonal elements $ m_{ij}^u = 0 $ in M_u. This new relation matrix $ \tilde{M}_u $ doesn’t change the expert’s ranking P_u.

The value $ 2\delta /k $ is used, since in (1) all differences are used twice. It is necessary to correct nonzero elements only, since each zero element indicates two alternatives are on the same place in the expert’s ranking. Hence, the distance between the modified median and the other ranking increases to compensate $ \delta > 0 $. In this case, for $ m_{ij} = 1 $ the preference $ a_i \succ a_j $ in the unknown ranking P is penalized by the expert’s ranking P_u to form partial distances

$$ d_{ij} (P,P_u ) = \left\{ {\begin{array}{*{20}c} {0 + 2\delta /k{\mkern 1mu} ,\;\tilde{m}_{ij}^u = + 1 + 2\delta /k\;} \\ {\;\;\;1{\mkern 1mu} ,\;\tilde{m}_{ij}^u = {\mkern 1mu} {\mkern 1mu} {\mkern 1mu} {\mkern 1mu} {\mkern 1mu} 0\;\;\;\;} \\ {2 + 2\delta /k{\mkern 1mu} ,\;\tilde{m}_{ij}^u = - 1 - 2\delta /k\;.} \\ \end{array} } \right. $$

2.
The case of $ - 1 < 2\delta /k^{\prime} < 0 $. To remove this difference, it is necessary to uniformly distribute this value between nonzero elements in M_u too. Additionally, the number of modified elements in M_u is decreased to $ k^{\prime} = k - \Delta k $, where $ \Delta k $ is the number of coinciding elements $ m_{ij}^u = m_{ij}^\ast $ in relation matrices M_u and M*. Indeed, the partial distance is zero $ d_{ij} (P^{*},P_u ) = 0 $ in this case and any negative additive to $ m_{ij}^u $ obligatory increases the distance between P_u and P*.

To get the unchanged expert’s ranking P_u, it is necessary to calculate the new relation matrix $ \tilde{M}_u $ with elements

$$ \tilde{m}_{ij}^u = \left\{ {\begin{array}{*{20}c} { + 1 - |2\delta /k^{\prime}|{\mkern 1mu} ,\;a_i \succ a_j } \\ {\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;0{\mkern 1mu} ,\;a_i \sim a_j } \\ { - 1 + |2\delta /k^{\prime}|{\mkern 1mu} ,\;a_i \prec a_j } \\ {\;\;\;\;\;\;\;\;\;m_{ij}^\ast {\mkern 1mu} ,\;m_{ij}^u = m_{ij}^\ast \,.} \\ \end{array} } \right. $$

For $ m_{ij} = 1 $ the preference $ a_i \succ a_j $ in the unknown ranking P is penalized by the expert’s ranking P_u to form partial distances

$$ d_{ij} (P,P_u ) = \left\{ {\begin{array}{*{20}c} {0{\mkern 1mu} ,\;\tilde{m}_{ij}^u = 1\;\;\;\;} \\ {1{\mkern 1mu} ,\;\tilde{m}_{ij}^u = 0\;\;\;} \\ {0 + |2\delta /k'|{\mkern 1mu} ,\;\tilde{m}_{ij}^u = + 1 - |2\delta /k'|\;} \\ {2 - |2\delta /k'|{\mkern 1mu} ,\;\tilde{m}_{ij}^u = - 1 + |2\delta /k'|.} \\ \end{array} } \right. $$

As a result, the metric median has the following main properties. In general case, the Kemeny’s metric median isn’t obligatory coincides with the classic Kemeny’s median as the binary relation. There is not a contradiction, since the metric median only clarifies the classic median. Namely, previously indistinguishable alternatives usually become distinguishable, while positions of previously distinguishable alternatives are not changed relative to each other.

The metric median as the ranking $ P_0^\ast $ can be finally presented by its relation matrix $ M_0^\ast $ with distances (1) to other rankings presented by their relation matrices $ M_u ,\;u = 1, \ldots \;n $. Unfortunately, such distances differ again from distances of the center P₀ to other rankings. In order to preserve “metric” distances, it is necessary to remember all modified experts’ relation matrices $ \tilde{M}_u ,\;u = 1, \ldots \;n $.

It isn’t so convenient sometimes. Moreover, changes in the expert group (for example, a reduction by one) change the Kemeny’s median and, consequently, change the modified relation matrices of the remaining experts.

3 The Kemeny’s Weighted Median

Usually, weights are considered as experts’ competences and determine the proportion of taking their opinions into account in the group decision. Different techniques are developed to determine a competence level, for example in [8].

Let $ w_1 \ge 0, \ldots \;w_n \ge 0 $, where $ \sum_{u = 1}^n {w_u = 1} $, be weights of expert opinions. An element $ q_{ij} $ is defined as $ q_{ij} = \sum_{u = 1}^n {w_u d_{ij} (P,\;P_u )} $ for the weighted loss matrix $ Q(N,N) = \sum_{u = 1}^n {w_u Q_u (N,N)} $. In this case, the problem (2) is solved actually with weighted partial distances defined under conditions $ m_{ij} = 1 $ as

$$ \tilde{d}_{ij} (P,P_u ) = \;w_u d_{ij} (P,P_u ) = w_u |m_{ij} - m_{ij}^u |\; = \left\{ {\begin{array}{*{20}c} {\;\;\;0,\;m_{ij}^u = 1\;\;\;\;} \\ {\;w_u ,\;m_{ij}^u = 0\;\;\;} \\ {2w_u ,\;m_{ij}^u = - 1\;} \\ \end{array} } \right.. $$

It is possible to find any ranking within the convex hull of a given set $ P_1 , \ldots \;P_n $ by a linear combination of individual rankings as elements of a discrete set. In particular, any individual ranking can be found as the Kemeny’s median $ P^{*} = P_i $ for weights $ w_i = 1 $, $ w_{j \ne i} = 0 $, $ j = 1, \ldots \;n $.

The classic Kemeny’s median can be found with weights $ w_i = 1/n,\;\;i = 1, \ldots \;n $. Moreover, it is the same Kemeny’s median with any $ w_i = const > 0 $, and the inverted one with $ w_i = const < 0,\;i = 1, \ldots \;n $.

The problem to find the Kemeny’s weighted median is formulated as follows. Let rankings be presented by the distance matrix $ D(n,n) $, the center P₀ be presented by distances $ d(P_0 ,P_u ),\;u = 1, \ldots \;n $, and the Kemeny’s median be presented by distances $ d(P^{*},P_u ),\;u = 1, \ldots \;n $. Such distances define deviations $ \delta_u = d(P_0 ,P_u ) - d(P^{*},P_u ) $, $ u = 1, \ldots \;n $.

4 Weights Searching

It needs to solve the optimization problem under constraints:

$$ \sum_{u = 1}^n {\delta_u^2 \to \hbox{min} } ,\,\,\sum_{u = 1}^n {w_u } = 1,\;w_u \ge 0,\;u = 1, \ldots \,n. $$

This procedure is based on the Gauss-Seidel coordinate descent algorithm, where each expert determines a “coordinate” axis. The changing in the weight of the expert’s individual loss matrix $ Q_u (N,N) $ in the range from 0 to 1 is considered as a variation of the correspondent coordinate axis.

At the beginning, all n experts’ weights are changed from $ 1/(n - 1), \ldots 0, \ldots 1/(n - 1) $ for completely excluded loss matrix Q_u, till weights 0,…1,…0 for the loss matrix Q_u only. Each step ends after weight variations for all experts with selecting the “optimal” expert with the number u* for the total deviation $ \sum_{u = 1}^n {\delta_u^2 } \to \hbox{min} $.

At the next step, the weight of the loss matrix Q_u varies in the interval $ 0 \le p \le 1 $. It defines the normalized weight $ w_u = p $. Other loss matrices have constant weights $ q_i ,\;i = 1, \ldots \,n $, $ i \ne u $ with the constant sum $ V = \sum_{i = 1}^n {q_i } ,\;i \ne u $.

The normalized weights of the loss matrices for other rankings vary in the range from $ w_i = q_i /V $ to $ w_i = 0 $, taking values $ w_i = q_i (1 - p)/V $ within the range of variation. Initial weights $ p = q_u $, $ q_i ,\;i = 1, \ldots \,n $, $ i \ne u $ correspond to the inner point of the variation interval $ w_i = q_i ,\;i = 1, \ldots \;n $. The Kemeny’s median is calculated for the linear combination of losses $ Q(N,N) = \sum_{u = 1}^n {w_u Q_u (N,N)} $, and deviations $ \delta_u ,\;u = 1, \ldots \;n $ are defined.

In such the problem statement, the solution does not guarantee that the distance difference from the center P₀ and the weighted median $ P_0^w $ to other rankings becomes zero. In this case, it is required to take this weighted median as the original one, and find the metric median by modifying the experts’ relation matrices.

5 Conclusion

The considered types of Kemeny’s median allow correct solving of important tasks appeared in the problem of rank aggregation: coordination of individual rankings, identifying of conflicting opinions, and reaching consensus.

The classical Kemeny’s median is one of relevant methods for solving the problem of ranks aggregation. However, rankings are discrete objects and after immersion in a continuous metric space, they form a set of isolated points (there is nothing between them). In this situation, the metric and weighted Kemeny’s medians allow refinement of the group ranking relative to the correct arithmetic mean of the set of points representing individual rankings.

In the case of conflicting opinions, the classical Kemeny’s median often contains many indistinguishable alternatives. Using of metric medians reduces the number of such alternatives, i.e. improves the quality of a group ranking. Immersion of rankings in the metric space allows identifying of groups of experts with very different opinions as a solution of the clustering problem. For this purpose, it is convenient to apply, for example, the well-known k-means algorithm (in the appropriate metric form) in contrast to usually complex discrete optimization algorithms. Then, for average objects of each cluster the group ranking is defined as the metric or weighted Kemeny’s median.

Finally, to rich a consensus, it is necessary to determine weights of individual opinions or groups as a solution of the problem of opinion competence. Using of the weighted Kemeny’s median allows more accurate taking into account of conflicting opinions in the right proportion. It is also improves the quality of the group decision compared to the case, when only the most competent individual or group opinions are used.

References

Kemeny, J., Snell, J.: Mathematical Models in the Social Sciences. Blaisdell, New York (1963)
MATH Google Scholar
Litvak, B.G.: Expert Information: Methods of Acquisition and Analysis. Radio i Svyaz, Moscow (1982). In Russian
Google Scholar
Young, G., Housholder, A.: Discussion of a set of points in terms of their mutual distances. Psychometrica 3(1), 19–22 (1938)
Article Google Scholar
Torgerson, W.: Theory and methods of scaling. John Wiley, NY (1958)
Google Scholar
Dvoenko, S.D.: Clustering and separating of a set of members in terms of mutual distances and similarities. Trans. MLDM 2(2), 80–99 (2009)
Google Scholar
Dvoenko, S., Owsinski, J.: The permutable k-means for the bi-partial criterion. Informatica 43, 253–262 (2019). https://doi.org/10.31449/inf.v43i2.2090
Dvoenko, S., Pshenichny, D.: On a metric Kemeny’s median. In: Strijov, V.V., Ignatov, D.I., Vorontsov, K.V. (eds.) IDP 2016. CCIS, vol. 794, pp. 44–57. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-35400-8_4
Chapter Google Scholar
Mirkin, B.G.: The Problem of a Group Choice. Nauka, Moscow (1974). In Russian
Google Scholar

Download references

Acknowledgments

This work is supported by the Russian Foundation for Basic Research (RFBR) under Grants 20-07-00055, 18-07-01087, 18-07-00942.

Author information

Authors and Affiliations

Tula State University, Lenin Ave. 92, 300012, Tula, Russia
Sergey Dvoenko & Denis Pshenichny

Authors

Sergey Dvoenko
View author publications
You can also search for this author in PubMed Google Scholar
Denis Pshenichny
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sergey Dvoenko .

Editor information

Editors and Affiliations

Dipartimento di Ingegneria dell’Informazione, University of Firenze, Firenze, Italy
Alberto Del Bimbo
Dipartimento di Ingegneria “Enzo Ferrari”, Università di Modena e Reggio Emilia, Modena, Italy
Rita Cucchiara
Department of Computer Science, Boston University, Boston, MA, USA
Stan Sclaroff
Dipartimento di Matematica e Informatica, University of Catania, Catania, Italy
Giovanni Maria Farinella
Cloud & AI, JD.COM, Beijing, China
Tao Mei
Dipartimento di Ingegneria dell’Informazione, University of Firenze, Firenze, Italy
Marco Bertini
Computational Sciences Department, National Institute of Astrophysics, Optics and Electronics (INAOE), Tonantzintla, Puebla, Mexico
Hugo Jair Escalante
Dipartimento di Ingegneria “Enzo Ferrari”, Università di Modena e Reggio Emilia, Modena, Italy
Roberto Vezzani

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dvoenko, S., Pshenichny, D. (2021). On New Kemeny’s Medians. In: Del Bimbo, A., et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12665. Springer, Cham. https://doi.org/10.1007/978-3-030-68821-9_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-68821-9_9
Published: 21 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-68820-2
Online ISBN: 978-3-030-68821-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)