Abstract
In this article, we present a probabilistic framework which serves as the base from which instance-based algorithms for solving the supervised ranking problem may be derived. This framework constitutes a simple and novel approach to the supervised ranking problem, and we give a number of typical examples of how this derivation can be achieved.
In this general framework, we pursue a cumulative and stochastic approach, relying heavily upon the concept of stochastic dominance. We show how the median can be used to extract, in a consistent way, a single (classification) label from a returned cumulative probability distribution function. We emphasize that all operations used are mathematically sound, i.e. they only make use of ordinal properties.
Mostly, when confronted with the problem of learning a ranking, the training data is not monotone in itself, and some cleansing operation is performed on it to remove these ‘inconsistent’ examples. Our framework, however, deals with these occurrences of ‘reversed preference’ in a non-invasive way. On the contrary, it even allows to incorporate information gained from the occurrence of these reversed preferences. This is exactly what happens in the second realization of the main theorem.
Similar content being viewed by others
References
Aha, D. W., Kibler, D., & Albert, M. K. (1991). Instance-based learning algorithms. Machine Learning, 6(1), 37–66.
Ben-David, A. (1992). Automatic generation of symbolic multiattribute ordinal knowledge-based DSSs: Methodology and applications. Decision Sciences, 23(6), 1357–1372.
Ben-David, A. (1995). Monotonicity maintenance in information-theoretic machine learning algorithms. Machine Learning, 19(1), 29–43.
Ben-David, A., Sterling, L., & Pao, Y. H. (1989). Learning and classification of monotonic ordinal concepts. Computational Intelligence, 5, 45–49.
Bioch, J., & Popova, V. (2000). Rough sets and ordinal classification. In A. Arimura & S. Jain (Eds.), Algorithmic learning theory. Notes in artificial intelligence, 1968 (pp. 291–305). New York: Springer.
Bioch, J., & Popova, V. (2002). Monotone decision trees and noisy data. In H. Blockeel & M. Denecker (Eds.), Proceedings of the 14th Belgian-Dutch conference on machine learning (BNAICC’2002) (pp. 19–26).
Bouyssou, D. (1990). Building criteria: a prerequisite for MCDA. In C. Bana e Costa (Ed.), Readings in multiple criteria decision aid (pp. 58–80). Heidelberg: Springer.
Cao-Van, K. (2003). Supervised ranking: from semantics to algorithms. PhD thesis, Ghent University.
Cao-Van, K., & De Baets, B. (2002). On the definition and representation of a ranking. Lecture Notes in Computer Science, 2561, 291–299.
Cao-Van, K., & De Baets, B. (2003). Consistent representation of rankings. Lecture Notes in Computer Science, 2929, 107–123.
Compact Oxford English dictionary of current English (2005). Oxford University Press.
Fishburn, P. C. (1985). Interval orders and interval graphs. New York: Wiley.
Frank, E., & Hall, M. (2001). A simple approach to ordinal classification. Lecture Notes in Computer Science, 2167, 145–156.
Gediga, G., & Düntsch, I. (2002). Approximation quality for sorting rules. Computational Statistics & Data Analysis, 40(3), 499–526.
Greco, S., Matarazzo, B., & Slowinski, R. (2002). Rough approximation by dominance relations. International Journal of Intelligent Systems, 17(2), 153–171.
Greco, S., Slowinski, R., Stefanowski, J., & Zurawski, M. (2004). Incremental versus non-incremental rule induction for multicriteria classification. Lecture Notes in Computer Science, 3135, 33–53.
Levy, H. (1998). Stochastic dominance. Boston: Kluwer Academic.
Makino, K., Suda, T., Yano, K., & Ibaraki, T. (1996). Data analysis by positive decision trees. In Proc. international symposium on cooperative database systems for advanced applications (pp. 282–289).
Makino, K., Suda, T., Ono, H., & Ibaraki, T. (1999). Data analysis by positive decision trees. IEICE Transactions on Information and Systems, E82-D(1), 76–88.
Marichal, J. L., Meyer, P., & Roubens, M. (2005). Sorting multi-attribute alternatives: the TOMASO method. Computers and Operations Research, 32, 861–877.
Ogryczak, W., & Ruszczyński, A. (2002). Dual stochastic dominance and related mean-risk models. SIAM Journal of Optimization, 13(1), 60–78.
Potharst, R., & Bioch, J. C. (1998). Quasi-monotone decision trees for ordinal classification. In F. Verdenius & W. van den Broek (Eds.), Proceedings of the 8th Belgian-Dutch conference on machine learning (BENELEARN’98) (pp. 122–131).
Potharst, R., & Bioch, J. C. (2000). Decision trees for ordinal classification. Intelligent Data Analysis, 4, 97–111.
Potharst, R., & Feelders, A. J. (2002). Classification trees for problems with monotonicity constraints. SIGKDD Explorations, 4(1), 1–10.
Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1, 81–106.
Verkeyn, A., Botteldooren, D., De Baets, B., & De Tré, G. (2003). Sugeno integrals for the modelling of noise annoyance aggregation. Lecture Notes in Artificial Intelligence, 2715, 277–284.
Witten, I. H., & Frank, E. (2005). Data mining: practical machine learning tools and techniques with Java implementations (2nd edn.). San Mateo: Morgann Kaufmann.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lievens, S., De Baets, B. & Cao-Van, K. A probabilistic framework for the design of instance-based supervised ranking algorithms in an ordinal setting. Ann Oper Res 163, 115–142 (2008). https://doi.org/10.1007/s10479-008-0326-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10479-008-0326-1