Maximizing upgrading and downgrading margins for ordinal regression

Carrizosa, Emilio; Martin-Barragan, Belen

doi:10.1007/s00186-011-0368-z

Maximizing upgrading and downgrading margins for ordinal regression

Original Article
Published: 20 September 2011

Volume 74, pages 381–407, (2011)
Cite this article

Mathematical Methods of Operations Research Aims and scope Submit manuscript

Emilio Carrizosa¹ &
Belen Martin-Barragan²

113 Accesses
1 Citation
Explore all metrics

Abstract

In ordinal regression, a score function and threshold values are sought to classify a set of objects into a set of ranked classes. Classifying an individual in a class with higher (respectively lower) rank than its actual rank is called an upgrading (respectively downgrading) error. Since upgrading and downgrading errors may not have the same importance, they should be considered as two different criteria to be taken into account when measuring the quality of a classifier. In Support Vector Machines, margin maximization is used as an effective and computationally tractable surrogate of the minimization of misclassification errors. As an extension, we consider in this paper the maximization of upgrading and downgrading margins as a surrogate of the minimization of upgrading and downgrading errors, and we address the biobjective problem of finding a classifier maximizing simultaneously the two margins. The whole set of Pareto-optimal solutions of such biobjective problem is described as translations of the optimal solutions of a scalar optimization problem. For the most popular case in which the Euclidean norm is considered, the scalar problem has a unique solution, yielding that all the Pareto-optimal solutions of the biobjective problem are translations of each other. Hence, the Pareto-optimal solutions can easily be provided to the analyst, who, after inspection of the misclassification errors caused, should choose in a later stage the most convenient classifier. The consequence of this analysis is that it provides a theoretical foundation for a popular strategy among practitioners, based on the so-called ROC curve, which is shown here to equal the set of Pareto-optimal solutions of maximizing simultaneously the downgrading and upgrading margins.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Combining Absolute and Relative Information with Frequency Distributions for Ordinal Classification

Minimum class variance support vector ordinal regression

Article 18 August 2016

Xiaoming Wang, Jinrong Hu & Zengxi Huang

MORD: Multi-class Classifier for Ordinal Regression

References

Adams NM, Hands DJ (1999) Comparing classifiers when the miallocation costs are uncertain. Pattern Recognit 32: 1139–1147
Article Google Scholar
Allwein EL, Schapire RE, Singer Y (2000) Reducing multiclass to binary: a unifying approach for margin classifiers. J Mach Learn Res 1: 113–141
MathSciNet Google Scholar
Ballarino G, Bernardi F, Requena M, Schadee H (2009) Persistent inequalities? expansion of education and class inequality in Italy and Spain. Eur Sociol Rev 25(1): 123–138
Article Google Scholar
Bradley AP (1997) The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit 30: 1145–1159
Article Google Scholar
Bredensteiner E, Bennet K (1999) Multicategory classification by support vector machines. Comput Opt Appl 12: 53–79
Article MATH Google Scholar
Cardoso JS, da Costa JF Pinto, Cardoso MJ (2005) Modelling ordinal relations with SVMs: an application to objective aesthetic evalutaion of breast cancer conservative treatment. Neural Netw 18: 808–817
Article Google Scholar
Carrizosa E (2006) Deriving weights in multiple-criteria decision making with support vector machines. TOP 14(2): 399–424
Article MathSciNet MATH Google Scholar
Carrizosa E (2008) Support vector machines and distance minimization. In: Pardalos PM, Hansen P (eds) Data mining and mathematical programming. AMS, New York, pp 2–20
Google Scholar
Carrizosa E, Martín-Barragán B (2006) Two-group classification via a biobjective margin maximization model. Eur J Oper Res 173(3): 746–761
Article MATH Google Scholar
Carrizosa E, Martín-Barragán B, Morales D Romero (2008) Multi-group support vector machines with measurement costs: a biobjective approach. Discret Appl Math 156(6): 950–966
MATH Google Scholar
Chu W, Keerthi SS (2007) Support vector ordinal regression. Neural Comput 19(3): 792–815
Article MathSciNet MATH Google Scholar
Cortes C, Vapnik V (1995) Support-vector network. Mach Learn 20: 273–297
MATH Google Scholar
Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines. Cambridge University Press, Cambridge
Google Scholar
Dembczyński K, Kotłowski W (2009) Decision rule-based algorithm for ordinal classification based on rank loss minimization. In: Preference learning, ECML/PKDD workshop
Dembczyński K, Kotłowski W, Słowiński R (2008) Ordinal classification with decision rules. In: Proceedings of the 3rd ECML/PKDD international conference on Mining complex data. MCD’07. Springer, Berlin, pp 169–181
Ehrgott M, Gandibleaux X (eds) (2002) Multiple criteria optimization. State of the art annotated bibliographic surveys, volume 52 of international series in operations research and management science. Kluwer Academic Publishers, Boston
Everson RM, Fieldsend JE (2006) Multi-class ROC analysis from a multi-objective optimisation perspective. Pattern Recogn Lett 27(8): 918–927
Article Google Scholar
Grigoroudis E, Nikolopoulou G, Zopounidis C (2008) Customer satisfaction barometers and economic development: An explorative ordinal regression analysis. Total Qual Manag Bus Excell 19(5): 441–460
Article Google Scholar
Guermeur Y (2002) Combining discriminant models with multi-class SVMs. Pattern Anal Appl 5: 168–179
Article MathSciNet MATH Google Scholar
Hand DJ, Till RJ (2001) A simple generalisation of the area under the roc curve for multiple class classification problems. Mach Learn 45(2): 171–186
Article MATH Google Scholar
Hastie T, Tibshirani R (1998) Classification by pairwise coupling. Ann Stat 26(2): 451–471
Article MathSciNet MATH Google Scholar
Herbrich R (2002) Learning theory classifiers. Theory and algorithms. MIT Press, Cambridge
Google Scholar
Herbrich R, Graepel T, Obermayer K (1999) Support vector learning for ordinal regression. In: In Ninth international conference on artificial neural networks ICANN, vol. 17, pp 97–102
Igel C (2005) Multi-objective model selection for supprot vector machines. In: Evolution multi-criterion optimization. Lecture notes in computer sciences, vol. 3410, pp 534–546
Jiao T, Peng J, Terlaky T (2009) A confidence voting process for ranking problems based on support vector machines. Ann Oper Res 166: 23–38
Article MathSciNet MATH Google Scholar
Jin Y, Sendhoff B (2008) Pareto-based multiobjective machine learning: an overview and case studies. IEEE Trans Syst Man Cybern Part C Appl Rev 38(3): 397–415
Article Google Scholar
Kupinski MA, Anastasio MA (1999) Multiobjective genetic optimization of diagnostic classifiers with implications for generating receiver operating characteristic curves. IEEE Trans Med Imaging 18(8): 675–685
Article Google Scholar
Lall R, Campbell MJ, Walters SJ, Morgan K (2002) A review of ordinal regression models applied on health-related quality of life assessments. Stat Methods Med Res 11(1): 49–67
Article MATH Google Scholar
Li L, Lin HT (2007) Ordinal regression by extended binary classification. In: Schölkopf B, Platt J, Hoffman T (eds) Advances in neural information processing systems, vol. 19. MIT Press, Cambridge, pp 865–872
Google Scholar
Lin HT, Li L (2006) Large-margin thresholded ensembles for ordinal regression: theory and practice. In: Algorithmic learning theory: ALT 2006. Lecture notes in computer sciences, vol. 4264, Springer, Berlin, pp 319–333
Lin HT, Li L (2009) Combining ordinal preferences by boosting. In: Second preference learning workshop at ECML/PKDD’09
Mangasarian OL (1965) Linear and nonlinear separation of patterns by linear programming. Oper Res 13: 444–452
Article MathSciNet MATH Google Scholar
Mercer J (1909) Functions of positive and negative type and their connection with the theory of integral equations. Philos Trans Royal Soc Lond A 209: 415–446
Article MATH Google Scholar
Nakayama H, Yun YB, Asada T, Yoon M (2005) MOP/GP models for machine learning. Eur J Oper Res 166: 756–768
Article MathSciNet MATH Google Scholar
Pedroso JP, Murata N (2001) Support vector machines with different norms: motivation, formulations and results. Pattern Recognit Lett 22: 1263–1272
Article MATH Google Scholar
Plastria F (2009) Asymmetric distances, semidirected networks and majority in Fermat-Weber problems. Ann Oper Res 167(1): 121–155
Article MathSciNet Google Scholar
Platt JC, Cristianini N, Shawe-Taylor J (2000) Large margin DAGs for multiclass classification. Adv Neural Inform Process Syst 12: 547–553
Google Scholar
Rennie JDM, Srebro N (2005) Loss functions for preference levels: regression with discrete ordered labels. In: Proceedings of the IJCAI multidisciplinary workshop on advances in preference handling
Shashua A, Levin A (2003) Ranking with large margin principle: two approaches. In: Thrun S, Becker S, Obermayer K (eds) Advances in Neural Information Processing Systems, volume 15. MIT Press, Cambridge, pp 937–944
Google Scholar
Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, Cambridge
Book Google Scholar
Tatsumi K, Hayashida K, Higashi H, Tanino T (2007) Multi-objective multiclass support vector machine for pattern recognition. SICE, 2007. Annual Conference, pp 1095–1098
Vapnik V (1995) The nature of statistical learning theory. Springer, Berlin
MATH Google Scholar
Vapnik V (1998) Statistical learning theory. Wiley, New York
MATH Google Scholar
Waegeman W, De Baets B, Boullart L (2008) Roc analysis in ordinal regression learning. Pattern Reognit Lett 29(1): 1–9
Article Google Scholar
Weston J, Watkins C (1999) Multi-class support vector machines. In: Proceedings of ESANN99. D. Facto Press, Brussels

Download references

Author information

Authors and Affiliations

Facultad de Matemáticas, Universidad de Sevilla (Spain), Avda. Reina Mercedes, s/n, 41012, Sevilla, Spain
Emilio Carrizosa
Facultad de CCSSJJ, Universidad Carlos III de Madrid (Spain), c/Madrid, 126, 28903, Getafe, Madrid, Spain
Belen Martin-Barragan

Authors

Emilio Carrizosa
View author publications
You can also search for this author in PubMed Google Scholar
Belen Martin-Barragan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Belen Martin-Barragan.

Additional information

This research was partially supported by project MTM2009-14039, ECO2008-05080 of Ministerio de Educación y Ciencia (Spain), FQM-329 of Plan Andaluz de Investigación (Andalucía, Spain) and CCG07-UC3M/ESP-3389 of the Comunidad de Madrid (Spain).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Carrizosa, E., Martin-Barragan, B. Maximizing upgrading and downgrading margins for ordinal regression. Math Meth Oper Res 74, 381–407 (2011). https://doi.org/10.1007/s00186-011-0368-z

Download citation

Received: 09 July 2010
Accepted: 07 July 2011
Published: 20 September 2011
Issue Date: December 2011
DOI: https://doi.org/10.1007/s00186-011-0368-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Maximizing upgrading and downgrading margins for ordinal regression

Abstract

Access this article

Similar content being viewed by others

Combining Absolute and Relative Information with Frequency Distributions for Ordinal Classification

Minimum class variance support vector ordinal regression

MORD: Multi-class Classifier for Ordinal Regression

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Maximizing upgrading and downgrading margins for ordinal regression

Abstract

Access this article

Similar content being viewed by others

Combining Absolute and Relative Information with Frequency Distributions for Ordinal Classification

Minimum class variance support vector ordinal regression

MORD: Multi-class Classifier for Ordinal Regression

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation