Abstract
A multi-class classifier based on the Bradley-Terry model predicts the multi-class label of an input by combining the outputs from multiple binary classifiers, where the combination should be a priori designed as a code word matrix. The code word matrix was originally designed to consist of +1 and −1 codes, and was later extended into deal with ternary code {+1,0,−1}, that is, allowing 0 codes. This extension has seemed to work effectively but, in fact, contains a problem: a binary classifier forcibly categorizes examples with 0 codes into either +1 or −1, but this forcible decision makes the prediction of the multi-class label obscure. In this article, we propose a Boosting algorithm that deals with three categories by allowing a ‘don’t care’ category corresponding to 0 codes, and present a modified decoding method called a ‘ternary’ Bradley-Terry model. In addition, we propose a couple of fast decoding schemes that reduce the heavy computation by the existing Bradley-Terry model-based decoding.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Allison, P. D. (2001). Missing data. Thousand Oaks: Sage.
Allwein, E. L., Schapire, R. E., & Singer, Y. (2001). Reducing multiclass to binary: a unifying approach for margin classifiers. Journal of Machine Learning Research, 1, 113–141.
Angulo, C., & Català, A. (2000). K-svcr. a multi-class support vector machine. In ECML ’00: Proceedings of the 11th European conference on machine learning (pp. 31–38), London, UK. Berlin: Springer. ISBN 3-540-67602-3.
Angulo, C., Ruiz, F. J., González, L., & Ortega, J. A. (2006). Multi-classification by using tri-class SVM. Neural Processing Letters, 23(1), 89–101. ISSN 1370-4621. doi:10.1007/s11063-005-3500-3.
Bhattacharjee, A., Richards, W. G., Staunton, J., Li, C., Monti, S., Vasa, P., Ladd, C., Beheshti, J., Bueno, R., Gillette, M., Loda, M., Weber, G., Mark, E. J., Lander, E. S., Wong, W., Johnson, B. E., Golub, T. R., Sugarbaker, D. J., & Meyerson, M. (2001). Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proceedings of the National Academy of Sciences of the United States of America, 98(24), 13790–13795.
Blake, C. L., & Merz, C. J. (1998). UCI repository of machine learning databases. University of California, Department of Information and Computer Science.
Bradley, R. A., & Terry, M. E. (1952). Rank analysis of incomplete block designs: I. The method of paired comparisons. Biometrika, 39(3/4), 324–345.
Crammer, K., & Singer, K. (2001). On the algorithmic implementation of multiclass kernel-based vector machines. Journal of Machine Learning Research, 2, 265–292.
Cutzu, F. (2003). Polychotomous classification with pairwise classifiers: a new voting principle. In T. Windeatt & F. Roli (Eds.), Lecture notes in computer science: Vol. 2709. Multiple classifier systems (pp. 115–124). Berlin: Springer. ISBN 3-540-40369-8.
Dekel, O., Shalev-Shwartz, S., & Singer, Y. (2005). Smooth epsilon-insensitive regression by loss symmetrization. Journal of Machine Learning Research, 6(1), 711–741.
Dietterich, T. G., & Bakiri, G. (1995). Solving multiclass learning problems via error-correcting output codes. The Journal of Artificial Intelligence Research, 2, 263–286.
Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119–139.
Hastie, T., & Tibshirani, R. (1998). Classification by pairwise coupling. Annals of Statistics, 26, 451–471.
Hastie, T., Tibishirani, R., & Friedman, J. (2001). The elements of statistical learning. New York: Springer.
Moreira, M., & Mayoraz, E. (1998). Improved pairwise coupling classification with correcting classifiers. In European conference on machine learning (pp. 160–171). URL citeseer.ist.psu.edu/moreira97improved.html.
Murata, N., Takenouchi, T., Kanamori, T., & Eguchi, S. (2004). Information geometry of U-boost and Bregman divergence. Neural Computation, 16(7), 1437–1481.
Platt, J. (1999). Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in Large Margin Classifiers, 10(3).
R Development Core Team (2010). R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. URL http://www.R-project.org. ISBN 3-900051-07-0.
Schapire, R. E. (1997). Using output codes to boost multiclass learning problems. In Machine learning: Proceedings of the fourteenth international conference (pp. 313–321). San Mateo: Morgan Kaufmann.
Schapire, R. E., & Singer, Y. (1999). Improved boosting algorithms using confidence-rated predictions. Machine Learning, 37(3), 297–336.
Takenouchi, T., & Ishii, S. (2009). A multi-class classification method based on decoding of binary classifiers. Neural Computation, 21(7), 2049–2081.
Takenouchi, T., & Ishii, S. (2008). Ternary Bradley-Terry model-based decoding for multi-class classification. In IEEE workshop on machine learning for signal processing, 2008 (pp. 121–126), MLSP 2008.
Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society. Series B. Methodological, 58(1), 267–288.
Van der Vaart, A. W. (1998). Asymptotic statistics. Cambridge: Cambridge University Press.
Vapnik, V. (1995). The nature of statistical learning theory. Berlin: Springer.
Weston, J., & Watkins, C. (1999). Support vector machines for multi-class pattern recognition. In Proceedings of the seventh European symposium on artificial neural networks (Vol. 4, p. 6).
Windeatt, T., & Ghaderi, R. (2003). Coding and decoding strategies for multi-class learning problems. Information Fusion, 4(1), 11–21.
Yukinawa, N., Oba, S., Kato, K., & Ishii, S. (2008). Optimal aggregation of binary classifiers for multi-class cancer diagnosis using gene expression profiles. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 6(2), 333–343.
Zadrozny, B. (2001). Reducing multiclass to binary by coupling probability estimates. In T. G. Dietterich, S. Becker, & Z. Ghahramani (Eds.), Advances in neural information processing systems (Vol. 14). Cambridge: MIT Press.
Author information
Authors and Affiliations
Corresponding author
Additional information
Editor: Phil Long.
Rights and permissions
About this article
Cite this article
Takenouchi, T., Ishii, S. Ternary Bradley-Terry model-based decoding for multi-class classification and its extensions. Mach Learn 85, 249–272 (2011). https://doi.org/10.1007/s10994-011-5240-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-011-5240-0