Machine Learning

, Volume 71, Issue 2–3, pp 219–242 | Cite as

On reoptimizing multi-class classifiers

  • Chris Bourke
  • Kun Deng
  • Stephen D. Scott
  • Robert E. Schapire
  • N. V. Vinodchandran
Article

Abstract

Significant changes in the instance distribution or associated cost function of a learning problem require one to reoptimize a previously-learned classifier to work under new conditions. We study the problem of reoptimizing a multi-class classifier based on its ROC hypersurface and a matrix describing the costs of each type of prediction error. For a binary classifier, it is straightforward to find an optimal operating point based on its ROC curve and the relative cost of true positive to false positive error. However, the corresponding multi-class problem (finding an optimal operating point based on a ROC hypersurface and cost matrix) is more challenging and until now, it was unknown whether an efficient algorithm existed that found an optimal solution. We answer this question by first proving that the decision version of this problem is \(\mathsf{NP}\) -complete. As a complementary positive result, we give an algorithm that finds an optimal solution in polynomial time if the number of classes n is a constant. We also present several heuristics for this problem, including linear, nonlinear, and quadratic programming formulations, genetic algorithms, and a customized algorithm. Empirical results suggest that under both uniform and non-uniform cost models, simple greedy methods outperform more sophisticated methods.

Keywords

Receiver Operator Characteristic (ROC) Classifier reoptimization Multi-class classification 

References

  1. Abramson, M. A. (2005). Genetic algorithm and direct search toolbox. http://www.mathworks.com/.
  2. Blake, C., & Merz, C. (2005). UCI repository of machine learning databases. http://www.ics.uci.edu/~mlearn/MLRepository.html.
  3. Boyd, S., & Vandenberghe, L. (2004). Convex optimization. Cambridge: Cambridge University Press. MATHGoogle Scholar
  4. Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7, 1–30. Google Scholar
  5. Deng, K., Bourke, C., Scott, S. D., & Vinodchandran, N. V. (2006). New algorithms for optimizing multi-class classifiers via ROC surfaces. In Proceedings of the ICML workshop on ROC analysis in machine learning (pp. 17–24). Google Scholar
  6. Ferri, C., Hernández-Orallo, J., & Salido, M. (2003). Volume under the ROC surface for multi-class problems. In European conference on artificial intelligence (pp. 108–120). Google Scholar
  7. Fieldsend, J., & Everson, R. (2005). Formulation and comparison of multi-class ROC surfaces. In Proceedings of the ICML workshop on ROC analysis in machine learning (pp. 41–48). Google Scholar
  8. Grant, M., Boyd, S., & Ye, Y. (2006). Disciplined convex programming. In L. Liberti & N. Maculan (Eds.), Nonconvex Optimization and its Applications. Global optimization: From theory to implementation (pp. 155–210). Berlin: Springer. Available at http://www.stanford.edu/~boyd/cvx/. Google Scholar
  9. Hand, D., & Till, R. (2001). A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning, 45, 171–186. CrossRefMATHGoogle Scholar
  10. Hochberg, Y., & Tamhane, A. C. (1987). Multiple comparison procedures. New York: Wiley. MATHGoogle Scholar
  11. Kohli, R., Krishnamurti, R., & Mirchandani, P. (1994). The minimum satisfiability problem. SIAM Journal of Discrete Mathematics, 7, 275–283. CrossRefMathSciNetMATHGoogle Scholar
  12. Lachiche, N., & Flach, P. (1999). 1BC: A first-order Bayesian classifier. In Proceedings of the 9th international workshop on inductive logic programming (pp. 92–103). Google Scholar
  13. Lachiche, N., & Flach, P. (2003). Improving accuracy and cost of two-class and multi-class probabilistic classifiers using ROC curves. In Proceedings of the 20th international conference on machine learning (pp. 416–423). Google Scholar
  14. Matsui, T. (1996). \(\mathsf{NP}\) -hardness of linear multiplicative programming and related problems. Journal of Global Optimization, 9, 113–119. CrossRefMathSciNetMATHGoogle Scholar
  15. MOSEK ApS (2005). The MOSEK optimization tools version 3.2. http://www.mosek.com/.
  16. Mossman, D. (1999). Three-way ROCs. Medical Decision Making, 19(1), 78–89. CrossRefGoogle Scholar
  17. O’Brien, D. B., & Gray, R. M. (2005). Improving classification performance by exploring the role of cost matrices in partitioning the estimated class probability space. In Proceedings of the ICML Workshop on ROC Analysis in Machine Learning (pp. 79–86). Google Scholar
  18. Provost, F. J., & Fawcett, T. (1997). Analysis and visualization of classifier performance: Comparison under imprecise class and cost distributions. In Proceedings of the third international conference on knowledge discovery and data mining (KDD-97) (pp. 43–48). Google Scholar
  19. Provost, F. J., & Fawcett, T. (1998). Robust classification systems for imprecise environments. In Proceedings of the 15th national conference on artificial intelligence (AAAI) (pp. 706–713). Google Scholar
  20. Provost, F. J., & Fawcett, T. (2001). Robust classification for imprecise environments. Machine Learning, 42, 203–231. CrossRefMATHGoogle Scholar
  21. Rockafellar, R. (1970). Convex analysis (2nd edn.). Princeton: Princeton University Press. MATHGoogle Scholar
  22. Schölkopf, B., & Smola, A. J. (2001). Learning with kernels. Cambridge: MIT Press. Google Scholar
  23. Srinivasan, A. (1999). Note on the location of optimal classifiers in n-dimensional ROC space (Technical Report PRG-TR-2-99). Oxford University Computing Laboratory, Oxford. Google Scholar
  24. Stoer, I. J., & Witzgall, C. (1996). Convexity and optimization in finite dimensions. Berlin: Springer. Google Scholar
  25. Witten, I. H., et al. (2005). Weka machine learning toolbox. http://www.cs.waikato.ac.nz/ml/weka/.

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  • Chris Bourke
    • 1
  • Kun Deng
    • 1
  • Stephen D. Scott
    • 1
  • Robert E. Schapire
    • 2
  • N. V. Vinodchandran
    • 1
  1. 1.Dept. of Computer ScienceUniversity of NebraskaLincolnUSA
  2. 2.Dept. of Computer SciencePrinceton UniversityPrincetonUSA

Personalised recommendations