Optimization and Engineering

, Volume 14, Issue 2, pp 225–250

Robust formulations for clustering-based large-scale classification

  • Saketha Nath Jagarlapudi
  • Aharon Ben-Tal
  • Chiranjib Bhattacharyya
Article

DOI: 10.1007/s11081-011-9166-y

Cite this article as:
Jagarlapudi, S.N., Ben-Tal, A. & Bhattacharyya, C. Optim Eng (2013) 14: 225. doi:10.1007/s11081-011-9166-y

Abstract

Chebyshev-inequality-based convex relaxations of Chance-Constrained Programs (CCPs) are shown to be useful for learning classifiers on massive datasets. In particular, an algorithm that integrates efficient clustering procedures and CCP approaches for computing classifiers on large datasets is proposed. The key idea is to identify high density regions or clusters from individual class conditional densities and then use a CCP formulation to learn a classifier on the clusters. The CCP formulation ensures that most of the data points in a cluster are correctly classified by employing a Chebyshev-inequality-based convex relaxation. This relaxation is heavily dependent on the second-order statistics. However, this formulation and in general such relaxations that depend on the second-order moments are susceptible to moment estimation errors. One of the contributions of the paper is to propose several formulations that are robust to such errors. In particular a generic way of making such formulations robust to moment estimation errors is illustrated using two novel confidence sets. An important contribution is to show that when either of the confidence sets is employed, for the special case of a spherical normal distribution of clusters, the robust variant of the formulation can be posed as a second-order cone program. Empirical results show that the robust formulations achieve accuracies comparable to that with true moments, even when moment estimates are erroneous. Results also illustrate the benefits of employing the proposed methodology for robust classification of large-scale datasets.

Keywords

Confidence sets Robustness Large dataset classification SOCP Chebyshev inequality 

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Saketha Nath Jagarlapudi
    • 1
    • 2
  • Aharon Ben-Tal
    • 3
  • Chiranjib Bhattacharyya
    • 1
  1. 1.Dept. of Computer Science and AutomationIndian Institute of ScienceBangaloreIndia
  2. 2.Dept. of Computer Science and Engg.Indian Institute of Technology BombayMumbaiIndia
  3. 3.Faculty of Industrial Engineering and ManagementTechnionHaifaIsrael

Personalised recommendations