ICDM 2013: Advances in Data Mining. Applications and Theoretical Aspects pp 151-165 | Cite as
Robust Feature Selection for SVMs under Uncertain Data
Abstract
In this paper, we consider the problem of feature selection and classification under uncertain data that is inherently prevalent in almost all datasets. Using principles of Robust Optimization, we propose a robust scheme to handle data with ellipsoidal model uncertainty. The difficulty in treating zero-norm ℓ0 in feature selection problem is overcome by using an appropriate approximation and DC (Difference of Convex functions) programming and DCA (DC Algorithm). The computational results show that the proposed robust optimization approach is more performant than a traditional approach in immunizing perturbation of the data.
Keywords
Feature selection SVM Robust Optimization DC programming DCAPreview
Unable to display preview. Download preview PDF.
References
- 1.Ben-Tal, A., El Ghaoui, L., Nemirovski, A.: Robust Optimization. Princeton University Press (2009)Google Scholar
- 2.Bradley, P.S., Magasarian, O.L., Street, W.N.: Feature Selection via mathematical Programming. INFORMS Journal on Computing 10(2), 209–217 (1998)MathSciNetMATHCrossRefGoogle Scholar
- 3.Bennett, K.P., Mangasarian, O.L.: Robust linear programming discrimination of two linearly inseparable sets. Optimization Methods and Software 1(1), 23–34 (1992)CrossRefGoogle Scholar
- 4.Bhattacharyya, C., Grate, L.R., Jordan, M.I., El Ghaoui, L., Mian, I.S.: Robust sparse hyperplane classifier: application to uncertain molecular profiling data. Journal of Computational Biology 11(6), 1073–1089 (2004)CrossRefGoogle Scholar
- 5.Bhattacharyya, C., Pannagadatta, K.S., Smola, A.J.: A second order cone programming formulation for classifying missing data. In: Advances in Neural Information Processing Systems, NIPS 17 (2004)Google Scholar
- 6.Bi, J., Zhang, T.: Support vector classification with input data uncertainty. Advances in Neural Information Processing Systems 17 (2004)Google Scholar
- 7.Collobert, R., Sinz, F., Weston, J., Bottou, L.: Large scale transductive SVMs. J. Machine Learn. 7, 1687–1712 (2006)MathSciNetMATHGoogle Scholar
- 8.Frank, A., Asuncion, A.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine (2010), http://archive.ics.uci.edu/ml Google Scholar
- 9.Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular Classifcation of Cancer: Class Discovery and Class Prediction by Gene Ex-pression Monitoring. Science 286, 531–537 (1999)CrossRefGoogle Scholar
- 10.Krause, N., Singer, Y.: Leveraging the margin more carefully. In: International Conference on Machine Learning ICML (2004)Google Scholar
- 11.Le Thi, H.A., Pham Dinh, T.: Solving a class of linearly constrained indefinite quadratic problems by DC algorithms. Journal of Global Optimization 11(3), 253–285 (1997)MathSciNetMATHCrossRefGoogle Scholar
- 12.Le Thi, H.A., Pham Dinh, T.: The DC (difference of convex functions) Programming and DCA revisited with DC models of real world nonconvex optimization problems. Annals of Operations Research 133, 23–46 (2005)MathSciNetMATHCrossRefGoogle Scholar
- 13.Le Thi, H.A., Belghiti, T., Pham Dinh, T.: A new efficient algorithm based on DC programming and DCA for Clustering. Journal of Global Optimization 37, 593–608 (2006)Google Scholar
- 14.Le Thi, H.A., Le Hoai, M., Pham Dinh, T.: Optimization based DC programming and DCA for Hierarchical Clustering. European Journal of Operational Research 183, 1067–1085 (2007)MathSciNetMATHCrossRefGoogle Scholar
- 15.Le Thi, H.A., Le Hoai, M., Nguyen, N.V., Pham Dinh, T.: A DC Programming approach for Feature Selection in Support Vector Machines learning. Journal of Advances in Data Analysis and Classification 2(3), 259–278 (2008)CrossRefGoogle Scholar
- 16.Thiao, M., Pham Dinh, T., Le Thi, H.A.: DC programming approach for a class of nonconvex programs involving l0 norm. In: Le Thi, H.A., Bouvry, P., Pham Dinh, T. (eds.) MCO 2008. CCIS, vol. 14, pp. 348–357. Springer, Heidelberg (2008)Google Scholar
- 17.Le Thi, H.A.: DC Programming and DCA., http://lita.sciences.univ-metz.fr/~lethi/DCA.html
- 18.Liu, Y., Shen, X., Doss, H.: Multicategory ψ-Learning and Support Vector Machine: Computational Tools. Journal of Computational and Graphical Statistics 14, 219–236 (2005)MathSciNetCrossRefGoogle Scholar
- 19.Liu, Y., Shen, X.: Multicategory ψ-Learning. Journal of the American Statistical Association 101, 500–509 (2006)MathSciNetCrossRefGoogle Scholar
- 20.Neumann, J., Schnörr, C., Steidl, G.: Combined SVM-based feature selection and classification. Machine Learning 61(1-3), 129–150 (2005)MATHCrossRefGoogle Scholar
- 21.Neumann, J., Schnörr, C., Steidl, G.: SVM-based Feature Selection by Direct Objective Minimisation. In: Rasmussen, C.E., Bülthoff, H.H., Schölkopf, B., Giese, M.A. (eds.) DAGM 2004. LNCS, vol. 3175, pp. 212–219. Springer, Heidelberg (2004)CrossRefGoogle Scholar
- 22.Ronan, C., Fabian, S., Jason, W., Lé, B.: Trading Convexity for Scalability. In: Proceedings of the 23rd International Conference on Machine Learning, ICML 2006, pp. 201–208 (2006)Google Scholar
- 23.Shivaswamy, P.K., Bhattacharyya, C., Smola, A.J.: Second order cone programming approaches for handling missing and uncertain data. Journal of Machine Learning Research 7, 1238–1314 (2006)MathSciNetGoogle Scholar
- 24.Trafalis, T.B., Raghav, P., Kash, B.: Support Vector Machine Classification of Uncertain and Imbalanced Data using Robust Optimization. In: Proceedings of the 15th WSEAS International Conference on Computers (2011)Google Scholar
- 25.Pham Dinh, T., Le Thi, H.A.: Convex analysis approach to DC programming: Theory, algorithms and applications. Acta Math. Vietnamica 22(1), 289–357 (1997)MATHGoogle Scholar
- 26.Pham Dinh, T., Le Thi, H.A.: DC optimization algorithms for solving the trust region subproblem. SIAM J. Opt. 8, 476–505 (1998)MATHCrossRefGoogle Scholar
- 27.Yuille, A.L., Rangarajan, A.: The Convex Concave Procedure. Neural Computation 15(4), 915–936 (2003)MATHCrossRefGoogle Scholar