Robust Feature Selection for SVMs under Uncertain Data

Le Thi, Hoai An; Vo, Xuan Thanh; Pham Dinh, Tao

doi:10.1007/978-3-642-39736-3_12

Hoai An Le Thi^20,21,
Xuan Thanh Vo²⁰ &
Tao Pham Dinh²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7987))

Included in the following conference series:

Industrial Conference on Data Mining

1873 Accesses
5 Citations

Abstract

In this paper, we consider the problem of feature selection and classification under uncertain data that is inherently prevalent in almost all datasets. Using principles of Robust Optimization, we propose a robust scheme to handle data with ellipsoidal model uncertainty. The difficulty in treating zero-norm ℓ₀ in feature selection problem is overcome by using an appropriate approximation and DC (Difference of Convex functions) programming and DCA (DC Algorithm). The computational results show that the proposed robust optimization approach is more performant than a traditional approach in immunizing perturbation of the data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 49.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ben-Tal, A., El Ghaoui, L., Nemirovski, A.: Robust Optimization. Princeton University Press (2009)
Google Scholar
Bradley, P.S., Magasarian, O.L., Street, W.N.: Feature Selection via mathematical Programming. INFORMS Journal on Computing 10(2), 209–217 (1998)
Article MathSciNet MATH Google Scholar
Bennett, K.P., Mangasarian, O.L.: Robust linear programming discrimination of two linearly inseparable sets. Optimization Methods and Software 1(1), 23–34 (1992)
Article Google Scholar
Bhattacharyya, C., Grate, L.R., Jordan, M.I., El Ghaoui, L., Mian, I.S.: Robust sparse hyperplane classifier: application to uncertain molecular profiling data. Journal of Computational Biology 11(6), 1073–1089 (2004)
Article Google Scholar
Bhattacharyya, C., Pannagadatta, K.S., Smola, A.J.: A second order cone programming formulation for classifying missing data. In: Advances in Neural Information Processing Systems, NIPS 17 (2004)
Google Scholar
Bi, J., Zhang, T.: Support vector classification with input data uncertainty. Advances in Neural Information Processing Systems 17 (2004)
Google Scholar
Collobert, R., Sinz, F., Weston, J., Bottou, L.: Large scale transductive SVMs. J. Machine Learn. 7, 1687–1712 (2006)
MathSciNet MATH Google Scholar
Frank, A., Asuncion, A.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine (2010), http://archive.ics.uci.edu/ml
Google Scholar
Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular Classifcation of Cancer: Class Discovery and Class Prediction by Gene Ex-pression Monitoring. Science 286, 531–537 (1999)
Article Google Scholar
Krause, N., Singer, Y.: Leveraging the margin more carefully. In: International Conference on Machine Learning ICML (2004)
Google Scholar
Le Thi, H.A., Pham Dinh, T.: Solving a class of linearly constrained indefinite quadratic problems by DC algorithms. Journal of Global Optimization 11(3), 253–285 (1997)
Article MathSciNet MATH Google Scholar
Le Thi, H.A., Pham Dinh, T.: The DC (difference of convex functions) Programming and DCA revisited with DC models of real world nonconvex optimization problems. Annals of Operations Research 133, 23–46 (2005)
Article MathSciNet MATH Google Scholar
Le Thi, H.A., Belghiti, T., Pham Dinh, T.: A new efficient algorithm based on DC programming and DCA for Clustering. Journal of Global Optimization 37, 593–608 (2006)
Google Scholar
Le Thi, H.A., Le Hoai, M., Pham Dinh, T.: Optimization based DC programming and DCA for Hierarchical Clustering. European Journal of Operational Research 183, 1067–1085 (2007)
Article MathSciNet MATH Google Scholar
Le Thi, H.A., Le Hoai, M., Nguyen, N.V., Pham Dinh, T.: A DC Programming approach for Feature Selection in Support Vector Machines learning. Journal of Advances in Data Analysis and Classification 2(3), 259–278 (2008)
Article Google Scholar
Thiao, M., Pham Dinh, T., Le Thi, H.A.: DC programming approach for a class of nonconvex programs involving l0 norm. In: Le Thi, H.A., Bouvry, P., Pham Dinh, T. (eds.) MCO 2008. CCIS, vol. 14, pp. 348–357. Springer, Heidelberg (2008)
Google Scholar
Le Thi, H.A.: DC Programming and DCA., http://lita.sciences.univ-metz.fr/~lethi/DCA.html
Liu, Y., Shen, X., Doss, H.: Multicategory ψ-Learning and Support Vector Machine: Computational Tools. Journal of Computational and Graphical Statistics 14, 219–236 (2005)
Article MathSciNet Google Scholar
Liu, Y., Shen, X.: Multicategory ψ-Learning. Journal of the American Statistical Association 101, 500–509 (2006)
Article MathSciNet Google Scholar
Neumann, J., Schnörr, C., Steidl, G.: Combined SVM-based feature selection and classification. Machine Learning 61(1-3), 129–150 (2005)
Article MATH Google Scholar
Neumann, J., Schnörr, C., Steidl, G.: SVM-based Feature Selection by Direct Objective Minimisation. In: Rasmussen, C.E., Bülthoff, H.H., Schölkopf, B., Giese, M.A. (eds.) DAGM 2004. LNCS, vol. 3175, pp. 212–219. Springer, Heidelberg (2004)
Chapter Google Scholar
Ronan, C., Fabian, S., Jason, W., Lé, B.: Trading Convexity for Scalability. In: Proceedings of the 23rd International Conference on Machine Learning, ICML 2006, pp. 201–208 (2006)
Google Scholar
Shivaswamy, P.K., Bhattacharyya, C., Smola, A.J.: Second order cone programming approaches for handling missing and uncertain data. Journal of Machine Learning Research 7, 1238–1314 (2006)
MathSciNet Google Scholar
Trafalis, T.B., Raghav, P., Kash, B.: Support Vector Machine Classification of Uncertain and Imbalanced Data using Robust Optimization. In: Proceedings of the 15th WSEAS International Conference on Computers (2011)
Google Scholar
Pham Dinh, T., Le Thi, H.A.: Convex analysis approach to DC programming: Theory, algorithms and applications. Acta Math. Vietnamica 22(1), 289–357 (1997)
MATH Google Scholar
Pham Dinh, T., Le Thi, H.A.: DC optimization algorithms for solving the trust region subproblem. SIAM J. Opt. 8, 476–505 (1998)
Article MATH Google Scholar
Yuille, A.L., Rangarajan, A.: The Convex Concave Procedure. Neural Computation 15(4), 915–936 (2003)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Laboratory of Theoretical and Applied Computer Science EA 3097, University of Lorraine, Ile de Saulcy, 57045, Metz, France
Hoai An Le Thi & Xuan Thanh Vo
Lorraine Research Laboratory in Computer Science and Its Applications, CNRS UMR 7503, University of Lorraine, 54506, Nancy, France
Hoai An Le Thi
Laboratory of Mathematics, National Institute for Applied Sciences-Rouen, Avenue de l’Université, 76801, Saint-Etienne-du-Rouvray cedex, France
Tao Pham Dinh

Authors

Hoai An Le Thi
View author publications
You can also search for this author in PubMed Google Scholar
Xuan Thanh Vo
View author publications
You can also search for this author in PubMed Google Scholar
Tao Pham Dinh
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Computer Vision and Applied Computer Sciences, IBaI, Leipzig, Germany
Petra Perner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Le Thi, H.A., Vo, X.T., Pham Dinh, T. (2013). Robust Feature Selection for SVMs under Uncertain Data. In: Perner, P. (eds) Advances in Data Mining. Applications and Theoretical Aspects. ICDM 2013. Lecture Notes in Computer Science(), vol 7987. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39736-3_12

Download citation

DOI: https://doi.org/10.1007/978-3-642-39736-3_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39735-6
Online ISBN: 978-3-642-39736-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics