Skip to main content

Robust Feature Selection for SVMs under Uncertain Data

  • Conference paper
Advances in Data Mining. Applications and Theoretical Aspects (ICDM 2013)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7987))

Included in the following conference series:

Abstract

In this paper, we consider the problem of feature selection and classification under uncertain data that is inherently prevalent in almost all datasets. Using principles of Robust Optimization, we propose a robust scheme to handle data with ellipsoidal model uncertainty. The difficulty in treating zero-norm ℓ0 in feature selection problem is overcome by using an appropriate approximation and DC (Difference of Convex functions) programming and DCA (DC Algorithm). The computational results show that the proposed robust optimization approach is more performant than a traditional approach in immunizing perturbation of the data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ben-Tal, A., El Ghaoui, L., Nemirovski, A.: Robust Optimization. Princeton University Press (2009)

    Google Scholar 

  2. Bradley, P.S., Magasarian, O.L., Street, W.N.: Feature Selection via mathematical Programming. INFORMS Journal on Computing 10(2), 209–217 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  3. Bennett, K.P., Mangasarian, O.L.: Robust linear programming discrimination of two linearly inseparable sets. Optimization Methods and Software 1(1), 23–34 (1992)

    Article  Google Scholar 

  4. Bhattacharyya, C., Grate, L.R., Jordan, M.I., El Ghaoui, L., Mian, I.S.: Robust sparse hyperplane classifier: application to uncertain molecular profiling data. Journal of Computational Biology 11(6), 1073–1089 (2004)

    Article  Google Scholar 

  5. Bhattacharyya, C., Pannagadatta, K.S., Smola, A.J.: A second order cone programming formulation for classifying missing data. In: Advances in Neural Information Processing Systems, NIPS 17 (2004)

    Google Scholar 

  6. Bi, J., Zhang, T.: Support vector classification with input data uncertainty. Advances in Neural Information Processing Systems 17 (2004)

    Google Scholar 

  7. Collobert, R., Sinz, F., Weston, J., Bottou, L.: Large scale transductive SVMs. J. Machine Learn. 7, 1687–1712 (2006)

    MathSciNet  MATH  Google Scholar 

  8. Frank, A., Asuncion, A.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine (2010), http://archive.ics.uci.edu/ml

    Google Scholar 

  9. Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular Classifcation of Cancer: Class Discovery and Class Prediction by Gene Ex-pression Monitoring. Science 286, 531–537 (1999)

    Article  Google Scholar 

  10. Krause, N., Singer, Y.: Leveraging the margin more carefully. In: International Conference on Machine Learning ICML (2004)

    Google Scholar 

  11. Le Thi, H.A., Pham Dinh, T.: Solving a class of linearly constrained indefinite quadratic problems by DC algorithms. Journal of Global Optimization 11(3), 253–285 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  12. Le Thi, H.A., Pham Dinh, T.: The DC (difference of convex functions) Programming and DCA revisited with DC models of real world nonconvex optimization problems. Annals of Operations Research 133, 23–46 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  13. Le Thi, H.A., Belghiti, T., Pham Dinh, T.: A new efficient algorithm based on DC programming and DCA for Clustering. Journal of Global Optimization 37, 593–608 (2006)

    Google Scholar 

  14. Le Thi, H.A., Le Hoai, M., Pham Dinh, T.: Optimization based DC programming and DCA for Hierarchical Clustering. European Journal of Operational Research 183, 1067–1085 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  15. Le Thi, H.A., Le Hoai, M., Nguyen, N.V., Pham Dinh, T.: A DC Programming approach for Feature Selection in Support Vector Machines learning. Journal of Advances in Data Analysis and Classification 2(3), 259–278 (2008)

    Article  Google Scholar 

  16. Thiao, M., Pham Dinh, T., Le Thi, H.A.: DC programming approach for a class of nonconvex programs involving l0 norm. In: Le Thi, H.A., Bouvry, P., Pham Dinh, T. (eds.) MCO 2008. CCIS, vol. 14, pp. 348–357. Springer, Heidelberg (2008)

    Google Scholar 

  17. Le Thi, H.A.: DC Programming and DCA., http://lita.sciences.univ-metz.fr/~lethi/DCA.html

  18. Liu, Y., Shen, X., Doss, H.: Multicategory ψ-Learning and Support Vector Machine: Computational Tools. Journal of Computational and Graphical Statistics 14, 219–236 (2005)

    Article  MathSciNet  Google Scholar 

  19. Liu, Y., Shen, X.: Multicategory ψ-Learning. Journal of the American Statistical Association 101, 500–509 (2006)

    Article  MathSciNet  Google Scholar 

  20. Neumann, J., Schnörr, C., Steidl, G.: Combined SVM-based feature selection and classification. Machine Learning 61(1-3), 129–150 (2005)

    Article  MATH  Google Scholar 

  21. Neumann, J., Schnörr, C., Steidl, G.: SVM-based Feature Selection by Direct Objective Minimisation. In: Rasmussen, C.E., Bülthoff, H.H., Schölkopf, B., Giese, M.A. (eds.) DAGM 2004. LNCS, vol. 3175, pp. 212–219. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  22. Ronan, C., Fabian, S., Jason, W., Lé, B.: Trading Convexity for Scalability. In: Proceedings of the 23rd International Conference on Machine Learning, ICML 2006, pp. 201–208 (2006)

    Google Scholar 

  23. Shivaswamy, P.K., Bhattacharyya, C., Smola, A.J.: Second order cone programming approaches for handling missing and uncertain data. Journal of Machine Learning Research 7, 1238–1314 (2006)

    MathSciNet  Google Scholar 

  24. Trafalis, T.B., Raghav, P., Kash, B.: Support Vector Machine Classification of Uncertain and Imbalanced Data using Robust Optimization. In: Proceedings of the 15th WSEAS International Conference on Computers (2011)

    Google Scholar 

  25. Pham Dinh, T., Le Thi, H.A.: Convex analysis approach to DC programming: Theory, algorithms and applications. Acta Math. Vietnamica 22(1), 289–357 (1997)

    MATH  Google Scholar 

  26. Pham Dinh, T., Le Thi, H.A.: DC optimization algorithms for solving the trust region subproblem. SIAM J. Opt. 8, 476–505 (1998)

    Article  MATH  Google Scholar 

  27. Yuille, A.L., Rangarajan, A.: The Convex Concave Procedure. Neural Computation 15(4), 915–936 (2003)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Le Thi, H.A., Vo, X.T., Pham Dinh, T. (2013). Robust Feature Selection for SVMs under Uncertain Data. In: Perner, P. (eds) Advances in Data Mining. Applications and Theoretical Aspects. ICDM 2013. Lecture Notes in Computer Science(), vol 7987. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39736-3_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-39736-3_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-39735-6

  • Online ISBN: 978-3-642-39736-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics