Stable L2-Regularized Ensemble Feature Weighting

  • Yun Li
  • Shasha Huang
  • Songcan Chen
  • Jennie Si
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7872)

Abstract

When selecting features for knowledge discovery applications, stability is a highly desired property. By stability of feature selection, here it means that the feature selection outcomes vary only insignificantly if the respective data change slightly. Several stable feature selection methods have been proposed, but only with empirical evaluation of the stability. In this paper, we aim at providing a try to give an analysis for the stability of our ensemble feature weighting algorithm. As an example, a feature weighting method based on L2-regularized logistic loss and its ensembles using linear aggregation is introduced. Moreover, the detailed analysis for uniform stability and rotation invariance of the ensemble feature weighting method is presented. Additionally, some experiments were conducted using real-world microarray data sets. Results show that the proposed ensemble feature weighting methods preserved stability property while performing satisfactory classification. In most cases, at least one of them actually provided better or similar tradeoff between stability and classification when compared with other methods designed for boosting the stability.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ng, A.Y.: Feature selection, l1 vs. l2 regularization, and rotational invariance. In: Proceedings of International Conference on Machine Learning, Banff, Canada (2004)Google Scholar
  2. 2.
    Zhao, Z.: Spectral Feature Selection for Mining Ultrahigh Dimensional Data. PhD thesis, Arizona State University (2010)Google Scholar
  3. 3.
    Inza, I., Larranaga, P., Blanco, R., Cerrolaza, A.J.: Filter versus wrapper gene selection approaches in dna microarray domains. Artificial Intelligence in Medicine 31, 91–103 (2004)CrossRefGoogle Scholar
  4. 4.
    Forman, G.: An extensive empirical study of feature selection metrics for text classification. Journal of Machine Learning Research 3, 1289–1305 (2003)MATHGoogle Scholar
  5. 5.
    Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. Knowledge and Data Engineering 17, 494–502 (2005)CrossRefGoogle Scholar
  6. 6.
    Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L.: Feature Extraction, Foundations and Applications. Springer, Physica-Verlag, New York (2006)MATHCrossRefGoogle Scholar
  7. 7.
    Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of Machine Learning Research 31, 1157–1182 (2003)Google Scholar
  8. 8.
    Saeys, Y., Abeel, T., Van de Peer, Y.: Robust feature selection using ensemble feature selection techniques. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. LNCS (LNAI), vol. 5212, pp. 313–325. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  9. 9.
    Han, Y., Yu, L.: A variance reduction for stable feature selection. In: Proceedings of the International Conference on Data Mining, pp. 206–215 (2010)Google Scholar
  10. 10.
    Loscalzo, S., Yu, L., Ding, C.: Consensus group stable feature selection. In: Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 567–575 (2009)Google Scholar
  11. 11.
    Abeel, T., Helleputte, T., Van de Peer, Y., Dupont, P., Saeys, Y.: Robust biomarker identification for cancer diagnosis with ensemble feature selection methods. Bioinformatics 26, 392–398 (2010)CrossRefGoogle Scholar
  12. 12.
    Yu, L., Han, Y., Berens, M.E.: Stable gene selection from microarray data via sample weighting. IEEE/ACM Trans. Computational Biology and Bioinformatics 9, 262–272 (2012)CrossRefGoogle Scholar
  13. 13.
    Bousquet, O., Elisseeff, A.: Stability and generalization. Journal of Machine Learning Research 2, 499–526 (2002)MathSciNetMATHGoogle Scholar
  14. 14.
    Elisseeff, A., Evgeniou, T., Pontil, M.: Stability of randomized learning algorithm. Journal of Machine Learning Research 6, 55–79 (2005)MathSciNetMATHGoogle Scholar
  15. 15.
    Li, Y., Gao, S.Y., Chen, S.C.: Ensemble feature weighting based on local learning and diversity. In: AAAI Conference on Artificial Intelligence, pp. 1019–1025 (2012)Google Scholar
  16. 16.
    Woznica, A., Nguyen, P., Kalousis, A.: Model mining for robust feature selection. In: Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 913–921 (2012)Google Scholar
  17. 17.
    Saeys, Y., Inza, I., Larranaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23, 2507–2517 (2007)CrossRefGoogle Scholar
  18. 18.
    Xu, H., Caramanis, C., Mannor, S.: Sparse algorithm are not stable: A no-free-lunch theorem. IEEE Trans. Pattern Analysis and Machine Intelligence 34, 187–193 (2012)CrossRefGoogle Scholar
  19. 19.
    Crammer, K., Bachrach, R.G., Navot, A., Tishby, N.: Margin analysis of the lvq algorithm. In: Advances in Neural Information Processing Systems, pp. 462–469 (2002)Google Scholar
  20. 20.
    Sun, Y.J., Todorovic, S., Goodison, S.: Local learning based feature selection for high dimensional data analysis. IEEE Trans. Pattern Analysis and Machine Intelligence 32, 1–18 (2010)CrossRefGoogle Scholar
  21. 21.
    Schapire, R.E., Freud, Y., Bartlett, P., Lee, W.S.: Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics 26, 1651–1686 (1998)MathSciNetMATHCrossRefGoogle Scholar
  22. 22.
    Breiman, L.: Bagging predictors. Machine Learning 26, 123–140 (1996)Google Scholar
  23. 23.
    Agarwal, S., Niyogi, P.: Generalization bounds for ranking algorithm via algorithmic stability. Journal of Machine Learning Research 10, 441–474 (2009)MathSciNetMATHGoogle Scholar
  24. 24.
    Anthony, M., Bartlett, P.L.: Neural Network Learning: Theoretical Foundations. Cambridge University Press, Cambridge (1999)MATHCrossRefGoogle Scholar
  25. 25.
    Frank, A., Asuncion, A.: UCI machine learning repository (2010), http://archive.ics.uci.edu/ml
  26. 26.
    Alon, U., Barkai, N., Notterman, D.A., Gish, K., Ybarra, S., Mack, D., Levine, A.J.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon cancer tissues probed by oligonucleotide arrays. Proceedings of the National Academy of Sciences of the United States of America, 6745–6750 (1999)Google Scholar
  27. 27.
    Bhattacharjee, A., Richards, W.G., Staunton, J., Li, C., Monti, S.: Classification of human lung carcinomas by mrna expression profiling reveals distinct adenocarcinoma subclasses. Proceedings of the National Academy of Sciences of the United States of America 98, 13790–13795 (2001)CrossRefGoogle Scholar
  28. 28.
    Li, Y., Lu, B.L.: Feature selection based on loss margin of nearest neighbor classification. Pattern Recognition 42, 1914–1921 (2009)MATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Yun Li
    • 1
  • Shasha Huang
    • 1
  • Songcan Chen
    • 2
  • Jennie Si
    • 3
  1. 1.College of Computer ScienceNanjing University of Posts and TelecommunicationsNanjingChina
  2. 2.College of Computer Science and TechnologyNanjing University of Aeronautics and AstronauticsNanjingChina
  3. 3.School of Electronic Computer and Energy EngineeringArizona State UniversityTempeUSA

Personalised recommendations