Online Feature Selection Based on Passive-Aggressive Algorithm with Retaining Features

  • Hai-Tao Zheng
  • Haiyang Zhang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9313)


Feature selection is an important topic in data mining and machine learning, and has been extensively studied in many literature. Unlike traditional batch learning methods, online learning is more efficient for real-world applications. Most existing studies of online learning require accessing all the features of training instances, but in real world, it is often expensive to acquire the full set of attributes. In online feature selection process, when a training instance arrive, a fixed small number of features will be selected, and then the other features will be ignored. However, those ignored features may be useful and selected in later instances. If we only consider the new instances for these special features, it will lead to extreme errors. To address these issues, we improved a novel algorithm with Passive-Aggressive Algorithm and retaining features. Then we evaluate the performance of the proposed algorithms for online feature selection on several public datasets, and we can see from the experiments that our algorithm consistently surpassed the baseline algorithms for all the situations.


Feature Selection Online Learning Binary Classification 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Crammer, K., Dekel, O., Keshet, J., Shalev-Shwartz, S., Singer, Y.: Online passive-aggressive algorithms. J. Mach. Learn. Res (JMLR) 7, 551–585 (2006)MathSciNetzbMATHGoogle Scholar
  2. 2.
    Wang, J., Zhao, P., Hoi, S.C.H., Jin, R.: Online Feature Selection and Its Applications. In: TKDE (2012)Google Scholar
  3. 3.
    Freund, Y., Schapire, R.E.: Large margin classification using the perceptron algorithm. Machine Learning 37(3), 277–296 (1999)CrossRefzbMATHGoogle Scholar
  4. 4.
    Rosenblatt, F.: The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review 65, 386–407 (1958)CrossRefGoogle Scholar
  5. 5.
    Hoi, S.C.H., Wang, J., Zhao, P.: LIBOL: A Library for Online Learning Algorithms. Nanyang Technological University (2012)Google Scholar
  6. 6.
    Dredze, M., Crammer, K., Pereira, F.: Confidence-weighted linear classification. In: ICML, pp. 264–271 (2008)Google Scholar
  7. 7.
    Zhao, P., Hoi, S.C.H., Jin, R.: Double updating online learning. Journal of Machine Learning Research 12, 1587–1615 (2011)MathSciNetzbMATHGoogle Scholar
  8. 8.
    Crammer, K., Dredze, M., Pereira, F.: Exact convex confidenceweighted learning. In: NIPS, pp. 345–352 (2008)Google Scholar
  9. 9.
    Crammer, K., Kulesza, A., Dredze, M.: Adaptive regularization of weight vectors. In: NIPS, pp. 414–422 (2009)Google Scholar
  10. 10.
    Bekkerman, R., El-Yaniv, R., Tishby, N., Winter, Y.: Distributional word clusters vs. words for text categorization. Journal of Machine Learning Research 3, 1183–1208 (2003)zbMATHGoogle Scholar
  11. 11.
    Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 1(1-4), 131–156 (1997)CrossRefGoogle Scholar
  12. 12.
    Saeys, Y., Inza, I., Larra ñaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)CrossRefGoogle Scholar
  13. 13.
    Yu, L., Liu, H.: Feature selection for high-dimensional data: A fast correlation-based filter solution. In: ICML, pp. 856–863 (2003)Google Scholar
  14. 14.
    Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182 (2003)zbMATHGoogle Scholar
  15. 15.
    Dash, M., Gopalkrishnan, V.: Distance based feature selection for clustering microarray data. In: Haritsa, J.R., Kotagiri, R., Pudi, V. (eds.) DASFAA 2008. LNCS, vol. 4947, pp. 512–519. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  16. 16.
    Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1-2), 273–324 (1997)CrossRefzbMATHGoogle Scholar
  17. 17.
    Xu, Z., Jin, R., Ye, J., Lyu, M.R., King, I.: Non-monotonic feature selection. In: ICML, p. 144 (2009)Google Scholar
  18. 18.
    He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: NIPS (2005)Google Scholar
  19. 19.
    Zhao, Z., Liu, H.: Spectral feature selection for supervised and unsupervised learning. In: ICML, pp. 1151–1157 (2007)Google Scholar
  20. 20.
    Yang, Y., Shen, H.T., Ma, Z., Huang, Z., Zhou, X.: l2, 1-norm regularized discriminative feature selection for unsupervised learning. In: IJCAI, pp. 1589–1594 (2011)Google Scholar
  21. 21.
    Zhao, Z., Liu, H.: Semi-supervised feature selection via spectral analysis. In: SDM (2007)Google Scholar
  22. 22.
    Ren, J., Qiu, Z., Fan, W., Cheng, H., Yu, P.S.: Forward semi-supervised feature selection. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 970–976. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  23. 23.
    Xu, Z., King, I., Lyu, M.R., Jin, R.: Discriminative semi-supervised feature selection via manifold regularization. IEEE Transactions on Neural Networks 21(7), 1033–1047 (2010)CrossRefGoogle Scholar
  24. 24.
    Wu, X., Yu, K., Wang, H., Ding, W.: Online streaming feature selection. In: ICML, pp. 1159–1166 (2010)Google Scholar
  25. 25.
    Hoi, S.C.H., Wang, J., Zhao, P., Jin, R.: Online Feature Selection for Mining Big Data. In: BigMine, pp. 93–100 (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Hai-Tao Zheng
    • 1
  • Haiyang Zhang
    • 1
  1. 1.Graduate School at ShenzhenTsinghua UniversityShenzhenP.R. China

Personalised recommendations