Rethinking Unsupervised Feature Selection: From Pseudo Labels to Pseudo Must-Links

  • Xiaokai Wei
  • Sihong Xie
  • Bokai Cao
  • Philip S. Yu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10534)


High-dimensional data are prevalent in various machine learning applications. Feature selection is a useful technique for alleviating the curse of dimensionality. Unsupervised feature selection problem tends to be more challenging than its supervised counterpart due to the lack of class labels. State-of-the-art approaches usually use the concept of pseudo labels to select discriminative features by their regression coefficients and the pseudo-labels derived from clustering is usually inaccurate. In this paper, we propose a new perspective for unsupervised feature selection by Discriminatively Exploiting Similarity (DES). Through forming similar and dissimilar data pairs, implicit discriminative information can be exploited. The similar/dissimilar relationship of data pairs can be used as guidance for feature selection. Based on this idea, we propose hypothesis testing based and classification based methods as instantiations of the DES framework. We evaluate the proposed approaches extensively using six real-world datasets. Experimental results demonstrate that our approaches achieve significantly outperforms the state-of-the-art unsupervised methods. More surprisingly, our unsupervised method even achieves performance comparable to a supervised feature selection method. Code related to this chapter is available at:


Feature selection 


  1. 1.
    Cai, D., Zhang, C., He, X.: Unsupervised feature selection for multi-cluster data. In: KDD, pp. 333–342 (2010)Google Scholar
  2. 2.
    Du, L., Shen, Y.-D.: Unsupervised feature selection with adaptive structure learning. In: KDD (2015)Google Scholar
  3. 3.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley, Hoboken (2001)MATHGoogle Scholar
  4. 4.
    Feng, Y., Xiao, J., Zhuang, Y., Liu, X.: Adaptive unsupervised multi-view feature selection for visual concept recognition. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012. LNCS, vol. 7724, pp. 343–357. Springer, Heidelberg (2013). CrossRefGoogle Scholar
  5. 5.
    He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: NIPS (2005)Google Scholar
  6. 6.
    Li, J., Hu, X., Jian, L., Liu, H.: Toward time-evolving feature selection on dynamic networks. In: IEEE 16th International Conference on Data Mining (ICDM), 12–15 December 2016, Barcelona, Spain, pp. 1003–1008 (2016)Google Scholar
  7. 7.
    Li, J., Tang, J., Liu, H.: Reconstruction-based unsupervised feature selection: an embedded approach. In: IJCAI (2017)Google Scholar
  8. 8.
    Li, Z., Yang, Y., Liu, J., Zhou, X., Lu, H.: Unsupervised feature selection using nonnegative spectral analysis. In: AAAI (2012)Google Scholar
  9. 9.
    Liu, H., Setiono, R.: Chi2: feature selection and discretization of numeric attributes. In: Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence (1995)Google Scholar
  10. 10.
    Nie, F., Huang, H., Cai, X., Ding, C.H.Q.: Efficient and robust feature selection via joint l2, 1-norms minimization. In: NIPS, pp. 1813–1821 (2010)Google Scholar
  11. 11.
    Qian, M., Zhai, C.: Robust unsupervised feature selection. In: IJCAI (2013)Google Scholar
  12. 12.
    Shi, L., Du, L., Shen, Y.-D.: Robust spectral learning for unsupervised feature selection. In: ICDM (2014)Google Scholar
  13. 13.
    Song, L., Smola, A.J., Gretton, A., Borgwardt, K.M., Bedo, J.: Supervised feature selection via dependence estimation. In: ICML, vol. 227, pp. 823–830. ACM (2007)Google Scholar
  14. 14.
    Sun, L., Li, Z., Yan, Q., Srisa-an, W., Pan, Y.: SigPID: significant permission identification for android malware detection. In: 2016 11th International Conference on Malicious and Unwanted Software (MALWARE), pp. 1–8. IEEE (2016)Google Scholar
  15. 15.
    Tang, J., Liu, H.: Unsupervised feature selection for linked social media data. In: KDD, pp. 904–912 (2012)Google Scholar
  16. 16.
    Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. (Ser. B) 58, 267–288 (1996)MathSciNetMATHGoogle Scholar
  17. 17.
    Wei, X., Cao, B., Yu, P.S.: Unsupervised feature selection on networks: a generative view. In: AAAI, pp. 2215–2221 (2016)Google Scholar
  18. 18.
    Wei, X., Cao, B., Yu, P.S.: Multi-view unsupervised feature selection by cross-diffused matrix alignment. In: International Joint Conference on Neural Networks (IJCNN), pp. 494–501 (2017)Google Scholar
  19. 19.
    Wei, X., Xie, S., Yu, P.S.: Efficient partial order preserving unsupervised feature selection on networks. In: SDM, pp. 82–90 (2015)Google Scholar
  20. 20.
    Wei, X., Yu, P.S.: Unsupervised feature selection by preserving stochastic neighbors. In: AISTATS (2016)Google Scholar
  21. 21.
    Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31, 210–227 (2009)CrossRefGoogle Scholar
  22. 22.
    Yang, Y., Shen, H.T., Ma, Z., Huang, Z., Zhou, X.: L2, 1-norm regularized discriminative feature selection for unsupervised learning. In: IJCAI, pp. 1589–1594 (2011)Google Scholar
  23. 23.
    Zhao, Z., Liu, H.: Spectral feature selection for supervised and unsupervised learning. In: ICML, vol. 227, pp. 1151–1157 (2007)Google Scholar
  24. 24.
    Zhao, Z., Wang, L., Liu, H.: Efficient spectral feature selection with minimum redundancy. In: AAAI (2010)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Xiaokai Wei
    • 1
  • Sihong Xie
    • 2
  • Bokai Cao
    • 3
  • Philip S. Yu
    • 3
  1. 1.Facebook Inc.Menlo ParkUSA
  2. 2.CSE DepartmentLehigh UniversityBethlehemUSA
  3. 3.Department of Computer ScienceUniversity of Illinois at ChicagoChicagoUSA

Personalised recommendations