Skip to main content

Unsupervised Feature Selection via Local Total-Order Preservation

  • Conference paper
  • First Online:
Book cover Artificial Neural Networks and Machine Learning – ICANN 2019: Deep Learning (ICANN 2019)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11728))

Included in the following conference series:

Abstract

Without class label, unsupervised feature selection methods choose a subset of features that faithfully maintain the intrinsic structure of original data. Conventional methods assume that the exact value of pairwise samples distance used in structure regularization is effective. However, this assumption imposes strict restrictions to feature selection, and it causes more features to be kept for data representation. Motivated by this, we propose Unsupervised Feature Selection via Local Total-order Preservation, called UFSLTP. In particular, we characterize a local structure by a novel total-order relation, which applies the comparison of pairwise samples distance. To achieve a desirable features subset, we map total-order relation into probability space and attempt to preserve the relation by minimizing the differences of the probability distributions calculated before and after feature selection. Due to the inherent nature of machine learning and total-order relation, less features are needed to represent data without adverse effecting on performance. Moreover, we propose two efficient methods, namely Adaptive Neighbors Selection(ANS) and Uniform Neighbors Serialization(UNS), to reduce the computational complexity and improve the method performance. The results of experiments on benchmark datasets demonstrate that the proposed method significantly outperforms the state-of-the-art methods. Compared to the competitors by clustering performance, it averagely achieves \(31.01\%\) improvement in terms of NMI and \(14.44\%\) in terms of Silhouette Coefficient.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In the following, we term pairwise samples similarity as global structure in order to keep it consistent with local manifold structure.

  2. 2.

    http://archive.ics.uci.edu/ml/datasets.html.

  3. 3.

    http://featureselection.asu.edu/datasets.php.

References

  1. Abdi, H., Williams, L.J.: Principal component analysis. Wiley Interdisc. Rev. Comput. Stat. 2(4), 433–459 (2010). https://doi.org/10.1002/wics.101

    Article  Google Scholar 

  2. Byrd, R.H., Lu, P., Nocedal, J., Zhu, C.: A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 16(5), 1190–1208 (1995). https://doi.org/10.2172/204262

    Article  MathSciNet  MATH  Google Scholar 

  3. Cai, D., Zhang, C., He, X.: Unsupervised feature selection for multi-cluster data. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 333–342. ACM (2010). https://doi.org/10.1145/1835804.1835848

  4. Chen, C.C., Juan, H.H., Tsai, M.Y., Lu, H.H.S.: Unsupervised learning and pattern recognition of biological data structures with density functional theory and machine learning. Sci. Rep. 8(1), 557 (2018). https://doi.org/10.1038/s41598-017-18931-5

    Article  Google Scholar 

  5. Du, L., Shen, Y.D.: Unsupervised feature selection with adaptive structure learning. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 209–218. ACM (2015). https://doi.org/10.1145/2783258.2783345

  6. He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: Advances in Neural Information Processing Systems, pp. 507–514 (2006)

    Google Scholar 

  7. He, X., Niyogi, P.: Locality preserving projections. In: Advances in Neural Information Processing Systems, pp. 153–160 (2004). https://doi.org/10.1016/j.patcog.2011.05.014

    Article  Google Scholar 

  8. Li, J., et al.: Feature selection: a data perspective. ACM Comput. Surv. (CSUR) 50(6), 94 (2018). https://doi.org/10.1145/3136625

    Article  Google Scholar 

  9. Li, J., Wu, L., Dani, H., Liu, H.: Unsupervised personalized feature selection. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  10. Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization. Math. Program. 45(1–3), 503–528 (1989). https://doi.org/10.1007/bf01589116

    Article  MathSciNet  MATH  Google Scholar 

  11. Liu, X., Wang, L., Zhang, J., Yin, J., Liu, H.: Global and local structure preservation for feature selection. IEEE Trans. Neural Netw. Learn. Syst. 25(6), 1083–1095 (2013). https://doi.org/10.1109/tnnls.2013.2287275

    Article  Google Scholar 

  12. Luo, M., Nie, F., Chang, X., Yang, Y., Hauptmann, A.G., Zheng, Q.: Adaptive unsupervised feature selection with structure regularization. IEEE Trans. Neural Netw. Learn. Syst. 29(4), 944–956 (2017). https://doi.org/10.1109/tnnls.2017.2650978

    Article  Google Scholar 

  13. Maaten, L.V.D., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)

    MATH  Google Scholar 

  14. Parsons, L., Haque, E., Liu, H.: Subspace clustering for high dimensional data: a review. ACM SIGKDD Explor. Newsl. 6(1), 90–105 (2004). https://doi.org/10.1145/1007730.1007731

    Article  Google Scholar 

  15. Pedregosa, F., Varoquaux, G., Gramfort, A., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011). https://doi.org/10.1524/auto.2011.0951

    Article  MathSciNet  MATH  Google Scholar 

  16. Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000). https://doi.org/10.1126/science.290.5500.2323

    Article  Google Scholar 

  17. Sammut, C., Webb, G.I.: Encyclopedia of Machine Learning. Springer, Heidelberg (2011). https://doi.org/10.1007/978-0-387-30164-8

    Book  MATH  Google Scholar 

  18. Shi, L., Du, L., Shen, Y.D.: Robust spectral learning for unsupervised feature selection. In: 2014 IEEE International Conference on Data Mining, pp. 977–982. IEEE (2014). https://doi.org/10.1109/icdm.2014.58

  19. Solorio-Fernández, S., Martínez-Trinidad, J.F., Carrasco-Ochoa, J.A.: A new unsupervised spectral feature selection method for mixed data: a filter approach. Pattern Recogn. 72, 314–326 (2017). https://doi.org/10.1016/j.patcog.2017.07.020

    Article  Google Scholar 

  20. Wang, D., Nie, F., Huang, H.: Unsupervised feature selection via unified trace ratio formulation and K-means clustering (TRACK). In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014. LNCS (LNAI), vol. 8726, pp. 306–321. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44845-8_20

    Chapter  Google Scholar 

  21. Wang, H., Shi, P., Zhang, Y.: Jointcloud: a cross-cloud cooperation architecture for integrated internet service customization. In: 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), pp. 1846–1855. IEEE (2017). https://doi.org/10.1109/icdcs.2017.237

  22. Wang, Y., Li, S.: Research and performance evaluation of data replication technology in distributed storage systems. Comput. Math. Appl. 51(11), 1625–1632 (2006). https://doi.org/10.1016/j.camwa.2006.05.002

    Article  Google Scholar 

  23. Wang, Y., Li, X., Li, X., Wang, Y.: A survey of queries over uncertain data. Knowl. Inf. Syst. 37(3), 485–530 (2013). https://doi.org/10.1007/s10115-013-0638-6

    Article  Google Scholar 

  24. Wang, Y., Ma, X.: A general scalable and elastic content-based publish/subscribe service. IEEE Trans. Parallel Distrib. Syst. 26(8), 2100–2113 (2014). https://doi.org/10.1109/tpds.2014.2346759

    Article  Google Scholar 

  25. Wang, Y., Pei, X., Ma, X., Xu, F.: Ta-update: an adaptive update scheme with tree-structured transmission in erasure-coded storage systems. IEEE Trans. Parallel Distrib. Syst. 29(8), 1893–1906 (2017). https://doi.org/10.1109/tpds.2017.2717981

    Article  Google Scholar 

  26. Wei, X., Philip, S.Y.: Unsupervised feature selection by preserving stochastic neighbors. In: Artificial Intelligence and Statistics, pp. 995–1003 (2016). https://doi.org/10.1145/2694859.2694864

  27. Yang, Y., Shen, H.T., Ma, Z., Huang, Z., Zhou, X.: L2, 1-norm regularized discriminative feature selection for unsupervised. In: Twenty-Second International Joint Conference on Artificial Intelligence (2011). https://doi.org/10.5591/978-1-57735-516-8/IJCAI11-267

  28. Zhang, T., Yang, J., Zhao, D., Ge, X.: Linear local tangent space alignment and application to face recognition. Neurocomputing 70(7–9), 1547–1553 (2007). https://doi.org/10.1016/j.neucom.2006.11.007

    Article  Google Scholar 

  29. Zhao, Z., Liu, H.: Spectral feature selection for supervised and unsupervised learning. In: Proceedings of the 24th International Conference on Machine learning, pp. 1151–1157. ACM (2007). https://doi.org/10.1145/1273496.1273641

  30. Zhao, Z., Wang, L., Liu, H., Ye, J.: On similarity preserving feature selection. IEEE Trans. Knowl. Data Eng. 25(3), 619–632 (2011). https://doi.org/10.1109/tkde.2011.222

    Article  Google Scholar 

Download references

Acknowledgment

This work is supported by the National Key Research and Development Program of China(2016YFB1000101), the National Natural Science Foundation of China(Grant No.61379052), the Science Foundation of Ministry of Education of China(Grant No.2018A02002), the Natural Science Foundation for Distinguished Young Scholars of Hunan Province(Grant No.14JJ1026).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yijie Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ma, R., Wang, Y., Cheng, L. (2019). Unsupervised Feature Selection via Local Total-Order Preservation. In: Tetko, I., Kůrková, V., Karpov, P., Theis, F. (eds) Artificial Neural Networks and Machine Learning – ICANN 2019: Deep Learning. ICANN 2019. Lecture Notes in Computer Science(), vol 11728. Springer, Cham. https://doi.org/10.1007/978-3-030-30484-3_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-30484-3_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-30483-6

  • Online ISBN: 978-3-030-30484-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics