Advertisement

A High-Dimensional and Multi-granularity Feature Selection Method Based on CNN and RF

  • Yinghong Sun
  • Lei LiuEmail author
  • Sheng Chen
  • Liangwen Hou
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1074)

Abstract

Feature engineering determines the upper limit of the performance of machine learning algorithm. And feature selection is the most critical step in feature engineering. However, the dimensional disasters are caused by high-dimensional and multi-granularity feature data, which makes effective feature selection very difficult. We propose a feature selection based on the Convolutional Neural Networks and Random Forest (FSCNNRF) for this issue. The model includes two parts, Feature Selection Convolutional Neural Networks (FSCNN) and Random Forest (RF). It can select more effective feature set by using FSCNN for dimensionality reduction and RF for feature selection. Firstly, the high-dimensional and multi-granularity feature data are subjected to dimensionality reduction processing by FSCNN, so that each feature becomes a single granularity feature. Then the RF is used to select valid features. Experiments show that the model has better effect on feature selection on high-dimensional and multi-granularity dataset and improves the performance of machine learning algorithms.

Keywords

Feature engineering Feature selection CNN RF 

Notes

Acknowledgment

This work is supported by the National Natural Science Foundation of China (Grant No. 61105040, 61203284), the Beijing Natural Science Foundation (Grant No 4133085), the general program of science and technology development project of Beijing Municipal Education Commission (Grant KM201810005005), the Beijing municipal commission of education young top-notch personnel plan and the Beijing University of Technology Science Foundation (Grant No. 006000543115502).

References

  1. 1.
    Zhou, Z.: Machine Learning, pp. 229–230. Tsinghua University Press, Beijing (2016)Google Scholar
  2. 2.
    Yu, L., et al.: Multi-response parameters optimization based on PCA and neural network. J. Syst. Simul. 176–183 (2018)Google Scholar
  3. 3.
    Yi, M.: Research on Infrared Feature Authentication and Counterfeiting Algorithm Based on PCA + SVM Paper Currency. University of Science and Technology Liaoling (2018)Google Scholar
  4. 4.
    Zhou, Z.: Machine Learning, pp. 60–63. Tsinghua University Press, Beijing (2016)Google Scholar
  5. 5.
    Han, Z., et al.: Gait recognition based on linear discriminant analysis and support vector machine. Pattern Recogn. Artif. Intell. 18(2) (2005)Google Scholar
  6. 6.
    Zhou, Z.: Machine Learning, pp. 234–237. Tsinghua University Press, Beijing (2016)Google Scholar
  7. 7.
    Kira, K., Renddell, L.A.: Wrappers for feature selection problem: traditional methods and a new algorithm. In: Proceedings of the 10th National Conference on Artificial Intelligence (AAAI), pp. 129–134 (1992)Google Scholar
  8. 8.
    Zhou, Z.: Machine Learning, pp. 248–253. Tsinghua University Press, Beijing (2016)Google Scholar
  9. 9.
    Liu, H., Setiono, R.: Feature selection and classification a probabilistic wrapper approach. In: Proceedings of the 9th International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems (IEA/AIE), pp. 419–424 (1996)Google Scholar
  10. 10.
    Tibshirani, R., et al.: Sparsity and smoothness via the fused LASSO. J. R. Stat. Soc.-Ser. B 67(1), 91–108 (2005)MathSciNetCrossRefGoogle Scholar
  11. 11.
  12. 12.
    Li, W.J., Wang, S., Kang, W.C.: Feature learning based deep supervised hashing with pairwise labels (2015)Google Scholar
  13. 13.
    Wang, H., Cai, Y., Zhang, Y., et al.: Deep learning for image retrieval: what works and what doesn’t. In: IEEE International Conference on Data Mining Workshop, pp. 1576–1583. IEEE (2015)Google Scholar
  14. 14.
    Huang, H.-K., Chiu, C.-F., Kuo, C.-H., Wu, Y.-C., Chu, N.N.Y, Chang, P.-C.: Mixture of deep CNN-based ensemble model for image retrieval. In: IEEE 5th Global Conference on Consumer Electronics (2016)Google Scholar
  15. 15.
    Li, J.Y., Li, J.H.: Fast image search with deep convolutional neural networks and efficient hashing codes. In: International Conference on Fuzzy Systems and Knowledge Discovery, pp. 1285–1290. IEEE (2015)Google Scholar
  16. 16.
    Liu, H., Wang, R., Shan, S., et al.: Deep supervised hashing for fast image retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2064–2072. IEEE (2016)Google Scholar
  17. 17.
    Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (2001)MathSciNetzbMATHGoogle Scholar
  18. 18.
    Genuer, R., et al.: Variable selection using random forests. Pattern Recogn. Lett. 31(14), 2225–2236 (2010)CrossRefGoogle Scholar
  19. 19.
    Deng, J., Zhang, Z., Marchi, E., et al.: Sparse autoencoder-based feature transfer learning for speech emotion recognition. In: Affective Computing & Intelligent Interaction (2013)Google Scholar
  20. 20.
    Liu, J., Li, C., Yang, W.: Supervised learning via unsupervised sparse autoencoder. IEEE Access PP(99), 1 (2018)Google Scholar
  21. 21.
    Wang, Y., et al.: Stacked sparse autoencoder with PCA and SVM for data-based line trip fault diagnosis in power systems. Neural Comput. Appl. (2018)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Yinghong Sun
    • 1
  • Lei Liu
    • 1
    Email author
  • Sheng Chen
    • 1
  • Liangwen Hou
    • 1
  1. 1.College of Applied SciencesBeijing University of TechnologyBeijingChina

Personalised recommendations