Skip to main content

Feature Selection Using Ensemble Techniques

  • Conference paper
  • First Online:
Futuristic Trends in Network and Communication Technologies (FTNCT 2020)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1395))

Abstract

Data used in Machine Learning tasks need to be pre-processed and prepared to improve its quality. Features or variables in data play a major role in the results obtained after applying Machine Learning models. The features which are irrelevant to the domain should be discarded with the objective of improving the accuracy and validity of results. For this purpose, Feature Selection is used. It is a way of reducing the size, the purpose of settling down to the right element from the original elements by removing the negative, redundant or noisy features. Feature selection can often result in better learning performance, such as, lower computational cost, and better model translation. It is very important to shed some light on the nature of feature selection for student performance measurement, because constructive educational approaches can be found in the appropriate set of features. Feature selection plays a major role in refining the quality of the data models. Increased data quality can produce better results and therefore based options for such quality data can increase the quality of education by predicting performance. In the light of the aforementioned fact, it is necessary to carefully stabilize the selection of the algorithm. Feature selection key can directly affect classification accuracy and simplify operation. Several data experiments were performed to demonstrate the effectiveness of the proposed method. Selective bias may also be the term used in data analytics. In this paper, two ensemble techniques namely, Random Forests and Gradient Boosting Machines have been applied on a dataset, for the purpose of Feature Selection. Experimental results show that Gradient Boosting Machines are better at Feature Selection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Saeys, Y., Abeel, T., Van de Peer, Y.: Robust feature selection using ensemble feature selection techniques. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008. LNCS (LNAI), vol. 5212, pp. 313–325. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87481-2_21

    Chapter  Google Scholar 

  2. Saeys, Y., Inza, I., Larrañaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)

    Google Scholar 

  3. Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40(1), 16–28 (2014)

    Article  Google Scholar 

  4. Osanaiye, O., Cai, H., Choo, K.-K., Dehghantanha, A., Xu, Z., Dlodlo, M.: Ensemble-based multi-filter feature selection method for DDoS detection in cloud computing. EURASIP J. Wirel. Commun. Netw. 2016(1), 1 (2016). https://doi.org/10.1186/s13638-016-0623-3

    Article  Google Scholar 

  5. Jović, A., Brkić, K., Bogunović, N.: A review of feature selection methods with applications. In: 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pp. 1200–1205. IEEE (2015)

    Google Scholar 

  6. Goyal, M., Pandey, M.: Towards prediction of energy consumption of HVAC plants using machine learning. In: Batra, U., Roy, N.R., Panda, B. (eds.) REDSET 2019. CCIS, vol. 1229, pp. 254–265. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-5827-6_22

    Chapter  Google Scholar 

  7. Friedman, J.H.: Stochastic gradient boosting. Comput. Stat. Data Anal. 38(4), 367–378 (2002)

    Article  MathSciNet  Google Scholar 

  8. Goyal, M., Pandey, M.: Extreme gradient boosting algorithm for energy optimization in buildings pertaining to HVAC plants. EW, EAI (2020). https://doi.org/10.4108/eai.13-7-2018.164562

  9. Prasad, A.M., Iverson, L.R., Liaw, A.: Newer classification and regression tree techniques: bagging and random forests for ecological prediction. Ecosystems 9(2), 181–199 (2006)

    Article  Google Scholar 

  10. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  Google Scholar 

  11. Cutler, A., Cutler, D.R., Stevens, J.R.: Random forests. In: Zhang, C., Ma, Y. (eds.) Ensemble Machine Learning. Springer, Boston, MA (2012). https://doi.org/https://doi.org/10.1007/978-1-4419-9326-7_5

  12. Khanna, A., Goyal, R., Verma, M., Joshi, D.: Intelligent traffic management system for smart cities. In: Singh, P.K., Paprzycki, M., Bhargava, B., Chhabra, J.K., Kaushal, N.C., Kumar, Y. (eds.) FTNCT 2018. CCIS, vol. 958, pp. 152–164. Springer, Singapore (2019). https://doi.org/10.1007/978-981-13-3804-5_12

    Chapter  Google Scholar 

  13. Kabir, M.M., Islam, M.M., Murase, K.: A new wrapper feature selection approach using neural network. Neurocomputing 73(16–18), 3273–3283 (2010)

    Article  Google Scholar 

  14. Piramuthu, S.: Evaluating feature selection methods for learning in data mining applications. Eur. J. Oper. Res. 156(2), 483–494 (2004)

    Article  MathSciNet  Google Scholar 

  15. Vergara, J.R., Estévez, P.A.: A review of feature selection methods based on mutual information. Neural Comput. Appl. 24(1), 175–186 (2013). https://doi.org/10.1007/s00521-013-1368-0

    Article  Google Scholar 

  16. Verma, C., Illés, Z., Stoffová, V.: Predictive modeling to predict the residency of teachers using machine learning for the real-time. In: Singh, P.K., Sood, S., Kumar, Y., Paprzycki, M., Pljonkin, A., Hong, W.-C. (eds.) FTNCT 2019. CCIS, vol. 1206, pp. 592–601. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-4451-4_47

    Chapter  Google Scholar 

  17. Malhotra, H., Dave, M., Lamba, T.: Security analysis of cyber attacks using machine learning algorithms in eGovernance projects. In: Singh, P.K., Sood, S., Kumar, Y., Paprzycki, M., Pljonkin, A., Hong, W.-C. (eds.) FTNCT 2019. CCIS, vol. 1206, pp. 662–672. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-4451-4_52

    Chapter  Google Scholar 

  18. García-Nieto, J., Alba, E., Jourdan, L., Talbi, E.: Sensitivity and specificity based multiobjective approach for feature selection: application to cancer diagnosis. Inf. Process. Lett. 109(16), 887–896 (2009)

    Article  MathSciNet  Google Scholar 

  19. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3(Mar), 1157–1182 (2003)

    Google Scholar 

  20. Singh, P., Paprzycki, M., Bhargava, B., Chhabra, J., Kaushal, N., Kumar, Y.: Futuristic trends in network and communication technologies. FTNCT 2018. Communications in Computer and Information Science 958, 141–166 (2018)

    Google Scholar 

  21. Singh, P., Sood, S., Kumar, Y., Paprzycki, M., Pljonkin, A., Hong, W.C.: Futuristic trends in networks and computing technologies. FTNCT. Commun. Comput. Inf. Sci. 1206, 3–707 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kaushik, Y., Dixit, M., Sharma, N., Garg, M. (2021). Feature Selection Using Ensemble Techniques. In: Singh, P.K., Veselov, G., Vyatkin, V., Pljonkin, A., Dodero, J.M., Kumar, Y. (eds) Futuristic Trends in Network and Communication Technologies. FTNCT 2020. Communications in Computer and Information Science, vol 1395. Springer, Singapore. https://doi.org/10.1007/978-981-16-1480-4_25

Download citation

  • DOI: https://doi.org/10.1007/978-981-16-1480-4_25

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-16-1479-8

  • Online ISBN: 978-981-16-1480-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics