Feature Selection Using Genetic Algorithm for Big Data

Saidi, Rania; Ncir, Waad Bouaguel; Essoussi, Nadia

doi:10.1007/978-3-319-74690-6_35

Rania Saidi¹⁸,
Waad Bouaguel Ncir¹⁸ &
Nadia Essoussi¹⁸

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 723))

Included in the following conference series:

International Conference on Advanced Machine Learning Technologies and Applications

3626 Accesses
4 Citations

Abstract

Feature selection is a powerful technique for dimensionality reduction and an important step in successful machine learning applications. In the last few decades, data has become progressively larger in both numbers of instances and features which make it harder to deal with the feature selection problem. To cope with this new epoch of big data, new techniques need to be developed for addressing this problem effectively. Nonetheless, the suitability of current feature selection algorithms is extremely downgraded and are inapplicable, when data size exceeds hundreds of gigabytes. In this paper, we introduce a scalable implementation of a parallel feature selection approach using the genetic algorithm that has been done in parallel using MapReduce model. The experimental results showed that the proposed method can be suitable to improve the performance of feature selection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 349.00; Price excludes VAT (USA)

Softcover Book: USD 449.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Cox, M., Ellsworth, D.: Application-controlled demand paging for out-of-core visualization. In: Proceedings of the 8th Conference on Visualization, 1997, p. 235-ff. IEEE Computer Society Press (1997)
Google Scholar
Di Geronimo, L., Ferrucci, F., Murolo, A., Sarro, F.: A parallel genetic algorithm based on hadoop mapreduce for the automatic generation of junit test suites. In: Software Testing, Verification and Validation (ICST), IEEE Fifth International Conference, pp. 785–793. IEEE (2012)
Google Scholar
El-Alfy, E.S.M., Alshammari, M.A.: Towards scalable rough set based attribute subset selection for intrusion detection using parallel genetic algorithm in mapreduce. Simul. Model. Pract. Theory 64, 18–29 (2016)
Article Google Scholar
Ferrucci, F., Salza, P., Kechadi, M., Sarro, F.: A parallel genetic algorithms framework based on Hadoop MapReduce. In: Proceedings of the 30th Annual ACM Symposium on Applied Computing, pp. 1664–1667 (2015)
Google Scholar
Garca, S., Luengo, J., Herrera, F.: Data Preprocessing in Data Mining, pp. 59–139. Springer, New York (2015)
Google Scholar
Goldberg, D.E.: Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, Reading (1989)
MATH Google Scholar
Hilda, G.T., Rajalaxmi, R.R.: Effective feature selection for supervised learning using genetic algorithm. In: Electronics and Communication Systems (ICECS), 2nd International Conference IEEE, pp. 909–914 (2015)
Google Scholar
Kacem, M.A.B.H., N’cir, C.E.B., Essoussi, N.: MapReduce-based k-prototypes clustering method for big data. In: Data Science and Advanced Analytics (DSAA). 36678 2015. IEEE International Conference, pp. 1–7. IEEE(2015)
Google Scholar
Sagiroglu, S., Sinanc, D.: Big data: a review. In: Collaboration Technologies and Systems (CTS), 2013 International Conference IEEE, pp. 42–47 (2013)
Google Scholar
Natarajan, A., Balasubramanian, R.: A fuzzy parallel island model multi objective genetic algorithm gene feature selection for microarray classification. Int. J. Appl. Eng. Res. 11(4), 2761–2770 (2016)
Google Scholar
Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. J. Mach. Learn. Res. 5, 1205–1224 (2004)
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

LARODEC, ISG, University of Tunis, Tunis, Tunisia
Rania Saidi, Waad Bouaguel Ncir & Nadia Essoussi

Authors

Rania Saidi
View author publications
You can also search for this author in PubMed Google Scholar
Waad Bouaguel Ncir
View author publications
You can also search for this author in PubMed Google Scholar
Nadia Essoussi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rania Saidi .

Editor information

Editors and Affiliations

Faculty of Computers and Information, Information Technology Department, Cairo University, Giza, Egypt
Aboul Ella Hassanien
Faculty of Computers and Information Sciences, Ain Shams University, Cairo, Egypt
Mohamed F. Tolba
Faculty of Computers and Information, Department of Computer Science and Engineering, Mansoura University, Dakahlia, Egypt
Mohamed Elhoseny
Arab Academy for Science, Technology and Maritime Transport (AASTMT), Dokki, Egypt
Mohamed Mostafa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Saidi, R., Ncir, W.B., Essoussi, N. (2018). Feature Selection Using Genetic Algorithm for Big Data. In: Hassanien, A., Tolba, M., Elhoseny, M., Mostafa, M. (eds) The International Conference on Advanced Machine Learning Technologies and Applications (AMLTA2018). AMLTA 2018. Advances in Intelligent Systems and Computing, vol 723. Springer, Cham. https://doi.org/10.1007/978-3-319-74690-6_35

Download citation

DOI: https://doi.org/10.1007/978-3-319-74690-6_35
Published: 26 January 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-74689-0
Online ISBN: 978-3-319-74690-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics