Abstract
A large number of real-world optimization and search problems are too computationally intensive to be solved due to their large state space. Therefore, a mechanism for generating approximate solutions must be adopted. Genetic Algorithms, a subclass of Evolutionary Algorithms, represent one of the widely used methods of finding and approximating useful solutions to hard problems. Due to their population-based logic and iterative behaviour, Evolutionary Algorithms are very well suited for parallelization and distribution. Several distributed models have been proposed to meet the challenges of implementing parallel Evolutionary Algorithms. Among them, the MapReduce paradigm proved to be a proper abstraction of mapping the evolutionary process. In this paper, we propose a generic framework, i.e., DHE\(^{2}\) (Distributed Hybrid Evolution Engine), that implements distributed Evolutionary Algorithms on top of the MapReduce open-source implementation in Apache Hadoop. Within DHE\(^{2}\), we propose and implement two distributed hybrid evolution models, i.e., the MasterSlaveIslands and MicroMacroIslands models, alongside a real-world application that avoids the local optimum for clustering in an efficient and performant way. The experiments for the proposed application are used to demonstrate DHE\(^{2}\) increased performance.
O. Stroie, E.-S. Apostol and C.-O. Truică—These authors contributed equally to the work.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Al-Madi, N., Ludwig, S.A.: Scaling genetic programming for data classification using MapReduce methodology. In: World Congress on Nature and Biologically Inspired Computing, pp. 132–139. IEEE (2013)
Alshammari, S., Zolkepli, M.B., Abdullah, R.B.: Genetic algorithm based parallel k-means data clustering algorithm using MapReduce programming paradigm on hadoop environment (GAPKCA). In: Recent Advances on Soft Computing and Data Mining. pp. 98–108. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-36056-6_10
Apostol, E., Băluţă, I., Gorgoi, A., Cristea, V.: A parallel genetic algorithm framework for cloud computing applications. In: Pop, F., Potop-Butucaru, M. (eds.) ARMS-CC 2014. LNCS, vol. 8907, pp. 113–127. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-13464-2_9
Di Geronimo, L., Ferrucci, F., Murolo, A., Sarro, F.: A parallel genetic algorithm based on hadoop MapReduce for the automatic generation of JUnit test suites. In: International Conference on Software Testing, Verification and Validation. pp. 785–793. IEEE (2012)
Douzas, G., Bacao, F., Last, F.: Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE. Inf. Sci. 465, 1–20 (2018)
Ferrucci, F., Salza, P., Sarro, F.: Using hadoop MapReduce for parallel genetic algorithms: a comparison of the global, grid and island models. Evol. Computat. 26(4), 535–567 (2018)
INRIA CNRS: Grid’5000, April 2020. http://www.grid5000.fr/w/Grid5000:Home
Jin, C., Vecchiola, C., Buyya, R.: MRPGA: an extension of MapReduce for parallelizing genetic algorithms. In: International Conference on eScience, pp. 214–221. IEEE (2008)
Keco, D., Subasi, A.: Parallelization of genetic algorithms using hadoop map/reduce. Southeast Europe J. Soft Comput. 1(2), 56–59 (2012)
Lyubimov, D., Palumbo, A.: Apache Mahout: Beyond MapReduce. CreateSpace Independent Publishing Platform (2016)
López, S., Márquez, A.A., Márquez, F.A., Peregrín, A.: Evolutionary design of linguistic fuzzy regression systems with adaptive defuzzification in big data environments. Cogn. Computat. 11(3), 388–399 (2019)
Rajeswari, D., Prakash, M., Suresh, J.: Computational grid scheduling architecture using MapReduce model-based non-dominated sorting geneticalgorithm. Soft Comput. 23(18), 8335–8347 (2019)
Verma, A., Llorà, X., Goldberg, D.E., Campbell, R.H.: Scaling genetic algorithms using MapReduce. In: International Conference on Intelligent Systems Design and Applications, pp. 13–18. IEEE (2009)
Acknowledgments
The publication of this paper is supported by the University Politehnica of Bucharest through the PubArt program.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Stroie, O., Apostol, ES., Truică, CO. (2020). DHE\(^{2}\): Distributed Hybrid Evolution Engine for Performance Optimizations of Computationally Intensive Applications. In: Song, M., Song, IY., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2020. Lecture Notes in Computer Science(), vol 12393. Springer, Cham. https://doi.org/10.1007/978-3-030-59065-9_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-59065-9_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59064-2
Online ISBN: 978-3-030-59065-9
eBook Packages: Computer ScienceComputer Science (R0)