Implementing Parallel Differential Evolution on Spark

  • Diego Teijeiro
  • Xoán C. Pardo
  • Patricia González
  • Julio R. Banga
  • Ramón Doallo
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9598)

Abstract

Metaheuristics are gaining increased attention as an efficient way of solving hard global optimization problems. Differential Evolution (DE) is one of the most popular algorithms in that class. However, its application to realistic problems results in excessive computation times. Therefore, several parallel DE schemes have been proposed, most of them focused on traditional parallel programming interfaces and infrastructures. However, with the emergence of Cloud Computing, new programming models, like Spark, have appeared to suit with large-scale data processing on clouds. In this paper we investigate the applicability of Spark to develop parallel DE schemes to be executed in a distributed environment. Both the master-slave and the island-based DE schemes usually found in the literature have been implemented using Spark. The speedup and efficiency of all the implementations were evaluated on the Amazon Web Services (AWS) public cloud, concluding that the island-based solution is the best suited to the distributed nature of Spark. It achieves a good speedup versus the serial implementation, and shows a decent scalability when the number of nodes grows.

Keywords

Metaheuristics Differential evolution Cloud computing Spark Amazon web services 

References

  1. 1.
    Floudas, C.A., Pardalos, P.M.: Optimization in Computational Chemistry and Molecular Biology: Local and Global Approaches, vol. 40. Springer Science and Business Media, Heidelberg (2013)Google Scholar
  2. 2.
    Banga, J.R.: Optimization in computational systems biology. BMC Syst. Biol. 2(1), 47 (2008)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Grossmann, I.E.: Global Optimization in Engineering Design, vol. 9. Springer Science and Business Media, Heidelberg (2013)MATHGoogle Scholar
  4. 4.
    Crainic, T.G., Toulouse, M.: Parallel Strategies for Meta-Heuristics. Springer, Heidelberg (2003)CrossRefMATHGoogle Scholar
  5. 5.
    Alba, E.: Parallel Metaheuristics: a New Class of Algorithms. Wiley-Interscience, New York (2005)CrossRefMATHGoogle Scholar
  6. 6.
    Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: Proceedings of the 6th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2004 (2004)Google Scholar
  7. 7.
    Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauly, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2012 (2012)Google Scholar
  8. 8.
    Storn, R., Price, K.: Differential evolution - a simple and efficient heuristic for global optimization over continuous spaces. J. Global Optim. 11(4), 341–359 (1997)MathSciNetCrossRefMATHGoogle Scholar
  9. 9.
    Alba, E., Luque, G., Nesmachnow, S.: Parallel metaheuristics: recent advances and new trends. Int. Trans. Oper. Res. 20(1), 1–48 (2013)CrossRefMATHGoogle Scholar
  10. 10.
    McNabb, A.W., Monson, C.K., Seppi, K.D.: Parallel PSO using MapReduce. In: IEEE Congress on Evolutionary Computation, CEC2007, IEEE, pp. 7–14 (2007)Google Scholar
  11. 11.
    Jin, C., Vecchiola, C., Buyya, R.: MRPGA: an extension of MapReduce for parallelizing genetic algorithms. In: IEEE Fourth International Conference on eScience, eScience 2008, IEEE, pp. 214–221 2008)Google Scholar
  12. 12.
    Verma, A., Llora, X., Goldberg, D.E., Campbell, R.H.: Scaling genetic algorithms using MapReduce. In: Ninth International Conference on Intelligent Systems Design and Applications, ISDA 2009, IEEE, pp. 13–18 (2009)Google Scholar
  13. 13.
    Radenski, A.: Distributed simulated annealing with MapReduce. In: Di Chio, C., Agapitos, A., Cagnoni, S., Cotta, C., de Vega, F.F., Di Caro, G.A., Drechsler, R., Ekárt, A., et al. (eds.) EvoApplications 2012. LNCS, vol. 7248, pp. 466–476. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  14. 14.
    Lee, W.P., Hsiao, Y.T., Hwang, W.C.: Designing a parallel evolutionary algorithm for inferring gene networks on the cloud computing environment. BMC Syst. Biol. 8(1), 5 (2014)CrossRefGoogle Scholar
  15. 15.
    Zhou, C.: Fast parallelization of differential evolution algorithm using MapReduce. In: Proceedings of the 12th Annual Conference on Genetic and Evolutionary Computation, ACM, pp. 1113–1114 2010)Google Scholar
  16. 16.
    Tagawa, K., Ishimizu, T.: Concurrent differential evolution based on MapReduce. Int. J. Comput. 4(4), 161–168 (2010)Google Scholar
  17. 17.
    Daoudi, M., Hamena, S., Benmounah, Z., Batouche, M.: Parallel differential evolution clustering algorithm based on MapReduce. In: 6th International Conference of Soft Computing and Pattern Recognition (SoCPaR), IEEE, pp. 337–341 (2014)Google Scholar
  18. 18.
    Ekanayake, J., Li, H., Zhang, B., Gunarathne, T., hee Bae, S., Qiu, J., Fox, G.: Twister: a runtime for iterative MapReduce. In: The First International Workshop on MapReduce and Its Applications (2010)Google Scholar
  19. 19.
    Zhang, Y., Gao, Q., Gao, L., Wang, C.: IMapReduce: a distributed computing framework for iterative computation. In: Proceedings of the 1st International Workshop on Data Intensive Computing in the Clouds (DataCloud), p. 1112 (2011)Google Scholar
  20. 20.
    Bu, Y., Howe, B., Balazinska, M., Ernst, M.D.: HaLoop: efficient iterative data processing on large clustersGoogle Scholar
  21. 21.
    Ewen, S., Tzoumas, K., Kaufmann, M., Markl, V.: Spinning fast iterative data flows. CoRR abs/1208.0088 (2012)Google Scholar
  22. 22.
    Alexandrov, A., Bergmann, R., Ewen, S., Freytag, J.C., Hueske, F., Heise, A., Kao, O., Leich, M., Leser, U., Markl, V., Naumann, F., Peters, M., Rheinländer, A., Sax, M., Schelter, S., Höger, M., Tzoumas, K., Warneke, D.: The stratosphere platform for big data analytics. VLDB J. 23(6), 939–964 (2014)CrossRefGoogle Scholar
  23. 23.
    Odersky, M., Micheloud, S., Mihaylov, N., Schinz, M., Stenman, E., Zenger, M., et al.: An overview of the Scala programming language. Technical report (2004)Google Scholar
  24. 24.
    Hansen, N., Auger, A., Finck, S., Ros, R.: Real-parameter black-box optimization benchmarking 2009: experimental setup. Technical report RR-6828, INRIA (2009)Google Scholar
  25. 25.
    Locke, J., Millar, A., Turner, M.: Modelling genetic networks with noisy and varied experimental data: the circadian clock in arabidopsis thaliana. J. Theor. Biol. 234(3), 383–393 (2005)MathSciNetCrossRefGoogle Scholar
  26. 26.
    Alba, E., Luque, G.: Evaluation of parallel metaheuristics. In: PPSN-EMAA 2006, Reykjavik, Iceland, 9–14 September 2006Google Scholar
  27. 27.
    Penas, D., Banga, J., González, P., Doallo, R.: Enhanced parallel differential evolution algorithm for problems in computational systems biology. Appl. Soft Comput. 33, 86–99 (2015)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Diego Teijeiro
    • 1
  • Xoán C. Pardo
    • 1
  • Patricia González
    • 1
  • Julio R. Banga
    • 2
  • Ramón Doallo
    • 1
  1. 1.Grupo de Arquitectura de ComputadoresUniversidade da CoruñaA CorunaSpain
  2. 2.BioProcess Engineering GroupIIM-CSICPontevedraSpain

Personalised recommendations