Synergies Between Reinforcement Learning and Evolutionary Dynamic Optimisation

  • Aman Soni
  • Peter R. Lewis
  • Anikó Ekárt
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 732)


A connection has recently been drawn between dynamic optimization and reinforcement learning problems as subsets of a broader class of sequential decision-making problems. We present a unified approach that enables the cross-pollination of ideas between established communities, and could help to develop rigorous methods for algorithm comparison and selection for real-world resource-constrained problems.


  1. 1.
    Oliveto, P.S., He, J., Yao, X.: Time complexity of evolutionary algorithms for combinatorial optimization: a decade of results. Int. J. Autom. Comput. 4(3), 281–293 (2007)CrossRefGoogle Scholar
  2. 2.
    Nguyen, T., Yang, S., Branke, J.: Evolutionary dynamic optimization: a survey of the state of the art. Swarm Evol. Comput. 6, 1–24 (2012)CrossRefGoogle Scholar
  3. 3.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, vol. 1. MIT press, Cambridge (1998)Google Scholar
  4. 4.
    Fu, H., Lewis, P.R., Sendhoff, B., Tang, K., Yao, X.: What are dynamic optimization problems? In: IEEE Congress on Evolutionary Computing (CEC), pp. 1550–1557 (2014)Google Scholar
  5. 5.
    Fu, H., Lewis, P.R., Yao, X.: A Q-learning based evolutionary algorithm for sequential decision making problems. In: Parallel Problem Solving from Nature (PPSN). VUB AI Lab (2014)Google Scholar
  6. 6.
    Wiering, M., van Otterlo, M.: Reinforcement Learning: State-of-the-Art, vol. 12. Springer, Heidelberg (2012). Scholar
  7. 7.
    Myers, P.L., Spencer, D.B.: Application of a multi-objective evolutionary algorithm to the spacecraft stationkeeping problem. Acta Astronautica 127, 76–86 (2016)CrossRefGoogle Scholar
  8. 8.
    Tan, K.C., Cheong, C.Y., Goh, C.K.: Solving multiobjective vehicle routing problem with stochastic demand via evolutionary computation. Eur. J. Oper. Res. 177(2), 813–839 (2007)CrossRefGoogle Scholar
  9. 9.
    Münst, W., Dannheim, C., Gay, N., Malnar, B., Al-mamun, M., Icking, C., Hagen, F.: Managing intersections in the cloud, pp. 329–334 (2015)Google Scholar
  10. 10.
    Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from Google’s image search. In: Tenth IEEE International Conference on Computer Vision, ICCV 2005, vol. 2, pp. 1816–1823. IEEE (2005)Google Scholar
  11. 11.
    Laumanns, M., Thiele, L., Deb, K., Zitzler, E.: Combining convergence and diversity in evolutionary multiobjective optimization. Evol. Comput. 10(3), 263–282 (2002)CrossRefGoogle Scholar
  12. 12.
    Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J., Kalchbrenner, N., Sutskever, I., Lillicrap, T., Leach, M., Kavukcuoglu, K., Graepel, T., Hassabis, D.: Mastering the game of Go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)CrossRefGoogle Scholar
  13. 13.
    Jin, Y.J.Y., Branke, J.: Evolutionary optimization in uncertain environments-a survey. IEEE Trans. Evol. Comput. 9(3), 303–317 (2005)CrossRefGoogle Scholar
  14. 14.
    Drugan, M.M.: Synergies between evolutionary algorithms and reinforcement learning. In: Proceedings of the Companion Publication of the 2015 Annual Conference on Genetic and Evolutionary Computation, GECCO Companion 2015, pp. 723–740. ACM (2015)Google Scholar
  15. 15.
    Eiben, A.E., Schoenauer, M.: Evolutionary computing. Inf. Process. Lett. 82(1), 1–6 (2002)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Soni, A., Lewis, P.R., Ekárt, A.: Offline and online time in sequential decision-making problems. In: IEEE CIDUE. IEEE Press (2016)Google Scholar
  17. 17.
    Uzor, C.J., Gongora, M., Coupland, S., Passow, B.N.: Real-world dynamic optimization using an adaptive-mutation compact genetic algorithm. In: 2014 IEEE Symposium on Computational Intelligence in Dynamic and Uncertain Environments (CIDUE), pp. 17–23. IEEE (2014)Google Scholar
  18. 18.
    Cruz, C., González, J.R., Pelta, D.A.: Optimization in dynamic environments: a survey on problems, methods and measures. Soft Comput. 15(7), 1427–1448 (2011)CrossRefGoogle Scholar
  19. 19.
    Dearden, R., Friedman, N., Andre, D.: Model based Bayesian exploration. In: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, pp. 150–159. Morgan Kaufmann Publishers Inc. (1999)Google Scholar
  20. 20.
    Piatkowski, N., Lee, S., Morik, K.: Integer undirected graphical models for resource-constrained systems. Neurocomputing 173, 9–23 (2016)CrossRefGoogle Scholar
  21. 21.
    Graves, A.: Adaptive computation time for recurrent neural networks. arXiv preprint arXiv:1603.08983 (2016)
  22. 22.
    Hutter, F., Xu, L., Hoos, H.H., Leyton-Brown, K.: Algorithm runtime prediction: methods & evaluation. In: International Joint Conference on Artificial Intelligence (IJCAI), pp. 4197–4201, January 2015Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Aston Labs for Intelligent Collectives Engineering (ALICE)Aston UniversityBirminghamUK

Personalised recommendations