Skip to main content

Reinforcement Learning-Based Differential Evolution for Global Optimization

  • Chapter
  • First Online:
Differential Evolution: From Theory to Practice

Part of the book series: Studies in Computational Intelligence ((SCI,volume 1009))

  • 541 Accesses

Abstract

Reinforcement learning is a computational approach that mimics learning from interaction and supplements the existing supervised and unsupervised learning methods within the machine learning field. It bases on the mapping of a given situation to the action, and each action is evaluated by a reward. Of crucial concern, here is that the mapping is performed using suitable policies that correspond to a set of the so-called psychological stimulus-response rules (associations). However, in reinforcement learning, we are not interested in immediate rewards, but in a value function that specifies how good the rewards were in the long run. Reinforcement learning differential evolution is proposed in this study. On the one hand, a Q-learning algorithm capable of ensuring the good behavior of the evolutionary search process by explicit strategy exploration is engaged to collect the more prominent mutation strategies within an ensemble of strategies. On the other, the reinforcement learning mechanism selects among the strategies incorporated from the original L-SHADE algorithm using the ‘DE/current-to-pbest/1/bin’ mutation strategy toward the iL-SHADE to jSO using the ‘DE/current-to-pbest-w/1/bin’ mutation strategies. Testing the proposed RL-SHADE algorithm was conducted on the well-established function benchmark suites from the popular CEC special session/competition on real-parameter single-objective optimization during the last decade, where three different benchmark suites were issued. We expected that the results of the proposed RL-SHADE algorithm would outperform the results of the three original algorithms in solving all the observed benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. A Bradford Book, Cambridge, MA, USA (2018)

    MATH  Google Scholar 

  2. Brooks, R.A.: Elephants don’t play chess. Robot. Auton. Syst. 6(1), 3–15 (1990)

    Article  Google Scholar 

  3. Piaget. J.: The Psychology of Intelligence/by Jean Piaget. Routledge and Kegan Paul London (1950)

    Google Scholar 

  4. Woodworth, R.S., Schlosberg, H., Kling, J.W., Riggs, L.A.: Woodworth and Schlosberg’s Experimental Psychology, 3rd edn. Methuen London (1972)

    Google Scholar 

  5. Minsky, M.: Steps toward artificial intelligence. Proc IRE. (1961)http://web.media.mit.edu/~minsky/papers/steps.html

  6. Holland, J.H.: Adaptation. In: Rosen, R., Snell, F. (eds.) Progress in theoretical biology, vol. 4, pp. 263–293. Academic Press, New York (1976)

    Chapter  Google Scholar 

  7. Klopf, A.: Brain function and adaptive systems: a heterostatic theory. Tech. rep, Air Force Cambridge Research Laboratories, Bedford, MA (1972)

    Google Scholar 

  8. Barto, A.G., Sutton, R.S.: Landmark learning: an illustration of associative search. Biol. Cybern. 42 (1981). https://doi.org/10.1007/BF00335152

  9. Bellman, R.: Dynamic Programming. Dover Publications (1957)

    Google Scholar 

  10. Bellman, R.: A Markovian decision process. J. Math. Mech. 6(5), 679–684 (1957b). http://www.jstor.org/stable/24900506

  11. Das, S., Suganthan, P.N.: Differential evolution: a survey of the state-of-the-art. IEEE Trans. Evol. Comput. 15(1), 4–31 (2011). https://doi.org/10.1109/TEVC.2010.2059031

    Article  Google Scholar 

  12. Darwin, C.: On the origin of species by means of natural selection. Murray, London, or the Preservation of Favored Races in the Struggle for Life (1859)

    Google Scholar 

  13. Storn, R., Price, K.: Differential evolution—a simple and efficient heuristic for global optimization over continuous spaces. J. Glob. Optim. 11(4), 341–359 (1997). https://doi.org/10.1023/A:1008202821328

    Article  MathSciNet  MATH  Google Scholar 

  14. Črepinšek, M., Liu, S.H., Mernik, M.: Exploration and exploitation in evolutionary algorithms: a survey. ACM Comput. Surv. 45(3) (2013). https://doi.org/10.1145/2480741.2480752

  15. Fister, I., Suganthan, P., Kamal, S., Al-Marzouki, F., Perc, M., Strnad, D.: Artificial neural network regression as a local search heuristic for ensemble strategies in differential evolution. Nonlinear Dynam. 84, 895–914 (2016)

    Article  MathSciNet  Google Scholar 

  16. Wu, G., Mallipeddi, R., Suganthan, P.N.: Ensemble strategies for population-based optimization algorithms–a survey. Swarm Evol. Comput. 44, 695–711 (2019)

    Article  Google Scholar 

  17. Liang, J.J., Qu, B.Y., Suganthan, P.N., Hernández-Díaz, A.G.: Problem Definitions and Evaluation Criteria for the CEC 2013 Special Session on Real-Parameter Optimization. Tech. rep., Technical Report, Computational Intelligence Laboratory, Zhengzhou University, Zhengzhou, China and Technical Report, Nanyang Technological University, Singapore (2013b)

    Google Scholar 

  18. Liang, J.J., Qu, B.Y., Suganthan, P.N.: Problem definitions and evaluation criteria for the CEC 2014 special session and competition on single objective real-parameter numerical optimization. Tech. rep., Computational Intelligence Laboratory, Zhengzhou University, Zhengzhou, China and Technical Report, Nanyang Technological University, Singapore (2013a)

    Google Scholar 

  19. Awad, N.H., Ali, M.Z., Liang, J.J., Qu, B.Y., Suganthan, P.N.: Problem Definitions and Evaluation Criteria for the CEC 2017 Special Session and Competition on Single Objective Bound Constrained Real-Parameter Numerical Optimization. Technical Report, Nanyang Technological University, Singapore, Tech. rep (2016)

    Google Scholar 

  20. Fister, I., Brest, J., Iglesias, A., Gálvez, A., Deb, S., Jr.: IF: on selection of a benchmark by determining the algorithms’ qualities. IEEE Access 9, 51166–51178 (2021)

    Article  Google Scholar 

  21. Karaboga, D., Basturk, B.: A powerful and efficient algorithm for numerical function optimization: artificial bee colony (ABC) algorithm. J. Glob. Optim. 39(3), 459–471 (2007). https://doi.org/10.1007/s10898-007-9149-x

    Article  MathSciNet  MATH  Google Scholar 

  22. Yavuz, G., Aydin, D., Stützle, T.: Self-adaptive search equation-based artificial bee colony algorithm on the CEC 2014 benchmark functions. In: 2016 IEEE Congress on Evolutionary Computation. CEC vol. 2016, 1173–1180 (2016). https://doi.org/10.1109/CEC.2016.7743920

  23. Qin, A. K., Suganthan, P.N.: Self-adaptive differential evolution algorithm for numerical optimization. In: 2005 IEEE Congress on Evolutionary Computation, IEEE CEC 2005. Proceedings, vol. 2, pp. 1785–1791 (2005). https://doi.org/10.1109/cec.2005.1554904

  24. Brest, J., Greiner, S., Bošković, B., Mernik, M., Žumer, V.: Self-adapting control parameters in differential evolution: a comparative study on numerical benchmark problems. IEEE Trans. Evol. Comput. 10(6), 646–657 (2006)

    Article  Google Scholar 

  25. Tanabe, R., Fukunaga, A.S.: Evaluating the performance of SHADE on CEC 2013 benchmark problems.: IEEE Cong. Evol. Comput. CEC 2013(1), 1952–1959 (2013). https://doi.org/10.1109/CEC.2013.6557798

  26. Tanabe, R., Fukunaga, A.S.: Improving the search performance of SHADE using linear population size reduction. In: Proceedings of the 2014 IEEE Congress on Evolutionary Computation, CEC 2014, pp. 1658–1665 (2014). https://doi.org/10.1109/CEC.2014.6900380

  27. Brest, J., Maučec, M.S., Bošković, B.: IL-SHADE: Improved L-SHADE algorithm for single objective real-parameter optimization. In: 2016 IEEE Congress on Evolutionary Computation. CEC, vol. 2016, pp. 1188–1195 (2016). https://doi.org/10.1109/CEC.2016.7743922

  28. Brest, J., Maučec, M.S., Bošković, B.: Single objective real-parameter optimization: algorithm jSO. In: 2017 IEEE Congress on Evolutionary Computation, CEC 2017—Proceedings, pp. 1311–1318 (2017). https://doi.org/10.1109/CEC.2017.7969456

  29. Stanovov, V., Akhmedova, S., Semenkin, E.: LSHADE algorithm with rank-based selective pressure strategy for solving CEC 2017 benchmark problems. In: 2018 IEEE Congress on Evolutionary Computation, CEC 2018–Proceedings (2018) https://doi.org/10.1109/CEC.2018.8477977

  30. Lyashenko, V.: The Fundamentals of Reinforcement Learning and How to Apply It. (2021) https://cnvrg.io/reinforcement-learning/

  31. Watkins, C.J.C.H.: Learning from delayed rewards. PhD thesis, King’s College, Cambridge, UK (1989). http://www.cs.rhul.ac.uk/~chrisw/new_thesis.pdf

  32. Blum, C., Merkle, D.: Swarm Intelligence: Introduction and Applications, 1st edn. Springer Publishing Company, Incorporated (2008)

    Google Scholar 

  33. Qin, A.K., Huang, V.L., Suganthan, P.N.: Differential evolution algorithm with strategy adaptation for global numerical optimization. Trans. Evol. Comp. 13(2), 398–417 (2009)

    Article  Google Scholar 

  34. Goldberg, D.E.: Genetic Algorithms in Search, Optimization, and Machine Learning, vol. 27, 1st edn. Addison-Wesley Longman Publishing Co., Inc., USA, (1989). https://doi.org/10.5860/choice.27-0936

  35. Arabas, J., Michalewicz, Z., Mulawka, J.: GAVaPS—a genetic algorithm with varying population size. IEEE Conf. Evol. Comput. Proc. 1, 73–78 (1994). https://doi.org/10.1109/icec.1994.350039

    Article  Google Scholar 

  36. Sörensen, K.: Metaheuristics-the metaphor exposed. Int. Trans. Oper. Res. 22(1), 3–18 (2015) https://doi.org/10.1111/itor.12001, https://onlinelibrary.wiley.com/doi/abs/10.1111/itor.12001, https://onlinelibrary.wiley.com/doi/pdf/10.1111/itor.12001

  37. Montes De Oca, M.A., Stützle, T., Van Den Enden, K., Dorigo, M.: Incremental social learning in particle swarms. IEEE Trans. Syst. Man Cyber. Part B: Cyber. 41(2), 368–384 (2011). https://doi.org/10.1109/TSMCB.2010.2055848

  38. Hellwig, M., Beyer, H.G.: Benchmarking evolutionary algorithms for single objective real-valued constrained optimization–a critical review. Swarm Evol. Comput. 44, 927–944 (2019)

    Article  Google Scholar 

  39. Suganthan, P.N.: Suganthan’s home page (2020) https://www.ntu.edu.sg/home/epnsugan/

  40. Price, K., Awad, N., Ali, M., Suganthan, P.: Problem Definitions and Evaluation Criteria for the 100-Digit Challenge Special Session and Competition on Single Objective Numerical Optimization. Technical Report, Nanyang Technological University, Singapore, Tech. rep (2018)

    Google Scholar 

  41. Friedman, M.: A comparison of alternative tests of significance for the problem of \(m\) rankings. Ann. Math. Statist. 11(1), 86–92 (1940). https://doi.org/10.1214/aoms/1177731944

    Article  MathSciNet  MATH  Google Scholar 

  42. Wolpert, D.H., Macready, W.G.: No free lunch theorems for optimization. IEEE Trans. Evol. Comput. 1(1), 67–82 (1997)

    Article  Google Scholar 

Download references

Acknowledgements

Iztok Fister Jr. is grateful the Slovenian Research Agency for the financial support under Research Core Funding No. P2-0057. Dušan Fister is grateful the Slovenian Research Agency for the financial support under Research Core Funding No. P5-0027. Iztok Fister thanks the Slovenian Research Agency for the financial support under Research Core Funding No. P2-0042 - Digital twin.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Iztok Fister Jr. .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Fister, I., Fister, D., Fister, I. (2022). Reinforcement Learning-Based Differential Evolution for Global Optimization. In: Kumar, B.V., Oliva, D., Suganthan, P.N. (eds) Differential Evolution: From Theory to Practice. Studies in Computational Intelligence, vol 1009. Springer, Singapore. https://doi.org/10.1007/978-981-16-8082-3_3

Download citation

Publish with us

Policies and ethics