Abstract
Reinforcement learning is a computational approach that mimics learning from interaction and supplements the existing supervised and unsupervised learning methods within the machine learning field. It bases on the mapping of a given situation to the action, and each action is evaluated by a reward. Of crucial concern, here is that the mapping is performed using suitable policies that correspond to a set of the so-called psychological stimulus-response rules (associations). However, in reinforcement learning, we are not interested in immediate rewards, but in a value function that specifies how good the rewards were in the long run. Reinforcement learning differential evolution is proposed in this study. On the one hand, a Q-learning algorithm capable of ensuring the good behavior of the evolutionary search process by explicit strategy exploration is engaged to collect the more prominent mutation strategies within an ensemble of strategies. On the other, the reinforcement learning mechanism selects among the strategies incorporated from the original L-SHADE algorithm using the ‘DE/current-to-pbest/1/bin’ mutation strategy toward the iL-SHADE to jSO using the ‘DE/current-to-pbest-w/1/bin’ mutation strategies. Testing the proposed RL-SHADE algorithm was conducted on the well-established function benchmark suites from the popular CEC special session/competition on real-parameter single-objective optimization during the last decade, where three different benchmark suites were issued. We expected that the results of the proposed RL-SHADE algorithm would outperform the results of the three original algorithms in solving all the observed benchmarks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. A Bradford Book, Cambridge, MA, USA (2018)
Brooks, R.A.: Elephants don’t play chess. Robot. Auton. Syst. 6(1), 3–15 (1990)
Piaget. J.: The Psychology of Intelligence/by Jean Piaget. Routledge and Kegan Paul London (1950)
Woodworth, R.S., Schlosberg, H., Kling, J.W., Riggs, L.A.: Woodworth and Schlosberg’s Experimental Psychology, 3rd edn. Methuen London (1972)
Minsky, M.: Steps toward artificial intelligence. Proc IRE. (1961)http://web.media.mit.edu/~minsky/papers/steps.html
Holland, J.H.: Adaptation. In: Rosen, R., Snell, F. (eds.) Progress in theoretical biology, vol. 4, pp. 263–293. Academic Press, New York (1976)
Klopf, A.: Brain function and adaptive systems: a heterostatic theory. Tech. rep, Air Force Cambridge Research Laboratories, Bedford, MA (1972)
Barto, A.G., Sutton, R.S.: Landmark learning: an illustration of associative search. Biol. Cybern. 42 (1981). https://doi.org/10.1007/BF00335152
Bellman, R.: Dynamic Programming. Dover Publications (1957)
Bellman, R.: A Markovian decision process. J. Math. Mech. 6(5), 679–684 (1957b). http://www.jstor.org/stable/24900506
Das, S., Suganthan, P.N.: Differential evolution: a survey of the state-of-the-art. IEEE Trans. Evol. Comput. 15(1), 4–31 (2011). https://doi.org/10.1109/TEVC.2010.2059031
Darwin, C.: On the origin of species by means of natural selection. Murray, London, or the Preservation of Favored Races in the Struggle for Life (1859)
Storn, R., Price, K.: Differential evolution—a simple and efficient heuristic for global optimization over continuous spaces. J. Glob. Optim. 11(4), 341–359 (1997). https://doi.org/10.1023/A:1008202821328
Črepinšek, M., Liu, S.H., Mernik, M.: Exploration and exploitation in evolutionary algorithms: a survey. ACM Comput. Surv. 45(3) (2013). https://doi.org/10.1145/2480741.2480752
Fister, I., Suganthan, P., Kamal, S., Al-Marzouki, F., Perc, M., Strnad, D.: Artificial neural network regression as a local search heuristic for ensemble strategies in differential evolution. Nonlinear Dynam. 84, 895–914 (2016)
Wu, G., Mallipeddi, R., Suganthan, P.N.: Ensemble strategies for population-based optimization algorithms–a survey. Swarm Evol. Comput. 44, 695–711 (2019)
Liang, J.J., Qu, B.Y., Suganthan, P.N., Hernández-Díaz, A.G.: Problem Definitions and Evaluation Criteria for the CEC 2013 Special Session on Real-Parameter Optimization. Tech. rep., Technical Report, Computational Intelligence Laboratory, Zhengzhou University, Zhengzhou, China and Technical Report, Nanyang Technological University, Singapore (2013b)
Liang, J.J., Qu, B.Y., Suganthan, P.N.: Problem definitions and evaluation criteria for the CEC 2014 special session and competition on single objective real-parameter numerical optimization. Tech. rep., Computational Intelligence Laboratory, Zhengzhou University, Zhengzhou, China and Technical Report, Nanyang Technological University, Singapore (2013a)
Awad, N.H., Ali, M.Z., Liang, J.J., Qu, B.Y., Suganthan, P.N.: Problem Definitions and Evaluation Criteria for the CEC 2017 Special Session and Competition on Single Objective Bound Constrained Real-Parameter Numerical Optimization. Technical Report, Nanyang Technological University, Singapore, Tech. rep (2016)
Fister, I., Brest, J., Iglesias, A., Gálvez, A., Deb, S., Jr.: IF: on selection of a benchmark by determining the algorithms’ qualities. IEEE Access 9, 51166–51178 (2021)
Karaboga, D., Basturk, B.: A powerful and efficient algorithm for numerical function optimization: artificial bee colony (ABC) algorithm. J. Glob. Optim. 39(3), 459–471 (2007). https://doi.org/10.1007/s10898-007-9149-x
Yavuz, G., Aydin, D., Stützle, T.: Self-adaptive search equation-based artificial bee colony algorithm on the CEC 2014 benchmark functions. In: 2016 IEEE Congress on Evolutionary Computation. CEC vol. 2016, 1173–1180 (2016). https://doi.org/10.1109/CEC.2016.7743920
Qin, A. K., Suganthan, P.N.: Self-adaptive differential evolution algorithm for numerical optimization. In: 2005 IEEE Congress on Evolutionary Computation, IEEE CEC 2005. Proceedings, vol. 2, pp. 1785–1791 (2005). https://doi.org/10.1109/cec.2005.1554904
Brest, J., Greiner, S., Bošković, B., Mernik, M., Žumer, V.: Self-adapting control parameters in differential evolution: a comparative study on numerical benchmark problems. IEEE Trans. Evol. Comput. 10(6), 646–657 (2006)
Tanabe, R., Fukunaga, A.S.: Evaluating the performance of SHADE on CEC 2013 benchmark problems.: IEEE Cong. Evol. Comput. CEC 2013(1), 1952–1959 (2013). https://doi.org/10.1109/CEC.2013.6557798
Tanabe, R., Fukunaga, A.S.: Improving the search performance of SHADE using linear population size reduction. In: Proceedings of the 2014 IEEE Congress on Evolutionary Computation, CEC 2014, pp. 1658–1665 (2014). https://doi.org/10.1109/CEC.2014.6900380
Brest, J., Maučec, M.S., Bošković, B.: IL-SHADE: Improved L-SHADE algorithm for single objective real-parameter optimization. In: 2016 IEEE Congress on Evolutionary Computation. CEC, vol. 2016, pp. 1188–1195 (2016). https://doi.org/10.1109/CEC.2016.7743922
Brest, J., Maučec, M.S., Bošković, B.: Single objective real-parameter optimization: algorithm jSO. In: 2017 IEEE Congress on Evolutionary Computation, CEC 2017—Proceedings, pp. 1311–1318 (2017). https://doi.org/10.1109/CEC.2017.7969456
Stanovov, V., Akhmedova, S., Semenkin, E.: LSHADE algorithm with rank-based selective pressure strategy for solving CEC 2017 benchmark problems. In: 2018 IEEE Congress on Evolutionary Computation, CEC 2018–Proceedings (2018) https://doi.org/10.1109/CEC.2018.8477977
Lyashenko, V.: The Fundamentals of Reinforcement Learning and How to Apply It. (2021) https://cnvrg.io/reinforcement-learning/
Watkins, C.J.C.H.: Learning from delayed rewards. PhD thesis, King’s College, Cambridge, UK (1989). http://www.cs.rhul.ac.uk/~chrisw/new_thesis.pdf
Blum, C., Merkle, D.: Swarm Intelligence: Introduction and Applications, 1st edn. Springer Publishing Company, Incorporated (2008)
Qin, A.K., Huang, V.L., Suganthan, P.N.: Differential evolution algorithm with strategy adaptation for global numerical optimization. Trans. Evol. Comp. 13(2), 398–417 (2009)
Goldberg, D.E.: Genetic Algorithms in Search, Optimization, and Machine Learning, vol. 27, 1st edn. Addison-Wesley Longman Publishing Co., Inc., USA, (1989). https://doi.org/10.5860/choice.27-0936
Arabas, J., Michalewicz, Z., Mulawka, J.: GAVaPS—a genetic algorithm with varying population size. IEEE Conf. Evol. Comput. Proc. 1, 73–78 (1994). https://doi.org/10.1109/icec.1994.350039
Sörensen, K.: Metaheuristics-the metaphor exposed. Int. Trans. Oper. Res. 22(1), 3–18 (2015) https://doi.org/10.1111/itor.12001, https://onlinelibrary.wiley.com/doi/abs/10.1111/itor.12001, https://onlinelibrary.wiley.com/doi/pdf/10.1111/itor.12001
Montes De Oca, M.A., Stützle, T., Van Den Enden, K., Dorigo, M.: Incremental social learning in particle swarms. IEEE Trans. Syst. Man Cyber. Part B: Cyber. 41(2), 368–384 (2011). https://doi.org/10.1109/TSMCB.2010.2055848
Hellwig, M., Beyer, H.G.: Benchmarking evolutionary algorithms for single objective real-valued constrained optimization–a critical review. Swarm Evol. Comput. 44, 927–944 (2019)
Suganthan, P.N.: Suganthan’s home page (2020) https://www.ntu.edu.sg/home/epnsugan/
Price, K., Awad, N., Ali, M., Suganthan, P.: Problem Definitions and Evaluation Criteria for the 100-Digit Challenge Special Session and Competition on Single Objective Numerical Optimization. Technical Report, Nanyang Technological University, Singapore, Tech. rep (2018)
Friedman, M.: A comparison of alternative tests of significance for the problem of \(m\) rankings. Ann. Math. Statist. 11(1), 86–92 (1940). https://doi.org/10.1214/aoms/1177731944
Wolpert, D.H., Macready, W.G.: No free lunch theorems for optimization. IEEE Trans. Evol. Comput. 1(1), 67–82 (1997)
Acknowledgements
Iztok Fister Jr. is grateful the Slovenian Research Agency for the financial support under Research Core Funding No. P2-0057. Dušan Fister is grateful the Slovenian Research Agency for the financial support under Research Core Funding No. P5-0027. Iztok Fister thanks the Slovenian Research Agency for the financial support under Research Core Funding No. P2-0042 - Digital twin.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Fister, I., Fister, D., Fister, I. (2022). Reinforcement Learning-Based Differential Evolution for Global Optimization. In: Kumar, B.V., Oliva, D., Suganthan, P.N. (eds) Differential Evolution: From Theory to Practice. Studies in Computational Intelligence, vol 1009. Springer, Singapore. https://doi.org/10.1007/978-981-16-8082-3_3
Download citation
DOI: https://doi.org/10.1007/978-981-16-8082-3_3
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-8081-6
Online ISBN: 978-981-16-8082-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)