Abstract
Improving on a previous paper, we explicitly relate reinforcement and selection learning (PBIL) algorithms for combinatorial optimization, which is understood as the task of finding a fixed-length binary string maximizing an arbitrary function. We show the equivalence of searching for an optimal string and searching for a probability distribution over strings maximizing the function expectation. In this paper however, we will only consider the family of Bernoulli distributions. Next, we introduce two gradient dynamical systems acting on probability vectors. The first one maximizes the expectation of the function and leads to reinforcement learning algorithms whereas the second one maximizes the logarithm of the expectation of the function and leads to selection learning algorithms. We finally give a stability analysis of solutions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
S. Baluja and R. Caruana. Removing the genetics from the standard genetic algorithm. In Proceedings of the 12th Annual Conference on Machine Learning, pages 38–46, 1995.
A.G. Barto and P. Anandan. Pattern recognizing stochastic learning automata. IEEE Trans, on Systems, Man, and Cybernetics, 15:60–75, 1985.
A. Berny. An adaptive scheme for real function optimization acting as a selection operator. In X. Yao, editor, First IEEE Symposium on Combinations of Evolutionary Computation and Neural Networks, 2000.
A. Berny. Statistical machine learning and combinatorial optimization. In L. Kallel, B. Naudts, and A. Rogers, editors, Theoretical Aspects of Evolutionary Computing, Lecture Notes in Natural Computing. Springer-Verlag, 2000.
R. Cerf. Une théorie asymptotique des algorithmes génétiques. PhD thesis, Université de Montpellier II, mars 1994.
P. Dayan and G.E. Hinton. Using EM for reinforcement learning. Neural Computation, 9(2):271–278, 1997.
M.W. Hirsch and S. Smale. Differential equations, dynamical systems, and linear algebra. Academic Press, 1974.
P. Larrañaga, R. Etxeberria, J.A. Lozano, and J.M. Peña. Optimization by learning and simulation of Bayesian and Gaussian networks. Technical Report EHU-KZAA-IK-4/99, Intelligent System Group, Dept. of Computer Science and Artificial Intelligence, University of the Basque Country, December 1999.
H. Mühlenbein. The equation for response to selection and its use for prediction. Evolutionary Computation, 5(3):303–346, 1997.
K.S. Narendra and M.A.L. Thathachar. Learning automata: an introduction. Prentice Hall, NJ, 1989.
B. Naudts and L. Kallel. Comparison of summary statistics of fitness landscapes. IEEE Transactions on Evolutionary Computation, 2000. To appear.
P.F. Stadler. Complex Systems and Binary Networks, chapter Towards a Theory of Landscapes, pages 77–163. Springer-Verlag, Berlin, 1995.
G. Syswerda. Simulated crossover in genetic algorithms. In L.D. Whitley, editor, Second workshop on foundations of genetic algorithms, pages 239–255. Morgan Kaufmann, 1993.
R.J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8:229–256, 1992.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Berny, A. (2000). Selection and Reinforcement Learning for Combinatorial Optimization. In: Schoenauer, M., et al. Parallel Problem Solving from Nature PPSN VI. PPSN 2000. Lecture Notes in Computer Science, vol 1917. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45356-3_59
Download citation
DOI: https://doi.org/10.1007/3-540-45356-3_59
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41056-0
Online ISBN: 978-3-540-45356-7
eBook Packages: Springer Book Archive