Dynamic Multi-Armed Bandits and Extreme Value-Based Rewards for Adaptive Operator Selection in Evolutionary Algorithms

  • Álvaro Fialho
  • Luis Da Costa
  • Marc Schoenauer
  • Michèle Sebag
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5851)


The performance of many efficient algorithms critically depends on the tuning of their parameters, which on turn depends on the problem at hand. For example, the performance of Evolutionary Algorithms critically depends on the judicious setting of the operator rates. The Adaptive Operator Selection (AOS) heuristic that is proposed here rewards each operator based on the extreme value of the fitness improvement lately incurred by this operator, and uses a Multi-Armed Bandit (MAB) selection process based on those rewards to choose which operator to apply next. This Extreme-based Multi-Armed Bandit approach is experimentally validated against the Average-based MAB method, and is shown to outperform previously published methods, whether using a classical Average-based rewarding technique or the same Extreme-based mechanism. The validation test suite includes the easy One-Max problem and a family of hard problems known as “Long k-paths”.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Grefenstette, J.: Optimization of control parameters for genetic algorithms. IEEE Transactions on Systems, Man and Cybernetics 16(1), 122–128 (1986)CrossRefGoogle Scholar
  2. 2.
    Lobo, F., Lima, C., Michalewicz, Z. (eds.): Parameter Setting in Evolutionary Algorithms. Studies in Computational Intelligence, vol. 54. Springer, Heidelberg (2007)zbMATHGoogle Scholar
  3. 3.
    Davis, L.: Adapting operator probabilities in genetic algorithms. In: Schaffer, J.D. (ed.) Proc. ICGA 1989, pp. 61–69. Morgan Kaufmann, San Francisco (1989)Google Scholar
  4. 4.
    Da Costa, L., Fialho, A., Schoenauer, M., Sebag, M.: Adaptive operator selection with dynamic multi-armed bandits. In: Keijzer, M. (ed.) Proc. GECCO 2008, pp. 913–920. ACM Press, New York (2008)CrossRefGoogle Scholar
  5. 5.
    Fialho, A., Da Costa, L., Schoenauer, M., Sebag, M.: Extreme value based adaptive operator selection. In: Rudolph, G., Jansen, T., Lucas, S., Poloni, C., Beume, N. (eds.) PPSN 2008. LNCS, vol. 5199, pp. 175–184. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  6. 6.
    Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Machine Learning 47(2-3), 235–256 (2002)zbMATHCrossRefGoogle Scholar
  7. 7.
    Hinkley, D.: Inference about the change point from cumulative sum-tests. Biometrika 58(3), 509–523 (1971)zbMATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Rudolph, G.: Convergence Properties of Evolutionary Algorithms. Verlag Dr. Kovac (1997)Google Scholar
  9. 9.
    Eiben, A.E., Hinterding, R., Michalewicz, Z.: Parameter control in Evolutionary Algorithms. IEEE Transactions on Evolutionary Computation 3(2), 124–141 (1999)CrossRefGoogle Scholar
  10. 10.
    Eiben, A.E., Michalewicz, Z., Schoenauer, M., Smith, J.E.: Parameter control in Evolutionary Algorithms. In: Lobo, F.G., et al. (eds.) Parameter Setting in Evolutionary Algorithms, pp. 19–46. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  11. 11.
    Birattari, M., Stützle, T., Paquete, L., Varrentrapp, K.: A racing algorithm for configuring metaheuristics. In: Langdon, W.B., et al. (eds.) Proc. GECCO 2002, pp. 11–18. Morgan Kaufmann, San Francisco (2002)Google Scholar
  12. 12.
    Yuan, B., Gallagher, M.: Statistical racing techniques for improved empirical evaluation of evolutionary algorithms. In: Yao, X., Burke, E.K., Lozano, J.A., Smith, J., Merelo-Guervós, J.J., Bullinaria, J.A., Rowe, J.E., Tiňo, P., Kabán, A., Schwefel, H.-P. (eds.) PPSN 2004. LNCS, vol. 3242, pp. 172–181. Springer, Heidelberg (2004)Google Scholar
  13. 13.
    Bartz-Beielstein, T., Lasarczyk, C., Preuss, M.: Sequential parameter optimization. In: McKay, B. (ed.) Proc. CEC 2005, pp. 773–780. IEEE Press, Los Alamitos (2005)Google Scholar
  14. 14.
    Nannen, V., Eiben, A.E.: Relevance estimation and value calibration of evolutionary algorithm parameters. In: Veloso, M. (ed.) Proc. IJCAI 2007, pp. 975–980 (2007)Google Scholar
  15. 15.
    De Jong, K.: Parameter Setting in EAs: a 30 Year Perspective. In: Lobo, F.G., et al. (eds.) Parameter Setting in Evolutionary Algorithms, pp. 1–18. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  16. 16.
    Lobo, F., Goldberg, D.: Decision making in a hybrid genetic algorithm. In: Porto, B. (ed.) Proc. ICEC 1997, pp. 121–125. IEEE Press, Los Alamitos (1997)Google Scholar
  17. 17.
    Tuson, A., Ross, P.: Adapting operator settings in genetic algorithms. Evolutionary Computation 6(2), 161–184 (1998)CrossRefGoogle Scholar
  18. 18.
    Barbosa, H.J.C., Sá, A.M.: On adaptive operator probabilities in real coded genetic algorithms. In: Workshop on Advances and Trends in AI for Problem Solving – SCCC 2000 (2000)Google Scholar
  19. 19.
    Julstrom, B.A.: What have you done for me lately? Adapting operator probabilities in a steady-state genetic algorithm on genetic algorithms. In: Eshelman, L.J. (ed.) Proc. ICGA 1995, pp. 81–87. Morgan Kaufmann, San Francisco (1995)Google Scholar
  20. 20.
    Maturana, J., Saubion, F.: A compass to guide genetic algorithms. In: Rudolph, G., Jansen, T., Lucas, S., Poloni, C., Beume, N. (eds.) PPSN 2008. LNCS, vol. 5199, pp. 256–265. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  21. 21.
    Whitacre, J.M., Pham, T.Q., Sarker, R.A.: Use of statistical outlier detection method in adaptive evolutionary algorithms. In: Keijzer, M. (ed.) Proc. GECCO 2006, pp. 1345–1352. ACM Press, New York (2006)CrossRefGoogle Scholar
  22. 22.
    Goldberg, D.E.: Probability matching, the magnitude of reinforcement, and classifier system bidding. Machine Learning 5(4), 407–425 (1990)Google Scholar
  23. 23.
    Thierens, D.: An adaptive pursuit strategy for allocating operator probabilities. In: Beyer, H.G. (ed.) Proc. GECCO 2005, pp. 1539–1546. ACM Press, New York (2005)CrossRefGoogle Scholar
  24. 24.
    Wong, Y.Y., Lee, K.H., Leung, K.S., Ho, C.W.: A novel approach in parameter adaptation and diversity maintenance for genetic algorithms. Soft Computing 7(8), 506–515 (2003)Google Scholar
  25. 25.
    Lai, T., Robbins, H.: Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics 6(1), 4–22 (1985)zbMATHCrossRefMathSciNetGoogle Scholar
  26. 26.
    Horn, J., Goldberg, D.E., Deb, K.: Long path problems. In: Davidor, Y., Männer, R., Schwefel, H.-P. (eds.) PPSN 1994. LNCS, vol. 866, pp. 149–158. Springer, Heidelberg (1994)Google Scholar
  27. 27.
    Garnier, J., Kallel, L.: Statistical distribution of the convergence time of evolutionary algorithms for long-path problems. IEEE Transactions on Evolutionary Computation 4(1), 16–30 (2000)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Álvaro Fialho
    • 1
  • Luis Da Costa
    • 2
  • Marc Schoenauer
    • 1
    • 2
  • Michèle Sebag
    • 1
    • 2
  1. 1.Microsoft Research – INRIA Joint CentreOrsayFrance
  2. 2.TAO teamINRIA Saclay – Île-de-France & LRI (UMR CNRS 8623)OrsayFrance

Personalised recommendations