The Bayesian Search Game

  • Marc ToussaintEmail author
Part of the Natural Computing Series book series (NCS)


The aim of this chapter is to draw links between (1) No Free Lunch (NFL) theorems which, interpreted inversely, lay the foundation of how to design search heuristics that exploit prior knowledge about the function, (2) partially observable Markov decision processes (POMDP) and their approach to the problem of sequentially and optimally choosing search points, and (3) the use of Gaussian processes as a representation of belief, i.e., knowledge about the problem. On the one hand, this joint discussion of NFL, POMDPs and Gaussian processes will give a broader view on the problem of search heuristics. On the other hand this will naturally introduce us to efficient global optimization algorithms that are well known in operations research and geology (Gutmann, J Glob Optim 19:201–227, 2001; Jones et al., J Glob Optim 13:455–492, 1998; Jones, J Glob Optim 21:345–383, 2001) and which, in our view, naturally arise from a discussion of NFL and POMDPs.


Gaussian Process Optimal Policy Belief State Prior Belief Search Point 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This research was supported by the German Research Foundation (DFG), Emmy Noether fellowship TO 409/1-3.


  1. 1.
    A. Auger, O. Teytaud, Continuous lunches are free plus the design of optimal optimization algorithms. Algorithmica 57(1), 121–146 (2010)CrossRefzbMATHMathSciNetGoogle Scholar
  2. 2.
    H. Gutmann, A radial basis function method for global optimization. J. Glob. Optim. 19, 201–227 (2001)CrossRefzbMATHMathSciNetGoogle Scholar
  3. 3.
    N. Hansen, A. Ostermeier, Completely derandomized self-adaption in evolutionary strategies. Evol. Comput. 9, 159–195 (2001)CrossRefGoogle Scholar
  4. 4.
    M. Hutter, Towards a universal theory of artificial intelligence based on algorithmic probability and sequential decision theory. arXiv: cs.AI/0012011 (2000)Google Scholar
  5. 5.
    C. Igel, M. Toussaint, A no-free-lunch theorem for non-uniform distributions of target functions. J. Math. Model. Algorithms 3, 313–322 (2004)CrossRefzbMATHMathSciNetGoogle Scholar
  6. 6.
    D. Jones, M. Schonlau, W. Welch, Efficient global optimization of expensive black-box functions. J. Glob. Optim. 13, 455–492 (1998)CrossRefzbMATHMathSciNetGoogle Scholar
  7. 7.
    D.R. Jones, A taxonomy of global optimization methods based on response surfaces. J. Glob. Optim. 21, 345–383 (2001)CrossRefzbMATHGoogle Scholar
  8. 8.
    M. Pelikan, D.E. Goldberg, F. Lobo, A survey of optimization by building and using probabilistic models. Technical Report IlliGAL-99018, Illinois Genetic Algorithms Laboratory, 1999Google Scholar
  9. 9.
    J. Pineau, G. Gordon, S. Thrun, Anytime point-based approximations for large POMDPs. J. Artif. Intell. Res. 27, 335–380 (2006)zbMATHGoogle Scholar
  10. 10.
    J. Poland, Explicit local models: towards optimal optimization algorithms. Technical Report No. IDSIA-09-04, 2004Google Scholar
  11. 11.
    P. Poupart, C. Boutilier, Bounded finite state controllers, in Advances in Neural Information Processing Systems 16 (NIPS 2003), Vancouver, vol. 16 (MIT Press, 2004)Google Scholar
  12. 12.
    P. Poupart, N. Vlassis, J. Hoey, K. Regan, An analytic solution to discrete Bayesian reinforcement learning, in Proceeding of the 23rd International Conference on Machine Learning (ICML 2006), Pittsburgh, 2006, pp. 697–704Google Scholar
  13. 13.
    C.E. Rasmussen, C. Williams, Gaussian Processes for Machine Learning (MIT Press, Cambridge, 2006)zbMATHGoogle Scholar
  14. 14.
    M. Toussaint, Compact representations as a search strategy: compression EDAs. Theor. Comput. Sci. 361, 57–71 (2006)CrossRefzbMATHMathSciNetGoogle Scholar
  15. 15.
    H. Ulmer, F. Streichert, A. Zell, Optimization by Gaussian processes assisted evolution strategies, in International Conference on Operations Research (OR 2003) (Springer, Heidelberg, 2003) pp. 435–442Google Scholar
  16. 16.
    D.H. Wolpert, W.G. Macready, No free lunch theorems for optimization. IEEE Trans. Evol. Comput. 1(1), 67–82 (1997)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  1. 1.Machine Learning & Robotics LabFree University of BerlinBerlinGermany

Personalised recommendations