The Bayesian Search Game
- 1.4k Downloads
The aim of this chapter is to draw links between (1) No Free Lunch (NFL) theorems which, interpreted inversely, lay the foundation of how to design search heuristics that exploit prior knowledge about the function, (2) partially observable Markov decision processes (POMDP) and their approach to the problem of sequentially and optimally choosing search points, and (3) the use of Gaussian processes as a representation of belief, i.e., knowledge about the problem. On the one hand, this joint discussion of NFL, POMDPs and Gaussian processes will give a broader view on the problem of search heuristics. On the other hand this will naturally introduce us to efficient global optimization algorithms that are well known in operations research and geology (Gutmann, J Glob Optim 19:201–227, 2001; Jones et al., J Glob Optim 13:455–492, 1998; Jones, J Glob Optim 21:345–383, 2001) and which, in our view, naturally arise from a discussion of NFL and POMDPs.
KeywordsGaussian Process Optimal Policy Belief State Prior Belief Search Point
This research was supported by the German Research Foundation (DFG), Emmy Noether fellowship TO 409/1-3.
- 4.M. Hutter, Towards a universal theory of artificial intelligence based on algorithmic probability and sequential decision theory. arXiv: cs.AI/0012011 (2000)Google Scholar
- 8.M. Pelikan, D.E. Goldberg, F. Lobo, A survey of optimization by building and using probabilistic models. Technical Report IlliGAL-99018, Illinois Genetic Algorithms Laboratory, 1999Google Scholar
- 10.J. Poland, Explicit local models: towards optimal optimization algorithms. Technical Report No. IDSIA-09-04, 2004Google Scholar
- 11.P. Poupart, C. Boutilier, Bounded finite state controllers, in Advances in Neural Information Processing Systems 16 (NIPS 2003), Vancouver, vol. 16 (MIT Press, 2004)Google Scholar
- 12.P. Poupart, N. Vlassis, J. Hoey, K. Regan, An analytic solution to discrete Bayesian reinforcement learning, in Proceeding of the 23rd International Conference on Machine Learning (ICML 2006), Pittsburgh, 2006, pp. 697–704Google Scholar
- 15.H. Ulmer, F. Streichert, A. Zell, Optimization by Gaussian processes assisted evolution strategies, in International Conference on Operations Research (OR 2003) (Springer, Heidelberg, 2003) pp. 435–442Google Scholar