Abstract
In Chap. 4, we consider a global optimization approach, called model reference adaptive search (MRAS), which provides a broad framework for updating a probability distribution over the solution space in a way that ensures convergence to an optimal solution. After introducing the theory and convergence results in a general optimization problem setting, we apply the MRAS approach to various MDP settings. For the finite- and infinite-horizon settings, we show how the approach can be used to perform optimization in policy space. In the setting of Chap. 3, we show how MRAS can be incorporated to further improve the exploration step in the evolutionary algorithms presented there. Moreover, for the finite-horizon setting with both large state and action spaces, we combine the approaches of Chaps. 2 and 4 and propose a method for sampling the state and action spaces. Finally, we present a stochastic approximation framework for studying a class of simulation- and sampling-based optimization algorithms. We illustrate the framework through an algorithm instantiation called model-based annealing random search (MARS) and discuss its application to finite-horizon MDPs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Balakrishnan, V., Tits, A.L.: Numerical optimization-based design. In: Levine, W.S. (ed.) The Control Handbook, pp. 749–758. CRC Press, Boca Raton (1996)
Benaim, M.: A dynamical system approach to stochastic approximations. SIAM J. Control Optim. 34, 437–472 (1996)
Bhatnagar, S., Fu, M.C., Marcus, S.I.: An optimal structured feedback policy for ABR flow control using two timescale SPSA. IEEE/ACM Trans. Netw. 9, 479–491 (2001)
Chang, H.S., Fu, M.C., Hu, J., Marcus, S.I.: An asymptotically efficient simulation-based algorithm for finite horizon stochastic dynamic programming. IEEE Trans. Autom. Control 52(1), 89–94 (2007)
Corana, A., Marchesi, M., Martini, C., Ridella, S.: Minimizing multimodal functions of continuous variables with the ‘simulated annealing’ algorithm. ACM Trans. Math. Softw. 13(3), 262–280 (1987)
De Boer, P.T., Kroese, D.P., Mannor, S., Rubinstein, R.Y.: A tutorial on the cross-entropy method. Ann. Oper. Res. 134, 19–67 (2005)
Dorigo, M., Gambardella, L.M.: Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans. Evol. Comput. 1, 53–66 (1997)
Evans, S.N., Weber, N.C.: On the almost sure convergence of a general stochastic approximation procedure. Bull. Aust. Math. Soc. 34, 335–342 (1986)
Fabian, V.: On asymptotic normality in stochastic approximation. Ann. Math. Stat. 39, 1327–1332 (1968)
Fu, M.C., Healy, K.J.: Techniques for simulation optimization: an experimental study on an (s,S) inventory system. IIE Trans. 29, 191–199 (1997)
Fu, M.C., Hu, J., Marcus, S.I.: Model-based randomized methods for global optimization. In: Proceedings of the 17th International Symposium on Mathematical Theory of Networks and Systems, Kyoto, Japan, July (2006)
Glover, F.: Tabu search: a tutorial. Interfaces 20(4), 74–94 (1990)
Goldberg, D.E.: Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, Boston (1989)
Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58, 13–30 (1963)
Homem-de-Mello, T.: A study on the cross-entropy method for rare-event probability estimation. INFORMS J. Comput. 19(3), 381–394 (2007)
Hong, L.J., Nelson, B.L.: Discrete optimization via simulation using COMPASS. Oper. Res. 54, 115–129 (2006)
Hu, J., Chang, H.S.: An approximate stochastic annealing algorithm for finite horizon Markov decision processes. In: Proceedings of the 49th IEEE Conference on Decision and Control, pp. 5338–5343 (2010)
Hu, J., Hu, P.: On the performance of the cross-entropy method. In: Proceedings of the 2009 Winter Simulation Conference, pp. 459–468 (2009)
Hu, J., Hu, P.: An approximate annealing search algorithm to global optimization and its connections to stochastic approximation. In: Proceedings of the 2010 Winter Simulation Conference, pp. 1223–1234 (2010)
Hu, J., Hu, P.: Annealing adaptive search, cross-entropy, and stochastic approximation in global optimization. Nav. Res. Logist. 58, 457–477 (2011)
Hu, J., Fu, M.C., Marcus, S.I.: A model reference adaptive search method for global optimization. Oper. Res. 55, 549–568 (2007)
Hu, J., Fu, M.C., Marcus, S.I.: A model reference adaptive search method for stochastic global optimization. Commun. Inf. Syst. 8, 245–276 (2008)
Hu, J., Hu, P., Chang, H.S.: A stochastic approximation framework for a class of randomized optimization algorithms. IEEE Trans. Autom. Control 57, 165–178 (2012)
Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220, 45–54 (1983)
Kushner, H.J., Clark, D.S.: Stochastic Approximation Methods for Constrained and Unconstrained Systems. Springer, New York (1978)
Kushner, H.J., Yin, G.G.: Stochastic Approximation Algorithms and Applications. Springer, New York (1997)
Laarhoven, P.J.M., Aarts, E.H.L.: Simulated Annealing: Theory and Applications. Kluwer Academic, Norwell (1987)
Mannor, S., Rubinstein, R.Y., Gat, Y.: The cross-entropy method for fast policy search. In: International Conference on Machine Learning, pp. 512–519 (2003)
Morris, C.N.: Natural exponential families with quadratic variance functions. Ann. Stat. 10, 65–80 (1982)
Mühlenbein, H., Paaß, G.: From recombination of genes to the estimation of distributions, I: binary parameters. In: Voigt, H., Ebeling, W., Rechenberg, I., Schwefel, H. (eds.) Proceedings of the 4th International Conference on Parallel Problem Solving from Nature, pp. 178–187. Springer, Berlin (1996)
Ng, A.Y., Parr, R., Koller, D.: Policy search via density estimation. In: Solla, S.A., Leen, T.K., Müller, K.-R. (eds.) Advances in Neural Information Processing Systems, vol. 12, NIPS 1999, pp. 1022–1028. MIT Press, Cambridge (2000)
Rajaraman, K., Sastry, P.S.: Finite time analysis of the pursuit algorithm for learning automata. IEEE Trans. Syst. Man Cybern., Part B, Cybern. 26(4), 590–598 (1996)
Romeijn, H.E., Smith, R.L.: Simulated annealing and adaptive search in global optimization. Probab. Eng. Inf. Sci. 8, 571–590 (1994)
Rubinstein, R.Y., Kroese, D.P.: The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte Carlo Simulation, and Machine Learning. Springer, New York (2004)
Rubinstein, R.Y., Shapiro, A.: Discrete Event Systems: Sensitivity Analysis and Stochastic Optimization by the Score Function Method. Wiley, New York (1993)
Shi, L., Ólafsson, S.: Nested partitions method for global optimization. Oper. Res. 48, 390–407 (2000)
Shi, L., Ólafsson, S.: Nested partitions method for stochastic optimization. Methodol. Comput. Appl. Probab. 2, 271–291 (2000)
Spall, J.C.: Introduction to Stochastic Search and Optimization. Wiley, New York (2003)
Spall, J.C., Cristion, J.A.: Model-free control of nonlinear stochastic systems with discrete-time measurements. IEEE Trans. Autom. Control 43, 1198–1210 (1998)
Srinivas, M., Patnaik, L.M.: Genetic algorithms: a survey. IEEE Comput. 27(6), 17–26 (1994)
Wolpert, D.H.: Finding bounded rational equilibria, part I: iterative focusing. In: Vincent, T. (ed.) Proceedings of the Eleventh International Symposium on Dynamic Games and Applications, ISDG ’04 (2004)
Yakowitz, S., L’Ecuyer, P., Vázquez-Abad, F.: Global stochastic optimization with low-dispersion point sets. Oper. Res. 48, 939–950 (2000)
Zabinsky, Z.B.: Stochastic Adaptive Search for Global Optimization. Kluwer Academic, Norwell (2003)
Zhang, Q., Mühlenbein, H.: On the convergence of a class of estimation of distribution algorithm. IEEE Trans. Evol. Comput. 8(2), 127–136 (2004)
Zlochin, M., Birattari, M., Meuleau, N., Dorigo, M.: Model-based search for combinatorial optimization: a critical survey. Ann. Oper. Res. 131, 373–395 (2004)
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag London
About this chapter
Cite this chapter
Chang, H.S., Hu, J., Fu, M.C., Marcus, S.I. (2013). Model Reference Adaptive Search. In: Simulation-Based Algorithms for Markov Decision Processes. Communications and Control Engineering. Springer, London. https://doi.org/10.1007/978-1-4471-5022-0_4
Download citation
DOI: https://doi.org/10.1007/978-1-4471-5022-0_4
Publisher Name: Springer, London
Print ISBN: 978-1-4471-5021-3
Online ISBN: 978-1-4471-5022-0
eBook Packages: EngineeringEngineering (R0)