Model Reference Adaptive Search

Chang, Hyeong Soo; Hu, Jiaqiao; Fu, Michael C.; Marcus, Steven I.

doi:10.1007/978-1-4471-5022-0_4

Hyeong Soo Chang⁵,
Jiaqiao Hu⁶,
Michael C. Fu⁷ &
…
Steven I. Marcus⁸

Part of the book series: Communications and Control Engineering ((CCE))

2652 Accesses

Abstract

In Chap. 4, we consider a global optimization approach, called model reference adaptive search (MRAS), which provides a broad framework for updating a probability distribution over the solution space in a way that ensures convergence to an optimal solution. After introducing the theory and convergence results in a general optimization problem setting, we apply the MRAS approach to various MDP settings. For the finite- and infinite-horizon settings, we show how the approach can be used to perform optimization in policy space. In the setting of Chap. 3, we show how MRAS can be incorporated to further improve the exploration step in the evolutionary algorithms presented there. Moreover, for the finite-horizon setting with both large state and action spaces, we combine the approaches of Chaps. 2 and 4 and propose a method for sampling the state and action spaces. Finally, we present a stochastic approximation framework for studying a class of simulation- and sampling-based optimization algorithms. We illustrate the framework through an algorithm instantiation called model-based annealing random search (MARS) and discuss its application to finite-horizon MDPs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Balakrishnan, V., Tits, A.L.: Numerical optimization-based design. In: Levine, W.S. (ed.) The Control Handbook, pp. 749–758. CRC Press, Boca Raton (1996)
Google Scholar
Benaim, M.: A dynamical system approach to stochastic approximations. SIAM J. Control Optim. 34, 437–472 (1996)
Article MathSciNet MATH Google Scholar
Bhatnagar, S., Fu, M.C., Marcus, S.I.: An optimal structured feedback policy for ABR flow control using two timescale SPSA. IEEE/ACM Trans. Netw. 9, 479–491 (2001)
Article Google Scholar
Chang, H.S., Fu, M.C., Hu, J., Marcus, S.I.: An asymptotically efficient simulation-based algorithm for finite horizon stochastic dynamic programming. IEEE Trans. Autom. Control 52(1), 89–94 (2007)
Article MathSciNet Google Scholar
Corana, A., Marchesi, M., Martini, C., Ridella, S.: Minimizing multimodal functions of continuous variables with the ‘simulated annealing’ algorithm. ACM Trans. Math. Softw. 13(3), 262–280 (1987)
Article MathSciNet MATH Google Scholar
De Boer, P.T., Kroese, D.P., Mannor, S., Rubinstein, R.Y.: A tutorial on the cross-entropy method. Ann. Oper. Res. 134, 19–67 (2005)
Article MathSciNet MATH Google Scholar
Dorigo, M., Gambardella, L.M.: Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans. Evol. Comput. 1, 53–66 (1997)
Article Google Scholar
Evans, S.N., Weber, N.C.: On the almost sure convergence of a general stochastic approximation procedure. Bull. Aust. Math. Soc. 34, 335–342 (1986)
Article MathSciNet MATH Google Scholar
Fabian, V.: On asymptotic normality in stochastic approximation. Ann. Math. Stat. 39, 1327–1332 (1968)
Article MathSciNet MATH Google Scholar
Fu, M.C., Healy, K.J.: Techniques for simulation optimization: an experimental study on an (s,S) inventory system. IIE Trans. 29, 191–199 (1997)
Google Scholar
Fu, M.C., Hu, J., Marcus, S.I.: Model-based randomized methods for global optimization. In: Proceedings of the 17th International Symposium on Mathematical Theory of Networks and Systems, Kyoto, Japan, July (2006)
Google Scholar
Glover, F.: Tabu search: a tutorial. Interfaces 20(4), 74–94 (1990)
Article Google Scholar
Goldberg, D.E.: Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, Boston (1989)
MATH Google Scholar
Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58, 13–30 (1963)
Article MathSciNet MATH Google Scholar
Homem-de-Mello, T.: A study on the cross-entropy method for rare-event probability estimation. INFORMS J. Comput. 19(3), 381–394 (2007)
Article MathSciNet MATH Google Scholar
Hong, L.J., Nelson, B.L.: Discrete optimization via simulation using COMPASS. Oper. Res. 54, 115–129 (2006)
Article MATH Google Scholar
Hu, J., Chang, H.S.: An approximate stochastic annealing algorithm for finite horizon Markov decision processes. In: Proceedings of the 49th IEEE Conference on Decision and Control, pp. 5338–5343 (2010)
Chapter Google Scholar
Hu, J., Hu, P.: On the performance of the cross-entropy method. In: Proceedings of the 2009 Winter Simulation Conference, pp. 459–468 (2009)
Chapter Google Scholar
Hu, J., Hu, P.: An approximate annealing search algorithm to global optimization and its connections to stochastic approximation. In: Proceedings of the 2010 Winter Simulation Conference, pp. 1223–1234 (2010)
Chapter Google Scholar
Hu, J., Hu, P.: Annealing adaptive search, cross-entropy, and stochastic approximation in global optimization. Nav. Res. Logist. 58, 457–477 (2011)
Article MATH Google Scholar
Hu, J., Fu, M.C., Marcus, S.I.: A model reference adaptive search method for global optimization. Oper. Res. 55, 549–568 (2007)
Article MathSciNet MATH Google Scholar
Hu, J., Fu, M.C., Marcus, S.I.: A model reference adaptive search method for stochastic global optimization. Commun. Inf. Syst. 8, 245–276 (2008)
MathSciNet MATH Google Scholar
Hu, J., Hu, P., Chang, H.S.: A stochastic approximation framework for a class of randomized optimization algorithms. IEEE Trans. Autom. Control 57, 165–178 (2012)
Article MathSciNet Google Scholar
Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220, 45–54 (1983)
Article MathSciNet Google Scholar
Kushner, H.J., Clark, D.S.: Stochastic Approximation Methods for Constrained and Unconstrained Systems. Springer, New York (1978)
Book Google Scholar
Kushner, H.J., Yin, G.G.: Stochastic Approximation Algorithms and Applications. Springer, New York (1997)
MATH Google Scholar
Laarhoven, P.J.M., Aarts, E.H.L.: Simulated Annealing: Theory and Applications. Kluwer Academic, Norwell (1987)
Book MATH Google Scholar
Mannor, S., Rubinstein, R.Y., Gat, Y.: The cross-entropy method for fast policy search. In: International Conference on Machine Learning, pp. 512–519 (2003)
Google Scholar
Morris, C.N.: Natural exponential families with quadratic variance functions. Ann. Stat. 10, 65–80 (1982)
Article MATH Google Scholar
Mühlenbein, H., Paaß, G.: From recombination of genes to the estimation of distributions, I: binary parameters. In: Voigt, H., Ebeling, W., Rechenberg, I., Schwefel, H. (eds.) Proceedings of the 4th International Conference on Parallel Problem Solving from Nature, pp. 178–187. Springer, Berlin (1996)
Google Scholar
Ng, A.Y., Parr, R., Koller, D.: Policy search via density estimation. In: Solla, S.A., Leen, T.K., Müller, K.-R. (eds.) Advances in Neural Information Processing Systems, vol. 12, NIPS 1999, pp. 1022–1028. MIT Press, Cambridge (2000)
Google Scholar
Rajaraman, K., Sastry, P.S.: Finite time analysis of the pursuit algorithm for learning automata. IEEE Trans. Syst. Man Cybern., Part B, Cybern. 26(4), 590–598 (1996)
Article Google Scholar
Romeijn, H.E., Smith, R.L.: Simulated annealing and adaptive search in global optimization. Probab. Eng. Inf. Sci. 8, 571–590 (1994)
Article Google Scholar
Rubinstein, R.Y., Kroese, D.P.: The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte Carlo Simulation, and Machine Learning. Springer, New York (2004)
MATH Google Scholar
Rubinstein, R.Y., Shapiro, A.: Discrete Event Systems: Sensitivity Analysis and Stochastic Optimization by the Score Function Method. Wiley, New York (1993)
MATH Google Scholar
Shi, L., Ólafsson, S.: Nested partitions method for global optimization. Oper. Res. 48, 390–407 (2000)
Article MathSciNet MATH Google Scholar
Shi, L., Ólafsson, S.: Nested partitions method for stochastic optimization. Methodol. Comput. Appl. Probab. 2, 271–291 (2000)
Article MathSciNet MATH Google Scholar
Spall, J.C.: Introduction to Stochastic Search and Optimization. Wiley, New York (2003)
Book MATH Google Scholar
Spall, J.C., Cristion, J.A.: Model-free control of nonlinear stochastic systems with discrete-time measurements. IEEE Trans. Autom. Control 43, 1198–1210 (1998)
Article MathSciNet MATH Google Scholar
Srinivas, M., Patnaik, L.M.: Genetic algorithms: a survey. IEEE Comput. 27(6), 17–26 (1994)
Article Google Scholar
Wolpert, D.H.: Finding bounded rational equilibria, part I: iterative focusing. In: Vincent, T. (ed.) Proceedings of the Eleventh International Symposium on Dynamic Games and Applications, ISDG ’04 (2004)
Google Scholar
Yakowitz, S., L’Ecuyer, P., Vázquez-Abad, F.: Global stochastic optimization with low-dispersion point sets. Oper. Res. 48, 939–950 (2000)
Article MathSciNet MATH Google Scholar
Zabinsky, Z.B.: Stochastic Adaptive Search for Global Optimization. Kluwer Academic, Norwell (2003)
MATH Google Scholar
Zhang, Q., Mühlenbein, H.: On the convergence of a class of estimation of distribution algorithm. IEEE Trans. Evol. Comput. 8(2), 127–136 (2004)
Article Google Scholar
Zlochin, M., Birattari, M., Meuleau, N., Dorigo, M.: Model-based search for combinatorial optimization: a critical survey. Ann. Oper. Res. 131, 373–395 (2004)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer Science and Engineering, Sogang University, Seoul, South Korea
Hyeong Soo Chang
Dept. Applied Mathematics & Statistics, State University of New York, Stony Brook, NY, USA
Jiaqiao Hu
Smith School of Business, University of Maryland, College Park, MD, USA
Michael C. Fu
Dept. Electrical & Computer Engineering, University of Maryland, College Park, MD, USA
Steven I. Marcus

Authors

Hyeong Soo Chang
View author publications
You can also search for this author in PubMed Google Scholar
Jiaqiao Hu
View author publications
You can also search for this author in PubMed Google Scholar
Michael C. Fu
View author publications
You can also search for this author in PubMed Google Scholar
Steven I. Marcus
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Chang, H.S., Hu, J., Fu, M.C., Marcus, S.I. (2013). Model Reference Adaptive Search. In: Simulation-Based Algorithms for Markov Decision Processes. Communications and Control Engineering. Springer, London. https://doi.org/10.1007/978-1-4471-5022-0_4

Download citation

DOI: https://doi.org/10.1007/978-1-4471-5022-0_4
Publisher Name: Springer, London
Print ISBN: 978-1-4471-5021-3
Online ISBN: 978-1-4471-5022-0
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics