Skip to main content

Part of the book series: Communications and Control Engineering ((CCE))

  • 2652 Accesses

Abstract

In Chap. 4, we consider a global optimization approach, called model reference adaptive search (MRAS), which provides a broad framework for updating a probability distribution over the solution space in a way that ensures convergence to an optimal solution. After introducing the theory and convergence results in a general optimization problem setting, we apply the MRAS approach to various MDP settings. For the finite- and infinite-horizon settings, we show how the approach can be used to perform optimization in policy space. In the setting of Chap. 3, we show how MRAS can be incorporated to further improve the exploration step in the evolutionary algorithms presented there. Moreover, for the finite-horizon setting with both large state and action spaces, we combine the approaches of Chaps. 2 and 4 and propose a method for sampling the state and action spaces. Finally, we present a stochastic approximation framework for studying a class of simulation- and sampling-based optimization algorithms. We illustrate the framework through an algorithm instantiation called model-based annealing random search (MARS) and discuss its application to finite-horizon MDPs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Balakrishnan, V., Tits, A.L.: Numerical optimization-based design. In: Levine, W.S. (ed.) The Control Handbook, pp. 749–758. CRC Press, Boca Raton (1996)

    Google Scholar 

  2. Benaim, M.: A dynamical system approach to stochastic approximations. SIAM J. Control Optim. 34, 437–472 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  3. Bhatnagar, S., Fu, M.C., Marcus, S.I.: An optimal structured feedback policy for ABR flow control using two timescale SPSA. IEEE/ACM Trans. Netw. 9, 479–491 (2001)

    Article  Google Scholar 

  4. Chang, H.S., Fu, M.C., Hu, J., Marcus, S.I.: An asymptotically efficient simulation-based algorithm for finite horizon stochastic dynamic programming. IEEE Trans. Autom. Control 52(1), 89–94 (2007)

    Article  MathSciNet  Google Scholar 

  5. Corana, A., Marchesi, M., Martini, C., Ridella, S.: Minimizing multimodal functions of continuous variables with the ‘simulated annealing’ algorithm. ACM Trans. Math. Softw. 13(3), 262–280 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  6. De Boer, P.T., Kroese, D.P., Mannor, S., Rubinstein, R.Y.: A tutorial on the cross-entropy method. Ann. Oper. Res. 134, 19–67 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  7. Dorigo, M., Gambardella, L.M.: Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans. Evol. Comput. 1, 53–66 (1997)

    Article  Google Scholar 

  8. Evans, S.N., Weber, N.C.: On the almost sure convergence of a general stochastic approximation procedure. Bull. Aust. Math. Soc. 34, 335–342 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  9. Fabian, V.: On asymptotic normality in stochastic approximation. Ann. Math. Stat. 39, 1327–1332 (1968)

    Article  MathSciNet  MATH  Google Scholar 

  10. Fu, M.C., Healy, K.J.: Techniques for simulation optimization: an experimental study on an (s,S) inventory system. IIE Trans. 29, 191–199 (1997)

    Google Scholar 

  11. Fu, M.C., Hu, J., Marcus, S.I.: Model-based randomized methods for global optimization. In: Proceedings of the 17th International Symposium on Mathematical Theory of Networks and Systems, Kyoto, Japan, July (2006)

    Google Scholar 

  12. Glover, F.: Tabu search: a tutorial. Interfaces 20(4), 74–94 (1990)

    Article  Google Scholar 

  13. Goldberg, D.E.: Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, Boston (1989)

    MATH  Google Scholar 

  14. Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58, 13–30 (1963)

    Article  MathSciNet  MATH  Google Scholar 

  15. Homem-de-Mello, T.: A study on the cross-entropy method for rare-event probability estimation. INFORMS J. Comput. 19(3), 381–394 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  16. Hong, L.J., Nelson, B.L.: Discrete optimization via simulation using COMPASS. Oper. Res. 54, 115–129 (2006)

    Article  MATH  Google Scholar 

  17. Hu, J., Chang, H.S.: An approximate stochastic annealing algorithm for finite horizon Markov decision processes. In: Proceedings of the 49th IEEE Conference on Decision and Control, pp. 5338–5343 (2010)

    Chapter  Google Scholar 

  18. Hu, J., Hu, P.: On the performance of the cross-entropy method. In: Proceedings of the 2009 Winter Simulation Conference, pp. 459–468 (2009)

    Chapter  Google Scholar 

  19. Hu, J., Hu, P.: An approximate annealing search algorithm to global optimization and its connections to stochastic approximation. In: Proceedings of the 2010 Winter Simulation Conference, pp. 1223–1234 (2010)

    Chapter  Google Scholar 

  20. Hu, J., Hu, P.: Annealing adaptive search, cross-entropy, and stochastic approximation in global optimization. Nav. Res. Logist. 58, 457–477 (2011)

    Article  MATH  Google Scholar 

  21. Hu, J., Fu, M.C., Marcus, S.I.: A model reference adaptive search method for global optimization. Oper. Res. 55, 549–568 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  22. Hu, J., Fu, M.C., Marcus, S.I.: A model reference adaptive search method for stochastic global optimization. Commun. Inf. Syst. 8, 245–276 (2008)

    MathSciNet  MATH  Google Scholar 

  23. Hu, J., Hu, P., Chang, H.S.: A stochastic approximation framework for a class of randomized optimization algorithms. IEEE Trans. Autom. Control 57, 165–178 (2012)

    Article  MathSciNet  Google Scholar 

  24. Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220, 45–54 (1983)

    Article  MathSciNet  Google Scholar 

  25. Kushner, H.J., Clark, D.S.: Stochastic Approximation Methods for Constrained and Unconstrained Systems. Springer, New York (1978)

    Book  Google Scholar 

  26. Kushner, H.J., Yin, G.G.: Stochastic Approximation Algorithms and Applications. Springer, New York (1997)

    MATH  Google Scholar 

  27. Laarhoven, P.J.M., Aarts, E.H.L.: Simulated Annealing: Theory and Applications. Kluwer Academic, Norwell (1987)

    Book  MATH  Google Scholar 

  28. Mannor, S., Rubinstein, R.Y., Gat, Y.: The cross-entropy method for fast policy search. In: International Conference on Machine Learning, pp. 512–519 (2003)

    Google Scholar 

  29. Morris, C.N.: Natural exponential families with quadratic variance functions. Ann. Stat. 10, 65–80 (1982)

    Article  MATH  Google Scholar 

  30. Mühlenbein, H., Paaß, G.: From recombination of genes to the estimation of distributions, I: binary parameters. In: Voigt, H., Ebeling, W., Rechenberg, I., Schwefel, H. (eds.) Proceedings of the 4th International Conference on Parallel Problem Solving from Nature, pp. 178–187. Springer, Berlin (1996)

    Google Scholar 

  31. Ng, A.Y., Parr, R., Koller, D.: Policy search via density estimation. In: Solla, S.A., Leen, T.K., Müller, K.-R. (eds.) Advances in Neural Information Processing Systems, vol. 12, NIPS 1999, pp. 1022–1028. MIT Press, Cambridge (2000)

    Google Scholar 

  32. Rajaraman, K., Sastry, P.S.: Finite time analysis of the pursuit algorithm for learning automata. IEEE Trans. Syst. Man Cybern., Part B, Cybern. 26(4), 590–598 (1996)

    Article  Google Scholar 

  33. Romeijn, H.E., Smith, R.L.: Simulated annealing and adaptive search in global optimization. Probab. Eng. Inf. Sci. 8, 571–590 (1994)

    Article  Google Scholar 

  34. Rubinstein, R.Y., Kroese, D.P.: The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte Carlo Simulation, and Machine Learning. Springer, New York (2004)

    MATH  Google Scholar 

  35. Rubinstein, R.Y., Shapiro, A.: Discrete Event Systems: Sensitivity Analysis and Stochastic Optimization by the Score Function Method. Wiley, New York (1993)

    MATH  Google Scholar 

  36. Shi, L., Ólafsson, S.: Nested partitions method for global optimization. Oper. Res. 48, 390–407 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  37. Shi, L., Ólafsson, S.: Nested partitions method for stochastic optimization. Methodol. Comput. Appl. Probab. 2, 271–291 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  38. Spall, J.C.: Introduction to Stochastic Search and Optimization. Wiley, New York (2003)

    Book  MATH  Google Scholar 

  39. Spall, J.C., Cristion, J.A.: Model-free control of nonlinear stochastic systems with discrete-time measurements. IEEE Trans. Autom. Control 43, 1198–1210 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  40. Srinivas, M., Patnaik, L.M.: Genetic algorithms: a survey. IEEE Comput. 27(6), 17–26 (1994)

    Article  Google Scholar 

  41. Wolpert, D.H.: Finding bounded rational equilibria, part I: iterative focusing. In: Vincent, T. (ed.) Proceedings of the Eleventh International Symposium on Dynamic Games and Applications, ISDG ’04 (2004)

    Google Scholar 

  42. Yakowitz, S., L’Ecuyer, P., Vázquez-Abad, F.: Global stochastic optimization with low-dispersion point sets. Oper. Res. 48, 939–950 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  43. Zabinsky, Z.B.: Stochastic Adaptive Search for Global Optimization. Kluwer Academic, Norwell (2003)

    MATH  Google Scholar 

  44. Zhang, Q., Mühlenbein, H.: On the convergence of a class of estimation of distribution algorithm. IEEE Trans. Evol. Comput. 8(2), 127–136 (2004)

    Article  Google Scholar 

  45. Zlochin, M., Birattari, M., Meuleau, N., Dorigo, M.: Model-based search for combinatorial optimization: a critical survey. Ann. Oper. Res. 131, 373–395 (2004)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag London

About this chapter

Cite this chapter

Chang, H.S., Hu, J., Fu, M.C., Marcus, S.I. (2013). Model Reference Adaptive Search. In: Simulation-Based Algorithms for Markov Decision Processes. Communications and Control Engineering. Springer, London. https://doi.org/10.1007/978-1-4471-5022-0_4

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-5022-0_4

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-5021-3

  • Online ISBN: 978-1-4471-5022-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics