• S. BhatnagarEmail author
  • H. Prasad
  • L. Prashanth
Part of the Lecture Notes in Control and Information Sciences book series (LNCIS, volume 434)


Optimization methods play a central role in many engineering disciplines. In this chapter, we give a broad overview of the optimization settings and algorithms including simultaneous perturbation approaches as well as give a brief summary of later chapters.


Reinforcement Learning Queue Length Markov Decision Process Stochastic Optimization Stochastic Approximation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Abdulla, M.S., Bhatnagar, S.: Reinforcement learning based algorithms for average cost Markov decision processes. Discrete Event Dynamic Systems 17(1), 23–52 (2007)MathSciNetzbMATHCrossRefGoogle Scholar
  2. 2.
    Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific, Belmont (1996)zbMATHGoogle Scholar
  3. 3.
    Bhatnagar, S.: Adaptive multivariate three-timescale stochastic approximation algorithms for simulation based optimization. ACM Transactions on Modeling and Computer Simulation 15(1), 74–107 (2005)CrossRefGoogle Scholar
  4. 4.
    Bhatnagar, S.: Adaptive Newton-based smoothed functional algorithms for simulation optimization. ACM Transactions on Modeling and Computer Simulation 18(1), 2:1–2:35 (2007)CrossRefGoogle Scholar
  5. 5.
    Bhatnagar, S.: Simultaneous perturbation and finite difference methods. Wiley Encyclopedia of Operations Research and Management Science 7, 4969–4991 (2011)Google Scholar
  6. 6.
    Bhatnagar, S., Abdulla, M.S.: Simulation-based optimization algorithms for finite horizon Markov decision processes. Simulation 84(12), 577–600 (2008)CrossRefGoogle Scholar
  7. 7.
    Bhatnagar, S., Fu, M.C., Marcus, S.I., Wang, I.J.: Two-timescale simultaneous perturbation stochastic approximation using deterministic perturbation sequences. ACM Transactions on Modelling and Computer Simulation 13(2), 180–209 (2003)CrossRefGoogle Scholar
  8. 8.
    Bhatnagar, S., Hemachandra, N., Mishra, V.: Stochastic approximation algorithms for constrained optimization via simulation. ACM Transactions on Modeling and Computer Simulation 21, 15:1–15:22 (2011)CrossRefGoogle Scholar
  9. 9.
    Bhatnagar, S., Karmeshu, Mishra, V.: Optimal parameter trajectory estimation in parameterized sdes: an algorithmic procedure. ACM Transactions on Modeling and Computer Simulation (TOMACS) 19(2), 8 (2009)CrossRefGoogle Scholar
  10. 10.
    Bhatnagar, S., Kumar, S.: A simultaneous perturbation stochastic approximation based actor-critic algorithm for Markov decision processes. IEEE Transactions on Automatic Control 49(4), 592–598 (2004)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Bhatnagar, S., Mishra, V., Hemachandra, N.: Stochastic algorithms for discrete parameter simulation optimization. IEEE Transactions on Automation Science and Engineering 9(4), 780–793 (2011)CrossRefGoogle Scholar
  12. 12.
    Borkar, V.S.: Stochastic approximation with two timescales. Systems and Control Letters 29, 291–294 (1997)MathSciNetzbMATHCrossRefGoogle Scholar
  13. 13.
    Borkar, V.S.: Stochastic Approximation: A Dynamical Systems Viewpoint. Cambridge University Press and Hindustan Book Agency (Jointly Published), Cambridge and New Delhi (2008)Google Scholar
  14. 14.
    Fabian, V.: Stochastic approximation. In: Rustagi, J.J. (ed.) Optimizing Methods in Statistics, pp. 439–470. Academic Press, New York (1971)Google Scholar
  15. 15.
    Floyd, S., Jacobson, V.: Random early detection gateways for congestion avoidance. IEEE/ACM Transactions on Networking 1(4), 397–413 (1993)CrossRefGoogle Scholar
  16. 16.
    Karmeshu, Bhatnagar, S., Mishra, V.: An optimized sde model for slotted aloha. IEEE Transactions on Communications 59(6), 1502–1508 (2011)CrossRefGoogle Scholar
  17. 17.
    Katkovnik, V.Y., Kulchitsky, Y.: Convergence of a class of random search algorithms. Automation Remote Control 8, 1321–1326 (1972)Google Scholar
  18. 18.
    Kiefer, E., Wolfowitz, J.: Stochastic estimation of the maximum of a regression function. Ann. Math. Statist. 23, 462–466 (1952)MathSciNetzbMATHCrossRefGoogle Scholar
  19. 19.
    Prashanth, L.A., Prasad, H., Desai, N., Bhatnagar, S., Dasgupta, G.: Simultaneous perturbation methods for adaptive labor staffing in service systems. Tech. rep., Stochastic Systems Lab, IISc (2012),
  20. 20.
    Patro, R.K., Bhatnagar, S.: A probabilistic constrained nonlinear optimization framework to optimize RED parameters. Performance Evaluation 66(2), 81–104 (2009)CrossRefGoogle Scholar
  21. 21.
    Prashanth, L., Bhatnagar, S.: Reinforcement learning with function approximation for traffic signal control. IEEE Transactions on Intelligent Transportation Systems 12(2), 412–421 (2011)CrossRefGoogle Scholar
  22. 22.
    Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Statist. 22, 400–407 (1951)MathSciNetzbMATHCrossRefGoogle Scholar
  23. 23.
    Spall, J.C.: Multivariate stochastic approximation using a simultaneous perturbation gradient approximation. IEEE Trans. Auto. Cont. 37(3), 332–341 (1992)MathSciNetzbMATHCrossRefGoogle Scholar
  24. 24.
    Spall, J.C.: A one-measurement form of simultaneous perturbation stochastic approximation. Automatica 33, 109–112 (1997)MathSciNetzbMATHCrossRefGoogle Scholar
  25. 25.
    Spall, J.C.: An overview of the simultaneous perturbation method for efficient optimization. Johns Hopkins APL Technical Digest 19, 482–492 (1998)Google Scholar
  26. 26.
    Spall, J.C.: Stochastic optimization, stochastic approximation and simulated annealing. In: Webster, J.G. (ed.) Wiley Encyclopedia of Electrical and Electronics Engineering, vol. 20, pp. 529–542. John Wiley and Sons, New York (1999)Google Scholar
  27. 27.
    Spall, J.C.: Adaptive stochastic approximation by the simultaneous perturbation method. IEEE Trans. Autom. Contr. 45, 1839–1853 (2000)MathSciNetzbMATHCrossRefGoogle Scholar
  28. 28.
    Spall, J.C.: Introduction to Stochastic Search and Optimization. John Wiley and Sons, New York (2003)zbMATHCrossRefGoogle Scholar
  29. 29.
    Styblinski, M.A., Tang, T.S.: Experiments in nonconvex optimization: stochastic approximation with function smoothing and simulated annealing. Neural Networks 3, 467–483 (1990)CrossRefGoogle Scholar
  30. 30.
    Vemu, K.R., Bhatnagar, S., Hemachandra, N.: Optimal multi-layered congestion based pricing schemes for enhanced qos. Computer Networks (2011),

Copyright information

© Springer-Verlag London 2013

Authors and Affiliations

  1. 1.Department of Computer Science and AutomationIndian Institute of ScienceBangaloreIndia

Personalised recommendations