# Applicational aspects of stochastic approximation

• Georg Pflug
Chapter
Part of the DMV Seminar book series (OWS, volume 17)

## Abstract

Let F(x) be a real function defined on ℝ k or a subset of it. In this part we will consider the optimization problem $$(P)\parallel \begin{array}{*{20}c} {F(x) = \min !} \\ {x \in S} \\ \end{array}$$where S $$\subseteq$$ k is a set of constraints. Any point x* which is the solution of (P) is called a global minimizer of F on S. If there is an open set U such that a point x 0 is the solution of $$(P)\parallel \begin{array}{*{20}c} {F(x) = \min !} \\ {x \in S \cap U} \\ \end{array}$$ then x 0 is called a local minimizer of F on S. In general, for deterministic procedures which use the gradient f(x) of F(x), only convergence to the set of critical points x: f(x) = 0} can be proved. There are however tricky deterministic methods which avoid convergence to non-global minimizers (Dixon and Szegö 1975; Ge 1990).

## Keywords

Stationary Distribution Asymptotic Distribution Design Point Confidence Region Stochastic Approximation
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

## References

1. Chiang T.S., Hwang C.R., Sheu S.J. (1987) Diffusions for global optimization in ℝn. SIAM J. Control Optim., Vol. 25, 737–752.
2. Chow Y., Robbins H. (1965). On the asymptotic theory of fixed-width sequential confidence intervals. Ann. Math. Statist., Vol. 36, 457–462.
3. Chung K.L. (1954). On a stochastic approximation method. Ann. Math. Statist., Vol. 25, 463–483.
4. David (1970). Order statistics. J. Wiley and Sons, New York.
5. Dekker A., Aarts E. (1991). Global optimization and simulated annealing. Math. Programming, Vol. 50, 367–393.
6. Dixon L., Szegö G. (1975). Towards Global Optimization. North Holland, Amsterdam.Google Scholar
7. Fabian V. (1968). On asymptotic normality in stochastic approximation. Ann. Statist., Vol. 39, 1327–1332.
8. Fabian V. (1978). On asymptotically efficient recursive estimation. Ann. Statist., Vol. 6., No 4, 854–856.
9. Farell R.H. (1962). Bounded length confidence intervals for the zero of a regression function. Ann. Math. Statist., Vol. 33, 237–247.
10. Fedorov V. (1972). Theory of optimal experiment. Academic press, New York.Google Scholar
11. Föllmer H. (1988). Random Fields and Diffusion Processes. Ecole dďété de probabilité de St.-Flour XV-XVII, Springer Lecture Notes 1362.Google Scholar
12. Gelfand S.B., Mitter S.K. (1991). Recursive stochastic algorithms for global optimization in ℝd. SIAM J. Control, Vol. 29, No. 5, 999–1018.
13. Ge Renpu (1990). A filled function method for finding a global minimizer of a function of several variables. Math. Progr. Vol. 46.Google Scholar
14. Geman S., Hwang C.R. (1986). Diffusions for global optimization. Siam J. Control and Optimization, Vol. 34, No. 3, 1031–1036.
15. Gihman J., Skorohod A. (1968). Stochastic differential equations. Kiev: Nauk. dumka (in Russian)Google Scholar
16. Graham A. (1981). Kronecker procucts and matrix calculus. Ellis Horwood.Google Scholar
17. Heyde C.C. (1974). On martingale limit theory and strong convergence results for stochastic approximation procedures. Stoch. Proc. and Appl., Vol. 2, 359–370.
18. Högnäs G. (1986). Comparison for some nonlinear autoregressive processes. J. Time Series Analysis, Vol. 7., No. 3, pp. 205–211.
19. Hwang C.R. (1980). Laplace’s method revisited: weak convergence of probability measures. Ann Probab., Vol. 8, 1177–1182.
20. Karlin S., Taylor H. (1981). A second course in stochastic processes. Academic press, New York.
21. Kersting G.D. (1977). Some results on the asymptotic behavior of the Robbins-Monro procedure. Bull. Int. Stat. Inst., Vol. 47, 327–335.
22. Kersting G.D. (1978). A weak convergence theorem with application to the Robbins-Monro process. Ann. Prob., Vol. 6., No. 6, 1015–1025.
23. Kushner H., Hai-Huang (1981). Asymptotic properties of stochastic approximation with constant coefficients. SIAM J. Control Vol. 19, 87–105.
24. Kushner H. (1987). Asymptotic global behavior for stochastic approximation and diffusions with slowly decreasing noise effects: global minimization via Monte Carlo. Siam J. Appl. Math., Vol. 47, No. 1, 169–185.
25. Major P., Revesz P. (1973). A limit theorem for the Robbims-Monro approximation. Z. Wahrscheinlichkeitstheorie verw. Geb., Vol. 27, 79–86.
26. Marti K. (1980). On Accelerations of the Convergence in Random Search Methods. Meth. Oper. Res., Vol. 37, 391–406.
27. Marti K. (1992). Semi-Stochastic Approximation by the Response Surface Methodology. Optimization.Google Scholar
28. Matyas J. (1965). Random Optimization. Automation and Remote Control, Vol. 26, 246–253.
29. V.Mises R., Pollaczek-Geiringer H (1929). Praktische Verfahren der Gleichungsauflösung. Z. angew. Math. Mech. Vol. 9, 58–77.
30. Nevelľson M.B., Hasminskij R.S. (1972). Stochastic approximation and recurrent estimation. Nauka, Moskwa (in Russian). Translated in Amer. Math. Soc. Transi. Monographs, Vol.24, Providence, R.I.Google Scholar
31. Neveu J. (1974). Discrete Parameter Martingales. North Holland, Amseterdam.Google Scholar
32. Pflug G. (1986). Stochastic optimization with constant step-size. Asymptotic laws. SIAM J. of Control, Vol. 24, No. 4, 655–666.
33. Pflug G. (1988). Stepsize rules, stopping times and their implementation in stochastic quasigradient algorithms. In: Numerical Techniques for Stoch. Optimization (Y. Ermoliev, R. Wets eds.), Springer Series in Computational Mathematics.Google Scholar
34. Pflug G. (1989). Sampling derivatives of probability measures. Computing, Vol. 42, 315–328.
35. Pflug G. (1990). Non-asymptotic Confidence Bounds for Stochastic Approximation Algorithms with Constant Step Size. Monatsh. Math., Vol. 110, 297–314.
36. Pflug G. (1991). A note on the comparison of stationary laws of Markovian processes. Statistics and Probability Letters, Vol. 11, No. 4, 331–334.
37. Pflug G. (1992). Gradient estimates for the performance of Markov Chains and Discrete Event Processes. to appear in: Annals of OR.Google Scholar
38. Pflug G. Ch., Schachermayer W. (1992). Cofficients of ergodicity for stochastically monotone Markov Chains. to appear in: Advances of Applied Probability.Google Scholar
39. Polyak B. (1991). Novi metod tipa stochasticekoi approksmacii. Automatika i Telemechanika No.7, 98–107 (in Russian).Google Scholar
40. Rachev S.T. (1984). The Monge-Kantorovich mass transformation problem and its stochastic applications. Theory of Probability and its Applications, Vol. 29, No. 4, 647–676.
41. Revuz D. (1975). Markov Chains. North Holland Publ. Comp. Amsterdam.Google Scholar
42. Robbins H., Monro S. (1971). A stochastic approximation method. Ann. Math. Statist., Vol. 22, 400–407.
43. Robbins H., Siegmund D. (1971). A convergence theorem for nonnegative almost supermartingales and some applications. Optimizing methods in Statistics, ed. by J.S. Rustagi. Academic Press, New York, 233–257.Google Scholar
44. Rubinstein R. (1986). The score function approach for sensitivity analysis of computer simulation models. Mathematics and Computers in Simulation, Vol. 28, 351–379.
45. Sielken R.L. (1973). Some stopping rule for stochastic approximation procedures. Z. Wahrscheinlichkeitstheorie verw. Geb., Vol. 27, 79–86.
46. Solis F., Wets R. (1981). Minimization by random search techniques. Mathematics of Operations Research, Vol. 6, No. 1, 19–30.
47. Strassen V. (1965). The existence of probability measures with given martin-gals. Ann. Math. Statist., Vol. 36, 423–439.
48. Stroup D.F., Braun H.I. (1982). A new stopping rule for stochastic approximation. Z. Wahrscheinlichkeitstheorie verw. Geb., Vol. 60, 535–554.