Abstract
An algorithm of searching a zero of an unknown function ϕ: ℝ → ℝ is considered: x t = x t−1 − γ t−1 y t , t = 1, 2, ..., where y t = ϕ(x t−1) + ξ t is the value of ϕ measured at x t−1 and ξ t is the measurement error. The step sizes γ t > 0 are modified in the course of the algorithm according to the rule: γ t = min{uγ t−1, \( \bar g \)} if y t−1 y t > 0, and γ t = dγ t−1, otherwise, where 0 < d < 1 < u, \( \bar g \) > 0. That is, at each iteration γ t is multiplied either by u or by d, provided that the resulting value does not exceed the predetermined value \( \bar g \). The function ϕ may have one or several zeros; the random values ξ t are independent and identically distributed, with zero mean and finite variance. Under some additional assumptions on ϕ, ξ t , and \( \bar g \), the conditions on u and d guaranteeing a.s. convergence of the sequence {x t }, as well as a.s. divergence, are determined. In particular, if P(ξ 1 > 0) = P (ξ 1 < 0) = 1/2 and P(ξ 1 = x) = 0 for any x ∈ ℝ, one has convergence for ud < 1 and divergence for ud > 1. Due to the multiplicative updating rule for γ t , the sequence {x t } converges rapidly: like a geometric progression (if convergence takes place), but the limit value may not coincide with, but instead, approximate one of the zeros of ϕ. By adjusting the parameters u and d, one can reach arbitrarily high precision of the approximation; higher accuracy is obtained at the expense of lower convergence rate.
Similar content being viewed by others
References
M. Nevel’son and R. Has’minskii, Stochastic Approximation and Recursive Estimation, in Translations of Mathematical Monographs (AMS, Providence, R.I., 1976), Vol. 47. [Translated from the Russian].
H. J. Kushner and G. G. Yin, Stochastic Approximation Algorithms and Applications, in Applications of Mathematics (Springer, Berlin, 1997), Vol. 35.
B. T. Polyak, “New Stochastic Approximation Type Procedures”, Avtomat. i Telemekh., No. 7, 98–107 (1990).
D. Ruppert, Efficient Estimators from a Slowly Convergent Robbins-Monroe Process, Tech. Report No. 781, School of Operations Research and Industrial Engineering (Cornell University, 1988).
B. T. Polyak and A. Juditsky, “Acceleration of Stochastic Approximation by Averaging”, SIAM J. Control Optim. 30, 838–855 (1992).
Y. Fang and T. J. Sejnowski, “Faster Learning for Dynamic Recurrent Backpropagation”, Neural Computation 2, 270–273 (1990).
F. M. Silva and L. B. Almeida, “Speeding up Backpropagation”, in Advanced Neural Computers, Ed. by R. Eckmiller (Elsevier, Amsterdam, 1990), pp. 151–158.
L. B. Almeida, T. Langlois, J. D. Amaral, and A. Plakhov, “Parameter Adaptation in Stochastic Optimization, in Online Learning in Neural Networks, Ed. by D. Saad (1998), pp. 111–134.
P. J. Werbos, “Neurocontrol and Supervised Learning: an Overview and Evaluation”, in Handbook of Intelligent Control (Neural, Fuzzy, and Adaptive Approaches), Ed. by D. A. White and D. A. Sofge (Van Nostrand Reinhold, New York, 1992), pp. 65–89.
H. Kesten, “Accelerated Stochastic Approximation”, Ann. Math. Statist. 29, 41–59 (1958).
B. Delyon and A. Juditsky, “Accelerated Stochastic Approximation”, SIAM J. Optim. 3, 868–881 (1993).
R. Salomon and van J. L. Hemmen, “Accelerating Backpropagation through Dynamic Self-Adaptation”, Neural Networks 9(4), 589–601 (Elsevier, 1996).
R. Battiti, “Accelerated Backpropagation Learning: Two Optimization Methods”, Complex Systems, Inc., 331–342 (1989).
M. Frean, A “Thermal” Perceptron Learning Rule”, Neural Computation 4 (6), 946–957 (The MIT Press, Cambridge-Massachusetts, 1992).
Author information
Authors and Affiliations
Corresponding author
About this article
Cite this article
Plakhov, A., Cruz, P. A stochastic approximation algorithm with multiplicative step size modification. Math. Meth. Stat. 18, 185–200 (2009). https://doi.org/10.3103/S1066530709020057
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.3103/S1066530709020057