Loss Functions and Stochastic Approximation
Gradient descent as a technique for finding the minimum of a loss function J(v) was introduced in Section 2.10. Recall that the technique consists of finding the gradient ∇ J(v) and then adjusting the parameter vector v so that the change in v is in the direction of the negative of the gradient.
Unable to display preview. Download preview PDF.
- 9.A. Dvoretzky, On stochastic approximation. Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, December, 1954 and June and July, 1955, pp. 39–55.Google Scholar
- 10.S. S. L. Chang, Synthesis of Optical Control Systems. McGraw-Hill, New York, 1961, pp. 289–93.Google Scholar