Abstract
In this work, we present and analyze C-SAGA, a (deterministic) cyclic variant of SAGA. C-SAGA is an incremental gradient method that minimizes a sum of differentiable convex functions by cyclically accessing their gradients. Even though the theory of stochastic algorithms is more mature than that of cyclic counterparts in general, practitioners often prefer cyclic algorithms. We prove C-SAGA converges linearly under the standard assumptions. Then, we compare the rate of convergence with the full gradient method, (stochastic) SAGA, and incremental aggregated gradient (IAG), theoretically and experimentally.
This is a preview of subscription content, access via your institution.

References
Agarwal, A., Bartlett, P.L., Ravikumar, P., Wainwright, M.J.: Information-theoretic lower bounds on the oracle complexity of stochastic convex optimization. IEEE Trans. Inf. Theory 58(5), 3235–3249 (2012)
Bertsekas, D.P.: Incremental proximal methods for large scale convex optimization. Math. Program. 129(2), 163–195 (2011)
Bertsekas, D.P.: Incremental aggregated proximal and augmented Lagrangian algorithms. Comput. Sci. Syst. Control. arXiv preprint arXiv:1509.09257 (2015)
Blatt, D., Hero, A.O., Gauchman, H.: A convergent incremental gradient method with a constant step size. SIAM J. Optim. 18(1), 29–51 (2007)
Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1–27:27 (2011). Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Defazio, A.: A simple practical accelerated method for finite sums. In: NIPS, pp. 676–684 (2016)
Defazio, A., Bach, F., Lacoste-Julien, A.: SAGA: a fast incremental gradient method with support for non-strongly convex composite objectives. In: NIPS, pp. 1646–1654 (2014)
Defazio, A., Domke, J., Caetano, T.S.: Finito: a faster, permutable incremental gradient method for big data problems. ICML 32, 1125–1133 (2014)
Gürbüzbalaban, M., Ozdaglar, A., Parrilo, P.A.: On the convergence rate of incremental aggregated gradient algorithms. SIAM J. Optim. 27(2), 1035–1048 (2017)
Johnson, R., Zhang, T.: Accelerating stochastic gradient descent using predictive variance reduction. In: NIPS, pp. 315–323 (2013)
Kushner, H.J., Yin, G.G.: Stochastic Approximation and Recursive Algorithms and Applications, 2nd edn. Springer, New York (2003)
Lan, G., Zhou, Y.: An optimal randomized incremental gradient method. Math. Program. 171(1), 167–215 (2018)
Le Roux, N., Schmidt, M., Bach, F.: Stochastic gradient method with an exponential convergence rate for finite training sets. In: NIPS (2012)
Mairal, J.: Optimization with first-order surrogate functions. In: ICML, pp. 783–791 (2013)
Mairal, J.: Incremental majorization–minimization optimization with application to large-scale machine learning. SIAM J. Optim. 25(2), 829–855 (2015)
Mokhtari, A., Gurbuzbalaban, M., Ribeiro, A.: Surpassing gradient descent provably: Acyclic incremental method with linear convergence rate. SIAM J. Optim. 28(2), 1420–1447 (2018)
Nemirovski, A.S., Yudin, D.B.: Problem Complexity and Method Efficiency in Optimization. Wiley, New York (1983)
Nesterov, Y.: Introductory Lectures on Convex Optimization: A Basic Course. Springer, New York (2004)
Nitanda, A.: Stochastic proximal gradient descent with acceleration techniques. In: NIPS, pp. 1574–1582 (2014)
Polyak, B.T.: Introduction to Optimization. Optimization Software Inc, New York (1987)
Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22, 400–407 (1951)
Schmidt, M., Le Roux, N., Bach, F.: Minimizing finite sums with the stochastic average gradient. Math. Program. 162(1–2), 83–112 (2017)
Shalev-Shwartz, S.: SDCA without duality, regularization and individual convexity. In: ICML, pp. 747–754 (2016)
Shalev-Shwartz, S., Zhang, T.: Stochastic dual coordinate ascent methods for regularized loss. J. Mach. Learn. Res. 14, 567–599 (2013)
Shalev-Shwartz, S., Zhang, T.: Accelerated proximal stochastic dual coordinate ascent for regularized loss minimization. Math. Program. 155(1), 105–145 (2016)
Shor, N.Z.: Notes of Scientific Seminar on Theory and Applications of Cybernetics and Operations Research, pp. 9–17. Ukrainian Academy of Sciences, Kiev (1962)
Shor, N.Z., Kiwiel, K.C., Ruszcayński, A.: Minimization Methods for Non-differentiable Functions. Springer, New York (1985)
Tseng, P., Yun, S.: Incrementally updated gradient methods for constrained and regularized optimization. J. Optim. Theory Appl. 160(3), 832–853 (2014)
Vanli, N.D., Gürbüzbalaban, M., Ozdaglar, A.: A simple proof for the iteration complexity of the proximal gradient algorithm. NIPS Workshop on Optimization for Machine Learning (2016)
Vanli, N.D., Gürbüzbalaban, M., Ozdaglar, A.: A stronger convergence result on the proximal incremental aggregated gradient method. arXiv (2016)
Vanli, N.D., Gürbüzbalaban, M., Ozdaglar, A.: Global convergence rate of proximal incremental aggregated gradient methods. SIAM J. Optim. 28(2), 1282–1300 (2018)
Wang, M., Bertsekas, D.P.: Incremental constraint projection methods for variational inequalities. Math. Program. 150(2), 321–363 (2015)
Xiao, L., Zhang, T.: A proximal stochastic gradient method with progressive variance reduction. SIAM J. Optim. 24(4), 2057–2075 (2014)
Ying B., Yuan, K., Sayed A.H.: Variance-reduced stochastic learning under random reshuffling. arXiv preprint arXiv:1708.01383 (2017)
Zhang, H., Dai, Y.H., Guo, L., Peng, W.: Proximal-like incremental aggregated gradient method with linear convergence under Bregman distance growth conditions. arXiv preprint arXiv:1711.01136 (2017)
Zhang, L., Mahdavi, M., Jin, R.: Linear convergence with condition number independent access of full gradients. In: NIPS, pp. 980–988 (2013)
Acknowledgements
Ernest Ryu was supported in part by NSF Grant DMS-1720237 and ONR Grant N000141712162.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Park, Y., Ryu, E.K. Linear convergence of cyclic SAGA. Optim Lett 14, 1583–1598 (2020). https://doi.org/10.1007/s11590-019-01520-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11590-019-01520-y