Skip to main content


Log in

Accelerating stochastic sequential quadratic programming for equality constrained optimization using predictive variance reduction

  • Published:
Computational Optimization and Applications Aims and scope Submit manuscript


In this paper, we propose a stochastic method for solving equality constrained optimization problems that utilizes predictive variance reduction. Specifically, we develop a method based on the sequential quadratic programming paradigm that employs variance reduction in the gradient approximations. Under reasonable assumptions, we prove that a measure of first-order stationarity evaluated at the iterates generated by our proposed algorithm converges to zero in expectation from arbitrary starting points, for both constant and adaptive step size strategies. Finally, we demonstrate the practical performance of our proposed algorithm on constrained binary classification problems that arise in machine learning.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data Availability Statement

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.


  1. For lemmas with proofs equivalent to those in [4], we refer interested reader to the appropriate sections.


  1. Achiam, J., Held, D., Tamar, A., Abbeel, P.: Constrained policy optimization. In: international conference on machine learning, pp. 22–31 (2017). PMLR

  2. Bai, J., Hager, W.W., Zhang, H.: An inexact accelerated stochastic ADMM for separable convex optimization. Comput. Optim. Appl. 81(2), 479–518 (2022)

  3. Berahas, A.S., Curtis, F.E., O’Neill, M.J., Robinson, D.P.: A stochastic sequential quadratic optimization algorithm for nonlinear equality constrained optimization with rank-deficient Jacobians. arXiv preprint arXiv:2106.13015 (2021a)

  4. Berahas, A.S., Curtis, F.E., Robinson, D., Zhou, B.: Sequential quadratic optimization for nonlinear equality constrained stochastic optimization. SIAM J. Optim. 31(2), 1352–1379 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  5. Bian, F., Liang, J., Zhang, X.: A stochastic alternating direction method of multipliers for non-smooth and non-convex optimization. Inverse Probl. 37(7), 075009 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  6. Bottou, L., Curtis, F.E., Nocedal, J.: Optimization methods for large-scale machine learning. Siam Rev. 60(2), 223–311 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  7. Byrd, R.H., Curtis, F.E., Nocedal, J.: An inexact SQP method for equality constrained optimization. SIAM J. Optim. 19(1), 351–369 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  8. Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 1–27 (2011)

    Article  Google Scholar 

  9. Chatterjee, N., Chen, Y.-H., Maas, P., Carroll, R.J.: Constrained maximum likelihood estimation for model calibration using summary-level information from external big data sources. J. Am. Statistical Assoc. 111(513), 107–117 (2016)

    Article  MathSciNet  Google Scholar 

  10. Chen, C., Tung, F., Vedula, N., Mori, G.: Constraint-aware deep neural network compression. In: proceedings of the European conference on computer vision (ECCV), pp. 400–415 (2018)

  11. Curtis, F.E., O’Neill, M.J., Robinson, D.P.: Worst-Case Complexity of an SQP method for nonlinear equality constrained stochastic optimization. arXiv preprint arXiv:2112.14799 (2021a)

  12. Curtis, F.E., Robinson, D.P., Zhou, B.: Inexact sequential quadratic optimization for minimizing a stochastic objective function subject to deterministic nonlinear equality constraints. arXiv preprint arXiv:2107.03512 (2021b)

  13. Defazio, A., Bach, F., Lacoste-Julien, S.: SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives. Adv. Neural Inf. Process. Syst., pp. 1646–1654 (2014)

  14. Geyer, C.J.: Constrained maximum likelihood exemplified by isotonic convex logistic regression. J. Am. Statistical Assoc. 86(415), 717–724 (1991)

    Article  Google Scholar 

  15. Ghadimi, S., Lan, G., Zhang, H.: Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization. Math. Program. 155(1–2), 267–305 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  16. Johnson, R., Zhang, T.: Accelerating stochastic gradient descent using predictive variance reduction. Adv. Neural Inf. Process. Syst. 26, 315–323 (2013)

    Google Scholar 

  17. Lan, G.: An optimal method for stochastic composite optimization. Math. Program. 133(1), 365–397 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  18. Lan, G.: First-order and Stochastic Optimization Methods for Machine Learning. Springer, Berlin (2020)

    Book  MATH  Google Scholar 

  19. Lioutikov, R., Paraschos, A., Peters, J., Neumann, G.: Sample-based informationl-theoretic stochastic optimal control. In: 2014 IEEE international conference on robotics and automation (ICRA), pp. 3896–3902 (2014). IEEE

  20. Malikopoulos, A.A.: Stochastic optimal control for series hybrid electric vehicles. In: 2013 American control conference, pp. 1189–1194 (2013). IEEE

  21. Márquez-Neila, P., Salzmann, M., Fua, P.: Imposing hard constraints on deep networks: Promises and limitations. arXiv: 1706.02025 (2017)

  22. Na, S., Anitescu, M., Kolar, M.: An adaptive stochastic sequential quadratic programming with differentiable exact augmented Lagrangians. arXiv preprint arXiv:2102.05320 (2021a)

  23. Na, S., Anitescu, M., Kolar, M.: Inequality constrained stochastic nonlinear optimization via active-set sequential quadratic programming. arXiv preprint arXiv:2109.11502 (2021b)

  24. Nandwani, Y., Pathak, A., Singla, P.: A primal-dual formulation for deep learning with constraints. In: proceedings of neural information processing systems (NeurIPS), pp. 12157–12168 (2019)

  25. Négiar, G., Dresdner, G., Tsai, A., El Ghaoui, L., Locatello, F., Freund, R., Pedregosa, F.: Stochastic frank-wolfe for constrained finite-sum minimization. In: international conference on machine learning, pp. 7253–7262 (2020). PMLR

  26. Nemirovski, A., Juditsky, A., Lan, G., Shapiro, A.: Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19(4), 1574–1609 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  27. Nguyen, L.M., Liu, J., Scheinberg, K., Takáč, M.: SARAH: A novel method for machine learning problems using stochastic recursive gradient. In: international conference on machine learning, pp. 2613–2621 (2017)

  28. Nocedal, J., Wright, S.: Numerical Optimization. Springer Series in Operations Research and Financial Engineering, Springer, New York (2006)

    MATH  Google Scholar 

  29. Ouyang, H., He, N., Tran, L., Gray, A.: Stochastic alternating direction method of multipliers. In: international conference on machine learning, pp. 80–88 (2013). PMLR

  30. Ravi, S.N., Dinh, T., Lokhande, V.S., Singh, V.: Explicitly imposing constraints in deep networks via conditional gradients gives improved generalization and faster convergence. In: Proceedings of the AAAI conference on artificial intelligence, vol. 33, pp. 4772–4779 (2019)

  31. Reddi, S.J., Sra, S., Póczos, B., Smola, A.: Stochastic Frank-Wolfe methods for nonconvex optimization. In: 2016 54th annual Allerton conference on communication, control, and computing (Allerton), pp. 1244–1251 (2016a). IEEE

  32. Reddi, S.J., Hefny, A., Sra, S., Poczos, B., Smola, A.: Stochastic variance reduction for nonconvex optimization. In: international conference on machine learning, pp. 314–323 (2016b). PMLR

  33. Robbins, H., Monro, S.: A stochastic approximation method. Ann Math Statistics, 400–407 (1951)

  34. Ross, S.M.: Simulation. Academic Press, Amsterdam (2013)

    MATH  Google Scholar 

  35. Roy, S.K., Mhammedi, Z., Harandi, M.: Geometry aware constrained optimization techniques for deep learning. In: proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4460–4469 (2018)

  36. Schmidt, M., Le Roux, N., Bach, F.: Minimizing finite sums with the stochastic average gradient. Math. Program. 162(1–2), 83–112 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  37. Shalev-Shwartz, S., Zhang, T.: Stochastic dual coordinate ascent methods for regularized loss minimization. J. Mach. Learn. Res. 14(2), 567 (2013)

    MathSciNet  MATH  Google Scholar 

  38. Shapiro, A., Dentcheva, D., Ruszczynski, A.: Lectures on stochastic programming: modeling and theory. SIAM, (2021)

  39. Shi, J., Spall, J.C.: SQP-based Projection SPSA algorithm for stochastic optimization with inequality constraints. In: 2021 American control conference (ACC), pp. 1244–1249 (2021). IEEE

  40. Summers, T., Warrington, J., Morari, M., Lygeros, J.: Stochastic optimal power flow based on conditional value at risk and distributional robustness. Int. J. Electric. Power Energy Syst. 72, 116–125 (2015)

    Article  Google Scholar 

  41. Uryasev, S., Pardalos, P.M.: Stochastic Optimization: Algorithms and Applications, vol. 54. Springer (2013)

    Google Scholar 

  42. Vrakopoulou, M., Mathieu, J.L., Andersson, G.: Stochastic optimal power flow with uncertain reserves from demand response. In: 2014 47th Hawaii international conference on system sciences, pp. 2353–2362 (2014). IEEE

  43. Wood, A.J., Wollenberg, B.F., Sheblé, G.B.: Power Generation, Operation, and Control. Wiley, New Jersey, USA (2013)

    Google Scholar 

  44. Zhong, W., Kwok, J.: Fast stochastic alternating direction method of multipliers. In: international conference on machine learning, pp. 46–54 (2014). PMLR

  45. Zhu, Y., Zabaras, N., Koutsourelakis, P.-S., Perdikaris, P.: Physics-constrained deep learning for high-dimensional surrogate modeling and uncertainty quantification without labeled data. J. Comput. Phys. 394, 56–81 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  46. Ziemba, W.T., Vickson, R.G.: Stochastic Optimization Models in Finance. Academic Press (2014)

    MATH  Google Scholar 

Download references


This material is based upon work supported by the Office of Naval Research under award number N00014-21-1-2532.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Albert S. Berahas.

Ethics declarations

Conflict of interest

The authors declare no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (zip 45 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Berahas, A.S., Shi, J., Yi, Z. et al. Accelerating stochastic sequential quadratic programming for equality constrained optimization using predictive variance reduction. Comput Optim Appl 86, 79–116 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: