Abstract
Many numerical methods for conic problems use the homogenous primal–dual embedding, which yields a primal–dual solution or a certificate establishing primal or dual infeasibility. Following Themelis and Patrinos (IEEE Trans Autom Control, 2019), we express the embedding as the problem of finding a zero of a mapping containing a skew-symmetric linear function and projections onto cones and their duals. We focus on the special case when this mapping is regular, i.e., differentiable with nonsingular derivative matrix, at a solution point. While this is not always the case, it is a very common occurrence in practice. In this paper we do not aim for new theorerical results. We rather propose a simple method that uses LSQR, a variant of conjugate gradients for least squares problems, and the derivative of the residual mapping to refine an approximate solution, i.e., to increase its accuracy. LSQR is a matrix-free method, i.e., requires only the evaluation of the derivative mapping and its adjoint, and so avoids forming or storing large matrices, which makes it efficient even for cone problems in which the data matrices are given and dense, and also allows the method to extend to cone programs in which the data are given as abstract linear operators. Numerical examples show that the method improves an approximate solution of a conic program, and often dramatically, at a computational cost that is typically small compared to the cost of obtaining the original approximate solution. For completeness we describe methods for computing the derivative of the projection onto the cones commonly used in practice: nonnegative, second-order, semidefinite, and exponential cones. The paper is accompanied by an open source implementation.
Similar content being viewed by others
References
Ali, A., Wong, E., Kolter, J.: A semismooth newton method for fast, generic convex programming. In: Proceedings of the 34th International Conference on Machine Learning, pp. 272–279 (2018)
Boyd, S., Busseti, E., Diamond, S., Kahn, R., Koh, K., Nystrup, P., Speth, J.: Multi-period trading via convex optimization. Found. Trends Optim. 3(1), 1–76 (2017)
Bauschke, H., Combettes, P.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces, 2nd edn. Springer, Berlin (2017)
Busseti, E., Ryu, E., Boyd, S.: Risk-constrained Kelly gambling. J. Invest. 25(3), 118–134 (2016)
Browder, F.: Convergence theorems for sequences of nonlinear operators in Banach spaces. Math. Z. 100(3), 201–225 (1967)
Ben-Tal, A., Nemirovski, A.: Lectures on Modern Convex Optimization. SIAM, Philadelphia (2001)
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Boyd, S., Vandenberghe, L.: Introduction to Applied Linear Algebra - Vectors, Matrices, and Least Squares. Cambridge University Press, Cambridge (2018)
Chen, X., Qi, H.D., Tseng, P.: Analysis of nonsmooth symmetric-matrix-valued functions with applications to semidefinite complementarity problems. SIAM J. Optim. 13(4), 960–985 (2003)
Diamond, S., Boyd, S.: CVXPY: a Python-embedded modeling language for convex optimization. J. Mach. Learn. Res. 16(83), 1–5 (2016)
Domahidi, A., Chu, E., Boyd, S.: ECOS: an SOCP solver for embedded systems. In: 2013 European Control Conference, pp. 3071–3076. IEEE (2013)
Evans, L., Gariepy, R.: Measure Theory and Fine Properties of Functions. CRC Press, Boca Raton (1992)
El Ghaoui, L., Lebret, H.: Robust solutions to least-squares problems with uncertain data. SIAM J. Matrix Anal. Appl. 18(4), 1035–1064 (1997)
Fu, A., Narasimhan, B., Boyd, S.: CVXR: an R package for disciplined convex optimization. J. Stat. Softw. (2019) (to appear)
Grant, M., Boyd, S.: Graph implementations for nonsmooth convex programs. In: Recent Advances in Learning and Control, Lecture Notes in Control and Information Sciences, pp. 95–110. Springer (2008)
Grant, M., Boyd, S.: CVX: Matlab software for disciplined convex programming, version 2.1. http://cvxr.com/cvx (2014)
Gardiner, J., Laub, A., Amato, J., Moler, C.: Solution of the Sylvester matrix equation \(AXB^{ T}+CXD^{T}=E\). ACM Trans. Math. Softw. 18(2), 223–231 (1992)
Jiang, H.: Global convergence analysis of the generalized Newton and Gauss–Newton methods of the Fischer–Burmeister equation for the complementarity problem. Math. Oper. Res. 24(3), 529–543 (1999)
Jones, E., Oliphant, T., Peterson, P., Others: SciPy: Open source scientific tools for Python. http://www.scipy.org/ (2001). Accessed 4 Mar 2019
Kanzow, C., Ferenczi, I., Fukushima, M.: On the local convergence of semismooth Newton methods for linear and nonlinear second-order cone programs without strict complementarity. SIAM J. Optim. 20(1), 297–320 (2009)
Löfberg, J.: YALMIP: A toolbox for modeling and optimization in MATLAB. In: Proceedings of the IEEE International Symposium on Computer Aided Control Systems Design, pp. 284–289 (2004)
Lasdon, L., Mitter, S., Waren, A.: The conjugate gradient method for optimal control problems. IEEE Trans. Autom. Control 12(2), 132–138 (1967)
Moreau, J.-J.: Décomposition orthogonale d’un espace hilbertien selon deux cônes mutuellement polaires. Bulletin de la Société Mathématique de France 93, 273–299 (1965)
MOSEK ApS. The MOSEK optimization toolbox for MATLAB manual, version 8.0 (revision 57) (2017)
Malick, J., Sendov, H.: Clarke generalized Jacobian of the projection onto the cone of positive semidefinite matrices. Set-Valued Anal. 14(3), 273–293 (2006)
Nash, S.: A survey of truncated-Newton methods. J. Comput. Appl. Math. 124(1–2), 45–59 (2000)
Nocedal, J., Wright, S.: Numerical Optimization. Springer Series in Operations Research and Financial Engineering, 2nd edn. Springer, Berlin (2006)
Numba Development Team. Numba. http://numba.pydata.org (2015). Accessed 4 Mar 2019
O’Donoghue, B., Chu, E., Parikh, N., Boyd, S.: Conic optimization via operator splitting and homogeneous self-dual embedding. J. Optim. Theory Appl. 169(3), 1042–1068 (2016)
Oliphant, T.: A Guide to NumPy, vol. 1. Trelgol Publishing, Spanish Fork (2006)
Parikh, N., Boyd, S.: Proximal algorithms. Found. Trends Optim. 1(3), 123–231 (2014)
Permenter, F., Friberg, H.A., Andersen, E.D.: Solving conic optimization problems via self-dual embedding and facial reduction: a unified approach. SIAM J. Optim, 27(3), 1257–1282 (2017)
Paige, C., Saunders, M.: LSQR: an algorithm for sparse linear equations and sparse least squares. ACM Trans. Math. Softw. 8(1), 43–71 (1982)
Qi, L., Sun, J.: A nonsmooth version of Newton’s method. Math. Program. 58(3, Ser. A), 353–367 (1993)
Qi, L., Sun.: A survey of some nonsmooth equations and smoothing Newton methods. In: Progress in Optimization, volume 30 of Applied Optimization, pp. 121–146. Kluwer (1999)
Rockafellar, R.: Convex Analysis. Princeton University Press, Princeton (1970)
Rockafellar, R., Wets, R.: Variational Analysis. Springer, Berlin (1998)
Stellato, B., Banjac, G., Goulart, P., Bemporad, A., Boyd, S.: OSQP: an operator splitting solver for quadratic programs. ArXiv e-prints (2017)
SCS. Splitting conic solve, version 1.1.0. https://github.com/cvxgrp/scs (2015)
Sun, D., Sun, J.: Löwner’s operator and spectral functions in Euclidean Jordan algebras. Math. Oper. Res. 33(2), 421–445 (2008)
Sturm, J.: Using SeDuMi 1.02, a MATLAB toolbox for optimization over symmetric cones. Optim. Methods Softw. 11(1–4), 625–653 (1999)
Sylvester, J.: Sur l’équation linéaire trinôme en matrices d’un ordre quelconque. Comptes Rendus de l’Académie des Sciences 99, 527–529 (1884)
Taylor, J.: Convex Optim. Power Syst. Cambridge University Press, Cambridge (2015)
Themelis, A., Patrinos, P.: SuperMann: a superlinearly convergent algorithm for finding fixed points of nonexpansive operators. IEEE Trans. Autom. Control (2019). https://doi.org/10.1109/TAC.2019.2906393
Udell, M., Mohan, K., Zeng, D., Hong, J., Diamond, S., Boyd, S.: Convex optimization in Julia. In: SC14 Workshop on High Performance Technical Computing in Dynamic Languages (2014)
Wright, S., Holt, J.: An inexact Levenberg–Marquardt method for large sparse nonlinear least squares. ANZIAM J. 26(4), 387–403 (1985)
Ye, Y., Todd, M., Mizuno, S.: An \({O}(\sqrt{n}{L})\)-iteration homogeneous and self-dual linear programming algorithm. Math. Oper. Res. 19(1), 53–67 (1994)
Acknowledgements
The authors thank Yinyu Ye, Micheal Saunders, Nicholas Moehle, and Steven Diamond for useful discussions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A
Differentiability properties of the residual map Let C be a nonempty closed convex subset of \({\mathbf{R}}^n\). It is well known that the projection \(\Pi _C\) onto C is (firmly) nonexpansive (see, e.g., [5, Proposition 2]), hence it is Lipschitz continuous with a Lipschitz constant at most 1. Consequently, if \(A: {\mathbf{R}}^n\rightarrow {\mathbf{R}}^m\) is linear then the composition \(A\circ \Pi _C\) is also Lipschitz continuous. Therefore, by the Rademacher Theorem (see, e.g., [37, Theorem 9.60] or [12, Theorem 3.2]) both \(\Pi _C\) and \(A\circ \Pi _C\) are differentiable almost everywhere. This allows us to conclude that the residual map (7) is differentiable almost everywhere. Moreover, let \(z\in {\mathbf{R}}^{m+n+1}\). Clearly \({\mathcal {R}}\) is differentiable at z if \(\Pi \) is differentiable at z.
Appendix B
Semi-definite cone projection derivative
Let \(X\in {\mathbf{S}}^n\), let \(X=U\mathbf {diag} (\lambda ) U^T\) be an eigendecomposition of X and suppose that \(\det (X)\ne 0\). Without loss of generality, we can and do assume that the entries of \(\lambda \) are in an increasing order. That is, there exists \(k\in \{1,\ldots , n\}\) such that
We also note that
where \(\lambda _- = -\min (\lambda , 0)\). It follows from (11), (18), and the orthogonality of U that
Note that
Let \({\mathsf {D}} {\Pi }(X): {\mathbf{S}}^n \rightarrow {\mathbf{S}}^n\) be the derivative of \(\Pi \) at X, and let \({\widetilde{X}}\in {\mathbf{S}}^n\). We now show that (12) holds.
Indeed, using the first order Taylor approximation of \(\Pi \) around X, for \(\Delta X\in {\mathbf{S}}^n\) such that \(||\Delta X||_F\) is sufficiently small (here \(||\cdot ||_F\) denotes the Frobenius norm) we have
To simplify the notation, we set \(\Delta Y={\mathsf {D}} \Pi (X)(\Delta X)\). Now
Here, (22a) follows from applying (20) with X replaced by \(X + \Delta X\), (22b) follows from combining (22a) and (21), (22c) follows from (20) by neglecting second order terms, (22d) follows from multiplying (22c) from the left by \(U^T\) and from the right by U, (22e) follows from the fact that \(UU^T=I\) and finally (22f) follows from (19). We rewrite the Sylvester [17, 42] Eq. (22f) as
Using (23), we learn that for any \(i \in \{1, \ldots , n\}\) and \(j \in \{1, \ldots , n\}\), we have
Recalling (17), if \(i \le k, \, j > k\) we have \((\lambda _-)_j = (\lambda _+)_i=0\). Otherwise, \((\lambda _-)_j +(\lambda _+)_i\ne 0 \) and
Proceeding by cases in view of (17), and using that \(\Delta Y \) is symmetric (so is \(U^T\Delta Y U\)), we conclude that
Therefore, combining with (24) we obtain
where “\(\circ \)” denotes the Hadamard (i.e., entrywise) product. Recalling the definition of \(\Delta Y\) and using that \(UU^T=I\) we conclude that
Letting \(||\Delta X||_F\rightarrow 0\) and applying the implicit function theorem, we conclude that (12) holds.
Appendix C
Exponential cone projection derivative The Lagrangian of the constrained optimization problem (13) is
where \(\mu \in {\mathbf{R}}\) is the dual variable. The KKT conditions at a solution \((x^*,y^*,z^*,\mu ^*)\) are
Considering the differentials dx, dy, dz and \(dx^*,dy^*,dz^*, d\mu ^*\) of the KKT conditions in (25), the authors of [1, Lemma 3.6] obtain the system of equations
Note that, since (13) is feasible, D is invertible. Therefore, \(du^*=D^{-1}(du)\). Consequently, the upper left \(3\times 3 \) block matrix of \(D^{-1}\) is the Jacobian of the projection at (x, y, z) in Case 4.
Rights and permissions
About this article
Cite this article
Busseti, E., Moursi, W.M. & Boyd, S. Solution refinement at regular points of conic problems. Comput Optim Appl 74, 627–643 (2019). https://doi.org/10.1007/s10589-019-00122-9
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10589-019-00122-9