Solution refinement at regular points of conic problems

Busseti, Enzo; Moursi, Walaa M.; Boyd, Stephen

doi:10.1007/s10589-019-00122-9

Solution refinement at regular points of conic problems

Published: 26 August 2019

Volume 74, pages 627–643, (2019)
Cite this article

Computational Optimization and Applications Aims and scope Submit manuscript

Enzo Busseti¹,
Walaa M. Moursi^1,2 &
Stephen Boyd¹

370 Accesses
8 Citations
Explore all metrics

Abstract

Many numerical methods for conic problems use the homogenous primal–dual embedding, which yields a primal–dual solution or a certificate establishing primal or dual infeasibility. Following Themelis and Patrinos (IEEE Trans Autom Control, 2019), we express the embedding as the problem of finding a zero of a mapping containing a skew-symmetric linear function and projections onto cones and their duals. We focus on the special case when this mapping is regular, i.e., differentiable with nonsingular derivative matrix, at a solution point. While this is not always the case, it is a very common occurrence in practice. In this paper we do not aim for new theorerical results. We rather propose a simple method that uses LSQR, a variant of conjugate gradients for least squares problems, and the derivative of the residual mapping to refine an approximate solution, i.e., to increase its accuracy. LSQR is a matrix-free method, i.e., requires only the evaluation of the derivative mapping and its adjoint, and so avoids forming or storing large matrices, which makes it efficient even for cone problems in which the data matrices are given and dense, and also allows the method to extend to cone programs in which the data are given as abstract linear operators. Numerical examples show that the method improves an approximate solution of a conic program, and often dramatically, at a computational cost that is typically small compared to the cost of obtaining the original approximate solution. For completeness we describe methods for computing the derivative of the projection onto the cones commonly used in practice: nonnegative, second-order, semidefinite, and exponential cones. The paper is accompanied by an open source implementation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Strong Duality and Solution Existence Under Minimal Assumptions in Conic Linear Programming

Article 17 October 2023

Nguyen Ngoc Luan & Nguyen Dong Yen

On the sensitivity of the optimal partition for parametric second-order conic optimization

Article 18 August 2021

Ali Mohammad-Nezhad & Tamás Terlaky

Error Bound for Conic Inequality

Article 04 October 2014

Xi Yin Zheng & Kung Fu Ng

References

Ali, A., Wong, E., Kolter, J.: A semismooth newton method for fast, generic convex programming. In: Proceedings of the 34th International Conference on Machine Learning, pp. 272–279 (2018)
Boyd, S., Busseti, E., Diamond, S., Kahn, R., Koh, K., Nystrup, P., Speth, J.: Multi-period trading via convex optimization. Found. Trends Optim. 3(1), 1–76 (2017)
Article Google Scholar
Bauschke, H., Combettes, P.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces, 2nd edn. Springer, Berlin (2017)
Book Google Scholar
Busseti, E., Ryu, E., Boyd, S.: Risk-constrained Kelly gambling. J. Invest. 25(3), 118–134 (2016)
Article Google Scholar
Browder, F.: Convergence theorems for sequences of nonlinear operators in Banach spaces. Math. Z. 100(3), 201–225 (1967)
Article MathSciNet Google Scholar
Ben-Tal, A., Nemirovski, A.: Lectures on Modern Convex Optimization. SIAM, Philadelphia (2001)
Book Google Scholar
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Book Google Scholar
Boyd, S., Vandenberghe, L.: Introduction to Applied Linear Algebra - Vectors, Matrices, and Least Squares. Cambridge University Press, Cambridge (2018)
MATH Google Scholar
Chen, X., Qi, H.D., Tseng, P.: Analysis of nonsmooth symmetric-matrix-valued functions with applications to semidefinite complementarity problems. SIAM J. Optim. 13(4), 960–985 (2003)
Article MathSciNet Google Scholar
Diamond, S., Boyd, S.: CVXPY: a Python-embedded modeling language for convex optimization. J. Mach. Learn. Res. 16(83), 1–5 (2016)
MathSciNet MATH Google Scholar
Domahidi, A., Chu, E., Boyd, S.: ECOS: an SOCP solver for embedded systems. In: 2013 European Control Conference, pp. 3071–3076. IEEE (2013)
Evans, L., Gariepy, R.: Measure Theory and Fine Properties of Functions. CRC Press, Boca Raton (1992)
MATH Google Scholar
El Ghaoui, L., Lebret, H.: Robust solutions to least-squares problems with uncertain data. SIAM J. Matrix Anal. Appl. 18(4), 1035–1064 (1997)
Article MathSciNet Google Scholar
Fu, A., Narasimhan, B., Boyd, S.: CVXR: an R package for disciplined convex optimization. J. Stat. Softw. (2019) (to appear)
Grant, M., Boyd, S.: Graph implementations for nonsmooth convex programs. In: Recent Advances in Learning and Control, Lecture Notes in Control and Information Sciences, pp. 95–110. Springer (2008)
Grant, M., Boyd, S.: CVX: Matlab software for disciplined convex programming, version 2.1. http://cvxr.com/cvx (2014)
Gardiner, J., Laub, A., Amato, J., Moler, C.: Solution of the Sylvester matrix equation $AXB^{ T}+CXD^{T}=E$. ACM Trans. Math. Softw. 18(2), 223–231 (1992)
Article Google Scholar
Jiang, H.: Global convergence analysis of the generalized Newton and Gauss–Newton methods of the Fischer–Burmeister equation for the complementarity problem. Math. Oper. Res. 24(3), 529–543 (1999)
Article MathSciNet Google Scholar
Jones, E., Oliphant, T., Peterson, P., Others: SciPy: Open source scientific tools for Python. http://www.scipy.org/ (2001). Accessed 4 Mar 2019
Kanzow, C., Ferenczi, I., Fukushima, M.: On the local convergence of semismooth Newton methods for linear and nonlinear second-order cone programs without strict complementarity. SIAM J. Optim. 20(1), 297–320 (2009)
Article MathSciNet Google Scholar
Löfberg, J.: YALMIP: A toolbox for modeling and optimization in MATLAB. In: Proceedings of the IEEE International Symposium on Computer Aided Control Systems Design, pp. 284–289 (2004)
Lasdon, L., Mitter, S., Waren, A.: The conjugate gradient method for optimal control problems. IEEE Trans. Autom. Control 12(2), 132–138 (1967)
Article MathSciNet Google Scholar
Moreau, J.-J.: Décomposition orthogonale d’un espace hilbertien selon deux cônes mutuellement polaires. Bulletin de la Société Mathématique de France 93, 273–299 (1965)
Article MathSciNet Google Scholar
MOSEK ApS. The MOSEK optimization toolbox for MATLAB manual, version 8.0 (revision 57) (2017)
Malick, J., Sendov, H.: Clarke generalized Jacobian of the projection onto the cone of positive semidefinite matrices. Set-Valued Anal. 14(3), 273–293 (2006)
Article MathSciNet Google Scholar
Nash, S.: A survey of truncated-Newton methods. J. Comput. Appl. Math. 124(1–2), 45–59 (2000)
Article MathSciNet Google Scholar
Nocedal, J., Wright, S.: Numerical Optimization. Springer Series in Operations Research and Financial Engineering, 2nd edn. Springer, Berlin (2006)
Google Scholar
Numba Development Team. Numba. http://numba.pydata.org (2015). Accessed 4 Mar 2019
O’Donoghue, B., Chu, E., Parikh, N., Boyd, S.: Conic optimization via operator splitting and homogeneous self-dual embedding. J. Optim. Theory Appl. 169(3), 1042–1068 (2016)
Article MathSciNet Google Scholar
Oliphant, T.: A Guide to NumPy, vol. 1. Trelgol Publishing, Spanish Fork (2006)
Google Scholar
Parikh, N., Boyd, S.: Proximal algorithms. Found. Trends Optim. 1(3), 123–231 (2014)
Google Scholar
Permenter, F., Friberg, H.A., Andersen, E.D.: Solving conic optimization problems via self-dual embedding and facial reduction: a unified approach. SIAM J. Optim, 27(3), 1257–1282 (2017)
Article MathSciNet Google Scholar
Paige, C., Saunders, M.: LSQR: an algorithm for sparse linear equations and sparse least squares. ACM Trans. Math. Softw. 8(1), 43–71 (1982)
Article MathSciNet Google Scholar
Qi, L., Sun, J.: A nonsmooth version of Newton’s method. Math. Program. 58(3, Ser. A), 353–367 (1993)
Article MathSciNet Google Scholar
Qi, L., Sun.: A survey of some nonsmooth equations and smoothing Newton methods. In: Progress in Optimization, volume 30 of Applied Optimization, pp. 121–146. Kluwer (1999)
Rockafellar, R.: Convex Analysis. Princeton University Press, Princeton (1970)
Book Google Scholar
Rockafellar, R., Wets, R.: Variational Analysis. Springer, Berlin (1998)
Book Google Scholar
Stellato, B., Banjac, G., Goulart, P., Bemporad, A., Boyd, S.: OSQP: an operator splitting solver for quadratic programs. ArXiv e-prints (2017)
SCS. Splitting conic solve, version 1.1.0. https://github.com/cvxgrp/scs (2015)
Sun, D., Sun, J.: Löwner’s operator and spectral functions in Euclidean Jordan algebras. Math. Oper. Res. 33(2), 421–445 (2008)
Article MathSciNet Google Scholar
Sturm, J.: Using SeDuMi 1.02, a MATLAB toolbox for optimization over symmetric cones. Optim. Methods Softw. 11(1–4), 625–653 (1999)
Article MathSciNet Google Scholar
Sylvester, J.: Sur l’équation linéaire trinôme en matrices d’un ordre quelconque. Comptes Rendus de l’Académie des Sciences 99, 527–529 (1884)
MATH Google Scholar
Taylor, J.: Convex Optim. Power Syst. Cambridge University Press, Cambridge (2015)
Book Google Scholar
Themelis, A., Patrinos, P.: SuperMann: a superlinearly convergent algorithm for finding fixed points of nonexpansive operators. IEEE Trans. Autom. Control (2019). https://doi.org/10.1109/TAC.2019.2906393
Article Google Scholar
Udell, M., Mohan, K., Zeng, D., Hong, J., Diamond, S., Boyd, S.: Convex optimization in Julia. In: SC14 Workshop on High Performance Technical Computing in Dynamic Languages (2014)
Wright, S., Holt, J.: An inexact Levenberg–Marquardt method for large sparse nonlinear least squares. ANZIAM J. 26(4), 387–403 (1985)
MathSciNet MATH Google Scholar
Ye, Y., Todd, M., Mizuno, S.: An ${O}(\sqrt{n}{L})$-iteration homogeneous and self-dual linear programming algorithm. Math. Oper. Res. 19(1), 53–67 (1994)
Article MathSciNet Google Scholar

Download references

Acknowledgements

The authors thank Yinyu Ye, Micheal Saunders, Nicholas Moehle, and Steven Diamond for useful discussions.

Author information

Authors and Affiliations

Department of Electrical Engineering, Stanford University, 350 Serra Mall, Stanford, CA, 94305, USA
Enzo Busseti, Walaa M. Moursi & Stephen Boyd
Mathematics Department, Faculty of Science, Mansoura University, Mansoura, 35516, Egypt
Walaa M. Moursi

Authors

Enzo Busseti
View author publications
You can also search for this author in PubMed Google Scholar
Walaa M. Moursi
View author publications
You can also search for this author in PubMed Google Scholar
Stephen Boyd
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Walaa M. Moursi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A

Differentiability properties of the residual map Let C be a nonempty closed convex subset of ${\mathbf{R}}^n$. It is well known that the projection $\Pi _C$ onto C is (firmly) nonexpansive (see, e.g., [5, Proposition 2]), hence it is Lipschitz continuous with a Lipschitz constant at most 1. Consequently, if $A: {\mathbf{R}}^n\rightarrow {\mathbf{R}}^m$ is linear then the composition $A\circ \Pi _C$ is also Lipschitz continuous. Therefore, by the Rademacher Theorem (see, e.g., [37, Theorem 9.60] or [12, Theorem 3.2]) both $\Pi _C$ and $A\circ \Pi _C$ are differentiable almost everywhere. This allows us to conclude that the residual map (7) is differentiable almost everywhere. Moreover, let $z\in {\mathbf{R}}^{m+n+1}$. Clearly ${\mathcal {R}}$ is differentiable at z if $\Pi $ is differentiable at z.

Appendix B

Semi-definite cone projection derivative

Let $X\in {\mathbf{S}}^n$, let $X=U\mathbf {diag} (\lambda ) U^T$ be an eigendecomposition of X and suppose that $\det (X)\ne 0$. Without loss of generality, we can and do assume that the entries of $\lambda $ are in an increasing order. That is, there exists $k\in \{1,\ldots , n\}$ such that

$$\begin{aligned} \lambda _1\le \cdots \le \lambda _k<0<\lambda _{k+1} \le \cdots \le \lambda _n. \end{aligned}$$

(17)

We also note that

$$\begin{aligned} \Pi X - X = U\mathbf {diag} (\lambda _- ) U^T, \end{aligned}$$

(18)

where $\lambda _- = -\min (\lambda , 0)$. It follows from (11), (18), and the orthogonality of U that

$$\begin{aligned} U^T\Pi X U = \mathbf {diag} (\lambda _+ ), \quad U^T(\Pi X -X)U= \mathbf {diag} (\lambda _-). \end{aligned}$$

(19)

Note that

$$\begin{aligned} \Pi X (\Pi X - X) =U \mathbf {diag}(\lambda _+ ) \mathbf {diag} (\lambda _- ) U^T= 0. \end{aligned}$$

(20)

Let ${\mathsf {D}} {\Pi }(X): {\mathbf{S}}^n \rightarrow {\mathbf{S}}^n$ be the derivative of $\Pi $ at X, and let ${\widetilde{X}}\in {\mathbf{S}}^n$. We now show that (12) holds.

Indeed, using the first order Taylor approximation of $\Pi $ around X, for $\Delta X\in {\mathbf{S}}^n$ such that $||\Delta X||_F$ is sufficiently small (here $||\cdot ||_F$ denotes the Frobenius norm) we have

$$\begin{aligned} \Pi (X + \Delta X) \approx \Pi X + {\mathsf {D}} \Pi (X)(\Delta X). \end{aligned}$$

(21)

To simplify the notation, we set $\Delta Y={\mathsf {D}} \Pi (X)(\Delta X)$. Now

$$\begin{aligned} 0&=\Pi (X + \Delta X) (\Pi (X + \Delta X) - X - \Delta X) \end{aligned}$$

(22a)

$$\begin{aligned}&\approx (\Pi X + \Delta Y) (\Pi X + \Delta Y - X - \Delta X) \end{aligned}$$

(22b)

$$\begin{aligned}&=\Pi X(\Pi X-X) +\Delta Y(\Pi X-X) +\Pi X(\Delta Y-\Delta X) +\Delta Y (\Delta Y-\Delta X) \nonumber \\&\approx \Pi X(\Delta Y-\Delta X) + \Delta Y(\Pi X-X) \end{aligned}$$

(22c)

$$\begin{aligned}&\approx U^T\Pi X(\Delta Y-\Delta X)U + U^T\Delta Y(\Pi X-X)U \end{aligned}$$

(22d)

$$\begin{aligned}&=(U^T\Pi XU) U^T(\Delta Y-\Delta X)U + U^T\Delta YU (U^T(\Pi X-X)U) \end{aligned}$$

(22e)

$$\begin{aligned}&=\mathbf {diag} (\lambda _+ ) U^T(\Delta Y-\Delta X)U + U^T\Delta YU (\mathbf {diag} (\lambda _- )). \end{aligned}$$

(22f)

Here, (22a) follows from applying (20) with X replaced by $X + \Delta X$, (22b) follows from combining (22a) and (21), (22c) follows from (20) by neglecting second order terms, (22d) follows from multiplying (22c) from the left by $U^T$ and from the right by U, (22e) follows from the fact that $UU^T=I$ and finally (22f) follows from (19). We rewrite the Sylvester [17, 42] Eq. (22f) as

$$\begin{aligned} \mathbf {diag} (\lambda _+ ) U^T\Delta YU + U^T\Delta YU \mathbf {diag} (\lambda _-) \approx \mathbf {diag} (\lambda _+ ) U^T\Delta X U. \end{aligned}$$

(23)

Using (23), we learn that for any $i \in \{1, \ldots , n\}$ and $j \in \{1, \ldots , n\}$, we have

$$\begin{aligned} ((\lambda _-)_j +(\lambda _+)_i)(U^T\Delta Y U)_{ij} \approx (\lambda _+)_i(U^T\Delta X U)_{ij} . \end{aligned}$$

Recalling (17), if $i \le k, \, j > k$ we have $(\lambda _-)_j = (\lambda _+)_i=0$. Otherwise, $(\lambda _-)_j +(\lambda _+)_i\ne 0 $ and

$$\begin{aligned} (U^T\Delta Y U)_{ij} \approx \underbrace{\frac{(\lambda _+)_i}{(\lambda _-)_j +(\lambda _+)_i}}_{=B_{ij}} (U^T\Delta X U)_{ij} . \end{aligned}$$

(24)

Proceeding by cases in view of (17), and using that $\Delta Y $ is symmetric (so is $U^T\Delta Y U$), we conclude that

$$\begin{aligned} B_{ij} = {\left\{ \begin{array}{ll} 0, &{}~~\text {if}~~i \le k, \, j \le k; \\ \frac{(\lambda _+)_i}{(\lambda _-)_j+(\lambda _+)_i}, &{} ~~\text {if}~~i> k, \, j \le k; \\ \frac{(\lambda _+)_j}{(\lambda _-)_i+(\lambda _+)_j}, &{}~~\text {if}~~i \le k, \, j> k; \\ 1,&{}~~\text {if}~~i> k, \, j > k.\\ \end{array}\right. } \end{aligned}$$

Therefore, combining with (24) we obtain

$$\begin{aligned} U^T\Delta Y U \approx B \circ (U^T\Delta X U ), \end{aligned}$$

where “$\circ $” denotes the Hadamard (i.e., entrywise) product. Recalling the definition of $\Delta Y$ and using that $UU^T=I$ we conclude that

$$\begin{aligned} {\mathsf {D}} \Pi (X)(\Delta X) \approx U (B \circ (U^T\Delta X U )) U^T. \end{aligned}$$

Letting $||\Delta X||_F\rightarrow 0$ and applying the implicit function theorem, we conclude that (12) holds.

Appendix C

Exponential cone projection derivative The Lagrangian of the constrained optimization problem (13) is

$$\begin{aligned} \tfrac{1}{2}||(x,y,z) - ({\overline{x}},{\overline{y}},{\overline{z}})||^2 +\mu ({\overline{y}} e^{{\overline{x}}/{\overline{y}}}-{\overline{z}}), \end{aligned}$$

where $\mu \in {\mathbf{R}}$ is the dual variable. The KKT conditions at a solution $(x^*,y^*,z^*,\mu ^*)$ are

$$\begin{aligned} x^*-x+\mu ^*e^{x^*/y^*}&=0\nonumber \\ y^*-y+\mu ^*e^{x^*/y^*}\big (1-\tfrac{x^*}{y^*}\big )&=0\nonumber \\ z^*-z-\mu ^*&=0\nonumber \\ y^*e^{x^*/y^*}-z^*&=0. \end{aligned}$$

(25)

Considering the differentials dx, dy, dz and $dx^*,dy^*,dz^*, d\mu ^*$ of the KKT conditions in (25), the authors of [1, Lemma 3.6] obtain the system of equations

$$\begin{aligned} \underbrace{ \begin{bmatrix} 1+\tfrac{\mu ^*e^{x^*/y^*}}{y^*}&-\tfrac{\mu ^*x^*e^{x^*/y^*}}{{y^*}^2}&0&e^{x^*/y^*} \\ -\tfrac{\mu ^*x^*e^{x^*/y^*}}{{y^*}^2}&1+\tfrac{\mu ^*{x^*}^2e^{x^*/y^*}}{{y^*}^3}&0&(1-x^*/y^*)e^{x^*/y^*} \\ 0&0&1&-1 \\ e^{x^*/y^*}&(1-x^*/y^*)e^{x^*/y^*}&-1&0 \end{bmatrix} }_{D} \underbrace{ \begin{bmatrix} dx^*\\ dy^*\\ dz^*\\ d\mu ^* \end{bmatrix} }_{du^*} =\underbrace{\begin{bmatrix} dx\\ dy\\ dz\\ 0 \end{bmatrix} }_{du}.\nonumber \\ \end{aligned}$$

(26)

Note that, since (13) is feasible, D is invertible. Therefore, $du^*=D^{-1}(du)$. Consequently, the upper left $3\times 3 $ block matrix of $D^{-1}$ is the Jacobian of the projection at (x, y, z) in Case 4.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Busseti, E., Moursi, W.M. & Boyd, S. Solution refinement at regular points of conic problems. Comput Optim Appl 74, 627–643 (2019). https://doi.org/10.1007/s10589-019-00122-9

Download citation

Received: 05 March 2019
Published: 26 August 2019
Issue Date: December 2019
DOI: https://doi.org/10.1007/s10589-019-00122-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Solution refinement at regular points of conic problems

Abstract

Access this article

Similar content being viewed by others

Strong Duality and Solution Existence Under Minimal Assumptions in Conic Linear Programming

On the sensitivity of the optimal partition for parametric second-order conic optimization

Error Bound for Conic Inequality

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A

Appendix B

Appendix C

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Solution refinement at regular points of conic problems

Abstract

Access this article

Similar content being viewed by others

Strong Duality and Solution Existence Under Minimal Assumptions in Conic Linear Programming

On the sensitivity of the optimal partition for parametric second-order conic optimization

Error Bound for Conic Inequality

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A

Appendix B

Appendix C

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation