Infeasibility Detection in the Alternating Direction Method of Multipliers for Convex Optimization

Abstract

The alternating direction method of multipliers is a powerful operator splitting technique for solving structured optimization problems. For convex optimization problems, it is well known that the algorithm generates iterates that converge to a solution, provided that it exists. If a solution does not exist, then the iterates diverge. Nevertheless, we show that they yield conclusive information regarding problem infeasibility for optimization problems with linear or quadratic objective functions and conic constraints, which includes quadratic, second-order cone, and semidefinite programs. In particular, we show that in the limit the iterates either satisfy a set of first-order optimality conditions or produce a certificate of either primal or dual infeasibility. Based on these results, we propose termination criteria for detecting primal and dual infeasibility.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

References

  1. 1.

    Parikh, N., Boyd, S.: Proximal algorithms. Found. Trends Optim. 1(3), 123–231 (2013). https://doi.org/10.1561/2400000003

    Article  Google Scholar 

  2. 2.

    Bauschke, H.H., Borwein, J.M.: On projection algorithms for solving convex feasibility problems. SIAM Rev. 38(3), 367–426 (1996). https://doi.org/10.1137/S0036144593251710

    MathSciNet  MATH  Article  Google Scholar 

  3. 3.

    Bauschke, H.H., Combettes, P.L., Luke, D.R.: Finding best approximation pairs relative to two closed convex sets in Hilbert spaces. J. Approx. Theory 127(2), 178–192 (2004). https://doi.org/10.1016/j.jat.2004.02.006

    MathSciNet  MATH  Article  Google Scholar 

  4. 4.

    Boley, D.: Local linear convergence of the alternating direction method of multipliers on quadratic or linear programs. SIAM J. Optim. 23(4), 2183–2207 (2013). https://doi.org/10.1137/120878951

    MathSciNet  MATH  Article  Google Scholar 

  5. 5.

    O’Donoghue, B., Chu, E., Parikh, N., Boyd, S.: Conic optimization via operator splitting and homogeneous self-dual embedding. J. Optim. Theory Appl. 169(3), 1042–1068 (2016). https://doi.org/10.1007/s10957-016-0892-3

    MathSciNet  MATH  Article  Google Scholar 

  6. 6.

    Zheng, Y., Fantuzzi, G., Papachristodoulou, A., Goulart, P., Wynn, A.: Chordal decomposition in operator-splitting methods for sparse semidefinite programs. Math. Program. (2019). https://doi.org/10.1007/s10107-019-01366-3

    Article  Google Scholar 

  7. 7.

    O’Donoghue, B., Stathopoulos, G., Boyd, S.: A splitting method for optimal control. IEEE Trans. Control Syst. Technol. 21(6), 2432–2442 (2013). https://doi.org/10.1109/TCST.2012.2231960

    Article  Google Scholar 

  8. 8.

    Jerez, J., Goulart, P., Richter, S., Constantinides, G., Kerrigan, E., Morari, M.: Embedded online optimization for model predictive control at megahertz rates. IEEE Trans. Autom. Control 59(12), 3238–3251 (2014). https://doi.org/10.1109/TAC.2014.2351991

    MathSciNet  MATH  Article  Google Scholar 

  9. 9.

    Banjac, G., Stellato, B., Moehle, N., Goulart, P., Bemporad, A., Boyd, S.: Embedded code generation using the OSQP solver. In: IEEE Conference on Decision and Control (CDC), pp. 1906–1911 (2017). https://doi.org/10.1109/CDC.2017.8263928

  10. 10.

    Beck, A., Teboulle, M.: Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Trans. Image Process. 18(11), 2419–2434 (2009). https://doi.org/10.1109/TIP.2009.2028250

    MathSciNet  MATH  Article  Google Scholar 

  11. 11.

    Combettes, P.L., Wajs, V.R.: Signal recovery by proximal forward-backward splitting. Multiscale Model. Simul. 4(4), 1168–1200 (2005). https://doi.org/10.1137/050626090

    MathSciNet  MATH  Article  Google Scholar 

  12. 12.

    Combettes, P.L., Pesquet, J.C.: Proximal splitting methods in signal processing. Springer Optim. Appl. 49, 185–212 (2011). https://doi.org/10.1007/978-1-4419-9569-8_10

    MathSciNet  MATH  Article  Google Scholar 

  13. 13.

    Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2011). https://doi.org/10.1561/2200000016

    MATH  Article  Google Scholar 

  14. 14.

    Stathopoulos, G., Shukla, H., Szucs, A., Pu, Y., Jones, C.: Operator splitting methods in control. Found. Trends Syst. Control 3(3), 249–362 (2016). https://doi.org/10.1561/2600000008

    Article  Google Scholar 

  15. 15.

    Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, New York (2011). https://doi.org/10.1007/978-1-4419-9467-7

    MATH  Book  Google Scholar 

  16. 16.

    Ghadimi, E., Teixeira, A., Shames, I., Johansson, M.: Optimal parameter selection for the alternating direction method of multipliers (ADMM): quadratic problems. IEEE Trans. Autom. Control 60(3), 644–658 (2015). https://doi.org/10.1109/TAC.2014.2354892

    MathSciNet  MATH  Article  Google Scholar 

  17. 17.

    Giselsson, P., Boyd, S.: Linear convergence and metric selection for Douglas–Rachford splitting and ADMM. IEEE Trans. Autom. Control 62(2), 532–544 (2017). https://doi.org/10.1109/TAC.2016.2564160

    MathSciNet  MATH  Article  Google Scholar 

  18. 18.

    Banjac, G., Goulart, P.: Tight global linear convergence rate bounds for operator splitting methods. IEEE Trans. Autom. Control 63(12), 4126–4139 (2018). https://doi.org/10.1109/TAC.2018.2808442

    MathSciNet  MATH  Article  Google Scholar 

  19. 19.

    Naik, V.V., Bemporad, A.: Embedded mixed-integer quadratic optimization using accelerated dual gradient projection. In: IFAC World Congress, pp. 10,723–10,728 (2017). https://doi.org/10.1016/j.ifacol.2017.08.2235

  20. 20.

    Eckstein, J., Bertsekas, D.P.: On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program. 55(1), 293–318 (1992). https://doi.org/10.1007/BF01581204

    MathSciNet  MATH  Article  Google Scholar 

  21. 21.

    Bauschke, H.H., Dao, M.N., Moursi, W.M.: The Douglas–Rachford algorithm in the affine-convex case. Oper. Res. Lett. 44(3), 379–382 (2016). https://doi.org/10.1016/j.orl.2016.03.010

    MathSciNet  MATH  Article  Google Scholar 

  22. 22.

    Bauschke, H.H., Moursi, W.M.: The Douglas–Rachford algorithm for two (not necessarily intersecting) affine subspaces. SIAM J. Optim. 26(2), 968–985 (2016). https://doi.org/10.1137/15M1016989

    MathSciNet  MATH  Article  Google Scholar 

  23. 23.

    Bauschke, H.H., Moursi, W.M.: On the Douglas–Rachford algorithm. Math. Program. 164(1), 263–284 (2017). https://doi.org/10.1007/s10107-016-1086-3

    MathSciNet  MATH  Article  Google Scholar 

  24. 24.

    Moursi, W.M.: The Douglas–Rachford operator in the possibly inconsistent case: static properties and dynamic behaviour. Ph.D. thesis, University of British Columbia (2016). https://doi.org/10.14288/1.0340501

  25. 25.

    Raghunathan, A.U., Di Cairano, S.: Infeasibility detection in alternating direction method of multipliers for convex quadratic programs. In: IEEE Conference on Decision and Control (CDC), pp. 5819–5824 (2014). https://doi.org/10.1109/CDC.2014.7040300

  26. 26.

    Toh, K.C.: An inexact primal-dual path following algorithm for convex quadratic SDP. Math. Program. 112(1), 221–254 (2008). https://doi.org/10.1007/s10107-006-0088-y

    MathSciNet  MATH  Article  Google Scholar 

  27. 27.

    Henrion, D., Malick, J.: Projection Methods in Conic Optimization, pp. 565–600. Springer, New York (2012). https://doi.org/10.1007/978-1-4614-0769-0_20

    MATH  Book  Google Scholar 

  28. 28.

    Stellato, B., Banjac, G., Goulart, P., Bemporad, A., Boyd, S.: OSQP: an operator splitting solver for quadratic programs. arXiv:1711.08013 (2018)

  29. 29.

    Rockafellar, R.T., Wets, R.J.B.: Variational Analysis. Grundlehren der mathematischen Wissenschaften. Springer, New York (1998). https://doi.org/10.1007/978-3-642-02431-3

    Book  Google Scholar 

  30. 30.

    Lourenço, B.F., Muramatsu, M., Tsuchiya, T.: Weak infeasibility in second order cone programming. Optim. Lett. 10(8), 1743–1755 (2016). https://doi.org/10.1007/s11590-015-0982-4

    MathSciNet  MATH  Article  Google Scholar 

  31. 31.

    Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)

    MATH  Book  Google Scholar 

  32. 32.

    Boyd, S., Vandenberghe, L.: Convex Optim. Cambridge University Press, Cambridge (2004). https://doi.org/10.1017/CBO9780511804441

    MATH  Book  Google Scholar 

  33. 33.

    Gabay, D.: Applications of the method of multipliers to variational inequalities. Stud. Math. Appl. 15((C)), 299–331 (1983). https://doi.org/10.1016/S0168-2024(08)70034-1

    Article  Google Scholar 

  34. 34.

    Giselsson, P., Fält, M., Boyd, S.: Line search for averaged operator iteration. In: IEEE Conference on Decision and Control (CDC), pp. 1015–1022 (2016). https://doi.org/10.1109/CDC.2016.7798401

  35. 35.

    Lions, P., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16(6), 964–979 (1979). https://doi.org/10.1137/0716071

    MathSciNet  MATH  Article  Google Scholar 

  36. 36.

    Pazy, A.: Asymptotic behavior of contractions in Hilbert space. Isr. J. Math. 9(2), 235–240 (1971). https://doi.org/10.1007/BF02771588

    MathSciNet  MATH  Article  Google Scholar 

  37. 37.

    Baillon, J.B., Bruck, R.E., Reich, S.: On the asymptotic behavior of nonexpansive mappings and semigroups in Banach spaces. Houst. J. Math. 4(1), 1–9 (1978)

    MathSciNet  MATH  Google Scholar 

  38. 38.

    Borchers, B.: SDPLIB 1.2, a library of semidefinite programming test problems. Optim. Methods Softw. 11(1), 683–690 (1999). https://doi.org/10.1080/10556789908805769

    MathSciNet  MATH  Article  Google Scholar 

  39. 39.

    Ramana, M.V.: An exact duality theory for semidefinite programming and its complexity implications. Math. Program. 77(1), 129–162 (1997). https://doi.org/10.1007/BF02614433

    MathSciNet  MATH  Article  Google Scholar 

Download references

Acknowledgements

We are grateful to Walaa Moursi for helpful comments and pointing out additional references. This work was supported by the People Programme (Marie Curie Actions) of the European Union Seventh Framework Programme (FP7/2007–2013) under REA Grant Agreement No. 607957 (TEMPO).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Goran Banjac.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Communicated by Jalal M. Fadili.

Appendix A: Supporting Results

Appendix A: Supporting Results

Lemma A.1

The first-order optimality conditions for problem (2) are conditions (3).

Proof

We first rewrite problem (2) in the form

$$\begin{aligned} \underset{(x,z)}{\mathrm{min}} \quad \left( \tfrac{1}{2}x^TP x + q^Tx + \mathcal {I}_\mathcal {C}(z) \right) \quad \mathrm{s.t.} \quad Ax=z \end{aligned}$$

and then form its Lagrangian

$$\begin{aligned} \mathcal {L}(x,z,y) :=\tfrac{1}{2}x^TP x + q^Tx + \mathcal {I}_\mathcal {C}(z) + y^T(Ax-z). \end{aligned}$$
(35)

Provided that the problem satisfies certain constraint qualification [15, Cor. 26.3], its solution can be characterized via a saddle point of (35). Therefore, the first-order optimality conditions can be written as [29, Ex. 11.52]

$$\begin{aligned} z&\in \mathcal {C} \\ 0&= -\nabla _x \mathcal {L}(x,z,y) = -(Px + q + A^Ty) \\ N_{\mathcal {C}}(z)&\ni \,-\nabla _z \mathcal {L}(x,z,y) = y \\ 0&= \nabla _y \mathcal {L}(x,z,y) = Ax - z. \end{aligned}$$

\(\square \)

Lemma A.2

The dual of problem (1) is given by problem (4).

Proof

The dual function can be derived from the Lagrangian (35) as follows:

$$\begin{aligned} g(y)&:=\inf _{(x,z)} \mathcal {L}(x,z,y) \\&= \inf _x \lbrace \tfrac{1}{2}x^TP x + (A^Ty + q)^Tx \rbrace + \inf _{z\in \mathcal {C}} \lbrace -y^Tz \rbrace \\&= \inf _x \lbrace \tfrac{1}{2}x^TP x + (A^Ty + q)^Tx \rbrace - \sup _{z\in \mathcal {C}} \lbrace y^Tz \rbrace . \end{aligned}$$

Note that the minimum of the Lagrangian over x is attained when \(Px + A^Ty + q = 0\), and the second term in the last line is \(S_{\mathcal {C}}(y)\). The dual problem, defined as the problem of maximizing the dual function, can then be written in the form (4), where the conic constraint on y is just the restriction of y to the domain of \(S_{\mathcal {C}}\) [31, p.112 and Cor. 14.2.1]. \(\square \)

Lemma A.3

For any vectors \(v\in \mathbb {R}^n\), \(b\in \mathbb {R}^n\) and a nonempty, closed, and convex cone \(\mathcal {K}\subseteq \mathbb {R}^n\),

  1. (i)

    \({\Pi }_{\mathcal {K}_b}(v) = b + {\Pi }_{\mathcal {K}}(v - b)\).

  2. (ii)

    \(({{\,\mathrm{Id}\,}}- {\Pi }_{\mathcal {K}_b})(v) = {\Pi }_{{\mathcal {K}}^{\circ }}(v - b)\).

  3. (iii)

    \(\left\langle {{\Pi }_{\mathcal {K}_b}(v)},{({{\,\mathrm{Id}\,}}- {\Pi }_{\mathcal {K}_b})(v)}\right\rangle = \left\langle {b},{{\Pi }_{{\mathcal {K}}^{\circ }}(v - b)}\right\rangle \).

  4. (iv)

    \(\left\langle {{\Pi }_{\mathcal {K}}(v)},{v}\right\rangle = ||{\Pi }_{\mathcal {K}}(v)||^2\).

  5. (v)

    \(S_{\mathcal {K}_b} ( {\Pi }_{{\mathcal {K}}^{\circ }}(v) ) = \left\langle {b},{{\Pi }_{{\mathcal {K}}^{\circ }}(v)}\right\rangle \).

Proof

Part (i) is from [15, Prop. 28.1(i)].

  1. (ii)

    From part (i), we have

    $$\begin{aligned} ({{\,\mathrm{Id}\,}}- {\Pi }_{\mathcal {K}_b})(v) = v - b - {\Pi }_{\mathcal {K}}(v - b) = {\Pi }_{{\mathcal {K}}^{\circ }}(v - b), \end{aligned}$$

    where the second equality follows from the Moreau decomposition [15, Thm. 6.29].

  2. (iii)

    Follows directly from parts (i) and (ii), and the Moreau decomposition.

  3. (iv)

    From the Moreau decomposition, we have

    $$\begin{aligned} \left\langle {{\Pi }_{\mathcal {K}}(v)},{v}\right\rangle = \left\langle {{\Pi }_{\mathcal {K}}(v)},{{\Pi }_{\mathcal {K}}(v) + {\Pi }_{{\mathcal {K}}^{\circ }}(v)}\right\rangle = ||{\Pi }_{\mathcal {K}}(v)||^2. \end{aligned}$$

    \(\square \)

  4. (v)

    Since the support function of \(\mathcal {K}\) evaluated at any point in \({\mathcal {K}}^{\circ }\) is zero, we have

    $$\begin{aligned} S_{\mathcal {K}_b} ( {\Pi }_{{\mathcal {K}}^{\circ }}(v) ) = \left\langle {b},{{\Pi }_{{\mathcal {K}}^{\circ }}(v)}\right\rangle + S_{\mathcal {K}} ( {\Pi }_{{\mathcal {K}}^{\circ }}(v) ) = \left\langle {b},{{\Pi }_{{\mathcal {K}}^{\circ }}(v)}\right\rangle . \end{aligned}$$

Lemma A.4

Suppose that \(\mathcal {K}\subseteq \mathbb {R}^n\) is a nonempty, closed, and convex cone and for some sequence \(\lbrace {v^k} \rbrace _{k\in {\mathbb {N}}}\), where \(v^k\in \mathbb {R}^n\), we denote by \(\delta v :=\lim _{k\rightarrow \infty }\tfrac{1}{k}v^k\), assuming that the limit exists. Then for any \(b\in \mathbb {R}^n\),

$$\begin{aligned} \lim _{k\rightarrow \infty }\tfrac{1}{k}{\Pi }_{\mathcal {K}_b}(v^k) = \lim _{k\rightarrow \infty }\tfrac{1}{k}{\Pi }_{\mathcal {K}}(v^k-b) = {\Pi }_{\mathcal {K}}(\delta v). \end{aligned}$$

Proof

Write the limit as

$$\begin{aligned} \lim _{k\rightarrow \infty }\tfrac{1}{k}{\Pi }_{\mathcal {K}_b}(v^k)&= \lim _{k\rightarrow \infty }\tfrac{1}{k}\left( b + {\Pi }_{\mathcal {K}}(v^k - b) \right) \\&= \lim _{k\rightarrow \infty }{\Pi }_{\mathcal {K}} \left( \tfrac{1}{k}(v^k - b) \right) \\&= {\Pi }_{\mathcal {K}} \left( \lim _{k\rightarrow \infty }\tfrac{1}{k} v^k \right) , \end{aligned}$$

where the first equality uses Lemma A.3(i) and the second and third follow from the positive homogeneity [15, Prop. 28.22] and continuity [15, Prop. 4.8] of \({\Pi }_{\mathcal {K}}\), respectively. \(\square \)

Lemma A.5

Suppose that \(\mathcal {B}\subseteq \mathbb {R}^n\) is a nonempty, convex, and compact set, and for some sequence \(\lbrace {v^k} \rbrace _{k\in {\mathbb {N}}}\), where \(v^k\in \mathbb {R}^n\), we denote by \(\delta v :=\lim _{k\rightarrow \infty }\tfrac{1}{k}v^k\), assuming that the limit exists. Then

$$\begin{aligned} \lim _{k\rightarrow \infty } \tfrac{1}{k} \left\langle {v^k},{{\Pi }_{\mathcal {B}}(v^k)}\right\rangle = \lim _{k\rightarrow \infty } \left\langle {\delta v},{{\Pi }_{\mathcal {B}}(v^k)}\right\rangle = S_{\mathcal {B}}(\delta v). \end{aligned}$$

Proof

Let \(z^k :={\Pi }_{\mathcal {B}}(v^k)\). We have the following inclusion [15, Prop. 6.46]

$$\begin{aligned} v^k - z^k \in N_{\mathcal {B}}(z^k), \end{aligned}$$

which, due to [15, Thm. 16.23], and the facts that \(S_{\mathcal {B}}\) is the Fenchel conjugate of \(\mathcal {I}_\mathcal {B}\) and \(N_{\mathcal {B}}\) is the subdifferential of \(\mathcal {I}_\mathcal {B}\), is equivalent to

$$\begin{aligned} \left\langle {\tfrac{1}{k}(v^k-z^k)},{z^k}\right\rangle = S_{\mathcal {B}}\left( \tfrac{1}{k}(v^k-z^k)\right) . \end{aligned}$$

Taking the limit of the identity above, we obtain

$$\begin{aligned} \lim _{k\rightarrow \infty } \left\langle {\tfrac{1}{k}(v^k-z^k)},{z^k}\right\rangle= & {} \lim _{k\rightarrow \infty } S_{\mathcal {B}}\left( \tfrac{1}{k}(v^k-z^k)\right) \nonumber \\= & {} S_{\mathcal {B}}\big (\lim _{k\rightarrow \infty }\tfrac{1}{k}(v^k-z^k)\big ) = S_{\mathcal {B}}(\delta v), \end{aligned}$$
(36)

where the second equality follows from the continuity of \(S_{\mathcal {B}}\) [15, Ex. 11.2] and the third from the compactness of \(\mathcal {B}\). Since \(\lbrace {z^k} \rbrace _{k\in {\mathbb {N}}}\) remains in the compact set \(\mathcal {B}\), we can derive the following relation from (36):

$$\begin{aligned} \left|S_{\mathcal {B}}(\delta v) - \lim _{k\rightarrow \infty } \left\langle {\delta v},{z^k}\right\rangle \right|&= \left|\lim _{k\rightarrow \infty } \left\langle {\tfrac{1}{k}(v^k-z^k)},{z^k}\right\rangle - \left\langle {\delta v},{z^k}\right\rangle \right|\\&= \left|\lim _{k\rightarrow \infty } \left\langle {\tfrac{1}{k}v^k - \delta v},{z^k}\right\rangle - \tfrac{1}{k} \left\langle {z^k},{z^k}\right\rangle \right|\\&\le \lim _{k\rightarrow \infty } \underbrace{||\tfrac{1}{k} v^k - \delta v||}_{\rightarrow 0} \, ||z^k|| + \tfrac{1}{k} ||z^k||^2 \\&= 0, \end{aligned}$$

where the third row follows from the triangle and Cauchy-Schwarz inequalities and the fourth from the compactness of \(\mathcal {B}\). Finally, we can derive the following identity from (36):

$$\begin{aligned} S_{\mathcal {B}}(\delta v) = \lim _{k\rightarrow \infty } \left\langle {\tfrac{1}{k}(v^k-z^k)},{z^k}\right\rangle = \lim _{k\rightarrow \infty } \left\langle {\tfrac{1}{k} v^k},{z^k}\right\rangle - \underbrace{\tfrac{1}{k}||z^k||^2}_{\rightarrow 0}. \end{aligned}$$

This concludes the proof. \(\square \)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Banjac, G., Goulart, P., Stellato, B. et al. Infeasibility Detection in the Alternating Direction Method of Multipliers for Convex Optimization. J Optim Theory Appl 183, 490–519 (2019). https://doi.org/10.1007/s10957-019-01575-y

Download citation

Keywords

  • Convex optimization
  • Infeasibility detection
  • Alternating direction method of multipliers
  • Conic programming

Mathematics Subject Classification

  • 90C06
  • 90C20
  • 90C22
  • 90C25
  • 68Q25