An optimal variant of Kelley’s cutting-plane method

Drori, Yoel; Teboulle, Marc

doi:10.1007/s10107-016-0985-7

An optimal variant of Kelley’s cutting-plane method

Full Length Paper
Series A
Published: 16 February 2016

Volume 160, pages 321–351, (2016)
Cite this article

Mathematical Programming Submit manuscript

Yoel Drori¹ &
Marc Teboulle¹

919 Accesses
16 Citations
Explore all metrics

Abstract

We propose a new variant of Kelley’s cutting-plane method for minimizing a nonsmooth convex Lipschitz-continuous function over the Euclidean space. We derive the method through a constructive approach and prove that it attains the optimal rate of convergence for this class of problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Some brief observations in minimizing the sum of locally Lipschitzian functions

Article 11 September 2019

An improved algorithm for the $$L_2$$ – $$L_p$$ minimization problem

Article 08 February 2017

Peaceman–Rachford splitting for a class of nonconvex optimization problems

Article 13 May 2017

Notes

In order to avoid overly numerous special cases, we adopt the convention $\frac{0}{0}=0$.
Note that since both problems admit a compact feasible set, attainment of both values is warranted.

References

Auslender, A.: Numerical methods for nondifferentiable convex optimization. In: Cornet, B., Nguyen, V., Vial, J. (eds.) Nonlinear Analysis and Optimization, Mathematical Programming Studies, vol. 30, pp. 102–126. Springer, Berlin (1987). doi:10.1007/BFb0121157
Chapter Google Scholar
Auslender, A., Teboulle, M.: Interior gradient and epsilon-subgradient descent methods for constrained convex minimization. Math. Oper. Res. 29(1), 1–26 (2004)
Article MathSciNet MATH Google Scholar
Ben-Tal, A., Nemirovski, A.: Non-euclidean restricted memory level method for large-scale convex optimization. Math. Progr. 102(3), 407–456 (2005)
Article MathSciNet MATH Google Scholar
Ben-Tal, A., Nemirovskii, A.S.: Lectures on Modern Convex Optimization. Siam, Philadelphia (2001)
Book Google Scholar
Benders, J.F.: Partitioning procedures for solving mixed-variables programming problems. Numer. Math. 4(1), 238–252 (1962)
Article MathSciNet MATH Google Scholar
Cheney, E.W., Goldstein, A.A.: Newton’s method for convex programming and Tchebycheff approximation. Numer. Math. 1(1), 253–268 (1959)
Article MathSciNet MATH Google Scholar
de Oliveira, W., Sagastizábal, C.: Bundle Methods in the XXIst Century: A Birds-Eye View. Optimization Online Report 4088 (2013)
Drori, Y., Teboulle, M.: Performance of first-order methods for smooth convex minimization: a novel approach. Math. Progr. Ser. A 145, 451–482 (2014)
Article MathSciNet MATH Google Scholar
Fan, K.: Minimax theorems. Proc. Natl. Acad. Sci. USA 39(1), 42 (1953)
Article MathSciNet MATH Google Scholar
Grone, R., Johnson, C.R., Sá, E.M., Wolkowicz, H.: Positive definite completions of partial hermitian matrices. Linear Algebr. Appl. 58, 109–124 (1984)
Article MathSciNet MATH Google Scholar
Kelley Jr, J.E.: The cutting-plane method for solving convex programs. J. Soc. Ind. Appl. Math. 8(4), 703–712 (1960)
Article MathSciNet MATH Google Scholar
Kim, D., Fessler, J.A.: Optimized first-order methods for smooth convex minimization. arXiv:1406.5468 (2014)
Kiwiel, K.C.: Proximity control in bundle methods for convex nondifferentiable minimization. Math. Progr. 46(1–3), 105–122 (1990)
Article MathSciNet MATH Google Scholar
Kiwiel, K.C.: Proximal level bundle methods for convex nondifferentiable optimization, saddle-point problems and variational inequalities. Math. Progr. 69(1–3), 89–109 (1995)
MathSciNet MATH Google Scholar
Kiwiel, K.C.: Efficiency of proximal bundle methods. J. Optim. Theory Appl. 104(3), 589–603 (2000)
Article MathSciNet MATH Google Scholar
Lemaréchal, C.: An extension of davidon methods to non differentiable problems. In: Nondifferentiable Optimization, pp. 95–109. Springer, Berlin (1975)
Lemaréchal, C., Nemirovskii, A., Nesterov, Y.: New variants of bundle methods. Math. Progr. 69(1–3), 111–147 (1995)
Article MathSciNet MATH Google Scholar
Lemaréchal, C., Sagastizábal, C.: Variable metric bundle methods: from conceptual to implementable forms. Math. Progr. 76(3), 393–410 (1997)
Article MathSciNet MATH Google Scholar
Lukšan, L., Vlček, J.: A Bundle–Newton method for nonsmooth unconstrained minimization. Math. Progr. 83(1–3), 373–391 (1998)
MathSciNet MATH Google Scholar
Mäkelä, M.: Survey of bundle methods for nonsmooth optimization. Optim. Methods Softw. 17(1), 1–29 (2002)
Article MathSciNet MATH Google Scholar
Nemirovsky, A.S., Yudin, D.B.: Problem Complexity and Method Efficiency in Optimization. A Wiley-Interscience Publication. Wiley, New York (1983) (Translated from the Russian and with a preface by E. R. Dawson, Wiley-Interscience Series in Discrete Mathematics)
Nesterov, Y.: Introductory Lectures on Convex Optimization: A Basic Course. Applied Optimization. Kluwer Academic Publishers, Dordrecht (2004)
Book MATH Google Scholar
Schramm, H., Zowe, J.: A version of the bundle idea for minimizing a nonsmooth function: conceptual idea, convergence analysis, numerical results. SIAM J. Optim. 2(1), 121–152 (1992)
Article MathSciNet MATH Google Scholar
Wolfe, P.: A method of conjugate subgradients for minimizing nondifferentiable functions. In: Nondifferentiable Optimization, pp. 145–173. Springer, Berlin (1975)

Download references

Acknowledgments

We thank the two referees and the associate editor for their constructive comments and useful suggestions.

Author information

Authors and Affiliations

School of Mathematical Sciences, Tel-Aviv University, 69978, Ramat-Aviv, Israel
Yoel Drori & Marc Teboulle

Authors

Yoel Drori
View author publications
You can also search for this author in PubMed Google Scholar
Marc Teboulle
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marc Teboulle.

Additional information

This research was partially supported by the Israel Science Foundation under ISF Grant No. 998-12.

Appendix: a tight lower-complexity bound

In this appendix, we refine the proof from [22, Sect. 3.2] to obtain a new lower-complexity bound on the class of nonsmooth, convex, and Lipschitz-continuous functions, which together with the results discussed above form a tight complexity result for this class of problems. More precisely, under the setting of Sect. 2.1, we show that for any first-order method, the worst-case absolute inaccuracy after N steps cannot be better than $\frac{LR}{\sqrt{N}}$, which is exactly the bound attained by Algorithm KLM.

In order to simplify the presentation, and following [22, Sect. 3.2], we restrict our attention to first-order methods that generate sequences that satisfy the following assumption:

Assumption 1

The sequence $\{x_i\}$ satisfies

$$\begin{aligned} x_i \in x_1 + \mathrm {span}\{f'(x_1),\dots ,f'(x_{i-1})\}, \end{aligned}$$

where $f'(x_i)\in \partial f(x_i)$ is obtained by evaluating a first-order oracle at $x_i$.

As noted by Nesterov [22, Page 59], this assumption is not necessary and can be avoided by some additional reasoning.

The lower-complexity result is stated as follows.

Theorem 2

For any $L,R>0$, $N,p\in \mathbb {N}$ with $N\le p$, and any starting point $x_1\in \mathbb {R}^p$, there exists a convex and Lipschitz-continuous function $f:\mathbb {R}^p\rightarrow \mathbb {R}$ with Lipschitz constant L and $\Vert x^*_f-x_1\Vert \le R$, and a first-order oracle $\mathcal {O}(x)= (f(x), f'(x))$, such that

$$\begin{aligned} f(x_N)-f^*\ge \frac{LR}{\sqrt{N}} \end{aligned}$$

for all sequences $x_1,\dots ,x_N$ that satisfies Assumption 1.

Proof

The proof proceeds by constructing a “worst-case” function, on which any first-order method that satisfies Assumption 1 will not be able to improve its initial objective value during the first N iterations.

Let $f_N:\mathbb {R}^p\rightarrow \mathbb {R}$ and $\bar{f}_N:\mathbb {R}^p\rightarrow \mathbb {R}$ be defined by

$$\begin{aligned}&f_N(x) = \max _{1\le i \le N} \langle x, e_i\rangle , \\&\bar{f}_N(x) =L\max (f_N(x), \Vert x\Vert -R(1+N^{-1/2})), \end{aligned}$$

then it is easy to verify that $\bar{f}_N$ is Lipschitz-continuous with constant L and that

$$\begin{aligned} \bar{f}_N^*= -\frac{LR}{\sqrt{N}} \end{aligned}$$

is attained for $x^*\in \mathbb {R}^p$ such that

$$\begin{aligned} x^*= -\frac{R}{\sqrt{N}}\sum _{i=1}^N e_i. \end{aligned}$$

We equip $\bar{f}_N$ with the oracle $\mathcal {O}_N(x)= (\bar{f}_N(x), \bar{f}'_N(x))$ by choosing $\bar{f}'_N(x)\in \partial \bar{f}_N(x)$ according to:

$$\begin{aligned} \bar{f}'_N(x) = {\left\{ \begin{array}{ll} L f'_N(x), &{} f_N(x)\ge \Vert x\Vert -R\left( 1+N^{-1/2}\right) ,\\ L\frac{x}{\Vert x\Vert }, &{} f_N(x)< \Vert x\Vert -R\left( 1+N^{-1/2}\right) , \end{array}\right. } \end{aligned}$$

(8.1)

where

$$\begin{aligned} f'_N(x) = e_{i^*}, \quad i^*= \min \{ i : f_N(x)=\langle x, e_i\rangle \}. \end{aligned}$$

(8.2)

We also denote

$$\begin{aligned} \mathbb {R}^{i,p} := \{x\in \mathbb {R}^d : \langle x, e_j\rangle =0,\ i+1\le j\le p\}. \end{aligned}$$

Now, let $x_1,\dots ,x_N$ be a sequence that satisfies Assumption 1 with $f=\bar{f}_N$ and the oracle $\mathcal {O}_N$, where without loss of generality we assume $x_1=0$. Then $\bar{f}'_N(x_1) = e_1$ and we get $x_2\in \mathrm {span}\{\bar{f}'_N(x_1)\}=\mathbb {R}^{1,p}$. Now, from $\langle x_2,e_2\rangle =\dots =\langle x_2,e_N\rangle =0$, we get that $\min \{ i : f_N(x)=\langle x, e_i\rangle \}\le 2$ and it follows by (8.1) and (8.2) that $f'_N(x_2)\in \mathbb {R}^{2,p}$ and $\bar{f}'_N(x_2)\in \mathbb {R}^{2,p}$. Hence, we conclude from Assumption 1 that $x_3 \in \mathrm {span}\{\bar{f}'_N(x_1),\bar{f}'_N(x_2)\}\subseteq \mathbb {R}^{2,p}$. It is straightforward to continue this argument to show that $x_i \in \mathbb {R}^{i-1,p}$ and $\bar{f}'_N(x_i)\in \mathbb {R}^{i,p}$ for $i=1,\dots ,N$, thus $x_N \in \mathbb {R}^{N-1,p}$. Finally, since for every $x\in \mathbb {R}^{N-1,p}$ we have $\bar{f}_N(x)\ge \langle x,e_N\rangle =0$, we immediately get

$$\begin{aligned} \bar{f}_N(x_{N})-\bar{f}_N^*\ge \frac{LR}{\sqrt{N}}, \end{aligned}$$

which completes the proof. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Drori, Y., Teboulle, M. An optimal variant of Kelley’s cutting-plane method. Math. Program. 160, 321–351 (2016). https://doi.org/10.1007/s10107-016-0985-7

Download citation

Received: 09 September 2014
Accepted: 24 January 2016
Published: 16 February 2016
Issue Date: November 2016
DOI: https://doi.org/10.1007/s10107-016-0985-7

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An optimal variant of Kelley’s cutting-plane method

Abstract

Access this article

Similar content being viewed by others

Some brief observations in minimizing the sum of locally Lipschitzian functions

An improved algorithm for the $$L_2$$ – $$L_p$$ minimization problem

Peaceman–Rachford splitting for a class of nonconvex optimization problems

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix: a tight lower-complexity bound

Assumption 1

Theorem 2

Proof

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

An optimal variant of Kelley’s cutting-plane method

Abstract

Access this article

Similar content being viewed by others

Some brief observations in minimizing the sum of locally Lipschitzian functions

An improved algorithm for the $$L_2$$ – $$L_p$$ minimization problem

Peaceman–Rachford splitting for a class of nonconvex optimization problems

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix: a tight lower-complexity bound

Appendix: a tight lower-complexity bound

Assumption 1

Theorem 2

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation