Bregman Proximal Mappings and Bregman–Moreau Envelopes Under Relative Prox-Regularity

Laude, Emanuel; Ochs, Peter; Cremers, Daniel

doi:10.1007/s10957-019-01628-2

Bregman Proximal Mappings and Bregman–Moreau Envelopes Under Relative Prox-Regularity

Published: 31 January 2020

Volume 184, pages 724–761, (2020)
Cite this article

Journal of Optimization Theory and Applications Aims and scope Submit manuscript

859 Accesses
6 Citations
Explore all metrics

Abstract

We systematically study the local single-valuedness of the Bregman proximal mapping and local smoothness of the Bregman–Moreau envelope of a nonconvex function under relative prox-regularity—an extension of prox-regularity—which was originally introduced by Poliquin and Rockafellar. As Bregman distances are asymmetric in general, in accordance with Bauschke et al., it is natural to consider two variants of the Bregman proximal mapping, which, depending on the order of the arguments, are called left and right Bregman proximal mapping. We consider the left Bregman proximal mapping first. Then, via translation result, we obtain analogue (and partially sharp) results for the right Bregman proximal mapping. The class of relatively prox-regular functions significantly extends the recently considered class of relatively hypoconvex functions. In particular, relative prox-regularity allows for functions with a possibly nonconvex domain. Moreover, as a main source of examples and analogously to the classical setting, we introduce relatively amenable functions, i.e. convexly composite functions, for which the inner nonlinear mapping is component-wise smooth adaptable, a recently introduced extension of Lipschitz differentiability. By way of example, we apply our theory to locally interpret joint alternating Bregman minimization with proximal regularization as a Bregman proximal gradient algorithm, applied to a smooth adaptable function.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Regularization in Banach Spaces with Respect to the Bregman Distance

Article 28 March 2020

Proximal Mappings and Moreau Envelopes of Single-Variable Convex Piecewise Cubic Functions and Multivariable Gauge Functions

Non-smooth Non-convex Bregman Minimization: Unification and New Algorithms

Article 06 December 2018

Notes

We write “nonconvex Bregman projection” or “nonconvex Bregman proximal mapping” for the sake of convenience, whereby they mean Bregman projection with respect to a (possibly) nonconvex set or Bregman proximal mapping with respect to a (possibly) nonconvex function.

References

Moreau, J.J.: Proximité et dualité dans un espace Hilbertien. Bulletin de la S. M. F. 93, 273–299 (1965)
MATH Google Scholar
Attouch, H.: Convergence de fonctions convexes, des sous-différentiels et semi-groupes associés. Comptes Rendus de l’Académie des Sciences de Paris 285, 539–542 (1977)
MATH Google Scholar
Attouch, H.: Variational Convergence for Functions and Operators. Pitman Advanced Publishing Program, Boston (1984)
MATH Google Scholar
Poliquin, R.A., Rockafellar, R.T.: Prox-regular functions in variational analysis. Trans. Am. Math. Soc. 348, 1805–1838 (1996)
MathSciNet MATH Google Scholar
Poliquin, R.A.: Integration of subdifferentials of nonconvex functions. Nonlinear Anal. Theory Methods Appl. 17(4), 385–398 (1991)
MathSciNet MATH Google Scholar
Rockafellar, R.T., Wets, R.J.B.: Variational Analysis. Springer, New York (1998)
MATH Google Scholar
Bac̆ák, M., Borwein, J.M., Eberhard, A., Mordukhovich, B.: Infimal convolutions and Lipschitzian properties of subdifferentials for prox-regular functions in Hilbert spaces. J. Convex Anal. 17, 732–763 (2010)
MathSciNet MATH Google Scholar
Jourani, A., Thibault, L., Zagrodny, D.: Differential properties of the Moreau envelope. J. Funct. Anal. 266(3), 1185–1237 (2014)
MathSciNet MATH Google Scholar
Poliquin, R.A., Rockafellar, R.T., Thibault, L.: Local differentiability of distance functions. Trans. Am. Mathe. Soc. 352(11), 5231–5249 (2000)
MathSciNet MATH Google Scholar
Bregman, L.M.: The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Comput. Math. Math. Phys. 7, 200–217 (1967)
MathSciNet MATH Google Scholar
Bauschke, H.H., Bolte, J., Teboulle, M.: A descent lemma beyond Lipschitz gradient continuity: first-order methods revisited and applications. Math. Oper. Res. 42(2), 330–348 (2017)
MathSciNet MATH Google Scholar
Bolte, J., Sabach, S., Teboulle, M., Vaisbourd, Y.: First order methods beyond convexity and Lipschitz gradient continuity with applications to quadratic inverse problems. SIAM J. Optim. 28(3), 2131–2151 (2018)
MathSciNet MATH Google Scholar
Teboulle, M.: Entropic proximal mappings with applications to nonlinear programming. Math. Oper. Res. 17(3), 670–690 (1992)
MathSciNet MATH Google Scholar
Bauschke, H.H., Bolte, J., Chen, J., Teboulle, M., Wang, X.: On linear convergence of non-Euclidean gradient methods without strong convexity and Lipschitz gradient continuity. J. Optim. Theory Appl. 182(3), 1068–1087 (2019). https://doi.org/10.1007/s10957-019-01516-9
Article MathSciNet MATH Google Scholar
Mukkamala, M.C., Ochs, P., Pock, T., Sabach, S.: Convex-concave backtracking for inertial Bregman proximal gradient algorithms in non-convex optimization. arXiv:1904.03537 (2019)
Bauschke, H.H., Combettes, P.L., Noll, D.: Joint minimization with alternating Bregman proximity operators. Pac. J. Optim. 2(3), 401–424 (2006)
MathSciNet MATH Google Scholar
Bauschke, H.H., Borwein, J.M., Combettes, P.L.: Bregman monotone optimization algorithms. SIAM J. Control Optim. 42(2), 596–636 (2003)
MathSciNet MATH Google Scholar
Bauschke, H.H., Combettes, P.L.: Iterating Bregman retractions. SIAM J. Optim. 13(4), 1159–1173 (2003)
MathSciNet MATH Google Scholar
Eckstein, J.: Nonlinear proximal point algorithms using Bregman functions, with applications to convex programming. Math. Oper. Res. 18(1), 202–226 (1993)
MathSciNet MATH Google Scholar
Ochs, P., Fadili, J., Brox, T.: Non-smooth non-convex Bregman minimization: unification and new algorithms. J. Optim. Theory Appl. 181(1), 244–278 (2018)
MathSciNet MATH Google Scholar
Byrne, C., Censor, Y.: Proximity function minimization using multiple Bregman projections, with applications to split feasibility and Kullback–Leibler distance minimization. Ann. Oper. Res. 105(1), 77–98 (2001)
MathSciNet MATH Google Scholar
Censor, Y., Reich, S.: The Dykstra algorithm with Bregman projections. Commun. Appl. Anal. 2, 407–419 (1998)
MathSciNet MATH Google Scholar
Censor, Y., Herman, G.: Block-iterative algorithms with underrelaxed Bregman projections. SIAM J. Optim. 13(1), 283–297 (2002)
MathSciNet MATH Google Scholar
Censor, Y., Zenios, S.A.: Parallel Optimization: Theory. Algorithms and Applications. Oxford University Press Inc., New York (1997)
MATH Google Scholar
Kassay, G., Reich, S., Sabach, S.: Iterative methods for solving systems of variational inequalities in reflexive Banach spaces. SIAM J. Optim. 21(4), 1319–1344 (2011)
MathSciNet MATH Google Scholar
Kiwiel, K.: Proximal minimization methods with generalized Bregman functions. SIAM J. Control Optim. 35(4), 1142–1168 (1997)
MathSciNet MATH Google Scholar
Nguyen, Q.: Variable quasi-Bregman monotone sequences. Numer. Algorithms 73(4), 1107–1130 (2016)
MathSciNet MATH Google Scholar
Davis, D., Drusvyatskiy, D., MacPhee, K.J.: Stochastic model-based minimization under high-order growth. arXiv:1807.00255 (2018)
Reem, D., Reich, S., De Pierro, A.: A telescopic Bregmanian proximal gradient method without the global Lipschitz continuity assumption. J. Optim. Theor. Appl. 182(3), 851–884 (2019). https://doi.org/10.1007/s10957-019-01509-8
Article MathSciNet MATH Google Scholar
Hanzely, F., Richtarik, P., Xiao, L.: Accelerated Bregman proximal gradient methods for relatively smooth convex optimization. arXiv:1808.03045 (2018)
Lu, H., Freund, R., Nesterov, Y.: Relatively smooth convex optimization by first-order methods and applications. SIAM J. Optim. 28(1), 333–354 (2018)
MathSciNet MATH Google Scholar
Burachik, R., Kassay, G.: On a generalized proximal point method for solving equilibrium problems in Banach spaces. Nonlinear Anal. Theory Methods Appl. 75(18), 6456–6464 (2012)
MathSciNet MATH Google Scholar
Mukkamala, M.C., Ochs, P.: Beyond alternating updates for matrix factorization with inertial Bregman proximal gradient algorithms. In: Advances in Neural Information Processing Systems 32, pp. 4268–4278. Curran Associates, Inc. (2019)
Nguyen, Q.: Forward–backward splitting with Bregman distances. Vietnam J. Math. 45, 1–21 (2017)
MathSciNet MATH Google Scholar
Benning, M., Betcke, M.M., Ehrhardt, M.J., Schönlieb, C.B.: Choose your path wisely: gradient descent in a Bregman distance framework. arXiv:1712.04045 (2017)
Censor, Y., Zenios, S.: Proximal minimization algorithm with D-functions. J. Optim. Theory Appl. 73(3), 451–464 (1992)
MathSciNet MATH Google Scholar
Bauschke, H.H., Dao, M., Lindstrom, S.: Regularizing with Bregman–Moreau envelopes. SIAM J. Optim. 28(4), 3208–3228 (2018)
MathSciNet MATH Google Scholar
Nemirovsky, A.S., Yudin, D.B.: Problem Complexity and Method Efficiency in Optimization. Wiley, Chichester (1983)
Google Scholar
Chen, Y.Y., Kan, C., Song, W.: The Moreau envelope function and proximal mapping with respect to the Bregman distance in Banach spaces. Vietnam J. Math. 40(2&3), 181–199 (2012)
MathSciNet MATH Google Scholar
Kan, C., Song, W.: The Moreau envelope function and proximal mapping in the sense of the Bregman distance. Nonlinear Anal. Theory Methods Appl. 75(3), 1385–1399 (2012)
MathSciNet MATH Google Scholar
Bauschke, H.H., Wang, X., Ye, J., Yuan, X.: Bregman distances and Chebyshev sets. J. Approx. Theory 159(1), 3–25 (2009)
MathSciNet MATH Google Scholar
Wang, X.: On Chebyshev functions and Klee functions. J. Math. Anal. Appl. 368(1), 293–310 (2010)
MathSciNet MATH Google Scholar
Laude, E., Wu, T., Cremers, D.: A nonconvex proximal splitting algorithm under Moreau–Yosida regularization. In: Proceedings of the 21st International Conference on Artificial Intelligence and Statistics, vol. 84, pp. 491–499. PMLR (2018)
Laude, E., Wu, T., Cremers, D.: Optimization of inf-convolution regularized nonconvex composite problems. In: Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, vol. 89, pp. 547–556. PMLR (2019)
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
MATH Google Scholar
Bauschke, H.H., Borwein, J.: Legendre functions and the method of random Bregman projections. J. Convex Anal. 4(1), 27–67 (1997)
MathSciNet MATH Google Scholar
Bauschke, H.H., Lewis, A.S.: Dykstra’s algorithm with Bregman projections: a convergence proof. Optimization 48(4), 409–427 (2000)
MathSciNet MATH Google Scholar
Bauschke, H.H., Borwein, J.M., Combettes, P.L.: Essential smoothness, essential strict convexity, and Legendre functions in Banach spaces. Commun. Contemp. Math. 3(04), 615–647 (2001)
MathSciNet MATH Google Scholar
Bauschke, H.H., Macklem, M.S., Wang, X.: Chebyshev sets, Klee sets, and Chebyshev centers with respect to Bregman distances: recent results and open problems. In: Fixed-Point Algorithms for Inverse Problems in Science and Engineering, pp. 1–21. Springer, New York (2011)
MATH Google Scholar
Harville, D.A.: Matrix Algebra: Exercises and Solutions. Springer, Berlin (2001)
MATH Google Scholar
Attouch, H., Bolte, J., Redont, P., Soubeyran, A.: Proximal alternating minimization and projection methods for nonconvex problems: an approach based on the Kurdyka-Łojasiewicz inequality. Math. Oper. Res. 35(2), 438–457 (2010)
MathSciNet MATH Google Scholar
Bolte, J., Sabach, S., Teboulle, M.: Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Program. 146(1–2), 459–494 (2014)
MathSciNet MATH Google Scholar
Lewis, A.S., Luke, D.R., Malick, J.: Local linear convergence for alternating and averaged nonconvex projections. Found. Comput. Math. 9(4), 485–513 (2009)
MathSciNet MATH Google Scholar
Ochs, P.: Local convergence of the heavy-ball method and iPiano for non-convex optimization. J. Optim. Theory Appl. 177, 153–180 (2018)
MathSciNet MATH Google Scholar

Download references

Acknowledgements

We would like to thank Tao Wu for fruitful discussions and helpful comments.

Author information

Authors and Affiliations

Technical University of Munich, Munich, Germany
Emanuel Laude
Saarland University, Saarbrücken, Germany
Peter Ochs
Technical University of Munich, Munich, Germany
Daniel Cremers

Authors

Emanuel Laude
View author publications
You can also search for this author in PubMed Google Scholar
Peter Ochs
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Cremers
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Emanuel Laude.

Additional information

Communicated by Nicolas Hadjisavvas.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Proof of First Part of Theorem 4.1

Lemma A.1

Let the assumptions in Theorem 4.1 hold. Then we have, for the iterates produced by Algorithm 1, that

(i)
A monotonic sufficient decrease over the iterates is guaranteed:
$$\begin{aligned} F_\lambda (u^{t+1}, x^{t+1}) + D_{\sigma }(x^{t+1}, x^t) + D_{\omega }(u^{t+1}, u^t) \le F_\lambda (u^{t}, x^{t}), \end{aligned}$$
(36)
(ii)
$\{u^t, x^t\}_{t \in \mathbb {N}}$ is bounded and $x^t \in {{\,\mathrm{int}\,}}(\mathrm{dom}\,\phi )$ for all t.
(iii)
We have that $-\,\infty < \beta \le F_\lambda (u^t, x^t)$ is uniformly bounded from below for all t and $\{F_\lambda (u^{t+1}, x^{t+1})\}_{t\in \mathbb {N}}$ converges.

Proof

In view of the coercivity of $F_\lambda $ and since f, g are proper lsc, the iterates are well-defined.

For part (i) note that by the definition of the $x$-update we have that

$$\begin{aligned} F_\lambda (u^t, x^{t+1}) + D_{\sigma }(x^{t+1}, x^t) \le F_\lambda (u^t, x^t) \end{aligned}$$

and by the definition of the $u$-update

$$\begin{aligned} F_\lambda (u^{t+1}, x^{t+1}) + D_{\omega }(u^{t+1}, u^t) \le F_\lambda (u^t, x^{t+1}). \end{aligned}$$

Summing the two yields (36).

For part (ii) note that the boundedness of $\{u^t, x^t\}_{t \in \mathbb {N}}$ follows from (36) and the coercivity of $F_\lambda $. By the qualification condition and an argument similar to the one in the Proof of Lemma 3.3, we have that $x^t \in {{\,\mathrm{int}\,}}(\mathrm{dom}\,\phi )$.

For part (iii) note that $F_\lambda $ is proper and lsc and the iterates are bounded due to part (ii). In view of [6, Corollary 1.10], $F_\lambda $ is bounded from below over the iterates and the conclusion follows. $\square $

We are now ready to prove the statement from Theorem 4.1:

Proof

We sum the estimate (36) form $t=0$ to T and obtain, in view of Lemma A.1(iii), that

$$\begin{aligned} -\infty < F_\lambda (u^T, x^T) - F_\lambda (u^0, x^0)&= \sum _{t=0}^T F_\lambda (u^{t+1}, x^{t+1}) - F_\lambda (u^{t}, x^{t}) \\&\le -\sum _{t=0}^T \big (D_{\sigma }(x^{t+1}, x^t) +D_{\omega }(u^{t+1}, u^t)\big ). \end{aligned}$$

We take $T \rightarrow \infty $ and deduce that

$$\begin{aligned} D_{\sigma }(x^{t+1}, x^t) +D_{\omega }(u^{t+1}, u^t) \rightarrow 0, \end{aligned}$$

and therefore $D_{\sigma }(x^{t+1}, x^t) \rightarrow 0$ and $D_{\omega }(u^{t+1}, u^t) \rightarrow 0$ and in view of the strict convexity of $\sigma ,\omega $ on ${{\,\mathrm{int}\,}}(\mathrm{dom}\,\phi )$, we also have $\Vert x^{t+1}-x^t\Vert \rightarrow 0$ and $\Vert u^{t+1}-u^t\Vert \rightarrow 0$. In view of the $x$- and $u$-updates and the qualification condition (5), we obtain that:

$$\begin{aligned}&0 \in \partial f(x^{t+1}) + \frac{1}{\lambda } (\nabla \phi (x^{t+1}) - A(u^{t+1})) \\&\qquad +\, \nabla \sigma (x^{t+1}) - \nabla \sigma (x^t) + \frac{1}{\lambda }(A(u^{t+1}) - A(u^t)), \end{aligned}$$

and

$$\begin{aligned} 0 \in \partial g(u^{t+1}) + \frac{1}{\lambda } A^* (\nabla \phi ^*(A(u^{t+1})) - x^{t+1}) + \nabla \omega (u^{t+1}) - \nabla \omega (u^t). \end{aligned}$$

In view of [6, Exercise 8.8(c)] and [6, Proposition 10.5] and since $x^{t+1} \in {{\,\mathrm{int}\,}}(\mathrm{dom}\,\phi )$, this means

$$\begin{aligned} \begin{pmatrix} \nabla \sigma (x^t) - \nabla \sigma (x^{t+1}) + \frac{1}{\lambda }(A(u^t) - A(u^{t+1})) \\ \nabla \omega (u^t) -\nabla \omega (u^{t+1}) \end{pmatrix} \in \partial F_\lambda (u^{t+1}, x^{t+1}). \end{aligned}$$

In view of Lemma A.1(ii), the iterates are bounded and we may consider a convergent subsequence $\{u^{t_j}, x^{t_j}\}_{j \in \mathbb {N}} \subset \{u^t, x^t\}_{t \in \mathbb {N}}$. Let $(u^*, x^*)$ denote the limit point. In view of the closedness of ${{\,\mathrm{gph}\,}}\partial F_\lambda $ under the $F_\lambda $-attentive topology, we have for $j \rightarrow \infty $, since $F_\lambda (u^{t_j}, x^{t_j}) \rightarrow F_\lambda (u^*, x^*)$, the continuity of $\nabla \sigma ,\nabla \omega ,A$ and $\Vert x^{t+1}-x^t\Vert \rightarrow 0$ and $\Vert u^{t+1}-u^t\Vert \rightarrow 0$ that:

$$\begin{aligned} 0\in \partial F_\lambda (u^*, x^*). \end{aligned}$$

It remains to argue that also the limit point $x^* \in {{\,\mathrm{int}\,}}(\mathrm{dom}\,\phi )$ is contained in the interior of $\mathrm{dom}\,\phi $: In view of the qualification condition (5) and an argument similar to the one in the Proof of Lemma 3.3, as well as [6, Proposition 10.5], we obtain that $x^* \in {{\,\mathrm{int}\,}}(\mathrm{dom}\,\phi )$ and conclude that the optimality conditions (32) and (33) hold. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Laude, E., Ochs, P. & Cremers, D. Bregman Proximal Mappings and Bregman–Moreau Envelopes Under Relative Prox-Regularity. J Optim Theory Appl 184, 724–761 (2020). https://doi.org/10.1007/s10957-019-01628-2

Download citation

Received: 09 July 2019
Accepted: 16 December 2019
Published: 31 January 2020
Issue Date: March 2020
DOI: https://doi.org/10.1007/s10957-019-01628-2

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bregman Proximal Mappings and Bregman–Moreau Envelopes Under Relative Prox-Regularity

Abstract

Access this article

Similar content being viewed by others

Regularization in Banach Spaces with Respect to the Bregman Distance

Proximal Mappings and Moreau Envelopes of Single-Variable Convex Piecewise Cubic Functions and Multivariable Gauge Functions

Non-smooth Non-convex Bregman Minimization: Unification and New Algorithms

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix: Proof of First Part of Theorem 4.1

Lemma A.1

Proof

Proof

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Bregman Proximal Mappings and Bregman–Moreau Envelopes Under Relative Prox-Regularity

Abstract

Access this article

Similar content being viewed by others

Regularization in Banach Spaces with Respect to the Bregman Distance

Proximal Mappings and Moreau Envelopes of Single-Variable Convex Piecewise Cubic Functions and Multivariable Gauge Functions

Non-smooth Non-convex Bregman Minimization: Unification and New Algorithms

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix: Proof of First Part of Theorem 4.1

Appendix: Proof of First Part of Theorem 4.1

Lemma A.1

Proof

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation