On the asymptotic behavior of the Douglas–Rachford and proximal-point algorithms for convex optimization

Banjac et al. (J Optim Theory Appl 183(2):490–519, 2019) recently showed that the Douglas–Rachford algorithm provides certificates of infeasibility for a class of convex optimization problems. In particular, they showed that the difference between consecutive iterates generated by the algorithm converges to certificates of primal and dual strong infeasibility. Their result was shown in a finite-dimensional Euclidean setting and for a particular structure of the constraint set. In this paper, we extend the result to real Hilbert spaces and a general nonempty closed convex set. Moreover, we show that the proximal-point algorithm applied to the set of optimality conditions of the problem generates similar infeasibility certificates.


Introduction
Due to its very good practical performance and ability to handle nonsmooth functions, the Douglas-Rachford algorithm has attracted a lot of interest for solving convex optimization problems.Provided that a problem is solvable and satisfies certain constraint qualification, the algorithm is known to converge to an optimal solution [BC17,Cor. 27.3].When the problem is infeasible, then some of its iterates diverge [EB92].
Results on the asymptotic behavior of the Douglas-Rachford algorithm for infeasible problems are very scarce, and most of them study some specific cases such as feasibility problems involving two convex sets that do not intersect [BDM16,BM16,BM17].Although there have been some recent results studying a more general setting [RLY19], they do not provide practical conditions to detect infeasibility of a problem.Instead, the asymptotic behavior is characterized via the so called infimal displacement vector, which is not known prior to solving the problem.The authors in [BGSB19] consider a problem of minimizing a convex quadratic function over a particular constraint set, and show that the iterates of the Douglas-Rachford algorithm generate an infeasibility certificate when the problem is primal or dual strongly infeasible.A similar analysis was applied in [LMK20] to show that the proximal-point algorithm used for solving a convex quadratic program can also detect infeasibility.
The constraint set of the problem studied in [BGSB19] is represented in the form Ax ∈ C, where A is a real matrix and C the Cartesian product of a convex compact set and a translated closed convex cone.This paper extends the result of [BGSB19] to Hilbert spaces and a general nonempty closed convex set C. Moreover, we show that a similar analysis can be used to show that the proximal-point algorithm for solving the same class of problems generates similar infeasibility certificates.
The paper is organized as follows.We introduce some definitions and notation in the sequel of Section 1, and the problem under consideration in Section 2. Section 3 presents some supporting results that are essential for generalizing the results in [BGSB19].Finally, Section 4 and Section 5 analyze the asymptotic behavior of the Douglas-Rachford and proximal-point algorithms, respectively, and show that they provide infeasibility certificates for the considered problem.

Notation
Let H, H 1 , H 2 be real Hilbert spaces with inner products • | • , induced norms • , and identity operators Id.The power set of H is denoted by 2 H . Let N denote the set of positive integers, R the set of real numbers, R n the n-dimensional Euclidean space, and R m×n the space of real m-by-n matrices.For a sequence (s n ) n∈N , we define δs n+1 := s n+1 − s n .
Let D be a nonempty subset of H. Then T : and it is α-averaged with α ∈ ]0, 1] if there exists a nonexpansive operator R : D → H such that T = (1 − α) Id +αR.We denote the range of T by ran T .A set-valued operator B : H → 2 H , characterized by its graph For a proper lower semicontinuous convex function f : H → ]−∞, +∞], we define its: For a nonempty closed convex set C ⊆ H, we denote its closure by C and define its: indicator function: projection operator :

Problem of Interest
Consider the following convex optimization problem: with Q : H 1 → H 1 a monotone self-adjoint bounded linear operator, q ∈ H 1 , A : H 1 → H 2 a bounded linear operator with a closed range, and C a nonempty closed convex subset of H 2 .The objective function of the problem is convex, continuous, and Fréchet differentiable [BC17, Prop.17.36].
When H 1 = R n and H 2 = R m , problem (1) reduces to the one considered in [BGSB19], where the Douglas-Rachford algorithm (which is equivalent to the alternating direction method of multipliers) was shown to generate certificates of primal and dual strong infeasibility.Moreover, the authors proposed termination criteria for infeasibility detection, which are easy to implement and are used in several numerical solvers; see, e.g., [SBG + 20, GCG19, HTP19].
To prove the main results, they used the assumption that C can be represented as the Cartesian product of a convex compact set and a translated closed convex cone, which was exploited heavily in their proofs.In this paper we extend these results to the case where H 1 and H 2 are real Hilbert spaces, and C is a general nonempty closed convex set.

Optimality Conditions
We can rewrite problem (1) in the form minimize Provided that a certain constraint qualification holds, we can characterize its solution by [BC17, Thm.27.2] 0 ∈ Qx + q + A * ∂ι C (Ax), and introducing a dual variable y ∈ ∂ι C (Ax), we can rewrite the inclusion as Introducing an auxiliary variable z ∈ C and using ∂ι C = N C , we can write the optimality conditions for problem (1) as

Infeasibility Certificates
The authors in [BGSB19] derived the following conditions for characterizing strong infeasibility of problem (1) and its dual:

Auxiliary Results
Lemma 3.1.Suppose that T : H → H is an averaged operator and let s 0 ∈ H, s n = T n s 0 , and δs := P ran(T −Id) (0).Then From [BC17, Prop.6.47], we have which, due to [BC17, Thm.16.29] and the facts that ι * D = σ D and ∂ι D = N D , is equivalent to which implies for any fixed p ∈ D. Dividing by n and taking the limit, we obtain Due to (4) and ( 5), the left-hand side of the inequality is the inner product of terms in rec D and (rec D) ⊖ , and is thus always nonpositive.Therefore, it must be lim n→∞ The result then follows from [BC17, Cor.6.31].
(iv): Taking the limit of the inequality and taking the supremum of the left-hand side over D, we get Taking the limit of (6) and using the inequality above, we obtain Since p n ∈ D, we also have The result follows by combining the two inequalities above.
The results of Prop.3.2 are straightforward under the additional assumption that D is compact, since then rec D = {0} and (rec D) ⊖ = H 2 , and thus lim n→∞ Moreover, due to the continuity of σ D [BC17, Ex. 11.2], taking the limit of (6) implies When D is a (translated) closed convex cone, the recession cone is the cone itself, and the results of Prop.3.2 can be shown using the Moreau decomposition and some basic properties of the projection operator; see [BGSB19, Lem.A.3 & Lem.A.4] for details.

Douglas-Rachford Algorithm
The Douglas-Rachford algorithm is an operator splitting method, which can be used to solve composite minimization problems of the form minimize w∈H f (w) + g(w), where f and g are proper lower semicontinuous convex functions.An iteration of the algorithm in application to problem (7) can be written as where α ∈ ]0, 2[ is the relaxation parameter.

If we rewrite problem (1) as
then an iteration of the Douglas-Rachford algorithm takes the following form [BGSB19, SBG + 20]: We will exploit the following well-known result to analyze the asymptotic behavior of the algorithm [LM79]: Fact 4.1.Iteration (8) amounts to where The solution to the subproblem in (8a) satisfies the optimality condition If we rearrange (8b) to isolate xn , and substitute it into (8c) and ( 9), we obtain the following relations between the iterates: Let us define the following auxiliary iterates of iteration (8): Observe that the pair (z n , y n ) satisfies optimality condition (3c) for all n ∈ N [BC17, Prop.6.47], and that the right-hand terms in (10) indicate how far the iterates (x n , z n , y n ) are from satisfying (3a) and (3b).
The following corollary follows directly from Lem.
Proposition 4.3.The following relations hold between δx, δz, and δy, which are defined in Cor.4.2: Proof.(i): Divide (10a) by n, take the limit, and use Cor.4.2(v) to get (ii): Divide (10b) by n, take the inner product of both sides with δx and take the limit to obtain , where we used (12) and Cor.4.2(iv) in the second equality, and Cor.4.2(viii) in the third.Due to [BC17,Cor. 18.18], the equality above implies (iii): Divide (10b) by n, take the limit, and use (13) to obtain where we used Cor.4.2(iv) in the second equality.
(iv): Subtracting (10a) at iterations n + 1 and n, and taking the limit yield where the second equality follows from (12).
Proposition 4.4.The following identities hold for δx and δy, which are defined in Cor.4.2: Proof.Take the inner product of both sides of (10b) with δx and use (13) to obtain Taking the limit and using Prop.4.3(i) and Cor.4.2(vii)&(viii) give where the inequality follows from Cor. 4.2(iii)&(v) as the inner product of terms in rec C and (rec C) ⊖ is nonpositive.Now take the inner product of both sides of (10a) with δy to obtain Due to Prop.4.3(iii), the first inner product on the left-hand side is zero.Taking the limit and using Cor.Summing ( 14) and (15) and using Cor.4.2(ix), we obtain Now take the inner product of both sides of (10b) with x n to obtain Dividing by n, taking the limit, and using Prop.4.3(i)&(ii) and Cor.4.2(vii)&(viii) yield We can write the last term on the left-hand side as where the first equality follows from (10a), the second from Prop.4.3(i) and Cor.4.2(iv)&(vii), and the third from Cor. 4.2(vi).Plugging the equality above in the preceding, we obtain where the inequality follows from the monotonicity of Q. Comparing inequalities ( 16) and (17), it follows that they must be satisfied with equality.Consequently, the left-hand sides of ( 14) and (15) must be zero.This concludes the proof.Our results pave the way for similar developments in the more general setting considered here.

Proximal-Point Algorithm
The proximal-point algorithm is a method for finding a vector w ∈ H that solves the following inclusion problem: 0 ∈ B(w), where B : H → 2 H is a maximally monotone operator.An iteration of the algorithm in application to problem (18) can be written as where γ > 0 is the regularization parameter.