Convergence Analysis of Iterative Algorithms for Phase Retrieval

This chapter surveys the analysis of the phase retrieval problem as an inconsistent and nonconvex feasibility problem. We apply a convergence framework for iterative mappings developed by Luke, Tam and Thao in 2018 to the inconsistent and nonconvex phase retrieval problem and establish the convergence properties (with rates) of popular projection methods for this problem. Although our main purpose is to illustrate the convergence results and their underlying concepts, we demonstrate how our theoretical analysis aligns with practical numerical computation applied to laboratory data.


Introduction
We highlight recent theoretical advances that have opened the door to a quantitative convergence analysis of well-known phase retrieval algorithms. As shown in Chap. 6, phase retrieval problems have a natural and easy characterization as feasibility problems, and issues like noise and model misspecification do not effect the abstract regularity of the problem formulation. This was also observed in studies by Bauschke et al. [1] and Marchesini [2] reviewing phase retrieval algorithms in the context of fixed point iterations, though in those works the theory only provided convex heuristics for understanding the most successful algorithms. A slow progression of the theory for nonconvex feasibility culminating in the work by Luke et al. in [3] now provides a firm theoretical basis for understanding most of the standard algorithms for phase retrieval.
The approach is fix-point theoretic and is based on a framework introduced by Luke et al. in [3]. Given some (set-valued) mapping T : E ⇒ E, where E is a finite-dimensional Euclidean space, the algorithms are studied as mere generators of sequences (x k ) k∈N through the fixed point iteration x k+1 ∈ T x k (∀k ∈ N) with x k → x * where x * = T x * . We demonstrate the convergence framework of [3] on a few of the more prevalent iterative phase retrieval algorithms introduced in Chap. 6.
The analysis is based on two main properties. The first of these is the regularity of the mapping defining the fixed point iteration; the second property concerns the stability of the fixed points of the mapping. The first property is covered by the notion of pointwise almost averagedness, a generalization of regularity concepts like (firm) nonexpansiveness. Already in the 1960s Opial [4] showed that an iterative sequence defined by an averaged self-mapping with nonempty fixed point set converges to a fixed point. It is no surprise, then, that generalizations of averagedness should play a central role in convergence for more general fixed point mappings. In the setting of feasibility problems, i.e. finding a point in the intersection of a collection of sets, pointwise almost averagedness of the fixed point mapping is inherited from the regularity of the sets.
The other concept that is central to the analysis concerns stability of the fixed points. This is the characterized by the notion of metric subregularity as presented in Dontchev and Rockafellar [5], and Ioffe [6,7]. Metric subregularity of the mapping at fixed points guarantees quantitative estimates for the rate of convergence of the iterates. This is closely related to the existence of error bounds, and weak-sharp minima, among other equivalent notions that provide a path to a quantitative convergence analysis.
In Sect. 23.2 we remind the reader of the phase retrieval problem. Section 23.3 and its subsections introduce basic notations and concepts. This is followed by a toolkit for convergence in Sect. 23.4 that describes the convergence framework we are working with. The use of this theoretical toolkit is demonstrated on two of the most prevalent algorithms for phase retrieval. We conclude this chapter with some numerical remarks in Sect. 23.8.

Phase Retrieval as a Feasibility Problem
The phase retrieval problem reviewed in Chaps. 2 and 6 involves reconstructing a complex valued field in a plane (the object plane) from measurements of its amplitude under a unitary mapping in a plane somewhere downstream from the object plane (the image plane). We use the notation for the phase retrieval problem already introduced in previous chapters. For a detailed description see Sect. 6.1.1. The measurements are represented by the sets The problem of recovering the phase from just the modulus of unitary transformed measurements is impossible to solve uniquely. Usually nonuniqueness is associated with ill-posedness, but for feasibility problems it is rather existence that is the source of difficulty. In real-world problems measurement errors and model misspecification have profound implications for feasibility models, but not for the reasons that one might expect. The geometry of the individual measurement sets does not change in the presence of noise or model misspecification. The issue is that the measurements are not consistent with one another. In other words, there is no solution that satisfies the measurements and other model requirements (like nonnegativity, in the case of real objects). A solution from the provided information is then only an approximation to the actual signal. Mathematically these characteristics translate into an inconsistent feasibility problem. That is, the intersection of the sets in the feasibility model is empty. Inconsistency has been investigated in many works (see for instance [8][9][10][11]) but most of these studies consider convex sets. Unfortunately, the sets involved in the phase retrieval problem are mostly nonconvex and have empty intersecton. In [3] the authors provided a scheme to handle even this case. The following sections are devoted to their work and present the most important concepts.
To avoid ambiguities recovering the phase, one often uses a priori information about the model. Common examples are the knowledge of a support of the signal, real-valuedness, non-negativity, sparsity or the information about an amplitude: for a set of indices I ⊂ {1, 2, . . . , n}, a ∈ R n + and s ∈ {1, 2, . . . , n}, where R n + = {x ∈ R n | x i ≥ 0, 1 ≤ i ≤ n}. In the following we focus on the (non-negative) support constraint.

Notation and Basic Concepts
Our setting throughout this chapter is a finite dimensional real Euclidean space E equipped with inner product ·, · and induced norm · . The open unit ball is denoted by B, whereas S stands for the unit sphere in E. The open ball with radius δ and center x is denoted by B δ (x). The iterative algorithms we analyze can be represented by mappings T : E ⇒ E, where ⇒ indicates that T is a point-to-set mapping. N denotes the natural numbers. The inverse mapping T −1 at a point y in the range of T is defined as the set of all points x such that y ∈ T (x).

Projectors
We follow in this section the definitions introduced in Chap. 6. As a reminder: the distance of a point x to a set Ω ⊂ E is defined by dist (x, Ω) The corresponding projector onto the set Ω is given by Similarly to the projector, the reflector onto a set Ω is defined by The regularity of a set influences the properties of the corresponding projector onto the set. The best properties are generated by convex sets. A convex set Ω is defined as a set that contains the line segment between any two points x, y ∈ Ω. The projector onto a convex set is not only single-valued, but can be characterized by a variational inequality (see for instance [12,Theorem 3.14]). As we see in Sect. 23.3.2 the algorithms considered here are all composed of projectors and reflectors. This leads to an analysis of the projectors onto the sets introduced in Sect. 23.2. The projector onto the measurement sets M j , defined in (23.1) was already discussed in Sect. 6.1.2. The projectors onto the support constraint sets are even simpler. The following statement is taken from [1, Example 3.14].
The projectors onto other constraint sets can be found, for instance, in [13] or [14] for a sparsity constraint, or in [1, Example 3.14] for an amplitude constraint or real-valued sparsity constraint. Except for the amplitude and sparsity constraint, all other mentioned constraint sets are closed and convex. The type of regularity of the constraint sets is later discussed in Remark 23.5.1. Another concept closely related to that of projectors are normal cones. Let Ω ⊆ E and let x ∈ Ω.
(i) The proximal normal cone of Ω atx is defined by Equivalently, x * ∈ N prox Ω (x) whenever there exists σ ≥ 0 such that (ii) The limiting (proximal) normal cone of Ω at x is defined by where the limit superior is taken in the sense of Painlevé-Kuratowski outer limit (for more details on the outer limit see for instance [15,Chap. 4]).
When x / ∈ Ω all normal cones at x are empty (by definition). If the set Ω is convex, the given definitions of the normal cones coincide (see for instance [16]).

Algorithms
In the context of feasibility problems, a prominent class of iterative algorithms are projection algorithms. Under these, the most prominent and probably one of the easiest to compute is the method of cyclic projections as introduced in Sect. 6.2.1. Given a finite number of closed sets Ω 1 , Ω 2 , . . . , Ω m ⊆ E and a point it generates the next iterate by consecutively projecting onto each of the individual sets. For only two sets the algorithm reduces to the method of alternating projections. In Sect. 6.2.3 the error reduction algorithm was identified with the method of alternating projections applied to a measurement and a support constraint. This connection was first made by Levi and Stark in [17]. Considering again only two sets, Sect. 6.1.2 introduced the well-known Douglas-Rachford algorithm as well as its relaxed version, the relaxed averaged alternating reflection algorithm introduced by Luke in [10]. For one magnitude constraint and a support constraint Douglas-Rachford yields Fienup's hybrid input output method (HIO) [18]. The connection of HIO and Douglas-Rachford was already observed by Bauschke et al. [1]. These three algorithms are the ones we want to focus on here. Nevertheless, we want to emphasize that the analysis shown below can be applied also to other projection methods.
Our survey is far from complete. Other approaches worthy of mention are several of the algorithms discussed in Chap. 5 and those in Chap. 6. Readers familiar with the physics literature will also miss the Hybrid Projection Reflection algorithm, [19], difference map, [20], solvent flipping algorithm, [21], and Fienup's Basic Input-Output algorithm (BIO). BIO is, in fact, nothing more than Dykstra's algorithm, see [1]. Like the BIO algorithm, most of the known approaches to phase retrieval fit into a concise scheme presented in [22].

Fixed Points and Regularities of Mappings
We refer to Fix T as the set of fixed points of the mapping T , i.e. x ∈ Fix T if and only if x ∈ T x. The continuity of set-valued mappings is a well-developed concept and follows the familiar patterns of continuity for single-valued functions. One key property is nonexpansiveness, which nothing more than being Lipschitz continuous with constant 1. That is, given two points, their images under the mapping T are no further away from each other than the initial points. A slightly stronger notion than nonexpansiveness is averagedness. For set-valued mappings, a finer distinction of the types of continuity, whether pointwise, or uniform, for example, is necessary. The following definition captures the crucial types of continuity and regularity of set-valued mappings that lie at the heart of numerical analysis of algorithms for phase retrieval. If the averaging constant α = 1 2 , then T is said to be (pointwise) (almost) firmly nonexpansive on D (with violation ) (at y).
From the above definition it can easily be seen that if a set-valued mapping is nonexpansive at a point, then it is single-valued there. This is a crucial property for our analytical framework, but should not be confused with uniqueness of fixed points: a multi-valued operator can be single-valued at its fixed points without having unique fixed points.
Averaged mappings do not enjoy as nice a calculus as nonexpansive mappings, but the next proposition shows that averagedness of some sort is preserved under addition and composition.

A Toolkit for Convergence
With the characterization of algorithms as simply self mappings with certain regularity properties, we show in this section how those properties come together to guarantee convergence of the algorithm iterations to fixed points. The fixed points need not be solutions to the feasibility problem (indeed, this does not exist for phase retrieval) but will in general be a point that allows one to compute another point that does have some physical significance, such as a local best approximation point.
It turns out that convergence itself is provided by regularity properties introduced in Sect. 23.3.3. The basic convergence idea goes back to Opial [4]. It says that averagedness of a single-valued mapping T and nonemptyness of the fixed point set imply convergence of the iterative sequence (T k x 0 ) k∈N to a point in Fix T for any x 0 ∈ E. Henceforth, we will see that averagedness of T and a nonempty fixed point set is enough to get convergence. As one would expect, it can be difficult for a map to satisfy these properties globally. Nevertheless, this is often the case in nonconvex problem instances. Thus, we seek a statement that includes local properties. That is in our case pointwise almost averagedness as introduced in Definition 23.3.
But convergence alone for iterative procedures is not enough: eventually one has to stop the iteration and without knowing the rate of convergence it is impossible to estimate how far a given iterate must be to the solution. A quantitative convergence analysis is achieved with the second essential property: metric (sub-)regularity. This concept has been studied by many authors in the literature (see for instance [5-7, 15, 23, 24]). For the definition of metric regularity we need gauge functions. Definition 23.6 (metric regularity on a set) holds for all x ∈ U ∩ Λ and y ∈ V with 0 < μ (dist (y, Φ(x))). When the set V consists of a single point, V = {ȳ}, then Φ is said to be metrically subregular forȳ on U with gauge μ relative to Λ ⊂ E. When μ is a linear function (that is, μ(t) = κt, ∀t ∈ [0, ∞)) one says "with constant κ" instead of "with gauge μ(t) = κt". When Λ = E, the quantifier "relative to" is dropped. When μ is linear, the smallest constant κ for which (23.6) holds is called modulus of metric regularity.
While this definition might seem abstract there are properties that directly imply metric regularity or reformulations that allow to prove metric regularity. One of these is polyhedrality (see [3, Proposition 2.6]). A mapping T : E ⇒ E is called polyhedral if its graph is the union of finitely many sets that can be expressed as the intersection of finitely many closed half-spaces and/or hyper-planes [5].
Collecting the concepts we have established so far, we present the following convergence result that goes back to Luke where μ i satisfies Then, for any x 0 ∈ Λ close enough to S, the iterates x j+1 ∈ T x j satisfy dist x j , Fix T ∩ S → 0 and

Regularities of Sets and Their Collection
In this section we connect the regularities of sets to regularities of the projectors on these, which effect the regularity of the mapping T . When dealing with nonconvex sets there are numerous set-regularity definitions available. A recent survey by Kruger et al. [26], sorted the different classes of nonconvex sets to highlight their dependencies and differences. Uniting several concepts of regularity, we propose to use the notion of -set regularity as introduced in [26] and refined in [27].

Definition 23.7 ( -set regularity)
Let Ω ⊂ E be nonempty and let x ∈ Ω. The set Ω is said to be -subregular relative to Λ at x for (y, v) ∈ gph (N Ω ) if it is locally closed at x and there exists an > 0 together with a neighborhood U of x such that if for every > 0 there is a neighborhood (depending on ) such that (23.9) holds, then Ω is said to be subregular relative to Λ at x for (y, v) ∈ gph (N Ω ). If Λ = {x}, then the qualifier "relative to" is dropped.
In the phase retrieval problem one type of nonconvexity, that is also covered by -subregularity, is prox-regularity.
This definition dates back to Federer [28] who called the property sets with positive reach. The definition presented here is taken from [29, Proposition 1.2]. The authors in [29] showed that their definition of prox-regularity at x ∈ C is equivalent to several statements. One of the most prominent might be local single-valuedness of the projector [ . As the next remark shows all constraint sets involved in the phase retrieval problem are, in fact, prox-regular.

Remark 23.5.1 (phase retrieval constraint sets are prox-regular)
Of great importance for the convergence analysis of the introduced algorithms is the -subregularity of the measurement sets defined in (23.1). By [3, Example 3.1.b] circles are subregular at any of their points x for all (x, v) in the graph of the normal cone of the sets. As mentioned before -subregularity covers a divers range of regularity notions for sets. The measurement sets investigated here are in fact shown to be semi-algebraic [30, Proposition 3.5] and prox-regular by [29, Theorem 1.3] and (6.11).
The other sets that are involved in the phase retrieval problem are the qualitative constraints introduced in (23.2) or mentioned before. Except for the amplitude constraint and the sparsity constraint all of these sets are convex and thus by [3, Proposition 3.1 (vii)] subregular. Fortunately, the amplitude constraint describes coordinatewise circles when the other coordinates are fixed, like the measurement constraint. Hence, the amplitude constraint is -subregular as well (and additionally semi-algebraic and prox-regular). The sparsity constraint A s is prox-regular at all points x satisfying x 0 = s (similar to the proof in [14,Proposition 4.4]).
By [12,Proposition 4.8] the projector onto a closed convex set is averaged with constant α = 1/2. Allowing sets to have a more general regularity, here prox-regularity, yield regularity of the projectors as well.

Proposition 23.9 (projectors and reflectors onto prox-regular sets) Let Ω ⊂ E be nonempty closed, and let U be a neighborhood of x ∈ C. Let Λ ⊂ Ω ∩ U . If Ω is prox-regular at x with constant on the neighborhood U , then the following hold.
(i) Let ∈ [0, 1). The projector P Ω is pointwise almost firmly nonexpansive at each y ∈ Λ with violation 2 := 2 + 2 2 on U . That is, at each y ∈ Λ (ii) The reflector R Ω is pointwise almost nonexpansive at each y ∈ Λ with violation 3 := 4 + 4 2 on U ; that is, for all y ∈ Λ × Ω m , the projection P Ω is with respect to the Euclidean norm on E m and Π : x = (x 1 , x 2 , . . . , x m ) → (x 2 , x 3 , . . . , x m , x 1 ) is the permutation mapping on the product space E m for x j ∈ E ( j = 1, 2, . . . , m). Let x = (x 1 , x 2 , . . . , x m ) ∈ E m and y ∈ Υ (x). The collection of sets is said to be subtransversal with gauge μ relative to Λ ⊂ E m at x for y if Υ is metrically subregular at x for y on some neighborhood U of x (metrically regular on U × {y}) with gauge μ relative to Λ. As in Definition 23.6, when μ(t) = κt, ∀t ∈ [0, ∞), one says "constant κ" instead of "gauge μ(t) = κt". When Λ = E, the quantifier "relative to" is dropped. In

Analysis of Cyclic Projections
Having introduced the main tools for convergence, this section is devoted to an explicit demonstration of how this framework can be applied. In particular, we present the main steps of the convergence analysis of the cyclic projection mapping as done by Luke et al. in [3].
As introduced in Algorithm 6.2.1 the method of cyclic projections on a finite collection of closed subsets of E, {Ω 1 , Ω 2 , . . . , Ω m } (m ≥ 2), is defined by the mapping P 0 := P Ω 1 P Ω 2 · · · P Ω m P Ω 1 , (23.10) that we denote for notational simplicity by P 0 . For an initial point u 0 the algorithm generates a sequence u k k∈N by u k+1 ∈ P 0 u k . For the analysis of P 0 it is convenient to introduce some auxiliary sets. We denote by Ω the product of the sets Ω j on E m , Let u ∈ Fix P 0 and fixζ ∈ Z(u) where Note that m j=1ζ j = 0. The elements of W 0 are all cycles of the cyclic projection method, where each coordinate of x corresponds to an inner iterate of P 0 . The first coordinate x 1 of x ∈ W 0 is, thus, a fixed point of P 0 . The vectors ζ ∈ Z(u) are called difference vectors. Their coordinate entries provide information about the gaps between the inner iterates of a cycle of the mapping P 0 .
To monitor the inner iterations, we consider the cyclic projection algorithm lifted to the product space E m . That is, generate the sequence (x k ) k∈N by x k+1 ∈ T¯ζ x k with forζ ∈ Z(u) where u ∈ Fix P 0 . Thus, the first entry of T¯ζ belongs to the cyclic projection mapping P 0 . Whereas the other entries of T¯ζ x indicate how close or distant x + 1 is from a certain cycle specified byζ. In order to isolate cycles, we restrict our attention to relevant subsets of E m . These are The set W (ζ) contains all points whose entries have a certain distance to each other, namelyζ i . In particular, W (ζ) contains all fixed points of T¯ζ . The affine subspace L is used to restrict the analysis to an affine subspace that contains the iterates x k of T¯ζ . To apply the convergence framework, Theorem 23.4.1, there are two major steps we have to take. First, we have to show that the mapping is averaged. Since the cyclic projection mapping is, as its name suggests, a composition of projectors averagedness, this not hard to show by the concepts presented in Sect. 23.5. Second, metric subregularity needs to be proven. For this, we state an auxiliary result that relates metric subregularity to subtransversality of the collection of sets (see [3,Proposition 3.4]).

18)
P Ω j U j+1 ⊆ U j for each j = 1, 2, . . . , m (U m+1 := U 1 ) . (23.19) For fixedζ ∈ Z and x ∈ S withζ = x − Π x, generate the sequence (x k ) k∈N by x k+1 ∈ T¯ζ x k for T¯ζ defined by (23.13), seeded by a point x 0 ∈ W (ζ) ∩ U for W(ζ) defined by (23.14) with x 0 1 ∈ Ω 1 ∩ U 1 . Suppose that, for Λ := L ∩ aff ∪ ζ∈Z W (ζ) ⊃ S such that T ζ : Λ ⇒ Λ for all ζ ∈ Z and an affine subspace L ⊃ aff(x k ))k ∈ N, the following hold: (i) Ω j is prox-regular at all x j ∈ S j with constant j ∈ (0, 1) on the neighborhood U j for j = 1, 2, . . . , m; (ii) for each x = ( x 1 , x 2 , . . . , x m )  holds whenever x ∈ Λ ∩ U with Then the sequence (x k ) k∈N initialized by a point Proof This is a special case of [3, Theorem 3.2] when the sets are prox-regular. Remark 23.6.2 Theorem 23.6.1 is rather long and technical at first sight, though the pieces are easily parsed. Equations (23.17)- (23.19) force the iterations to stay in specific neighborhoods. This is needed to apply Proposition 23.9 with the help of (i) to deduce pointwise almost averagedness of P 0 and likewise of T¯ζ . Assumptions (ii) and (iii) then yield metric subregularity of Φ¯ζ = T¯ζ − Id by Proposition 23.11. This is where the construction in the product space comes into play. Working on E m , we were able to use subtransversality to show metric subregularity of Φ¯ζ . It is worth mentioning that, until now, we were not able to show metric subregularity for the mapping directly associated to P 0 . Adding assumption (iv) in Theorem 23.6.1 we can finally apply Theorem 23.4.1 and deduce convergence of T¯ζ with the given constants. At this point the definition of T¯ζ becomes crucial. Since the first iterate of the sequence x k generated under the mapping T¯ζ is nothing more than applying the method of cyclic projections P 0 , convergence of x k implies convergence of x k 1 , that is, the sequence generated by cyclic projections. In [25] Luke et al. discussed the necessity of subtransversality for alternating projections to converge R-linearly.

Application to Phase Retrieval Algorithms
In Sect. 23.6 we have seen how to apply Theorem 23.4.1 on the method of cyclic projections. This section is devoted to the analysis of other well known algorithms which we introduced in Sect. 23.3.2. The analysis in Sect. 23.6 focuses on showing how to satisfy the assumptions of Theorem 23.4.1 in the context of set-feasibility. This section aims to provide a broad intuition of the convergence of projection based algorithms used to solve the phase retrieval problem. This explains also why the statements given next are presented in a cartoon-like manner. The statements include only the most important parts that yield local convergence, but not how to construct it nor at which rate. Nevertheless, these are verifiable by following the approach in Sect. 23.6. Corollary 23.12 (convergence of the error reduction algorithm) Let Fix P S P M 1 = ∅. The error reduction algorithm, that is alternating projections as discussed in Sect. 6.2.3 on the sets S and M 1 , converges locally linearly to a pointx ∈ Fix P S P M 1 whenever the mapping Φ = P S P M 1 − Id is locally metrically subregular at its zeros.
Proof Following Luke et al. in [32,Sect. 3.2.2], we represent C as R 2 and reformulate the phase retrieval problem as a feasibility problem with entrywise values in R 2 . Then this is an application of Theorem 23.4.1 using Remark 23.5.1. Remark 23.7.1 In contrast to Theorem 23.6.1 metric subregularity is required directly in Theorem 23.12. Equivalently, we could demand subtransversality of the collection of sets {S, M 1 } plus the additional assumption (iii) in Theorem 23.6.1. The problem here is, that, until now, it is not clear when and where these two assumptions are satisfied. Illustrative examples and numerical simulations indicate that they hold in many instances. Nevertheless, there are certain situations when at least one of the two assumptions is violated (see for instance [33]). Moreover, allowing metric subregularity under some gauge depicts the reality sometimes better than restricting the analysis to a linear setting. One example is the setting of alternating projections applied to the sphere S and a line tangent to S at x = (0, −1). In this instance the algorithm does not converge linearly to x, although it converges depending on the initial point (see for instance [3]). This problem is not only interesting for the type of convergence, but also when it comes to the actual numerical implementation of algorithms. Although sets in real-life applications intersect tangentially on a set of measure zero, beyond a certain numerical accuracy the distinction between tangential intersection and linear convergence with a rate constant within 15 digits of 1 is rather academic. Having a relatively large gap between sets for inconsistent feasibility is in fact an advantage for the numerical performance of an algorithm.  (6.22). The relaxed averaged alternating reflections applied to a phase retrieval problem converges locally linearly to a point x ∈ Fix T R AAR whenever the mapping Φ = λ 2 R S R M 1 + Id + (1 − λ)P M 1 − Id is locally metrically subregular at its zeros.
A detailed proof of the convergence analysis for the relaxed averaged alternating reflection algorithm can be found in [33] by the authors of this chapter. There we use subtransversality of the collections of sets in general feasibility problems to make the connection to metric subregularity of the algorithm in question. The analysis does not use prox-regularity as the desired type of regularity for sets yielding the almost averaging property, but rather the property of being super-regular at a distance. This extends notions of regularity of sets to their effect on points that are not in the sets. Their definition is in line with -subregularity and is thus connected to the analysis of [3].

Remark 23.7.4
In [33] we not only provided a convergence statement for the relaxed averaged alternating reflections method, but also gave a description of the fixed point set of the underlying mapping. For super-regular sets at a distance, the fixed points, if they exist, are either points in the intersection of both sets or relate to the local gap between these, if the intersection of the sets is empty. This result is in line with [11] where Luke studied the case of one set being convex and the other prox-regular. In contrast to the original Douglas-Rachford algorithm, the main advantage of the relaxed version is that existence of fixed points does not depend on whether the feasibility problem is consistent. Connecting this observation to the convergence analysis presented here, in practice the Douglas-Rachford/HIO is much less stable than the relaxed version.
Following the ideas above, it is not hard to show that most projection methods are pointwise almost averaged mappings when applied to the phase retrieval problem. Nonetheless, the property of metric subregularity is still an open problem in some important cases. Thus, local convergence can be easily verified, but it is hard to quantify.

Final Remarks
When it comes to computing (see Remark 23.7.1), whether a method converges, let alone determining the rate depends on the numerical precision. But also inconsistency has an impact on the numerical performance. Closely related to this, we want to stress another feature of the analysis surveyed here. That is, sometimes less information can lead to better performance of an algorithm. For a demonstration we analyze a data set recorded by undergraduates at the X-Ray Physics Institute at the University of Göttingen. It is an optical diffraction image with model constraints √ I j,i , j = 1, 2, ..., m, as in (23.1) with m = 1 and n the dimension of the image and additional support constraint. The full data set has dimension n = 1392 × 1040, the cropped data set n = 128 2 . The graphs shown in Figs. 23.1 and 23.2 are produced by applying the alternating projection algorithm, i.e. error reduction, on the data sets individually.
As it turns out alternating projections on the full data set (Fig. 23.2) shows a worse convergence behavior than the image with the limited data set (Fig. 23.1). Not only that the algorithm needs more iterations to reach a certain accuracy (9.8485 × 10 4 instead of 666), but also the rate of linear convergence when the iterates reach a suitable neighborhood is worse. Noteworthy is the observed gap in both problem instances. In the full data set version the gap is smaller than in the version with a limited data set. We conjecture that this behavior is closely related to the property of metric subregularity, or in the context of set feasibility, subtransversality. The more, and better, information one has, the closer the constraint sets come to intersect. But this can included cases in which the sets intersect transversally as well. In cases like these the method of alternating projections does not have to converge locally linearly but can show a sublinear convergence behavior (see for instance [3,Remark 3.2]). The take home message in this context is that more information does not have to yield a better image when applying numerical algorithms. This is good news and bad news for these algorithms. The good news is that one can profit from implicit regularization with smaller problem sizes. The bad news is that this indicates a type of dimension dependence of these methods: the higher the dimension, the worse the constants in the linear convergence rates. This is not surprising and points to the need for models that lead to algorithms whose performance (that is, regularity) is dimension independent. While our discussion here focuses on the theoretical analysis rather than the comparison of the presented algorithms we point the reader to a study by Luke et al. [22], where the authors present a thorough review of first-order proximal methods for phase retrieval algorithms. The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.