Abstract
Phase retrieval refers to the problem of reconstructing an unknown vector \(x_0 \in {\mathbb {C}}^n\) or \(x_0 \in {\mathbb {R}}^n \) from m measurements of the form \(y_i = \big \vert \langle \xi ^{\left( i\right) }, x_0 \rangle \big \vert ^2 \), where \( \left\{ \xi ^{\left( i\right) } \right\} ^m_{i=1} \subset {\mathbb {C}}^m \) are known measurement vectors. While Gaussian measurements allow for recovery of arbitrary signals provided the number of measurements scales at least linearly in the number of dimensions, it has been shown that ambiguities may arise for certain other classes of measurements \( \left\{ \xi ^{\left( i\right) } \right\} ^{m}_{i=1}\) such as Bernoulli measurements or Fourier measurements. In this paper, we will prove that even when a subgaussian vector \( \xi ^{\left( i\right) } \in {\mathbb {C}}^m \) does not fulfill a small-ball probability assumption, the PhaseLift method is still able to reconstruct a large class of signals \(x_0 \in {\mathbb {R}}^n\) from the measurements. This extends recent work by Krahmer and Liu from the real-valued to the complex-valued case. However, our proof strategy is quite different and we expect some of the new proof ideas to be useful in several other measurement scenarios as well. We then extend our results \(x_0 \in {\mathbb {C}}^n \) up to an additional assumption which, as we show, is necessary.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Phase retrieval refers to the problem of reconstructing an unknown vector \(x_0 \in {\mathbb {C}}^n\) from m measurements of the form
where the \(\xi ^{\left( i\right) } \in {\mathbb {C}}^n\) are known measurement vectors and \(w_i \in {\mathbb {R}}\) represents additive noise. Such problems are ubiquituous in many areas of science and engineering such as X-ray crystallography [23, 32], astronomical imaging [18], ptychography [35], and quantum tomography [28].
The foundational papers [4, 7, 13] proposed to reconstruct \(x_0\) via the PhaseLift method, a convex relaxation of the original problem. These papers have triggered many follow-up works since they were the first to establish rigorous recovery guarantees under the assumption that the measurement vectors \(\xi ^{\left( i\right) }\) are sampled uniformly at random from the sphere. Since then several papers have analyzed scenarios where the measurement vectors possess a significantly reduced amount of randomness, in particular spherical designs [21] and coded diffraction patterns [5, 22]. However, the theoretical results for coded diffraction patterns rely on the assumption that the modulus of the illumination patterns is varying. Indeed, it was shown in [17] that for certain illumination patterns with constant modulus ambiguities can arise, i.e., it is not possible to determine \(x_0\) uniquely from the measurements \(y_i\). In fact, such ambiguities can already arise in much simpler settings, where the measurement vectors \( \left( \xi ^{\left( i\right) } \right) \) are i.i.d. subgaussian. For example, consider the case that \(\xi ^{\left( i\right) } = \left( \varepsilon ^{\left( i\right) }_{1} , \ldots , \varepsilon ^{\left( i\right) }_{n} \right) \), where the \( \varepsilon ^{\left( i\right) }_{j} \) are i.i.d. Rademacher random variables. That is, they only take the values \(+1\) and \(-1\) each with probability \(\frac{1}{2}\). In this case the vector \(x_0:=e_1=\left( 1, 0, \ldots , 0\right) \) can never be distinguished from the vector \( {\tilde{x}}_0:=e_2=\left( 0, 1, \ldots , 0\right) \). Note that in this scenario it holds that \({\mathbb {E}} \left[ \big \vert \xi ^{\left( i\right) }_j \big \vert ^4 \right] = {\mathbb {E}} \left[ \big \vert \xi ^{\left( i\right) }_j \big \vert ^2 \right] \) and, hence, the vector \( \xi ^{\left( i\right) } \) does not fulfill a small-ball probability assumption, which means that there is no constant \(c>0\) such that for all \(\varepsilon >0\) and for all vectors x it holds that
When the signals are complex even additional classes of ambiguities can arise. For example, when the measurement vectors \( \xi ^{\left( i\right) } \) are real, any signal x and its complex-conjugate signal \( {\overline{x}} \) will result in identical observations.
For these reasons, previous works on phase-retrieval from subgaussian measurements (see, e.g., [11]) work with real signals and require that all entries of the vector \( \xi ^{\left( i\right) } \) fulfill
for all \( j \in \left[ n\right] \) or make even stronger assumptions.
The only exception is [26] which shows for the real-valued case (\(x_0 \in {\mathbb {R}}^n\) and \(\xi ^{\left( i\right) } \in {\mathbb {R}}^n \)) PhaseLift recovers a large class of signals from subgaussian measurements even if estimates of the type (1.3) are not satisfied. More precisely, one obtains that all signals \(x_0\) whose peak-to-average power ratio satisfies a mild bound of the form
for some absolute constant \(\mu >0\), can be recovered with high probability as long as \( m \gtrsim n \). However, as the approach in [26] is intrinsically based on arguments in [16] it cannot be generalized to the complex case in a straightforward manner. This paper provides an analysis both for real-valued and complex-valued signals. We believe that this understanding will be of importance for the subsequent study of structured scenarios such as coded diffraction patterns, which are also intrinsically complex in nature.
While the proofs in previous papers [5, 21, 22, 26] relied on the construction of a so-called dual certificate, our paper will employ a more geometric approach based on Mendelson’s small ball method [25, 31]. This is motivated by recent work [24, 27, 28], which showed that a geometric analysis based on the descent cone of the trace norm can often yield additional insights compared to an approach based on dual certificates.
For the problem studied in this paper, however, the small-ball method cannot be applied directly to the entire descent cone or the entire cone of directions in which positive semidefiniteness is preserved. Rather we divide the latter cone into two parts: One that contains all the problematic cases, but is small, and one that is larger, but easier to analyze. Then we control one of these cones using a restricted isometry property and one via the small-ball method.
We think that this novel viewpoint and also some of the techniques developed in this paper will be useful for the analysis of other interesting measurement scenarios, such as the case of heavy-tailed measurement vectors \(\xi ^{\left( i\right) } \) or the case that \(\xi ^{\left( i\right) }\) has only entries 0 and 1.
2 Background and Main Results
2.1 Notation
\({\mathcal {S}}^n\) denotes the vector space of all Hermitian matrices in \({\mathbb {C}}^{n \times n} \). By \({\mathcal {S}}^n_{+} \subset {\mathcal {S}}^n \) we will denote the set of all positive definite Hermitian matrices. For \(A,B \in {\mathcal {S}}^n\) the Hilbert-Schmidt inner product is defined by \( \left\langle A,B \right\rangle _{HS} := \text {Tr}\,\left( A^* B\right) \). The corresponding norm will be denote by \( \Vert \cdot \Vert _{HS}\). For a matrix \(Z \in {\mathcal {S}}^n \) we will denote their eigenvalues by \(\lambda _1 \left( Z\right) , \lambda _2 \left( Z\right) , \ldots , \lambda _n \left( Z\right) \), which are assumed to be arranged in decreasing order, i.e., \( \lambda _1 \left( Z\right) \ge \lambda _2 \left( Z\right) \ge \cdots \ge \lambda _n \left( Z\right) \). If no confusion can arise, we will suppress the dependence on Z and write \(\lambda _i \) instead of \(\lambda _i \left( Z\right) \). By \(\Vert Z \Vert _1\) we will denote the Schatten-1 norm of Z, i.e. \( \Vert Z \Vert _1 := \sum _{i=1}^{n} \vert \lambda _i \left( Z\right) \vert \). By \(\text {diag}\left( Z\right) \in {\mathcal {S}}^n\) we denote the matrix, which we obtain by setting all off-diagonal entries of Z equal to zero. We will write \(a \lesssim b\) or \(b \gtrsim a \) if there is a universal constant \(C>0\) such that \( a \le Cb \).
2.2 PhaseLift
The PhaseLift method was first introduced in [7]. In this paper we focus on a variant [4, 13] based on the observation that the measurements \(y_i\) can be rewritten in the form
where \(X_0 = x_0 x^*_0 \) is a rank-1 matrix encoding the signal to be recovered up to the true inherent phase ambiguity. From this observation, PhaseLift relaxes the constraint that \(X_0\) is of rank 1 to obtain the optimization problem
In order to simplify notation we introduce the linear operator \({\mathcal {A}}: {\mathcal {S}}^n \rightarrow {\mathbb {R}}^m \) as
Hence, setting \(y:= \left( y_1, \ldots , y_m \right) \in {\mathbb {R}}^m \), (2.2) can be rewritten as
We note that while understanding the relaxation (2.4) is an important benchmark approach and can be solved in polynomial time, it is typically not practical for applications, as lifting increases the number of optimization variables. For this reason, a very active line of research study recovery guarantees for algorithms that operate in the natural parameter domain such as alternating minimization (see, e.g., [34, 43]), gradient-descent based formulations (see, e.g., [6, 9, 10, 38, 39]), and anchored regression [1,2,3, 20]. However, most of these guarantees have been shown under the assumption that the measurement vectors \(\left\{ \xi ^{\left( i\right) } \right\} _{i=1}^m \) are sampled i.i.d. from the unit sphere, so it will be a natural follow-up of this work to study to which extent our results generalize to the more practical nonconvex algorithms. In particular, most reconstruction guarantees for these non-convex approaches require an appropriate initialization. For this reason, one needs to study which initialization schemes work for the measurements considered in this paper. A natural approach will be to try spectral initializations and recent generalizations that have been shown to be feasible for a basically minimal number of measurements [15, 29, 30, 33]. We expect that the analysis provided in this paper will prove useful for this endeavour as the spectral initialization is somewhat connected to trace-norm minimization.
2.3 Subgaussian Measurements
We consider random measurement vectors \( \left\{ \xi ^{\left( i\right) } \right\} ^m_{i=1} \) given as independent copies of a random vector \(\xi \), whose entries \(\xi _j\) are assumed to be i.i.d. subgaussian random variables with parameter K, expectation \({\mathbb {E}}\left[ \xi _j\right] =0 \), and variance \( {\mathbb {E}}\left[ \vert \xi _j \vert ^2 \right] =1\). Recall that a random variable X is subgaussian with parameter K, if and only if
It is well known (see, e.g., [42]) that from this definition it follows for any (measurable) random variable X that
Since \( \Vert \xi _1 \Vert _{L_2}^2 = {\mathbb {E}} \left[ \vert \xi _1 \vert ^2 \right] =1 \) inequality (2.6) immediately implies that
Moreover, it is well known (see, e.g., [42]) that for all \( x\in {\mathbb {C}}^n\) the random variable \( \langle x,\xi \rangle \) is subgaussian with parameter \(K \Vert x \Vert \).
2.4 Previous Work
A number of previous works have studied phase retrieval with subgaussian measurements in the real-valued setting , i.e., \(x_0 \in {\mathbb {R}}^n\) and \(\xi \in {\mathbb {R}}^n\). For measurements fulfilling \( {\mathbb {E}} \left[ \big \vert \xi _j \big \vert ^4 \right] > {\mathbb {E}} \left[ \big \vert \xi _j \big \vert ^2 \right] \), [11] showed that PhaseLift admits order optimal uniform recovery guarantees.Footnote 1 Without the assumption \( {\mathbb {E}} \left[ \big \vert \xi _j \big \vert ^4 \right] > {\mathbb {E}} \left[ \big \vert \xi _j \big \vert ^2 \right] \), in [26] the following result was proven, again for the real-valued case.
Theorem 1
[26, Theorem V.1] Let \(\xi = \left( \xi _1, \ldots , \xi _n \right) \in {\mathbb {R}}^n \) be a random vector with i.i.d. subgaussian entries. Then there exist constants \(C_1\), \(C_2\), \(C_3\), and \(0< \mu < 1 \), which depend only on the distribution of \(\xi _1\), such that whenever
the following statement holds with probability at least \(1-\exp \left( - C_2 n \right) \): For all signals \(x_0 \in {\mathbb {R}}^n\) with \(\Vert x_0 \Vert _{\infty } \le \mu \Vert x_0 \Vert \) and all noise vectors \( w \in {\mathbb {R}}^m \) any minimizer of (2.4) fulfills
3 Main Results
3.1 Complex Signals and Complex Measurement Vectors
In Theorem 1 both the signal \(x_0\) and the measurement vectors \(\xi ^{\left( i\right) }\) are assumed to be real. While for the measurement vectors this is often too restrictive, the signal \(x_0\) is indeed typically real-valued in applications. This important special case will be discussed in Sect. 3.2 below. Nevertheless, we find it still interesting from a mathematical point of view under which assumptions recovery is possible for complex-valued signals. Our first result deals with this case.
As we have explained in Sect. 1, there are subgaussian distributions for which we cannot achieve uniform recovery of all signals \( x_0 \in {\mathbb {C}}^n\). For this reason, we define for all \(0 < \mu \le 1 \) the set of all signals of mildly bounded peak-to-average power ratio
Indeed, this restriction is very mild as \(\mu \) will not depend on the dimension, whereas for a Gaussian random signal the ratio \(\frac{\Vert x \Vert _{\infty }}{\Vert x \Vert } \) would scale like \( \sqrt{\frac{\log n}{n}} \). Now we are prepared to state the following theorem, which is our first main result.
Theorem 2
Let the observation vector y be given as in (1.1), where the random measurement vectors \( \left\{ \xi ^{\left( i\right) } \right\} _{i=1}^m \) are defined as in Sect. 2.3. Assume that \( \vert {\mathbb {E}} \left[ \xi _1^2 \right] \vert ^2 \le 1-\beta \) for some \(\beta \in \left( 0,1\right) \) and that
Then for some probability parameter \(p_{\beta }=1- {\mathcal {O}} \left( \exp \left( \frac{-m \beta ^4}{C_2 K^{16}} \right) \right) \) the following two statements hold.
-
1.
With probability at least \(p_{\beta }\) one has that for all vectors \(x_0 \in {\mathcal {X}}_{1/81} \) and any noise vector \(w\in {\mathbb {R}}^m\) any solution \({\hat{X}}\) of (2.4) satisfies
$$\begin{aligned} \Vert {\hat{X}} - x_0 x^*_0 \Vert _{1} \le C_3 \frac{K^8}{m \beta ^{5/2}} \Vert w \Vert _{\ell _1}. \end{aligned}$$(3.3) -
2.
If, in addition, \({\mathbb {E}} \left[ \vert \xi _1 \vert ^4 \right] \ge 1+\beta \), then with probability at least \(p_{\beta }\) inequality (3.3) holds for all \(x_0 \in {\mathbb {C}}^n \setminus \left\{ 0 \right\} \).
Here \(C_1\), \(C_2\), and \(C_3\) are universal constants.
The first case of Theorem 2, where one makes no assumption on the fourth moment of \(\xi _1 \), can be applied also to certain scenarios, where unique recovery is not possible without this assumption. One important example is that the entries \(\xi _i\) are drawn from \( \left\{ z\in {\mathbb {C}}: \ \Vert z \Vert =1 \right\} \) uniformly at random. Note that these measurements will always yield the same observations y for the two signals
Such very sparse signals are exactly prevented by Condition 1, so there is no contradiction to the theorem’s conclusion that unique recovery can be achieved via (2.4) for all signals \(x_0\) such that \(\Vert x_0 \Vert _{\infty } \le \frac{1}{82} \Vert x_0 \Vert \).
Note that in the second scenario, where assumptions on the fourth moment of \(\xi _1\) are available, we obtain a uniform recovery result over all \(x_0 \in {\mathbb {C}}^n \). In the real-valued case a similar result has been shown in [11].
Remark 1
An assumption of the form \( \vert {\mathbb {E}} \left[ \xi _1^2 \right] \vert ^2 \le 1-\beta \) cannot be avoided as the following argument shows. Indeed, if \( \vert {\mathbb {E}} \left[ \xi _1^2 \right] \vert ^2 =1\) the assumption \( {\mathbb {E}} \left[ \vert \xi _1^2 \vert \right] =1 \) implies that \( \xi = \lambda {\tilde{\xi }} \) almost surely, where \( \lambda \in \left\{ z\in {\mathbb {C}}: \Vert z \Vert =1 \right\} \) is fixed and \({\tilde{\xi }} \in {\mathbb {R}} \) is a real random variable. We observe that
Consequently, \(x_0\) and its complex-conjugate \(\overline{x_0}\) will always lead to the same measurements.
3.2 Real Signals and Complex Measurement Vectors
We have seen in Remark 1 that the assumption \( \vert {\mathbb {E}} \left[ \xi _1^2 \right] \vert ^2 \le 1-\beta \) is necessary to distinguish between a signal \(x_0\) and \(\overline{x_0}\). However, if, as in many practical applications, it is known a priori that the signal \(x_0\) is real-valued then this ambiguity cannot arise and we can uniquely recover without additional assumptions via the following natural variant of the PhaseLift method, where we restrict the search space to real-valued matrices.
The following theorem shows that in this scenario the assumption \(\vert {\mathbb {E}} \left[ \xi _1^2 \right] \vert ^2 \le 1-\beta \) is indeed not necessary.
Theorem 3
Let the observation vector y be given as in (1.1), where the random measurement vectors \( \left\{ \xi ^{\left( i\right) } \right\} _{i=1}^m \) are as defined in Sect. 2.3. Then the following two statements hold.
-
1.
Assume that
$$\begin{aligned} m \ge C_1 {K^{20}} n. \end{aligned}$$(3.11)Then, with probability at least \(1- {\mathcal {O}} \left( \exp \left( \frac{-m }{C_2 K^{16}} \right) \right) \) one has that for all vectors \(x_0 \in {\mathcal {X}}_{1/81}\cap {\mathbb {R}}^n\) and any noise vector \(w\in {\mathbb {R}}^m\) any solution \({\hat{X}}\) of (3.10) satisfies
$$\begin{aligned} \Vert {\hat{X}} - x_0 x^*_0 \Vert _{1} \le C_3 {K^8}m \Vert w \Vert _{\ell _1}. \end{aligned}$$(3.12) -
2.
If, in addition, it holds that \({\mathbb {E}} \left[ \vert \xi _1 \vert ^4 \right] \ge 1+\beta \) for some \(\beta \in (0,1] \), then, under the refined assumption
$$\begin{aligned} m \ge C_1 \frac{K^{20}}{\beta ^{5/2}} n, \end{aligned}$$(3.13)one has a more general bound. Namely it holds that with probability at least \(1- {\mathcal {O}} \left( \exp \left( \frac{-m \beta ^4}{C_2 K^{16}} \right) \right) \) and for all vectors \(x_0 \in {\mathbb {R}}^n \backslash \left\{ 0\right\} \), again for arbitrary noise vectors \(w\in {\mathbb {R}}^m\), any solution \({\hat{X}}\) of (3.10) satisfies
$$\begin{aligned} \Vert {\hat{X}} - x_0 x^*_0 \Vert _{1} \le C_3 \frac{K^8}{m \beta ^{5/2}} \Vert w \Vert _{\ell _1}. \end{aligned}$$(3.14)
Here \(C_1\), \(C_2\), and \(C_3\) are universal constants.
Remark 2
In comparison to Theorem 1 the probability bound in Theorems 2 and 3 is slightly better, as it improves from \(1-\exp \left( - \Omega \left( n\right) \right) \) to \( 1-\exp \left( - \Omega \left( m\right) \right) \). Moreover, note that in contrast to Theorem 1 the dependence on the subgaussian distribution of \(\xi \) is not hidden in the constants. Also note that in our result the dependence on \(\beta \) is stated explicitly. However, we do not know whether these bounds are optimal with respect to K and \(\beta \).
4 Proof of Main Results
4.1 Proof of Theorem 2
Our goal is to show that with high probability the matrix \(x_0 x_0^*\) is close to the minimizer \({\hat{X}}\) of the expression \(\Vert {\mathcal {A}}(W)-y\Vert _{\ell _1}\) over all \(W\in {\mathcal {S}}^n_{+}\). A common proof strategy that we will also follow is to establish that all \(X \in {\mathcal {S}}^n_{+}\) with
are sufficiently close to the true solution in \(\Vert \cdot \Vert _{1} \)-norm. More precisely, a sufficient condition for inequality (3.3) is that every X fulfilling condition (4.1) satisfies
Setting \(Z= X- x_0 x_0^*\), Eq. (4.1) reads
By the triangle inequality this implies that
Hence, the upper bound (4.2) that we aim to establish directly follows from an appropriate lower bound for \(\Vert {\mathcal {A}} \left( Z\right) \Vert _{\ell _1}/ \Vert Z \Vert _1 \). Here \(Z \in {\mathcal {S}}^n\) ranges over those matrices for which \(x_0 x^*_0 + Z \) is positive semidefinite. This set is convex, so it is locally well-approximated by a convex cone. To establish a uniform recovery result over all \(x_0 \in {\mathcal {X}}_{\mu }\), we need to study the union of the corresponding cones as given by
We will refer to this set as the cone of admissible directions.
With this notation, our proof strategy can be summarized as establishing a lower bound for
which in the literature is commonly referred to as the minimum conic singular value (see, e.g., [27, 40]). Except for the precise nature of the cone under consideration, this strategy is exactly analogous to a number of works in the recent literature on linear inverse problems [8, 28]. In particular, the following lemma, which summarizes our motivating considerations above, can be seen as a variant of [8, Proposition 2.2].
Lemma 1
Let \({\mathcal {A}}\) be the operator defined in (2.3). Assume that \(y= {\mathcal {A}} \left( x_0 x^*_0 \right) +w \). Then the minimizer \({\hat{X}}\) of (2.4) satisfies
In the following, our goal will be to derive an appropriate lower bound for \(\lambda _{\min } \left( {\mathcal {A}}, {\mathcal {M}}_{\mu } \right) \). One difficulty in the analysis is that not all matrices belonging to \({\mathcal {M}}_{\mu } \) are positive semidefinite. Indeed, in this scenario one could use that for positive semidefinite matrices an approximate \(\ell _1\) isometry holds (see, e.g. [7, Sect. 3]). While not all matrices in \({\mathcal {M}}_{\mu }\) are positive semidefinite the following lemma states that each matrix belonging to \({\mathcal {M}}_{\mu } \) possesses at most one negative eigenvector.
Lemma 2
Suppose that \(Z\in {\mathcal {M}}_{\mu }\). Then Z has at most one strictly negative eigenvalue.
Proof
Let \(Z\in {\mathcal {M}}_{\mu }\). By definition of \( {\mathcal {M}}_{\mu }\) we can find \(x_0 \in {\mathcal {X}}_{\mu }\) and \(t>0 \) such that
Suppose now by contradiction that Z has two (strictly) negative eigenvalues with corresponding eigenvectors \( z_1, z_2 \in {\mathbb {C}}^n\). Then we can find a vector \(u \in \text {span} \left\{ z_1, z_2 \right\} \backslash \left\{ 0 \right\} \) such that \( \langle u,x_0 \rangle =0 \). This implies that for any \( t >0 \) we have that
which is a contradiction to (4.8). \(\square \)
Recall that for a matrix \(Z \in {\mathcal {S}}^n\) we denoted its eigenvalues by \(\left\{ \lambda _i \left( Z\right) \right\} ^n_{i=1} \) in decreasing order. By the previous lemma it holds that \( \lambda _i \left( Z\right) \ge 0\) for all \( i \in \left[ n-1\right] \) and all \(Z \in {\mathcal {M}}_{\mu } \). For the proof we will partition \({\mathcal {M}}_{\mu }\) into two sets. Namely, for \( \alpha > 0 \) we define
The two sets can be interpreted in the following way. If we would suppose that \(\alpha =1 \) it would follow that \(\text {Tr}\,\left( Z\right) < 0 \) for all matrices \(Z\in {\mathcal {M}}_{2,\mu , \alpha } \). In particular, this implies that there is \(x_0 \in {\mathcal {X}}_{\mu }\) such that Z is in the descent cone of the function \(\text {Tr}\,\left( \cdot \right) \) at the point \(x_0 x^*_0 \). Hence, for \( \alpha < 1 \) we can interpret \({\mathcal {M}}_{2,\mu , \alpha } \) as a slightly enlarged union of descent cones. In order to bound \( \underset{Z \in {\mathcal {M}}_{2,\mu , \alpha } }{\inf } \Vert {\mathcal {A}} \left( Z\right) \Vert _{\ell _1} / \Vert Z \Vert _1 \) from below we will rely on the following lemma, which is proven in Sect. 6.
Lemma 3
Assume that one of following two conditions is satisfied for \( \beta \in (0,1] \):
-
1.
It holds that \( \vert {\mathbb {E}} \left[ \xi _1^2 \right] \vert ^2 \le 1-\beta \). In this case we set \(\mu =1/81\).
-
2.
In addition to \( \big \vert {\mathbb {E}} \left[ \xi _1^2 \right] \big \vert ^2 \le 1-\beta \), the inequality \({\mathbb {E}} \left[ \vert \xi _1 \vert ^4 \right] \ge 1+\beta \), is fulfilled. In this case we set \( \mu =1 \).
Moreover, assume that
Then with probability at least \(1- 2\exp \left( \frac{-m\beta ^4}{C_2 K^{16}} \right) \) it holds that
where \( \alpha = 4/5 \). Here \(C_1\), \(C_2\), and \(C_3\) are universal constants.
The proof of Lemma 3 makes use of the fact that the set \( {\mathcal {M}}_{2,\mu , \alpha } \) has low complexity in the sense that the matrices in \( {\mathcal {M}}_{2,\mu , \alpha } \) are approximately low-rank.
In contrast, the set \( {\mathcal {M}}_{1,\mu , \alpha } \) has rather high complexity. For example, note that \({\mathcal {S}}^n_{+} \subset {\mathcal {M}}_{1,\mu , \alpha } \). Nevertheless, the quantity \( \underset{Z\in {\mathcal {M}}_{1,\mu , \alpha } \setminus \left\{ 0 \right\} }{\inf } \Vert {\mathcal {A}} \left( Z\right) \Vert _{\ell _1} / \Vert Z\Vert _{1} \) can be bounded from below, because the measurement matrices \(\xi ^{\left( i\right) } (\xi ^{\left( i\right) })^* \) are positive semidefinite and the matrices in \({\mathcal {M}}_{1,\mu , \alpha } \) also have a dominant positive semidefinite component. This is achieved by the following lemma, whose proof can be found in Sect. 5.
Lemma 4
Let \(0 < \mu \le 1\), \( \alpha >0\), and \( \delta >0\). Assume that
Then with probability at least \(1- {\mathcal {O}} \left( \exp \left( -\frac{m}{C_2K^4} \right) \right) \) for all \(Z\in {\mathcal {M}}_{1,\mu ,\alpha } \) it holds that
Here \(C_1\) and \(C_2\) are absolute constants.
We remark that Lemma 4 would no longer hold if the measurement matrices \(\xi ^{\left( i\right) } (\xi ^{\left( i\right) })^* \) would be replaced by symmetric matrices with i.i.d. Gaussian entries (see [37, Proposition 1]).
Having gathered all the necessary ingredients we can prove the main result of this manuscript.
Proof of Theorem 2
Set \(\alpha =4/5\). The proof of the two statements is analogous, except that for the first statement we set \( \mu =1/81 \) whereas for the second statement we set \( \mu =1 \). By Lemma 4 and Assumption (3.2) it follows that with probability at least \(1-{\mathcal {O}} \left( \exp \left( -\frac{m}{CK^4} \right) \right) \)
Furthermore, by Lemma 3 we have with probability at least \(1- 2\exp \left( \frac{-m\beta ^4}{C_2 K^{16}} \right) \) that
holds.
Set \(Z:= {\hat{X}} - x_0 x^*_0\). Note that by definition we have that Z is an admissible direction, i.e., \(Z\in {\mathcal {M}}_{\mu }\). It follows by (4.16), (4.17), and \({\mathcal {M}}_{\mu }= {\mathcal {M}}_{1,\mu , \alpha } \cup {\mathcal {M}}_{2,\mu , \alpha } \) that
where in the last inequality we used (2.7) and \(0<\beta \le 1 \). It follows by Lemma 1 that
which finishes the proof. \(\square \)
4.2 Proof of Theorem 3
The proof of Theorem 3 is in large parts analogous to the proof of Theorem 2. For this reason, we will only highlight the main differences. Replacing \({\mathcal {X}}_{\mu }\) by \( {\mathcal {X}}_{\mu }\cap {\mathbb {R}}^n \) and \( {\mathcal {M}}_{\mu }\) by \({\mathcal {M}}_{\mu }\cap {\mathbb {R}}^{n \times n} \) we can argue analogously to Sect. 4.1 with the only difference that Lemma 3 has to be replaced by the following variant.
Lemma 5
Assume that one of following two conditions is satisfied for \( \mu , \beta \in (0,1] \):
-
1.
It holds that \(\mu = \frac{1}{81} \) and \( \beta =1 \).
-
2.
It holds that \({\mathbb {E}} \left[ \vert \xi _1 \vert ^4 \right] \ge 1+\beta \) and \(\mu =1 \).
Moreover, assume that
Then with probability at least \(1- 2\exp \left( \frac{-m\beta ^4}{C_2 K^{16}} \right) \) it holds that
where \( \alpha = 4/5 \). Here \(C_1\), \(C_2\), and \(C_3\) are universal constants.
In order to prove Lemma 5 we can proceed similarly as in the proof of Lemma 3, see Sect. 6, where in the proof of Lemma 3 we have highlighted the necessary modifications.
5 Proof of Lemma 4
Proof
Note that for any \(z\in {\mathbb {C}}^n \) we have that \(\Vert {\mathcal {A}} \left( zz^*\right) \Vert _{\ell _1} = \sum _{i=1}^{m} \vert \langle \xi _i , z \rangle \vert ^2 \). Let \(A\in {\mathbb {C}}^{m\times n}\) be the matrix whose rows are given by \( \left\{ \xi _i \right\} _{i=1}^m \). It follows that \( \Vert {\mathcal {A}} \left( zz^*\right) \Vert _{\ell _1} = \Vert Az \Vert ^2 \). It follows from [42, Theorem 4.6.1] that due to our assumption on m with probability at least \( 1- {\mathcal {O}} \left( \exp \left( -\frac{m}{CK^4} \right) \right) \) for all \(z\in {\mathbb {C}}^m \) it holds that
Due to the observation above this is equivalent to
for all \(z \in {\mathbb {C}}^n \). We will assume in the following that (5.2) holds for all \(z \in {\mathbb {C}}^m\).
Let \(Z\in {\mathcal {M}}_{1,\mu , \alpha }\) with corresponding eigenvalue decomposition \( Z = \sum _{i=1}^{n} \lambda _i v_i v^*_i \). We observe that
By Lemma 2 we know that Z has at most one negative eigenvalue. If all eigenvalues \(\lambda _i \left( Z\right) \) are positive, this inequality chain and inequality (5.2) imply that
which shows (4.15). Now suppose that \(\lambda _n \left( Z\right) <0\). By (5.2) and \( -\lambda _n \left( Z\right) \le \alpha \sum _{i=1}^{n-1} \lambda _i \left( Z\right) \), which is due to \(Z\in {\mathcal {M}}_{1,\mu ,\alpha } \), we obtain that
Again using the relation \( -\lambda _n \left( Z\right) \le \alpha \sum _{i=1}^{n-1} \lambda _i \left( Z\right) \) we can also observe that
Combining (5.9) and (5.10) shows (4.15), which finishes the proof. \(\square \)
6 Proof of Lemmas 3 and 5
In order to prove Lemmas 3 and 5 we will use the following version of Mendelson’s small ball method [25, 31], a tool for deriving a lower bound for nonnegative empirical process.
Lemma 6
[14, Lemma 1] Let \({\mathcal {Z}} \subset {\mathcal {S}}^n \) and let \(\xi ^{(1)}, \xi ^{(2)}, \ldots , \xi ^{(m)} \) be i.i.d. random vectors. Let \(u>0\) and \(t>0\) and define
Then, with probability at least \(1-2\exp \left( -2t^2 \right) \), it holds that
where \( \left( \varepsilon _{i} \right) ^m_{i=1} \) are independent, symmetric, \(\left\{ -1,1 \right\} \)-valued random variables that are independent of \( \left( X_i \right) ^m_{i=1} \).
Our goal is to apply Lemma 6 to \({\mathcal {Z}}= {\mathcal {M}}_{2,\mu , \alpha } \cap \left\{ Z \in {\mathcal {S}}^n: \ \Vert Z \Vert _F =1 \right\} \). The following key lemma shows that matrices in \( {\mathcal {M}}_{2,\mu , \alpha } \) have two favorable properties: They are approximately low-rank and their mass with respect to the Frobenius norm is not concentrated on the diagonal for \(\mu \) is small. The first property follows directly from the fact that the negative eigenvalue is rather small, the second property requires the spectral flatness of \(x_0\), i.e., that \(\mu \) is bounded.
Lemma 7
Let \( \alpha >0 \) and \( 0 < \mu \le 1 \). Assume that \(Z \in {\mathcal {M}}_{2,\mu , \alpha }\). Then it holds that
-
1.
$$\begin{aligned} \Vert Z \Vert _1 \le \left( 1+ \frac{1}{\alpha } \right) \Vert Z \Vert _{\text {HS}}, \end{aligned}$$(6.4)
-
2.
$$\begin{aligned} \Vert \text {diag}\left( Z \right) \Vert _{HS} \le \left( \sqrt{1 - \frac{1}{ \left( 1 + \alpha ^{-1} \right) ^2 }} + 3\mu \right) \Vert Z \Vert _{HS}. \end{aligned}$$(6.5)
Proof
Let \(Z \in {\mathcal {M}}_{2,\mu , \alpha } \). By definition of \({\mathcal {M}}_{2,\mu , \alpha } \) we have that \( \alpha \sum _{i=1}^{n-1} \lambda _i \left( Z\right) < - \lambda _n \left( Z\right) \), which implies that
This proves inequality (6.4).
In order to prove the second inequality note that by definiton of \({\mathcal {M}}_{2,\mu ,\alpha } \subset {\mathcal {M}}_{\mu } \) we can choose \(x_0 \in {\mathcal {X}}_{\mu } \cap S^{n-1} \) such that there exists \(t>0\) with \(x_0x^*_0 + tZ\) positive semidefinite. For this choice of \(x_0\) we can decompose Z uniquely into
where \( \lambda \in {\mathbb {R}} \), \( \langle u,x_0 \rangle =0\), and \(Z_2x_0=0\). We observe that
We will bound the two summands separately. We begin with \( \Vert \text {diag}\left( Z_1 \right) \Vert _{HS} \) and observe that
In the first inequality we used the triangle inequality and in the third line we used that \(\Vert x_0 \Vert _{\infty } \le \mu \Vert x_0 \Vert = \mu \) due to \(x_0 \in {\mathcal {X}}_{\mu }\cap S^{n-1}\). In the fourth line we used that \(\vert \lambda \vert \le \Vert Z_1 \Vert _{HS} \) and \( \Vert u \Vert \le \Vert Z_1 \Vert _{HS} \), which follows from the fact that the summands of \(Z_1= -\lambda x_0 x^*_0 + ux^*_0 + x_0 u^* \) are orthogonal to each other. In the last line we again used that \(\Vert Z_1 \Vert _{HS} \le \Vert Z \Vert _{HS} \) as Z is decomposed orthogonally into \(Z=Z_1+Z_2\).
In order to bound \( \Vert \text {diag}\left( Z_2 \right) \Vert _{HS}\) we note first that \(Z_2\) is positive semidefinite. Indeed, suppose by contradiction that \(Z_2\) is not positive semidefinite. Then there would exist a vector \(v\in {\mathbb {C}}^n\) such that \( \langle v, x_0 \rangle =0\) and \( v^* Z_2 v <0\). In particular, this would imply that \(v^* \left( x_0 x^*_0 + t Z \right) v <0 \) for all \( t >0\), which is a contradiction to our choice of \(x_0\).
Now let \(w\in {\mathbb {C}}^n\) be the normalized (i.e., \( \Vert w \Vert =1 \)) eigenvector corresponding to the eigenvalue \( \lambda _n \left( Z\right) \). Then we obtain that
where the first inequality follows from the fact that \(Z_2\) is positive semidefinite. Using this observation we obtain that
where in the fourth line we used that \( - \lambda _n \left( Z\right) \ge \frac{1}{1+\alpha ^{-1}} \Vert Z \Vert _{1} \), which is a consequence of the first inequality of (6.6). Combining this estimate with (6.8) and (6.9) shows part (2), which finishes the proof.
\(\square \)
In analogy to [25] we bound \(Q_{{\mathcal {Z}}} \left( 2u\right) \) using the following lemma, whose proof is based on the Paley–Zygmund inequality. A key difference is that we use the Hanson–Wright inequality to control the fourth moment \({\mathbb {E}} \vert \xi ^* A \xi \vert ^4 \) appropriately.
Lemma 8
Let \(A \in {\mathcal {S}}^n\) and let \( \xi = \left( \xi _1, \ldots , \xi _n \right) \) be a random vector with independent and identically distributed entries \( \xi _i \) taking values in \( {\mathbb {C}} \) such that \( {\mathbb {E}} \xi _i = 0 \), \( {\mathbb {E}} \vert \xi _i \vert ^2 =1 \), and \(\Vert \xi _i \Vert _{\psi _2} \le K \). Then we have that
Here \(C>0 \) is an absolute constant.
Proof
Note that by the Paley–Zygmund inequality (see, e.g., [12]) we have that for all \( 0 < t \le {\mathbb {E}}\vert \xi ^* A \xi \vert ^2\)
In particular, setting \(t={\mathbb {E}}\vert \xi ^* A \xi \vert ^2/2 \) yields that
To estimate \({\mathbb {E}} \vert \xi ^* A \xi \vert ^4\) from above we note that the triangle inequality yields that
In order to estimate the first summand we will use that \( \big \vert \xi ^* A \xi - {\mathbb {E}}\left[ \xi ^* A \xi \right] \big \vert \) has a mixed subgaussian/subexponential tail. We can bound the tail probability using the Hanson-Wright inequality (in the version of [36]), which states that there is a numerical constant \(c>0\) such that for all \(t>0\) it holds that
This yields that
where the third line follows from a change of variables. Combining this inequality chain with (6.15) we obtain that
Inserting this into (6.14) finishes the proof. \(\square \)
In order to apply Lemma 8 we need a lower bound for \({\mathbb {E}} \left[ \vert \xi ^* A \xi \vert ^2 \right] \). The next lemma computes this quantity.
Lemma 9
Let \( \xi = \left( \xi _1, \ldots , \xi _n \right) \) be a random vector with independent and identically distributed entries \( \xi _i \) taking values in \( {\mathbb {C}} \) such that \( {\mathbb {E}} \xi _i = 0 \) and \( {\mathbb {E}} \vert \xi _i \vert ^2 =1 \). Then for all matrices \( A \in {\mathcal {S}}^n \) it holds that
Proof
First, we observe that
where in the third line we used that \( {\mathbb {E}} \left[ \xi _i \right] = 0 \) and that the entries of \(\xi \) are independent, which implies that there are no summands where one index appears exactly three times. The first summand can be computed by
where we have used that \(A_{i,i} = \overline{ A_{i,i} } \) for all \( i \in \left[ n\right] \) and \( {\mathbb {E}} \left[ \vert \xi _i \vert ^2 \right] =1\). The second summand can be computed by
For equation (a) we used the observation that
By summing up (I) and (II) we obtain equality (6.22).
\(\square \)
The lemmas above would allows us to find a lower bound for \(Q_{{\mathcal {Z}}} \left( 2u\right) \) in Lemma 6. We still need an upper bound for the Rademacher complexity \({\mathbb {E}} \left[ \underset{Z \in {\mathcal {Z}} }{\sup } \Big \vert \sum _{i=1}^{m}\varepsilon _i \left\langle \xi _i \xi ^*_i, Z \right\rangle _{HS} \Big \vert \right] \). The next lemma provides such a bound. In [28] a version of this lemma has already been presented. Nevertheless, we include a proof for completeness.
Lemma 10
Assume that \(m \ge C_1n\). Let \(\alpha >0 \), \(0 < \mu \le 1 \) and set \({\mathcal {Z}}:= {\mathcal {M}}_{1,\mu , \alpha } \cap \left\{ Z \in {\mathcal {S}}^n: \ \Vert Z \Vert _{HS}=1 \right\} \). Then we have that
\(C_1\) and \(C_2\) are absolute constants.
Proof
First, we note that by Hoelder’s inequality and Lemma 7 we obtain that
To bound \( {\mathbb {E}} \left[ \Big \Vert \sum ^m_{i=1} \varepsilon _i \xi ^{\left( i\right) } \left( \xi ^{\left( i\right) }\right) ^* \Big \Vert \right] \) let \( {\mathcal {N}} \) be an \(\frac{1}{4}\)-covering of the unit sphere \(S^{n-1} \subset {\mathbb {R}}^n\) with respect to the Euclidean norm such that
By [41, Lemma 5.4] we have that
Fix \( x \in {\mathcal {N}} \) and observe that
where we have set \(z_i := \varepsilon _i \vert \langle \xi _i ,x \rangle \vert ^2 \). We observe that \( {\mathbb {E}} \left[ z_i\right] = 0 \) and, moreover,
where the first equality follows directly from the definition of the \(\Vert \cdot \Vert _{\psi _1} \)-norm.Footnote 2 The first inequality can be seen using [42, Lemma 2.7.6] and the second one using [42, Lemma 3.4.2]. By the Bernstein inequality (see, e.g., [42, Theorem 2.8.1]) we obtain that
where \( c>0 \) is some numerical constant. It follows from (6.40), (6.41), (6.43), and a union bound that
where \({\tilde{c}} = \log 12 \). Then, whenever \( m \ge \frac{{\tilde{c}}}{c} n \), we obtain that
In order to finish we need to estimate the two integrals. By a change of variables and [19, Lemma C.7] we obtain that
Inserting this in the inequality chain above yields that
Combined with inequality (6.39) this finishes the proof. \(\square \)
Now we have gathered all the ingredients to complete the proof.
Proof of Lemmas 3and 5 We will start by showing that \({\mathbb {E}} \left[ \vert \left\langle \xi \xi ^*, Z \right\rangle _{HS} \vert ^2 \right] \gtrsim \beta \Vert Z \Vert ^2_{HS} \) for all \( Z \in {\mathcal {M}}_{2,\mu , \alpha } \) in the case of Lemma 3, or all \( Z \in {\mathcal {M}}_{2,\mu , \alpha } \cap {\mathbb {R}}^{n \times n} \) in the case of Lemma 5, respectively.
We first consider the second case and assume that the condition \({\mathbb {E}} \left[ \vert \xi _i \vert ^4 \right] \ge 1+ \beta \) is satisfied for some \( \beta >0 \). By Lemma 9 we obtain that for all \(Z \in {\mathcal {M}}_{2,\mu ,\alpha } \) under the conditions of Lemma 3
Under the assumptions of Lemma 5 we observe that \(\sum _{i \ne j} \text {Im} \left( Z_{i,j} \right) ^2=0 \) and \(\sum _{i \ne j} \text {Re} \left( Z_{i,j} \right) ^2 = \Vert Z - \text {diag}\left( Z \right) \Vert ^2_{HS} \). Hence, a similar argument as before also leads to
Under the first assumption we obtain by Lemma 9 that for all \(Z \in {\mathcal {M}}_{2,\mu , \alpha } \)
Similarly, under the assumptions of Lemma 5 we can again use that \(\sum _{i \ne j} \text {Im} \left( Z_{i,j} \right) ^2=0 \) and \(\sum _{i \ne j} \text {Re} \left( Z_{i,j} \right) ^2 = \Vert Z - \text {diag}\left( Z \right) \Vert ^2_{HS} \) to obtain by an analogous argument that
The remainder of the proof will be the same for Lemmas 3 and 5. By Lemma 7 we have that
By the triangle inequality it follows that
for \({\tilde{C}}=100\). Inserting into (6.57) one obtains that
Hence, we have shown in all cases that \({\mathbb {E}} \left[ \vert \left\langle \xi \xi ^*, Z \right\rangle _{HS} \vert ^2 \right] \ge \frac{\beta }{{\tilde{C}}^2} \Vert Z \Vert ^2_{HS} \).
By Lemma 8 it follows that for all \(Z\in {\mathcal {M}}_{2,\mu , \alpha } \)
Note that for all \(Z \in {\mathcal {M}}_{2,\mu , \alpha } \)
where in the last inequality we also used that \( \alpha = \frac{4}{5} \). This shows that for all \(Z \in {\mathcal {M}}_{2,\mu , \alpha } \) it holds that
where we used that \( K \gtrsim 1 \) due to (2.7). Now recall that \( {\mathcal {Z}}:= {\mathcal {M}}_{2,\mu , \alpha } \cap \left\{ Z\in {\mathcal {S}}^n: \ \Vert Z \Vert _{HS} =1 \right\} \). Thus we have shown that
where \(Q_{{\mathcal {Z}}} \left( \cdot \right) \) is defined in (6.1). We used that \( u= \frac{ \sqrt{\beta }}{2\sqrt{2} {\tilde{C}}} \) and \(C''>0\) is a constant chosen large enough. From Lemma 10 it follows that
Combining this inequality with our choice of u and choosing the constant in assumption (4.12) large enough it follows that
Applying Lemma 6 yields that with probability at least \(1-2\exp \left( -2t^2\right) \)
Setting \(t= \frac{\sqrt{m} \beta ^2 }{4 C'' K^8} \) it follows that with probability at least \(1- 2 \exp \left( \frac{-m\beta ^4}{8 (C'')^2 K^{16}} \right) \) it holds that
Hence, by the definition of \({\mathcal {A}}\) and \({\mathcal {Z}}\) it follows that
Due to \(\alpha =\frac{4}{5} \) and Lemma 7 we have that \(\Vert Z \Vert _{1} \le \frac{9}{4} \Vert Z \Vert _{HS} \) for all \(Z \in {\mathcal {M}}_{2,\mu , \alpha } \). Combined with (6.75) this shows (4.13). \(\square \)
References
Bahmani, S., Romberg, J.: A flexible convex relaxation for phase retrieval. Electron. J. Stat. 11(2), 5254–5281 (2017)
Bahmani, S., Romberg, J.: Phase retrieval meets statistical learning theory: a flexible convex relaxation. In: International Conference on Artificial Intelligence and Statistics, vol. 54, pp. 252–260 (2017)
Bahmani, S., Romberg, J.: Anchored regression: solving random convex equations via convex programming. Found. Comput. Math. (to appear)
Candès, E.J., Li, X.: Solving quadratic equations via phaselift when there are about as many equations as unknowns. Found. Comput. Math. 14(5), 1017–1026 (2014)
Candes, E.J., Li, X., Soltanolkotabi, M.: Phase retrieval from coded diffraction patterns. Appl. Comput. Harmon. Anal. 39(2), 277–299 (2015)
Candès, E.J., Li, X., Soltanolkotabi, M.: Phase retrieval via Wirtinger flow: theory and algorithms. IEEE Trans. Inf. Theory 61(4), 1985–2007 (2015)
Candes, E.J., Strohmer, T., Voroninski, V.: Phaselift: exact and stable signal recovery from magnitude measurements via convex programming. Commun. Pure Appl. Math. 66(8), 1241–1274 (2013)
Chandrasekaran, V., Recht, B., Parrilo, P.A., Willsky, A.S.: The convex geometry of linear inverse problems. Found. Comput. Math. 12(6), 805–849 (2012)
Chen, Y., Candès, E.J.: Solving random quadratic systems of equations is nearly as easy as solving linear systems. Commun. Pure Appl. Math. 70(5), 822–883 (2017)
Chen, Y., Chi, Y., Fan, J., Ma, C.: Gradient descent with random initialization: fast global convergence for nonconvex phase retrieval. Math. Program. (to appear)
Chen, Y., Chi, Y., Goldsmith, A.J.: Exact and stable covariance estimation from quadratic sampling via convex programming. IEEE Trans. Inf. Theory 61(7), 4034–4059 (2015)
de la Peña, V., Giné, E.: Decoupling, from Dependence to Independence, Randomly Stopped Processes, \(U\)-Statistics and Processes, Martingales and Beyond. Springer, New York (1998)
Demanet, L., Hand, P.: Stable optimizationless recovery from phaseless linear measurements. J. Fourier Anal. Appl. 20(1), 199–221 (2014)
Dirksen, S., Lecué, G., Rauhut, H.: On the gap between restricted isometry properties and sparse recovery conditions. IEEE Trans. Inform. Theory 64(8), 5478–5487 (2018)
Dudeja, R., Bakhshizadeh, M., Ma, J., Maleki, A.: Analysis of spectral methods for phase retrieval with random orthogonal matrices. IEEE Trans. Inf. Theory (2020)
Eldar, Y.C., Mendelson, S.: Phase retrieval: stability and recovery guarantees. Appl. Comput. Harmon. Anal. 36(3), 473–494 (2014)
Fannjiang, A.: Absolute uniqueness of phase retrieval with random illumination. Inverse Probl. 28(7), 20 (2012)
Fienup, C., Dainty, J.: Phase retrieval and image reconstruction for astronomy. Image Recov. Theory Appl. 231, 275 (1987)
Foucart, S., Rauhut, H.: A Mathematical Introduction to Compressive Sensing, vol. 1. Birkhäuser, Basel (2013)
Goldstein, T., Studer, C.: PhaseMax: convex phase retrieval via basis pursuit. IEEE Trans. Inf. Theory 64(4), 2675–2689 (2018)
Gross, D., Krahmer, F., Kueng, R.: A partial derandomization of phaselift using spherical designs. J. Fourier Anal. Appl. 21(2), 229–266 (2015)
Gross, D., Krahmer, F., Kueng, R.: Improved recovery guarantees for phase retrieval from coded diffraction patterns. Appl. Comput. Harmon. Anal. 42(1), 37–64 (2017)
Harrison, R.W.: Phase problem in crystallography. JOSA a 10(5), 1046–1055 (1993)
Kabanava, M., Kueng, R., Rauhut, H., Terstiege, U.: Stable low-rank matrix recovery via null space properties. Inf. Inference 5(4), 405–441 (2016)
Koltchinskii, V., Mendelson, S.: Bounding the smallest singular value of a random matrix without concentration. Int. Math. Res. Not. IMRN 2015(23), 12991–13008 (2015)
Krahmer, F., Liu, Y.-K.: Phase retrieval without small-ball probability assumptions. IEEE Trans. Inform. Theory 64(1), 485–500 (2018)
Krahmer, F., Stöger, D.: On the convex geometry of blind deconvolution and matrix completion. arXiv.1902.11156 (2019)
Kueng, R., Rauhut, H., Terstiege, U.: Low rank matrix recovery from rank one measurements. Appl. Comput. Harmon. Anal. 42(1), 88–116 (2017)
Lu, Y.M., Li, G.: Phase transitions of spectral initialization for high-dimensional non-convex estimation. Inform. Inference 9, 507–541 (2017)
Luo, W., Alghamdi, W., Lu, Y.M.: Optimal spectral initialization for signal recovery with applications to phase retrieval. IEEE Trans. Signal Process. 67(9), 2347–2356 (2019)
Mendelson, S.: Learning without concentration. In: Conference on Learning Theory, pp. 25–39 (2014)
Millane, R.P.: Phase retrieval in crystallography and optics. JOSA A 7(3), 394–411 (1990)
Mondelli, M., Montanari, A.: Fundamental limits of weak recovery with applications to phase retrieval. Found. Comput. Math. 19(3), 703–773 (2019)
Netrapalli, P., Jain, P., Sanghavi, S.: Phase retrieval using alternating minimization. In: Advances in Neural Information Processing Systems, pp. 2796–2804 (2013)
Rodenburg, J.M.: Ptychography and related diffractive imaging methods. Adv. Imaging Electron. Phys. 150, 87–184 (2008)
Rudelson, M., Vershynin, R.: Hanson-Wright inequality and sub-Gaussian concentration. Electron. Commun. Probab. 18, 9 (2013)
Slawski, M., Li, P., Hein, M.: Regularization-free estimation in trace regression with symmetric positive semidefinite matrices. In: Advances in Neural Information Processing Systems, pp. 2782–2790 (2015)
Soltanolkotabi, M.: Structured signal recovery from quadratic measurements: breaking sample complexity barriers via nonconvex optimization. IEEE Trans. Inf. Theory 65(4), 2374–2400 (2019)
Sun, J., Qu, Q., Wright, J.: A geometric analysis of phase retrieval. Found. Comput. Math. 18(5), 1131–1198 (2018)
Tropp, J.A.: Convex recovery of a structured signal from independent random linear measurements. In: Sampling Theory, a Renaissance. Compressive Sensing and Other Developments, pp. 67–101. Birkhäuser/Springer, Cham (2015)
Vershynin, R.: Introduction to the Non-asymptotic Analysis of Random Matrices, pp. 210–268. Cambridge University Press, Cambridge (2012)
Vershynin, R.: High-Dimensional Probability. An Introduction with Applications in Data Science. Camb. Ser. Stat. Probab. Math., vol. 47 (2018)
Waldspurger, I.: Phase retrieval with random Gaussian sensing vectors by alternating projections. IEEE Trans. Inf. Theory 64(5), 3301–3312 (2018)
Acknowledgements
This work has been supported by the German Science Foundation (DFG) in the context of the joint project Bilinear Compressed Sensing (KR 4512/2-1) as part of the Priority Program 1798 as well as the Emmy Noether Junior Research Group Randomized Sensing of Signals and Images (KR 4512/1-1). Furthermore, the authors want to thank Peter Jung for inspiring discussions.
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Holger Rauhut.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Krahmer, F., Stöger, D. Complex Phase Retrieval from Subgaussian Measurements. J Fourier Anal Appl 26, 89 (2020). https://doi.org/10.1007/s00041-020-09797-9
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00041-020-09797-9