Skip to main content
Log in

Fast Rank-One Alternating Minimization Algorithm for Phase Retrieval

  • Published:
Journal of Scientific Computing Aims and scope Submit manuscript

Abstract

The phase retrieval problem is a fundamental problem in many fields, which is appealing for investigation. It is to recover the signal vector \({\tilde{{\mathbf {x}}}}\in {\mathbb {C}}^d\) from a set of N measurements \(b_n=|{\mathbf {f}}^*_n{\tilde{{\mathbf {x}}}}|^2,\ n=1,\ldots , N\), where \(\{{\mathbf {f}}_n\}_{n=1}^N\) forms a frame of \({\mathbb {C}}^d\). Existing algorithms usually use a least squares fitting to the measurements, yielding a quartic polynomial minimization. In this paper, we employ a new strategy by splitting the variables, and we solve a bi-variate optimization problem that is quadratic in each of the variables. An alternating gradient descent algorithm is proposed, and its convergence for any initialization is provided. Since a larger step size is allowed due to the smaller Hessian, the alternating gradient descent algorithm converges faster than the gradient descent algorithm (known as the Wirtinger flow algorithm) applied to the quartic objective without splitting the variables. Numerical results illustrate that our proposed algorithm needs less iterations than Wirtinger flow to achieve the same accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

References

  1. Alexeev, B., Bandeira, A.S., Fickus, M., Mixon, D.G.: Phase retrieval with polarization. SIAM J. Imaging Sci. 7(1), 35–66 (2014)

    Article  MATH  Google Scholar 

  2. Bahmani, S., Romberg, J.: Phase retrieval meets statistical learning theory: a flexible convex relaxation (2016). arXiv preprint arXiv:1610.04210

  3. Balan, R., Bodmann, B.G., Casazza, P.G., Edidin, D.: Painless reconstruction from magnitudes of frame coefficients. J. Fourier Anal. Appl. 15(4), 488–501 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  4. Balan, R., Casazza, P., Edidin, D.: On signal reconstruction without phase. Appl. Comput. Harmon. Anal. 20(3), 345–356 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  5. Bolte, J., Sabach, S., Teboulle, M.: Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Program. 146(1–2), 459–494 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  6. Candès, E.J., Eldar, Y.C., Strohmer, T., Voroninski, V.: Phase retrieval via matrix completion. SIAM Rev. 57(2), 225–251 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  7. Candès, E.J., Li, X.: Solving quadratic equations via phaselift when there are about as many equations as unknowns. Found. Comput. Math. 14(5), 1017–1026 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  8. Candès, E.J., Li, X., Soltanolkotabi, M.: Phase retrieval via Wirtinger flow: theory and algorithms. IEEE Trans. Inf. Theory 61(4), 1985–2007 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  9. Candès, E.J., Strohmer, T., Voroninski, V.: Phaselift: exact and stable signal recovery from magnitude measurements via convex programming. Commun. Pure Appl. Math. 66(8), 1241–1274 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  10. Chai, A., Moscoso, M., Papanicolaou, G.: Array imaging using intensity-only measurements. Inverse Probl. 27(1), 015,005 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  11. Chen, Y., Candès, E.: Solving random quadratic systems of equations is nearly as easy as solving linear systems. In: Advances in Neural Information Processing Systems, pp. 739–747 (2015)

  12. Duchi, J.C., Ruan, F.: Solving (most) of a set of quadratic equalities: composite optimization for robust phase retrieval (2017). arXiv preprint arXiv:1705.02356

  13. Goldstein, T., Studer, C.: Phasemax: convex phase retrieval via basis pursuit. IEEE Trans. Inf. Theory 64(4), 2675–2689 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  14. Hardt, M.: Understanding alternating minimization for matrix completion. In: 2014 IEEE 55th Annual Symposium on Foundations of Computer Science (FOCS), pp. 651–660. IEEE (2014)

  15. Harrison, R.W.: Phase problem in crystallography. JOSA A 10(5), 1046–1055 (1993)

    Article  Google Scholar 

  16. Heinosaari, T., Mazzarella, L., Wolf, M.M.: Quantum tomography under prior information. Commun. Math. Phys. 318(2), 355–374 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  17. Iwen, M., Viswanathan, A., Wang, Y.: Robust sparse phase retrieval made easy. Appl. Comput. Harmon. Anal. 42(1), 135–142 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  18. Jain, P., Netrapalli, P., Sanghavi, S.: Low-rank matrix completion using alternating minimization. In: Proceedings of the Forty-Fifth Annual ACM Symposium on Theory of computing, pp. 665–674. ACM (2013)

  19. Millane, R.P.: Phase retrieval in crystallography and optics. JOSA A 7(3), 394–411 (1990)

    Article  Google Scholar 

  20. Netrapalli, P., Jain, P., Sanghavi, S.: Phase retrieval using alternating minimization. In: Advances in Neural Information Processing Systems, pp. 2796–2804 (2013)

  21. Sun, J., Qu, Q., Wright, J.: A geometric analysis of phase retrieval. Found. Comput. Math. 18(5), 1131–1198 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  22. Tanner, J., Wei, K.: Low rank matrix completion by alternating steepest descent methods. Appl. Comput. Harmon. Anal. 40(2), 417–429 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  23. Waldspurger, I., dAspremont, A., Mallat, S.: Phase recovery, maxcut and complex semidefinite programming. Math. Program. 149(1–2), 47–81 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  24. Walther, A.: The question of phase retrieval in optics. J. Mod. Opt. 10(1), 41–49 (1963)

    MathSciNet  Google Scholar 

  25. Wang, G., Giannakis, G., Saad, Y., Chen, J.: Solving most systems of random quadratic equations. In: Advances in Neural Information Processing Systems, pp. 1865–1875 (2017)

  26. Wang, G., Giannakis, G.B., Eldar, Y.C.: Solving systems of random quadratic equations via truncated amplitude flow. IEEE Trans. Inf. Theory 64, 773–794 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  27. Wei, K.: Solving systems of phaseless equations via kaczmarz methods: a proof of concept study. Inverse Probl. 31(12), 125,008 (2015)

    Article  MathSciNet  Google Scholar 

  28. Wei, K., Cai, J.F., Chan, T.F., Leung, S.: Guarantees of riemannian optimization for low rank matrix completion (2016). arXiv preprint arXiv:1603.06610

  29. Zhang, H., Chi, Y., Liang, Y.: Median-truncated nonconvex approach for phase retrieval with outliers (2017). arXiv preprint arXiv:1603.63805

Download references

Acknowledgements

The authors would like to thank Emmanuel Candès, Mo Mu and Aditya Viswanathan for very helpful discussions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haixia Liu.

Additional information

JFC was supported in part by HKRGC Grant 16300616. Part of the research work of YW was completed while the author was at Department of Mathematics, Michigan State University. This research was supported in part by the National Science Foundation Grant DMS-1043032 and AFOSR Grant FA9550-12-1-0455 and HKRGC Grant 16306415.

Larger step size

Larger step size

In this appendix, we demonstrate that, when \({\mathbf {x}}\) and \({\mathbf {y}}\) are sufficiently close, our alternating gradient descent algorithm is roughly 1.5 times faster than the WF algorithm in the real case.

To this end, we let \(E({\mathbf {x}},{\mathbf {y}})\) be the function defined in (3.2). Choose \(\lambda =0\) and assume \({\mathbf {x}}_k\approx {\mathbf {y}}_k\). Then, by Taylor’s expansion, we obtain

$$\begin{aligned} \begin{aligned}&E({\mathbf {x}}_{k+1},{\mathbf {y}}_k)\\&\quad \approx E({\mathbf {x}}_{k},{\mathbf {y}}_k)+\nabla _{{\mathbf {x}}}E({\mathbf {x}}_{k},{\mathbf {y}}_k)^*\left[ \begin{matrix}{\mathbf {x}}_{k+1}-{\mathbf {x}}_k\\ \overline{{\mathbf {x}}_{k+1}-{\mathbf {x}}_k}\end{matrix}\right] +\frac{1}{2} \left[ \begin{matrix}{\mathbf {x}}_{k+1}-{\mathbf {x}}_k\\ \overline{{\mathbf {x}}_{k+1}-{\mathbf {x}}_k}\end{matrix}\right] ^*\nabla _{{\mathbf {x}}}^2E({\mathbf {x}}_k,{\mathbf {y}}_k) \left[ \begin{matrix}{\mathbf {x}}_{k+1}-{\mathbf {x}}_k\\ \overline{{\mathbf {x}}_{k+1}-{\mathbf {x}}_k}\end{matrix}\right] \\&\quad =E({\mathbf {x}}_{k},{\mathbf {x}}_k)-\alpha _k\left\| \nabla _{{\mathbf {x}}}E({\mathbf {x}}_{k},{\mathbf {y}}_k)\right\| _2^2+\frac{\alpha ^2_k}{2}\mathfrak {R}\left( \nabla _{{\mathbf {x}}}E({\mathbf {x}}_k,{\mathbf {y}}_k)^*\left( \sum _n{\mathbf {f}}_n{\mathbf {f}}^*_n{\mathbf {y}}_k{\mathbf {y}}_k^*{\mathbf {f}}_n{\mathbf {f}}^*_n\right) \nabla _{{\mathbf {x}}}E({\mathbf {x}}_{k},{\mathbf {y}}_k)\right) \\&\quad =E({\mathbf {x}}_{k},{\mathbf {y}}_k)-\alpha _k\left\| (\nabla _{{\mathbf {x}}}E({\mathbf {x}}_{k},{\mathbf {y}}_k)\right\| _2^2 +\frac{\alpha _k^2}{2}\nabla _{{\mathbf {x}}}E({\mathbf {x}}_{k},{\mathbf {y}}_k)^*\left( \sum _n{\mathbf {f}}_n{\mathbf {f}}^*_n{\mathbf {y}}_k{\mathbf {y}}_k^*{\mathbf {f}}_n{\mathbf {f}}^*_n\right) \nabla _{{\mathbf {x}}}E({\mathbf {x}}_{k},{\mathbf {y}}_k)\\&\quad =E({\mathbf {x}}_{k},{\mathbf {y}}_k)-\alpha _k\left( \left\| \nabla _{{\mathbf {x}}}E({\mathbf {x}}_{k},{\mathbf {y}}_k)\right\| _2^2 -\frac{\alpha _k}{2}\nabla _{{\mathbf {x}}}E({\mathbf {x}}_{k},{\mathbf {y}}_k)^*\left( \sum _n{\mathbf {f}}_n{\mathbf {f}}^*_n{\mathbf {y}}_k{\mathbf {y}}_k^*{\mathbf {f}}_n{\mathbf {f}}^*_n\right) \nabla _{{\mathbf {x}}}E({\mathbf {x}}_{k},{\mathbf {y}}_k)\right) \end{aligned}\nonumber \\ \end{aligned}$$
(A.1)

The last equality hold because \(\sum _n{\mathbf {f}}_n{\mathbf {f}}^*_n{\mathbf {y}}_k{\mathbf {y}}_k^*{\mathbf {f}}_n{\mathbf {f}}^*_n\) is Hermitian. We choose \(\alpha _k>0\). Therefore, \(E({\mathbf {x}}_{k+1},{\mathbf {y}}_k)-E({\mathbf {x}}_{k},{\mathbf {y}}_k)\le 0\) as long as

$$\begin{aligned} \frac{2}{\alpha _k}\ge \frac{\nabla _{{\mathbf {x}}}E({\mathbf {x}}_{k},{\mathbf {y}}_k)^*\left( \sum _n{\mathbf {f}}_n{\mathbf {f}}^*_n{\mathbf {y}}_k{\mathbf {y}}_k^*{\mathbf {f}}_n{\mathbf {f}}^*_n\right) \nabla _{{\mathbf {x}}}E({\mathbf {x}}_{k},{\mathbf {y}}_k)}{\left\| \nabla _{{\mathbf {x}}}E({\mathbf {x}}_{k},{\mathbf {y}}_k)\right\| _2^2}, \end{aligned}$$

which is guaranteed if

$$\begin{aligned} \alpha _k\le \frac{2}{\Vert \sum _n{\mathbf {f}}_n{\mathbf {f}}^*_n{\mathbf {y}}_k{\mathbf {y}}_k^*{\mathbf {f}}_n{\mathbf {f}}^*_n\Vert _2}. \end{aligned}$$

To minimize \(E({\mathbf {x}}_{k+1},{\mathbf {y}}_k)-E({\mathbf {x}}_{k},{\mathbf {y}}_k)\), it is easy seen from (A.1) that \(\alpha _k\) is chosen as

$$\begin{aligned} \alpha _k=\frac{\left\| \nabla _{{\mathbf {x}}}E({\mathbf {x}}_{k},{\mathbf {y}}_k)\right\| _2^2}{\nabla _{{\mathbf {x}}}E({\mathbf {x}}_{k},{\mathbf {y}}_k)^*\left( \sum _n{\mathbf {f}}_n{\mathbf {f}}^*_n{\mathbf {y}}_k{\mathbf {y}}_k^*{\mathbf {f}}_n{\mathbf {f}}^*_n\right) \nabla _{{\mathbf {x}}}E({\mathbf {x}}_{k},{\mathbf {y}}_k)}. \end{aligned}$$

In this case,

$$\begin{aligned} E({\mathbf {x}}_{k+1},{\mathbf {y}}_k)-E({\mathbf {x}}_{k},{\mathbf {y}}_k) \approx -\frac{\left\| \nabla _{{\mathbf {x}}}E({\mathbf {x}}_{k},{\mathbf {y}}_k)\right\| _2^4}{2\nabla _{{\mathbf {x}}}E({\mathbf {x}}_{k},{\mathbf {y}}_k)^*\left( \sum _n{\mathbf {f}}_n{\mathbf {f}}^*_n{\mathbf {y}}_k{\mathbf {y}}_k^*{\mathbf {f}}_n{\mathbf {f}}^*_n\right) \nabla _{{\mathbf {x}}}E({\mathbf {x}}_{k},{\mathbf {y}}_k)}. \end{aligned}$$
(A.2)

Now we consider the WF algorithm, which minimizes \(G({\mathbf {x}})=\frac{1}{N}\sum ^N_{n=1}(|{\mathbf {f}}^*_n{\mathbf {x}}|^2-b_n)^2\). Assume we have the same \({\mathbf {x}}_k\) as in the alternating gradient descent algorithm, and the WF algorithm generates the new iterates by

$$\begin{aligned} {\mathbf {x}}_{k+1}={\mathbf {x}}_k-\delta _k\nabla _{{\mathbf {x}}}G({\mathbf {x}}_k). \end{aligned}$$

With the optimal choice of \(\delta _k\), an analogous analysis leads to

$$\begin{aligned}&G({\mathbf {x}}_{k+1})-G({\mathbf {x}}_{k}) \nonumber \\&\quad \approx -\frac{1}{2}\frac{\left\| \nabla _{{\mathbf {x}}}G({\mathbf {x}}_k)\right\| _2^4}{\mathfrak {R}\left( \nabla _{{\mathbf {x}}}G({\mathbf {x}}_k)^*H_{11}({\mathbf {x}}_k)\nabla _{{\mathbf {x}}}G({\mathbf {x}}_k)\right) +\mathfrak {R}\left( \nabla _{{\mathbf {x}}}G({\mathbf {x}}_k)^TH_{21}({\mathbf {x}}_k)\nabla _{{\mathbf {x}}}G({\mathbf {x}}_k)\right) }, \end{aligned}$$
(A.3)

where \(\mathfrak {R}\) denotes the real part, and

$$\begin{aligned}&H_{11}({\mathbf {x}}_k) =4\sum _n{\mathbf {f}}_n{\mathbf {f}}^*_n{\mathbf {x}}_k{\mathbf {x}}_k^*{\mathbf {f}}_n{\mathbf {f}}^*_n,\\&H_{21}({\mathbf {x}}_k)=\sum _n\left( \overline{{\mathbf {f}}_n{\mathbf {f}}^*_n{\mathbf {x}}_k}{\mathbf {x}}_k^*{\mathbf {f}}_n{\mathbf {f}}^*_n+\overline{{\mathbf {f}}_n{\mathbf {f}}^*_n{\mathbf {x}}_k}{\mathbf {x}}_k^*{\mathbf {f}}_n{\mathbf {f}}^*_n\right) . \end{aligned}$$

Since we assumed \({\mathbf {x}}_k\approx {\mathbf {y}}_k\) and \(\lambda =0\),

$$\begin{aligned} \nabla _{{\mathbf {x}}}G({\mathbf {x}}_k)\approx 2\nabla _{{\mathbf {y}}}E({\mathbf {x}}_k,{\mathbf {y}}_k), \end{aligned}$$
(A.4)

which implies

$$\begin{aligned} \begin{aligned}&\mathfrak {R}\left( \nabla _{{\mathbf {x}}}G({\mathbf {x}}_k)^*H_{11}({\mathbf {x}}_k)\nabla _{{\mathbf {x}}}G({\mathbf {x}}_k)\right) \\&\quad =\nabla _{{\mathbf {x}}}G({\mathbf {x}}_k)^*H_{11}({\mathbf {x}}_k)\nabla _{{\mathbf {x}}}G({\mathbf {x}}_k)\\&\quad \approx \left( 2\nabla _{{\mathbf {y}}}E({\mathbf {x}}_k,{\mathbf {y}}_k)\right) ^*\left( 4\sum _n{\mathbf {f}}_n{\mathbf {f}}^*_n{\mathbf {x}}_k{\mathbf {x}}_k^*{\mathbf {f}}_n{\mathbf {f}}^*_n\right) \left( 2\nabla _{{\mathbf {y}}}E({\mathbf {x}}_k,{\mathbf {y}}_k)\right) \\&\quad \approx 16\cdot \left( \nabla _{{\mathbf {y}}}E({\mathbf {x}}_k,{\mathbf {y}}_k)\right) ^*\left( \sum _n{\mathbf {f}}_n{\mathbf {f}}^*_n{\mathbf {y}}_k{\mathbf {y}}_k^*{\mathbf {f}}_n{\mathbf {f}}^*_n\right) \nabla _{{\mathbf {y}}}E({\mathbf {x}}_k,{\mathbf {y}}_k). \end{aligned} \end{aligned}$$
(A.5)

If we further assume all vectors involved are real, then we have \(H_{21}({\mathbf {x}}_k)=2\sum _n{\mathbf {f}}_n{\mathbf {f}}^*_n{\mathbf {x}}_k{\mathbf {x}}_k^*{\mathbf {f}}_n{\mathbf {f}}^*_n\) and

$$\begin{aligned} \begin{aligned} \mathfrak {R}\left( \nabla _{{\mathbf {x}}}G({\mathbf {x}}_k)^TH_{21}({\mathbf {x}}_k)\nabla _{{\mathbf {x}}}G({\mathbf {x}}_k)\right)&=\nabla _{{\mathbf {x}}}G({\mathbf {x}}_k)^*H_{21}({\mathbf {x}}_k)\nabla _{{\mathbf {x}}}G({\mathbf {x}}_k)\\&\approx 8\cdot \nabla _{{\mathbf {y}}}E({\mathbf {x}}_k,{\mathbf {y}}_k)^*\left( \sum _n{\mathbf {f}}_n{\mathbf {f}}^*_n{\mathbf {y}}_k{\mathbf {y}}_k^*{\mathbf {f}}_n{\mathbf {f}}^*_n\right) \nabla _{{\mathbf {y}}}E({\mathbf {x}}_k,{\mathbf {y}}_k). \end{aligned}\nonumber \\ \end{aligned}$$
(A.6)

Substituting (A.4), (A.5), and (A.6) into (A.3), we get

$$\begin{aligned} \left( G({\mathbf {x}}_{k+1})-G({\mathbf {x}}_k)\right) \approx \frac{2}{3}\left( E({\mathbf {x}}_{k+1},{\mathbf {y}}_k)-E({\mathbf {x}}_k,{\mathbf {y}}_k)\right) \end{aligned}$$

This means that the alternating gradient descent algorithm is 1.5 times faster than Wirtinger flow in terms of the decreasing of the objective.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cai, JF., Liu, H. & Wang, Y. Fast Rank-One Alternating Minimization Algorithm for Phase Retrieval. J Sci Comput 79, 128–147 (2019). https://doi.org/10.1007/s10915-018-0857-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10915-018-0857-9

Keywords

Navigation