Optimization Methods for Inverse Problems

  • Nan Ye
  • Farbod Roosta-KhorasaniEmail author
  • Tiangang Cui
Part of the MATRIX Book Series book series (MXBS, volume 2)


Optimization plays an important role in solving many inverse problems. Indeed, the task of inversion often either involves or is fully cast as a solution of an optimization problem. In this light, the mere non-linear, non-convex, and large-scale nature of many of these inversions gives rise to some very challenging optimization problems. The inverse problem community has long been developing various techniques for solving such optimization tasks. However, other, seemingly disjoint communities, such as that of machine learning, have developed, almost in parallel, interesting alternative methods which might have stayed under the radar of the inverse problem community. In this survey, we aim to change that. In doing so, we first discuss current state-of-the-art optimization methods widely used in inverse problems. We then survey recent related advances in addressing similar challenges in problems faced by the machine learning community, and discuss their potential advantages for solving inverse problems. By highlighting the similarities among the optimization challenges faced by the inverse problem and the machine learning communities, we hope that this survey can serve as a bridge in bringing together these two communities and encourage cross fertilization of ideas.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.



We thank the anonymous reviewers for their helpful comments. The work was carried out when the author was affiliated with ACEMS & Queensland University of Technology.


  1. 1.
    Abdoulaev, G.S., Ren, K., Hielscher, A.H.: Optical tomography as a PDE-constrained optimization problem. Inverse Prob. 21(5), 1507–1530 (2005)MathSciNetzbMATHGoogle Scholar
  2. 2.
    Agapiou, S., Bardsley, J.M., Papaspiliopoulos, O., Stuart, A.M.: Analysis of the Gibbs sampler for hierarchical inverse problems. SIAM/ASA J. Uncertain. Quantif. 2(1), 511–544 (2014)MathSciNetzbMATHGoogle Scholar
  3. 3.
    Agarwal, N., Bullins, B., Hazan, E.: Second order stochastic optimization in linear time. Preprint, arXiv:1602.03943 (2016)Google Scholar
  4. 4.
    Allen-Zhu, Z., Hazan, E.: Variance reduction for faster non-convex optimization. Preprint, arXiv:1603.05643 (2016)Google Scholar
  5. 5.
    Archer, G., Titterington, D.: On some Bayesian/regularization methods for image restoration. IEEE Trans. Image Process. 4(7), 989–995 (1995)Google Scholar
  6. 6.
    Arridge, S.R.: Optical tomography in medical imaging. Inverse Prob. 15(2), R41 (1999)MathSciNetzbMATHGoogle Scholar
  7. 7.
    Arridge, S.R., Hebden, J.C.: Optical imaging in medicine: Ii. Modelling and reconstruction. Phys. Med. Biol. 42(5), 841 (1997)Google Scholar
  8. 8.
    Aster, R.C., Borchers, B., Thurber, C.H.: Parameter Estimation and Inverse Problems. Academic, London (2013)zbMATHGoogle Scholar
  9. 9.
    Bardsley, J.M., Calvetti, D., Somersalo, E.: Hierarchical regularization for edge-preserving reconstruction of pet images. Inverse Prob. 26(3), 035010 (2010)MathSciNetzbMATHGoogle Scholar
  10. 10.
    Benzi, M., Haber, E., Taralli, L.: A preconditioning technique for a class of PDE-constrained optimization problems. Adv. Comput. Math. 35(2), 149–173 (2011)MathSciNetzbMATHGoogle Scholar
  11. 11.
    Berahas, A.S., Bollapragada, R., Nocedal, J.: An investigation of Newton-sketch and subsampled Newton methods. Preprint, arXiv:1705.06211 (2017)Google Scholar
  12. 12.
    Bertero, M., Boccacci, P.: Introduction to Inverse Problems in Imaging. CRC Press, Boca Raton (2010)zbMATHGoogle Scholar
  13. 13.
    Björck, Å.: Numerical Methods for Least Squares Problems. SIAM, Philadelphia (1996)zbMATHGoogle Scholar
  14. 14.
    Boas, D., Brooks, D., Miller, E., DiMarzio, C.A., Kilmer, M., Gaudette, R., Zhang, Q.: Imaging the body with diffuse optical tomography. IEEE Signal Process. Mag. 18(6), 57–75 (2001)Google Scholar
  15. 15.
    Bollapragada, R., Byrd, R., Nocedal, J.: Exact and inexact subsampled Newton methods for optimization. Preprint, arXiv:1609.08502 (2016)Google Scholar
  16. 16.
    Borcea, L., Berryman, J.G., Papanicolaou, G.C.: High-contrast impedance tomography. Inverse Prob. 12, 835–858 (1996)MathSciNetzbMATHGoogle Scholar
  17. 17.
    Bottou, L., Curtis, F.E., Nocedal, J.: Optimization methods for large-scale machine learning. Preprint, arXiv:1606.04838 (2016)Google Scholar
  18. 18.
    Bunks, C., Saleck, F.M., Zaleski, S., Chavent, G.: Multiscale seismic waveform inversion. Geophysics 60(5), 1457–1473 (1995)Google Scholar
  19. 19.
    Byrd, R.H., Chin, G.M., Neveitt, W., Nocedal, J.: On the use of stochastic Hessian information in optimization methods for machine learning. SIAM J. Optim. 21(3), 977–995 (2011)MathSciNetzbMATHGoogle Scholar
  20. 20.
    Byrd, R.H., Chin, G.M., Nocedal, J., Wu, Y.: Sample size selection in optimization methods for machine learning. Math. Program. 134(1), 127–155 (2012)MathSciNetzbMATHGoogle Scholar
  21. 21.
    Calvetti, D., Somersalo, E.: A gaussian hypermodel to recover blocky objects. Inverse Prob. 23(2), 733 (2007)MathSciNetzbMATHGoogle Scholar
  22. 22.
    Calvetti, D., Somersalo, E.: Hypermodels in the Bayesian imaging framework. Inverse Prob. 24(3), 034013 (2008)MathSciNetzbMATHGoogle Scholar
  23. 23.
    Cartis, C., Gould, N.I., Toint, P.L.: Evaluation complexity of adaptive cubic regularization methods for convex unconstrained optimization. Optim. Methods Softw. 27(2), 197–219 (2012)MathSciNetzbMATHGoogle Scholar
  24. 24.
    Cheney, M., Isaacson, D., Newell, J.C.: Electrical impedance tomography. SIAM Rev. 41, 85–101 (1999)MathSciNetzbMATHGoogle Scholar
  25. 25.
    Claerbout, J.F., Muir, F.: Robust modeling with erratic data. Geophysics 38(5), 826–844 (1973)Google Scholar
  26. 26.
    Clason, C.: L fitting for inverse problems with uniform noise. Inverse Prob. 28(10), 104007 (2012)MathSciNetzbMATHGoogle Scholar
  27. 27.
    Conn, A.R., Gould, N.I., Toint, P.L.: Trust Region Methods, vol. 1. SIAM, Philadelphia (2000)zbMATHGoogle Scholar
  28. 28.
    Dai, Y.: Nonlinear conjugate gradient methods. In: Wiley Encyclopedia of Operations Research and Management Science. Wiley, New York (2011)Google Scholar
  29. 29.
    Dauphin, Y., de Vries, H., Bengio, Y.: Equilibrated adaptive learning rates for non-convex optimization. In: Advances in Neural Information Processing Systems, pp. 1504–1512 (2015)Google Scholar
  30. 30.
    Doel, K.v.d., Ascher, U.: Adaptive and stochastic algorithms for EIT and DC resistivity problems with piecewise constant solutions and many measurements. SIAM J. Scient. Comput. 34 (2012).,692
  31. 31.
    Doel, K.v.d., Ascher, U., Leitao, A.: Multiple level sets for piecewise constant surface reconstruction in highly ill-posed problems. J. Sci. Comput. 43(1), 44–66 (2010)Google Scholar
  32. 32.
    Domingos, P.: A few useful things to know about machine learning. Commun. ACM 55(10), 78–87 (2012)Google Scholar
  33. 33.
    Dorn, O., Miller, E.L., Rappaport, C.M.: A shape reconstruction method for electromagnetic tomography using adjoint fields and level sets. Inverse Prob. 16, 1119–1156 (2000)MathSciNetzbMATHGoogle Scholar
  34. 34.
    Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)MathSciNetzbMATHGoogle Scholar
  35. 35.
    Eisen, M., Mokhtari, A., Ribeiro, A.: Large scale empirical risk minimization via truncated adaptive Newton method. Preprint, arXiv:1705.07957 (2017)Google Scholar
  36. 36.
    Engl, H.W., Hanke, M., Neubauer, A.: Regularization of Inverse Problems. Kluwer, Dordrecht (1996)zbMATHGoogle Scholar
  37. 37.
    Erdogdu, M.A., Montanari, A.: Convergence rates of sub-sampled Newton methods. In: Advances in Neural Information Processing Systems, vol. 28, pp. 3034–3042 (2015)Google Scholar
  38. 38.
    Fichtner, A.: Full Seismic Waveform Modeling and Inversion. Springer, Berlin (2011)Google Scholar
  39. 39.
    Fletcher, R.: Practical Methods of Optimization. Wiley, New York (2013)zbMATHGoogle Scholar
  40. 40.
    Fox, C., Norton, R.A.: Fast sampling in a linear-gaussian inverse problem. SIAM/ASA J. Uncertain. Quantif. 4(1), 1191–1218 (2016)MathSciNetzbMATHGoogle Scholar
  41. 41.
    Frank, M., Wolfe, P.: An algorithm for quadratic programming. Nav. Res. Logist. Q. 3(1–2), 95–110 (1956)MathSciNetGoogle Scholar
  42. 42.
    Gao, H., Osher, S., Zhao, H.: Quantitative photoacoustic tomography. In: Mathematical Modeling in Biomedical Imaging II, pp. 131–158. Springer, Berlin (2012)Google Scholar
  43. 43.
    Ge, R., Huang, F., Jin, C., Yuan, Y.: Escaping from saddle points-online stochastic gradient for tensor decomposition. In: Proceedings of COLT, pp. 797–842 (2015)Google Scholar
  44. 44.
    Haber, E.: Quasi-Newton methods for large-scale electromagnetic inverse problems. Inverse Prob. 21(1), 305 (2004)MathSciNetzbMATHGoogle Scholar
  45. 45.
    Haber, E., Ascher, U.M.: Preconditioned all-at-once methods for large, sparse parameter estimation problems. Inverse Prob. 17(6), 1847 (2001)MathSciNetzbMATHGoogle Scholar
  46. 46.
    Haber, E., Chung, M.: Simultaneous source for non-uniform data variance and missing data. Preprint, arXiv:1404.5254 (2014)Google Scholar
  47. 47.
    Haber, E., Ascher, U.M., Oldenburg, D.: On optimization techniques for solving nonlinear inverse problems. Inverse Prob. 16(5), 1263 (2000)MathSciNetzbMATHGoogle Scholar
  48. 48.
    Haber, E., Ascher, U., Oldenburg, D.: Inversion of 3D electromagnetic data in frequency and time domain using an inexact all-at-once approach. Geophysics 69, 1216–1228 (2004)Google Scholar
  49. 49.
    Haber, E., Heldmann, S., Ascher, U.: Adaptive finite volume method for distributed non-smooth parameter identification. Inverse Prob. 23, 1659–1676 (2007)MathSciNetzbMATHGoogle Scholar
  50. 50.
    Haber, E., Chung, M., Herrmann, F.: An effective method for parameter estimation with PDE constraints with multiple right-hand sides. SIAM J. Optim. 22, 739–757 (2012)MathSciNetzbMATHGoogle Scholar
  51. 51.
    Haber, E., Chung, M., Herrmann, F.: An effective method for parameter estimation with PDE constraints with multiple right-hand sides. SIAM J. Optim. 22(3), 739–757 (2012)MathSciNetzbMATHGoogle Scholar
  52. 52.
    Hadamard, J.: Sur les problèmes aux dérivées partielles et leur signification physique. Princeton University Bulletin, pp. 49–52 (1902)Google Scholar
  53. 53.
    Hanke, M.: Regularizing properties of a truncated Newton-CG algorithm for nonlinear inverse problems. Numer. Funct. Anal. Optim. 18, 971–993 (1997)MathSciNetzbMATHGoogle Scholar
  54. 54.
    Hansen, P.C.: Rank-Deficient and Discrete Ill-Posed Problems. SIAM, Philadelphia (1998)Google Scholar
  55. 55.
    Herman, G.T.: Fundamentals of Computerized Tomography: Image Reconstruction from Projections. Springer Science & Business Media, London (2009)zbMATHGoogle Scholar
  56. 56.
    Herrmann, F., Erlangga, Y., Lin, T.: Compressive simultaneous full-waveform simulation. Geophysics 74, A35 (2009)Google Scholar
  57. 57.
    Ito, K., Kunisch, K.: The augmented Lagrangian method for parameter estimation in elliptic systems. SIAM J. Control Optim. 28(1), 113–136 (1990)MathSciNetzbMATHGoogle Scholar
  58. 58.
    Jaggi, M.: Revisiting Frank-Wolfe: projection-free sparse convex optimization. In: Proceedings of the 30th International Conference on Machine Learning (ICML-13), pp. 427–435 (2013)Google Scholar
  59. 59.
    Jain, P., Netrapalli, P., Sanghavi, S.: Low-rank matrix completion using alternating minimization. In: Proceedings of the Forty-Fifth Annual ACM Symposium on Theory of Computing, pp. 665–674. ACM, New York (2013)Google Scholar
  60. 60.
    Jin, C., Ge, R., Netrapalli, P., Kakade, S.M., Jordan, M.I.: How to escape saddle points efficiently. Preprint, arXiv:1703.00887 (2017)Google Scholar
  61. 61.
    Johnson, R., Zhang, T.: Accelerating stochastic gradient descent using predictive variance reduction. In: Advances in Neural Information Processing Systems, pp. 315–323 (2013)Google Scholar
  62. 62.
    Kingma, D., Ba, J.: Adam: A method for stochastic optimization. Preprint, arXiv:1412.6980 (2014)Google Scholar
  63. 63.
    Kumar, R., Silva, C.D., Akalin, O., Aravkin, A.Y., Mansour, H., Recht, B., Herrmann, F.J.: Efficient matrix completion for seismic data reconstruction. Geophysics 80(5), V97–V114 (2015)Google Scholar
  64. 64.
    Lan, G., Zhou, Y.: Conditional gradient sliding for convex optimization. SIAM J. Optim. 26(2), 1379–1409 (2016)MathSciNetzbMATHGoogle Scholar
  65. 65.
    Lan, G., Pokutta, S., Zhou, Y., Zink, D.: Conditional accelerated lazy stochastic gradient descent. In: Proceedings of ICML. PMLR (2017).
  66. 66.
    Levy, K.Y.: The power of normalization: faster evasion of saddle points. Preprint, arXiv:1611.04831 (2016)Google Scholar
  67. 67.
    Li, H., Lin, Z.: Accelerated proximal gradient methods for nonconvex programming. In: Advances in Neural Information Processing Systems, pp. 379–387 (2015)Google Scholar
  68. 68.
    Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization. Math. Program. 45(1–3), 503–528 (1989)MathSciNetzbMATHGoogle Scholar
  69. 69.
    Liu, M., Yang, T.: On noisy negative curvature descent: competing with gradient descent for faster non-convex optimization. Preprint, arXiv:1709.08571 (2017)Google Scholar
  70. 70.
    Liu, W., Li, J., Marzouk, Y.M.: An approximate empirical Bayesian method for large-scale linear-gaussian inverse problems. Preprint, arXiv:1705.07646 (2017)Google Scholar
  71. 71.
    Louis, A.: Medical imaging: state of the art and future development. Inverse Prob. 8(5), 709 (1992)MathSciNetzbMATHGoogle Scholar
  72. 72.
    Mandt, S., Hoffman, M., Blei, D.: A variational analysis of stochastic gradient algorithms. In: International Conference on Machine Learning, pp. 354–363 (2016)Google Scholar
  73. 73.
    Mazumder, R., Friedman, J.H., Hastie, T.: Sparsenet: Coordinate descent with nonconvex penalties. J. Am. Stat. Assoc. 106(495), 1125–1138 (2011)MathSciNetzbMATHGoogle Scholar
  74. 74.
    Menke, W.: Geophysical Data Analysis: Discrete Inverse Theory. Academic, London (2012)zbMATHGoogle Scholar
  75. 75.
    Mutnỳ, M.: Stochastic second-order optimization via von Neumann series. Preprint, arXiv:1612.04694 (2016)Google Scholar
  76. 76.
    Mutnỳ, M., Richtárik, P.: Parallel stochastic Newton method. Preprint, arXiv:1705.02005 (2017)Google Scholar
  77. 77.
    Natterer, F., Wübbeling, F.: Mathematical Methods in Image Reconstruction. SIAM, Philadelphia (2001)zbMATHGoogle Scholar
  78. 78.
    Newman, G.A., Alumbaugh, D.L.: Frequency-domain modelling of airborne electromagnetic responses using staggered finite differences. Geophys. Prospect. 43, 1021–1042 (1995)Google Scholar
  79. 79.
    Nocedal, J.: Updating quasi-Newton matrices with limited storage. Math. Comput. 35(151), 773–782 (1980)MathSciNetzbMATHGoogle Scholar
  80. 80.
    Nocedal, J., Wright, S.: Numerical Optimization. Springer Science & Business Media, New York (2006)zbMATHGoogle Scholar
  81. 81.
    Oldenburg, D., Haber, E., Shekhtman, R.: 3D inversion of multi-source time domain electromagnetic data. J. Geophys. 78(1), E47–E57 (2013)Google Scholar
  82. 82.
    Osher, S., Sethian, J.: Fronts propagating with curvature dependent speed: algorithms based on Hamilton-Jacobi formulations. J. Comp. Phys. 79, 12–49 (1988)MathSciNetzbMATHGoogle Scholar
  83. 83.
    Pidlisecky, A., Haber, E., Knight, R.: RESINVM3D: a MATLAB 3D resistivity inversion package. Geophysics 72(2), H1–H10 (2007)Google Scholar
  84. 84.
    Pilanci, M., Wainwright, M.J.: Newton sketch: a linear-time optimization algorithm with linear-quadratic convergence. Preprint, arXiv:1505.02250 (2015)Google Scholar
  85. 85.
    Reddi, S.J., Hefny, A., Sra, S., Poczos, B., Smola, A.: Stochastic variance reduction for nonconvex optimization. Preprint, arXiv:1603.06160 (2016)Google Scholar
  86. 86.
    Rieder, A.: Inexact Newton regularization using conjugate gradients as inner iteration. SIAM J. Numer. Anal. 43, 604–622 (2005)MathSciNetzbMATHGoogle Scholar
  87. 87.
    Rieder, A., Lechleiter, A.: Towards a general convergence theory for inexact Newton regularizations. Numer. Math. 114(3), 521–548 (2010)MathSciNetzbMATHGoogle Scholar
  88. 88.
    Rohmberg, J., Neelamani, R., Krohn, C., Krebs, J., Deffenbaugh, M., Anderson, J.: Efficient seismic forward modeling and acquisition using simultaneous random sources and sparsity. Geophysics 75(6), WB15–WB27 (2010)Google Scholar
  89. 89.
    Roosta-Khorasani, F.: Randomized algorithms for solving large scale nonlinear least squares problems. Ph.D. thesis, University of British Columbia (2015)Google Scholar
  90. 90.
    Roosta-Khorasani, F., Mahoney, M.W.: Sub-sampled Newton methods I: globally convergent algorithms. Preprint, arXiv:1601.04737 (2016)Google Scholar
  91. 91.
    Roosta-Khorasani, F., Mahoney, M.W.: Sub-sampled Newton methods II: local convergence rates. Preprint, arXiv:1601.04738 (2016)Google Scholar
  92. 92.
    Roosta-Khorasani, F., van den Doel, K., Ascher, U.: Data completion and stochastic algorithms for PDE inversion problems with many measurements. Electron. Trans. Numer. Anal. 42, 177–196 (2014)MathSciNetzbMATHGoogle Scholar
  93. 93.
    Roosta-Khorasani, F., van den Doel, K., Ascher, U.: Stochastic algorithms for inverse problems involving PDEs and many measurements. SIAM J. Sci. Comput. 36(5), S3–S22 (2014). MathSciNetzbMATHGoogle Scholar
  94. 94.
    Roosta-Khorasani, F., Székely, G.J., Ascher, U.: Assessing stochastic algorithms for large scale nonlinear least squares problems using extremal probabilities of linear combinations of gamma random variables. SIAM/ASA J. Uncertain. Quantif. 3(1), 61–90 (2015)MathSciNetzbMATHGoogle Scholar
  95. 95.
    Rundell, W., Engl, H.W.: Inverse Problems in Medical Imaging and Nondestructive Testing. Springer, New York (1997)zbMATHGoogle Scholar
  96. 96.
    Russell, B.H.: Introduction to Seismic Inversion Methods, vol. 2. Society of Exploration Geophysicists, Tulsa (1988)Google Scholar
  97. 97.
    Scharf, L.L.: Statistical Signal Processing, vol. 98. Addison-Wesley, Reading (1991)zbMATHGoogle Scholar
  98. 98.
    Sen, M.K., Stoffa, P.L.: Global Optimization Methods in Geophysical Inversion. Cambridge University Press, Cambridge (2013)zbMATHGoogle Scholar
  99. 99.
    Shalev-Shwartz, S., Zhang, T.: Stochastic dual coordinate ascent methods for regularized loss. J. Mach. Learn. Res. 14(1), 567–599 (2013)MathSciNetzbMATHGoogle Scholar
  100. 100.
    Smith, N.C., Vozoff, K.: Two dimensional DC resistivity inversion for dipole dipole data. IEEE Trans. Geosci. Remote Sens. GE 22, 21–28 (1984)Google Scholar
  101. 101.
    Sun, W., Yuan, Y.X.: Optimization Theory and Methods: Nonlinear Programming, vol. 1. Springer Science & Business Media, New York (2006)zbMATHGoogle Scholar
  102. 102.
    Sutskever, I., Martens, J., Dahl, G., Hinton, G.: On the importance of initialization and momentum in deep learning. In: International Conference on Machine Learning, pp. 1139–1147 (2013)Google Scholar
  103. 103.
    Tai, X.C., Li, H.: A piecewise constant level set method for elliptic inverse problems. Appl. Numer. Math. 57, 686–696 (2007)MathSciNetzbMATHGoogle Scholar
  104. 104.
    Tarantola, A.: Inverse Problem Theory and Methods for Model Parameter Estimation. SIAM, Philadelphia (2005)zbMATHGoogle Scholar
  105. 105.
    Tieleman, T., Hinton, G.: Lecture 6.5-rmsprop: divide the gradient by a running average of its recent magnitude. In: COURSERA: Neural Networks for Machine Learning, vol. 4 (2012)Google Scholar
  106. 106.
    Tripuraneni, N., Stern, M., Jin, C., Regier, J., Jordan, M.I.: Stochastic cubic regularization for fast nonconvex optimization. Preprint, arXiv:1711.02838 (2017)Google Scholar
  107. 107.
    van den Doel, K., Ascher, U.M.: On level set regularization for highly ill-posed distributed parameter estimation problems. J. Comp. Phys. 216, 707–723 (2006)MathSciNetzbMATHGoogle Scholar
  108. 108.
    van den Doel, K., Ascher, U.M.: Dynamic level set regularization for large distributed parameter estimation problems. Inverse Prob. 23, 1271–1288 (2007)MathSciNetzbMATHGoogle Scholar
  109. 109.
    van den Doel, K., Ascher, U.M.: Dynamic regularization, level set shape optimization, and computed myography. In: Control and Optimization with Differential-Algebraic Constraints, vol. 23, p. 315. SIAM, Philadelphia (2012)Google Scholar
  110. 110.
    Van Den Doel, K., Ascher, U., Haber, E.: The lost honour of 2-based regularization. In: Large Scale Inverse Problems. Radon Series on Computational and Applied Mathematics, vol. 13, pp. 181–203. De Gruyter (2012)Google Scholar
  111. 111.
    Vogel, C.: Computational Methods for Inverse Problem. SIAM, Philadelphia (2002)zbMATHGoogle Scholar
  112. 112.
    Wang, C.C., Huang, C.H., Lin, C.J.: Subsampled Hessian Newton methods for supervised learning. Neural Comput. 27(8), 1766–1795 (2015)MathSciNetGoogle Scholar
  113. 113.
    Xu, P., Yang, J., Roosta-Khorasani, F., Ré, C., Mahoney, M.W.: Sub-sampled Newton methods with non-uniform sampling. In: Advances in Neural Information Processing Systems (NIPS), pp. 2530–2538 (2016)Google Scholar
  114. 114.
    Xu, P., Roosta-Khorasani, F., Mahoney, M.W.: Newton-type methods for non-convex optimization under inexact hessian information. Preprint, arXiv:1708.07164 (2017)Google Scholar
  115. 115.
    Xu, P., Roosta-Khorasani, F., Mahoney, M.W.: Second-order optimization for non-convex machine learning: an empirical study. Preprint, arXiv:1708.07827 (2017)Google Scholar
  116. 116.
    Ye, H., Luo, L., Zhang, Z.: Revisiting sub-sampled Newton methods. Preprint, arXiv:1608.02875 (2016)Google Scholar
  117. 117.
    Yuan, Z., Jiang, H.: Quantitative photoacoustic tomography: recovery of optical absorption coefficient maps of heterogeneous media. Appl. Phys. Lett. 88(23), 231101 (2006)Google Scholar
  118. 118.
    Zeiler, M.D.: Adadelta: an adaptive learning rate method. Preprint, arXiv:1212.5701 (2012)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.University of QueenslandBrisbaneAustralia
  2. 2.School of Mathematical SciencesMonash UniversityClaytonAustralia

Personalised recommendations