Ordered Line Integral Methods for Solving the Eikonal Equation

Abstract

We present a family of fast and accurate Dijkstra-like solvers for the eikonal equation and factored eikonal equation which compute solutions on a regular grid by solving local variational minimization problems. Our methods converge linearly but compute significantly more accurate solutions than competing first order methods. In 3D, we present two different families of algorithms which significantly reduce the number of FLOPs needed to obtain an accurate solution to the eikonal equation. One method employs a fast search using local characteristic directions to prune unnecessary updates, and the other uses the theory of constrained optimization to achieve the same end. The proposed solvers are more efficient than the standard fast marching method in terms of the relationship between error and CPU time. We also modify our method for use with the additively factored eikonal equation, which can be solved locally around point sources to maintain linear convergence. We conduct extensive numerical simulations and provide theoretical justification for our approach. A library that implements the proposed solvers is available on GitHub.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

Notes

  1. 1.

    We thank D. Qi for helpful discussions regarding this problem.

References

  1. 1.

    Alexandrescu, A.: Modern C++ Design: Generic Programming and Design Patterns Applied. Addison-Wesley, Boston (2001)

    Google Scholar 

  2. 2.

    Arndt, J.: Matters Computational: Ideas, Algorithms, Source Code. Springer, Berlin (2010)

    MATH  Google Scholar 

  3. 3.

    Bertsekas, D.P.: Network Optimization: Continuous and Discrete Models. Princeton, Citeseer (1998)

    MATH  Google Scholar 

  4. 4.

    Bertsekas, D.P.: Nonlinear Programming. Athena Scientific, Belmont (1999)

    MATH  Google Scholar 

  5. 5.

    Bornemann, F., Rasch, C.: Finite-element discretization of static Hamilton–Jacobi equations based on a local variational principle. Comput. Vis. Sci. 9(2), 57–69 (2006)

    MathSciNet  Google Scholar 

  6. 6.

    Bronstein, A.M., Bronstein, M.M., Kimmel, R.: Numerical Geometry of Non-rigid Shapes. Springer, Berlin (2008)

    MATH  Google Scholar 

  7. 7.

    Chacon, A., Vladimirsky, A.: Fast two-scale methods for eikonal equations. SIAM J. Sci. Comput. 34(2), A547–A578 (2012)

    MathSciNet  MATH  Google Scholar 

  8. 8.

    Chacon, A., Vladimirsky, A.: A parallel two-scale method for eikonal equations. SIAM J. Sci. Comput. 37(1), A156–A180 (2015)

    MathSciNet  MATH  Google Scholar 

  9. 9.

    Chopp, D.L.: Some Improvements of the Fast Marching Method. SIAM J. Sci. Comput. 23(1), 230–244 (2001)

    MathSciNet  MATH  Google Scholar 

  10. 10.

    Crandall, M.G., Lions, P.-L.: Viscosity solutions of Hamilton–Jacobi equations. Trans. Am. Math. Soc. 277(1), 1–42 (1983)

    MathSciNet  MATH  Google Scholar 

  11. 11.

    Dahiya, D., Cameron, M.: An ordered line integral method for computing the quasi-potential in the case of variable anisotropic diffusion (2018). arXiv preprint arXiv:1806.05321

  12. 12.

    Dahiya, D., Cameron, M.: Ordered line integral methods for computing the quasi-potential. J. Sci. Comput. 75(3), 1351–1384 (2018)

    MathSciNet  MATH  Google Scholar 

  13. 13.

    Dial, R.B.: Algorithm 360: shortest-path forest with topological ordering. Commun. ACM 12(11), 632–633 (1969)

    Google Scholar 

  14. 14.

    Dijkstra, E.W.: A note on two problems in connexion with graphs. Numer Math 1(1), 269–271 (1959)

    MathSciNet  MATH  Google Scholar 

  15. 15.

    Durou, J.-D., Falcone, M., Sagona, M.: Numerical methods for shape-from-shading: a new survey with benchmarks. Comput. Vis. Image Underst. 109(1), 22–43 (2008)

    Google Scholar 

  16. 16.

    Engquist, B., Runborg, O.: Computational high frequency wave propagation. Acta Numer. 12, 181–266 (2003)

    MathSciNet  MATH  Google Scholar 

  17. 17.

    Fomel, S., Luo, S., Zhao, H.: Fast sweeping method for the factored eikonal equation. J. Comput. Phys. 228(17), 6440–6455 (2009)

    MathSciNet  MATH  Google Scholar 

  18. 18.

    Fomel, S., Sethian, J.A.: Fast-phase space computation of multiple arrivals. Proc. Natl. Acad. Sci. 99(11), 7329–7334 (2002)

    MathSciNet  MATH  Google Scholar 

  19. 19.

    Gómez, J.V., Alvarez, D., Garrido, S., Moreno, L.: Fast methods for eikonal equations: an experimental survey. IEEE Access (2019)

  20. 20.

    https://github.com/sampotter/olim/tree/sisc19. Project page for libolim on GitHub

  21. 21.

    https://github.com/sampotter/olim/tree/sisc19/plotting. Link to section of libolim project page containing Python plotting scripts and instructions

  22. 22.

    Ihrke, I., Ziegler, G., Tevs, A., Theobalt, C., Magnor, M., Seidel, H.-P.: Eikonal rendering: efficient light transport in refractive objects. ACM Trans. Graph. (TOG) 26(3), 59 (2007)

    Google Scholar 

  23. 23.

    Jeong, W.-K., Whitaker, R.T.: A fast iterative method for eikonal equations. SIAM J. Sci. Comput. 30(5), 2512–2534 (2008)

    MathSciNet  MATH  Google Scholar 

  24. 24.

    Kao, C.-Y., Osher, S., Qian, J.: Legendre-transform-based fast sweeping methods for static Hamilton–Jacobi equations on triangulated meshes. J. Comput. Phys. 227(24), 10209–10225 (2008)

    MathSciNet  MATH  Google Scholar 

  25. 25.

    Kim, S.: An \(O(N)\) level set method for eikonal equations. SIAM J. Sci. Comput. 22(6), 2178–2193 (2001)

    MathSciNet  MATH  Google Scholar 

  26. 26.

    Kim, S.: 3-D eikonal solvers: first-arrival traveltimes. Geophysics 67(4), 1225–1231 (2002)

    Google Scholar 

  27. 27.

    Kimmel, R., Sethian, J.A.: Computing geodesic paths on manifolds. Proc. Natl. Acad. Sci. 95(15), 8431–8435 (1998)

    MathSciNet  MATH  Google Scholar 

  28. 28.

    Kimmel, R., Sethian, J.A.: Optimal algorithm for shape from shading and path planning. J. Math. Imaging Vis. 14(3), 237–244 (2001)

    MathSciNet  MATH  Google Scholar 

  29. 29.

    Lewiner, T., Lopes, H., Vieira, A.W., Tavares, G.: Efficient implementation of marching cubes’ cases with topological guarantees. J. Graph. Tools 8(2), 1–15 (2003)

    Google Scholar 

  30. 30.

    Luo, S., Qian, J.: Fast sweeping methods for factored anisotropic eikonal equations: multiplicative and additive factors. J. Sci. Comput. 52(2), 360–382 (2012)

    MathSciNet  MATH  Google Scholar 

  31. 31.

    Luo, S., Zhao, H.: Convergence analysis of the fast sweeping method for static convex Hamilton–Jacobi equations. Res. Math. Sci. 3(1), 35 (2016)

    MathSciNet  MATH  Google Scholar 

  32. 32.

    Mirebeau, J.-M.: Anisotropic fast-marching on cartesian grids using lattice basis reduction. SIAM J. Numer. Anal. 52(4), 1573–1599 (2014)

    MathSciNet  MATH  Google Scholar 

  33. 33.

    Mirebeau, J.-M.: Efficient fast marching with Finsler metrics. Numer. Math. 126(3), 515–557 (2014)

    MathSciNet  MATH  Google Scholar 

  34. 34.

    Mitchell, J.S.B., Mount, D.M., Papadimitriou, C.H.: The discrete geodesic problem. SIAM J. Comput. 16(4), 647–668 (1987)

    MathSciNet  MATH  Google Scholar 

  35. 35.

    Nethercote, N., Seward, J.: Valgrind: a framework for heavyweight dynamic binary instrumentation. In: ACM Sigplan Notices, pp. 89–100. ACM (2007)

  36. 36.

    Nocedal, J., Wright, S.: Numerical Optimization. Springer, Berlin (2006)

    MATH  Google Scholar 

  37. 37.

    Osher, S., Fedkiw, R.: Level Set Methods and Dynamic Implicit Surfaces, vol. 153. Springer, Berlin (2006)

    MATH  Google Scholar 

  38. 38.

    Popovici, A.M., Sethian, J.A.: 3-D imaging using higher order fast marching traveltimes. Geophysics 67(2), 604–609 (2002)

    Google Scholar 

  39. 39.

    Potter, S.F.: http://umiacs.umd.edu/~sfp. Author’s personal webpage

  40. 40.

    Potter, S.F., Cameron, M.K.: https://github.com/sampotter/olim-plots/blob/master/marmousi_2d.ipynb. Supplemental numerical experiments in 2D for the original Marmousi model

  41. 41.

    Prados, E., Faugeras, O.: Shape from shading. In: Handbook of Mathematical Models in Computer Vision, pp. 375–388. Springer (2006)

  42. 42.

    Prislan, R., Veble, G., Svenšek, D.: Ray-trace modeling of acoustic Green’s function based on the semiclassical (eikonal) approximation. J. Acoust. Soc. Am. 140(4), 2695–2702 (2016)

    Google Scholar 

  43. 43.

    Qi, D., Vladimirsky A.: Corner cases, singularities, and dynamic factoring (2018). arXiv preprint arXiv:1801.04322

  44. 44.

    Raghuvanshi, N., Snyder, J.: Parametric wave field coding for precomputed sound propagation. ACM Trans. Graph. (TOG) 33(4), 38 (2014)

    Google Scholar 

  45. 45.

    Raghuvanshi, N., Snyder, J.: Parametric directional coding for precomputed sound propagation. ACM Trans. Graph. (TOG) 37(4), 108 (2018)

    Google Scholar 

  46. 46.

    Sedgewick, R., Wayne, K.: Algorithms. Addison-Wesley Professional, Reading (2011)

    MATH  Google Scholar 

  47. 47.

    Sethian, J.A.: A fast marching level set method for monotonically advancing fronts. Proc. Natl. Acad. Sci. 93(4), 1591–1595 (1996)

    MathSciNet  MATH  Google Scholar 

  48. 48.

    Sethian, J.A.: Level Set Methods and Fast Marching Methods: Evolving Interfaces in Computational Geometry, Fluid Mechanics, Computer Vision, and Materials Science, vol. 3. Cambridge University Press, Cambridge (1999)

    MATH  Google Scholar 

  49. 49.

    Sethian, J.A., Popovici, A.M.: 3-D traveltime computation using the fast marching method. Geophysics 64(2), 516–523 (1999)

    Google Scholar 

  50. 50.

    Sethian, J.A., Vladimirsky, A.: Fast methods for the Eikonal and related Hamilton–Jacobi equations on unstructured meshes. Proc. Natl. Acad. Sci. 97(11), 5699–5703 (2000)

    MathSciNet  MATH  Google Scholar 

  51. 51.

    Sethian, J.A., Vladimirsky, A.: Ordered upwind methods for static Hamilton–Jacobi equations: theory and algorithms. SIAM J. Numer. Anal. 41(1), 325–363 (2003)

    MathSciNet  MATH  Google Scholar 

  52. 52.

    Slotnick, M.: Lessons in seismic computing. Soc. Expl. Geophys, p. 268 (1959)

  53. 53.

    Stoer, J., Bulirsch, R.: Introduction to Numerical Analysis, vol. 12. Springer, Berlin (2013)

    MATH  Google Scholar 

  54. 54.

    Stroustrup, B.: The C++ Programming Language. Pearson Education, London (2013)

    MATH  Google Scholar 

  55. 55.

    Treister, E., Haber, E.: A fast marching algorithm for the factored eikonal equation. J. Comput. Phys. 324, 210–225 (2016)

    MathSciNet  MATH  Google Scholar 

  56. 56.

    Tsai, Y.-H.R., Cheng, L.-T., Osher, S., Zhao, H.-K.: Fast sweeping algorithms for a class of Hamilton–Jacobi equations. SIAM J. Numer. Anal. 41(2), 673–694 (2003)

    MathSciNet  MATH  Google Scholar 

  57. 57.

    Tsitsiklis, J.N.: Efficient algorithms for globally optimal trajectories. IEEE Trans. Autom. Control 40(9), 1528–1538 (1995)

    MathSciNet  MATH  Google Scholar 

  58. 58.

    Van der Walt, S., Schönberger, J.L., Nunez-Iglesias, J., Boulogne, F., Warner, J.D., Yager, N., Gouillart, E., Yu, T.: scikit-image: image processing in python. PeerJ 2, 453 (2014)

    Google Scholar 

  59. 59.

    Van Trier, J., Symes, W.W.: Upwind finite-difference calculation of traveltimes. Geophysics 56(6), 812–821 (1991)

    Google Scholar 

  60. 60.

    Versteeg, R.: The marmousi experience: velocity model determination on a synthetic complex data set. Lead. Edge 13(9), 927–936 (1994)

    Google Scholar 

  61. 61.

    Vidale, J.E.: Finite-difference calculation of traveltimes in three dimensions. Geophysics 55(5), 521–526 (1990)

    Google Scholar 

  62. 62.

    Yang, S., Potter, S.F., Cameron, M.K.: Computing the quasipotential for nongradient SDEs in 3D. J. Comput. Phys. 379, 325–350 (2019)

    MathSciNet  Google Scholar 

  63. 63.

    Yatziv, L., Bartesaghi, A., Sapiro, G.: \(O(N)\) implementation of the fast marching algorithm. J. Comput. Phys. 212(2), 393–399 (2006)

    MATH  Google Scholar 

  64. 64.

    Zhang, Y.-T., Zhao, H.-K., Qian, J.: High order fast sweeping methods for static Hamilton–Jacobi equations. J. Sci. Comput. 29(1), 25–56 (2006)

    MathSciNet  MATH  Google Scholar 

  65. 65.

    Zhao, H.-K.: A fast sweeping method for eikonal equations. Math. Comput. 74(250), 603–627 (2005)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

We thank Prof. A. Vladimirsky for valuable discussions during the course of this project.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Samuel F. Potter.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was partially supported by NSF Career Grant DMS1554907 and MTECH Grant No. 6205.

Appendices

Minimum actional integral for the eikonal equation

The eikonal equation (Eq. 1) is a Hamilton–Jacobi equation for u. If we let each fixed characteristic (ray) of the eikonal equation be parametrized by some parameter \(\sigma \) and denote \(p \equiv \nabla u\), the corresponding Hamiltonian is:

$$\begin{aligned} H{(p, x)} = \frac{\left\| p\right\| ^2}{2} - \frac{s(x)^2}{2} = 0. \end{aligned}$$
(43)

Since \(H = 0\), Eq. 43 implies \(L = \sup _p (\langle p, x' \rangle - H) = s(x) \left\| x'\right\| \). Since \(x' = \partial _p H = p\) and \(\left\| p\right\| = s(x)\) can be expressed as:

$$\begin{aligned} L(x, x') = \langle p, x'\rangle = \langle x', x'\rangle = \langle \nabla u, x' \rangle = \frac{du}{d\sigma }. \end{aligned}$$
(44)

Let \(x(\sigma )\) be a characteristic arriving at \(\hat{x} = x(\hat{\sigma })\) from \(x_0 = x(0)\), which lies on the expanding front. Integrating from 0 to \(\hat{\sigma }\) and letting \(\hat{u} = u(\hat{x})\) and \(u_0 = u(x_0)\):

$$\begin{aligned} \hat{u} - u_0 = \int _{0}^{\hat{\sigma }} L(x, x') d\sigma = \int _{0}^{\hat{\sigma }} s(x) \left\| x'\right\| d\sigma = \int _0^L s(x) dl, \end{aligned}$$
(45)

where L is the length of the characteristic from \(x_0\) to \(\hat{x}\) and dl is the length element. A characteristic of Eq. 1 minimizes Eq. 45 over admissible paths. Then, if \(\hat{x}\) is fixed and \(\alpha \) is an arc-length parametrized curve with \(\alpha (L) = \hat{x}\), Eq. 45 is equivalent to:

$$\begin{aligned} \hat{u} = u(\hat{x}) = \min _\alpha \left\{ u(\alpha (0)) + \int _\alpha s(x) dl\right\} . \end{aligned}$$
(46)

Our update procedure is based on Eq. 46. This problem may have multiple local minima—\(\hat{u}\) above corresponds to the first arrival, which is what interests us primarily in this work.

Skipping Updates in the Bottom-Up Family of Algorithms

In this section, we describe how to use the KKT conditions to skip updates in the bottom-up algorithms. In this section, we write:

$$\begin{aligned} A = \begin{bmatrix} -1&\\&\ddots&\\&-1 \\ 1&\cdots&1 \end{bmatrix} \in \mathbb {R}^{d + 1 \times d}, \qquad b = \begin{bmatrix} 0 \\ \vdots \\ 0 \\ 1 \end{bmatrix} \in \mathbb {R}^{d + 1} \end{aligned}$$
(47)

Using these, the set \(\Delta ^d\) can be written as a linear matrix inequality:

$$\begin{aligned} \lambda \in \Delta ^d \iff A\lambda \le b \end{aligned}$$
(48)

Let \(\mu \in \mathbb {R}^{d + 1}\) be the vector of Lagrange multipliers. Then, the Lagrangian function for Eq. 14 is:

$$\begin{aligned} L(\lambda , \mu ) = F(\lambda ) + (A\lambda - b)^\top \mu . \end{aligned}$$
(49)

Since \(F_0\) is strictly convex and since we assume h is small enough for \(F_1\) to be strictly convex, if \(\lambda ^*\) lies on the boundary of \(\Delta ^d\), we only need to check that the optimum Lagrange multipliers \(\mu ^*\) are dual feasible; i.e., whether \(\mu ^* \ge 0\) (this follows directly from the standard KKT conditions [4, 36]). For a fixed \(\lambda \in \Delta ^d\), define the set of indices of active constraints:

$$\begin{aligned} \mathcal {I} = \left\{ i : (A\lambda - b)_i = 0\right\} \end{aligned}$$
(50)

That is, \(i \in \mathcal {I}\) if the ith inequality holds with equality (“is active”). Stationarity then requires:

$$\begin{aligned} A^\top _{\mathcal {I}} \mu _{\mathcal {I}}^* = \nabla F_i(\lambda ). \end{aligned}$$
(51)

If \(i \notin \mathcal {I}\), we set \(\mu _i^* = 0\). If \(\mu ^*_i \ge 0\) for all i, then the update may be skipped.

When implementing this, since A is sparse, it is simplest and most efficient to write out the system given by Eq. 51 and write a specialized function to solve it. Note that since we always start with a lower-dimensional interior point solution lying on the boundary of a higher-dimensional problem, we only have to compute one Lagrange multiplier.

Proofs for Sect. 3.2

Proof

(Proof of proposition 1) For the gradient, we have:

$$\begin{aligned} \nabla F_0(\lambda ) = {\delta U}+ \frac{s^{\theta } h}{2 \Vert p_\lambda \Vert } \nabla p_\lambda ^\top p_\lambda = {\delta U}+ \frac{s^{\theta } h}{\Vert p_\lambda \Vert } {\delta P}^\top p_\lambda , \end{aligned}$$

since \(\nabla p_\lambda ^\top p_\lambda = 2 {\delta P}^\top p_\lambda \). For the Hessian:

$$\begin{aligned} \nabla ^2_\lambda F_0(\lambda )&= \nabla \left( \frac{s^{\theta } h}{\Vert p_\lambda \Vert } p_\lambda ^\top {\delta P}\right) = s^{\theta } h \left( \nabla \frac{1}{\Vert p_\lambda \Vert } p_\lambda ^\top {\delta P}+ \frac{1}{\Vert p_\lambda \Vert } \nabla p_\lambda ^\top {\delta P}\right) \\&= \frac{s^{\theta } h}{\Vert p_\lambda \Vert } \left( {\delta P}^\top {\delta P}- \frac{{\delta P}^\top p_\lambda p_\lambda ^\top {\delta P}}{p_\lambda ^\top p_\lambda }\right) = \frac{s^{\theta } h}{\Vert p_\lambda \Vert } {\delta P}^\top \left( I - \frac{p_\lambda p_\lambda ^\top }{p_\lambda ^\top p_\lambda }\right) {\delta P}, \end{aligned}$$

from which the result follows.

Proof

(Proof of proposition 2) Since \(F_1(\lambda ) = u_\lambda + h s^{\theta }_\lambda \Vert p_\lambda \Vert \), for the gradient we have:

$$\begin{aligned} \nabla F_1(\lambda ) = {\delta U}+ h \left( \theta \Vert p_\lambda \Vert {\delta s}+ \frac{s^{\theta }_\lambda }{2\Vert p_\lambda \Vert } \nabla p_\lambda ^\top p_\lambda \right) = {\delta U}+ \frac{h}{\Vert p_\lambda \Vert } \left( \theta p_\lambda ^\top p_\lambda {\delta s}+ s^{\theta } {\delta P}^\top p_\lambda \right) , \end{aligned}$$

and for the Hessian:

$$\begin{aligned} \begin{aligned} \nabla ^2 F_1(\lambda ) = \frac{h}{2 \Vert p_\lambda \Vert } \Bigg (\theta \Big (\nabla p_\lambda ^\top p_\lambda {\delta s}^\top&+\; {\delta s}{(\nabla p_\lambda ^\top p_\lambda )}^\top \Big ) \;+ \\&s^{\theta }_\lambda \left( \frac{1}{2 p_\lambda ^\top p_\lambda } \nabla p_\lambda ^\top p_\lambda {(\nabla p_\lambda ^\top p_\lambda )}^\top - \nabla ^2_\lambda p_\lambda ^\top p_\lambda \right) \Bigg ). \end{aligned} \end{aligned}$$

Simplifying this gives us the result.

Proof

(Proof of lemma 1) Let \(\nu _\lambda = p_\lambda /\Vert p_\lambda \Vert \in \mathbb {R}^n\) be the unit vector in the direction of \(p_\lambda \), and assume that \(Q = \begin{bmatrix} \nu _\lambda&U \end{bmatrix} \in \mathbb {R}^{n \times n}\) is orthonormal. Then:

$$\begin{aligned} {\delta P}^\top \mathtt {Proj}^\perp _{p_\lambda } {\delta P}= {\delta P}^\top {(I - \nu _\lambda \nu _\lambda ^\top )} {\delta P}= {\delta P}^\top {(QQ^\top - \nu _\lambda \nu _\lambda ^\top )} {\delta P}= {\delta P}^\top U U^\top {\delta P}. \end{aligned}$$
(52)

Hence, \({\delta P}^\top \mathtt {Proj}^\perp {\delta P}\) is a Gram matrix and positive semidefinite.

Next, since \(\Delta ^n\) is nondegenerate, the vectors \(p_i\) for \(i = 0, \ldots , n - 1\) are linearly independent. Since the ith column of \({\delta P}\) is \({\delta p}_i = p_i - p_0\), we can see that the vector \(p_0\) is not in the range of \({\delta P}\); hence, there is no vector \(\mu \) such that \({\delta P}\mu = \alpha p_\lambda \), for any \(\alpha \ne 0\). What’s more, by definition, \(\text {ker}(\mathtt {Proj}_{p_\lambda }^\perp ) = \langle p_\lambda \rangle \). So, we can see that \(\mathtt {Proj}^\perp _{p_\lambda } {\delta P}\mu = 0\) only if \(\mu = 0\), from which we can conclude \({\delta P}^\top \mathtt {Proj}^\perp _{p_\lambda } {\delta P}\succ 0\). Altogether, bearing in mind that \(s_{{\text {min}}}\) is assumed to be positive, we conclude that \(\nabla ^2 F_0\) is positive definite.

Proof

(Proof of lemma 2) To show that \(\nabla ^2 F_1\) is positive definite for h small enough, note from Eq. 19 that \(\nabla ^2 F_1 = A + B\), where A is positive definite and B is small relative to A and indefinite. To use this fact, note that since \({\delta P}^\top \mathtt {Proj}^\perp _\lambda {\delta P}\) is symmetric positive definite, it has an eigenvalue decomposition \(Q \Lambda Q^\top \) where \(\Lambda _{ii} > 0\) for all i. Since \({\delta P}^\top \mathtt {Proj}^\perp _\lambda {\delta P}\) doesn’t depend on h, for a fixed set of vectors \(p_0, \ldots , p_n\), its eigenvalues are constant with respect to h. So, defining:

$$\begin{aligned} A = \frac{s^\theta _\lambda h}{\Vert p_\lambda \Vert } {\delta P}^\top \mathtt {Proj}^\perp _\lambda {\delta P}= Q \left( \frac{s^\theta _\lambda h}{\Vert p_\lambda \Vert } \Lambda \right) Q^\top \end{aligned}$$
(53)

we can expect this matrix’s eigenvalues to be \(\Theta (h)\); in particular, \(\lambda _{\min } \ge C h\) for some constant C, provided that \(s> s_{{\text {min}}}> 0\), as assumed. This gives us a bound for the positive definite part of \(\nabla F_1^2\).

The perturbation \(B = \left\{ {\delta P}^\top \nu _\lambda , \theta h {\delta s}\right\} \) is indefinite. Since \(\left\| {\delta s}\right\| = O(h)\), we find that:

$$\begin{aligned} |\lambda _{\max }(B)| = \left\| \left\{ {\delta P}^\top \nu _\lambda , \theta h {\delta s}\right\} \right\| _2 \le \theta h \sqrt{n} \left\| \left\{ {\delta P}^\top \nu _\lambda , {\delta s}\right\} \right\| _\infty = O(h^2), \end{aligned}$$
(54)

where we use the fact that the Lipschitz constant of s is \(K \le C\), so that:

$$\begin{aligned} |{\delta s}_i| = |s_i - s_0| \le K |x_i - x_0| \le K h \sqrt{n} \le Ch \sqrt{n}, \end{aligned}$$
(55)

for each i. Letting \(z \ne 0\), we compute:

$$\begin{aligned} z^\top \nabla ^2 F_1 z = z^\top A z + z^\top B z \ge \lambda _{\min }(A) z^\top z + z^\top B z \ge Ch z^\top z + z^\top B z. \end{aligned}$$
(56)

Now, since \(\left| z^\top B z\right| \le \left| \lambda _{\max }(B)\right| z^\top z \le D h^2 z^\top z\), where D is some positive constant, we can see that for h small enough, it must be the case that \(Ch z^\top z + z^\top B z > 0\); i.e., that \(\nabla ^2 F_1\) is positive definite; consequently, \(F_1\) is strictly convex in this case.

Proofs for Sect. 3.3

In this section, we establish some technical lemmas that we will use to validate the use of mp0. Lemmas 4, 5, and 6 set up the conditions for Theorem 5 of Stoer and Bulirsch [53], from which Theorem 1 readily follows.

Lemma 4

There exists \(\beta = O(h^{-1})\) s.t. \(\left\| \nabla ^2 F_1(\lambda )^{-1}\right\| \le \beta \) for all \(\lambda \in \Delta ^n\).

Proof

(Proof of lemma 4) To simplify Eq. 19, we temporarily define:

$$\begin{aligned} A = \frac{s^\theta _\lambda h}{\Vert p_\lambda \Vert } {\delta P}^\top \mathtt {Proj}^\perp _\lambda {\delta P}\;\mathrm{and}\;B = \frac{\theta h}{\Vert p_\lambda \Vert } \left\{ {\delta P}^\top p_\lambda , {\delta s}\right\} . \end{aligned}$$
(57)

Observe that \(\Vert A\Vert = O(h)\) and \(\Vert B\Vert = O(h^2)\), since \(\Vert \delta s\Vert = O(h)\) and since all other factors involved in A and B (excluding h itself) are independent of h. Hence:

$$\begin{aligned} \left\| A^{-1} B\right\| = \frac{\theta }{s^\theta _\lambda } \left\| \left( {\delta P}^\top \mathtt {Proj}^\perp _\lambda {\delta P}\right) ^{-1} \left\{ {\delta P}^\top p_\lambda , {\delta s}\right\} \right\| = O(h), \end{aligned}$$
(58)

since \(\left\| {\delta s}\right\| = O(h)\). Hence, \(\left\| A^{-1} B\right\| < 1\) for h small enough, and we can Taylor expand:

$$\begin{aligned} \begin{aligned} \nabla ^2 F_1(\lambda )^{-1}&= \left( A + B\right) ^{-1} = {(I + A^{-1} B)}^{-1} A^{-1} \\&= \left( I - A^{-1} B + {(A^{-1} B)}^2 - \cdots \right) A^{-1} \\&= A^{-1} - A^{-1} B A^{-1} + {(A^{-1} B)}^2 A^{-1} - \,\cdots , \end{aligned} \end{aligned}$$
(59)

which implies \(\left\| \nabla ^2 F_1(\lambda )^{-1}\right\| = O(h^{-1})\). Note that when we Taylor expand, \(\Vert A^{-1} B\Vert = O(h)\), so that \(\Vert A^{-1} B\Vert < 1\) for h small enough. To define \(\beta \), let:

$$\begin{aligned} \beta = \max _{\lambda \in \Delta ^n} \left\| \nabla ^2 F_1(\lambda )^{-1}\right\| = O(h^{-1}), \end{aligned}$$
(60)

completing the proof.

Lemma 5

There exists \(\alpha = O(h)\) s.t. \(\left\| \nabla ^2 F_1(\lambda _{0}^*)^{-1} \nabla F_1(\lambda _{0}^*)\right\| \le \alpha \).

Proof

(Proof of lemma 5) From Lemma 4 we have \(\left\| F_1(\lambda _0^*)^{-1}\right\| = O(h^{-1})\), so to establish the result we only need to show that \(\left\| \nabla F_1(\lambda _0^*)\right\| = O(h^2)\). To this end, let \(\underline{\lambda } = {(n + 1)}^{-1} \varvec{1}_{n \times 1}\) (i.e., the centroid of \(\Delta ^n\), where \(s^\theta \) is evaluated). Then, recalling Fig. 4, \(s^\theta _\lambda = s^\theta + {\delta s}^\top (\lambda - \underline{\lambda })\) so that, for a general \(\lambda \):

$$\begin{aligned} \begin{aligned} \nabla F_1(\lambda )&= \Vert p_\lambda \Vert h {\delta s}+ {\delta U}+ \frac{s^\theta + {\delta s}^\top (\lambda - \underline{\lambda })}{\Vert p_\lambda \Vert } h {\delta P}^\top p_\lambda \\&= \Vert p_\lambda \Vert h {\delta s}+ \nabla F_0(\lambda ) + \frac{{\delta s}^\top {(\lambda - \underline{\lambda })}}{\Vert p_\lambda \Vert } h {\delta P}^\top p_\lambda . \end{aligned} \end{aligned}$$
(61)

Since \(\nabla F_0(\lambda _0^*) = 0\) by optimality, we can conclude using Eq. 61 and \(\left\| {\delta s}\right\| = O(h)\) that:

$$\begin{aligned} \left\| \nabla F_1(\lambda _0^*)\right\| = h \left\| \Vert p_{\lambda _0^*}\Vert {\delta s}+ \frac{{\delta s}^\top {(\lambda - \underline{\lambda })}}{\Vert p_{\lambda _0^*}\Vert } {\delta P}^\top p_\lambda \right\| = O(h^2), \end{aligned}$$
(62)

which proves the result.

Lemma 6

The Hessian \(\nabla ^2 F_1\) is Lipschitz continuous with O(h) Lipschitz constant. That is, there is some constant \(\gamma = O(h)\) so that for two points \(\lambda \) and \(\lambda '\):

$$\begin{aligned} \left\| \nabla ^2 F_1(\lambda ) - \nabla ^2 F_1(\lambda ')\right\| \le \gamma \left\| \lambda - \lambda '\right\| . \end{aligned}$$

Proof

(Proof of lemma 6) If we restrict our attention to \(\Delta ^n\), we see that \(\Vert p_\lambda \Vert ^{-1} {\delta P}^\top \mathtt {Proj}_\lambda ^\perp {\delta P}\) is Lipschitz continuous function of \(\lambda \) with O(1) Lipschitz constant and \(\theta \{{{\delta P}}^\top p_\lambda , {\delta s}\} /\Vert p_{\lambda }\Vert \) is Lipschitz continuous with O(h) Lipschitz constant since \(\Vert \delta s\Vert = O(h)\). Then, since \(s^\theta _\lambda \) is O(1) Lipschitz, it follows that:

$$\begin{aligned} A(\lambda ) = \tfrac{s^\theta _\lambda h}{\Vert p_\lambda \Vert } {\delta P}^\top \mathtt {Proj}^\perp _\lambda {\delta P}\end{aligned}$$
(63)

has a Lipschitz constant that is O(h) for \(\lambda \in \Delta ^n\), using the notation of Lemma 4. Likewise,

$$\begin{aligned} B(\lambda ) = \tfrac{\theta h}{\Vert p_\lambda \Vert } \left\{ {\delta P}^\top p_\lambda , {\delta s}\right\} = O(h^2), \end{aligned}$$
(64)

since it is a sum of two terms involving products of h and \(\delta s\). Since \(\nabla ^2 F_1(\lambda ) = A(\lambda ) + B(\lambda )\), we can see immediately that it is also Lipschitz on \(\Delta ^n\) with a constant that is O(h).

Proof

(Proof of theorem 1) Our proof of theorem 1 relies on the following theorem on the convergence of Newton’s method, which we present for convenience.

Theorem 5

(Theorem 5.3.2, Stoer and Bulirsch) Let \(C \subseteq \mathbb {R}^n\) be an open set, let \(C_0\) be a convex set with \(\overline{C}_0 \subseteq C\), and let \(f : C \rightarrow \mathbb {R}^n\) be differentiable for \(x \in C_0\) and continuous for \(x \in C\). For \(x_0 \in C_0\), let \(r, \alpha , \beta , \gamma \) satisfy \(S_r(x_0) = \left\{ x : \left\| x - x_0\right\| < r\right\} \subseteq C_0\), \(\mu = \alpha \beta \gamma < 2\), \(r = \alpha (1 - \mu )^{-1}\), and let f satisfy:

  1. (a)

    for all \(x, y \in C_0\), \(\left\| D f(x) - D f(y)\right\| \le \gamma \left\| x - y\right\| \),

  2. (b)

    for all \(x \in C_0\), \((D f(x))^{-1}\) exists and satisfies \(\left\| (Df(x))^{-1}\right\| \le \beta \),

  3. (c)

    and \(\left\| (Df(x_0))^{-1} f(x_0)\right\| \le \alpha \).

Then, beginning at \(x_0\), each iterate:

$$\begin{aligned} x_{k+1} = x_k - Df(x_k)^{-1} f(x_k), \quad k = 0, 1, \ldots , \end{aligned}$$
(65)

is well-defined and satisfies \(\left\| x_k - x_0\right\| < r\) for all \(k \ge 0\). Furthermore, \(\lim _{k \rightarrow \infty } x_k = \xi \) exists and satisfies \(\left\| \xi - x_0\right\| \le r\) and \(f(\xi ) = 0\).

For our situation, Theorem 5.3.2 of Stoer and Bulirsch [53] indicates that if:

$$\begin{aligned} \Vert \nabla F_1(\lambda )^{-1}\Vert&\le \beta , \mathrm{where}\quad \beta = O(h^{-1}) , \end{aligned}$$
(66)
$$\begin{aligned} \Vert \nabla F_1(\lambda _0^*)^{-1} \nabla F_1(\lambda _0^*)\Vert&\le \alpha , \;\mathrm{where }\quad \quad \alpha = O(h), \;\mathrm{and} \end{aligned}$$
(67)
$$\begin{aligned} \Vert \nabla F_1(\lambda ) - \nabla F_1(\lambda ')\Vert&\le \gamma \left\| \lambda - \lambda '\right\| \;\mathrm{for each}\;\lambda , \lambda ' \in \Delta ^n,\;\mathrm{where}\;\gamma = O(h), \end{aligned}$$
(68)

then with \(\lambda _0 = \lambda _0^*\), the iteration Eq.  20 is well-defined, with each iterate satisfying \(\left\| \lambda _k - \lambda _0\right\| \le r\), where \(r = \alpha /(1 - \alpha \beta \gamma /2)\). Additionally, the limit of this iteration exists, and the iteration converges to it quadratically; we note that since \(F_1\) is strictly convex for h small enough, the limit of the iteration must be \(\lambda _1^*\), so the theorem also gives us \(\left\| {\delta \lambda }^*\right\| = \left\| \lambda _1^* - \lambda _0^*\right\| \le r\).

Now, we note that items 66, 67, and 68 correspond exactly to Lemma 4, 5, and 6, respectively which gave us values for \(\alpha , \beta \), and \(\gamma \). All that remains is to compute r. Since the preceding lemmas imply \(\alpha \beta \gamma = O(h)\), hence \(\alpha \beta \gamma /2 < 1\) for h small enough. We have:

$$\begin{aligned} r = \frac{\alpha }{1 - \frac{\alpha \beta \gamma }{2}} = \alpha \left( 1 + \frac{\alpha \beta \gamma }{2} + \frac{\alpha ^2\beta ^2\gamma ^2}{4} + \cdots \right) = O(h), \end{aligned}$$
(69)

so that \(\left\| {\delta \lambda }^*\right\| = O(h)\), and the result follows.

To obtain the \(O(h^3)\) error bound, from Theorem 1, we have \(\left\| {\delta \lambda }^*\right\| = O(h)\). Then, Taylor expanding \(F_1(\lambda _0^*)\), we get:

$$\begin{aligned} F_1(\lambda _0^*) = F_1(\lambda _1^* - {\delta \lambda }^*) = F_1(\lambda _1^*) - \nabla F_1(\lambda _1^*)^\top {\delta \lambda }^* + \frac{1}{2} {\delta \lambda }^* \nabla F_1^2(\lambda _1^*) {\delta \lambda }^* + R, \end{aligned}$$

where \(\left| R\right| = O(\left\| {\delta \lambda }^*\right\| ^3)\). Since \(\lambda _1^*\) is optimum, \(\nabla F_1(\lambda _1^*) = 0\). Hence:

$$\begin{aligned} \left| F_1(\lambda _1^*) - F_1(\lambda _0^*)\right| \le \frac{1}{2} \left\| \nabla F_1^2(\lambda _1^*)\right\| \left\| {\delta \lambda }^*\right\| ^2 + O(\left\| {\delta \lambda }^*\right\| ^3) = O(h^3), \end{aligned}$$

which proves the result.

Proofs for Sect. 3.4

Proof

(Proof of theorem 2) We proceed by reasoning geometrically; Fig. 21 depicts the geometric setup. First, letting \({\delta P}= QR\) be the reduced QR decomposition of \({\delta P}\), and writing \(\nu _{\lambda ^*} = p_{\lambda ^*}/\Vert p_{\lambda ^*}\Vert \), we note that since:

$$\begin{aligned} \nabla F_0(\lambda ^*) = {\delta U}+ s^\theta h {\delta P}^\top \nu _{\lambda ^*} = 0, \end{aligned}$$
(70)

the optimum \(\lambda ^*\) satisfies:

$$\begin{aligned} - R^{-\top } \frac{{\delta U}}{s^\theta h} = Q^\top \nu _{\lambda ^*} \end{aligned}$$
(71)

Let \(\mathtt {Proj}_{{\delta P}} = QQ^\top \) denote the orthogonal projector onto \({\text {range}}({\delta P})\), and \(\mathtt {Proj}^\perp _{{\delta P}} = I - QQ^\top \) the projector onto its orthogonal complement. We can try to write \(p_{\lambda ^*}\) by splitting it into a component that lies in \({\text {range}}({\delta P})\) and one that lies in \({\text {range}}({\delta P})^\perp \). Letting \(p_{{\text {min}}}\) be the point in \(p_0 + {\text {range}}({\delta P})\) with the smallest 2-norm, we write:

$$\begin{aligned} p_{\lambda ^*} = (p_{\lambda ^*} - p_{{\text {min}}}) + p_{{\text {min}}}, \end{aligned}$$
(72)

where \(p_{\lambda ^*} - p_{{\text {min}}}\in {\text {range}}({\delta P})\) and \(p_{{\text {min}}}\in {\text {range}}({\delta P})^\perp \). The vector \(p_{{\text {min}}}\) corresponds to \(p_{\lambda _{{\text {min}}}}\) where \(\lambda _{{\text {min}}}\) satisfies:

$$\begin{aligned} 0 = {\delta P}^\top ({\delta P}\lambda _{{\text {min}}}+ p_0) = R^\top R \lambda _{{\text {min}}}+ R^\top Q^\top p_0, \end{aligned}$$
(73)

hence \(\lambda _{{\text {min}}}= -R^{-1} Q^\top p_0\), giving us:

$$\begin{aligned} p_{{\text {min}}}= p_0 + {\delta P}\lambda _{{\text {min}}}= \mathtt {Proj}^\perp _{{\delta P}} p_0. \end{aligned}$$
(74)

This vector is easily obtained. For \(p_{\lambda ^*} - p_{{\text {min}}}\), we note that \(\mathtt {Proj}_{{\delta P}} \nu _{\lambda ^*}\) is proportional to \(p_{\lambda ^*} - p_{{\text {min}}}\), suggesting that we determine the ratio \(\alpha \) satisfying \(p_{\lambda ^*} - p_{{\text {min}}}= \alpha \mathtt {Proj}_{{\delta P}} \nu _{\lambda ^*}\). In particular, from the similarity of the triangles \((\hat{p}, \nu _{\lambda ^*}, \mathtt {Proj}^\perp _{{\delta P}} \nu _{\lambda ^*})\) and \((\hat{p}, p_{\lambda ^*}, p_{{\text {min}}})\) in Fig. 21, we have, using Eqs. 71 and 74:

$$\begin{aligned} \alpha = \frac{\left\| p_{{\text {min}}}\right\| }{\left\| \mathtt {Proj}^\perp _{{\delta P}} \nu _{\lambda ^*}\right\| } = \sqrt{\frac{p_0^\top \mathtt {Proj}^\perp _{{\delta P}} p_0}{1 - \left\| Q^\top \nu _{\lambda ^*}\right\| ^2}} = \sqrt{\frac{p_0^\top \mathtt {Proj}^\perp _{{\delta P}} p_0}{1 - \left\| R^{-\top } \frac{{\delta U}}{s^\theta h}\right\| ^2}}. \end{aligned}$$
(75)

At the same time, since:

$$\begin{aligned} \nu _{\lambda ^*}^\top \mathtt {Proj}^\perp _{{\delta P}} \nu _{\lambda ^*} = \frac{{(\mathtt {Proj}^\perp _{{\delta P}} p_{\lambda ^*})}^\top {(\mathtt {Proj}^\perp _{{\delta P}} p_{\lambda ^*})}}{\Vert p_{\lambda _*}\Vert ^2} = \frac{p_{{\text {min}}}^\top p_{{\text {min}}}}{\Vert p_{\lambda ^*}\Vert ^2} = \frac{p_0^\top \mathtt {Proj}^\perp _{{\delta P}} p_0}{\Vert p_{\lambda ^*}\Vert ^2} \end{aligned}$$
(76)

we can conclude that:

$$\begin{aligned} \Vert p_{\lambda ^*}\Vert = \alpha = \sqrt{\frac{p_0^\top \mathtt {Proj}^\perp _{{\delta P}} p_0}{1 - \left\| R^{-\top } \frac{{\delta U}}{s^\theta h}\right\| ^2}}, \end{aligned}$$
(77)

giving us Eq. 22, proving the first part of Theorem 2.

Fig. 21
figure21

A schematic depiction of the Proof of Theorem 2

Next, combining Eqs. 71, 72, 74, and 75, we get:

$$\begin{aligned} p_{\lambda ^*} = \mathtt {Proj}^\perp _{{\delta P}} p_0 - \sqrt{\frac{p_0^\top \mathtt {Proj}^\perp _{{\delta P}} p_0}{1 - \left\| R^{-\top } \frac{{\delta U}}{s^\theta h}\right\| ^2}} Q R^{-\top } \frac{{\delta U}}{s^\theta h}. \end{aligned}$$
(78)

This expression for \(p_{\lambda ^*}\) can be computed from our problem data and \({\delta P}\). Now, note that \(p_{\lambda ^*} = p_0 + {\delta P}\lambda ^*\) implies:

$$\begin{aligned} \lambda ^* = R^{-1} Q^\top (p_{\lambda ^*} - p_0). \end{aligned}$$
(79)

Substituting Eq. 78 into Eq.  79, we obtain Eq. 23 after making appropriate cancellations, establishing the second part of Theorem 2.

To establish Eq. 24, we note that by optimality of \(\lambda ^*\), our expression for \(\nabla F_0\) (Eq.  16 of Proposition 1) gives:

$$\begin{aligned} {\delta U}= -s^\theta h \frac{{\delta P}^\top p_{\lambda ^*}}{\Vert p_{\lambda ^*}\Vert }. \end{aligned}$$
(80)

This lets us write:

$$\begin{aligned} {\delta U}^\top \lambda ^* = -\frac{s^\theta h}{\Vert p_{\lambda ^*}\Vert } p_{\lambda ^*}^\top {\delta P}^\top \lambda ^* = \frac{s^\theta h}{\Vert p_{\lambda ^*}\Vert } p_{\lambda ^*}^\top {(p_0 - p_{\lambda ^*})}. \end{aligned}$$
(81)

Combining Eq. 81 with our definition of \(F_0\) yields:

$$\begin{aligned} \hat{U} = F_0(\lambda ^*) = U_0 + {\delta U}^\top \lambda ^* + s^\theta h \Vert p_{\lambda ^*}\Vert = U_0 + \frac{s^\theta h}{\Vert p_{\lambda ^*}\Vert } p_{\lambda ^*}^\top {(p_0 - p_{\lambda ^*})} + \frac{s^\theta h}{\Vert p_{\lambda ^*}\Vert } p_{\lambda ^*}^\top p_{\lambda ^*}, \end{aligned}$$
(82)

which gives Eq. 24, completing the final part of the proof.

Proofs for Sect. 3.5

Proof

(Proof of theorem 3) We assume that U is a linear function in the update simplex; hence, \(\nabla U\) is constant. By stacking and subtracting Eq.  25 for different values of i, we obtain, for \(i = 0, \ldots , n - 1\):

$$\begin{aligned} \begin{bmatrix} {\delta P}^\top \\ p_i^\top \end{bmatrix} \nabla U = \begin{bmatrix} {\delta U}\\ U_0 - \hat{U} \end{bmatrix}. \end{aligned}$$
(83)

The inverse of the matrix in the left-hand side of Eq.  83 is:

$$\begin{aligned} \begin{bmatrix} \left( I - \frac{\nu _{{\text {min}}}p_i^\top }{\nu _{{\text {min}}}^\top p_i}\right) Q R^{-\top },&\frac{\nu _{{\text {min}}}}{\nu _{{\text {min}}}^\top p_i} \end{bmatrix}, \end{aligned}$$
(84)

which can be checked. This gives us:

$$\begin{aligned} \nabla U = \left( I - \frac{\nu _{{\text {min}}}p_i^\top }{\nu _{{\text {min}}}^\top p_i}\right) Q R^{-\top } {\delta U}+ \frac{U_i - \hat{U}}{\nu _{{\text {min}}}^\top p_i} \nu _{{\text {min}}}. \end{aligned}$$
(85)

Hence, \(\Vert \nabla U\Vert ^2\) is a quadratic equation in \(\hat{U} - U_i\). Expanding \(\Vert \nabla U\Vert ^2\), a number of cancellations occur since \(Q^\top \nu _{{\text {min}}}= 0\). We have:

$$\begin{aligned} {\delta U}^\top R^{-1} Q^\top \left( I - \frac{\nu _{{\text {min}}}p_i^\top }{\nu _{{\text {min}}}^\top p_i}\right) ^{\top } \left( I - \frac{\nu _{{\text {min}}}p_i^\top }{\nu _{{\text {min}}}^\top p_i}\right) Q R^{-\top } {\delta U}= \left\| R^{-\top } {\delta U}\right\| ^2 + \frac{\left( p_i^\top Q R^{-\top } {\delta U}\right) ^2}{\left\| p_{{\text {min}}}\right\| ^2}, \end{aligned}$$
(86)

so that, written in standard form:

$$\begin{aligned} \begin{aligned} {(\hat{U} - U_i)}^2 + 2 p_i^\top Q R^{-\top } {\delta U}{(\hat{U} - U_i)} \,&+\, \left( p_i^\top Q R^{-\top } {\delta U}\right) ^2 \\&+\left\| p_{{\text {min}}}\right\| ^2 \left( \left\| R^{-\top } {\delta U}\right\| ^2 - \left( s^\theta h\right) ^2\right) = 0. \end{aligned} \end{aligned}$$
(87)

Solving for \(\hat{U} - U_i\) gives:

$$\begin{aligned} \hat{U} = U_i - p_i^\top Q R^{-\top } {\delta U}+ \left\| p_{{\text {min}}}\right\| \sqrt{\left( s^\theta h\right) - \Vert R^{-\top } {\delta U}\Vert ^2}, \end{aligned}$$
(88)

establishing Eq. 27.

Next, to show that \(\hat{U}' = \hat{U}\), we compute:

$$\begin{aligned} \hat{U}'&= U_0 + {\delta U}^\top \lambda ^* + s^\theta h \Vert p_{\lambda ^*}\Vert \\&= U_0 - \left( Q^\top p_0 + \Vert p_{\lambda ^*}\Vert R^{-\top } \frac{{\delta U}}{s^\theta h}\right) ^\top R^{-\top } {\delta U}+ s^\theta h \Vert p_{\lambda ^*}\Vert \quad \mathrm{(Eq.~23)} \\&= U_0 - p_0^\top Q R^{-\top } {\delta U}+ s^\theta h \Vert p_{\lambda ^*}\Vert \left( 1 - \left\| R^{-\top } \frac{{\delta U}}{s^\theta h}\right\| ^2\right) \\&= U_0 - p_0^\top Q R^{-\top } {\delta U}+ \left\| p_{{\text {min}}}\right\| \sqrt{\left( s^\theta h\right) ^2 - \left\| R^\top {\delta U}\right\| ^2} = \hat{U}. \quad \mathrm{(Eq.~22)} \end{aligned}$$

To establish Eq. 28, first note that \(-R^{-\top } {\delta U}= s^\theta h Q^\top \nu _{\lambda ^*}\) by optimality. Substituting this into Eq. 27, we first obtain:

$$\begin{aligned} \hat{U} = U_i + \frac{s^\theta h}{\Vert p_{\lambda ^*}\Vert } \left( p_i^\top \mathtt {Proj}_{{\delta P}} p_{\lambda ^*} + \left\| p_{{\text {min}}}\right\| \sqrt{p_{\lambda ^*}^\top \mathtt {Proj}^\perp _{{\delta P}} p_{\lambda ^*}}\right) . \end{aligned}$$
(89)

Now, using the notation for weighted norms and inner products, we have:

$$\begin{aligned} p_i^\top \mathtt {Proj}_{{\delta P}} p_{\lambda ^*} + \left\| p_{{\text {min}}}\right\| \sqrt{p_{\lambda ^*}^\top \mathtt {Proj}^\perp _{{\delta P}} p_{\lambda ^*}} = \langle p_i, p_{\lambda ^*} \rangle _{\mathtt {Proj}_{{\delta P}}} + \left\| p_i\right\| _{\mathtt {Proj}^\perp _{{\delta P}}} \left\| p_{\lambda ^*}\right\| _{\mathtt {Proj}^\perp _{{\delta P}}}. \end{aligned}$$
(90)

Since \(\mathtt {Proj}^\perp _{{\delta P}}\) orthogonally projects onto \({\text {range}}({\delta P})^\perp \), and since the dimension of this subspace is 1, \(\mathtt {Proj}^\perp _{{\delta P}} p_i\) and \(\mathtt {Proj}^\perp _{{\delta P}} p_{\lambda ^*}\) are multiples of one another and their directions coincide (see Fig. 21); furthermore, the angle between them is since our simplex is nondegenerate. So, by Cauchy–Schwarz:

$$\begin{aligned} \left\| p_i\right\| _{\mathtt {Proj}^\perp _{{\delta P}}} \left\| p_{\lambda ^*}\right\| _{\mathtt {Proj}^\perp _{{\delta P}}} = \langle p_i, p_{\lambda ^*} \rangle _{\mathtt {Proj}^\perp _{{\delta P}}}. \end{aligned}$$
(91)

Combining Eq. 91 with Eq.  90 and cancelling terms yields:

$$\begin{aligned} p_i^\top \mathtt {Proj}_{{\delta P}} p_{\lambda ^*} + \left\| p_{{\text {min}}}\right\| \sqrt{p_{\lambda ^*}^\top \mathtt {Proj}_{{\delta P}} p_{\lambda ^*}} = p_i^\top p_{\lambda ^*}. \end{aligned}$$
(92)

Equation 28 follows.

To parametrize the characteristic found by solving the finite difference problem, first note that the characteristic arriving at \(\hat{p}\) is colinear with \(\nabla \hat{U}\). If we let \(\tilde{\nu }\) be the normal pointing from \(\hat{p}\) in the direction of the arriving characteristic, let \(\tilde{p}\) be the point of intersection between \(p_0 + {\text {range}}({\delta P})\) and \({\text {span}}{(\tilde{\nu })}\), and let \(\tilde{l} = \left\| \tilde{p}\right\| \), then, since \(\tilde{p} - p_0 \in {\text {range}}({\delta P})\):

$$\begin{aligned} \nu _{{\text {min}}}^\top (\tilde{p} - p_0) = 0. \end{aligned}$$
(93)

Rearranging this and substituting \(\tilde{p} = \tilde{l} \tilde{\nu }\), we get:

$$\begin{aligned} \tilde{l} = \frac{\nu _{{\text {min}}}^\top p_0}{\nu _{{\text {min}}}^\top \tilde{\nu }}. \end{aligned}$$
(94)

Now, if we assume that we can write \(\tilde{p} = {\delta P}\tilde{\lambda } + p_0\) for some \(\tilde{\lambda }\), then:

$$\begin{aligned} \tilde{\lambda } = R^{-1} Q^\top \left( \tilde{p} - p_0\right) = -R^{-1} Q^\top \left( I - \frac{\tilde{\nu } \nu _{{\text {min}}}^\top }{\tilde{\nu }^\top \nu _{{\text {min}}}}\right) p_0. \end{aligned}$$
(95)

To see that \(\tilde{p} = p_{\lambda ^*}\), note that since \(\tilde{\nu } = -\nabla \hat{U}/\left\| \nabla \hat{U}\right\| = -\nabla \hat{U}/(s^\theta h)\):

$$\begin{aligned} \mathtt {Proj}_{{\delta P}} \tilde{\nu } = \frac{-\mathtt {Proj}_{{\delta P}} \nabla \hat{U}}{s^\theta h} = \frac{-QR^{-\top } {\delta U}}{s^\theta h} = \mathtt {Proj}_{{\delta P}} \nu _{\lambda ^*}. \end{aligned}$$
(96)

Since \(\tilde{\nu }\) and \(\nu _{\lambda ^*}\) each lie in the unit sphere on the same side of the hyperplane spanned by \({\delta P}\), and since \(\mathtt {Proj}_{{\delta P}}\) orthogonally projects onto \({\text {range}}({\delta P})\), we can see that in fact \(\tilde{\nu } = \nu _{\lambda ^*}\). Hence, \(\tilde{p} = p_{\lambda ^*} \in p_0 + {\text {range}}({\delta P})\). The second and third parts of Theorem 3 follow.

Proofs for Sect. 3.6

Proof

(Proof of theorem 4) For causality of \(F_0\), we want \(\hat{U} \ge \max _i U_i\), which is equivalent to \(\min _i(\hat{U} - U_i) \ge 0\). From Eq.  27, we have:

$$\begin{aligned} \min _i \left( \hat{U} - U_i\right) = s^\theta h \min _i \min _{\lambda \in \Delta ^n} \frac{\nu _i^\top \nu _{\lambda }}{\left\| p_\lambda \right\| } = s^\theta h \min _{i, j} \frac{\nu _i^\top \nu _j}{\left\| p_i\right\| } \ge 0. \end{aligned}$$
(97)

The last equality follows because minimizing the cosine between two unit vectors is equivalent to maximizing the angle between them; since \(\lambda \) is restricted to lie in \(\Delta ^n\), this clearly happens at a vertex since the minimization problem is a linear program.

For \(F_1\), first rewrite \(s_\lambda ^\theta \) as follows:

$$\begin{aligned} s_\lambda ^\theta = s^\theta + \theta (s_0 + {\delta s}^\top \lambda - \overline{s}), \end{aligned}$$
(98)

where \(\overline{s} = n^{-1} \sum _{i=0}^{n-1} s_i\). If \(\lambda _0^\star \) and \(\lambda _1^\star \) are the minimizing arguments for \(F_0\) and \(F_1\), respectively, and if \({\delta \lambda }^* = \lambda _1^* - \lambda _0^*\), then we have:

$$\begin{aligned} F_1(\lambda _1^*) = F_0(\lambda _1^*) + \theta \left( s_0 + {\delta s}^\top \lambda _1^* - \overline{s}\right) h \Vert p_{\lambda _1^\star }\Vert . \end{aligned}$$
(99)

By the optimality of \(\lambda _0^*\) and strict convexity of \(F_0\) (Lemma 2), we can Taylor expand and write:

$$\begin{aligned} F_0(\lambda _1^*) = F_0(\lambda _0^*) + \nabla F_0(\lambda _0^*)^\top {\delta \lambda }^* + \frac{1}{2} {{\delta \lambda }^*}^\top \nabla ^2 F_0(\lambda _0^*) {\delta \lambda }^* + R \ge R, \end{aligned}$$
(100)

where \(\left| R\right| = O(h^3)\) by Theorem 1. Let \(\hat{U} = F_1(\lambda _1^*)\). Since \(F_0\) is causal, we can write:

$$\begin{aligned} \hat{U} \ge \max _i U_i + R + \theta \left( s_0 + {\delta s}^\top \lambda _1^* - \overline{s}\right) h \Vert p_{\lambda _1^*}\Vert . \end{aligned}$$
(101)

Since s is Lipschitz, the last term is \(O(h^2)\)—in particular, \(\left\| {\delta s}\right\| = O(h)\) and \(\left\| s_0 - \overline{s}\right\| = O(h)\) since \(s_0\) and \(\overline{s}\) lie in the same simplex. So, because the gap \(\min _i(\hat{U} - U_i)\) is O(h), we can see that \(\hat{U} \ge \max _i U_i\) for h sufficiently small.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Potter, S.F., Cameron, M.K. Ordered Line Integral Methods for Solving the Eikonal Equation. J Sci Comput 81, 2010–2050 (2019). https://doi.org/10.1007/s10915-019-01077-z

Download citation

Keywords

  • Ordered line integral method
  • Eikonal equation
  • Factored eikonal equation
  • Simplified midpoint rule
  • Semi-Lagrangian method
  • Fast marching method

Mathematics Subject Classification

  • 65N99
  • 65Y20
  • 49M99