Path-following methods for calculating linear surface wave dispersion relations on vertical shear flows

The path-following scheme in [Loisel and Maxwell, SIAM J. Matrix Anal. Appl., 39-4 (2018), pp. 1726-1749] is adapted to efficiently calculate the dispersion relation curve for linear surface waves on an arbitrary vertical shear current. This is equivalent to solving the Rayleigh instability equation with linearised free-surface boundary condition for each sought point on the curve. Taking advantage of the analyticity of the dispersion relation, a path-following or continuation approach is adopted. The problem is discretized using a collocation scheme, parametrised along either a radial or angular path in the wave vector plane, and differentiated to yield a system of ODEs. After an initial eigenproblem solve using QZ decomposition, numerical integration proceeds along the curve using linear solves as the Runge--Kutta $F(\cdot)$ function; thus, many QZ decompositions on a $2N$ companion matrix are exchanged for one QZ decomposition and a small number of linear solves on a size $N$ matrix. A piecewise interpolant provides dense output. The integration represents a nominal setup cost afterwhich very many points can be computed at negligible cost whilst preserving high accuracy. Furthermore, a 2-dimensional interpolant suitable for scattered data query points in the wave vector plane is described. Finally, a comparison is made with existing numerical methods for this problem, revealing that the path-following scheme is asymptotically two orders of magnitude faster in number of query points.

1. Introduction. Engineering and the natural sciences are replete with eigenproblems for ordinary differential operators which depend on a finite set of parameters. We are interested in problems which are parametrised by a single real variable.
The canonical solution approach involves conversion to an algebraic problem via spatial discretization, which often leads to polynomial or even nonlinear eigenproblems of potentially large dimension. These can be solved using classical techniques for each sought parameter value. This strategy may become prohibitively expensive when the computation must be repeated many times. It can also be difficult to take advantage of the nearness of solutions for small parameter variations, forcing full calculations for each point in the parameter space. An alternative approach is to solve the discretized eigenproblem once for a fixed parameter value then use the local piecewise analyticity of the eigenvalue and eigenvector [17], [3], [28] to calculate along the solution curve using a path-following or numerical continuation algorithm.
In a more general setting, this comprises a numerical continuation method whereby the parameter-dependent solution is calculated as an implicitly defined curve [2]. Homotopy methods have a similar philosphy but introduce an artificial parameter to parametrise a convex homotopy to map from the solution of an 'easy' problem to the solution of the actual problem [20]. These methods tend to use predictor-corrector schemes such as pseudo-arclength continuation or similar approaches. We make reference to the homotopy method in [22] and the invariant subspace methods in [8], [7] as relevant examples. For a recent approach that shares a strong philosophical similarity with the material herein for working with time-varying matrix eigenproblems, albeit using different techniques (look-ahead finite difference formulas), see [41] and [40].
This paper is concerned with repurposing the path-following technique used in [21] to solve a specific classical problem from wave-current interactions [29], [30], [31], [42], [44], [27,s. IV], [34], [45], [26], [25] : that of calculating the dispersion relation for perturbative linear order free-surface waves travelling atop a vertical shear flow. The problem is particularly suited as a motivating example of the technique: it is conceptually simple, it has an eigenvalue-dependent boundary condition, it is wellknown from both the waves literature and hydrodynamic stability, and there is a practical requirement for efficient numerical solution.
We summarise our approach as follows. The original eigenproblem is spatially discretized using a collocation method, implicitly incorporating the boundary conditions, to obtain a parameter-varying system of equations that are then differentiated to yield an under-determined system of ODEs. An additional constraint is then included. After performing an initial eigenproblem solve, numerical integration can proceed along the solution curve using linear solves as the Runge-Kutta F (·) function. A piecewise polynomial interpolant provides dense output.
1.1. Outline of paper. We begin by introducing the geometry of the physical problem and some problem-specific background in section 2. The collocation scheme used is briefly described in section 3. The path-following method is described in section 4 for both the reduced and general problem using scattered data. In section 5, we provide numerical results to determine the expected accuracy of the collocation and path-following methods. In section 7, we evaluate the relative performance characteristics of the various methods and in section 8 we describe how to choose optimal parameters. Finally, in section 9, we provide some conclusions.

Preamble.
2.1. Problem description. Wave-current interaction problems are often studied by adopting a modal linear perturbative approach: waves are considered as firstorder perturbations of a stationary, incompressible, and inviscid bulk fluid flow. In this context, waves are dispersive with the phase velocity of a wave dependent on the wave vector in a nonlinear manner. The relationship between the wave vector, k, and the phase velocity, c, is termed the dispersion relation and is determined by factors such as water depth and background current.
In our context of first-order free-surface waves atop a vertical shear flow, the problem reduces to finding solutions of the eigenvalue problem formed from the Rayleigh instability equation and appropriate boundary conditions. The Rayleigh equation is a second order ODE that is equivalent to the Orr-Sommerfeld equation when viscous effects are neglected. A solution to the eigenproblem will yield a {k, c, w(z)} triplet where w(z) is the associated eigenfunction. The eigenfunction can be used to reconstruct the velocity and pressure field for the corresponding wave vector [19, s. 4.4]. Notably, there is substantial overlap with the literature on linear stability theory, e.g. [32, ch. 2] or [11], albeit with different boundary conditions. Closed-form expressions for the dispersion relation for free-surface waves exist only in specific scenarios such as quiescent water [24] or a linear shear current (constant vorticity) [12]. Integral approximation methods [35], [18], [13] and numerical schemes [34], [45], [36], [19] exist to calculate the dispersion relation for arbitrary shear profiles.
To make matters more concrete, we adopt the general approach used in [34], (a) An illustrative example of dispersive ring waves atop a shear current. An example shear profile is indicated with z-dependent arrows changing direction in the horizontal plane and an example k vector as k 0 .
(b) Geometry used for reduced problem. Shear profile U T (z) shown in blue. [19] and refer the reader to the derivations therein for full detail. For expediency, the approach is only summarised here. The physical model is depicted in Figure 1a. Dimensional quantities are denoted with an acute, e.g.ḱ.
The background flow is specified by a shear profileÚ (z) = (Ú x (z),Ú y (z)): a 2dimensional vector field describing the bulk fluid velocity in the horizontal plane for a givenź ∈ [−h, 0] whereh is the constant fluid depth and the unperturbed surface is atź = 0. Let (2.1) We useh as a characteristic length scale andÚ m as a characteristic velocity, to arrive at the following nondimensionalisation: so that a notional shear strength can be expressed with Froude number, F =Ú m / gh. The shear profile must be suitably regular, so we impose that U We also require c parallel to k and we henceforth only refer to scalar c = |c|. It is assumed that the current can influence the waves but not conversely and, for clarity of exposition, we neglect surface tension. We adopt the Ansatz that perturbations are plane waves proportional to exp[i(k·x−ωt)] where the wave angular frequency is ω = kc, and use a Fourier transform in the horizontal plane (coordinate space quantities indicated with a tilde), The velocity perturbations along the x, y, and z axes are respectively u = u(k, z), v = v(k, z), and w = w(k, z), whilst the pressure is p = p(k, z). The governing equations are the linearised Euler equations and incompressibility condition, e.g. recall [34]: for µ = k 2 . The velocity and pressure field for a specific wave vector can be recovered by substituting the appropriate eigenpair into [19, s. 4.4], The eigenvalue problem described by (2.4) is inherently 1-dimensional. Usually, the physical problem is posed so that U (z) is scalar for U (z) = (U (z), 0), with scalar k and c, see Figure 1b. However, there is no difficulty in solving the more general physical problem by simply projecting U (z) along k. Full 3-dimensional considerations only come to the fore when calculating the velocity and pressure field. To avoid the inherent ambiguities of '1d' / '2d' or '2d' / '3d' descriptions, we refer to the problem with scalar U (z) as the reduced problem and with vector U (z) as the general problem. In subsection 3.1, we note that the reduced problem is equivalent to solving the general problem in a radial 'slice' at some fixed angle θ 0 .
So far, we have deliberately avoided specifying which variable is the sought eigenvalue in (2.4): it can be chosen as either µ or c, with its counterpart parametrising the problem and always chosen to be real valued, in a similar manner to [9, s. 7.4]. Since we are always choosing the parameter to be real-valued, we are concerned with a subset of the spectrum in each case and can plot this arrangement as a function of the parameter.
• The spectrum for µ(c), for c in some suitable interval, is comprised of a countably infinite set of eigenvalues. The dominant eigenvalue, µ 1 = k 2 , in this case is the only positive eigenvalue, and corresponds to a propagating wave (for ±k). The negative eigenvalues correspond to the countably infinite set of discrete k arranged along the imaginary axis and are not mentioned further. See Figure 2a. • The spectrum for c(k), for k > 0, has both discrete and essential part (c such that U (z) − c = 0, causing the ODE to become singular). In this case, the sought eigenvalue is again dominant but may be located within the same interval as the essential spectrum and therefore can be difficult to identify within numerical solution sets. See Figure 2b.  2.2. Problem types: forward, backward, and inverse. We distinguish three types of problem.
1. For shear profile, U (z), and collection of wave numbers, {k (j) } J j=1 , calculate associated velocities {c(k (j) )} J j=1 . We denote this as the forward problem. 2. For shear profile, U (z), and collection of velocities, {c (j) } J j=1 , calculate associated wave numbers {k(c (j) )} J j=1 . We denote this as the backward problem. 3. For collection of wave number and phase velocity pairs, {(k (j) , c (j) )} J j=1 , determine shear profile, U (z). We denote this as the inverse problem, which is of an entirely different nature and herein not considered further. Both the forward and backward problem usually amount to calculating sufficiently many {k (j) , c (j) } pairs as to adequately specify the full dispersion relation for a given shear profile. For practical purposes, these problems are almost always posed as (2.3) with exponent of the form i(kx−ω(k)t) (see [27, p. 77, eqn. 4.6], [18, eqn. 1], [34, eqn. 2.4], etc). This, by default, presupposes solving the forward problem. Furthermore, handling of critical layers as in subsection 3.5 is, from a numerical standpoint, easier for the forward problem. Therefore, most of this paper concerns solution of the forward problem.
There are a few exceptions to this rule such as for wave problems with periodic or stationary time dependence, e.g. in ship waves. Hence, for purposes of completeness, we also describe solving the backwards problem using a rudimentary collocation scheme and the basic properties of the spectrum.

Summary of numerical schemes.
For ease of reference, we denote the various numerical schemes used or described in this paper: • CL-c : Collocation scheme for the forward problem, see subsection 3.4.
• CL-k : Collocation scheme for the backward problem, see subsection 3.3.
• PF-R-r-c : A path-following scheme with dense output to solve the forward reduced problem along a fixed angle θ 0 in the k-plane, see subsection 4.2.1. • PF-R-a-c : A path-following scheme with dense output to solve the forward reduced problem for c k0 (θ) along a fixed circle of radius k = k 0 with varying θ in the k-plane, which we term the angular solution, see subsection 4.2.2.
• PFmp-R-{r,a}-c : An illustrative scheme using a single high precision QZ solve to improve accuracy of PF-R-{r,a}-c, detailed in subsection 4.2.4. • PF-G-c : A scheme which solves the forward general problem by using PF-R-r-c and PF-R-a-c to allow rapid interpolation with 2-dimensional scattered data query points in the k-plane, see subsection 4.3. The CL-c and CL-k schemes incur an eigenvalue calculation for each query point, so the computational cost will increase linearly with the number of query points. The arrangement of points in the k-plane for the CL-schemes can be random without affecting computational cost.
The PF-R-r-c path-following algorithm is two stage: it first performs numerical integration to calculate control points along a radial 'slice' at fixed angle θ 0 , which incurs a nominal initial computational cost; query points on that slice are then calculated using a Hermite interpolant. Although the computational cost of interpolation is linear in the number of query points, it is so light-weight as to be of almost negligible cost in most situations: so after the initial computation, very many query points can be calculated efficiently. The angular PF-R-a-c scheme is similar but instead calculates along a circular path at a fixed radius k 0 .
The PF-G-c scheme is more involved because we accept query points in the kplane with no assumption on arrangement, i.e. scattered data. A naive approach would incur a complete first stage calculation of PF-R-r-c for every query point, which is unacceptable. The PF-G-c scheme instead precalculates a 2-dimensional polar grid of suitable control points and then can interpolate for query points at negligible cost.
Note that all methods presented can also make available the eigenfunction w(z) so that the velocity and pressure field can be reconstructed using (2.5).
2.4. Existing algorithms and approximation methods. Development in this area has been slightly unusual: despite the problem being readily amenable to numerical methods, there has been an emphasis on integral approximation schemes. In chronological order: Stewart & Joy [37] (infinite depth), Skop [35] (finite depth), Kirby & Chen [18] (finite depth to 2nd order), and finally Ellingsen & Li [13]. Since our focus is on numerical methods, we do not address approximation schemes further.
The principal algorithm against which we compare is 'DIM' from [19], which also contains a review of other numerical methods including the perennial piecewise-linear approach. For purposes of completeness, we also perform numerical simulations using a basic shooting method.

Shear profiles and parameters used.
For later numerical tests, we define a test shear profile U T for the reduced problem, as shown in Figure 1b, We choose the physical depthh = 20 and shear Froude number as F 2 = 0.05. We choose nondimensional k ∈ I k := [ 0.025, 250 ]. This corresponds broadly to gravity waves in the air-water interface regime [24, p. 4] with shortest period ≈ 0.2s. The function is chosen as a suitable test candidate because it has several stationary points and cannot be approximated exactly over a finite dimensional polynomial basis. For the figures produced from PF-G-c shown later in Figure 4, we use U T (z) along the x axis and an approximation of a flow in the Colombia River on the y axis, which we denote U CR . This is defined by a sixth degree polynomial and in our tests was scalled to have F 2 ≈ 0.01; the precise definition of U CR is not so important as it is used only for the illustrative plots in subsection 4.3.
For more general choice of shear profile and parameters, it may be possible to create critical layers. These are z ∈ [−1, 0] for which there exists some c(k) such that U (z) − c(k) = 0, i.e depths z for which the governing equation becomes singular. For our chosen shear profiles and parameters, critical layers are not encountered. Brief mention is made in subsection 3.5 of how critical layers may be processed for CL-c.
3. Collocation method for solving the dispersion relation.
3.1. General to reduced problem. The general problem (2.4) can be simplified to a reduced problem by projecting U along k, cf. [27, p. 77] [34, p. 566]. Define the scalar shear profile for the reduced problem as U θ (z) = (1/k)k · U = cos(θ)U x (z) + sin(θ)U y (z) where θ is taken to be the standard angular polar coordinate for k.
3.2. Discretization of the equations. We use sans serif notation to indicate matrices (uppercase) and vectors (lowercase), e.g. U or w, to distinguish from their continuous counterparts.
The problem is a two-point boundary value problem so is amenable to the standard 'row-replacement' strategy, see for example [39, ch. 7]. Specifically, we aim to construct eigenvalue equations which discretise the governing equation (2.4a) using the 'interior' rows 2 through N z of the differentiation and coefficient matrices. The free-surface boundary condition (2.4b) is incorporated in the first row of the matrices. The bottom Dirichlet (2.4c) boundary condition is accounted for by eliminating the last row and column in the matrices. For notational convenience, we define 'interior' differentiation and shear profile matrices as D int = D lm , U int = U lm , U int = U lm , and U int = U lm with l = 2 . . . N z , m = 1 . . . N z (in other words, eliminating the first and last rows, and last column). We also define a free-surface differentiation vector as the first row of D, d f := D 1m , m = 1 . . . N z , again without the last column.

Backward reduced problem (CL-k).
Treating c as a parameter and k as the eigenvalue, we obtain a regular Sturm-Liouville problem on z ∈ [−1, 0] with eigenvalue µ = k 2 , Let q j := u j /(u j − c), q int := [q 2 . . . q Nz ], and Q int = diag(q int ). The discretization of (3.1a) proceeds in the obvious manner, Discretising (3.1b) into a row vector gives, to obtain the generalised eigenvalue problem, Note that the only effect of B is to ensure that the row of A with the free-surface boundary condition is set equal to zero and is not dependent on the eigenvalue. (3.5) can be solved in several ways, e.g. using MATLAB's implementation of QZ as eig (A,B). For a given c, there is a countably infinite set of discrete µ j eigenvalues ordered µ 1 > µ 2 > . . .. However, the only positive eigenvalue is µ 1 , which corresponds to the only real k, hence ±k represent the only propagating waves; we solve only for the positive branch. This is shown in Figure 2a. 3.4. Forward reduced problem (CL-c). Now, treating k as a parameter and c as the eigenvalue, we rewrite the reduced problem (3.1) to emphasise the quadratic eigenvalue dependence in the free-surface boundary condition, We initially discretise (3.6a) as We proceed by expressing the free-surface condition as coefficients of the powers of c, In the same manner, we now separate (3.7) into powers of c, To obtain the sought solution, we solve the quadratic eigenproblem, There are several techniques to solve the quadratic eigenvalue problem, although a direct linearisation and then using a QZ decomposition is sufficient in this setting. MATLAB's polyeig(A2,A1,A0) implements such a linearisation, although some care must be taken. In particular, the A 2 matrix is badly rank-deficient. As a consequence, the QZ algorithm will return infinite and large-but-finite eigenvalues which are merely artefacts of the numerical method and must be removed.
The spectrum has two discrete branches and essential spectrum; we seek the positive branch (greatest eigenvalue), which corresponds to propagating waves. The essential spectrum contains all c such that U (z) − c = 0 for some z ∈ [−1, 0], see [11] and Figure 2b. A critical layer exists if phase velocity c ∈ [U min , U max ] (see shaded region in Figure 2b), in other words if c is in the region occupied by the essential spectrum. The QZ algorithm returns many points from the essential spectrum and, if c ∈ [U min , U max ], they are numerically indistinguishable. However, the eigenvectors from the essential spectrum have singular behaviour in the interior of their domain whereas this is not true for the eigenvector corresponding to sought eigenvalue c. Thus, the sought eigenvalue can, in principle at least, be identified and computation may still proceed when critical layer(s) are present. The question of critical layers is, however, not central to the theme of this paper and so shall not be mentioned further.
4. Path-following method for calculating the dispersion relation curve. 4.1. Review of Loisel-Maxwell path-following method for the field of values. In [21], the authors describe a path-following method to calculate the field of values boundary of a matrix, which we now briefly summarise. It concerns the solution of a parametrised Hermitian eigenvalue problem (which bounds the projection of the field of values onto the real axis), Here, and in the remainder of the paper, the overdot notation is used to indicate derivatives with respect to the problem parameter. This is to emphasise the parameter-varying or "time-varying" nature of the problems. Note that (4.1) is well-defined except perhaps for a finite number of τ j due to elementwise analyticity of the elements of H(e iτ A) and the analyticity, up to ordering, of the eigenvalue and eigenvectors. Differentiating (4.1) gives, (4.2) H(e it A)u(τ ) −λ(τ )u(τ ) − λ(τ )u(τ ) = −iS(e iτ A)u(τ ).
Since the system is under-determined, an additional constraint that u(τ ) must be tangent to its (elementwise) derivative is included, giving the system, which can be rewritten in matrix form, The system described by (4.4) can be solved for [u(τ ) * λ (τ ) * ] * using a linear solver and used as the F (·) function for a Runge-Kutta numerical integrator, which then generates control points along the curve. The authors use the Dormand-Prince RK5(4)7M method [10, p. 23] and interpolation method of Shampine [33, p. 148]. The near-interpolant solution from this method is 5 th order accurate.

4.2.
Path-following method for forward reduced problem. We now extend the same process to the quadratic eigenvalue problem posed in subsection 3.4. Recall (3.11), which upon differentiating (indicated with overdot) with respect to k gives, We further impose that w(k) * ẇ (k) = 0. Writing in matrix form, This is the general form in which the structure is clear. In the subsections below, we perform the same derivation but include the specific expressions for the radial and angular paths including boundary conditions. The approach taken is analogous to [21]: an initial eigenpair {c 0 , w 0 } is calculated using CL-c. Then by using (4.7) to solve for [ẇ(k) * ċ (k) * ] * , numerical integration can proceed along the curve in both directions. Hermite interpolation can then be used to query at arbitrary k.

System of equations along radial slice at fixed θ (PF-R-r-c).
For PF-R-r-c, we fix angle θ = θ 0 and parametrise by k. Thus, we are in the setting of the reduced problem with the constant shear profile being the relevant reduced shear profile, U θ (z). Writing (3.1a) in matrix form with c as eigenvalue and explicit dependence on parameter k, For notational succinctness, we use the shorthand c = c(k) and w = w(k). Differentiating (4.8) with respect to k (indicated by an overdot) gives, The free-surface condition can be written as, Differentiating (4.10) with respect to k, Upon rearranging terms, we define the block matrices: , (4.12c) so that the system of ODEs can now be written in matrix form as: .
Note that we do not include the row corresponding to the bottom surface, only the free-surface is included.

System of equations along angular circle at fixed k (PF-R-a-c).
For PF-R-r-c, the angle θ and hence the shear profile was held constant. For PF-Ra-c, we instead hold k constant and seek to use a θ angular dependence. Therefore, we must also specify the parametrisation of the shear profile.
So that in matrix form, and, upon differentiation with respect to θ (indicated with an overdot), Our starting point is the same, we use (4.8) but instead hold k constant and take the derivative with respect to θ. Temporarily adopting the abbreviated notation U int = U int (θ), w = w(θ), and c = c(θ): As before, the free-surface condition is (4.10), which we take the derivative of with respect to θ using the shorthandu 0 =u 0 (θ), In a similar manner to before, we define the block matrices: so that the system of ODEs can now be written in matrix form as: .
Note that the P and Q matrix have the same structure as in (4.13), it is R that changes.

4.2.
3. Path-following algorithm specification for reduced problem. We describe the algorithm for PF-R-r, the algorithm for PF-R-a follows in the obvious manner. Using the definitions of P, Q, R from (4.12) define matrix and vector functions, Given a candidate v(k) := [ w(k) * c(k) * ] * , define the Runge-Kutta F (·) function as, F (·) can be easily obtained using a linear solver, such as LU decomposition.
The algorithm requires an initial v 0 = v(k 0 ) calculated using CL-c. As in [21], the Dormand-Prince RK5(4)7M method [10, p. 23] and Hermite interpolation strategy of Shampine [33, p. 148] is used. We use automatic stepsize control as described in [14, p. 167]. For an interval [k (j) , k (j+1) ] with midpoint k (mid) , the integrator produces control points and v (j+1/2) = v(k (mid) ). Thus, after numerical integration, a solution set of v (j) , v (j) , and v (j+1/2) is obtained upon which piecewise Hermite interpolation can be performed. If both c(k) and the eigenvector w(k) is required then interpolation is over N + 1 length vectors; if only c(k) is required then interpolation is only 1-dimensional. Example output is shown in Figure 3 (1-dimensional output). Sample interpolant query points shown with red asterisks.

4.2.4.
PFmp-R-{r,a}-c: improving accuracy for PF-R-{r,a}-c. As shall be described in section 5, the error in the CL-methods are determined almost entirely by roundoff error incurred during the solution of the quadratic eigenvalue problem in double precision. The path-following algorithm essentially maintains the same error as is present in the initial v 0 . By calculating v(k 0 ) in high precision arithmetic then executing the path-following schemes in double precision as normal, an improvement in accuracy of two to three orders of magnitude is obtained. This is discussed further in subsection 5.1.

Path-following method for forward general problem (PF-G-c).
The PF-R-r and PF-R-a algorithms can be combined to create an efficient algorithm that can process scattered data query points, which we describe below.
i. First, PF-R-a is executed at some nominal k = k 0 and interpolation points at angles {θ (j) } J j=1 are calculated. See Figures 4a and 4b. ii. The results from step i. are used as the initial v 0 values for PF-R-r calculating radially along each θ (j) . The curve on each radial slice is then interpolated at predefined {k (i) } I i=1 points. The control points for each radial slice are then replaced with the control points at these fixed k (i) (we do not calculate new midpoint values). So there is now a 2-dimesional polar grid at angles θ (j) and radii k (i) . See arrangement in Figures 4c and 4d. iii. For an arbitrary query point (k q , θ q ) the nearest angles θ (l) , θ (l+1) and radii k (m) , k (m+1) are identified. The interpolant on radial slices at angles θ (l) , θ (l+1) are calculated at radius k q . (4.17) is then used to calculate the angular derivatives. Finally, (cubic) interpolation is performed in an angular direction for angle θ q to obtain the solution. See Figures 4e and 4f. Note that after steps i. and ii. are calculated once, only step iii. need be performed for further query points, in a similar manner to PF-R-{r,a}.
There is a loss in accuracy because of the required use of cubic interpolation -due to not having the midpoint-rather than the 4th order interpolation used in PF-R-{r,a}. However, for these purposes, it is not particularly significant. For clarity, we omit further analysis of PF-G: it is broadly similar to PF-R-r and does not add anything to the discussion.
(a) Planar plot of PF-R-a used at normial radius k 0 , interpolated at angles θ (j) . Blue circles are control points, red astrisks are interpolation points. Angles indicated in dotted grey.
(c) Planar plot of PF-R-r used along each θ (j) . Blue circles are control points, red astrisks are interpolation points (the k (i) ).

Convergence and error estimates.
It is well known that for sufficiently smooth solutions, spectral methods converge exponentially fast or with 'spectral accuracy' [9, ch. 1,2]. However, roundoff error poses a significant challenge for collocation methods due to the interaction of ill-conditioned matrices with commonly used double precision calculations [9] [5]. We adopt a heuristic strategy to estimate the accuracy of each algorithm.

5.1.
Dependence of eigenvalue accuracy on order N z . To determine accuracy depending on N z , we first calculate a reference dispersion relation ref values distributed along the test interval I k . This is done in high precision arithmetic, using the Advanpix library [16], for N ref = 384; this size of matrix exceeds what would be used in practice.
We calculate the relative normwise error in a candidate dispersion relation R cand (with k This is done for for the U T shear profile using the CL-c, PF-R-r-c, PFmp-R-r-c, and DIM algorithms. The CL and PF methods reduce error with spectral accuracy until roundoff error starts to dominate. The high-precision initial calculation for the PFmp algorithm avoids this roundoff error and it can be seen that the pathfollowing method itself retains this improved accuracy even in double precision. DIM is included for indicative purposes. See Figure 5a.
A possible explanation for this can be found in comparison of the backwards error and conditioning of the quadratic eigenvalue solve used for the CL-schemes compared to the linear solves predominantly used in PF-. Although it is not a direct comparison -the linear solves are used to calculate a derivative, not the value itself-it may lend some insight. The backwards error for the linear solve can be calculated with [15, eqn. 1.2] and the condition number in the usual manner: where w l is a corresponding left-eigenvector. For a range of N z , we calculate the · ∞ of the backwards errors η L , η Q and condition numbers κ L , κ Q over the k vector. This then permits calculating the product from (5.4). This is shown in Figure 5b. The condition numbers are of the same magnitude, κ L ≈ κ Q , but the backwards error for the linear solves in PF-r is clearly smaller, η L η Q . Although a direct comparison cannot be made, this suggests the path-following method has favourable numerical properties.
(a) Log-log plot of normwise relative error in candidate algorithms depending on Nz. The collocation method and path-following algorithm reduce error with spectral accuracy until roundoff error starts to dominate around Nz ≈ 65. Path-following+MP which uses a high-precision initial result maintains the broadly the same low error despite the actual path-following calculations being preformed in double precision. DIM reduces error as predicted, as O(N −2 z ).
(b) Backwards error, condition number, and error estimate (backwards error × condition number). The time series for the quadratic eigenproblem solves for CL shown in blue, the linear solves for PF in magenta. From the condition numbers, shown with dotted lines, it can be seen that CL solves are slightly better conditioned but PF solves are of the same order of magnitude. The backwards error, shown in dashed lines, shows the linear solves in PF are appreciably more backwards stable. The error estimate clearly favours PF. 5.2. k-dependent convergence. As can be seen from Figure 6a, the eigenvectors become numerically singular at the surface as k increases, implying that increasingly many basis polynomials are required to approximate the solution. This can be tested by using a similar algorithm as in [4] to determine when the Chebyshev series has converged. Specifically, we calculate an envelope then use a histogram to locate the plateau convergence region. We then determine the required N z to reach convergence for a range of k values, as shown in Figure 6b. For shorter wavelengths, much higher N z is required to reach convergence and so requiring more computational resources. This problem can be entirely ameliorated, as described in section 6.
6. Adaptive depth and partition of unity. It is clear from the results in subsection 5.2 that as k increases, the required N z becomes infeasibly large due to the singular behaviour of the eigenfunction. This can be avoided by using a smaller h so that h 1 for higher k, on the following rationale. We expect that the eigenfunction decays roughly as e kz . Therefore, we can estimate the depth below which the eigenfunction is effectively zero, for numerical purposes. Let δ be the tolerance below which numerical values are considered zero, e.g. the "machine epsilon". Let h δ (k) := min{1, − log(δ)/k} be an estimate of the depth, taking into account the finite depth, at which the eigenfunction decays below tolerance δ for a given k.
The CL-r scheme can be adapted for each calculate k. For a given k, we can set h = h δ (k). The calculated eigenvalue for the phase velocity will be correct automatically. The eigenvector may be remapped back onto the original interval on any suitably large set of z points chosen on the [−1, 0] interval using barycentric interpolation [6]; the eigenvector will be zero for z < −h δ (k).
This procedure becomes less obvious when considering the PF-r scheme because  it would require remapping the entire system at each Runge-Kutta step. To avoid this, we instead split the k domain into several, partially overlapping, subintervals for which the chosen depth is suitable for all k in that subinterval. The path-following algorithm is then used on each subinterval independently with the appropriate depth. To combine the subintervals and avoid loss of smoothness in the computed dispersion relation, a partition of unity method is used on the overlaps.
We seek a scheme to choose subintervals I b ] and corresponding depths h (j) that is both simple and easy to implement. For some k (j) ∈ I where C min , C max are constants controlling the proportion of the [h (j) , 0] interval that h δ (k (j) ) should be within. In our computations, we found that C min = 0.3 and C max = 0.8 worked well. The subintervals and associated depths are then calculated as, This generates intervals I (j) k in such a manner that k , i.e. there is some overlap in the intervals. We use the partition of unity method described in [1] to join the subintervals. This is demonstrated in Figure 7.
7. Performance analysis. There are two variables which control the expected computation time for the candidate algorithms: the number of z evaluation points, N z , and the number of query points N q . Since N z determines accuracy and is dependent on algorithm choice, we assume N z is set appropriately for each algorithm to achieve similar accuracy. Therefore, our primary concern shall be how the algorithms scale with N q . DIM will incur a fixed per-point computational cost that depends on the number of z points, which we denote σ DIM (N z ). So, the expected cost is O(σ DIM (N z )N q ). Similarly, the collocation algorithm incurs a fixed per-point computational cost which also depends on the number of z points used, σ CL (N z ). So, the expected cost is O(σ CL (N z )N q ). These estimates are valid for both the reduced and general problems.
In contrast, the reduced path-following algorithm incurs an initial computational cost dependent on the number of z points, σ PF-NI (N z ), whereafter there is a very lightweight per-point cost, σ PF-Q σ PF-NI (N z ). Therefore, the expected computational costs is O(σ PF-NI (N z ) + σ PF-Q N q ). The general path-following algorithm is similar with only the coefficients changed. This is summarised in the following table, assuming the eigenvector output is not required: is not too large and σ PF-Q is sufficiently small then as N q increases, the path-following algorithms are much more efficient.
The asymptotic complexity claims are confirmed by practical testing. For clarity, we only test with the reduced problem. By measuring the time taken for each algorithm to compute the dispersion relation for differing N q , we can determine the computational complexity in relation to N q as shown in in the log-log plot, Figure 8. Each algorithm can be executed with different parameter choices that influence accuracy. As such, we calibrated each algorithm to produce output at three different accuracies (measured as relative normwise error using (5.1)): ≈ 10 −4 , ≈ 10 −7 , and ≈ 10 −10 . As seen in the results, the path-following algorithm is asymptotically at least two orders of magnitude faster than both DIM and the collocation scheme. The break-even point in N q at which the path-following scheme becomes faster than DIM CL DIM DIM PF SH DIM Fig. 8: Performance plot for reduced problem. Note that the collocation and DIM algorithms are clearly linear in complexity with respect to Nq. The path-following algorithm is also linear but this is only visible after around 10 5 points because the interpolation cost is so small. For the ≈ 10 −4 results, the path-following algorithm breaks-even at around 1200 query points; at ≈ 10 −7 accuracy, it breaks-even at around 100 points; and, at ≈ 10 −10 accuracy it is always faster. Asymptotically, the path-following algorithm is at least two orders of magnitude faster than both DIM and the collocation scheme.
8. Guidance on optimal parameter choices. Optimal parameter choices are predicated on two key properties: the required accuracy and the anticipated number of query points.
As can be observed from Figure 8, the path-following algorithm is most effective when higher accuracy and at least a moderate number of query points are required. The nominal setup cost caused by the initial quadratic eigenproblem solve and numerical integration is dependent on the order of differentiation matrix used, N z . So this should be kept at the lowest value possible that maintains required accuracy. We found N z between 48 and 64 is optimal for the cases we tested. Furthermore, using N z too high risks roundoff error causing deleterious effects, cf. Figure 5a.
The Dormand-Prince integrator requires a tolerance for the adaptive stepsize control. We suggest that 10 −11 is the smallest value to use when the initial eigenproblem solve is performed in double precision. If the initial eigenvalue solve can be performed more accurately, for example in high precision arithmetic, then the tolerance can be set around 10 −15 . In any case, if using a smaller N z then the tolerance should be adjusted to match the accuracy from the collocation solution.
9. Conclusions. By considering the boundary value eigenproblem posed by the Rayleigh instability equation with linearised free-surface boundary condition as parameterised by wave number k, we can adapt the path-following scheme in [21] to efficiently calculate the dispersion relation at high accuracy. This efficiency is achieved by first exchanging many expensive QZ decompositions on a size 2N matrix for one QZ decomposition and some linear solves on a size N matrix; secondly, we 'front load' the computational cost into the numerical integration with light-weight Hermite interpolation being used to compute the sought solution points.
The accuracy tests in section 5 suggest that the path-following algorithm can maintain the accuracy of the initial eigenpair v 0 and appears to be numerically more stable than the QZ decomposition used to obtain the initial eigenpair.
The algorithm is extended to permit calculation in the 2d k-plane with scattered data and some preliminary discussion of critical layers is given.
In other tests, not included here, it is clear the same approach works well for other problems from physics and engineering, assuming the problem is parametrised by a real scalar. Additional difficulties arise when there are exceptional points or bifurcations in the solution curve, or if the ODEs become stiff. These challenges may form the basis of future work.
The MATLAB library used to perform the calculations used in this paper is maintained at [23].