A fast sparse spectral method for nonlinear integro-differential Volterra equations with general kernels

We present a sparse spectral method for nonlinear integro-differential Volterra equations based on the Volterra operator's banded sparsity structure when acting on specific Jacobi polynomial bases. The method is not restricted to convolution-type kernels of the form $K(x,y)=K(x-y)$ but instead works for general kernels at competitive speeds and with exponential convergence. We provide various numerical experiments on problems with or without known analytic solutions and comparisons with other methods.


Introduction
The Volterra integral operator is defined by K(x, y)u(y)dy. (1) Integral equations involving Volterra operators occur in diverse applications in various disciplines of engineering [53,56], finance and economics [3,35] as well as the natural sciences [9, 20,26,28,29,33,42,55]. Beyond linear Volterra integral equations of first and second kind, which respectively take the forms equations and in particular nonlinear VIDEs, the last of which take the form: where m ∈ N and ∀i : λ i ∈ R. Linear Volterra equations thus correspond to the special case f (u) = u. For certain very well-behaved cases one can use variational iteration, Adomian decomposition or Laplace transform methods [54] to try to obtain analytic solutions for such equations but in general one must use numerical methods to find approximate solutions. We assume in this paper that the equations we intend to solve with our proposed numerical scheme are solvable with unique solutions and omit discussion of ill-posed problems. A number of results on criteria for the existence of solutions are known, see e.g. [22,32,57] and the references therein. Interest in efficient algorithms with good convergence properties is high, resulting in a variety of competitive methods for linear, nonlinear and integro-differential Volterra equations. For decades [11] researchers have been proposing various forms of increasingly refined collocation and iteration methods to approach nonlinear VIDEs [1,2,13,15,45,51]. More recently they were also joined by a number of wavelet-based methods [8,27,30,43,44,58]. Numerical solvers based on homotopy perturbation methods have also been proposed [21]. A highly efficient spectral solver for the case of convolution kernels K(x, y) = K(x − y) for linear VIDEs using low rank approximations was recently described by Hale [24]. Gutleb and Olver described a general kernel sparse spectral method for linear Volterra integral equations in [23], which forms the background of the present paper. In this paper we present a spectral method which works for linear, nonlinear, integro-differential and many other types of Volterra equations of first, second and third kind while utilizing polynomial spaces in which the Volterra operators have a sparse banded structure. As such this paper is a direct generalization of results in [23], which derived said banded structure of the Volterra operator in Jacobi polynomial bases and on the basis of [40] developed an efficient Clenshaw algorithm based approach to numerically generate the operator. This method thus combines the unmatched exponential convergence of spectral methods with highly efficient sparse linear algebra, a very promising combination which has been successful in recent years [25,38,39,50,52]. In addition to efficiency via bandedness and exponential convergence rate, the proposed method is furthermore not limited to convolution kernel cases, i.e. kernels of form K(x, y) = K(x − y), a common restriction in competitively fast and accurate approaches [24]. Due to the extensive literature and code libraries available on numerical solutions for linear, nonlinear VIDEs an exhaustive comparison with other methods is far beyond the scope of a single paper. We will however present comparisons with recent collocation methods, as these are the most competitive and ubiquitous solvers available.
The structure of this paper is as follows: In sections 1.1 and 1.2 we introduce the necessary framework of uni-and multivariate orthogonal polynomial approximation of functions. Section 1.3 briefly recounts relevant results from [23], detailing the banded structure of the Volterra operator on appropriately chosen triangle domain Jacobi polynomial bases. Section 2 details the extension of the linear Volterra integral equation method to a general linear integro-differential case by augmenting the system with appropriate evaluation operators. Section 3 details the extension to nonlinear Volterra integral equations using an iterative approach and describes further relevant numerical experiments. Section 4 combines these two approaches, finally extending the method to general kernel nonlinear VIDEs. Section 5 showcases various numerical experiments for linear VIDEs as well as nonlinear VIEs and VIDEs to verify and test applicability, convergence rate and competitiveness including comparisons to collocation methods in Chebfun. We close with notes on convergence for the methods proposed in this paper in section 6 and discuss applicability and potential further research directions in the conclusion.

Function approximation with univariate orthogonal polynomials
In what follows we introduce the relevant elements of function approximation in univariate orthogonal polynomial bases, primarily focusing on the Jacobi polynomials, which are required to understand the proposed spectral method for nonlinear VIDEs. The reasons for highlighting the specific chosen properties above others will become clear when we discuss the methods in sections 2 and 3. For a more general and complete introduction into the theory of function approximation with univariate orthogonal polynomials, see e.g. [6,19]. The Jacobi polynomials P (α,β) n (t) are a univariate complete set of polynomials orthogonal with respect to the weight function (1 − t) α (1 + t) β on their natural domain t ∈ [−1, 1], meaning they satisfy where α, β ≥ −1. As such, the Legendre polynomials correspond to the special case α = β = 0 and the ultraspherical or Gegenbauer polynomials correspond to the special case in which α = β. Spectral methods can make use of the fact that complete sets of orthogonal polynomials can be used to approximate any sufficiently smooth function f (t) defined on a real interval (a, b) via the expansion where f n are the unique constant coefficients of f (t) in the given complete polynomial basis p(t) which is orthogonal on the domain (a, b). These coefficients f n may be computed efficiently for various sets of orthogonal polynomials using methods and C libraries by Slevinsky [46][47][48], while evaluation of polynomials is efficiently performed using Clenshaw's algorithm, see e.g. [19]. While the interval [−1, 1] is the natural choice for the Jacobi polynomials one can easily shift them to any other real interval required by an application. The method in this paper exclusively makes use of the Jacobi polynomials shifted to the unit interval [0, 1], so we introduce the following shorthand notation: Once expanded in the above way, performing addition and subtraction of functions has an obvious element-wise implementation. Additionally, being orthogonal polynomials the Jacobi polynomials satisfy a three term recurrence relationship n−1 (t) + a n P (α,β) which allows for efficient computation of function multiplication in this framework via a tridiagonal so-called Jacobi operator: The exact elements of the Jacobi operator depend on the Jacobi parameters (α, β) of the chosen basis, see e.g. [36,18.9(i)] for explicit values of a i , b i and c i . As the sparsity of the operators in this paper relies heavily on correctly moving between bases with different Jacobi parameters (α, β) we will index coefficient vectors with the Jacobi parameters of the basis of expansion, i.e. by writing f (α,β) , where it might otherwise be ambiguous. The above properties allow for the development of software packages capable of performing arithmetic on functions using highly efficient sparse linear algebra, where functions are replaced by coefficient vectors. One such package is ApproxFun.jl [37] writen in the Julia programming language [7], which is used as the background environment of the implementations presented in this paper. Other available software packages include among others the Dedalus project [14], Chebfun [5,16,41].
Beyond arithmetic with functions, there are a number of other useful operators in the sparse spectral method toolbox. We may, for example, freely shift from a basis with Jacobi parameters (α, β) to one with higher parameters (α + n, β + m) viã where n, m ∈ N using a sequence of upper bidiagonal raising operators [36, 18. and an analogous sequence of operators for the second Jacobi parameter. Lowering operators are also available, although this is in general only possible in a sparse (lower bidiagonal) way with added weights [36, 18.9.6]: We may also mirror functions on their domain in a given Jacobi polynomial basis by using the very useful symmetry property [36, In particular, on [0, 1] we can define a diagonal reflection operator via Differentiation is also a sparse (diagonal) operation if we simultanously increment the Jacobi parameters [36, 18.9.15], i.e.: Importantly, this means that repeated sparse differentiation is not equivalent to a repeat application of the same operator D (α,β) . As the derivative operator shifts coefficient vectors to a higher parameter basis the second derivative operator is actually a combination of two distinct derivative operators acting on different bases and so on for higher derivatives. We thus denote the n-th derivative operator acting on a coefficient vector inP (α,β) basis by D instead of the potentially misleading notation D n which may evoke false intuitions of commutativity of the operators. The last component of theory we need for our univariate function approximation purposes are endpoint evaluation operators which will be used to enforce boundary conditions in integro-differential equations. From the viewpoint described above, functions are coefficient vectors and multiplications, derivatives and basis changes are operators on coefficient vectors (matrices in finite dimensional approximation space). Functionals, e.g. evaluation operators E at an endpoint, must act on coefficient vectors to return a scalar value and are thus represented by row vectors. In particular, for the Jacobi polynomials we can make use of the known property [36, where (·) n denotes the Pochhammer symbol or rising factorial [36, 5.2(iii)]. Via the symmetry property in Equation (3) we obtain a similar evaluation operator for the other endpoint of our chosen interval domain.

Function approximation with multivariate orth. polynomials
This section introduces required elements of function approximation in multivariate orthogonal polynomial bases, focusing on the Jacobi polynomials on the triangle domain which was discussed in detail in [40]. Multivariate polynomials are not yet widely used in numerical methods despite their vast potential. For a more general and complete introduction to theoretical aspects of multivariate orthogonal polynomials, see [17].
Function approximation with multivariate orthogonal polynomials works in direct analogy to the univariate case. For a given multivariate function f (x, y) defined on some suitable 2-dimensional domain and given a set of complete multivariate orthogonal polynomials on said domain, we may expand the function via A generalization to n-dimensional cases is straightforward, cf. [17]. On the triangle T 2 , a sensible choice of polynomial basis is found in the triangle Jacobi polynomials, also known as Proriol polynomials, which are defined via reference to the univariate Jacobi polynomials [17, Proposition 2.4.1]: Expansion coefficients for functions on the triangle Jacobi polynomial basis, such as required for the kernel K(x, y) discussed in the next section, may be computed efficiently using C libraries by Slevinsky [46][47][48]. As in the univariate case, we can define a variety of operators acting on coefficient space such as multiplication operators based on Jacobi operators, derivative operators and basis change operators [40]. The novelty is that for 2-dimensional spaces such as the triangle we need to distinguish between the x and y variables and thus have two different Jacobi operators: J x for the x variable and J y for the y variable, which are now block tri-diagonal operators instead of being tridiagonal, see [40] for details.

Banded sparsity of the linear Volterra operator in Jacobi bases
It was shown in [23] that the Volterra integral operator is sparse with banded structure on appropriate Jacobi polynomial spaces. Based on this, a sparse spectral method with exponential convergence for linear Volterra equations with general kernels was motivated and analyzed. The results are based on interpreting the Volterra operator as acting on multivariate Jacobi bases on a triangle domain. The idea behind the linear method follows the schemes in Algorithms 1 and 2. In this section we briefly review these methods to the degree necessary to follow the integro-differential and nonlinear extension in this paper. For the full discussion of the linear case, we refer to [23]. The move to the triangle domain may initially be motivated by noting that the Volterra integral operator K(x, y)u(y)dy acting on u may be considered for other upper bounds l(x) than x, in particular l(x) = 1−x. The Proriol polynomials with parameters (0, 0, 0), being orthogonal on the triangle domain, behave well with respect to this integration: Labeling the (1 − x) as a weight term and referring to what remains as operator Q y , in reference of it being an integration with respect to y, aligns our notation with that in [23]. Using reflection operators this can be adapted for the more standard l(x) = x case, see [23], which means that considering the kernel K(x, y) means looking at K(1 − x, y) on this domain. This operator Q y acts on a function expanded in the Proriol polynomials with parameters (0, 0, 0) on T 2 and as seen above has the form We may account for the as of now omitted weight term (1 − x) by using a direct multiplication with Jacobi operators but for reasons of efficiency and due to the need to reflect when l(x) = x is better performed using a bidiagonal lowering operator followed by a diagonal reflection and finally a bidiagonal raising operator. The discussion so far explains the form of the operator for linear Volterra integral equations of second kind in Algorithm 2 being 1 − S Equations of first kind are somewhat more subtle and we thus omit discussion of further details, referring instead to the original linear method derivations and proofs in [23]. We have assumed above that the function may be expanded in the Proriol polynomials but we can make use of additional sparsity structures when instead thinking of f n,k as the extension of univariate function coefficients f n extended to the triangle domain via the expansion operator Choosing the respectively optimal basesP (1,0) (x) and P (0,0,0) (x, y) for this purpose results in the extension operator found in [23], which when multiplied with the integration from 0 to 1 − x operator above results in the following diagonal operator Via certain quasi-commutativity properties of the above-discussed operators and the Jacobi operators on the triangle domain and using diagonal reflection operators appropriately one may iteratively build the full Volterra integral operator for a general kernel via the efficient operator-valued Clenshaw algorithm for general kernel linear Volterra integral equations introduced in [23]. We will refer to this Volterra operator without the weight (1 − x) as V K in the following sections and assume it is computed using the methods outlined here and detailed in [23]. The weight is accounted for by using appropriate basis shifts or multiplication as detailed in the algorithm steps.
Algorithm 2 Linear Volterra integral eq. of second kind [23] u

Solve the linear system
2 Extension of the linear case sparse spectral method to integro-differential equations Volterra integro-differential equations (VIDEs) are named such because the unknown appears in the equation under the action of both a Volterra integral and a derivative operator. In this section we will consider linear VIDEs of the following generic form: with constants λ m and M ∈ N. Within the context of the present spectral method the integral operator is the Volterra operator from section 1.3, which as we saw maps a coefficient vector of a function in theP (1,0) (x) basis to the solution in the same basis. Consistent basis considerations, specficically the Jacobi parameters of our chosen basis, are crucial when developing a solution algorithm for integrodifferential equations. As noted in section 1.1, there is an abundance of useful structure in the Jacobi polynomials which among other things allows us to take derivatives by shifting the basis parameters as in (4)(5)(6). Applying a derivative operator is the same as applying a parameter-scaled raising operator and thus incurs a basis change, which needs to be accounted for in the other operators. We choose the integro-differential equation of second order as an example to illustrate this. To consistently obtain a solution from an extension to the above linear method it does not suffice to simply replace the second order derivative operator with the appropriate Jacobi polynomial basis derivative operator. Instead, due to the incurred basis shift, an additional conversion or shift operator must be applied to the Volterra operator as well. Starting from theP (1,0) (x) basis in which we obtain our solution u, the second derivative operator carries us into the basisP (3,2) (x), meaning that, taking note of the steps in Algorithm 2, the appropriate operator form of the above second order example equation is We may collapse the compatible conversion operators down into a single one to obtain the slightly simpler Note that g(x) must be expanded in theP (3,2) (x) basis instead of theP (1,0) (x) basis or converted into said basis using the above-defined basis shift operators. This is for consistency reasons as the operators acting on u (1,0) shifting the basis fromP (1,0) (x) toP (3,2) (x) means that the inverse of said operation must act on a function expanded inP (3,2) (x). This is not an artifact of our choice of theP (1,0) (x) basis for our solution: While that basis is particularly well-suited for Volterra integral equations as it results in a far more efficient kernel computation [23], the derivative operator in the VIDE will always shift the basis of our solution, so to optimize efficiency g(x) is always initially expanded in theP (1+M,M ) (x) basis where M is the order of the highest appearing derivative operator. Even in the general case with multiple derivative operators of different orders the basis for the solution always remainsP (1,0) (x) (for efficiency of kernel computations) while the highest order derivative operator determines the basis in which g(x) must be expanded along with the shift operators which respectively act on all lower order operators as well as the Volterra integral operator.
Attempting to invert the operator on the left-hand side of Equation (8) as-is will yield nonsensical results. This should be unsurprising, as the differential equation it corresponds with does not have a unique solution unless initial conditions are supplied as well. Given a Volterra integro-differential equation with highest appearing derivative operator of order M ∈ N, we will in general require initial conditions for all lower order derivatives to be given, i.e.: for given constants c m . In the example case of Equation 8 the values u(0) and u (0) must be given. In spectral methods such as the one discussed in this paper, boundary or initial conditions are enforced by extending the to-be-inverted operator by appropriate evaluation operators. The relevant Jacobi basis evaluation operators, being functionals, are represented in the coefficient vector and operator language as row vectors, as discussed in section 1.1. For the example in Equation 8 we thus append the two initial condition evaluations at the top of the operator as follows obtaining the now solvable system with consistently modified right hand side. Similar procedures have previously been used to solve differential equations, cf. [25,52]. The discussion in this section in combination with the linear Volterra integral method in Algorithms 1 and 2 thus provides a recipe for the solution of general linear Volterra integro-differential equations satisfying a sufficient set of initial conditions. We produce the general case method in Algorithm 3. The resulting operator on the left-hand side has filled-in top rows for each initial condition and thus is no longer fully banded but still retains very well-behaved sparsity structure (semi-banded) leading to fast solutions even for high orders of polynomial approximation.

Nonlinear Volterra equations via iterative methods
In this section we develop an iterative approach for solving nonlinear Volterra integral equations based on the linear case sparse spectral method. Computing solutions to nonlinear Volterra and Fredholm integral equations with iterative methods is not a novel idea in itself, see e.g. [15], but typically comes with significant drawbacks, cf. remarks in [4,18]. The core problem with iterative methods is how rapidly their computational cost scales with the expense of evaluation in each iteration. The slower the rate of convergence and the more expensive the individual evaluation in each step, the less feasible iterative methods become. Conversely, the presented sparse spectral method is very well suited to be used in conjunction with iterative methods as it not only converges exponentially but also keeps evaluation cost comparatively low by making use of operator bandedness in the chosen bases. We will primarily use a simple Newton iteration algorithm without linesearch on the basis of implementations in NLsolve.jl [34] for the numerical experiments but in principle many other iterative approaches may be used, resulting in further speed-ups in some cases. The main idea of the extension to the nonlinear case is to notice that given functions K, g and f the general nonlinear, second kind Volterra equation Algorithm 3 Linear integro-differential Volterra eq. of second kind may be cast into the form of a root-finding problem in function space for the objective function F (u) defined by The initial guess required for iterative approaches is thus made at the level of coefficient vectors, meaning that a guessed column vector representing the solution in theP (1,0) (x) basis is supplied to the iterative solver. When no convergence automation is used the supplied length of the guess as well as g determines the maximum polynomial degree and thus the approximation error. The step-by-step method is stated in Algorithm 4.

Nonlinear integro-differential Volterra equations
We can straightforwardly combine considerations in sections 2 and 3 to obtain a sparse spectral method suitable for solving Volterra equations featuring both derivative operators and Volterra integral operators with nonlinearities. A very (but not exhaustively) general case of such an equation of second kind is For brevity we only address equations of the form in Equation (10) but the methodology outlined in this paper is applicable for a much broader class of problems. The full step-by-step method is stated in Algorithm 5.
Algorithm 5 Nonlinear integro-differential Volterra eq. of second kind 2. Generate V K recursively via an operator-valued Clenshaw algorithm for the flipped kernel K(1 − x, y).

Generate the operator
6. The approximate solution is the obtained rootP (1,0) (x) T u.

Numerical experiments
Throughout this section we measure errors between analytic solutions u(x) and computed approximate solutionsP (1,0) (x) T u (1,0) in each point of the domain via the infinity norm of the absolute error

Set 1: Second kind, convolution kernels, one derivative operator
As a proof-of-concept we first test the above method on three simple convolution kernel cases with analytically known results: with initial conditions given by The following analytic solutions derived respectively via variational iteration, Adomian decomposition and Laplace transform methods are found in [54]: We plot the absolute error of the computed solution compared to the analytic solution in semi-logarithmic scale in Figure 1, showing exponential convergence to the exact solutions.

Set 2: Collocation method for third kind integro-differential equations
A collocation method is used in [45] to solve certain types of third kind integrodifferential Volterra equations of form K(x, y)u(y)dy.
While we do not explicitly treat third kind equations in this paper, the discussion of first and second kind integro-differential equations in section 2 implies an obvious extension to these cases. Shayanfard, Dastjerdi and Ghaini [45] discuss two numerical examples and provide a table of error values for differently chosen collocation points. The two numerical experiments with non-convolution kernels are with initial conditions respectively given by and known analytic solutions In Figure 2 we compare absolute errors of the results obtained with our sparse spectral approach to the errors obtained with their collocation method (as given in Tables 1 and 2 in [45]). The sparse method bandwidth of the operators for these third kind VIDEs is large as they feature additional multiplications with Jacobi operators with poorly approximated rational powers in them, so while our proposed method still performs very well in terms of accuracy, there is not much computation time efficiency to be gained due to sparsity for these two problems.

Set 3: Integro-differential equations in Chebfun
The Chebfun package allows state-of-the-art computations using polynomial approximations and collocation methods in MATLAB [5,16,41]. An implementation of an automatic collocation method for integral and integro-differential Volterra and Fredholm equations in Chebfun was presented in [15]. In this section we aim to compare performance of the sparse method compared to the dense collocation method used in Chebfun for problems requiring low and high polynomial orders.

Low order solutions
The example in this section is given in [15] and is a non-convolution kernel linear VIDE which previously appeared in a discussion of higher order collocation methods for VIDEs by Brunner [12]. We seek a solution to with initial condition u 1 (0) = 1, (21) and known analytic solution u 1 (x) = e x 2 . In Figure 3(a) we plot the absolute error of the solution obtained via the sparse spectral method with maximal polynomial approximation order n. We present a spy plot of the quasi-banded integro-differential operator generated by our sparse method in Figure 3(b). We find that for orders around n = 20, where machine precision accuracy is within reach, the operator for this problem is still dense and thus the proposed method should realistically only match Chebfun's performance. That a speed-up is nevertheless observed, see Table 1, may be explained by language specific differences between MATLAB and Julia, the automatic convergence search which Chebfun performs but was not used for the sparse method or a combination of such factors. Sparsity becomes an important factor for efficiency when treating equations where more complicated solutions are to be expected which require polynomial approximations in the order of hundreds or thousands of coefficients.

High order solutions
For an example which requires a higher n to solve with good accuracy we consider given initial condition and right-hand-side function g 2 (x, k) defined by For all k ∈ R the analytic solution to this equation is given by As this approximates a step-like function at x = 0 for increasing k, see Figure  5(a), it is easy to see why polynomial approximations quickly begin to require high orders. Figure 5(b) shows a spy plot of the quasi-banded integro-differential operator generated by our sparse method, while Figure 4 shows the absolute error of some solutions obtained via the sparse spectral method. The solutions needs to be resolved in relatively high order polynomial approximations, so the bandedness of the operator results in notable performance improvements compared to Chebfun's dense collocation method, see Table 2.

Set 4: Bessel kernels with highly oscillatory solutions
Hale [24] discusses an integro-differential equation with Bessel function kernel, which appears in scattering applications and potential theory, on the basis of previous work on Bessel kernel Volterra equations by Xiang and Brunner [55]. We use this as the final numerical experiment in this section as it touches on the interesting case of highly oscillatory solutions without known analytic form. Specifically, we discuss the singularly perturbed version of the equation which appears in [24] and intentionally makes the problem significantly more oscillatory: with g(x, µ, ν) defined by where J µ are first kind Bessel functions, µ > 0 and ω ∈ R. Equation (24) is further supplied with initial conditions To allow comparisons with [24] we will consider the example parameters ν = 3, µ = 2, ω = 20.
Analytic solutions to this equation are not known and convergence comparisons are thus made to high order approximate solutions (n = 2000) instead. We plot the highly oscillatory solution to this in Figure 6(a) and the convergence to the n = 2000 solution in Figure 6(b). Similarly to results in [24] we observe rapid exponential convergence once the polynomial order becomes sufficient to resolve the frequency of the oscillations. Better convergence up to machine precision is possible when using a more sophisticated balancing of approximation orders for the kernel, g and the solution respectively, as opposed to linearly increasing the approximation order of each of them at the same time. This could also be done using an automated convergence algorithm if needed but this example is primarily presented to show the broad range of applicability even for oscillatory problemsas this particular Bessel kernel is ultimately a convolution kernel, methods which take the additional convolution kernel structure into account, e.g. Hale's method in [24], will generally outperform the general kernel method presented in this paper in accuracy or performance (in particular if operating in low polynomial approximation orders or if they themselves make use of sparsity structure).

Set 1: Power nonlinearity Volterra integral equations
The simplest case of nonlinear Volterra integral equations and thus also where most analytic solutions are available for direct comparison is the case of power nonlinearities of the form f (u) = u m for some positive integer m. We thus consider the examples whose analytic solutions are derived respectively via a Picard type iteration and Adomian decomposition method in [54]: u 2 (x) = sin(x).
As discussed above and as is true for any iterative method there are now multiple parameters which may be fine-tuned to the problems at hand in order to achieve faster and more precise convergence. To that end we may for example fine-tune the initial guess or the convergence cut-offs. As what can go wrong in a standard application case is of greater interest than what may happen in ideal circumstances, we omit such fine-tuning and instead simply supply a vector of all zeros of length n for Equation (25) and a vector of all ones of length n for Equation (26). We plot the maximal absolute errors between true and computed solutions in Figure 7. We observe exponential convergence as n increases using simple Newton iteration without linesearch.  Fig. 1: Absolute error between analytic and computed solutions for u 1 (x), u 2 (x) and u 3 (x) in equations (11)(12)(13) for polynomial approximation of order n with initial conditions in (14-16).

Set 2: Numerical experiments with nonlinear VIDEs
For nonlinear VIDEs we consider: with initial conditions Analytic solutions to these equations were derived in [54] using the variational iteration method: As in section 5.5 we avoid making educated guesses for the inital guess supplied to the algorithm and merely increase the maximal allowed length of the solution coefficient vector, i.e., the maximal polynomial degree of the computed approximation. The initial guess for Equation (27) is a vector of all ones and the initial guess for Equation (27) is a vector of all zeros of length n respectively. We plot the maximal absolute errors between analytic and computed solutions in Figure 8.
We again observe exponential convergence as n increases using Newton iteration without linesearch.

Notes on algorithm convergence
Convergence of the above-discussed method in the case of nonlinear equations arises as a function of the convergence properties of the root search algorithm    that is utilized, combined with the proofs for the respective linear variants. Proofs of convergence for second kind linear Volterra integro-differential equations may be given in a similar fashion to linear Volterra integral equations in [23] and differential equations in [25], the basic observation being that the full to-be-inverted operator is diagonally dominant for well-behaved functions and may be written as a compact perturbation of the identity, thus reducing the problem to standard finite section approximation convergence results, cf. [10,31,38,49]. As seen in the linear   case proof for first kind VIEs in [23] proofs for first kind equations would require a deeper functional analysis approach due to being somewhat ill-conditioned.

Discussion
We have presented a competitively fast general kernel sparse spectral method for nonlinear Volterra integro-differential and integral equations which extends linear results in [23]. The method is notably not reliant on the structure of convolution kernels and applies for general kernels. Furthermore, as it does not rely on low rank approximations it is applicable in more general cases where these approximations fail. It thus combines very broad applicability with high performance and accuracy.
One noteworthy drawback of this method is that, although as discussed in the numerical experiments section in [23] the method may yield sensible results for some types of singular kernels, there are as of now no known guarantees for such cases. That said, the presented method was shown to be convergent and well-behaved with problems that may be well approximated in the specified polynomial bases, which allow for a very general range of kernels.
The numerical experiments in this paper serve an illustrative purpose -in a practical application setting one would choose more sophisticated and efficient root search algorithms than a simple Newton iteration without linesearch and make an educated initial guess for the root search based on background knowledge about the structure of the problem instead of supplying simple zero or one filled coefficient vectors. These points were specifically ignored in this paper to illustrate that such more sophisticated methods are not required to achieve competitive performance and accuracy.