A New Approach to Shooting Methods for Terminal Value Problems of Fractional Differential Equations

For terminal value problems of fractional differential equations of order α∈(0,1)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha \in (0,1)$$\end{document} that use Caputo derivatives, shooting methods are a well developed and investigated approach. Based on recently established analytic properties of such problems, we develop a new technique to select the required initial values that solves such shooting problems quickly and accurately. Numerical experiments indicate that this new proportional secting technique converges very quickly and accurately to the solution. Run time measurements indicate a speedup factor of between 4 and 10 when compared to the standard bisection method.


Introduction and Motivation
Differential equations of fractional (i.e., non-integer) order [6] are an object of great current interest. They are useful tools for modeling various phenomena in science and engineering, see, e.g., [1,2,27,28,29]. These problems typically have the form D α a y(t) = f (t, y(t)), y(a) =ỹ 0 , (1.1) where α ∈ (0, 1) is the order of the differential operator. Note that α > 1 arises only rarely in applications and this case will not be discussed here. In eq. (1.1), f : [a, b] × R → R is a given function andỹ 0 ∈ R describes the initial state of the modeled system at t = a. The differential operator D α a in eq. (1.1) is the Caputo differential operator of order α with starting point a ∈ R as defined by [6,Definition 3.2], namely  In equations (1.1) and (1.2), the starting point a plays a special role as the start of the process that is being studied. A typical application from mechanics is an object made from viscoelastic material under external loads that is still in its virgin state for t < a and to which forces are only applied for t ≥ a.
Here problem (1.1) is an initial value problem as y(a) =ỹ 0 is the state of the process at the initial time t = a and we are interested in finding y(t) for t ∈ [a, b] and some given time b > a.
From the analytical viewpoint such initial value problems are well understood, see e.g., [6,Chapters 6 and 7]. Many numerical methods have been proposed and investigated; cf., e.g., [23]. However, from the modeling perspective, these methods often are of limited use because they hinge on the exact state of the process at the initial time t = a and this may be impossible to determine in actual applications. If one can only measure the value of y(b) for some b > a but not at a itself, this leads to As indicated above, the second step of this process has a well understood structure and can be handled by standard methods. Therefore it does not require any special attention. Hence we focus only on the first step in this work.

Analytic Properties of Terminal Value Problems
Here we provide the basis for our numerical work with terminal value problems for fractional differential equations and recall some of their known analytical properties.
Conditions for well posed-ness have been discussed and partially established in [5,8]. A complete analysis is in [3], with additional aspects presented in [9,14]. For our work here, we shall specifically use the following result that is an immediate consequence of a statement by Cong and Tuan regarding initial value problems [3,Theorem 3.5]. It follows the classical set-up in assuming that the function f in eq. (1.3) is continuous on [a, b] × R, maps into R and satisfies a Lipschitz condition with respect to the second variable. Remark 2.1 When talking about initial value problems, it is common practice to discuss not only scalar but multidimensional problems, i.e. to assume that the function f on the right-hand side of the differential equation maps from [a, b] × R d to R d with some d ≥ 1. In the initial value problem setting, this generalization does not introduce any difficulties as far as the existence and the uniqueness of solutions are concerned. However, if terminal value problems are addressed as we are doing here, the situation is significantly different. To be precise, in [3] it was shown that well-posedness of terminal value problems in general only occurs for d = 1; with a counterexample for d > 1 given in [3,Section 6]. This counterexample demonstrated that multiple solutions can arise when d > 1. Therefore we only consider the scalar case d = 1 here.
A classical technique for investigating analytical properties of initial value problems for differential equations is to rewrite the given problem in the form of an equivalent integral equation. This can be done in the fractional case in exactly the same way as in the classical case of first order problems, see, e.g., [6,Lemma 6.2] and results in a Volterra integral equation. But for terminal value problems of fractional differential equations this changes significantly. When rewriting our fractional derivative problem in form of an integral equation, the resulting integral equation will be of Fredholm type, not Volterra type [6,Theorem 6.18]. In first order differential equations, Fredholm integral equations may arise as well, but in connection with boundary value problems and not with initial value problems. Therefore we shall employ techniques that are based on principles used for boundary value problems of integer order initial value problems. Specifically, shooting methods, see [24], are the foundation of the numerical methods that we suggest for solving fractional terminal value problems.
To describe our method, we need further analytic prerequisites. An important concept is the one-parameter Mittag-Leffler function E α : C → C defined for Re α > 0 as see [20]. For α values that are relevant in our setting we exploit the following: 1 For all α ∈ (0, 1), the function E α is analytic. Moreover, for these α and all z ∈ R we have E α (z) > 0 and E α (z) is strictly increasing in z ∈ R.
Proof The analyticity of E α follows from [20,Proposition 3.1]. The inequality E α (z) > 0 is trivial for z ≥ 0 since Euler's Gamma function satisfies Γ (w) > 0 for all w > 0. For z < 0 the inequality is a consequence of the properties discussed in [20, Subsection 3.7.2] and the strict monotonicity follows from the properties shown in [20, Subsection 3.7.2] as well.

⊓ ⊔
To become familiar with the nature of our numerical method, it is essential to understand further properties of initial value problems for fractional differential equations. Specifically, we mention the following: Given two solutions of the same fractional differential equation on the same interval, but starting from two different initial values, what are the differences between these two solutions on this interval? Upper bounds for the differences are directly available by standard classical Gronwall type arguments. For our purposes, however, we also need lower bounds, about which much less is known. First we explain our result for a linear differential equation. This is very simple but immediately gives us important insights.
Proof The upper bound has been derived in [13,Theorem 5] and the lower bound is shown in [13,Theorem 4].

⊓ ⊔
In the general (nonlinear) case the result is more involved but the essential properties of the linear case remain intact.
3 Assume that f satisfies the hypotheses of Theorem 2.1. Let y 1 and y 2 , respectively, be the solutions of the initial value problems where y 0,1 > y 0,2 . Then, for all t ∈ [a, b] we have Proof This is the result of [13,Theorem 7]. ⊓ ⊔ Theorem 2.3 can be applied to the linear case of Theorem 2.2. In this situation, the functions ℓ * and ℓ * of Theorem 2.2 coincide with the functions ℓ * andl * , respectively, of Theorem 2.3.
We use the notation β ∼ γ for expressions β and γ that depend on the same quantities to denote that there exist absolute constants C 1 > 0 and C 2 > 0 such that C 1 β ≤ γ ≤ C 2 β for all admissible values of the quantities that β and γ depend on. With this notation and the findings of Lemma 2.1, we can summarize the statements of Theorems 2.2 and 2.3 more compactly.
for the functionsl * andl * of (2.4). Since c * > c * > 0 by (2.6), we can rewrite equation (2.5) as This observation is the foundation for our numerical method in Subsection 3.4.
For later reference, we note a few more facts: Remark 2.2 In general, we cannot expect that the ratiô i.e. the proportionality factor between the terminal and the initial value of the solution to a problem, is known exactly. However, from eq. (2.5), we know thatĉ is bounded above by c * and below by c * given in (2.6). As long as no additional information is available to obtain a more precise approximate value forĉ, one may use the mean of the upper and the lower bound, i.e. the approximationĉ To compute this value from (2.6), we need to evaluate the quantitiesl * (b) and ℓ * (b) as defined in eq. (2.4). If an approximate solutionŷ to the differential equation in question is known at least for some grid points a = t 0 < t 1 < t 2 < . . . < t N = b, one may select a step size H > 0 and an integer M > 0 and approximate these upper and lower bounds bỹ respectively. Then we obtainĉ from these approximate values instead of the exact valuesl * (b) andl * (b).

Remark 2.3
Estimating the proportionality factorĉ as indicated in Remark 2.2 is quite useful when the differential equation in eq. (1.3) is dissipative, i.e. when (f (t, y 1 ) − f (t, y 2 ))(y 1 − y 2 ) ≤ 0 for all t ∈ [a, b] and all y 1 , y 2 ∈ R. In this case, eq. (2.4) implies thatl * (t) ≤l * (t) ≤ 0 for all t. And due to the monotonicity of the Mittag-Leffler function E α (see Lemma 2.1), we obtain Therefore, the interval [c * , c * ] in which the correct value ofĉ must lie is quite small. By choosing this interval's midpoint as starting point we only make a small error. If, on the other hand, the differential equation is not dissipative then c * may be very much larger than c * , and the strategy described in Remark 2.2 may lead to an estimate forĉ that is very far away from the correct value.
In Section 5 we work through example problems for either case.
Remark 2.4 Following Remark 2.3 we suggest yet another method for approximatingĉ: Analyze the given differential equation and see whether the approach of Remark 2.2 is appropriate (i.e., whether this does not lead to an excessively large value forĉ). If Remark 2.2 does not yield a usefulĉ value, choose a smaller one. More precisely: 1. Approximately compute the valuesl * (b) andl * (b) as indicated in Remark 2.2. 2. Ifl * (b) ≤l * (b) ≤ 0 then there is no danger of obtaining extremely large values for c * or c * . Thus proceed as suggested in Remark 2.2. 3. Ifl * (b) ≤ 0 <l * (b) then c * ≤ 1, but c * may be very much larger than 1.
To dampen the possible overestimation that c * might induce, ignore the precise value of c * and setĉ = 1. 4. If 0 <l * (b) ≤l * (b) then c * ≥ c * > 1. Again, to mitigate an overestimation, use the lower bound of the interval [c * , c * ] as an estimate forĉ, i.e. set c = E α (l * (b)(b − a) α ) as suggested by the first relation in eq. (2.6).

General Framework
As indicated in Section 2, the essential characteristics of problem (1.3) are identical to those of classical boundary value problems and our approach involves shooting methods [24] which are a well established technique for boundary value problems. The basic steps of general shooting methods are as follows: 1. Set k = 0. Given problem (1.3), make an initial guessỹ for the starting value y(a). Go back to step 2.
These simple components of shooting methods will be specified more precisely in the subsequent subsections. The overarching concern here is to keep the chosen shooting algorithm's computational complexity low. The computational cost of shooting algorithms is reflected in the number of operations required per iteration step, multiplied by the number of iterations needed to achieve satisfactory accuracy. Unless specific information about the given fractional ODE problem is available that suggests otherwise, we chooseỹ (0) 0 = y * as our initial guess for y(a) required in step 1, i.e., we use the desired terminal value as a first guess for our initial value.

Selecting the Initial Guessỹ
It is often assumed that a good choice of an initial guess at a leads to quick convergence with acceptable accuracy at b, while a poorly chosen initial guess might require more iterations, thus leading to a significantly higher overall computational cost. The examples in Section 5, however, indicate otherwise. In every test example that we have considered, satisfactory accuracy was achieved with very few iterations with our method, no matter how close the starting guess at a was to the exact solution.

Numerically Solving a Fractional ODE Initial Value Problem
Algorithms that compute the solution of an (artificially constructed) initial value problem in step 2 have been discussed by Ford et al. [14], showing that the fractional Adams-Bashforth-Moulton (ABM) method [10,11] is a good choice for non-stiff ODEs. However, for stiff differential equations or when the interval [a, b] is very large, the stability properties of ABM may be insufficient [16] and one should use an implicit linear multistep method such as the fractional trapezoidal method [18] or a fractional backward differentiation formula [25,26]. For our examples in Section 5, we present the results obtained with both alternative methods for comparison. There we use a uniform discretization of the basis interval [a, b] by choosing an integer N > 1 and equally spaced grid points t j = a + jh for the step size h = (b − a)/N . This allows us to use FFT techniques and obtain a fast implementation, see [19,21]. Using the FFT to compute the numerical solution on Solutions to fractional differential equations of the type considered here are almost never differentiable at their initial point a, see [6,Theorem 6.26]. This adversely affects the convergence rate for many numerical methods such as the ABM method, see [11]. To improve convergence, one could replace a uniform mesh by a graded one, see [30]. Or one could use a non-polynomial collocation scheme that was suggested, analyzed and tested in [15]. Such techniques will lead to faster convergence and can reach the required accuracy with larger step sizes and lower the overall computational effort. But these ideas cannot be easily combined with the FFT technique and consequently their numerical schemes become more costly overall. We shall not pursue these latter approaches any further.

Improved Subsequent Guesses for the Initial Values
The major contribution of our work is a new efficient method for guessing the initial values y(a) that hits y * = y(b) more and more accurately. Traditional approaches [5,7,15] use classical bisection that halves the size of the containment interval for the "correct" choice of y(a) in each step. Clearly bisection is convergent, but it takes a large number of iterations to arrive in a sufficiently small neighbourhood of the exact solution if the size of the containment interval is large such as 10 7 units wide. Note that ten interval halving steps reduce the error of the initial value location only by a factor of about 10 3 since 2 10 ≈ 1,000 and it would take 40 guess iterations to reduce this error to a reasonable 10 −5 . We suggest a different method that converges much faster. Section 5 compares the new approach with classical bisection based methods.
Like the classical bisection method, our approach also requires two initial guesses for the initial valuesỹ  . By Corollary 2.1, the proportion of two solution values y * 1 and y * 2 obtained for t = b and their starting values y 0,1 and y 0,2 at t = a indicates how to space the initial values until we find a starting value y * 0 that reaches the desired final value y * within a chosen error bound.
As long as only one initial guessỹ (0) 0 is available, i.e., when the next guess y (1) 0 has not yet been computed, we assume-due to a lack of any information that might suggest otherwise-that the proportionality factorĉ between the terminal values (i.e. the function values of the solution at t = b) and the initial values (the corresponding values for t = a), see Remark 2.2, is given by eq. (2.8). The values of c * and c * in this formula are then replaced by their approximations indicated in eq. (2.9). According to Remark 2.2, our next guess for the initial value becomes if and only if the latter has resulted in the exact solution, i.e., if and only ifỹ 0 (b) = y * and the problem has been solved.
Remark 3.1 Note that evaluating the formulas in (2.9) requires knowledge of an approximate solution to the given differential equation for some initial condition. At this stage, such information is already available because we have computed a 'solution' using the first guessỹ  For this purpose, we suggest to use the algorithm developed in [17]. -In case of a non-dissipative fractional differential equation, we have seen in Remark 2.3 that the approach of Remark 2.2 may lead to very poor approximations ofĉ and its true value may be massively over-estimated. Therefore, for non-dissipative problems, one is likely to be better off with simply choosing an arbitrary not excessively large positive number forĉ such asĉ = 1, in eq. (3.1). -If the user believes that evaluating the formulas in (2.9) is too expensive then one may again useĉ = 1 in equation (3.1), even for dissipative equations. Then the guess ofỹ (1) 0 may be worse than the one obtained witĥ c from (2.8) and the number of iterations until satisfactory accuracy may increase slightly.
-Remark 2.4 suggests another way to chooseĉ as it tries to find a compromise between the two earlier suggestions.
In the numerical experiments of Section 5, we report the results of our new method for various choices ofĉ such as the construct of Remark 2.2, the idea of Remark 2.4, or simply settingĉ = 1. In practice the method of Remark 2.4 usually leads to the smallest number of iterations and quickest overall convergence.
Our strategy for constructing additional initial valuesỹ (k) 0 for k = 2, 3, . . . is based on Corollary 2.1 which states that, given two fractional initial value problems for the same fractional differential equation, but starting from different initial conditions, the difference in terminal values of these two problems is approximately proportional to the difference in their initial values. Strictly speaking, Corollary 2.1 only holds in the asymptotic sense when the difference between subsequent initial values is tending to zero. In practice, our proportional secting idea has worked exceedingly well for all types of fractional ODEs even if this assumption is not satisfied, and the results in Section 5 show this clearly.
Next we need to specify initial value guesses forỹ when k ≥ 2 from two earlier guessed starts. Our algorithm always analyzes the two most recent iteration results for y at b and compares their calculated approximationsỹ k−1 (b) andỹ k−2 (b) with the desired value y * . How are these three values for y at b positioned? For this we found it convenient to express the target value y * as a generalized convex combination of the twoỹ µ (b) values for µ ∈ {k − 2, k − 1} and write for some λ k ∈ R. Here λ k can be immediately computed since all other quantities in (3.2) are known. Since any positioning of the three values relative to each other is possible, this concept may lead to λ k < 0 or λ k > 1 which would not be admissible in a classical convex combination, but this does not create any difficulties for our algorithm. From λ k we compute the new guess for the next shooting start asỹ The new starting guessỹ is a generalized convex combination of the two preceding guesses that uses the same proportions as those in eq. (3.2). Evidently, if the statement of Corollary 2.1 were an equality and no errors had occurred in the numerical solver for the initial value problem, then (3.3) would lead to a starting guess that hits the desired target value y * exactly.
With the value of λ k in eq. (3.3) as computed from equation (3.2), we obtain the next initial value guess as .
Thus, the correction term that gets us from the previous initial valueỹ to the next starting guessỹ with respect to each other or to y * . This formulation was chosen deliberately to avoid any lay-logical tree complications when executing our proportional secting method. The reference point in eq. (3.4) is always y * . The algorithm computesỹ (k) 0 whose associated function value y k (b) is closer to y * than at least one of those generated by the initial valuesỹ and the associated terminal valueỹ k (b) have been computed, we drop the oldest point data pairỹ (k−2) 0 andỹ k−2 (b) and continue with the pair with indices k and k − 1 and iterate on until |y k (b) − y * | has dropped below the required accuracy threshold.
Compared to classical bisection, our approach has two significant advantages: 1. Before a classical bisection method can be started, the correct initial value y(a) of the solution y must be known to lie inside the search interval, i.e., two numbers y 0 and y 0 with y(a) ∈ [y 0 , y 0 ] must have been computed for the actual solution y with y(b) = y * . Any first guess y 2. In classical bisection, one starts with the initial interval [y 0 , y 0 ] in which the exact solution's value for y(a) is known to be located. In each iteration step, the size of this interval (and hence the accuracy with which one knows the correct initial value) is reduced by one half. While this method clearly converges, it is easy to see that its convergence is typically rather slow. When the interval [y 0 , y 0 ] is large, classical bisection often requires very many iterations for an acceptable accuracy in the 10 −6 or 10 −8 range.
Our examples demonstrate that our proportional secting scheme reduces the size of the initial search interval much faster and thereby it solves the shooting problem with fewer iterations. The authors of [14] have neither stated any properties of this approach nor provided an analysis or given any reasons why to use this method. The two main advantages of the secant method over the bisecetion approach seem to have been unnoticed so far.
of the solution's exact initial value. Our algorithm does not guarantee such an enclosure in any iteration and we never know whether Since for most practical applications it is not relevant to have this information, we actually do not consider this a major drawback. We could modify proportional secting so that it retains the enclosure property once it has been obtained. Instead of always replacing the older of the two previous values, we might just replace the value that is on the same side of the exact solution as the new one, thus employing the regula falsi (false position method). However, using the regula falsi here comes with the added cost of potentially requiring significantly more iterations before reaching acceptable accuracy. Therefore we do not pursue this idea further.

Selection of the Step Size for the Numerical IVP Solver
To reduce the computational cost of any iterative shooting algorithm, [7] has proposed to vary the step size of the algorithm. When the iteration counter k of the "shots" is small, one assumingly is relatively far from the exact solution because the initial value is not sufficiently accurate and it does not make sense to solve the initial value problem with high accuracy. Moreover one can likewise use relatively large steps in the early iterations. Our numerical experiments in Section 5 indicate that no such difficult to implement step size varying procedure is necessary for either one of the three variants of our proportional secting approach because we always arrive at a very accurate solution in a small number (3 to 8 instead of 18 to 65) of iterations. Therefore we have decided to use a fixed step size and the same IVP solver in all shooting iterations.

Algorithmic Description of the Proportional Secting Scheme
We now write out the proportional secting method in a formal pseudo-code like manner in Algorithm 3.1 below.

Algorithm 3.1 Shooting method based on proportional secting
input data:the terminal value problem (1.3), i.e., the function f on the right-hand side of the differential equation and the desired terminal value y * a numerical method for solving fractional initial value problems (e.g., the fractional Adams method, the fractional trapezoidal method or a fractional BDF) a step size h to be used by the numerical IVP solver (or an arbitrary set of mesh points for the discretization of the time interval [a, b]) a desired accuracy level ε > 0 a decision strategy for choosing the valueĉ in eq. For a better understanding of the performance and behavior of our algorithm, we now give a short theoretical analysis.

Accuracy and Convergence
We begin with a look at error estimates. The total error of our numerical scheme has two active components originating from two different sources besides rounding and truncation errors.
When solving the terminal value problem (1.3) in the k-th iteration of the shooting procedure, we solve an adjacent problem from the starting guess y(a) =ỹ (k) 0 and obtain y k with y k (b) ̸ = y(b) = y * for the exact solution y. This is one component e 1 (k) of the total error. Additionally there is another error component e 2 (k, N ) associated with the k-th iterative solution y k , namely the computational error e 2 (k, N ) inherent in the numerical integration algorithm that we use. e 2 (k, N ) depends on the number N of grid points of the fractional IVP solver and on the k-th initial value problem. We shall assume that the grid points are uniformly spaced as t j = a + j(b − a)/N for j = 0, 1, 2, . . . , N .
We first consider the limit case k → ∞, i.e. we assume that we have done so many shooting iterations that the exact terminal value y ∞ (b) = y(b) = y * is reached numerically. We denote the solutions for this special terminal value problem by y ∞ for the exact solution and byỹ ∞,N for the numerical solution where, in contrast to the convention used in the remainder of this paper, the notation explicitly indicates the number N of grid points used in the IVP solver. Since k cannot be varied any more in the limit case, it remains to study the influence of the parameter N , i.e., the number of grid points of the IVP solver, on the error. The following theorem indicates that the convergence order of the underlying IVP solver is retained. Remark 4.1 In Theorem 4.1, we require the initial value solver to solve the given differential equation with an O(N −p ) error for every initial value, not just for the special initial value of the exact solution of the given terminal value problem. This requirement is due to a special property of fractional differential equations: The solutions of fractional differential equations tend to have only weak smoothness properties in general, but may behave much more smoothly for some exceptional initial values, see [6,Section 6.4]. If the exact solution of the terminal value problem has such an exceptional initial value then the IVP solver may compute a numerical approximation rapidly if starting from the exact initial value. But generally the iterations converge much more slowly for all other, even nearby, initial values.
Proof Our proof has two steps. First we prove that and then we use (4.2) to show eq. (4.1).
In an indirect proof, we assume that (4.2) is not true. Then there exists a function ϕ : N → R and a strictly increasing sequence (n µ ) ∞ µ=1 of positive integers such that lim µ→∞ ϕ(n µ ) = ∞ and for all sufficiently large µ and some constant c > 0. By definition ofỹ ∞,N , we haveỹ ∞,N (t N ) =ỹ ∞,N (b) = y * and thus And whenever N occurs in the sequence (n µ ), we have: -After convergence z 1,N equals N p times the absolute value of the difference between the exact solution y * of the given terminal value problem (1.3) evaluated at t = b, i.e. the solution to the initial value problem for the associated differential equation combined with the initial value y(t 0 ), and the exact solution of the terminal value problem when computed by the shooting method. The shooting method terminal value problem is equivalent to the initial value problem for the same differential equation but with the initial valueỹ ∞,N (t 0 ). Thus, by Corollary 2.1 and eq. (4.3), we obtain -After convergence z 2,N equals N p times the absolute value of the error of the numerical solution to the initial value problem for y ∞ at the point t N . By assumption, the IVP solver converges at an O(N −p ) rate, and therefore there exists some constant c ′ > 0 such that 0 ≤ z 2,N ≤ c ′ N p N −p = c ′ for all sufficiently large N in the sequence (n µ ). Consequently, for sufficiently large N in the sequence (n µ ), we have z 1,N > z 2,N , and hence the rightmost entry of eq. (4.4) is strictly positive, giving the required contradiction. So eq. (4.2) has been proved. Now note that max j=0,1,2,...,N Andỹ ∞,N (t 0 ) = y ∞ (t 0 ) because the initial value problem solver that generates the approximationỹ ∞,N is exact for the initial point so that by Corollary 2.1 and eq. (4.2). Moreover, due to the convergence rate of the initial value problem solver, for sufficiently large N .

⊓ ⊔
For only finitely many shooting steps the following similar error bounds hold.
depends only on k and depends on k and the chosen number N of grid points.
Proof This can be shown much in the same way as Theorem 4.1.

⊓ ⊔
Numerical experiments in Section 5 will illustrate these estimates.

Stability and Robustness
The numerical stability and robustness of an algorithm are relevant issues when assessing its practical usefulness. For shooting algorithms we have to deal with two essential aspects in this context. We need to understand the fundamental idea here to solve initial value problems that are close to, but not identical to the initial value problem that are equivalent to the given terminal value problem. From Theorem 2.3 and Corollary 2.1, we can see the well-posedness of both the original terminal value problem and the associated equivalent initial value problem. Thus small changes in either of these problems, no matter whether they are due to the way in which the shooting method works, to rounding errors of the given data or inherent errors of the initial value problem solver, or anything else do not lead to significant perturbations of the algorithm's output.
Another key component is the initial value solving algorithm that is executed in every iteration of a shooting method. If an integrator with poor stability properties is used or if the chosen step size is too large to guarantee stability, then the instability will be propagated into the shooting method and may render its output meaningless. Fortunately, the stability properties of many frequently used fractional IVP solvers in the context of fractional ODEs are well understood: for the Adams-Bashforth-Moulton method (ABM), see [16], for the fractional linear multistep methods such as the fractional BDF2 or the fractional trapezoidal methods, see [25]; for additional information see [18]. These well established IVP solving methods are highly suitable for our purposes and they will be used in the numerical examples of Section 5.

Numerical Results
Here we present numerical experiments with our new proportional secting scheme, first in all its three variants of choosingĉ required in eq. (3.1) when computing the second guess for the initial value and continuing with further proportional secting iterations. We compare our new method with the conventionally used shooting method that uses bisection. Our algorithm has been implemented in MATLAB R2022a on a notebook with an Intel Core i7-8550U CPU clocked at 1.8 GHz running Windows 10 and in MATLAB R2022b on a MacBook Pro with 2.4 GHz Quad-Core Intel Core i5 and 16 GB RAM.
In all cases, we have tested the shooting methods with two different solvers for the initial value problems, using the Adams-Bashforth-Moulton (ABM) scheme and the second order backward differentiation formula (BDF2) of [25]. The ABM method was implemented in a P(EC) m E structure with four corrector iterations [4]. BDF2 is an implicit method and hence needs to solve a nonlinear equation at each time step to compute the corresponding approximate solution. For this we use Garrappa's implementation [18]. We terminate its Newton iterations when two successive values differ by less than 10 −10 . All shooting iterations are terminated when the approximate solution y k at the endpoint b of [a, b] differs by at most ε from the desired value y(b) = y * for ε = 10 −6 , ε = 10 −8 , and ε = 10 −10 in our tests.
The tables below list the chosen initial value problem solver together with the corresponding step size, the maximal error over the interval of interest and the number of iterations that each of the shooting methods needed with the respective combination of IVP solver and step size to converge up to the required accuracy. In this context, we note that, since the shooting strategiesand hence the sequences of the chosen initial values-differ from each other, the approximate solutions computed by the four different approaches do not coincide exactly. Therefore, the respective maximal errors are also not precisely identical. However, at least for ε = 10 −8 and ε = 10 −10 , the maximal errors agree with each other at least within the accuracy listed in the tables. For ε = 10 −6 , the error variations are somewhat larger but their values have a common order of magnitude. For this case the corresponding table columns list the errors for the worst of our four guessing approaches.

Example 5.1 Our first example is the terminal value problem
whose exact solution is This is a standard example used for testing numerical methods in fractional calculus; cf., e.g., [10]. We report the results for the special case α = 0.3 in Tables 5.1 -Even if a very small step size is used in the initial value solver, the best accuracy that we can achieve is limited by the parameter ε that governs the termination criterion of the shooting iterations. Generally the total error is slightly larger than ε in lockstep with the chosen step size. For relatively small steps the error component e 1 (k) dominates the overall error estimate as established in Theorem 4.2 and for larger steps e 2 (k, N ) is the dominant error contribution. -In Tables 5.2 and 5.3 where the accuracy requirement ε is much smaller than the error term e 2 (k, N ) from Theorem 4.2, the convergence rate of BDF2 is around O(h 2 ) which is exactly the rate for the corresponding initial value problems, see [25]. This confirms the expected behavior predicted by Theorem 4.1 where we dealt with the case ε = 0. For the Adams method, the expected convergence rate is again O(h 2 ) for this example, see [4]. But Adams 0.00200000 6.7 · 10 −6 37 5 5 5 0.00100000 2.9 · 10 −6 38 5 5 5 0.00050000 9.5 · 10 −7 19 5 5 5 0.00025000 1.9 · 10 −   the actual rate in the data is a little lower at O(h 1.9 ), because the step size is still too large for asymptotics to have set in. -Varying the strategy for selecting the next initial guess (i.e., switching between classical bisection and proportional secting) but not changing the IVP solver has no influence on the final result because a change of starting point guessing means trying to solve the same nonlinear equation by a different method which should not lead to significantly different results. -There is, however, a substantial difference in the number of iterations that the two guessing strategies require to obtain a result with the desired accuracy. Classical bisection needs many iterations, and this induces a rather high computational cost. The proportional secting method performs much better. It typically requires only between 8% and 24% of the iterations for classical bisection. -A simple run times comparison of the two algorithms reflects this speedup.
-For example, for the case ε = 10 −10 shown in Table 5 The results are in Tables 5.4, 5.5 and 5.6. For this and the third example below we list the data for fewer step sizes than we did for Example 5.1 earlier. Note the following: -In Example 5.2, the exact solution satisfies y(b) = y(7) ≈ 0.6476 with y(a) = y(0) = 14/5 = 2.8, so y(b) is a relatively poor initial guess for y(0). All of our shooting start guessing rules perform very similarly to Example 5.1. -The performance comparisons between bisection and proportional secting are essentially the same as those for Example 5.1. Proportional secting reaches the required accuracy with only between 8.3% and 15% of the iterations that the bisection method needs and the run times decrease correspondingly.  -For both shooting strategies, the accuracy of the BDF2 second-order backward differentiation formula is much better than the accuracy of the ABM method. -This example uses a dissipative fractional differential equation that is linear, homogeneous and has the constant negative growth coefficient −3/2. For fractional ODEs with constant coefficients the proportionality factorĉ is discussed in Remark 2.2 and it is simple to compute. In this Example the lower bound c * coincides with the upper bound c * and both of them have the value 0.23 . . . which we can use forĉ. This leads to a slight reduction in the number of required iterations.

Example 5.3
The third example is based on the initial value problem The exact solution for this problem is unknown. Using a second order backward differentiation formula with 16,000,000 steps on [a, b] = [0, 20], and thus a stepsize of 20/(16 · 10 6 ) = 1.25 · 10 −6 , we have computed the approximate solution shown in Figure 3.1 as the curve highlighted by the black dots. We are confident to be very close to the exact solution with terminal value y(b) = y(20) ≈ 0.8360565. By replacing the given initial condition above by the terminal condition, we obtain a terminal value problem that we try to solve with the fractional ODE shooting methods of this paper. This fractional ODE problem appears more challenging than those of Examples 5.1 and 5.2 because here we work on a much larger interval and with a decaying oscillatory exact solution. The results are listed in Tables 5.7, 5.8 and 5.9. Without any precise information of the exact solution, the listed computed errors are in comparison to the numerical solution constructed earlier. These tables again exhibit a similar behavior as our Examples 5.1 and 5.2. The proportional secting method is substantially faster than the classical bisection method. It requires significantly fewer shooting iterations to converge for the required accuracy. The same holds true for the run time measurements. For ε = 10 −8 (see Table 5.8) and a BDF2 solver with stepsize 0.02, the run  There is a significant difference in Example 5.3 compared to Example 5.2: Using the strategy of Remark 2.2 to compute the second guess for the initial value leads to convergence, but it is slightly worse than when simply usinĝ c = 1 as suggested in Remark 3.2. This fractional differential equation is not dissipative, and its containment interval bounds given by (2.6) are c * ≈ 0.05 and c * ≈ 5 · 10 7 . Thus, the first containing interval for the correct proportionality factor is extremely large and the midpoint method of Remark 2.2 then starts from a very large error and continues with a relatively poor second approximate solution so that more iterations are needed until convergence.

Conclusion
We have discussed shooting methods for solving fractional terminal value problems with Caputo derivatives numerically. Choosing the best numerical IVP solver was not our focus. This has already been discussed extensively in [14,16,18,19]. Instead we have investigated and tested algorithms that select initial values for each iteration in fractional ODE shooting procedures. Classical bisection is often recommended and most often used. It converges rather slowly, requiring many iterations until approximating the actual solution well. The newly proposed proportional secting method is much better here. It computes the guess of the second and subsequent starting values for the shooting procedure rather differently. Three separate methods, differing only in the guess of the second initial value, have been proposed and their respective performances shown to differ only slightly in speed and accuracy. Our experimental findings are supported by the results of analytical investigations.
Remark 6.1 The Caputo differential operator listed in eq. (1.2) is not the only fractional differential operator of order α with starting point a for which one can try to formulate terminal value problems. Indeed, it is conceivable to use the so-called Hilfer fractional derivatives of order α and type µ ∈ [0, 1] with starting point a [22,Definition 3.3] instead. The special case µ = 1 of this class of operators is, on a very large set of functions, equivalent to the Caputo operators that we have discussed here; for µ = 0 one obtains the Riemann-Liouville derivatives [6,Chapter 2]. In principle, one could use the approach that we have proposed here to handle such generalized terminal value problems. One would then have to replace the initial value problem solving subroutines for Caputo IVPs in Algorithm 3.1 by corresponding functions for the modified operators, which is a relatively straightforward matter. However, to theoretically justify the proportional secting idea in this generalized context, one would also need to generalize our Corollary 2.1 in a corresponding way. Such a result currently does not seem to be available.

Software
The MATLAB source codes of the algorithms described in this paper, including all required auxiliary functions, can be downloaded from a dedicated repository [12] on the Zenodo platform, thus allowing all readers to reproduce the results of our numerical experiments. Our functions were tested in MATLAB R2022a and MATLAB R2022b.

Declarations
Funding The authors did not receive support from any organization for the submitted work.

Declaration of Interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Author Contributions Both authors contributed equally to this work.
Data Availability Not applicable.