Introduction

Given \(n+1\) distinct interpolation nodes \(x_0,\ldots ,x_n\) with associated data \(f_0,\ldots ,f_n\), the classical interpolation problem consists in finding a function \(r:\mathbb {R}\rightarrow \mathbb {R}\) that interpolates \(f_i\) at \(x_i\), that is,

$$\begin{aligned} r(x_i) = f_i, \qquad i=0,\ldots ,n. \end{aligned}$$
(1)

In the case of polynomial interpolation of degree at most n, this problem has a unique solution, which can be expressed in Lagrange form as

$$\begin{aligned} r(x) = \sum _{i=0}^n \ell _i(x) f_i, \qquad \ell _i(x) = \prod _{j=0,\,j\ne i}^n \frac{x-x_j}{x_i-x_j}. \end{aligned}$$

While this form is advantageous for theoretical analysis, its evaluation requires \(O(n^2)\) operations and can be numerically unstable. It is advisable to consider instead the first barycentric form of r,

$$\begin{aligned} r(x) = \ell (x) \sum _{i=0}^n \frac{w_i}{x-x_i} f_i, \qquad \ell (x) = \prod _{j=0}^n (x-x_j), \end{aligned}$$
(2)

with the Lagrange weights \(w_i\) defined as

$$\begin{aligned} w_i = \prod _{j=0,\,j\ne i}^n \frac{1}{x_i-x_j}, \qquad i=0,\ldots ,n. \end{aligned}$$
(3)

The first barycentric form is more efficient than the Lagrange form, as it can be evaluated in O(n) operations, after computing the \(w_i\), which are independent of x, in \(O(n^2)\) operations in a preprocessing step. Higham [1] further shows that this evaluation is backward stable with respect to perturbations of the data \(f_i\). Another means of evaluating r is given by the second barycentric form of r,

$$\begin{aligned} r(x) = \frac{\sum _{i=0}^n \frac{w_i}{x-x_i} f_i}{\sum _{i=0}^n \frac{w_i}{x-x_i}}, \end{aligned}$$
(4)

which can be derived from (2) by noticing that \(1=\ell (x)\sum _{i=0}^n\frac{w_i}{x-x_i}\). Evaluating this formula also requires O(n) operations, but it comes with the advantage that the \(w_i\) can be rescaled by a common factor to avoid underflow and overflow [2]. Moreover, the second barycentric form is forward stable, as long as the Lebesgue constant associated with the interpolation nodes \(x_i\) is small [1], which is the case, for example, for Chebyshev nodes of the first and the second kind, but not for equidistant nodes.

In the case of rational interpolation, the interpolation problem (1) no longer has a unique solution, but Berrut and Mittelmann [3] show that every rational interpolant of degree at most n can be expressed in the second barycentric form (4) for a specific choice of weights \(w_i\). Vice versa, Schneider and Werner [4] note that for any set of non-zero weights \(w_i\), the function r in (4) is a rational interpolant of degree at most n. An important subset of these barycentric rational interpolants are those that do not have any poles in \(\mathbb {R}\). This is obviously true for the Lagrange weights in (3), but also for the Berrut weights [5]

$$\begin{aligned} w_i = {(-1)}^i, \qquad i=0,\dots ,n, \end{aligned}$$
(5)

and, assuming the interpolation nodes to be in ascending order, that is, \(x_0<\dots <x_n\), for the family of weights with parameter \(d\in \{0,\ldots ,n\}\),

$$\begin{aligned} w_i = \sum _{k=\max (i-d,0)}^{\min (i,n-d)} {(-1)}^k \prod \limits _{j=k,\,j\ne i}^{k+d} \frac{1}{x_i-x_j}, \qquad i=0,\ldots ,n, \end{aligned}$$
(6)

proposed by Floater and Hormann [6]. Note that this family includes the Berrut and the Lagrange weights as special cases for \(d=0\) and \(d=n\). For the weights in (6), Floater and Hormann further observe that the barycentric rational interpolant in (4) with the weights (6) can also be written as

$$\begin{aligned} r(x) = \frac{\sum _{i=0}^{n} \frac{w_i}{x-x_i} f_i}{\sum _{i=0}^{n-d} \lambda _i(x)}, \end{aligned}$$
(7)

where

$$\begin{aligned} \lambda _i(x) = \frac{{(-1)}^i}{(x-x_i) \cdots (x-x_{i+d})}, \qquad i=0,\dots ,n-d. \end{aligned}$$
(8)

As the formula in (7) simplifies to (2) for \(d=n\), we refer to it as the first barycentric form of the corresponding rational interpolant. Note that the first and the second form are identical for Berrut’s interpolant, that is, for \(d=0\).

The result of Higham [1] can be extended to show that the evaluation of the second barycentric form (4) is forward stable, not only in the case of polynomial interpolation, but for general barycentric rational interpolants, provided that the weights \(w_i\) can be computed with a forward stable algorithm and that the corresponding Lebesgue constant is small [7]. For Berrut’s rational interpolant with weights in (5), this is the case for all well-spaced interpolation nodes [8], including equidistant and Chebyshev nodes. For the family of barycentric rational interpolants with weights in (6), the Lebesgue constant is known to grow logarithmically in n, for any constant \(d>0\) and equidistant [9] as well as quasi-equidistant [10] nodes, and the formula in (6) turns out to be a forward stable means of computing the weights \(w_i\) [11].

In this paper, we further generalize the proof of Salazar Celis [7], such that it can also be used for proving the forward stability of the first barycentric form (7), with two important changes (Sect. 3). On the one hand, the result relies on the fact that not only the \(w_i\), but also the \(\lambda _i\) can be evaluated with a forward stable algorithm. This, however is indeed the case, both for the formula in (8), which requires \(O(d(n-d))\) operations for \(0<d<n\), as well as a more efficient, but slightly less stable formula that gets by with O(n) operations (Sect. 4). On the other hand, the Lebesgue constant must be replaced by a similar quantity that depends on the functions \(\lambda _i\), for which we prove that it is at most on the order of \(O(\mu ^d)\), where \(\mu \) is the mesh ratio of the interpolation nodes (Sect. 6). Moreover, we show that a more efficient formula [12] for computing the weights in (6) is forward stable, too (Sect. 4).

Regarding backward stability, Mascarenhas and Camargo [13] show that the second barycentric form is backward stable under the same assumptions, namely forward stable weights \(w_i\) and small Lebesgue constant. Moreover, Camargo [11] proves that the first barycentric form is backward stable, as long as the denominator in (7) is computed with a special algorithm in O(nd) operations.

In this paper, we derive a substantially different approach that can be used to show the backward stability of both barycentric forms (Sect. 5). For the second barycentric form, our result provides an upper bound on the perturbation of the data \(f_i\) that is smaller than the upper bound by Mascarenhas and Camargo [13]. For the first barycentric form, our upper bound is larger than the one found by Camargo [11], but it comes with the advantage of holding for a more efficient way of computing the denominator in (7) in O(n) operations, which is based on our new O(n) algorithm for evaluating the \(\lambda _i\) (Sect. 4).

It is important to note that our results hold under the assumption that the input values to the algorithm, \(x_i\), \(f_i\), and x, are given as floating point numbers, so they do not introduce any additional error when we compute both forms (4) and (7). Our stability analysis does not cover errors that result from initially rounding the given values to floating point numbers.

Our numerical experiments (Sect. 7) confirm that this leads to an evaluation of the first barycentric form (7) of the rational interpolant with weights in (6), which is considerably faster than the algorithm proposed by Camargo [11], especially for larger d, at the price of only marginally larger forward errors. Evaluating the interpolant using the second barycentric form (4) is even faster and can be as stable, but may also result in significantly larger errors for certain choices of interpolation nodes. However, we also report a case in which the second form is stable and the first is not.

Preliminaries

Following Trefethen [14], a general problem g can be seen as a function \(g:U \rightarrow V\) from a normed vector space \((U,{\Vert \cdot \Vert }_U)\) of data to a normed vector space \((V,{\Vert \cdot \Vert }_V)\) of solutions and a numerical algorithm for solving the problem on a computer as another function \(\hat{g}:U\rightarrow V\) that approximates g.

For our analysis, we consider a computer that uses a set \(\mathbb {F}\) of floating point numbers with the corresponding machine epsilon \(\epsilon \) and let \({\textrm{fl}}:\mathbb {R}\rightarrow \mathbb {F}\) be the rounding function that maps each \(x\in \mathbb {R}\) to the closest floating point approximation \({\textrm{fl}}(x)\in \mathbb {F}\). Moreover, we denote by \(\circledast \) the floating point analogue of the arithmetic operation \(*\in \{+,-,\times ,\div \}\), that is,

$$\begin{aligned} x \circledast y = {\textrm{fl}}(x *y) \end{aligned}$$
(9)

for \(x,y\in \mathbb {F}\), and recall [14, Lecture 13] that \({\textrm{fl}}\) and \(\circledast \) introduce a relative error of size at most \(\epsilon \). That is, for all \(x\in \mathbb {R}\) there exists some \(\delta \in \mathbb {R}\) with \({|\delta |}<\epsilon \), such that

$$\begin{aligned} {\textrm{fl}}(x) = x (1 + \delta ) \end{aligned}$$

and likewise

$$\begin{aligned} x \circledast y = (x *y) (1 + \delta ), \qquad *\in \{+,-,\times ,\div \} \end{aligned}$$
(10)

for all \(x,y\in \mathbb {F}\), which follows immediately from (9). We further note that \({\textrm{fl}}(-x)=-x\) for all \(x\in \mathbb {F}\), so that multiplying a floating point number by \({(-1)}^i\) or taking its absolute value does not entail any rounding error.

In order to analyse and to compare the quality of numerical algorithms, the usual quantities to consider are the absolute forward error \({\Vert \hat{g}(u)-g(u)\Vert }_V\) and the relative forward error \({\Vert \hat{g}(u)-g(u)\Vert }_V/{\Vert g(u)\Vert }_V\). The algorithm \(\hat{g}\) is called accurate or forward stable, if

$$\begin{aligned} \frac{{\Vert \hat{g}(u)-g(u)\Vert }_V}{{\Vert g(u)\Vert }_V} = O(\epsilon ) \end{aligned}$$

for all \(u\in U\) with \(g(u)\ne 0\) and backward stable at \(u\ne 0\), if

$$\begin{aligned} \hat{g}(u)=g(\hat{u}) \qquad \text {for some}\quad \hat{u}\in U \quad \text {with}\quad \frac{{\Vert \hat{u}-u\Vert }_U}{{\Vert u\Vert }_U} = O(\epsilon ), \end{aligned}$$

where the notation \(x=O(\epsilon )\) means that there exists some positive constant C, such that \({|x|}\le C\epsilon \) as \(\epsilon \rightarrow 0\). In the case of forward stability, this constant usually depends on the relative condition number

$$\begin{aligned} \kappa (u) = \lim _{\delta \rightarrow 0} \sup _{{\Vert h\Vert }_U\le \delta } \biggl ( \frac{{\Vert g(u+h)-g(u)\Vert }_V}{{\Vert g(u)\Vert }_V} \bigg / \frac{{\Vert h\Vert }_U}{{\Vert u\Vert }_U} \biggr ) \end{aligned}$$

of the problem g at u and on the dimensions of U and V.

In this paper, we analyse algorithms for evaluating barycentric rational interpolants, based on the formulas (4) and (7). Following Higham [1], we assume the evaluation point \(x\in \mathbb {F}\) and the interpolation nodes \({\varvec{x}}=(x_0,\dots ,x_n)\in \mathbb {F}^{n+1}\) to be fixed and prove the forward and backward stability with respect to the data \({\varvec{f}}=(f_0,\dots ,f_n)\in \mathbb {F}^{n+1}\). More precisely, we solve the problem

$$\begin{aligned} g :(\mathbb {F}^{n+1},{\Vert \cdot \Vert }_\infty ) \rightarrow (\mathbb {R},{|\cdot |}), \qquad g({\varvec{f}}) = r(x) \end{aligned}$$

by a numerical algorithm

$$\begin{aligned} \hat{g} :(\mathbb {F}^{n+1},{\Vert \cdot \Vert }_\infty ) \rightarrow (\mathbb {F},{|\cdot |}), \qquad \hat{g}({\varvec{f}}) = \hat{r}(x), \end{aligned}$$

and we show that for any choice of x, \({\varvec{x}}\), and \({\varvec{f}}\), the relative forward error is bounded from above (Sect. 3),

$$\begin{aligned} \frac{{|\hat{r}(x)-r(x)|}}{{|r(x)|}} \le C_1 \epsilon + O(\epsilon ^2), \end{aligned}$$

and that there exists some perturbed data \(\varvec{\hat{f}}\) with

$$\begin{aligned} \frac{{\Vert \varvec{\hat{f}}-{\varvec{f}}\Vert }_\infty }{{\Vert {\varvec{f}}\Vert }_\infty } \le C_2 \epsilon + O(\epsilon ^2), \end{aligned}$$

such that \(\hat{r}(x;{\varvec{x}},{\varvec{f}})=r(x;{\varvec{x}},\varvec{\hat{f}})\), where the additional arguments of r are used to emphasize the dependence of the interpolant on the interpolation nodes and the data (Sect. 5). In both cases, \(\epsilon \) is assumed to be sufficiently small, since this is enough to guarantee that the errors are \(O(\epsilon )\). We will see that both constants \(C_1\) and \(C_2\) depend linearly on n and a quantity \(\beta (x)\), which is independent of \({\varvec{f}}\) and different for the first and the second barycentric form. It turns out that this quantity is usually smaller when the first barycentric form is used to evaluate r (Sect. 7). As mentioned above, the constant \(C_1\) further depends on the condition number of the problem, and more specifically on the componentwise relative condition number [15]

$$\begin{aligned} \kappa (x;{\varvec{x}},{\varvec{f}}) = \lim _{\delta \rightarrow 0} \sup _{\begin{array}{c} {\varvec{h}}\in \mathbb {R}^{n+1}\setminus \{{\varvec{0}}\}\\ {|h_i|}\le \delta {|f_i|} \end{array}} \biggl ( \frac{{|r(x;{\varvec{x}},{\varvec{f}}+{\varvec{h}}) - r(x;{\varvec{x}},{\varvec{f}})|}}{{|r(x;{\varvec{x}},{\varvec{f}})|}} \bigg / \max _{\begin{array}{c} i=0,\dots ,n\\ f_i\ne 0 \end{array}} \frac{{|h_i|}}{{|f_i|}} \biggr ). \end{aligned}$$
(11)

For the derivation of these upper bounds, we shall frequently use the following basic facts:

  1. 1.

    By Taylor expansion,

    $$\begin{aligned} \frac{1}{1+y} = \sum _{k=0}^\infty {(-1)}^k y^k \end{aligned}$$

    for any \(y\in \mathbb {R}\) with \({|y|}<1\). Consequently, if \(y=O(\epsilon )\), then

    $$\begin{aligned} \frac{1}{1+y} = 1 - y + O(\epsilon ^2). \end{aligned}$$
    (12)

    Moreover, for any \(\delta \in \mathbb {R}\) with \({|\delta |} \le \epsilon \), there exists some \(\delta '\in \mathbb {R}\) with \({|\delta '|}\le \epsilon +O(\epsilon ^2)\), such that

    $$\begin{aligned} \frac{1}{1+\delta } = 1 + \delta '. \end{aligned}$$
    (13)

    This observation is useful for “moving” the perturbation (10) caused by a floating point operation from the denominator to the numerator.

  2. 2.

    For any \(\delta _1,\dots ,\delta _m\in \mathbb {R}\) with \({|\delta _i|}\le C_i\epsilon \) for some \(C_i>0\), \(i=0,\dots ,n\), there exists some \(\delta \in \mathbb {R}\) with \({|\delta |}\le C\epsilon +O(\epsilon ^2)\), where \(C=\sum _{i=1}^m C_i\), such that

    $$\begin{aligned} \prod \limits _{j=1}^{m}(1+\delta _j) = 1 + \delta . \end{aligned}$$
    (14)

    We use this observation to gather the perturbations caused by computing the product of m terms into a single perturbationFootnote 1.

  3. 3.

    For any \(t_0,\dots ,t_m\in \mathbb {F}\), there exist some \(\varphi _0,\dots ,\varphi _m\in \mathbb {R}\) with \({|\varphi _0|},\dots ,{|\varphi _m|}\le m\epsilon +O(\epsilon ^2)\), such that

    $$\begin{aligned} {\textrm{fl}}\left( \sum _{i=0}^m t_i \right) = ( \cdots (( t_0 \oplus t_1) \oplus t_2) \cdots \oplus t_m) = \sum _{i=0}^m t_i (1+\varphi _i). \end{aligned}$$
    (15)

    This follows from the previous observation, and we use it to estimate the rounding error introduced by simple iterative summation of \(m+1\) floating point numbers.

Forward stability

For analysing the relative forward error of barycentric rational interpolation, we first observe that (4) and (7) can both be written in the common form

$$\begin{aligned} r(x) = \frac{\sum _{i=0}^n a_i(x) f_i}{\sum _{j=0}^m b_j(x)}, \end{aligned}$$
(16)

where \(a_i(x)=w_i/(x-x_i)\) for both forms, while \(m=n\) and \(b_j(x)=a_j(x)\) for the second form and \(m=n-d\) and \(b_j(x)=\lambda _j(x)\) for the first form. Next, we define the functions

$$\begin{aligned} \alpha (x;{\varvec{f}}) = \frac{\sum _{i=0}^n {|a_i(x) f_i|}}{{|\sum _{i=0}^n a_i(x) f_i|}} \end{aligned}$$
(17)

and

$$\begin{aligned} \beta (x) = \frac{\sum _{j=0}^m {|b_j(x)|}}{{|\sum _{j=0}^m b_j(x)|}}. \end{aligned}$$
(18)

Assuming now that we have forward stable algorithms for computing \(a_i(x)\) as \(\hat{a}_i(x)\) and \(b_j(x)\) as \(\hat{b}_j(x)\), we can derive a general bound on the relative forward error for the function r in (16).

Theorem 1

Suppose that there exist \(\alpha _0,\dots ,\alpha _n\in \mathbb {R}\) with

$$\begin{aligned} \hat{a}_i(x) = a_i(x) (1+\alpha _i), \qquad {|\alpha _i|} \le A \epsilon + O(\epsilon ^{2}), \qquad i=0,\dots ,n \end{aligned}$$

and \(\beta _0,\dots ,\beta _m\in \mathbb {R}\) with

$$\begin{aligned} \hat{b}_j(x) = b_j(x) (1+\beta _j), \qquad {|\beta _j|} \le B \epsilon + O(\epsilon ^{2}), \qquad j=0,\dots ,m \end{aligned}$$

for some constants A and B. Then, assuming \({\varvec{f}}\in \mathbb {F}^{n+1}\), the relative forward error of r in (16) satisfies

$$\begin{aligned} \frac{{|\hat{r}(x)-r(x)|}}{{|r(x)|}} \le (n+2+A) \alpha (x;{\varvec{f}}) \epsilon + (m+B) \beta (x) \epsilon + O(\epsilon ^2), \end{aligned}$$
(19)

for \(\epsilon \) small enough.

Proof

We first notice that \(\hat{r}\) is given by

$$\begin{aligned} \hat{r}(x)&= \frac{\sum _{i=0}^n \hat{a}_i(x) f_i (1+\delta ^{\times }_i) (1+\varphi ^N_i)}{\sum _{j=0}^m \hat{b}_j(x) (1+\varphi ^D_j)} (1+\delta ^{\div })\\&= \frac{\sum _{i=0}^n a_i(x) (1+\alpha _i) f_i (1+\delta ^{\times }_i) (1+\varphi ^N_i)}{\sum _{j=0}^m b_j(x) (1+\beta _j) (1+\varphi ^D_j)} (1+\delta ^{\div }), \end{aligned}$$

where \(\delta ^{\times }_i\), \(\varphi ^N_i\), \(\varphi ^D_j\) and \(\delta ^{\div }\) are the relative errors introduced respectively by the product \(\hat{a}_i(x)f_i\), the sums in the numerator and the denominator, and the final division. It then follows from (10) that \({|\delta ^{\times }_i|},{|\delta ^{\div }|} \le \epsilon \), while from (15) we have \({|\varphi ^N_i|}\le n\epsilon +O(\epsilon ^2)\) and \({|\varphi ^D_j|}\le m\epsilon +O(\epsilon ^2)\). By (14), there exist some \(\eta _i,\mu _j\in \mathbb {R}\) with

$$\begin{aligned} \begin{aligned} {|\eta _i|}&\le (n+2+A)\epsilon + O(\epsilon ^2),&\qquad i&= 0,\dots ,n,\\ {|\mu _j|}&\le (m+B)\epsilon + O(\epsilon ^2),&j&= 0,\dots ,m, \end{aligned} \end{aligned}$$
(20)

such that

$$\begin{aligned} \hat{r}(x) = \frac{\sum _{i=0}^n a_i(x) f_i (1+\eta _i)}{\sum _{j=0}^m b_j(x) (1+\mu _j)}. \end{aligned}$$
(21)

Therefore,

$$\begin{aligned} \frac{\hat{r}(x)}{r(x)}&= \frac{\sum _{i=0}^n a_i(x) f_i(1+\eta _i)}{\sum _{j=0}^m b_j(x) (1+\mu _j)} \bigg / \frac{\sum _{i=0}^n a_i(x) f_i}{\sum _{j=0}^m b_j(x)}\\&= \frac{\sum _{i=0}^n a_i(x) f_i(1+\eta _i)}{\sum _{j=0}^m a_i(x) f_i} \frac{\sum _{j=0}^m b_j(x)}{\sum _{j=0}^m b_j(x) (1+\mu _j)}\\&= \biggl (1 + \frac{\sum _{i=0}^n a_i(x) f_i \eta _i}{\sum _{j=0}^m a_i(x) f_i} \biggr ) \frac{1}{1 + \frac{\sum _{j=0}^m b_j(x) \mu _j}{\sum _{j=0}^m b_j(x)}}. \end{aligned}$$

Since, by the triangle inequality,

$$\begin{aligned} {\biggl |\frac{\sum _{i=0}^n a_i(x) \eta _i}{\sum _{i=0}^n a_i(x)}\biggl |} \le \alpha (x) \max _{i=0,\dots ,n} {|\eta _i|} \le \alpha (x) (n+2+A) \epsilon + O(\epsilon ^2) \end{aligned}$$
(22)

and

$$\begin{aligned} {\biggl |\frac{\sum _{j=0}^m b_j(x) \mu _j}{\sum _{j=0}^m b_j(x)}\biggl |} \le \beta (x) \max _{j=0,\dots ,m} {|\mu _j|} \le \beta (x) (m+B) \epsilon + O(\epsilon ^2), \end{aligned}$$
(23)

we can use (12) to express this ratio as

$$\begin{aligned} \frac{\hat{r}(x)}{r(x)}&= \biggl (1 + \frac{\sum _{i=0}^n a_i(x) f_i \eta _i}{\sum _{j=0}^m a_i(x) f_i} \biggr ) \biggl (1 - \frac{\sum _{j=0}^m b_j(x) \mu _j}{\sum _{j=0}^m b_j(x)} + O(\epsilon ^2) \biggr )\\&= 1 + \frac{\sum _{i=0}^n a_i(x) f_i \eta _i}{\sum _{j=0}^m a_i(x) f_i} - \frac{\sum _{j=0}^m b_j(x) \mu _j}{\sum _{j=0}^m b_j(x)} + O(\epsilon ^2) \end{aligned}$$

and obtain the relative forward error of r as

$$\begin{aligned} \frac{{|\hat{r}(x)-r(x)|}}{{|r(x)|}} = {\biggl |\frac{\hat{r}(x)}{r(x)} - 1\biggl |} = {\biggl |\frac{\sum _{i=0}^n a_i(x) f_i \eta _i}{\sum _{j=0}^m a_i(x) f_i} - \frac{\sum _{j=0}^m b_j(x) \mu _j}{\sum _{j=0}^m b_j(x)} + O(\epsilon ^2)\biggl |}. \end{aligned}$$

The upper bound in (19) then follows immediately by using again (22) and (23). \(\square \)

While Theorem 1 holds for any function r that can be expressed as in (16), we shall now focus on the special cases for the two different forms of the barycentric rational interpolant. For the second barycentric form, the only assumption we need is that the weights \(w_i\) can be computed as \(\hat{w}_i\) with a forward stable algorithm. Moreover, we recall the definition of the Lebesgue function of the barycentric rational interpolant [9] as

$$\begin{aligned} \Lambda _n(x;{\varvec{x}}) = \frac{\sum _{i=0}^n {\bigl |\frac{w_i}{x-x_i}\bigl |}}{{\bigl |\sum _{i=0}^n \frac{w_i}{x-x_i}\bigl |}}. \end{aligned}$$

Corollary 1

Assume that there exist \(\psi _0,\dots ,\psi _n\in \mathbb {R}\) with

$$\begin{aligned} \hat{w}_i = w_i (1+\psi _i), \qquad {|\psi _i|} \le W \epsilon + O(\epsilon ^2), \qquad i=0,\dots ,n \end{aligned}$$
(24)

for some constant W. Then, assuming \({\varvec{f}}\in \mathbb {F}^{n+1}\), the relative forward error of the second barycentric form in (4) satisfies

$$\begin{aligned} \frac{{|\hat{r}(x)-r(x)|}}{{|r(x)|}} \le (n+4+W) \kappa (x;{\varvec{x}},{\varvec{f}}) \epsilon + (n+2+W) \Lambda _n(x;{\varvec{x}}) \epsilon + O(\epsilon ^{2}), \end{aligned}$$
(25)

for \(\epsilon \) small enough.

Proof

We first notice that \(a_i(x)=w_i/(x-x_i)\) can be computed with one subtraction and one division, so that, by (10), (13), (14), and (24),

$$\begin{aligned} \hat{a}_i(x) = a_i(x) (1+\alpha _i), \qquad i=0,\dots ,n \end{aligned}$$

for some \(\alpha _i\in \mathbb {R}\) with \({|\alpha _i|}\le (2+W)\epsilon +O(\epsilon ^2)\). Hence, the constants in (19) are \(A=2+W\) and further \(B=2+W\), because \(b_j(x)=a_j(x)\) in case of the second barycentric form. From the latter, it also follows immediately that \(\beta (x)\) in (18) is equal to \(\Lambda _n(x;{\varvec{x}})\) in this case, and it only remains to show that \(\alpha (x;{\varvec{f}})\) in (17) is equal to \(\kappa (x;{\varvec{x}},{\varvec{f}})\) in (11). To this end, we first use the triangle inequality to see that for any \(\delta >0\) and any \({\varvec{h}}=(h_0,\dots ,h_n)\in \mathbb {R}^{n+1}\setminus \{{\varvec{0}}\}\) with \({|h_i|}\le \delta {|f_i|}\),

$$\begin{aligned} {\Biggl |\sum _{i=0}^n a_i(x) h_i\Biggl |} = {\Biggl |\sum _{i=0}^n a_i(x) f_i \frac{h_i}{f_i}\Biggl |} \le H \sum _{i=0}^n {|a_i(x) f_i|}, \qquad H = \max _{\begin{array}{c} i=0,\dots ,n\\ f_i\ne 0 \end{array}} \frac{{|h_i|}}{{|f_i|}}, \end{aligned}$$

where equality is attained for \({\varvec{h}}\) with \(h_i=\delta {\textrm{sign}}(a_i(x)){|f_i|}\), \(i=0,\dots ,n\). Dividing both sides of this inequality by H and \({\bigl |\sum _{i=0}^n a_i(x)f_i\bigl |}\), taking the supremum over all admissible \({\varvec{h}}\) and the limit \(\delta \rightarrow 0\), gives \(\kappa (x;{\varvec{x}},{\varvec{f}})=\alpha (x;{\varvec{f}})\). \(\square \)

For the first barycentric form, we additionally need to assume that the \(\lambda _i(x)\) can be computed with a forward stable algorithm as \(\hat{\lambda }_i(x)\), and we note that \(\beta (x)\) in (18) is equal to

$$\begin{aligned} \Gamma _d(x;{\varvec{x}}) = \frac{\sum _{i=0}^{n-d} {|\lambda _i(x)|}}{{\bigl |\sum _{i=0}^{n-d} \lambda _i(x)\bigl |}}. \end{aligned}$$
(26)

in this case.

Corollary 2

Assume that the weights \(w_0,\dots ,w_n\) can be computed as in (24) and that there exist \(\gamma _0,\dots ,\gamma _{n-d}\in \mathbb {R}\) with

$$\begin{aligned} \hat{\lambda }_i(x) = \lambda _i(x) (1+\gamma _i), \qquad {|\gamma _i|} \le C \epsilon + O(\epsilon ^2), \qquad i=0,\dots ,n-d \end{aligned}$$
(27)

for some constant C. Then, assuming \({\varvec{f}}\in \mathbb {F}^{n+1}\), the relative forward error of the first barycentric form in (7) satisfies

$$\begin{aligned} \frac{{|\hat{r}(x)-r(x)|}}{{|r(x)|}} \le (n+4+W) \kappa (x;{\varvec{x}},{\varvec{f}}) \epsilon + (n-d+C) \Gamma _d(x;{\varvec{x}}) \epsilon + O(\epsilon ^{2}), \end{aligned}$$
(28)

for \(\epsilon \) small enough.

Proof

As the numerator of the first and second barycentric form are identical, the only difference to the proof of Corollary 1 is that \(B=C\) in (19), because \(b_j(x)=\lambda _j(x)\) and \(m=n-d\). \(\square \)

While upper bounds for the Lebesgue function \(\Lambda _n(x;{\varvec{x}})\) can be found in the literature [8, 9], we are unaware of any previous work bounding the function \(\Gamma _d(x;{\varvec{x}})\), and we derive such an upper bound in Sect. 6.

Computing the weights \(w_i\) and evaluating the functions \(\lambda _i\)

It remains to work out the constants W and C, related to the computation of the weights \(w_i\) in (4) and the evaluation of the functions \(\lambda _i\) in (7). In particular, we study the error propagation that occurs in the implementation of different algorithms and further analyse them in terms of computational cost.

Regarding the \(w_i\), it was shown by Higham [1] that the Lagrange weights in (3) can be computed stably with \(W=2n\) in (24), and the Berrut weights in (5) can be represented exactly in \(\mathbb {F}\), so that \(W=0\). The same holds for the weights in (6) if the interpolation nodes are equidistant, because they simplify to the integers

$$\begin{aligned} w_i = (-1)^{i-d} \sum _{j=\max (i-d,0)}^{\min (i,n-d)} \left( {\begin{array}{c}d\\ i-j\end{array}}\right) ,\qquad i=0,\dots ,n \end{aligned}$$

in this special case [6]. For the general case, Camargo [11, Lemma 1] shows that \(W=3d\), if the \(w_i\) are computed with a straightforward implementation of the formula in (6). While this construction requires \(O(nd^2)\) operations, Hormann and Schaefer [12] suggest an improved O(nd) pyramid algorithm, which turns out to have the same precision. Their algorithm starts from the values

$$\begin{aligned} v^d_i = 1, \qquad i=0,\dots ,n-d \end{aligned}$$
(29)

and iteratively computes

$$\begin{aligned} v^l_i = \frac{v^{l+1}_{i-1}}{x_{i+l}-x_{i-1}} +\frac{v^{l+1}_i}{x_{i+l+1}-x_i}, \qquad i=0,\dots ,n-l \end{aligned}$$
(30)

for \(l=d-1,d-2,\dots ,0\), tacitly assuming \(v^l_i=0\) for \(i<0\) and \(i>n-l\). They show that the resulting values \(v^0_i\) are essentially the weights \(w_i\) in (6), up to a factor of \({(-1)}^{i-d}\).

Lemma 1

For any \(x_0,\dots ,x_n\in \mathbb {F}\), there exist \(\phi ^0_0,\dots ,\phi ^0_n\in \mathbb {R}\) with \({|\phi ^0_0|},\dots ,{|\phi ^0_n|}\le W\epsilon +O(\epsilon ^2)\) for \(W=3d\), such that the \(v^0_i\) in (30) satisfy

$$\begin{aligned} \hat{v}^0_i = v^0_i (1+\phi ^0_i), \qquad i=0,\dots ,n. \end{aligned}$$

Proof

The statement is a special case of the more general observation that there exists for any \(l=d,d-1,\dots ,0\) and \(i=0,\dots ,n-l\) some \(\phi ^l_i\in \mathbb {R}\) with \({|\phi ^l_i|}\le 3(d-l)\epsilon +O(\epsilon ^2)\), such that \(\hat{v}^l_i=v^l_i(1+\phi ^l_i)\), which can be shown by induction over l. The base case \(l=d\) follows trivially from (29). For the inductive step from \(l+1\) to l, we conclude from (10), (13), and (14), that \(\hat{v}^l_i\), computed with the formula in (30), satisfies

$$\begin{aligned} \hat{v}^l_i = \frac{\hat{v}^{l+1}_{i-1}}{x_{i+l}-x_{i-1}} (1+\rho _1) +\frac{\hat{v}^{l+1}_i}{x_{i+l+1}-x_i} (1+\rho _2) \end{aligned}$$

for some \(\rho _1,\rho _2\in \mathbb {R}\) with \({|\rho _1|},{|\rho _2|}\le 3\epsilon +O(\epsilon ^2)\), since both terms are affected by one subtraction, one division, and one sum. By induction hypothesis and (14), we can then assume the existence of some \(\sigma _1,\sigma _2\in \mathbb {R}\) with \({|\sigma _1|},{|\sigma _2|}\le 3(d-l)\epsilon +O(\epsilon ^2)\), such that

$$\begin{aligned} \hat{v}^l_i = \frac{v^{l+1}_{i-1}}{x_{i+l}-x_{i-1}} (1+\sigma _1) + \frac{v^{l+1}_i}{x_{i+l+1}-x_i} (1+\sigma _2), \end{aligned}$$

and the intermediate value theorem further guarantees that

$$\begin{aligned} \hat{v}^l_i = \biggl ( \frac{v^{l+1}_{i-1}}{x_{i+l}-x_{i-1}} + \frac{v^{l+1}_i}{x_{i+l+1}-x_i} \biggr ) (1+\phi ^l_i) \end{aligned}$$

for some \(\phi ^l_i\in [\min (\sigma _1,\sigma _2),\max (\sigma _1,\sigma _2)]\) with \({|\phi ^l_i|}\le 3(d-l)\epsilon +O(\epsilon ^2)\). \(\square \)

Let us now focus on the functions \(\lambda _i\) that appear in the barycentric formula (7) and first study the error propagation when computing them straightforwardly, following the formula in (8).

Lemma 2

For any \(x\in \mathbb {F}\) and \(x_0,\dots ,x_n\in \mathbb {F}\), there exist \(\theta _0,\dots ,\theta _{n-d}\in \mathbb {R}\) with \({|\theta _0|},\dots ,{|\theta _{n-d}|}\le C\epsilon +O(\epsilon ^2)\) for \(C=2d+2\), such that the \(\lambda _i(x)\) in (8) satisfy

$$\begin{aligned} \hat{\lambda }_i(x) = \lambda _i(x) (1+\theta _i), \qquad i=0,\dots ,n-d. \end{aligned}$$

Proof

Since computing \(\hat{\lambda }_i(x)\) requires \(d+1\) subtractions, d products, and one division, the result follows directly from (10), (14), and (13). \(\square \)

Evaluating \(\lambda _i\) in this way clearly has a computational cost of O(d) for \(d>0\) and O(1) for \(d=0\), so that computing all \(\lambda _i(x)\) requires \(O(d(n-d))\) operations for \(0<d<n\) and O(n) operations for \(d=0\) and \(d=n\). However, for \(0<d<n\) this can be improved by exploiting the fact that \(\lambda _i(x)\) and \(\lambda _{i+1}(x)\) have d common factors in the denominator, which in turn suggests to first compute the “central” \(\lambda _m(x)\) for \(m=\left\lfloor \frac{n-d}{2} \right\rfloor \) as above and then the remaining \(\lambda _i(x)\) iteratively as

$$\begin{aligned} \begin{aligned} \lambda _{i-1}(x)&= -\lambda _i(x) \frac{(x-x_{i+d})}{(x-x_{i-1})},&\qquad i&=m,m-1,\dots ,1,\\ \lambda _{i+1}(x)&= -\lambda _i(x) \frac{(x-x_i)}{(x-x_{i+1+d})},&\qquad i&=m,m+1,\dots ,n-d-1. \end{aligned} \end{aligned}$$
(31)

Computing all \(\lambda _i(x)\) this way requires only O(n) operations, but it comes at the price of a likely reduced precision.

Lemma 3

For any \(x\in \mathbb {F}\) and \(x_0,\dots ,x_n\in \mathbb {F}\), there exist \(\zeta _0,\dots ,\zeta _{n-d}\in \mathbb {R}\) with \({|\zeta _0|},\dots ,{|\zeta _{n-d}|}\le C\epsilon +O(\epsilon ^2)\) for \(C=2n+4\), such that the \(\lambda _i(x)\) in (31) satisfy

$$\begin{aligned} \hat{\lambda }_i(x) = \lambda _i(x) (1+\zeta _i), \qquad i=0,\dots ,n-d. \end{aligned}$$

Proof

By Lemma 2, we know that \(\hat{\lambda }_m(x)=\lambda _m(x)(1+\zeta _m)\) with \({|\zeta _m|}\le (2d+2)\epsilon +O(\epsilon ^2)\). Each step in (31) involves two subtractions, one division, and one product, and therefore introduces a perturbation of \((1+\delta )\) with \({|\delta |}\le 4\epsilon +O(\epsilon ^2)\), and these perturbations accumulate during the iteration. Since the number of steps is at most \((n-d+1)/2\), the overall perturbation for each \(\lambda _i(x)\) is at most \((1+\zeta _i)\) with \({|\zeta _i|}\le [(2d+2)+4(n-d+1)/2]\epsilon +O(\epsilon ^2)\). \(\square \)

Backward stability

Similar to how we established the forward stability of both barycentric forms in a unified way in Sect. 3, we can prove the backward stability in general for the function r in (16) and then derive upper bounds on the perturbation of the data for both forms as special cases.

Theorem 2

Suppose that there exist \(\alpha _0,\dots ,\alpha _n\in \mathbb {R}\) with

$$\begin{aligned} \hat{a}_i(x) = a_i(x) (1+\alpha _i), \qquad {|\alpha _i|} \le A \epsilon + O(\epsilon ^{2}), \qquad i=0,\dots ,n \end{aligned}$$

and \(\beta _0,\dots ,\beta _m\in \mathbb {R}\) with

$$\begin{aligned} \hat{b}_j(x) = b_j(x) (1+\beta _j), \qquad {|\beta _j|} \le B \epsilon + O(\epsilon ^2), \qquad j=0,\dots ,m \end{aligned}$$

for some constants A and B. Then there exists for any \({\varvec{f}}\in \mathbb {F}^{n+1}\) some \(\varvec{\hat{f}}\in \mathbb {F}^{n+1}\) with

$$\begin{aligned} \frac{{\Vert \varvec{\hat{f}}-{\varvec{f}}\Vert }_\infty }{{\Vert {\varvec{f}}\Vert }_\infty } \le (n+2+A) \epsilon + (m+B) \max _x \beta (x) \epsilon + O(\epsilon ^{2}), \end{aligned}$$

for \(\epsilon \) small enough, such that the numerical evaluation of r in (16) satisfies \(\hat{r}(x;{\varvec{f}})=r(x;\varvec{\hat{f}})\).

Proof

Starting from (21), with the \(\eta _i\) and \(\mu _j\) satisfying (20), we get, again with the help of (12),

$$\begin{aligned} \hat{r}(x;{\varvec{x}},{\varvec{f}})&= \frac{\sum _{i=0}^n a_i(x) f_i (1+\eta _i)}{\sum _{j=0}^m b_j(x) + \sum _{j=0}^m b_j(x) \mu _j} \\&= \frac{\sum _{i=0}^n a_i(x) f_i (1+\eta _i)}{\sum _{j=0}^m b_j(x) \Bigl (1 + \frac{\sum _{j=0}^m b_j(x) \mu _j}{\sum _{j=0}^m b_j(x)} \Bigr )}\\&= \frac{\sum _{i=0}^n a_i(x) f_i (1+\eta _i)}{\sum _{j=0}^m b_j(x)} \biggl (1 - \frac{\sum _{j=0}^m b_j(x) \mu _j}{\sum _{j=0}^{m} b_j(x)} + O(\epsilon ^2) \biggr )\\&= \frac{\sum _{i=0}^n a_i(x) {\hat{f}}_i}{\sum _{j=0}^m b_j(x)} = r(x;{\varvec{x}},\varvec{\hat{f}}), \end{aligned}$$

where

$$\begin{aligned} \hat{f}_i = f_i (1+\eta _i) \biggl ( 1 - \frac{\sum _{j=0}^m b_j(x) \mu _j}{\sum _{j=0}^{m} b_j(x)} + O(\epsilon ^2) \biggr ), \qquad i=0,\dots ,n. \end{aligned}$$

By (23), this means that there exist some \(\xi _0,\dots ,\xi _n\in \mathbb {R}\) with \({|\xi _0|},\dots ,{|\xi _n|}\le (m+B)\max _x\beta (x)\epsilon +O(\epsilon ^2)\), such that

$$\begin{aligned} \hat{f}_i = f_i (1+\eta _i) (1+\xi _i), \qquad i=0,\dots ,n, \end{aligned}$$

and by (14) and (20) we can further assume the existence of some \(\varphi _0,\dots ,\varphi _n\) with \({|\varphi _0|},\dots ,{|\varphi _n|}\le (n+2+A)\epsilon +(m+B)\max _x\beta (x)\epsilon +O(\epsilon ^2)\), such that

$$\begin{aligned} \hat{f}_i = f_i(1+\varphi _i), \qquad i=0,\dots ,n. \end{aligned}$$

The statement then follows directly from

$$\begin{aligned} {\Vert \varvec{\hat{f}}-{\varvec{f}}\Vert }_\infty&= \max _{i=0,\dots ,n} {|\hat{f}_i-f_i|} = \max _{i=0,\dots ,n} {|f_i \varphi _i|}\\&\le \max _{i=0,\dots ,n} {|f_i|} \max _{i=0,\dots ,n} {|\varphi _i|} = {\Vert {\varvec{f}}\Vert }_\infty \max _{i=0,\dots ,n} {|\varphi _i|}. \end{aligned}$$

\(\square \)

The special cases of Theorem 2 for the two different forms of the barycentric rational interpolant then follow with the same reasoning as in Sect. 3.

Corollary 3

Assume that the weights \(w_0,\dots ,w_n\) can be computed as in (24). Then there exists for any \({\varvec{f}}\in \mathbb {F}^{n+1}\) some \(\varvec{\hat{f}}\in \mathbb {F}^{n+1}\) with

$$\begin{aligned} \frac{{\Vert \varvec{\hat{f}}-{\varvec{f}}\Vert }_\infty }{{\Vert {\varvec{f}}\Vert }_\infty } \le (n+4+W) \epsilon + (n+2+W) \max _x\Lambda _n(x;{\varvec{x}}) \epsilon + O(\epsilon ^2), \end{aligned}$$

for \(\epsilon \) small enough, such that the second barycentric form in (4) satisfies \(\hat{r}(x;{\varvec{x}},{\varvec{f}}) = r(x;{\varvec{x}},\varvec{\hat{f}})\).

Corollary 4

Assume that the weights \(w_0,\dots ,w_n\) can be computed as in (24) and the values \(\lambda _0(x),\dots ,\lambda _{n-d}(x)\) as in (27). Then there exists for any \({\varvec{f}}\in \mathbb {F}^{n+1}\) some \(\varvec{\hat{f}}\in \mathbb {F}^{n+1}\) with

$$\begin{aligned} \frac{{\Vert \varvec{\hat{f}}-{\varvec{f}}\Vert }_\infty }{{\Vert {\varvec{f}}\Vert }_\infty } \le (n+4+W) \epsilon + (n-d+C) \max _x\Gamma _d(x;{\varvec{x}}) \epsilon + O(\epsilon ^2), \end{aligned}$$

for \(\epsilon \) small enough, such that the first barycentric form in (7) satisfies \(\hat{r}(x;{\varvec{x}},{\varvec{f}}) = r(x;{\varvec{x}},\varvec{\hat{f}})\).

Upper bound for \(\Gamma _d\)

In case of the first barycentric form, the bounds for the forward and backward stability depend on the function \(\Gamma _d\) in (26), and we still need to show that this function is bounded from above. Note that \(\Gamma _0=\Lambda _n\), because \(w_i={(-1)}^i\) and \(\lambda _i={(-1)}^i/(x-x_i)\) if \(d=0\), so that the bound for the Lebesgue constant of Berrut’s interpolant [8] also holds for \(\Gamma _d\) in this case. In the following, we therefore assume \(d\ge 1\) and define

$$\begin{aligned} N(x) = \sum _{i=0}^{n-d} {|\lambda _i(x)|} \qquad \text {and}\qquad D(x) = {\Biggl |\sum _{i=0}^{n-d} \lambda _i(x)\Biggl |}, \end{aligned}$$

so that \(\Gamma _d(x;{\varvec{x}})=N(x)/D(x)\). We further assume that \(x\in (x_k,x_{k+1})\) for some \(k\in \{0,\dots ,n-1\}\). It then follows from the definition in (8) that all \(\lambda _i\) with index in

$$\begin{aligned} I_3 = I \cap \{ k-d+1, \dots , k \}, \end{aligned}$$

where \(I=\{0,\dots ,n-d\}\), have the same sign as \({(-1)}^{k+d}\) and that the sign alternates for decreasing indices “to the left” and increasing indices “to the right” of \(I_3\). More precisely, the \(\lambda _i\) with index in

$$\begin{aligned} I_2 = I \cap \{ k-d, k-d-2, \dots \} \qquad \text {or}\qquad I_4 = I \cap \{ k+1, k+3, \dots \} \end{aligned}$$

have the same sign as the ones with index in \(I_3\), while the sign is opposite, if the index is in

$$\begin{aligned} I_1 = I \cap \{ k-d-1, k-d-3, \dots \} \qquad \text {or}\qquad I_5 = I \cap \{ k+2, k+4, \dots \}. \end{aligned}$$

Without loss of generality, we assume that the \(\lambda _i\) are positive for \(i\in I_2,I_3,I_4\) and negative for \(i\in I_1,I_5\), since multiplying all \(\lambda _i\) with a common constant does not change \(\Gamma _d\). Letting

$$\begin{aligned} S_j = \sum _{i\in I_j} \lambda _i(x), \qquad j = 1,\dots ,5, \end{aligned}$$

we then find that

$$\begin{aligned} N(x) = - S_1 + S_2 + S_3 + S_4 - S_5 \qquad \text {and}\qquad D(x) = S_1 + S_2 + S_3 + S_4 + S_5. \end{aligned}$$
(32)

To bound \(\Gamma _d\), we need to bound the sum of the negative \(\lambda _i\) with \(i\in I_1, I_5\) as well as the \(\lambda _i\) with \(i\in I_3\), which in turn requires to first bound the terms \(x-x_i\). To this end, let \(h_i=x_{i+1}-x_i\) for \(i\in \{0,\dots ,n-1\}\) and define

$$\begin{aligned} h_{\min }= \min \{h_0,\dots ,h_{n-1}\} \qquad \text {and}\qquad h_{\max }= \max \{h_0,\dots ,h_{n-1}\}. \end{aligned}$$

Proposition 1

For any \(i\in \{0,\dots ,n\}\), the distance between \(x\in (x_k,x_{k+1})\) and \(x_i\) is bounded as

$$\begin{aligned} \begin{aligned} \frac{h_{\min }}{2} (1+t+2(k-i))&\le x - x_i \le \frac{h_{\max }}{2} (1+t+2(k-i)),&\qquad i&\le k,\\ \frac{h_{\min }}{2} (1-t+2(i-k-1))&\le x_i - x \le \frac{h_{\max }}{2} (1-t+2(i-k-1)),&\qquad i&\ge k+1, \end{aligned} \end{aligned}$$

where \(t=2(x-x_k)/h_k-1\in (-1,1)\).

Proof

The statement follows directly by noting that

$$\begin{aligned} x = x_k + \frac{h_k}{2} (1+t) = x_{k+1} + \frac{h_k}{2} (1-t) \end{aligned}$$

for the given t and because \(h_{\min }\le h_i\le h_{\max }\) for any \(i\in \{0,\dots ,n-1\}\). \(\square \)

For bounding the negative \(\lambda _i\) from above, it turns out to be useful to consider them in pairs, with indices from \(I_1\) and \(I_5\) at the same “distance” from \(I_3\).

Lemma 4

For any \(j\in \mathbb {N}\) and \(x\in (x_k,x_{k+1})\),

$$\begin{aligned} - \lambda _{k-d-2j+1}(x) - \lambda _{k+2j}(x) \le \biggl ( \frac{1}{h_{\min }}\biggr )^{d+1} \Biggl ( \frac{1}{\prod _{m=0}^d (2j+m)} + \frac{1}{\prod _{m=0}^d (2j-1+m)} \Biggr ), \end{aligned}$$

where we set \(\lambda _i(x)=0\) for any \(i\notin I\).

Proof

Since the denominator of \(\lambda _{k-d-2j+1}(x)\), for \(k-d-2j+1\ge 0\), contains the terms \(x-x_{k-2j+1-m}\) for \(m=0,\dots ,d\), it follows from Proposition 1 that

$$\begin{aligned} \frac{-1}{\lambda _{k-d-2j+1}(x)} \ge \biggl ( \frac{h_{\min }}{2} \biggr )^{d+1} \prod _{m=0}^d (4j+2m-1+t), \end{aligned}$$

with \(t\in (-1,1)\) and likewise

$$\begin{aligned} \frac{-1}{\lambda _{k+2j}(x)} \ge \biggl ( \frac{h_{\min }}{2} \biggr )^{d+1} \prod _{m=0}^d (4j+2m-1-t), \end{aligned}$$

for \(k+2j\le n-d\). Combining both bounds, we get

$$\begin{aligned} - \lambda _{k-d-2j+1}(x) - \lambda _{k+2j}(x) \le \biggl ( \frac{2}{h_{\min }}\biggr )^{d+1} g(t), \end{aligned}$$

where

$$\begin{aligned} g(t) = \frac{1}{\prod _{m=0}^d (4j+2m-1+t)} + \frac{1}{\prod _{m=0}^d (4j+2m-1-t)}. \end{aligned}$$

As g is clearly even and

$$\begin{aligned} g(1) = \frac{1}{2^{d+1}} \Biggl ( \frac{1}{\prod _{m=0}^d (2j+m)} + \frac{1}{\prod _{m=0}^d (2j-1+m)} \Biggr ), \end{aligned}$$

it remains to show that \(g(t)\le g(1)\) for \(t\in [0,1]\). To this end, note that \(g(t)=p(t)/q(t)\) for

$$\begin{aligned} p(t) = \prod _{m=0}^d (4j+2m-1+t) + \prod _{m=0}^d (4j+2m-1-t) \end{aligned}$$

and

$$\begin{aligned} q(t) = \prod _{m=0}^d \bigl ( {(4j+2m-1)}^2 - t^2 \bigr ). \end{aligned}$$

By the product rule,

$$\begin{aligned} p'(t) = \sum _{l=0}^d \Biggl ( \prod _{m=0,\,m\ne l}^d (4j+2m-1+t) - \prod _{m=0,\,m\ne l}^d (4j+2m-1-t) \Biggr ) \end{aligned}$$

and

$$\begin{aligned} q'(t) = -2t \sum _{l=0}^d \, \prod _{m=0,\,m\ne l}^d \bigl ( {(4j+2m-1)}^2 - t^2 \bigr ). \end{aligned}$$

For \(t\in [0,1]\), we observe that \(p(t)>0\), \(q(t)>0\), \(p'(t)\ge 0\), and \(q'(t)\le 0\), hence

$$\begin{aligned} p'(t) q(t) \ge 0 \ge p(t) q'(t), \end{aligned}$$

and it follows from the quotient rule that g is monotonically increasing over [0, 1]. \(\square \)

Next, let us bound the \(\lambda _i\) with indices in \(I_3\) from below.

Lemma 5

For any \(i\in I_3\) and \(x\in (x_k,x_{k+1})\),

$$\begin{aligned} \lambda _i(x) \ge \biggl ( \frac{1}{h_{\max }} \biggr )^{d+1} \frac{4}{d!}. \end{aligned}$$
(33)

Proof

Since the denominator of \(\lambda _i(x)\), for \(i\in I_3\), contains the factors \(x-x_{k-m}\) for \(m=0,\dots ,k-i\) and the factors \(x_{k+1+l}-x\) for \(l=0,\dots ,i+d-k-1\), we conclude from Proposition 1 that

$$\begin{aligned} \frac{1}{\lambda _i(x)}&\le \biggl ( \frac{h_{\max }}{2}\biggr )^{d+1} \prod _{m=0}^{k-i} (1+t+2m) \prod _{l=0}^{i+d-k-1} (1-t+2l)\\&= \biggl ( \frac{h_{\max }}{2}\biggr )^{d+1} (1-t^2) \prod _{m=1}^{k-i} (1+t+2m) \prod _{l=1}^{i+d-k-1} (1-t+2l) \\&\le \biggl ( \frac{h_{\max }}{2}\biggr )^{d+1} \prod _{m=1}^{k-i} (2+2m) \prod _{l=1}^{i+d-k-1} (2+2l)\\&= \biggl ( \frac{h_{\max }}{2}\biggr )^{d+1} 2^{d-1} (k-i+1)! (i+d-k)!, \end{aligned}$$

and the statement then follows, because \(a!b!\le (a+b-1)!\) for any \(a,b\in \mathbb {N}\). \(\square \)

We are now ready to derive a general upper bound on the function \(\Gamma _d\) in (26), which turns out to depend on d and the mesh ratio

$$\mu =\frac{h_{\max }}{h_{\min }}.$$

Theorem 3

If \(d\ge 1\), then

$$\begin{aligned} \Gamma _d(x;{\varvec{x}}) \le 1 + \frac{\mu ^{d+1}}{2d} \end{aligned}$$
(34)

for any set of ascending interpolation nodes \({\varvec{x}}=(x_0,\dots ,x_n)\in \mathbb {R}^{n+1}\) and any \(x\in [x_0,x_n]\).

Proof

If \(x=x_k\) for some \(k\in \{0,\dots ,n\}\), then, after multiplying both N(x) and D(x) by \(\prod _{i=0}^n{|x-x_i|}\), we find that \(\Gamma _d(x;{\varvec{x}})=1\), which is clearly smaller than the upper bound in (34). Otherwise, there exists some \(k\in \{0,\dots ,n-1\}\), such that \(x\in (x_k,x_{k+1})\), and it follows from Lemma 4 that

$$\begin{aligned} - S_1 - S_5&\le \sum _{j=1}^\infty \biggl ( \frac{1}{h_{\min }} \biggr )^{d+1} \Biggl ( \frac{1}{\prod _{m=0}^d (2j+m)} + \frac{1}{\prod _{m=0}^d (2j-1+m)} \Biggr )\\&= \biggl ( \frac{1}{h_{\min }}\biggr )^{d+1} \sum _{j=1}^\infty \prod _{m=0}^d \frac{1}{j+m}\\&= \biggl ( \frac{1}{h_{\min }}\biggr )^{d+1} \frac{1}{d \cdot d!}, \end{aligned}$$

where the sum of the series can be found in [17, p. 464]. Together with Lemma 5, this implies

$$\begin{aligned} \frac{-2(S_1 + S_5)}{\lambda _i(x)} \le \frac{\mu ^{d+1}}{2d} \end{aligned}$$

for any \(i\in I_3\). As \(\lambda _i(x)\le S_3\le D(x)\) and \(N(x)-D(x)=-2(S_1+S_5)\), we conclude that

$$\begin{aligned} N(x) - D(x) \le \frac{\mu ^{d+1}}{2d} D(x), \end{aligned}$$

and the statement then follows immediately. \(\square \)

In the case of equidistant nodes, when the mesh ratio is \(\mu =1\), the upper bound in (34) is simply \(1+1/(2d)\), so it becomes smaller as d grows. For other nodes, \(\mu \) may depend on n, which may result in very large upper bounds. For example, in the case of Chebyshev nodes, one can show that \(\mu \) grows asymptotically linear in n. However, our numerical experiments suggest that the function \(\Gamma _d\) is always small, and we believe that the upper bound in Theorem 3 can be improved significantly in future work.

Numerical experiments

We performed numerous experiments to verify the results proven in the previous sections numerically and report some representative results below. In particular, we analyze the various algorithms that implement the first barycentric form (7) both in terms of stability and computational cost (Sect. 7.1) and focus on the comparison between the first and the second form for an example where \(\Lambda _n\gg \Gamma _d\) (Sect. 7.2).

Our experimental platform is a Windows 10 laptop with 1.8 GHz Intel Core i7-10510U processor and 16 GB RAM, and we implemented all algorithms in C++. In what follows, the ‘exact’ values were computed in multiple-precision (1024 bit) floating point arithmetic using the MPFR library [18], while all other values were computed in standard double precision. Moreover, we took care of providing all input data (interpolation nodes, data values, and evaluation points) in double precision, so that they do not cause any additional error.

Comparison of algorithms for the first barycentric form

For the first example, we consider the case of \(n+1\) interpolation nodes \(x_i={\textrm{fl}}(y_i)\in [-1,1]\) for \(i=0,\dots ,n\), derived from the equidistant nodes \(y_i=2i/n-1\) by rounding them to double precision, and associated (rounded) data \(f_i={\textrm{fl}}(f(y_i))\), sampled from the test function

$$\begin{aligned} f(x) = \tfrac{3}{4} e^{-\frac{(9x-2)^2}{4}} + \tfrac{3}{4} e^{-\frac{(9x+1)^2}{49}} + \tfrac{1}{2} e^{-\frac{(9x-7)^2}{4}} + \tfrac{1}{5} e^{-(9x-4)^2}, \end{aligned}$$

and we compare three ways of evaluating the first barycentric form in (7), which differ in the way the denominator is computed.

Fig. 1
figure 1

Distribution of the relative forward errors of the first barycentric form for equidistant nodes at 50,000 random evaluation points (top) and overall running time in seconds (bottom), both on a logarithmic scale, for different n and three choices of d (left, middle, right), using the standard algorithm (blue), Camargo’s algorithm (red), and our efficient variant of the standard algorithm (green) (color figure online)

The first algorithm simply evaluates the functions \(\lambda _i\) as in (8), leading to the error mentioned in Lemma 2 and then sums up these values to get the denominator. The second algorithm by Camargo [11, Section 4.1] instead increases the stability of the summation by first computing sums of pairs of \(\lambda _i\)’s such that all these sums have the same sign. The third algorithm implements our iterative strategy in (31) before taking the sum, which is more efficient than the first algorithm, but less precise (cf. Lemma 3). All three algorithms compute the numerator of the first barycentric form in the same way, first dividing the weights \(w_i\) by \(x-x_i\), then multiplying the results by \(f_i\), and finally summing up these values. The \(w_i\) themselves are precomputed with the pyramid algorithm in (29) and (30). Note that, although the weights for equidistant nodes are integer multiples of each other in theory [6], they do not have this property in this example, because the nodes \(x_i\) are not exactly equidistant, because of the rounding.

Fig. 2
figure 2

Plots of \(\kappa (x;{\varvec{x}},{\varvec{f}})\) (top) and \(\Gamma _d(x;{\varvec{x}})\) (bottom) for equidistant nodes and \(x\in [-1,1]\), both on a logarithmic scale, for \(n=39\) and three choices of d (left, middle, right)

To compare these three algorithms, we used them to evaluate the barycentric rational interpolant with weights in (6) for \(d\in \{1,5,25\}\) and an increasing number of interpolation nodes, \(n\in \{39,79,159,319,639,1279\}\), at 50,000 random points from \([-1+10^3\epsilon ,1-10^3\epsilon ]\setminus \{x_0,\dots ,x_n\}\). Figure 1 shows the corresponding running times and the distribution of the relative forward error of the computed values. For the latter, we chose a box plot, where the bottom and top of each box represent the interquartile range, from the first to the third quartile, and the line in the middle shows the median of the relative forward errors. The whiskers range from the minimum to the maximum, excluding those values that are more than 1.5 times the interquartile range greater than the third quartile, which are instead considered outliers and shown as isolated points.

On the one hand, we observe that Camargo’s algorithm beats the standard algorithm in terms of running time, but that our efficient algorithm is the fastest, especially as d grows, because its time complexity does not depend on d. On the other hand, our efficient algorithm is less precise than the standard algorithm, as predicted by Lemma 3 and Camargo’s algorithm gives the smallest errors, except for \(d=5\) and \(n\in \{639,1279\}\). Nevertheless, the computations confirm the forward stability for all three algorithms. For Camargo’s algorithm, this follows from the backward stability, which is proven in [11], and for the other two algorithms it is implied by Corollary 2 and Lemmas 13.

The rather large errors of the outliers in the case \(d=25\) can be explained by the behaviour of the componentwise relative condition number \(\kappa \), shown in Figure 2. While \(\kappa \) is small for all \(x\in [-1,1]\) in the case of \(d=1\), it starts to grow considerably close to the end points of the interval as d grows, up to \(10^6\) for \(n=39\) and \(10^8\) for \(n=1279\) in the case of \(d=25\), and so does the upper bound on the relative forward error in (28). While this upper bound also depends on \(\Gamma _d\), Figure 2 shows that this function is always small and its maximum even decreases as d grows, independently of n. This is in agreement with Theorem 3, because \(\mu \approx 1\) in this example. The fact that the maximum error still seems to decrease for \(d=25\) as n increases is simply due to the fact that the 50,000 sample points are not sufficiently many to “catch” the worst case, because the region near the endpoints where \(\kappa \) grows rapidly actually shrinks as n grows.

Of course, it is also possible to evaluate the rational interpolant using the second barycentric form (4), which is actually the best choice for this example, giving relative forward errors that are similar to the ones of the standard algorithm for the first barycentric form, but being roughly twice as fast as the efficient algorithm, both of which is not surprising. Regarding the efficiency, the second form is superior, because the denominator can be computed “on-the-fly” at almost no extra cost during the evaluation of the numerator. As for the error, we note that \(\Lambda _n\) is much smaller than \(\kappa \) for the nodes used in this example [10] and so the upper bound in (25) is dominated by \(\kappa \), exactly as the upper bound in (28). However, for non-uniform nodes, the situation can be quite different, as the next example shows.

Fig. 3
figure 3

Plots of \(\Lambda _n(x;{\varvec{x}})\) (left) and \(\Gamma _d(x;{\varvec{x}})\) (right) for a non-regular distribution of nodes and \(x\in [0,1]\), both on a logarithmic scale

Worst-case comparison of first and second barycentric form

The aim of the second example is to compare the standard algorithm for the first barycentric form in (7), as described in Sect. 7.1, with a straightforward implementation of the second barycentric form, following the formula in (4). The weights \(w_i\) are again precomputed with the pyramid algorithm.

We consider \(n=29\), \(d=3\), and interpolation nodes \(x_i={\textrm{fl}}(y_i)\in [0,1]\) for \(i=0,\dots ,n\), obtained by rounding to double precision the values \(y_i=F(t_i)\), where

$$\begin{aligned} F(t)={\left\{ \begin{array}{ll} 0, &{} \quad t=0,\\ e^{1-\frac{1}{t}}, &{} \quad t\in (0,1] \end{array}\right. } \end{aligned}$$

and \(t_i=i/n\). We choose these nodes, because the functions \(\Lambda _n\) and \(\Gamma _d\) behave completely differently in this case, as shown in Fig. 3. While \(\Lambda _n\) reaches huge values, up to \(10^{17}\), \(\Gamma _d\) is small (even though the latter is not guaranteed by Theorem 3). Hence, the upper bounds in Corollaries 1 and 2 suggest that we may see a big difference in the forward stability of the first and the second barycentric form, if we choose the data such that the condition number \(\kappa \) is small.

Fig. 4
figure 4

Plots of \(l_n(x)\) (black) and the barycentric rational interpolant r(x) for \(d=3\) (red) for non-regularly distributed interpolation nodes over the whole interval [0, 1] (top left) and a close-up view over [0.21, 0.31] (top right). Evaluating r(x) at 10, 000 equidistant evaluation points in \([10^3\epsilon ,1-10^3\epsilon ]\) with the standard implementations of the first (blue dots) and the second (green dots) barycentric form shows that the first form is stable, while the second form is not (bottom) (color figure online)

One such choice, which is also presented by Higham for the case \(d=n\) with equidistant nodes [1], is to take the data

$$\begin{aligned} f_i = {\left\{ \begin{array}{ll} 0, &{} \quad i=0,\dots ,n-1,\\ 1, &{} \quad i=n, \end{array}\right. } \end{aligned}$$

which can be interpreted as having been sampled from the nth Lagrange basis polynomial \(l_n(x)=\prod _{i=0}^{n-1}\frac{x-x_i}{x_n-x_i}\), that is, \(f_i={\textrm{fl}}(l_n(y_i))\). For this data, we know that \(\kappa (x;{\varvec{x}},{\varvec{f}})=1\) for all \(x\in [0,1]\), so the upper bounds on the relative forward errors in (25) and (28) are dominated by \(\Lambda _n\) and \(\Gamma _d\), respectively. Consequently, as shown in Fig. 4, the barycentric rational interpolant is reproduced faithfully by the first form, but not by the second, because the relative forward error for the first form is on the order of \(\epsilon \), while it can be on the order of 1 for the second form; see Fig. 5 (left).

Fig. 5
figure 5

Plot of relative forward errors of the first (blue) and second (green) barycentric form for a non-regular distribution of nodes at 100 equidistant evaluation points in \([10^3\epsilon ,1-10^3\epsilon ]\) with data sampled from the nth Lagrange basis polynomial (left) and the constant one function (right). Since both plots are on a logarithmic scale and the second form is exact in the latter case, the corresponding errors are missing in the plot on the right (color figure online)

Fig. 6
figure 6

Even though the barycentric rational interpolant of the constant one function for non-regularly distributed interpolation nodes is simply \(r(x)=1\), evaluating it at 10, 000 equidistant evaluation points in \([10^3\epsilon ,1-10^3\epsilon ]\) shows that the second form (green dots) is stable, while the first form (blue dots) is not

However, the opposite may happen as well. If we consider the data

$$\begin{aligned} f_i=1, \qquad i=0,\dots ,n, \end{aligned}$$

sampled from the constant one function, then \(\kappa =\Lambda _n\), and the upper bounds in (25) and (28) suggest that both forms can be unstable, even though the barycentric rational interpolant is simply \(r(x)=1\). Figure 5 (right) and Figure 6 confirm that this is indeed the case for the first barycentric form. However, the second barycentric form is perfectly stable, because the numerator and the denominator in (4) are identical and cancel out to always give the correct function value 1.

Conclusion

Barycentric interpolation offers a fast and stable means of evaluating the polynomial Lagrange interpolant using either the first or the second barycentric form. While the first form is always numerically stable, the second form is stable only for interpolation nodes with a small Lebesgue constant.

Evaluating a rational interpolant via the second barycentric form comes with the same limitation, but for the special family of barycentric rational interpolants with weights in (6), a computationally more attractive first barycentric form is available. Instead of depending on the Lebesgue constant, both the forward and the backward stability of a straightforward implementation of this first barycentric form depend on the function \(\Gamma _d\) in (26).

Unlike the Lebesgue constant, Theorem 3 shows that the maximum of \(\Gamma _d\) is independent of n. Moreover, it is guaranteed to be very small for equidistant nodes, regardless of d, while the Lebesgue constant is known to grow logarithmically in n and exponentially in d in this case. Based on our numerical experiments, the maximum of \(\Gamma _d\) seems to be small for other distributions of interpolation nodes, too, even if the mesh ratio \(\mu \) is big, as in the example in Sect. 7.2, and we believe that the upper bound in (34) can be improved considerably in future work. For example, if \(d=n\), then \(\Gamma _d\) is just the constant one function, independent of \(\mu \).

Overall, we conclude that the most efficient way of stably evaluating a barycentric rational interpolant with weights in (6) is by determining the weights with the pyramid algorithm in (29) and (30) in a preprocessing step, and, for every evaluation point x, first computing the values \(\lambda _i(x)\) with our iterative strategy in (31) and then the value of the interpolant using a straightforward implementation of the first barycentric form in (7). Alternatively computing the sum of the \(\lambda _i(x)\) in the denominator with Camargo’s algorithm [11] results in slightly smaller forward errors, but is significantly slower, especially for larger d.