1 Introduction

Numerical algorithms for solving continuous problems generally fall into two categories: nonadaptive algorithms and adaptive algorithms. By “adaptive” we here mean that in its successive steps the algorithm uses information about the problem instance (usually a real valued function) obtained from the previous steps. Adaptive algorithms often overcome nonadaptive ones in that they enjoy an essentially better convergence rate. Examples include bisection or Newton’s method for solving nonlinear equations. A good deal of numerical literature is devoted to automatic integration using adaptive quadratures (see, e.g., [1, 6]), one of the first and probably best known being the adaptive Simpson quadrature [5]. The question when and how much adaption helps is one of the main issues in information-based complexity [7, 12].

For the problems of function approximation or integration, adaptive algorithms are especially efficient when the underlying function is piecewise smooth only, since then adaption can be successfully used to localize the unknown singular points [9,10,11]. On the other hand, if the function is smooth in the whole domain, then adaptive algorithms can improve the error only by a constant compared to nonadaptive algorithms. The exact asymptotic constants for quadratures of degree of exactness r − 1 and for functions fCr([a, b]) with f(r) > 0 were obtained in [8] for r = 4 and in [2, 3] for arbitrary r. Procedures corresponding to the optimal strategies for automatic integration were also proposed.

While adaptive numerical algorithms for the problem of function approximation of smooth functions are sometimes constructed (see, e.g., [4, Sect. 6.14]), a similar quantitative analysis of such algorithms seem not to exist. The purpose of the current paper is to fill this gap. We consider approximation of functions fCr([a, b]) and algorithms that rely on piecewise polynomial interpolation of degree r − 1 with adaptive strategies of selecting m subintervals. The error is measured in the integral norm \(\|\cdot \|_{L^{p}}\) with \(1\le p\le +\infty \). It is well known that the optimal convergence rate is in this case of order mr and it is already achieved by the uniform, i.e., nonadaptive, partition of the initial interval. (Actually, the rate mr cannot be beaten even in the much larger class of algorithms that use m function evaluations, which follows in particular from [13]).

We first prove that for any function in the class a theoretically best adaptive strategy of interval subdivision relies on keeping the Lp errors equal in all subintervals. Then, the global Lp error of approximation asymptotically, as \(m\to +\infty ,\) equals

$$\frac{\alpha_{r,p}}{r!}\|f^{(r)}\|_{L^{1/(r+1/p)}(a,b)}m^{-r},$$

while for the uniform partition it equals

$$\frac{\alpha_{r,p}}{r!}(b-a)^{r}\|f^{(r)}\|_{L^{p}(a,b)}m^{-r},$$

where αr, p is given by (4) (see Propositions 1 and 2). The gain from using adaption can be significant. For instance, consider the \(L^{\infty }\) approximation of f(x) = 1/(x + 10d) in the interval [0,1]. If d = 2, then the adaptive algorithm overcomes the nonadaptive one roughly by the factor of 106, and for d = 8 this factor becomes 1029 (see Table 2).

Then, we show how the optimal strategy can be realized in practice. That is, for a given function fC([a, b]) we construct a relatively simple procedure that uses a priority queue and produces an almost optimal m th partition with the help of proportionally to m evaluations of f. Different versions of the procedure and the error analysis are presented depending on additional properties of f (see Theorems 1–4).

Next, we deal with automatic approximation. We consider a local subdivision strategy that is a departure point for obtaining a recursive procedure using the (almost) optimal strategy. For any ε > 0 and fCr([a, b]), the proposed procedures return an approximation with the Lp error at most ε, asymptotically as ε → 0+.

Finally, we notice that our results imply the previously known and mentioned earlier in this introduction results for the numerical integration.

The content of the paper is as follows. In Section 2 we formally define our problem and show some preliminary estimates. A theoretically optimal partition is constructed in Section 3, while Section 4 is devoted to its practical realization. The recursive procedures for automatic approximation are constructed in Section 5. In Section 6 we comment on relations to the numerical integration. Theoretical findings are complemented by some numerical examples.

2 Preliminaries

For an integer r ≥ 1 and \(-\infty <a<b<+\infty ,\) we denote by Cr([a, b]) the space of functions

$$f:[a,b]\to\mathbb R$$

that are r-times continuously differentiable in [a, b]. We assume that such functions are approximated using piecewise interpolation of degree r − 1 with possibly non-uniform partition of the interval [a, b] into subintervals. Specifically, we first fix points

$$ 0\le t_{1}<t_{2}<\cdots<t_{r}\le 1. $$
(1)

For a given fCr([a, b]), the interval [a, b] is subdivided into m subintervals that are determined by a choice of points

$$ a=x_{0}<x_{1}<\cdots<x_{m}=b. $$
(2)

In each subinterval [xj− 1, xj], the function is approximated by its Lagrange polynomial of degree r − 1 interpolating f at

$$ x_{j,i}=x_{j-1}+h_{j}t_{i},\qquad 1\le i\le r,$$

where hj = xjxj− 1. We denote such an approximation by Lm, rf.

The error of approximation is measured in the Lp norm, i.e.,

$$ \|f-L_{m,r}f\|_{L^{p}(a,b)} = \left\{\begin{array}{ll} \left( {{\int}_{a}^{b}}|f(x)-L_{m,r}f(x)|^{p} \mathrm dx\right)^{1/p}, &\quad 1\le p<+\infty, \\ \operatornamewithlimits{ess sup}_{a\le x\le b}|f(x)-L_{m,r}f(x)|, &\quad p=+\infty. \end{array}\right. $$
(3)

We are interested in partitions (2) such that the errors for the corresponding approximations Lm, rf are asymptotically (as \(m\to +\infty \)) as small as possible. Note that the problem can be formally treated as a special way of approximating the embedding

$$ C^{r}([a,b])\hookrightarrow L^{p}(a,b).$$

Remark 1

Obviously, the uniform approximation Cr([a, b])↪C([a, b]) is also of interest. We do not analyze it separately, since it is equivalent to \(L^{\infty }\) approximation provided t1 = 0, tr = 1, and r ≥ 2. Indeed, then for any partition, the approximation Lm, rf is continuous in [a, b] and \(\|f-L_{m,r}f\|_{C([a,b])}=\|f-L_{m,r}f\|_{L^{\infty }(a,b)}\).

In the rest of the paper, we assume without loss of generality that f is not a polynomial of degree smaller than r, since otherwise we clearly have Lm, rf = f. Then, in particular, the derivative f(r) is a nontrivial function.

We now provide preliminary formulas for the approximation error that will be used later. Let

$$P_{r}(t)=(t-t_{1})(t-t_{2})\cdots(t-t_{r}),$$

where tis are given by (1). Let

$$ \alpha_{r,p} = \|P_{r}\|_{L^{p}(0,1)} = \left\{\begin{array}{ll} \left( {{\int}_{0}^{1}}|P_{r}(t)|^{p} \mathrm dt\right)^{1/p}, &\quad 1\le p<+\infty, \\ \max_{0\le t\le 1}|P_{r}(t)|, &\quad p=+\infty. \end{array}\right. $$
(4)

Then, the local errors, by which we mean the errors in the successive subintervals [xj− 1, xj], can be written as follows. For \(1\le p<+\infty ,\)

$$ \begin{array}{@{}rcl@{}} &&\|f-L_{m,r}f\|_{L^{p}(x_{j-1},x_{j})}\\ &=& \left( {\int}_{x_{j-1}}^{x_{j}}|f(x)-L_{m,r}f(x)|^{p}\mathrm dx\right)^{1/p} \\ &=& \left( {\int}_{x_{j-1}}^{x_{j}}\left|(x-x_{j,1})\cdots(x-x_{j,r})f[x_{j,1},\ldots,x_{j,r},x]\right|^{p}\mathrm dx\right)^{1/p}\\ &=& h_{j}^{r+1/p}\left( {{\int}_{0}^{1}}|(t-t_{1})\cdots(t-t_{r})|^{p}\mathrm dx\right)^{1/p}\left|f[x_{j,1},\ldots,x_{j,r},\xi_{j}]\right|\\ &=& \frac{\alpha_{r,p}}{r!} h_{j}^{r+1/p} \left|f^{(r)}(\eta_{j})\right|, \qquad\text{where}\quad\xi_{j},\eta_{j}\in[x_{j-1},x_{j}], \end{array} $$

and

$$ \begin{array}{@{}rcl@{}} \|f-L_{m,r}f\|_{L^{\infty}(x_{j-1},x_{j})} &=& \max_{x_{j-1}\le x\le x_{j}}|f(x)-L_{m,r}f(x)| \\ &=& \max_{x_{j-1}\le x\le x_{j}}\left|(x-x_{j,1})\cdots(x-x_{j,r})f[x_{j,1},\ldots,x_{j,r},x]\right| \\ &=& \frac{\alpha_{r,\infty}}{r!} {h_{j}^{r}} \left|f^{(r)}(\eta_{j})\right|, \qquad\text{where}\quad\eta_{j}\in[x_{j-1},x_{j}]. \end{array} $$

Hence,

$$ \begin{array}{@{}rcl@{}} \|f-L_{m,r}f\|_{L^{p}(a,b)}\!&=&\! \frac{\alpha_{r,p}}{r!} \left( \sum\limits_{j=1}^{m} h_{j}^{rp+1}\left|f^{(r)}(\eta_{j})\right|^{p}\right)^{1/p},\quad 1\!\le\! p{<}{+}\infty, \end{array} $$
(5)
$$ \begin{array}{@{}rcl@{}} \|f-L_{m,r}f\|_{L^{\infty}(a,b)}\!&=&\!\frac{\alpha_{r,\infty}}{r!} \max_{1\le j\le m}{h_{j}^{r}}\left|f^{(r)}(\eta_{j})\right|. \end{array} $$
(6)

In the sequel, ηj always denotes a point in the j th subinterval for which

$$\|f-L_{m,r}f\|_{L^{p}(x_{j-1},x_{j})} = \frac{\alpha_{r,p}}{r!} h_{j}^{r+1/p}\left|f^{(r)}(\eta_{j})\right|.$$

For convenience, we also use the following asymptotic notation. For two nonnegative functions a and b of the variable m we write

$$ a(m)\lessapprox b(m) \text{ iff } \limsup_{m\to\infty}\frac{a(m)}{b(m)}\le 1, \qquad a(m)\approx b(m) \text{ iff } \lim_{m\to\infty}\frac{a(m)}{b(m)}=1.$$

Obviously, a(m) ≈ b(m) iff \(a(m)\lessapprox b(m)\) and \(b(m)\lessapprox a(m)\).

Consider first the uniform partition of the interval [a, b], in which case

$$ x_{j}=a+j \frac{b-a}{m},\qquad 0\le j\le m. $$
(7)

Proposition 1

For the uniform partition (7) we have

$$ \|f-L_{m,r}f\|_{L^{p}(a,b)} \approx \frac{\alpha_{r,p}}{r!} (b-a)^{r}\left\|f^{(r)}\right\|_{L^{p}(a,b)} m^{-r},$$

for all \(1\le p\le +\infty \).

Proof

Indeed, by (5) we have

$$ \begin{array}{@{}rcl@{}} \|f-L_{m,r}f\|_{L^{p}(a,b)} &=& \frac{\alpha_{r,p}}{r!}\left( \frac{b-a}{m}\right)^{r}\left( \sum\limits_{j=1}^{m} \left( \frac{b-a}{m}\right) \left|f^{(r)}(\eta_{j})\right|^{p}\right)^{1/p}\\ &\approx& \frac{\alpha_{r,p}}{r!} (b-a)^{r}\left\|f^{(r)}\right\|_{L^{p}(a,b)} m^{-r}, \end{array} $$

and by (6) we have

$$ \begin{array}{@{}rcl@{}} \|f-L_{m,r}f\|_{L^{\infty}(a,b)}&=&\frac{\alpha_{r,\infty}}{r!} \max_{1\le j\le m}{h_{j}^{r}} \left|f^{(r)}(\eta_{j})\right| \\&\approx&\frac{\alpha_{r,\infty}}{r!} (b-a)^{r}\left\|f^{(r)}\right\|_{L^{\infty}(a,b)} m^{-r}. \end{array} $$

3 Optimal partition

We now show that an asymptotically optimal partition makes all local errors equal. That is, it asymptotically enjoys the smallest error as \(m\to +\infty ,\) for all functions fCr([a, b]). Specifically, for a given m, let

$$ a=x_{0}^{*}<x_{1}^{*}<\cdots<x_{m}^{*}=b $$
(8)

be such that all the quantities

$$ \|f-L_{m,r}^{*}f\|_{L^{p}(x_{j-1}^{*},x_{j}^{*})} = \frac{\alpha_{r,p}}{r!} h_{j}^{r+1/p}\left|f^{(r)}(\eta_{j})\right|,\qquad 1\le j\le m, $$

where \(L_{m,r}^{*}f\) denotes the approximation corresponding to (8), are equal. Observe that such a partition exists since the local errors continuously depend on the points xi.

In the sequel, \(\|g\|_{L^{q}(a,b)}=\left ({{\int \limits }_{a}^{b}}|g(x)|^{q} \mathrm dx\right )^{1/q}\) for all \(0<q\le +\infty \). This is obviously not a norm in case 0 < q < 1, since then the triangle inequality is not satisfied. We also adopt the notation that 1/p = 0 for \(p=+\infty \).

Proposition 2

The equal-local error partitions (8) and the corresponding approximations \(L_{m,r}^{*}\) are asymptotically optimal. That is, for the approximations Lm, r using other partitions we have

$$ \|f-L_{m,r}^{*}f\|_{L^{p}(a,b)} \lessapprox \|f-L_{m,r}f\|_{L^{p}(a,b)}.$$

Furthermore,

$$ \|f-L_{m,r}^{*}f\|_{L^{p}(a,b)} \approx \frac{\alpha_{r,p}}{r!} \left\|f^{(r)}\right\|_{L^{1/(r+1/p)}(a,b)} m^{-r}. $$

Proof

We first show the error formula for \(L_{m,r}^{*}\). Let \(A=h_{j}^{r+1/p}\left |f^{(r)}(\eta _{j})\right |\). Then, for finite p, we have

$$ mA^{1/(r+1/p)} = \sum\limits_{j=1}^{m} h_{j}\left|f^{(r)}(\eta_{j})\right|^{1/(r+1/p)} \approx {{\int}_{a}^{b}}\left|f^{(r)}(x)\right|^{1/(r+1/p)} \mathrm dx, $$

where we used the fact that if f(r)(ηj) = 0, then f(r) nullifies on the whole interval [xj− 1, xj]. This implies

$$ \begin{array}{@{}rcl@{}} \|f-L_{m,r}^{*}f\|_{L^{p}(a,b)} &=&\frac{\alpha_{r,p}}{r!}(mA^{p})^{1/p} = \frac{\alpha_{r,p}}{r!}\left( mA^{1/(r+1/p)}\right)^{r+1/p} m^{-r} \\ &\approx&\frac{\alpha_{r,p}}{r!} \|f^{(r)}\|_{L^{1/(r+1/p)}(a,b)} m^{-r}. \end{array} $$

For infinite p we have in turn

$$ \|f - L_{m,r}^{*}f\|_{L^{\infty}(a,b)} = \frac{\alpha_{r,\infty}}{r!}A= \frac{\alpha_{r,\infty}}{r!}\left( mA^{1/r}\right)^{r}m^{-r} \approx \frac{\alpha_{r,\infty}}{r!} \|f^{(r)}\|_{L^{1/r}(a,b)} m^{-r}, $$

as claimed.

Now, we show that for any Lm, rf such that \(\|f-L_{m,r}f\|_{L^{p}(a,b)}\lessapprox Cm^{-r}\) we have \(C\ge \frac {\alpha _{r,p}}{r!} \left \|f^{(r)}\right \|_{L^{1/(r+1/p)}(a,b)}\). For that end, we fix ≥ 1 and define ui = a + iH where H = (ba)/, and

$$ C_{i} = \min_{x\in[u_{i-1},u_{i}]}\left|f^{(r)}(x)\right|,\qquad 1\le i\le\ell. $$

Suppose that Lm, rf use partitions a = x0 < ⋯ < xm = b. We can assume without loss of generality that for m we have \(\{u_{i}\}_{i=0}^{\ell }\subset \{x_{j}\}_{j=0}^{m}\). Indeed, since we keep fixed, we can always add the points ui to a given partition without asymptotically increasing the error, as \(m\to +\infty \). Let li be such that \(x_{l_{i}}=u_{i},\) and mi = lili− 1, 1 ≤ i.

Consider first finite p. We have

$$ \begin{array}{@{}rcl@{}} \|f - L_{m,r}f\|_{L^{p}(u_{i-1},u_{i})}\!\!&\ge&\!\!\frac{\alpha_{r,p}}{r!} C_{i}\left( \sum\limits_{j=l_{i-1}+1}^{l_{i}}h_{j}^{rp+1}\!\right)^{1/p} \!\ge\!\frac{\alpha_{r,p}}{r!} C_{i}\left( \!m_{i}\!\left( \!\frac{H}{m_{i}}\!\right)^{rp+1}\right)^{1/p} \\ \!&=&\! \frac{\alpha_{r,p}}{r!} C_{i} H^{r+1/p}m_{i}^{-r}, \end{array} $$

as the sum above is minimized for hj = H/mj for all li− 1 + 1 ≤ jlj. Hence,

$$\|f-L_{m,r}f\|_{L^{p}(a,b)} \ge \frac{\alpha_{r,p}}{r!} H^{r+1/p} \left( \sum\limits_{i=1}^{\ell}\left( \frac{C_{i}}{{m_{i}^{r}}}\right)^{p}\right)^{1/p}. $$

The minimization of the last sum with respect to \({\sum }_{i=1}^{\ell } m_{i}=m\) gives the optimal

$$m_{i}^{*} = \frac{C_{i}^{1/(r+1/p)}}{{\sum}_{j=1}^{\ell} C_{j}^{1/(r+1/p)}} m,$$

for which

$$\left( \sum\limits_{i=1}^{\ell}\left( \frac{C_{i}}{(m_{i}^{*})^{r}}\right)^{p}\right)^{1/p} = \left( \sum\limits_{i=1}^{\ell} C_{i}^{1/(r+1/p)}\right)^{r+1/p}m^{-r}.$$

Hence,

$$ \begin{array}{@{}rcl@{}} \|f-L_{m,r}f\|_{L^{p}(a,b)} &\ge& \frac{\alpha_{r,p}}{r!} H^{r+1/p}\left( \sum\limits_{i=1}^{\ell} C_{i}^{1/(r+1/p)}\right)^{r+1/p}m^{-r} \\ &=& \frac{\alpha_{r,p}}{r!} \left( \sum\limits_{i=1}^{\ell} HC_{i}^{1/(r+1/p)}\right)^{r+1/p}m^{-r}. \end{array} $$

The last sum in the parentheses is a Riemann sum for the integral \({{\int \limits }_{a}^{b}}|f^{(r)}\) (x)|1/(r+ 1/p)dx. Hence, taking sufficiently large and m, we can make \(m^{r}\|f-L_{s,m}f\|_{L^{p}(a,b)}\) arbitrarily close to \(\frac {\alpha _{r,p}}{r!}\left \|f^{(r)}\right \|_{L^{1/(r+1/p)}(a,b)}\).

For infinite p, we similarly have

$$\|f-L_{m,r}f\|_{L^{\infty}(a,b)} \ge \frac{\alpha_{r,\infty}}{r!}\max_{1\le i\le\ell} \left( C_{i} \max_{l_{i-1}+1\le j\le l_{i}}{h_{j}^{r}}\right) \ge \frac{\alpha_{r,p}}{r!} H^{r} \max_{1\le i\le\ell}C_{i}m_{i}^{-r}. $$

The right-hand side is minimized by

$$m_{i}^{*} = \frac{C_{i}^{1/r}}{{\sum}_{j=1}^{\ell} C_{j}^{1/r}} m,$$

for which \(\max \limits _{1\le i\le \ell }C_{i}(m_{i}^{*})^{-r}=\left ({\sum }_{i=1}^{\ell } C_{i}^{1/r}\right )^{r}m^{-r}\). Hence,

$$\|f-L_{m,r}f\|_{L^{\infty}(a,b)} \ge \frac{\alpha_{r,\infty}}{r!} \left( \sum\limits_{i=1}^{\ell} HC_{i}^{1/r}\right)^{r}m^{-r}.$$

The proof completes the observation that the last sum in the parentheses is a Riemann sum for the integral \({{\int \limits }_{a}^{b}}\left |f^{(r)}(x)\right |^{1/r}\mathrm dx\). □

Remark 2

The error of approximation depends on the points tis via αr, p. Recall that for \(p\in \{1,2,+\infty \}\) this factor is minimized by the points \(t_{i}^{*}\)s being zeros of appropriate orthogonal polynomials, adjusted to the interval [0,1]. For p = 1 these are Chebyshev polynomials of the second kind, for p = 2 these are Legendre polynomials, and for \(p=+\infty \) these are Chebyshev polynomials of the first kind. Consider, for example, r = 4. Then, the optimal points are as follows. For p = 1,

$$ \begin{array}{@{}rcl@{}} t_{1}^{*}=\frac12\left( 1+\cos\left( \frac{4\pi}{5}\right)\right),\qquad t_{2}^{*}=\frac12\left( 1+\cos\left( \frac{3\pi}{5}\right)\right),\\ t_{3}^{*}=\frac12\left( 1+\cos\left( \frac{2\pi}{5}\right)\right),\qquad t_{4}^{*}=\frac12\left( 1+\cos\left( \frac{\pi}{5}\right)\right). \end{array} $$

For p = 2,

$$ \begin{array}{@{}rcl@{}} t_{1}^{*}=\frac12\left( 1-\sqrt{\frac{15+2\sqrt{30}}{35}} \right),\qquad t_{2}^{*}=\frac12\left( 1-\sqrt{\frac{15-2\sqrt{30}}{35}} \right),\\ t_{3}^{*}=\frac12\left( 1+\sqrt{\frac{15-2\sqrt{30}}{35}} \right),\qquad t_{4}^{*}=\frac12\left( 1+\sqrt{\frac{15+2\sqrt{30}}{35}} \right). \end{array} $$

For \(p=+\infty ,\)

$$ \begin{array}{@{}rcl@{}} t_{1}^{*}=\frac12\left( 1+\cos\left( \frac{7\pi}{8}\right)\right),\qquad t_{2}^{*}=\frac12\left( 1+\cos\left( \frac{5\pi}{8}\right)\right),\\ t_{3}^{*}=\frac12\left( 1+\cos\left( \frac{3\pi}{8}\right)\right),\qquad t_{4}^{*}=\frac12\left( 1+\cos\left( \frac{\pi}{8}\right)\right). \end{array} $$

Table 1 shows the corresponding values of αr, p for the optimal points \(t_{i}^{*}\) and, for comparison, for the equispaced points ti = (i − 1)/(r − 1), 1 ≤ ir.

Table 1 The values of α4, p for the optimal and equispaced choices of the tis

Remark 3

We have shown that the optimal partition is asymptotically better than the uniform partition by the factor of

$$ R_{r,p}(f) = \frac{(b-a)^{r}\left\| f^{(r)}\right\|_{L^{p}(a,b)}}{\left\| f^{(r)}\right\|_{L^{1/(r+1/p)}(a,b)}}.$$

We obviously have that

$$1\le R_{r,p}(f)<+\infty,$$

where the equality holds for f being a polynomial of degree r, and the more f(r) varies the bigger Rr, p(f). An example is provided in Table 2.

Table 2 The values of R4, p(f) for f(x) = 1/(x + 10d), 0 ≤ x ≤ 1

Now we want to see how much we potentially lose by not using the optimal partition. For the error to go to zero as \(m\to +\infty \) we have to assume that the partitions satisfy

$$\lim_{m\to\infty} \max\left\{h_{j}: 1\le j\le m, f^{(r)}(\eta_{j})\ne 0\right\}=0.$$

Let \(A_{j}=h_{j}^{r+1/p}\left |f^{(r)}(\eta _{j})\right |\) and A = (A1,…, Am). Denoting \(\|\mathbf A\|_{\infty }=\max \limits _{1\le j\le m}|A_{j}|\) and \(\|\mathbf A\|_{q}=\left ({\sum }_{j=1}^{m}|A_{j}|^{q}\right )^{1/q}\) for \(0<q<+\infty ,\) we have that

$$\|\mathbf A\|_{\frac{1}{r+1/p}}\approx\left( {{\int}_{a}^{b}}\left|f^{(r)}(x)\right|^{1/(r+1/p)}\mathrm dx\right)^{r+1/p}$$

as \(m\to +\infty \). The error satisfies

$$ \|f-L_{m,r}f\|_{L^{p}(a,b)} = \frac{\alpha_{r,p}}{r!} \|\mathbf A\|_{p} \approx K_{m} \|f-L_{m,r}^{*}f\|_{L^{p}(a,b)}, $$
(9)

where

$$ K_{m} = \frac{\|\mathbf A\|_{p} m^{r}}{\|\mathbf A\|_{\frac1{r+1/p}}}. $$

Obviously, Km ≥ 1 and for the optimal partition is Km = 1.

Let us check how big Km can be assuming that for all m sufficiently large we have

$$ \max_{1\le i,j\le m}A_{i}/A_{j}\le{\Omega}, $$
(10)

where Ω > 1 and 0/0 = 1. Since Km is a homogeneous function of A, we can assume without loss of generality that 1 ≤ Ai ≤Ω for all i s. It is clear that then the maximum is attained at A = (Ω,…,Ω,1,…,1), where Ω is repeated k times, for some k. If \(p=+\infty \), then the maximum is for k = 1 and

$$\max_{\mathbf A} K_{m} = \frac{\Omega m^{r}}{\left( {\Omega}^{1/r}+(m-1)\right)^{r}} \approx {\Omega}. $$

Let \(1\le p<+\infty \). Then, setting q = 1/(r + 1/p) we have

$$ K_{m} = \frac{\left( k({\Omega}^{p}-1)+m\right)^{1/p}}{\left( k\left( {\Omega}^{q}-1\right)+m\right)^{1/q}} m^{r}. $$

We treat Km as a function of k ∈ [0, m] and find its maximum. The maximum is for

$$ k^{*} = \left( \frac{q}{{\Omega}^{q}-1}-\frac{p}{{\Omega}^{p}-1}\right)\left( \frac{m}{p-q}\right); $$

therefore

$$ \begin{array}{@{}rcl@{}} \max_{\mathbf A}K_{m} &\le & \frac{(1-q/p)^{1/q}}{(p/q-1)^{1/p}} \frac{({\Omega}^{p}-1)^{1/q}}{({\Omega}^{q}-1)^{1/p}} ({\Omega}^{p}-{\Omega}^{q})^{1/p-1/q} \\ &=& \frac{(pr)^{r}}{(1+pr)^{r+1/p}} \frac{({\Omega}^{p}-1)^{r+1/p}}{({\Omega}^{1/(r+1/p)}-1)^{1/p}} \left( {\Omega}^{p}-{\Omega}^{1/(r+1/p)}\right)^{-r} < {\Omega}. \end{array} $$

Remark 4

Especially important will be the case where

$${\Omega}=2^{r+1/p}.$$

Then, Km in (9) is bounded from above by \(\kappa _{r,\infty }=2^{r}\) for \(p=+\infty ,\) and

$$ \kappa_{r,p} = \left( 1+\frac{1}{2^{1+pr}-2}\right)^{r}\left( 2^{1+pr}-1\right)^{1/p}\frac{(pr)^{r}}{(1+pr)^{r+1/p}} \quad\text{for}\quad 1\le p<+\infty. $$
(11)

The values of κr, p for \(p=1,2,\infty \) and 1 ≤ r ≤ 6 are in Table 3

Table 3 The values of κr, p for various p and r

4 An algorithm for (almost) optimal partitions

In this section, we show how asymptotically (almost) optimal partitions can be practically realized for a given m and fCr([a, b]). We allow algorithms that can evaluate f at any x ∈ [a, b].

Let us fix another point t0 ∈ [0,1] that is different from ti in (1) for 1 ≤ ir. For an interval I = [c, d] ⊂ [a, b] of length h = dc, define the functional \(\mathcal L_{I}:C^{r}([a,b])\to \mathbb R,\)

$$ \mathcal L_{I}(f) = f(u_{0})-\sum\limits_{i=1}^{r} w_{i} f(u_{i}),\quad\text{where}\quad w_{i} = \prod\limits_{i\ne k=1}^{r}\frac{t_{0}-t_{k}}{t_{i}-t_{k}}$$

and ui = c + hti, 0 ≤ ir.

Remark 5

Observe that for each interval I the functional \(\mathcal L_{I}\) is uniquely (up to a multiplicative factor) defined by the conditions that it linearly combines the values of f at ui for 0 ≤ ir, and its kernel consists of all polynomials of degree at most r − 1. In our definition, \(\mathcal L_{I}(f)\) is just the error of interpolating f in I at u0, but equally well it could be the divided difference f[u0, u1,…, ur]. For we have

$$ \mathcal L_{I}(f)=h^{r}\gamma_{r}f[u_{0},u_{1},\ldots,u_{r}],\quad\text{where}\quad \gamma_{r}=P_{r}(t_{0})=(t_{0}-t_{1})\cdots(t_{0}-t_{r}). $$
(12)

The algorithm that we present and analyze in this section uses a priority queue S whose elements are subintervals. For each IS of length h its priority is given by

$$p_{f}(I) = h^{1/p}|\mathcal L_{I}(f)|.$$

In the following pseudocode, insert(S, I) and I := extract_max(S) denote actions corresponding to inserting an interval to S, and extracting from S an interval with highest priority.

figure a

After execution, the elements of S form a partition into m subintervals. Since a priority queue can be implemented using a heap, an m th partition can be obtained at cost proportional to \(m\log m\).

Denote by \(L_{m,r}^{**}f\) the approximation corresponding to the m th partition obtained by our algorithm. Recall that αr, p and κr, p are respectively given by (4) and (11).

Theorem 1

If the function fCr([a, b]) is such that its derivative f(r) does not nullify in [a, b], then

$$ \|f-L_{m,r}^{**}f\|_{L^{p}(a,b)} \lessapprox \kappa_{r,p} \frac{\alpha_{r,p}}{r!} \|f^{(r)}\|_{L^{1/(r+1/p)}(a,b)} m^{-r}.$$

Proof

In vew of the definition of κr, p in (11) of Remark 4, it suffices to show that the value of Ω in (10) can be chosen arbitrarily close to 2r+ 1/p, provided m is large enough.

Suppose that f(r) > 0 (the case f(r) < 0 is symmetric and there is no need to consider it separately). Then, there are \(0<d\le D<+\infty \) depending on f such that

$$d\le f^{(r)}\le D.$$

For an interval I of length h, its priority can be written as

$$p_{f}(I)=\frac{|\gamma_{r}|}{r!}h^{r+1/p}f^{(r)}(\xi),\qquad\xi\in I,$$

where γ is defined in (12). This means that

$$d \frac{|\gamma_{r}|}{r!}h^{r+1/p} \le p_{f}(I) \le D \frac{|\gamma_{r}|}{r!}h^{r+1/p};$$

i.e., the priority is always positive, and it decreases to zero when an interval is successively subdivided. This leads to an important observation that the maximum length of a subinterval in an m th partition goes to zero as \(m\to +\infty \). Furthermore, if the interval I is further subdivided into I1 and I2, then for s = 1,2 we have

$$ \frac{p_{f}(I)}{p_{f}(I_{s})} = 2^{r+1/p}\frac{f^{(r)}(\xi)}{f^{(r)}(\xi_{s})}.$$

Since

$$ \left|\frac{f^{(r)}(\eta)}{f^{(r)}(\xi)}-1\right| = \frac{\left|f^{(r)}(\eta)-f^{(r)}(\xi)\right|}{f^{(r)}(\xi)} \le \frac{\omega(h)}{d}, $$
(13)

where ω is the modulus of continuity of the function f(r),

$$\frac{p_{f}(I)}{p_{f}(I_{s})} \le 2^{r+1/p}\left( 1+\frac{\omega(h)}{d}\right).$$

This in turn means that for any δ > 0 there is mδ such that for all m > mδ the ratio of the highest to lowest priorities in the m th partition is

$$ \max_{1\le i\le m} \frac{p_{f}([x_{i-1},x_{i}])}{p_{f}([x_{j-1},x_{j}])} \le 2^{r+1/p}\left( 1+\frac{\omega(\delta)}{d}\right). $$
(14)

(Indeed, mδ is such that the lengths of all subintervals in the corresponding partition are at most δ, and such that after dividing the subinterval with the highest priority, one of its successors has the lowest priority).

Consider now the partition for a particular mmδ. Since the local error in an interval I can be written as

$$ \|f-L_{m,r}^{**}f\|_{L^{p}(I)} = \frac{\alpha_{r,p}}{r!} h^{r+1/p}f^{(r)}(\eta) = p_{f}(I) \frac{\alpha_{r,p}}{|\gamma_{r}|} \frac{f^{(r)}(\eta)}{f^{(r)}(\xi)}, $$
(15)

by (13) and (14) we have that the ratio of any two local errors is upper bounded by

$$ \begin{array}{@{}rcl@{}} \frac{\|f-L_{m,r}^{**}f\|_{L^{p}([x_{i-1},x_{i}])}}{\|f-L_{m,r}^{**}f\|_{L^{p}([x_{j-1},x_{j}])}} &=& \frac{p_{f}([x_{i-1},x_{i}])}{p_{f}([x_{j-1},x_{j}])} \left( \frac{f^{(r)}(\eta_{i})/f^{(r)}(\xi_{i})}{f^{(r)}(\eta_{j})/f^{(r)}(\xi_{j})}\right) \\&\le& 2^{r+1/p} \frac{(1+\omega(\delta)/d)^{2}}{1-\omega(\delta)/d}. \end{array} $$

Since δ can be arbitrarily small, the right-hand side can be made arbitrarily close to 2r+ 1/p, as claimed. □

Example 1

Figure 1 shows results of a numerical experiment for regularity r = 4. The tested function is

$$f(x)=\frac{1}{x+\tfrac1{100}},\qquad 0\le x\le 1,$$

for which the 4th derivative is positive. The approximations are based on the adaptive partitions obtained from ALGORITHM, and those based on the uniform (nonadaptive) partitions. In this and all the numerical examples that follow we take t0 = 0.5 and the optimal points \(t_{1}^{*},t_{2}^{*},t_{3}^{*},t_{4}^{*},\) cf. Remark 2. The errors are measured in the Lp norms with \(p\in \{1,2,+\infty \}\). The results perfectly confirm the theoretical findings. (An artifact for \(p=+\infty ,\) in case of adaptive partitions and m close to 104, is a consequence of round-off errors that show up earlier for \(p=+\infty \) than for p = 1,2.)

Fig. 1
figure 1

Adaptive vs. nonadaptive strategies for the function f

Remark 6

In this paper, we consider the algorithm error versus the number m of subintervals. One may want to consider the error versus the number n of function values used. Then the choice of the equispaced points ti = (i − 1)/(r − 1) may lead to a better asymptotic constant than the choice of \(t_{i}^{*},\) 1 ≤ ir, despite the fact that the factor αr, p is in this case slightly larger, cf. Table 1. Consider, for instance, our algorithm for r = 4. If the points \(t_{i}^{*}\) are applied, then the algorithm produces an m th partition using n ≈ 10m function values. On the other hand, for the equispaced tis we have n ≈ 4m, since all 5 function values computed for a given subinterval can be re-used when halving this interval in one of the following steps.

Unfortunately, Theorem 1 does not hold for all f satisfying f(r) ≥ 0 or f(r) ≤ 0. Indeed, suppose that \(\hat t:=\max \limits _{0\le i\le r}t_{i}<1\) and consider the function \(f(x)=(x-\hat t )_{+}^{r+1}\) for x ∈ [0,2]. Then, pf([0,1]) = 0 and pf(I) > 0 for all intervals I ⊂ [1,2]. Hence, the interval [0,1] will never be subdivided and the error does not go to zero as \(m\to +\infty \).

A key point in this example is that the set of points ti, 0 ≤ ir, does not contain both endpoints of the interval [0,1]. If this obstacle is removed, then Theorem 1 holds true for all functions such that f(r) does not change its sign. To show this, we need the following auxiliary result.

Lemma 1

Let \(1\le p\le +\infty \). Let

$$ \min(t_{0},t_{1})=0\quad\text{and}\quad\max(t_{0},t_{r})=1. $$
(16)

Then, there exists βr, p > 0 (given, e.g., by (19)) such that the following holds. For any interval [c, d] of length h = dc and any function gCr([c, d]) such that

  1. (i)

    the derivative g(r) does not change its sign in [c, d], and

  2. (ii)

    g nullifies at ui = c + tih for all 1 ≤ ir,

we have

$$ \|g\|_{L^{p}(c,d)}\le\beta_{r,p} h^{1/p} \left|g(u_{0})\right|\quad\text{where}\quad u_{0}=c+t_{0}h. $$
(17)

In particular, if g(u0) = 0, then g nullifies on the whole interval [c, d].

Proof

Assume without loss of generality that g(r) ≥ 0. We estimate g(u) for u different from any of the points ui. We have two cases: either u < u0 or u > u0.

If u < u0 then, by the explicit formula for divided differences, we have

$$ \begin{array}{@{}rcl@{}} g[u_{1},u_{2},\ldots,u_{r},u] &=& \frac{g(u)}{{\prod}_{i=1}^{r}(u-u_{i})} \ge 0,\qquad\text{and} \\ g[u_{0},u_{2},\ldots,u_{r},u] &=& \frac{g(u)}{(u-u_{0}){\prod}_{i=2}^{r}(u-u_{i})}+ \frac{g(u_{0})}{(u_{0}-u){\prod}_{i=2}^{r}(u_{0}-u_{i})} \ge 0 \\ \end{array} $$
(18)

Combining both inequalities we get that if \({\prod }_{i=1}^{r}(u-u_{i})>0\) then

$$ 0 \le g(u) \le \ell_{1}(u)g(u_{0})\quad\text{where}\quad\ell_{1}(u)=\prod\limits_{i=2}^{r}\frac{u-u_{i}}{u_{0}-u_{i}}.$$

On the other hand, if \({\prod }_{i=1}^{r}(u-u_{i})<0\), then 1(u)g(u0) ≤ g(u) ≤ 0.

In the case u > u0, we similarly combine (18) with

$$ g[u_{1},\ldots,u_{r-1},u_{0},u] = \frac{g(u)}{(u-u_{0}){\prod}_{i=1}^{r-1}(u-u_{i})}+ \frac{g(u_{0})}{(u_{0}-u){\prod}_{i=1}^{r-1}(u_{0}-u_{i})} \ge 0 $$

to get that either

$$ 0 \le g(u) \le \ell_{r}(u)g(u_{0})\quad\text{where}\quad\ell_{r}(u)=\prod\limits_{i=1}^{r-1}\frac{u-u_{i}}{u_{0}-u_{i}},$$

or r(u)g(u0) ≤ g(u) ≤ 0.

Thus,

$$ |g(u)| \le |\ell(u)g(u_{0})|\quad\text{where}\quad \ell(u)=\ell_{1}(u)\mathbf 1_{[c,u_{0})}(u)+\ell_{r}(u)\mathbf 1_{[u_{0},d]}(u),$$

and \(\|g\|_{L^{p}(c,d)}\le \|\ell \|_{L^{p}(c,d)}|g(u_{0})|\). Letting \(l(t)=l_{1}(t)\mathbf 1_{[0,t_{0})}(t)+l_{r}(t)\mathbf 1_{[t_{0},1]}(t),\) where

$$ l_{1}(t) = \prod\limits_{i=2}^{r}\frac{t-t_{i}}{t_{0}-t_{i}},\qquad l_{r}(t) = \prod\limits_{i=1}^{r-1}\frac{t-t_{i}}{t_{0}-t_{i}},$$

and applying the substitution u = c + th, we finally obtain that \(\|\ell \|_{L^{p}(c,d)}=h^{1/p}\|l\|_{L^{p}(0,1)};\) hence, the lemma holds with

$$ \beta_{r,p} = \| l \|_{L^{p}(0,1)}. $$
(19)

Remark 7

An important consequence of Lemma 1 is that for any subinterval I of a given partition we have

$$ \|f-L_{m,r}f\|_{L^{p}(I)} \le \beta_{r,p} p_{f}(I). $$
(20)

Indeed, it suffices to take g = fLm, rf in Lemma 1 and recall the definition of pf(I). If so, then for any partition we have

$$ \begin{array}{@{}rcl@{}} \|f-L_{m,r}f\|_{L^{p}(a,b)} &\le& \beta_{r,p}\left( \sum\limits_{i=1}^{m}\left|p_{f}(I_{i})\right|^{p}\right)^{1/p}, \quad 1\le p<+\infty,\\ \|f-L_{m,r}f\|_{L^{\infty}(a,b)} &\le& \beta_{r,\infty}\max_{1\le i\le m}p_{f}(I_{i}), \end{array} $$

which means that the inequality (20) allows us to control the exact error of approximation. For instance, if r = 2 and (t0, t1, t2) = (1/2,0,1), then \(\beta _{r,p}=\|l(t)\|_{L^{p}(0,1)},\) where l(t) = 1 + 2|t − 1/2|. We have β2,1 = 1.5, \(\beta _{2,\infty }=2,\) and

$$\beta_{2,p} = 2^{1+1/p}(p+1)^{-1/p}\left( 1-2^{-(p+1)}\right)^{1/p}\quad \text{for}\quad 1<p<+\infty.$$

We stress that the exact inequalities hold true only under the assumptions of Lemma 1. The values of βr, p given by (19) are by no means best possible. Optimization of βr, p is a separate problem and is not addressed in the present paper.

Theorem 2

Let the assumption (16) of Lemma 1 be fulfilled. Then, the error estimate of Theorem 1 holds true if the derivative f(r) does not change its sign in [a, b].

Proof

Choose 0 < 𝜖 < ∥f(r)C([a, b]). For a given m, define

$$ \begin{array}{@{}rcl@{}} \mathcal I_{0} &=& \{ 1\le i\le m: \|f^{(r)}\|_{C(I_{i})}=0 \}, \\ \mathcal I_{1} &=& \{ 1\le i\le m: 0<\|f^{(r)}\|_{C(I_{i})}<\epsilon \}, \\ \mathcal I_{2} &=& \{ 1\le i\le m: \|f^{(r)}\|_{C(I_{i})}\ge\epsilon \}, \end{array} $$

where Ii is the i th subinterval. We assume that m is large enough, mm𝜖, so that \(\mathcal I_{2}\ne \emptyset \) and the modulus of continuity of f(r) at

$$\max\left\{ |I_{i}|: i\in\mathcal I_{1}\cup\mathcal I_{2}\right\},$$

denoted by ω, is smaller than 𝜖. Such an m𝜖 exists since by Lemma 1 we have pf(Ii) > 0 for \(i\in \mathcal I_{1}\cup \mathcal I_{2},\) which implies that the maximum length of such subintervals decreases to zero as \(m\to +\infty \).

Let

$$p_{f}^{*} = \max_{1\le i\le m}p_{f}(I_{i}).$$

Then, for \(i\in \mathcal I_{0}\), we have

$$p_{f}^{*} \le \frac{|\gamma_{r}|}{r!}(2h_{i})^{r+1/p}\omega,$$

since otherwise the predecessor of Ii would not be subdivided. This implies

$$ h_{i} \ge \frac{1}{2}\left( \frac{r!}{|\gamma_{r}|} \frac{p_{f}^{*}}{\omega}\right)^{1/(r+1/p)}. $$
(21)

For the same reason, for \(i\in \mathcal I_{1}\) we have

$$p_{f}^{*} \le \frac{|\gamma_{r}|}{r!}(2h_{i})^{r+1/p}(\epsilon+\omega),$$

which implies

$$ h_{i} \ge \frac1{2}\left( \frac{r!}{|\gamma_{r}|} \frac{p_{f}^{*}}{\epsilon+\omega}\right)^{1/(r+1/p)}. $$
(22)

For \(i\in \mathcal I_{2}\) we have in turn

$$p_{f}^{*} \ge \frac{|\gamma_{r}|}{r!}h_{i}^{r+1/p}(\epsilon-\omega),$$

which implies

$$ h_{i} \le \left( \frac{r!}{|\gamma_{r}|} \frac{p_{f}^{*}}{\epsilon-\omega}\right)^{1/(r+1/p)}. $$
(23)

Now, let \(m_{k}=\#\mathcal I_{k}\) and \(B_{k}=\cup _{i\in \mathcal I_{k}}I_{i},\) k = 0,1,2. Obviously m = m0 + m1 + m2 and [a, b] = B0B1B2. Using (21), (22), (23), we get that

$$ \begin{array}{@{}rcl@{}} m_{0} &\le& 2 |B_{0}|\left( \frac{|\gamma_{r}| \omega}{r! p_{f}^{*}}\right)^{1/(r+1/p)}, \\ m_{1} &\le& 2 |B_{1}|\left( \frac{|\gamma_{r}| (\epsilon+\omega)}{r! p_{f}^{*}}\right)^{1/(r+1/p)}, \\ m_{2} &\ge& |B_{2}|\left( \frac{|\gamma_{r}| (\epsilon-\omega)}{r! p_{f}^{*}}\right)^{1/(r+1/p)}. \end{array} $$

Hence,

$$ \lim_{m\to\infty}\frac{m_{0}}{m_{2}} \le 2 \lim_{m\to\infty}\frac{|B_{0}|}{|B_{2}|} \left( \frac{\omega}{\epsilon-\omega}\right)^{1/(r+1/p)} = 0, $$
(24)

where the last equality follows from the fact that if \(m\to +\infty \) then: ω goes to zero, |B0| monotonically increases to \(|\overline B_{0}|,\) where \(\overline B_{0}=\{x\in [a,b]: f^{(r)}(x)=0\},\) and |B2| monotonically decreases to \(|\overline B_{2}|>0,\) where \(\overline B_{2}=\{x\in [a,b]: |f^{(r)}(x)|\ge \epsilon \}\). We also have that

$$ \limsup_{m\to\infty}\frac{m_{1}}{m_{2}} \le 2 \lim_{m\to\infty}\frac{|B_{1}|}{|B_{2}|} \left( \frac{\epsilon+\omega}{\epsilon-\omega}\right)^{1/(r+1/p)} = 2 \frac{|\overline B_{1}|}{|\overline B_{2}|}, $$
(25)

where \(\overline B_{1}=\{x\in [a,b]: 0<|f^{(r)}(x)|<\epsilon \}\). Note that the right-hand side of this inequality goes to zero when 𝜖 → 0+.

We now estimate the error of our approximation. Obviously \(\|f-L_{m,r}^{**}f\|_{L^{p}(B_{0})}=0\). From (20) it follows that

$$\|f-L_{m,r}^{**}f\|_{L^{p}(B_{1})} \le \beta_{r,p}p_{f}^{*}m_{1}^{1/p}.$$

For B2 we use (14) and (15) to get that

$$\|f-L_{m,r}^{**}f\|_{L^{p}(B_{2})} \gtrapprox \frac{\alpha_{r,p}}{2^{r+1/p}|\gamma_{r}|} p_{f}^{*} m_{2}^{1/p}.$$

In view of (25), this means that the error on B1 vanishes compared to that on B2 when 𝜖 → 0+.

Since for xB2 the derivative f(r) is separated away from zero, we can use Theorem 1 together with (24) and (25) to obtain that

$$\limsup_{m\to+\infty} m^{r}\|f-L_{m,r}^{**}f\|_{L^{p}(B_{2})} \le \kappa_{r,p}\frac{\alpha_{r,p}}{r!}\|f^{(r)}\|_{L^{1/(r+1/p)}(\overline B_{2})} \left( 1+2 \frac{|\overline B_{1}|}{|\overline B_{2}|}\right).$$

Taking the limit of both sides of this inequality with respect to 𝜖 → 0+ and using the fact that then the error on B2 dominates the error on the remaining part of the interval [a, b], we finally claim that

$$ \limsup_{m\to+\infty} m^{r}\|f-L_{m,r}^{**}f\|_{L^{p}(a,b)} \le \kappa_{r,p}\frac{\alpha_{r,p}}{r!}\|f^{(r)}\|_{L^{1/(r+1/p)}(a,b)}. $$

The proof is complete. □

Now we want to relax the requirement that the derivative f(r) does not change its sign. It is clear that then our original algorithm may fail since, again, for an interval I we may have that pf(I) = 0 and this interval will not be further subdivided, while f(r)≠ 0 in I.

To obtain a result similar to that of Theorem 1 in this case, we generalize the priority function pf leaving the algorithm unchanged. We also do not assume that the points ti for 0 ≤ ir contain 0 and 1. The modified priority uses a predefined nonincreasing function \(\delta :(0,+\infty )\to [0,+\infty )\) and is given as

$$ \overline p_{f}(I) = \max\left( p_{f}(I),\delta(h) h^{r+1/p}\right),$$

where h is the length of the interval I. Obviously, we always have \(\overline p_{f}(I)\ge p_{f}(I),\) and \(\overline p_{f}(I)=p_{f}(I)\) if δ(h) = 0. Hence, \(\overline p_{f}\) is indeed a generalization of pf.

Denote the resulting approximation by \(L_{m,r}^{***}f\). The following theorem generalizes Theorem 1.

Theorem 3

Suppose that \(\lim _{h\to 0^{+}}\delta (h)=0\). If fCr([a, b]) is such that its derivative f(r) does not nullify in [a, b], or the modulus of continuity of f(r), denoted ωf, satisfies

$$ \limsup_{h\to 0^{+}}\frac{\omega_{f}(h)}{\delta(h)} = 0, $$
(26)

then the error estimate of Theorem 1 holds true, i.e.,

$$\|f-L_{m,r}^{***}f\|_{L^{p}(a,b)} \lessapprox \kappa_{r,p} \frac{\alpha_{r,p}}{r!} \left\|f^{(r)}\right\|_{L^{1/(r+1/p)}(a,b)} m^{-r}.$$

Proof

If f(r) does not nullify, then for all sufficiently large m we have \(\overline p_{f}(I_{i})=p_{f}(I_{i}),\) for any subinterval Ii in the m th partition, and the theorem follows from Theorem 1.

Assume (26). The fact that the priority function is always positive assures that

$$ h^{*}=\max_{1\le i\le m}h_{i}$$

decreases to zero as \(m\to +\infty \). For a given m, define

$$\mathcal I_{1} = \{1\le i\le m: \overline p_{f}(I_{i})>p_{f}(I_{i})\},\qquad \mathcal I_{2} = \{1\le i\le m: \overline p_{f}(I_{i})=p_{f}(I_{i})\}.$$

Let

$$\overline p_{f}^{*} = \max_{1\le i\le m}\overline p_{f}(I_{i}).$$

For \(i\in \mathcal I_{1}\) we have \(\frac {|\gamma _{r}|}{r!}|f^{(r)}(\xi _{i})|<\delta (h_{i})\) for some ξiIi, and

$$\overline p_{f}^{*} \le (2h_{i})^{r+1/p}\max\left( \frac{|\gamma_{r}|}{r!}\left|f^{(r)}(\xi_{i}^{\prime})\right|, \delta(2h_{i})\right),$$

since otherwise the predecessor of Ii (to which \(\xi _{i}^{\prime }\) belongs) would not be subdivided. We also have

$$\left|f^{(r)}(\xi_{i}^{\prime})\right| \le \left|f^{(r)}(\xi_{i})\right|+\omega_{f}(2h_{i}) \le \frac{r!}{|\gamma_{r}|} \delta(h_{i})+\omega_{f}(2h_{i}). $$

Hence, by (26), for all m sufficiently large is \(\overline p_{f}^{*}\le (2h_{i})^{r+1/p}2\delta (2h_{i}),\) which implies that

$$ h_{i} \ge \frac{1}{2}\left( \frac{\overline p_{f}^{*}}{2\delta(2h^{*})}\right)^{1/(r+1/p)},$$

and the number \(m_{1}=\#\mathcal I_{1}\) is at most proportional to \(\left (\frac {\delta (2h^{*})}{\overline p_{f}^{*}}\right )^{1/(r+1/p)}\).

For \(i\in \mathcal I_{2}\) we have

$$\overline p_{f}^{*} \ge \frac{|\gamma_{r}|}{r!}h_{i}^{r+1/p}\left|f^{(r)}(\xi_{i})\right|.$$

Let 0 < 𝜖 < ∥f(r)C([a, b]) and

$$\mathcal I_{2}^{\prime} = \left\{i\in\mathcal I_{2}: |f^{(r)}(x)|\ge\epsilon\text{ for all }x\in I_{i}\right\}. $$

Then, the set \(B_{2}^{\prime }=\cup _{i\in \mathcal I_{2}^{\prime }}I_{i}\) is for large m nonempty and nondecreasing as m increases. Hence, for \(i\in \mathcal I_{2}^{\prime }\) we have

$$ h_{i} \le \left( \frac{r!}{|\gamma_{r}|} \frac{\overline p_{f}^{*}}{\epsilon}\right)^{1/(r+1/p)},$$

which implies that the number \(m_{2}=\#\mathcal I_{2}\) is at least proportional to \(\left (\frac {\epsilon }{\overline p_{f}^{*}}\right )^{1/(r+1/p)}\).

Thus, we have shown that

$$ \lim_{m\to\infty}\frac{m_{1}}{m_{2}}=0. $$
(27)

To estimate the error, observe that for \(i\in \mathcal I_{1}\) we have

$$ \begin{array}{@{}rcl@{}} \|f-L_{m,r}^{***}f\|_{L^{p}(I_{i})} &=&\frac{\alpha_{r,p}}{r!}h_{i}^{r+1/p}\left|f^{(r)}(\eta_{i})\right| \le \frac{\alpha_{r,p}}{r!}h_{i}^{r+1/p}\left( \frac{r!}{|\gamma_{r}|}\delta(h_{i})+\omega_{f}(h_{i})\right) \\&\lessapprox& \frac{\alpha_{r,p}}{|\gamma_{r}|} \overline p_{f}^{*},\end{array} $$

which implies

$$\|f-L_{m,r}^{***}f\|_{L^{p}(B_{1})} \lessapprox \frac{\alpha_{r,p}}{|\gamma_{r}|} \overline p_{f}^{*}m_{1}^{1/p}, \qquad B_{1}=\cup_{i\in\mathcal I_{1}}I_{i}.$$

For \(i\in \mathcal I_{2}\) we use the condition (26) to claim, as in the proof of Theorem 2, that

$$\|f-L_{m,r}^{***}f\|_{L^{p}(B_{2})} \gtrapprox \frac{\alpha_{r,p}}{2^{r+1/p}|\gamma_{r}|} \overline p_{f}^{*} m_{2}^{1/p}, \qquad B_{2}=\cup_{i\in\mathcal I_{2}}I_{i}.$$

In view of (27), the error on B2 dominates the error on B1. Moreover, from (26) it follows that the value of Ω in (10) with i, j restricted to those in \(\mathcal I_{2}\) is asymptotically at most 2r+ 1/p. Hence,

$$ \begin{array}{@{}rcl@{}} \lefteqn{\limsup_{m\to+\infty}m^{r} \|f-L_{m,r}^{**}f\|_{L^{p}(a,b)} = \limsup_{m\to+\infty} {m_{2}^{r}} \|f-L_{m,r}^{***}f\|_{L^{p}(B_{2})}}\\ &&\lessapprox \kappa_{r,p}\frac{\alpha_{r,p}}{r!}\|f^{(r)}\|_{L^{1/(r+1/p)}(B_{2})} \le \kappa_{r,p}\frac{\alpha_{r,p}}{r!}\|f^{(r)}\|_{L^{1/(r+1/p)}(a,b)}, \end{array} $$

where the asymptotic inequality follows from Theorem 1. The proof is complete. □

Theorem 3 still does not cover the whole range of r-times continuously differentiable functions. The last theorem of this section does it at the expense of the asymptotic factor depending on f.

Theorem 4

If δ(h) = δ0 > 0, then for all fCr([a, b]) we have

$$ \|f-L_{m,r}^{***}f\|_{L^{p}(a,b)} \lessapprox \kappa_{r,p}\frac{\alpha_{r,p}}{r!} \left\|f^{(r)}_{\delta_{0}}\right\|_{L^{1/(r+1/p)}(a,b)} m^{-r}, $$

where \(f^{(r)}_{\delta _{0}}(x)=\max \limits \left (|f^{(r)}(x)|,\delta _{0}\right )\).

Proof

For any subinterval Ii of an m th partition we have

$$ \|f-L_{m,r}^{***}f\|_{L^{p}(I_{i})} = \frac{\alpha_{r,p}}{r!}h_{i}^{r+1/p}\left|f^{(r)}(\eta_{i})\right| \le \frac{\alpha_{r,p}}{r!}h_{i}^{r+1/p}f_{\delta_{0}}^{(r)}(\eta_{i}). $$

Since the maximum ratio of the highest to the lowest values of \(f_{\delta _{0}}^{(r)}(x)\) in the same subinterval goes to one as \(m\to +\infty ,\) the theorem follows directly from the proof of Theorem 1. □

Example 2

Consider the function

$$g(x)=\frac{\cos(100x)}{x+\tfrac1{100}},\qquad 0\le x\le 1,$$

for which the 4th derivative,

$$ \begin{array}{@{}rcl@{}} g^{(4)}(x) &=& -\frac{4 000 000 \sin(100x)}{(x+\tfrac1{100})^{2}} +\frac{2 400 \sin(100x)}{(x+\tfrac1{100})^{4}} \\ && +\frac{100 000 000 \cos(100x)}{(x+\tfrac1{100})} -\frac{120 000 \cos(100x)}{(x+\tfrac1{100})^{3}} +\frac{24 \cos(100x)}{(x+\tfrac1{100})^{5}}, \end{array} $$

changes its sign 32 times (see Fig. 2).

Fig. 2
figure 2

The graph of g(4)

In Fig. 3, we present the quality of \(L^{\infty }\) approximation of g using ALGORITHM, for two extreme choices of the function δ; namely δ(h) = 0 and δ(h) = 104. For comparison, we also include the corresponding error for the uniform subdivision.

For δ(h) = 0, i.e., for \(\overline p_{f}(I)=|\mathcal L_{I}(f)|,\) the error seems to decrease at speed m− 4, despite the fact that neither the assumptions of Theorem 3 nor those of Theorem 4 are fulfilled. However, the error fluctuates because of difficulties in proper estimation of the local errors in the intervals where g(4) changes its sign. Much better results are for the “safe” choice δ(h) = 104, i.e., for \(\overline p_{f}(I)=\max \limits \left (|\mathcal L_{I}(f)|,(10h)^{4}\right ),\) for which Theorem 4 applies.

Fig. 3
figure 3

Adaptive vs. nonadaptive strategies for the function g

5 Automatic approximation

In this section, we deal with automatic approximation. Ideally, we should have a procedure that for a given function f and an error threshold ε > 0 returns a partition, for which the corresponding approximation, say \(\mathcal A(f,\varepsilon ),\) satisfies

$$ \|f-\mathcal A(f,\varepsilon)\|_{L^{p}(a,b)} \le \varepsilon. $$
(28)

Obviously, such a procedure does not exist if it is supposed to work for all fCr([a, b]) and ε > 0, and use only finitely many function evaluations. We shall show however that the inequality (28) can be achieved asymptotically, as ε → 0+.

Since the accuracy ε (instead of m of function evaluations) is now an input parameter, in this section we use the asymptotic notation with respect to ε → 0+. That is,

$$a(\varepsilon)\lessapprox b(\varepsilon) \text{iff} \limsup_{\varepsilon\to 0^{+}}\frac{a(\varepsilon)}{b(\varepsilon)}\le 1, \quad\text{and}\quad a(\varepsilon)\approx b(\varepsilon) \text{iff} \lim_{\varepsilon\to 0^{+}}\frac{a(\varepsilon)}{b(\varepsilon)}=1.$$

To begin with, consider the following recursive procedure that corresponds to a local subdivision strategy. Here S is a set of subintervals. It is initially empty, and at the end it contains all subintervals in the resulting partition.

figure b

For simplicity, we restrict our analysis to the priority function

$$ \overline p_{f}(I)=\max\left( p_{f}(I),\delta(h)h^{r+1/p}\right)\quad\text{ with }\quad \delta(h) = \frac{|\gamma_{r}|}{\alpha_{r,p}} {\Delta}, $$
(29)

for some Δ > 0 (for the case Δ = 0, see Remark 8).

Suppose that AUTO1 is run for a given fCr([a, b]) and a threshold e = ε. Let mε be the number of subintervals in the resulting partition. Then

$$\overline p_{f}(I_{i}) \le \varepsilon \frac{|\gamma_{r}|}{\alpha_{r,p}}\left( \frac{h_{i}}{b-a}\right)^{1/p} \qquad\text{for all}\quad 1\le i\le m_{\varepsilon},$$

which implies that for the corresponding approximation

$$ \|f-\mathcal A_{1}(f,\varepsilon)\|_{L^{p}(a,b)} \lessapprox \frac{\alpha_{r,p}}{|\gamma_{r}|} \left( \sum\limits_{i=1}^{m_{\varepsilon}} \overline p_{f}(I_{i})^{p}\right)^{1/p} \le \varepsilon \left( \sum\limits_{i=1}^{m_{\varepsilon}} \frac{h_{i}}{b-a}\right)^{1/p} = \varepsilon.$$

Admittedly, we achieved our goal; however, the obtained partition is (almost) optimal only for \(p=+\infty \). Indeed, it is easy to see that AUTO1 tries to keep all the local errors proportional to \(h_{i}^{1/p},\) which results in that the factor depending on f in the overall error equals \(\left \|f_{\Delta }^{(r)}\right \|_{L^{1/r}(a,b)}\) instead of \(\left \|f_{\Delta }^{(r)}\right \|_{L^{1/(r+1/p)}(a,b)},\) where fΔ is any function in Cr([a, b]) such that \(f_{\Delta }^{(r)}(x)=\max \limits \left (|f^{(r)}(x)|, \frac {|\gamma _{r}|}{\alpha _{r,p}}{\Delta }\right )\).

To construct a procedure for \(1\le p<+\infty \) that uses an (almost) optimal partition, consider the following modification of AUTO1.

figure c

When run for fCr([a, b]) with e = ε, this procedure keeps all the local errors at most ε, and the approximation corresponding to the resulting partition equals \(L_{m_{\varepsilon },r}^{***}f,\) where mε is as before the number of subintervals in the resulting partition. Then

$$ \begin{array}{@{}rcl@{}} \|f-L_{m_{\varepsilon},r}^{***}f\|_{L^{p}(a,b)} &\lessapprox& \|f_{\Delta}-L_{m_{\varepsilon},r}^{***}f_{\Delta}\|_{L^{p}(a,b)} \\ &\lessapprox&\frac{\alpha_{r,p}}{|\gamma_{r}|} \left( \sum\limits_{i=1}^{m_{\varepsilon}} \overline p_{f}(I_{i})^{p}\right)^{1/p} \le \varepsilon m_{\varepsilon}^{1/p}, \end{array} $$
(30)

where the first inequality follows from Theorem 3. Thus, to reach an ε-approximation it suffices to run AUTO2 with e = ε such that \(\varepsilon ^{*}m_{\varepsilon ^{*}}^{1/p}\le \varepsilon \).

The value of ε can be found as follows. We first use the lower bound of Proposition 2,

$$\|f_{\Delta}-L_{m_{\varepsilon},r}^{***}f_{\Delta}\|_{L^{p}(a,b)} \gtrapprox \frac{\alpha_{r,p}}{r!} \left\|f_{\Delta}^{(r)}\right\|_{L^{1/(r+1/p)}(a,b)} m_{\varepsilon}^{-r},$$

together with (30) to get that \(\left \|f_{\Delta }^{(r)}\right \|_{L^{1/(r+1/p)}(a,b)} \lessapprox \frac {r!}{\alpha _{r,p}} \varepsilon m_{\varepsilon }^{r+1/p}\). Then

$$\limsup_{m\to\infty} m^{r}\|f_{\Delta}-L_{m,r}^{***}f_{\Delta}\|_{L^{p}(a,b)} \le \kappa_{r,p}\frac{\alpha_{r,p}}{r!} \left\|f_{\Delta}^{(r)}\right\|_{L^{1/(r+1/p)}(a,b)} \lessapprox \kappa_{r,p} \varepsilon m_{\varepsilon}^{r+1/p}. $$

Hence, to have the error asymptotically at most ε, it suffices that

$$ m \ge \kappa_{r,p}^{1/r} m_{\varepsilon}^{1+\frac{1}{rp}}.$$

That is, the procedure AUTO2 may be run with

$$ \varepsilon^{*} = \varepsilon\left/\left( \kappa_{r,p}^{1/r} m_{\varepsilon}^{1+\frac1{rp}}\right)^{1/p}\right.. $$
(31)

(Observe that ε = ε if \(p=+\infty ,\) which is consistent with the previous considerations).

To summarize, our algorithm consists of two steps. First, we run the recursive procedure AUTO2 with the error threshold e = ε and find mε. Second, we resume the recursion with the updated threshold e = ε given by (31) to get the final partition. If the recursion is implemented using a stack, then the cost of the algorithm is proportional to \(m_{\varepsilon ^{*}},\) which in turn is proportional to \(\|f_{\Delta }^{(r)}\|_{L^{1/(r+1/p)}(a,b)}^{1/r}\varepsilon ^{-1/r}\).

Denote the resulting approximation by \(\mathcal A_{2}(f,\varepsilon )\).

Theorem 5

For all functions fCr([a, b]) we have

$$ \left\|f-\mathcal A_{2}(f,\varepsilon)\right\|_{L^{p}(a,b)} \lessapprox \varepsilon,$$

i.e., an ε-approximation is achieved asymptotically as ε → 0+.

Example 3

Results of numerical tests for the automatic approximation of the functions f and g of Examples 1 and 2 using AUTO2 are presented, correspondingly, in Tables 4 and 5. We observe a perfect behavior of the algorithm for f and Δ = 0, and for \(p=1,2,+\infty \). Things are quite different for g. If Δ = 0, then the algorithm wrongly estimates the \(L^{\infty }\) error and terminates too early. A much better is the “safe” choice Δ = 104.

Table 4 Results from AUTO2 for the function f
Table 5 Results from AUTO2 for \(p=\infty ,\) for the function g

Remark 8

It is easy to verify using Theorems 1 and 2 that if Δ = 0 in (29), i.e., when the priority \(\overline p_{f}=p_{f},\) then Theorem 5 holds true provided f(r) does not nullify, or f(r) does not change its sign and the condition (16) is fulfilled. Moreover, in the latter case, it is possible to obtain an ε-approximation non-asymptotically. Indeed, it is enough to change the “if” condition in AUTO1 to βr, ppf([a, b]) ≤ e, where βr, p is as in Lemma 1, and run the procedure with e = ε. It immediately follows from (20) that then we get for sure an approximation with error at most ε.

The existence of a corresponding to AUTO2 recursive procedure that uses an (almost) optimal partition is problematic. Instead one can apply the following iterative procedure that is based on our initial algorithm discussed in Section 4.

figure d

It is worth mentioning that AUTO3 produces an approximation with unnecessarily much smaller error than the required ε, and consequently its running time is much higher than that of AUTO2. This is due to the fact that βr, ppf(I) in (20) usually considerably overestimates the error in any interval I. For instance, consider again the \(L^{\infty }\) approximation of the function f from previous examples. Let \(\beta _{4,\infty }\) be defined as in (19). Then, for ε = 10− 3,10− 6,10− 9, the procedure AUTO3 produces respectively approximations with errors 3.0211e − 06, 3.3098e − 10 and 5.6843e − 14 using 88, 878, and 9749 subintervals (compare with the corresponding results for AUTO2 in Table 4).

6 Remarks on numerical integration

Adaptive quadratures are frequently used for automatic integration,

$$ \mathcal If = {{\int}_{a}^{b}}f(x) \mathrm dx. $$
(32)

Such quadratures can be obtained, for instance, by integrating the interpolant Lm, rf, which results in the compound quadrature

$$Q_{m,r}f = \mathcal I(L_{m,r}f).$$

Then, our results for the L1 approximation provide upper bounds for the quadrature error, and the procedures constructed for automatic approximation can be as well used for automatic integration. For we have

$$ \begin{array}{@{}rcl@{}}|\mathcal If-Q_{m,r}f\big|&=&\left|{{\int}_{a}^{b}}(f-L_{m,r}f)(x) \mathrm dx\right|\\&\le& {{\int}_{a}^{b}}\left|(f-L_{m,r}f)(x)\right| \mathrm dx = \left\|f-L_{m,r}f\right\|_{L^{1}(a,b)}.\end{array} $$

The bound above often overestimates the actual error. This happens when the degree of exactness of the quadrature Qm, r is at least r. Then, for sr + 1 and for any function fCs([a, b]) with f(s)≠ 0, the error \(|\mathcal If-Q_{m,r}f|\) is of order ms, while \(\|f-L_{m,r}f\|_{L^{1}(a,b)}\) decreases to zero no faster than mr.

Consider now the case when the base quadrature \(\mathcal Q_{r}\) for approximating the integral \({{\int \limits }_{0}^{1}}f(x) \mathrm dx\) is such that its degree of exactness equals r − 1, and the Peano kernel of the error functional \(f{\mapsto {\int \limits }_{0}^{1}}f(x) \mathrm dx-\mathcal Q_{r}f\) does not change its sign. The quadrature \(\mathcal Q_{r}\) may, but does not have to, use the points (1). (Obvious examples include the Newton-Cotes quadratures or Gauss-Legendre quadratures). Suppose that the integral (32) is approximated by the corresponding to \(\mathcal Q_{r}\) compound quadrature \(\mathcal Q_{m,r}\) applied to a given partition consisting of m subintervals. Then, there is λr such that the quadrature error in each subinterval [xj− 1, xj] equals

$$ \lambda_{r} h_{j}^{r+1}f^{(r)}(\zeta_{j})\quad\text{for some}\quad\zeta_{j}\in[x_{j-1},x_{j}]. $$
(33)

If f(r) does not change its sign in [a, b], then the formula (33) allows us to apply the whole machinery of Sections 3 and 4 to claim that an asymptotically optimal partition makes all local integration errors equal. For the corresponding quadrature \(\mathcal Q_{m,r}^{*}\) we have

$$|\mathcal If-\mathcal Q_{m,r}^{*}f| \approx \lambda_{r} \left\|f^{(r)}\right\|_{L^{1/(r+1)}(a,b)} m^{-r} \qquad\text{as}\quad m\to+\infty,$$

which reproduces the results of [8] for r = 4, and those of [2, 3] for arbitrary r. Moreover, if the quadrature uses the partition produced by ALGORITHM, then its error bound is asymptotically worse than the optimal error by the factor of κr,1.

An example is provided by the standard adaptive Simpson quadrature [5], where r = 4, the points in (1) are \((t_{0}, t_{1}, t_{2}, t_{3}, t_{4})=(0, \frac 14, \frac 12, \frac 34,1),\) and

$$\mathcal Q_{4}f=\tfrac1{12}\left( f(0)+4f(\tfrac14)+2f(\tfrac12)+4f(\tfrac34)+f(1)\right).$$