Abstract
We present an asymptotic analysis of adaptive methods for Lp approximation of functions f ∈ Cr([a, b]), where \(1\le p\le +\infty \). The methods rely on piecewise polynomial interpolation of degree r − 1 with adaptive strategy of selecting m subintervals. The optimal speed of convergence is in this case of order m−r and it is already achieved by the uniform (nonadaptive) subdivision of the initial interval; however, the asymptotic constant crucially depends on the chosen strategy. We derive asymptotically best adaptive strategies and show their applicability to automatic Lp approximation with a given accuracy ε.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Numerical algorithms for solving continuous problems generally fall into two categories: nonadaptive algorithms and adaptive algorithms. By “adaptive” we here mean that in its successive steps the algorithm uses information about the problem instance (usually a real valued function) obtained from the previous steps. Adaptive algorithms often overcome nonadaptive ones in that they enjoy an essentially better convergence rate. Examples include bisection or Newton’s method for solving nonlinear equations. A good deal of numerical literature is devoted to automatic integration using adaptive quadratures (see, e.g., [1, 6]), one of the first and probably best known being the adaptive Simpson quadrature [5]. The question when and how much adaption helps is one of the main issues in information-based complexity [7, 12].
For the problems of function approximation or integration, adaptive algorithms are especially efficient when the underlying function is piecewise smooth only, since then adaption can be successfully used to localize the unknown singular points [9,10,11]. On the other hand, if the function is smooth in the whole domain, then adaptive algorithms can improve the error only by a constant compared to nonadaptive algorithms. The exact asymptotic constants for quadratures of degree of exactness r − 1 and for functions f ∈ Cr([a, b]) with f(r) > 0 were obtained in [8] for r = 4 and in [2, 3] for arbitrary r. Procedures corresponding to the optimal strategies for automatic integration were also proposed.
While adaptive numerical algorithms for the problem of function approximation of smooth functions are sometimes constructed (see, e.g., [4, Sect. 6.14]), a similar quantitative analysis of such algorithms seem not to exist. The purpose of the current paper is to fill this gap. We consider approximation of functions f ∈ Cr([a, b]) and algorithms that rely on piecewise polynomial interpolation of degree r − 1 with adaptive strategies of selecting m subintervals. The error is measured in the integral norm \(\|\cdot \|_{L^{p}}\) with \(1\le p\le +\infty \). It is well known that the optimal convergence rate is in this case of order m−r and it is already achieved by the uniform, i.e., nonadaptive, partition of the initial interval. (Actually, the rate m−r cannot be beaten even in the much larger class of algorithms that use m function evaluations, which follows in particular from [13]).
We first prove that for any function in the class a theoretically best adaptive strategy of interval subdivision relies on keeping the Lp errors equal in all subintervals. Then, the global Lp error of approximation asymptotically, as \(m\to +\infty ,\) equals
while for the uniform partition it equals
where αr, p is given by (4) (see Propositions 1 and 2). The gain from using adaption can be significant. For instance, consider the \(L^{\infty }\) approximation of f(x) = 1/(x + 10−d) in the interval [0,1]. If d = 2, then the adaptive algorithm overcomes the nonadaptive one roughly by the factor of 106, and for d = 8 this factor becomes 1029 (see Table 2).
Then, we show how the optimal strategy can be realized in practice. That is, for a given function f ∈ C([a, b]) we construct a relatively simple procedure that uses a priority queue and produces an almost optimal m th partition with the help of proportionally to m evaluations of f. Different versions of the procedure and the error analysis are presented depending on additional properties of f (see Theorems 1–4).
Next, we deal with automatic approximation. We consider a local subdivision strategy that is a departure point for obtaining a recursive procedure using the (almost) optimal strategy. For any ε > 0 and f ∈ Cr([a, b]), the proposed procedures return an approximation with the Lp error at most ε, asymptotically as ε → 0+.
Finally, we notice that our results imply the previously known and mentioned earlier in this introduction results for the numerical integration.
The content of the paper is as follows. In Section 2 we formally define our problem and show some preliminary estimates. A theoretically optimal partition is constructed in Section 3, while Section 4 is devoted to its practical realization. The recursive procedures for automatic approximation are constructed in Section 5. In Section 6 we comment on relations to the numerical integration. Theoretical findings are complemented by some numerical examples.
2 Preliminaries
For an integer r ≥ 1 and \(-\infty <a<b<+\infty ,\) we denote by Cr([a, b]) the space of functions
that are r-times continuously differentiable in [a, b]. We assume that such functions are approximated using piecewise interpolation of degree r − 1 with possibly non-uniform partition of the interval [a, b] into subintervals. Specifically, we first fix points
For a given f ∈ Cr([a, b]), the interval [a, b] is subdivided into m subintervals that are determined by a choice of points
In each subinterval [xj− 1, xj], the function is approximated by its Lagrange polynomial of degree r − 1 interpolating f at
where hj = xj − xj− 1. We denote such an approximation by Lm, rf.
The error of approximation is measured in the Lp norm, i.e.,
We are interested in partitions (2) such that the errors for the corresponding approximations Lm, rf are asymptotically (as \(m\to +\infty \)) as small as possible. Note that the problem can be formally treated as a special way of approximating the embedding
Remark 1
Obviously, the uniform approximation Cr([a, b])↪C([a, b]) is also of interest. We do not analyze it separately, since it is equivalent to \(L^{\infty }\) approximation provided t1 = 0, tr = 1, and r ≥ 2. Indeed, then for any partition, the approximation Lm, rf is continuous in [a, b] and \(\|f-L_{m,r}f\|_{C([a,b])}=\|f-L_{m,r}f\|_{L^{\infty }(a,b)}\).
In the rest of the paper, we assume without loss of generality that f is not a polynomial of degree smaller than r, since otherwise we clearly have Lm, rf = f. Then, in particular, the derivative f(r) is a nontrivial function.
We now provide preliminary formulas for the approximation error that will be used later. Let
where tis are given by (1). Let
Then, the local errors, by which we mean the errors in the successive subintervals [xj− 1, xj], can be written as follows. For \(1\le p<+\infty ,\)
and
Hence,
In the sequel, ηj always denotes a point in the j th subinterval for which
For convenience, we also use the following asymptotic notation. For two nonnegative functions a and b of the variable m we write
Obviously, a(m) ≈ b(m) iff \(a(m)\lessapprox b(m)\) and \(b(m)\lessapprox a(m)\).
Consider first the uniform partition of the interval [a, b], in which case
Proposition 1
For the uniform partition (7) we have
for all \(1\le p\le +\infty \).
Proof
Indeed, by (5) we have
and by (6) we have
□
3 Optimal partition
We now show that an asymptotically optimal partition makes all local errors equal. That is, it asymptotically enjoys the smallest error as \(m\to +\infty ,\) for all functions f ∈ Cr([a, b]). Specifically, for a given m, let
be such that all the quantities
where \(L_{m,r}^{*}f\) denotes the approximation corresponding to (8), are equal. Observe that such a partition exists since the local errors continuously depend on the points xi.
In the sequel, \(\|g\|_{L^{q}(a,b)}=\left ({{\int \limits }_{a}^{b}}|g(x)|^{q} \mathrm dx\right )^{1/q}\) for all \(0<q\le +\infty \). This is obviously not a norm in case 0 < q < 1, since then the triangle inequality is not satisfied. We also adopt the notation that 1/p = 0 for \(p=+\infty \).
Proposition 2
The equal-local error partitions (8) and the corresponding approximations \(L_{m,r}^{*}\) are asymptotically optimal. That is, for the approximations Lm, r using other partitions we have
Furthermore,
Proof
We first show the error formula for \(L_{m,r}^{*}\). Let \(A=h_{j}^{r+1/p}\left |f^{(r)}(\eta _{j})\right |\). Then, for finite p, we have
where we used the fact that if f(r)(ηj) = 0, then f(r) nullifies on the whole interval [xj− 1, xj]. This implies
For infinite p we have in turn
as claimed.
Now, we show that for any Lm, rf such that \(\|f-L_{m,r}f\|_{L^{p}(a,b)}\lessapprox Cm^{-r}\) we have \(C\ge \frac {\alpha _{r,p}}{r!} \left \|f^{(r)}\right \|_{L^{1/(r+1/p)}(a,b)}\). For that end, we fix ℓ ≥ 1 and define ui = a + iH where H = (b − a)/ℓ, and
Suppose that Lm, rf use partitions a = x0 < ⋯ < xm = b. We can assume without loss of generality that for m ≥ ℓ we have \(\{u_{i}\}_{i=0}^{\ell }\subset \{x_{j}\}_{j=0}^{m}\). Indeed, since we keep ℓ fixed, we can always add the points ui to a given partition without asymptotically increasing the error, as \(m\to +\infty \). Let li be such that \(x_{l_{i}}=u_{i},\) and mi = li − li− 1, 1 ≤ i ≤ ℓ.
Consider first finite p. We have
as the sum above is minimized for hj = H/mj for all li− 1 + 1 ≤ j ≤ lj. Hence,
The minimization of the last sum with respect to \({\sum }_{i=1}^{\ell } m_{i}=m\) gives the optimal
for which
Hence,
The last sum in the parentheses is a Riemann sum for the integral \({{\int \limits }_{a}^{b}}|f^{(r)}\) (x)|1/(r+ 1/p)dx. Hence, taking ℓ sufficiently large and m ≥ ℓ, we can make \(m^{r}\|f-L_{s,m}f\|_{L^{p}(a,b)}\) arbitrarily close to \(\frac {\alpha _{r,p}}{r!}\left \|f^{(r)}\right \|_{L^{1/(r+1/p)}(a,b)}\).
For infinite p, we similarly have
The right-hand side is minimized by
for which \(\max \limits _{1\le i\le \ell }C_{i}(m_{i}^{*})^{-r}=\left ({\sum }_{i=1}^{\ell } C_{i}^{1/r}\right )^{r}m^{-r}\). Hence,
The proof completes the observation that the last sum in the parentheses is a Riemann sum for the integral \({{\int \limits }_{a}^{b}}\left |f^{(r)}(x)\right |^{1/r}\mathrm dx\). □
Remark 2
The error of approximation depends on the points tis via αr, p. Recall that for \(p\in \{1,2,+\infty \}\) this factor is minimized by the points \(t_{i}^{*}\)s being zeros of appropriate orthogonal polynomials, adjusted to the interval [0,1]. For p = 1 these are Chebyshev polynomials of the second kind, for p = 2 these are Legendre polynomials, and for \(p=+\infty \) these are Chebyshev polynomials of the first kind. Consider, for example, r = 4. Then, the optimal points are as follows. For p = 1,
For p = 2,
For \(p=+\infty ,\)
Table 1 shows the corresponding values of αr, p for the optimal points \(t_{i}^{*}\) and, for comparison, for the equispaced points ti = (i − 1)/(r − 1), 1 ≤ i ≤ r.
Remark 3
We have shown that the optimal partition is asymptotically better than the uniform partition by the factor of
We obviously have that
where the equality holds for f being a polynomial of degree r, and the more f(r) varies the bigger Rr, p(f). An example is provided in Table 2.
Now we want to see how much we potentially lose by not using the optimal partition. For the error to go to zero as \(m\to +\infty \) we have to assume that the partitions satisfy
Let \(A_{j}=h_{j}^{r+1/p}\left |f^{(r)}(\eta _{j})\right |\) and A = (A1,…, Am). Denoting \(\|\mathbf A\|_{\infty }=\max \limits _{1\le j\le m}|A_{j}|\) and \(\|\mathbf A\|_{q}=\left ({\sum }_{j=1}^{m}|A_{j}|^{q}\right )^{1/q}\) for \(0<q<+\infty ,\) we have that
as \(m\to +\infty \). The error satisfies
where
Obviously, Km ≥ 1 and for the optimal partition is Km = 1.
Let us check how big Km can be assuming that for all m sufficiently large we have
where Ω > 1 and 0/0 = 1. Since Km is a homogeneous function of A, we can assume without loss of generality that 1 ≤ Ai ≤Ω for all i s. It is clear that then the maximum is attained at A = (Ω,…,Ω,1,…,1), where Ω is repeated k times, for some k. If \(p=+\infty \), then the maximum is for k = 1 and
Let \(1\le p<+\infty \). Then, setting q = 1/(r + 1/p) we have
We treat Km as a function of k ∈ [0, m] and find its maximum. The maximum is for
therefore
Remark 4
Especially important will be the case where
Then, Km in (9) is bounded from above by \(\kappa _{r,\infty }=2^{r}\) for \(p=+\infty ,\) and
The values of κr, p for \(p=1,2,\infty \) and 1 ≤ r ≤ 6 are in Table 3
4 An algorithm for (almost) optimal partitions
In this section, we show how asymptotically (almost) optimal partitions can be practically realized for a given m and f ∈ Cr([a, b]). We allow algorithms that can evaluate f at any x ∈ [a, b].
Let us fix another point t0 ∈ [0,1] that is different from ti in (1) for 1 ≤ i ≤ r. For an interval I = [c, d] ⊂ [a, b] of length h = d − c, define the functional \(\mathcal L_{I}:C^{r}([a,b])\to \mathbb R,\)
and ui = c + hti, 0 ≤ i ≤ r.
Remark 5
Observe that for each interval I the functional \(\mathcal L_{I}\) is uniquely (up to a multiplicative factor) defined by the conditions that it linearly combines the values of f at ui for 0 ≤ i ≤ r, and its kernel consists of all polynomials of degree at most r − 1. In our definition, \(\mathcal L_{I}(f)\) is just the error of interpolating f in I at u0, but equally well it could be the divided difference f[u0, u1,…, ur]. For we have
The algorithm that we present and analyze in this section uses a priority queue S whose elements are subintervals. For each I ∈ S of length h its priority is given by
In the following pseudocode, insert(S, I) and I := extract_max(S) denote actions corresponding to inserting an interval to S, and extracting from S an interval with highest priority.
![figure a](http://media.springernature.com/lw685/springer-static/image/art%3A10.1007%2Fs11075-021-01114-9/MediaObjects/11075_2021_1114_Figa_HTML.png)
After execution, the elements of S form a partition into m subintervals. Since a priority queue can be implemented using a heap, an m th partition can be obtained at cost proportional to \(m\log m\).
Denote by \(L_{m,r}^{**}f\) the approximation corresponding to the m th partition obtained by our algorithm. Recall that αr, p and κr, p are respectively given by (4) and (11).
Theorem 1
If the function f ∈ Cr([a, b]) is such that its derivative f(r) does not nullify in [a, b], then
Proof
In vew of the definition of κr, p in (11) of Remark 4, it suffices to show that the value of Ω in (10) can be chosen arbitrarily close to 2r+ 1/p, provided m is large enough.
Suppose that f(r) > 0 (the case f(r) < 0 is symmetric and there is no need to consider it separately). Then, there are \(0<d\le D<+\infty \) depending on f such that
For an interval I of length h, its priority can be written as
where γ is defined in (12). This means that
i.e., the priority is always positive, and it decreases to zero when an interval is successively subdivided. This leads to an important observation that the maximum length of a subinterval in an m th partition goes to zero as \(m\to +\infty \). Furthermore, if the interval I is further subdivided into I1 and I2, then for s = 1,2 we have
Since
where ω is the modulus of continuity of the function f(r),
This in turn means that for any δ > 0 there is mδ such that for all m > mδ the ratio of the highest to lowest priorities in the m th partition is
(Indeed, mδ is such that the lengths of all subintervals in the corresponding partition are at most δ, and such that after dividing the subinterval with the highest priority, one of its successors has the lowest priority).
Consider now the partition for a particular m ≥ mδ. Since the local error in an interval I can be written as
by (13) and (14) we have that the ratio of any two local errors is upper bounded by
Since δ can be arbitrarily small, the right-hand side can be made arbitrarily close to 2r+ 1/p, as claimed. □
Example 1
Figure 1 shows results of a numerical experiment for regularity r = 4. The tested function is
for which the 4th derivative is positive. The approximations are based on the adaptive partitions obtained from ALGORITHM, and those based on the uniform (nonadaptive) partitions. In this and all the numerical examples that follow we take t0 = 0.5 and the optimal points \(t_{1}^{*},t_{2}^{*},t_{3}^{*},t_{4}^{*},\) cf. Remark 2. The errors are measured in the Lp norms with \(p\in \{1,2,+\infty \}\). The results perfectly confirm the theoretical findings. (An artifact for \(p=+\infty ,\) in case of adaptive partitions and m close to 104, is a consequence of round-off errors that show up earlier for \(p=+\infty \) than for p = 1,2.)
Remark 6
In this paper, we consider the algorithm error versus the number m of subintervals. One may want to consider the error versus the number n of function values used. Then the choice of the equispaced points ti = (i − 1)/(r − 1) may lead to a better asymptotic constant than the choice of \(t_{i}^{*},\) 1 ≤ i ≤ r, despite the fact that the factor αr, p is in this case slightly larger, cf. Table 1. Consider, for instance, our algorithm for r = 4. If the points \(t_{i}^{*}\) are applied, then the algorithm produces an m th partition using n ≈ 10m function values. On the other hand, for the equispaced tis we have n ≈ 4m, since all 5 function values computed for a given subinterval can be re-used when halving this interval in one of the following steps.
Unfortunately, Theorem 1 does not hold for all f satisfying f(r) ≥ 0 or f(r) ≤ 0. Indeed, suppose that \(\hat t:=\max \limits _{0\le i\le r}t_{i}<1\) and consider the function \(f(x)=(x-\hat t )_{+}^{r+1}\) for x ∈ [0,2]. Then, pf([0,1]) = 0 and pf(I) > 0 for all intervals I ⊂ [1,2]. Hence, the interval [0,1] will never be subdivided and the error does not go to zero as \(m\to +\infty \).
A key point in this example is that the set of points ti, 0 ≤ i ≤ r, does not contain both endpoints of the interval [0,1]. If this obstacle is removed, then Theorem 1 holds true for all functions such that f(r) does not change its sign. To show this, we need the following auxiliary result.
Lemma 1
Let \(1\le p\le +\infty \). Let
Then, there exists βr, p > 0 (given, e.g., by (19)) such that the following holds. For any interval [c, d] of length h = d − c and any function g ∈ Cr([c, d]) such that
-
(i)
the derivative g(r) does not change its sign in [c, d], and
-
(ii)
g nullifies at ui = c + tih for all 1 ≤ i ≤ r,
we have
In particular, if g(u0) = 0, then g nullifies on the whole interval [c, d].
Proof
Assume without loss of generality that g(r) ≥ 0. We estimate g(u) for u different from any of the points ui. We have two cases: either u < u0 or u > u0.
If u < u0 then, by the explicit formula for divided differences, we have
Combining both inequalities we get that if \({\prod }_{i=1}^{r}(u-u_{i})>0\) then
On the other hand, if \({\prod }_{i=1}^{r}(u-u_{i})<0\), then ℓ1(u)g(u0) ≤ g(u) ≤ 0.
In the case u > u0, we similarly combine (18) with
to get that either
or ℓr(u)g(u0) ≤ g(u) ≤ 0.
Thus,
and \(\|g\|_{L^{p}(c,d)}\le \|\ell \|_{L^{p}(c,d)}|g(u_{0})|\). Letting \(l(t)=l_{1}(t)\mathbf 1_{[0,t_{0})}(t)+l_{r}(t)\mathbf 1_{[t_{0},1]}(t),\) where
and applying the substitution u = c + th, we finally obtain that \(\|\ell \|_{L^{p}(c,d)}=h^{1/p}\|l\|_{L^{p}(0,1)};\) hence, the lemma holds with
□
Remark 7
An important consequence of Lemma 1 is that for any subinterval I of a given partition we have
Indeed, it suffices to take g = f − Lm, rf in Lemma 1 and recall the definition of pf(I). If so, then for any partition we have
which means that the inequality (20) allows us to control the exact error of approximation. For instance, if r = 2 and (t0, t1, t2) = (1/2,0,1), then \(\beta _{r,p}=\|l(t)\|_{L^{p}(0,1)},\) where l(t) = 1 + 2|t − 1/2|. We have β2,1 = 1.5, \(\beta _{2,\infty }=2,\) and
We stress that the exact inequalities hold true only under the assumptions of Lemma 1. The values of βr, p given by (19) are by no means best possible. Optimization of βr, p is a separate problem and is not addressed in the present paper.
Theorem 2
Let the assumption (16) of Lemma 1 be fulfilled. Then, the error estimate of Theorem 1 holds true if the derivative f(r) does not change its sign in [a, b].
Proof
Choose 0 < 𝜖 < ∥f(r)∥C([a, b]). For a given m, define
where Ii is the i th subinterval. We assume that m is large enough, m ≥ m𝜖, so that \(\mathcal I_{2}\ne \emptyset \) and the modulus of continuity of f(r) at
denoted by ω, is smaller than 𝜖. Such an m𝜖 exists since by Lemma 1 we have pf(Ii) > 0 for \(i\in \mathcal I_{1}\cup \mathcal I_{2},\) which implies that the maximum length of such subintervals decreases to zero as \(m\to +\infty \).
Let
Then, for \(i\in \mathcal I_{0}\), we have
since otherwise the predecessor of Ii would not be subdivided. This implies
For the same reason, for \(i\in \mathcal I_{1}\) we have
which implies
For \(i\in \mathcal I_{2}\) we have in turn
which implies
Now, let \(m_{k}=\#\mathcal I_{k}\) and \(B_{k}=\cup _{i\in \mathcal I_{k}}I_{i},\) k = 0,1,2. Obviously m = m0 + m1 + m2 and [a, b] = B0 ∪ B1 ∪ B2. Using (21), (22), (23), we get that
Hence,
where the last equality follows from the fact that if \(m\to +\infty \) then: ω goes to zero, |B0| monotonically increases to \(|\overline B_{0}|,\) where \(\overline B_{0}=\{x\in [a,b]: f^{(r)}(x)=0\},\) and |B2| monotonically decreases to \(|\overline B_{2}|>0,\) where \(\overline B_{2}=\{x\in [a,b]: |f^{(r)}(x)|\ge \epsilon \}\). We also have that
where \(\overline B_{1}=\{x\in [a,b]: 0<|f^{(r)}(x)|<\epsilon \}\). Note that the right-hand side of this inequality goes to zero when 𝜖 → 0+.
We now estimate the error of our approximation. Obviously \(\|f-L_{m,r}^{**}f\|_{L^{p}(B_{0})}=0\). From (20) it follows that
For B2 we use (14) and (15) to get that
In view of (25), this means that the error on B1 vanishes compared to that on B2 when 𝜖 → 0+.
Since for x ∈ B2 the derivative f(r) is separated away from zero, we can use Theorem 1 together with (24) and (25) to obtain that
Taking the limit of both sides of this inequality with respect to 𝜖 → 0+ and using the fact that then the error on B2 dominates the error on the remaining part of the interval [a, b], we finally claim that
The proof is complete. □
Now we want to relax the requirement that the derivative f(r) does not change its sign. It is clear that then our original algorithm may fail since, again, for an interval I we may have that pf(I) = 0 and this interval will not be further subdivided, while f(r)≠ 0 in I.
To obtain a result similar to that of Theorem 1 in this case, we generalize the priority function pf leaving the algorithm unchanged. We also do not assume that the points ti for 0 ≤ i ≤ r contain 0 and 1. The modified priority uses a predefined nonincreasing function \(\delta :(0,+\infty )\to [0,+\infty )\) and is given as
where h is the length of the interval I. Obviously, we always have \(\overline p_{f}(I)\ge p_{f}(I),\) and \(\overline p_{f}(I)=p_{f}(I)\) if δ(h) = 0. Hence, \(\overline p_{f}\) is indeed a generalization of pf.
Denote the resulting approximation by \(L_{m,r}^{***}f\). The following theorem generalizes Theorem 1.
Theorem 3
Suppose that \(\lim _{h\to 0^{+}}\delta (h)=0\). If f ∈ Cr([a, b]) is such that its derivative f(r) does not nullify in [a, b], or the modulus of continuity of f(r), denoted ωf, satisfies
then the error estimate of Theorem 1 holds true, i.e.,
Proof
If f(r) does not nullify, then for all sufficiently large m we have \(\overline p_{f}(I_{i})=p_{f}(I_{i}),\) for any subinterval Ii in the m th partition, and the theorem follows from Theorem 1.
Assume (26). The fact that the priority function is always positive assures that
decreases to zero as \(m\to +\infty \). For a given m, define
Let
For \(i\in \mathcal I_{1}\) we have \(\frac {|\gamma _{r}|}{r!}|f^{(r)}(\xi _{i})|<\delta (h_{i})\) for some ξi ∈ Ii, and
since otherwise the predecessor of Ii (to which \(\xi _{i}^{\prime }\) belongs) would not be subdivided. We also have
Hence, by (26), for all m sufficiently large is \(\overline p_{f}^{*}\le (2h_{i})^{r+1/p}2\delta (2h_{i}),\) which implies that
and the number \(m_{1}=\#\mathcal I_{1}\) is at most proportional to \(\left (\frac {\delta (2h^{*})}{\overline p_{f}^{*}}\right )^{1/(r+1/p)}\).
For \(i\in \mathcal I_{2}\) we have
Let 0 < 𝜖 < ∥f(r)∥C([a, b]) and
Then, the set \(B_{2}^{\prime }=\cup _{i\in \mathcal I_{2}^{\prime }}I_{i}\) is for large m nonempty and nondecreasing as m increases. Hence, for \(i\in \mathcal I_{2}^{\prime }\) we have
which implies that the number \(m_{2}=\#\mathcal I_{2}\) is at least proportional to \(\left (\frac {\epsilon }{\overline p_{f}^{*}}\right )^{1/(r+1/p)}\).
Thus, we have shown that
To estimate the error, observe that for \(i\in \mathcal I_{1}\) we have
which implies
For \(i\in \mathcal I_{2}\) we use the condition (26) to claim, as in the proof of Theorem 2, that
In view of (27), the error on B2 dominates the error on B1. Moreover, from (26) it follows that the value of Ω in (10) with i, j restricted to those in \(\mathcal I_{2}\) is asymptotically at most 2r+ 1/p. Hence,
where the asymptotic inequality follows from Theorem 1. The proof is complete. □
Theorem 3 still does not cover the whole range of r-times continuously differentiable functions. The last theorem of this section does it at the expense of the asymptotic factor depending on f.
Theorem 4
If δ(h) = δ0 > 0, then for all f ∈ Cr([a, b]) we have
where \(f^{(r)}_{\delta _{0}}(x)=\max \limits \left (|f^{(r)}(x)|,\delta _{0}\right )\).
Proof
For any subinterval Ii of an m th partition we have
Since the maximum ratio of the highest to the lowest values of \(f_{\delta _{0}}^{(r)}(x)\) in the same subinterval goes to one as \(m\to +\infty ,\) the theorem follows directly from the proof of Theorem 1. □
Example 2
Consider the function
for which the 4th derivative,
changes its sign 32 times (see Fig. 2).
In Fig. 3, we present the quality of \(L^{\infty }\) approximation of g using ALGORITHM, for two extreme choices of the function δ; namely δ(h) = 0 and δ(h) = 104. For comparison, we also include the corresponding error for the uniform subdivision.
For δ(h) = 0, i.e., for \(\overline p_{f}(I)=|\mathcal L_{I}(f)|,\) the error seems to decrease at speed m− 4, despite the fact that neither the assumptions of Theorem 3 nor those of Theorem 4 are fulfilled. However, the error fluctuates because of difficulties in proper estimation of the local errors in the intervals where g(4) changes its sign. Much better results are for the “safe” choice δ(h) = 104, i.e., for \(\overline p_{f}(I)=\max \limits \left (|\mathcal L_{I}(f)|,(10h)^{4}\right ),\) for which Theorem 4 applies.
5 Automatic approximation
In this section, we deal with automatic approximation. Ideally, we should have a procedure that for a given function f and an error threshold ε > 0 returns a partition, for which the corresponding approximation, say \(\mathcal A(f,\varepsilon ),\) satisfies
Obviously, such a procedure does not exist if it is supposed to work for all f ∈ Cr([a, b]) and ε > 0, and use only finitely many function evaluations. We shall show however that the inequality (28) can be achieved asymptotically, as ε → 0+.
Since the accuracy ε (instead of m of function evaluations) is now an input parameter, in this section we use the asymptotic notation with respect to ε → 0+. That is,
To begin with, consider the following recursive procedure that corresponds to a local subdivision strategy. Here S is a set of subintervals. It is initially empty, and at the end it contains all subintervals in the resulting partition.
![figure b](http://media.springernature.com/lw685/springer-static/image/art%3A10.1007%2Fs11075-021-01114-9/MediaObjects/11075_2021_1114_Figb_HTML.png)
For simplicity, we restrict our analysis to the priority function
for some Δ > 0 (for the case Δ = 0, see Remark 8).
Suppose that AUTO1 is run for a given f ∈ Cr([a, b]) and a threshold e = ε. Let mε be the number of subintervals in the resulting partition. Then
which implies that for the corresponding approximation
Admittedly, we achieved our goal; however, the obtained partition is (almost) optimal only for \(p=+\infty \). Indeed, it is easy to see that AUTO1 tries to keep all the local errors proportional to \(h_{i}^{1/p},\) which results in that the factor depending on f in the overall error equals \(\left \|f_{\Delta }^{(r)}\right \|_{L^{1/r}(a,b)}\) instead of \(\left \|f_{\Delta }^{(r)}\right \|_{L^{1/(r+1/p)}(a,b)},\) where fΔ is any function in Cr([a, b]) such that \(f_{\Delta }^{(r)}(x)=\max \limits \left (|f^{(r)}(x)|, \frac {|\gamma _{r}|}{\alpha _{r,p}}{\Delta }\right )\).
To construct a procedure for \(1\le p<+\infty \) that uses an (almost) optimal partition, consider the following modification of AUTO1.
![figure c](http://media.springernature.com/lw685/springer-static/image/art%3A10.1007%2Fs11075-021-01114-9/MediaObjects/11075_2021_1114_Figc_HTML.png)
When run for f ∈ Cr([a, b]) with e = ε, this procedure keeps all the local errors at most ε, and the approximation corresponding to the resulting partition equals \(L_{m_{\varepsilon },r}^{***}f,\) where mε is as before the number of subintervals in the resulting partition. Then
where the first inequality follows from Theorem 3. Thus, to reach an ε-approximation it suffices to run AUTO2 with e = ε∗ such that \(\varepsilon ^{*}m_{\varepsilon ^{*}}^{1/p}\le \varepsilon \).
The value of ε∗ can be found as follows. We first use the lower bound of Proposition 2,
together with (30) to get that \(\left \|f_{\Delta }^{(r)}\right \|_{L^{1/(r+1/p)}(a,b)} \lessapprox \frac {r!}{\alpha _{r,p}} \varepsilon m_{\varepsilon }^{r+1/p}\). Then
Hence, to have the error asymptotically at most ε, it suffices that
That is, the procedure AUTO2 may be run with
(Observe that ε∗ = ε if \(p=+\infty ,\) which is consistent with the previous considerations).
To summarize, our algorithm consists of two steps. First, we run the recursive procedure AUTO2 with the error threshold e = ε and find mε. Second, we resume the recursion with the updated threshold e = ε∗ given by (31) to get the final partition. If the recursion is implemented using a stack, then the cost of the algorithm is proportional to \(m_{\varepsilon ^{*}},\) which in turn is proportional to \(\|f_{\Delta }^{(r)}\|_{L^{1/(r+1/p)}(a,b)}^{1/r}\varepsilon ^{-1/r}\).
Denote the resulting approximation by \(\mathcal A_{2}(f,\varepsilon )\).
Theorem 5
For all functions f ∈ Cr([a, b]) we have
i.e., an ε-approximation is achieved asymptotically as ε → 0+.
Example 3
Results of numerical tests for the automatic approximation of the functions f and g of Examples 1 and 2 using AUTO2 are presented, correspondingly, in Tables 4 and 5. We observe a perfect behavior of the algorithm for f and Δ = 0, and for \(p=1,2,+\infty \). Things are quite different for g. If Δ = 0, then the algorithm wrongly estimates the \(L^{\infty }\) error and terminates too early. A much better is the “safe” choice Δ = 104.
Remark 8
It is easy to verify using Theorems 1 and 2 that if Δ = 0 in (29), i.e., when the priority \(\overline p_{f}=p_{f},\) then Theorem 5 holds true provided f(r) does not nullify, or f(r) does not change its sign and the condition (16) is fulfilled. Moreover, in the latter case, it is possible to obtain an ε-approximation non-asymptotically. Indeed, it is enough to change the “if” condition in AUTO1 to βr, ppf([a, b]) ≤ e, where βr, p is as in Lemma 1, and run the procedure with e = ε. It immediately follows from (20) that then we get for sure an approximation with error at most ε.
The existence of a corresponding to AUTO2 recursive procedure that uses an (almost) optimal partition is problematic. Instead one can apply the following iterative procedure that is based on our initial algorithm discussed in Section 4.
![figure d](http://media.springernature.com/lw685/springer-static/image/art%3A10.1007%2Fs11075-021-01114-9/MediaObjects/11075_2021_1114_Figd_HTML.png)
It is worth mentioning that AUTO3 produces an approximation with unnecessarily much smaller error than the required ε, and consequently its running time is much higher than that of AUTO2. This is due to the fact that βr, ppf(I) in (20) usually considerably overestimates the error in any interval I. For instance, consider again the \(L^{\infty }\) approximation of the function f from previous examples. Let \(\beta _{4,\infty }\) be defined as in (19). Then, for ε = 10− 3,10− 6,10− 9, the procedure AUTO3 produces respectively approximations with errors 3.0211e − 06, 3.3098e − 10 and 5.6843e − 14 using 88, 878, and 9749 subintervals (compare with the corresponding results for AUTO2 in Table 4).
6 Remarks on numerical integration
Adaptive quadratures are frequently used for automatic integration,
Such quadratures can be obtained, for instance, by integrating the interpolant Lm, rf, which results in the compound quadrature
Then, our results for the L1 approximation provide upper bounds for the quadrature error, and the procedures constructed for automatic approximation can be as well used for automatic integration. For we have
The bound above often overestimates the actual error. This happens when the degree of exactness of the quadrature Qm, r is at least r. Then, for s ≥ r + 1 and for any function f ∈ Cs([a, b]) with f(s)≠ 0, the error \(|\mathcal If-Q_{m,r}f|\) is of order m−s, while \(\|f-L_{m,r}f\|_{L^{1}(a,b)}\) decreases to zero no faster than m−r.
Consider now the case when the base quadrature \(\mathcal Q_{r}\) for approximating the integral \({{\int \limits }_{0}^{1}}f(x) \mathrm dx\) is such that its degree of exactness equals r − 1, and the Peano kernel of the error functional \(f{\mapsto {\int \limits }_{0}^{1}}f(x) \mathrm dx-\mathcal Q_{r}f\) does not change its sign. The quadrature \(\mathcal Q_{r}\) may, but does not have to, use the points (1). (Obvious examples include the Newton-Cotes quadratures or Gauss-Legendre quadratures). Suppose that the integral (32) is approximated by the corresponding to \(\mathcal Q_{r}\) compound quadrature \(\mathcal Q_{m,r}\) applied to a given partition consisting of m subintervals. Then, there is λr such that the quadrature error in each subinterval [xj− 1, xj] equals
If f(r) does not change its sign in [a, b], then the formula (33) allows us to apply the whole machinery of Sections 3 and 4 to claim that an asymptotically optimal partition makes all local integration errors equal. For the corresponding quadrature \(\mathcal Q_{m,r}^{*}\) we have
which reproduces the results of [8] for r = 4, and those of [2, 3] for arbitrary r. Moreover, if the quadrature uses the partition produced by ALGORITHM, then its error bound is asymptotically worse than the optimal error by the factor of κr,1.
An example is provided by the standard adaptive Simpson quadrature [5], where r = 4, the points in (1) are \((t_{0}, t_{1}, t_{2}, t_{3}, t_{4})=(0, \frac 14, \frac 12, \frac 34,1),\) and
References
Davis, P., Rabinowitz, P.: Methods of Numerical Integration, 2nd edn. Academic Press, New York (1984)
Goćwin, M.: On the optimal adaptive quadratures for automatic integration. BIT Numerical Mathematics, to appear
Jagieła, K.: Construction of optimal adaptive quadratures of arbitrary order (in Polish). Master Thesis, University of Warsaw (2015)
Kincaid, D., Cheney, W.: Numerical Analysis. Mathematics of Scientific Computing, 3rd ed. AMS, Providence (2002)
Lyness, J.N.: Notes on the adaptive Simpson quadrature routine. J. ACM 16, 483–495 (1969)
Lyness, J.N.: Guidelines for automatic quadrature routines. In: Freeman, C.V. (ed.) Information Processing 71, vol. 2, pp 1351–1355. North-Holland Publ (1972)
Novak, E.: On the power of adaption. J. Complex. 12, 199–238 (1996)
Plaskota, L.: Automatic integration using asymptotically optimal adaptive Simpson quadrature. Numer. Math. 131, 173–198 (2015)
Plaskota, L., Wasilkowski, G.W.: Adaption allows efficient integration of functions with unknown singularities. Numer. Math. 102, 123–144 (2005)
Plaskota, L., Wasilkowski, G.W.: Uniform approximation of piecewise r-smooth and globally continuous functions. SIAM J. Numer. Anal. 47, 762–785 (2009)
Plaskota, L., Wasilkowski, G.W., Zhao, Y.: The power of adaption for approximating functions with singularities. Math. Comput. 77, 2309–2338 (2008)
Traub, J.F., Wasilkowski, G.W., Woźniakowski, H.: Information-Based Complexity. Academic Press, Boston (1988)
Trojan, G.M.: Asymptotic setting for linear problems, manuscript (See also [12]) (1983)
Funding
L. Plaskota was supported by the National Science Centre, Poland, under project 2017/25/B/ST1/00945.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Plaskota, L., Samoraj, P. Automatic approximation using asymptotically optimal adaptive interpolation. Numer Algor 89, 277–302 (2022). https://doi.org/10.1007/s11075-021-01114-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11075-021-01114-9