Abstract
Scientific problems often feature observational data received in the form \(w_1=l_1(f),\ldots \),\(w_m=l_m(f)\) of known linear functionals applied to an unknown function f from some Banach space \(\mathcal {X}\), and it is required to either approximate f (the full approximation problem) or to estimate a quantity of interest Q(f). In typical examples, the quantities of interest can be the maximum/minimum of f or some averaged quantity such as the integral of f, while the observational data consists of point evaluations. To obtain meaningful results about such problems, it is necessary to possess additional information about f, usually as an assumption that f belongs to a certain model class \(\mathcal {K}\) contained in \(\mathcal {X}\). This is precisely the framework of optimal recovery, which produced substantial investigations when the model class is a ball in a smoothness space, e.g., when it is a unit ball in Lipschitz, Sobolev, or Besov spaces. This paper is concerned with other model classes described by approximation processes, as studied in DeVore et al. [Data assimilation in Banach spaces, (To Appear)]. Its main contributions are: (1) designing implementable optimal or near-optimal algorithms for the estimation of quantities of interest, (2) constructing linear optimal or near-optimal algorithms for the full approximation of an unknown function using its point evaluations. While the existence of linear optimal algorithms for the approximation of linear functionals Q(f) is a classical result established by Smolyak, a numerically friendly procedure that performs this approximation is not generally available. In this paper, we show that in classical recovery settings, such linear optimal algorithms can be produced by constrained minimization methods. We illustrate these techniques on several examples involving the computation of integrals using point evaluation data. In addition, we show that linearization of optimal algorithms can be achieved for the full approximation problem in the important situation where the \(l_j\) are point evaluations and \(\mathcal {X}\) is a space of continuous functions equipped with the uniform norm. It is also revealed how quasi-interpolation theory enables the construction of linear near-optimal algorithms for the recovery of the underlying function.
Similar content being viewed by others
Notes
It is worth mentioning that the classical notion of a Chebyshev center of a set considered in this paper is different from the more computable notion considered in [8], which corresponds to the center of the largest ball contained in the given set.
There are various conditions on Q ensuring that \(Q(\mathcal{K}_w(\epsilon ,V))\) is bounded, e.g., Q being a Lipschitz map.
The correction (4.4) may be omitted, since a pointwise near-optimal algorithm is already provided by \(w\mapsto v(w)\), but it makes the algorithm A data-consistent.
References
Adcock, B., Hansen, A.C.: Stable reconstructions in Hilbert spaces and the resolution of the Gibbs phenomenon. Appl. Comput. Harm. Anal. 32, 357–388 (2012)
Adcock, B., Hansen, A.C., Poon, C.: Beyond consistent reconstructions: optimality and sharp bounds for generalized sampling, and application to the uniform resampling problem. SIAM J. Math. Anal. 45, 3132–3167 (2013)
Adcock, B., Platte, R.B., Shadrin, A.: Optimal sampling rates for approximating analytic functions from pointwise samples. IMA J. Numer. Anal. (2018). https://doi.org/10.1093/imanum/dry024
Bakhvalov, N.S.: On the optimality of linear methods for operator approximation in convex classes of functions. USSR Comput. Math. Math. Phys. 11, 244–249 (1971)
Binev, P., Cohen, A., Dahmen, W., DeVore, R., Petrova, G., Wojtaszczyk, P.: Convergence rates for Greedy algorithms in reduced basis methods. SIAM J. Math. Anal. 43, 1457–1472 (2011)
Binev, P., Cohen, A., Dahmen, W., DeVore, R., Petrova, G., Wojtaszczyk, P.: Data assimilation in reduced modeling. SIAM/ASA J. Uncertain. Quant. 5, 1–29 (2017)
Bojanov, B.: Optimal recovery of functions and integrals. First European Congress of Mathematics, Vol. I (Paris, 1992), pp. 371–390, Progress in Mathematics, 119, Birkhäuser, Basel (1994)
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Cohen, A., Dahmen, W., DeVore, R.: Compressed sensing and best \(k\)-term approximation. JAMS 22, 211–231 (2009)
Coppersmith, D., Rivlin, T.: The growth of polynomials bounded at equally spaced points. SIAM J. Math. Anal. 23, 970–983 (1992)
Creutzig, J., Wojtaszczyk, P.: Linear versus nonlinear algorithms for linear problems. J. Complex. 20, 807–820 (2004)
CVX Research, Inc. CVX: matlab software for disciplined convex programming, version 2.1. http://cvxr.com/cvx (2014)
DeVore, R.: Nonlinear approximation. Acta Numer. 7, 51–150 (1998)
DeVore, R., Lorentz, G.G.: Constructive Approximation, vol. 303. Springer Grundlehren, Berlin (1993)
DeVore, R., Petrova, G., Wojtaszczyk, P.: Data assimilation and sampling in Banach spaces. P. Calcolo 54, 1–45 (2017)
Driscoll, T.A., Hale, N., Trefethen, L.N. (eds.): Chebfun Guide. Pafnuty Publications, Oxford (2014)
Elad, M.: Sparse and Redundant Representations. Springer, Berlin (2010)
Foucart, S., Rauhut, H.: A Mathematical Introduction to Compressive Sensing. Birkhäuser, Basel (2013)
Kalman, J.A.: Continuity and convexity of projections and barycentric coordinates in convex polyhedra. Pac. Math. J. 11, 1017–1022 (1961)
Lindenstrauss, J.: Extension property for compact operators. Mem. Am. Math. Soc. 48 (1964)
Marcinkiewicz, J., Zygmund, A.: Mean values of trigonometrical polynomials. Fundamenta Mathematicae 28, 131–166 (1937)
Micchelli, C., Rivlin, T.: Lectures on optimal recovery. Numerical analysis (Lancaster, 1984), 21–93, Lecture Notes in Math., 1129, Springer, Berlin (1985)
Micchelli, C., Rivlin, T., Winograd, S.: The optimal recovery of smooth functions. Numerische Mathematik 26, 191–200 (1976)
Milman, V., Schechtman, G.: Asymptotic Theory of Finite Dimensional Normed Spaces, Lecture Notes in Mathematics, vol. 1200. Springer, Berlin (1986)
Osipenko, KYu.: Best approximation of analytic functions from information about their values at a finite number of points. Math. Notes Acad. Sci. USSR 19(1), 17–23 (1976)
Platte, R., Trefethen, L., Kuijlaars, A.: Impossibility of fast stable approximation of analytic functions from equispaced samples. SIAM Rev. 53, 308–318 (2011)
Schönhage, A.: Fehlerfortpflantzung bei Interpolation. Numer. Math. 3, 62–71 (1961)
Traub, J., Wozniakowski, H.: A General Theory of Optimal Algorithms. Academic Press, New York (1980)
Turetskii, A.H.: The bounding of polynomials prescribed at equally distributed points. Proc. Pedag. Inst. Vitebsk 3, 117–127 (1940). [Russian]
Wilson, M.W.: Necessary and sufficient conditions for equidistant quadrature formula. SIAM J. Numer. Anal. 7(1), 134–141 (1970)
Zippin, M.: Extension of Bounded Linear Operators, Handbook of the Geometry of Banach Spaces, vol. 2, pp. 1703–1741. North-Holland, Amsterdam (2003)
Zygmund, A.: Trigonometric Series. Cambridge University Press, Cambridge (2002)
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Wolfgang Dahmen.
This research was supported by the ONR Contracts N00014-15-1-2181, N00014-16-1-2706 the NSF Grant DMS 1521067, DARPA through Oak Ridge National Laboratory; by the NSF Grant DMS 1622134; and by National Science Centre, Poland Grant UMO-2016/21/B/ST1/00241.
Appendix
Appendix
Finally, we provide full justifications for several unproven results we have relied on, namely (3.12), (4.1), and Lemma 4.1. We start with (3.12).
Lemma 6.1
For any polynomial \(r\in \mathcal{P}_d\), one has
and the inequality is sharp.
Proof
Let us consider the expansion of r with respect to the Legendre polynomials \(P_j\) normalized so that \(P_j(1)=1\) and \(\Vert P_j\Vert _{L_2[-1,1]}^2=\dfrac{2}{2j+1}\); that is,
Since
the statement in the lemma follows from the fact that
Inequality (6.1) is sharp because all inequalities become equalities for \(r=k(1,\cdot )\). \(\square \)
Next, we continue by restating (4.1).
Lemma 6.2
Let V be an n-dimensional subspace of C(D) and \(x_1,\ldots ,x_m \in D\) be m distinct points in D. If \(\mathcal{N}:= \{ \eta \in C(D): \eta (x_1)= \cdots = \eta (x_m)=0 \}\), then
Proof
In view of (1.2), it is enough to establish that
Let us define
and pick \(v \in V\) with \(\max _{1 \le j \le m} |v(x_j)| = 1\) and \(\Vert v\Vert _{C(D)} = \mu \). If \( \mu > 1\), choose \(x^* \in D\) such that \(|v(x^*)| = \mu \) and therefore \(x^* \not \in \{x_1,\ldots ,x_m\}\). If \(\mu = 1\), choose \(x^* \in D \setminus \{x_1,\ldots ,x_m\}\) such that \(|v(x^*)| \ge \mu - \delta \) for an arbitrarily small \(\delta > 0\). We introduce a function \(h \in C(D)\) satisfying
Clearly, the function \(\eta := v-h\) belongs to \(\mathcal{N}\), and we have
Since \(\delta > 0\) was arbitrary, this proves (6.2). \(\square \)
Finally, we prove Lemma 4.1 stated in a slightly different version below.
Lemma 6.3
Let \(\theta _1,\ldots ,\theta _N\) be N distinct points in \(\mathbb {R}^n\) with convex hull \(\mathcal{C}:= \mathrm{conv}\{\theta _1,\ldots ,\theta _N\}\). Then, there exist functions \(\psi ^{(N)}_j:\mathcal{C}\rightarrow \mathbb {R}\), \(j=1,\ldots ,N\), such that
-
(i)
\(\psi ^{(N)}_1,\ldots ,\psi ^{(N)}_N\) are continuous on \(\mathcal{C}\);
-
(ii)
for any linear function \(\lambda : \mathbb {R}^n \rightarrow \mathbb {R}\) (in particular for \(\lambda (\theta )=1\) and \(\lambda (\theta )=\theta \)),
$$\begin{aligned} \sum _{i=1}^N \psi ^{(N)}_i(\theta ) \lambda (\theta _i) = \lambda (\theta ) \qquad \text{ whenever } \theta \in \mathcal{C}; \end{aligned}$$ -
(iii)
for all \(i=1,\ldots , N\), \(\psi ^{(N)}_i(\theta ) \ge 0\) whenever \(\theta \in \mathcal{C}\);
-
(iv)
for all \(i,j = 1,\ldots ,N\), \(\psi ^{(N)}_i(\theta _j) = \delta _{i,j}\).
Proof
We proceed by induction on \(N \ge 1\). The result is clear for \(N=1\) and \(N=2\). Let us assume that it holds up to \(N-1\) for some integer \(N \ge 3\) and that we are given N distinct points \(\theta _1,\ldots ,\theta _N \in \mathbb {R}^n\). We separate two cases.
Case 1: Each \(\theta _j\) is an extreme point of \(\mathcal{C}:= \mathrm{conv} \{ \theta _1,\ldots , \theta _N \}\). In this case, we invoke the result of Kalman [19] and consider the functions \(\psi ^{(N)}_1,\ldots ,\psi ^{(N)}_N\) from [19] satisfying (i)–(iii). Condition (iv) then occurs as a consequence of (ii)-(iii). Indeed, given \(j = 1,\ldots ,N\), one can find a linear function \(\lambda : \mathbb {R}^n \rightarrow \mathbb {R}\) such that \(\lambda (\theta _j)=0\) and \(\lambda (\theta _i) >0\) for all \(i \not = j\). Therefore,
implies that \(\psi ^{(N)}_i(\theta _j) =0 \) for all \(i \not = j\), and then \(\psi ^{(N)}_j(\theta _j) =1 \) follows from \(\sum _{i=1}^N \psi ^{(N)}_i(\theta _j) =1\).
Case 2: One of the \(\theta _j\)’s belongs to the convex hull of the other \(\theta _i\)’s, say \(\theta _N \in \mathrm{conv} \{ \theta _1,\ldots , \theta _{N-1} \}\). Let \(\psi ^{(N-1)}_1,\ldots ,\psi ^{(N-1)}_{N-1}\) be the functions defined on \(\mathcal{C}= \mathrm{conv} \{ \theta _1,\ldots , \theta _{N-1} \} = \mathrm{conv} \{ \theta _1,\ldots , \theta _N \}\) that are obtained from the induction hypothesis applied to the \(N-1\) distinct points \(\theta _1,\ldots ,\theta _{N-1}\). Next, we introduce the set \(\Omega \), which has at least two elements, and the function \(\tau \), which is continuous on \(\mathcal{C}\), given by
Finally, we define functions \(\psi ^{(N)}_1,\ldots ,\psi ^{(N)}_N\) by
These are continuous functions of \(\theta \in \mathcal{C}\), so (i) is satisfied. To verify (ii), given a linear function \(\lambda : \mathbb {R}^n \rightarrow \mathbb {R}\), we observe that
As for (iii), given \(\theta \in \mathcal{C}\), the fact that \(\psi ^{(N)}_N(\theta ) \ge 0\) is clear from the definition of \(\tau \), and for \(i=1,\ldots ,N-1\), the fact that \(\psi ^{(N)}_i(\theta ) \ge 0\) is equivalent to \(\psi ^{(N-1)}_i(\theta _N) \tau (\theta ) \le \psi ^{(N-1)}_i(\theta )\), which is obvious if \(i \not \in \Omega \) and follows from the definition of \(\tau \) if \(i \in \Omega \). Finally, to prove (iv), it is enough to verify that \(\psi ^{(N)}_i(\theta _i) = 1\) for all \(i=1,\ldots ,N\), which clearly holds for \(i=N\), and for \(i=1,\ldots ,N-1\), it is the identity \(\psi ^{(N-1)}_i(\theta _N) \tau (\theta _i) = 0\), valid both when \(i \not \in \Omega \) and when \(i \in \Omega \), that implies \(\psi ^{(N)}_i(\theta _i) = \psi ^{(N-1)}_i(\theta _i) =1\). We have now shown that the induction hypothesis holds for N, and this concludes the inductive proof. \(\square \)
Rights and permissions
About this article
Cite this article
DeVore, R., Foucart, S., Petrova, G. et al. Computing a Quantity of Interest from Observational Data. Constr Approx 49, 461–508 (2019). https://doi.org/10.1007/s00365-018-9433-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00365-018-9433-7
Keywords
- Optimal recovery
- Data fitting
- Chebyshev centers
- \(\ell _1\)-minimization
- Quadrature formulas
- Reproducing Kernel Hilbert spaces