Abstract
We study the embedding \(\mathrm {id}: \ell _p^b(\ell _q^d) \rightarrow \ell _r^b(\ell _u^d)\) and prove matching bounds for the entropy numbers \(e_k(\mathrm {id})\) provided that \(0<p<r\le \infty \) and \(0<q\le u\le \infty \). Based on this finding, we establish optimal dimension-free asymptotic rates for the entropy numbers of embeddings of Besov and Triebel–Lizorkin spaces of small dominating mixed smoothness, which gives a complete answer to an open problem mentioned in the recent monograph by Dũng, Temlyakov, and Ullrich. Both results rely on a novel covering construction recently found by Edmunds and Netrusov.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Entropy numbers quantify the degree of compactness of a set, i.e., how well the set can be approximated by a finite set. Given a compact set K in a quasi-Banach space Y, the k-th entropy number \(e_k(K,Y)\) is defined to be the smallest radius \(\varepsilon > 0\) such that K can be covered with \(2^{k-1}\) copies of the ball \(\varepsilon B_Y\), i.e.,
The concept of entropy numbers can be easily extended to operators. Given a compact operator \(T: X \rightarrow Y\), where X and Y are quasi-Banach spaces, the k-th entropy number of the operator T is defined to be
If the spaces X, Y are clear from the context, we will abbreviate \(e_k(T:X \rightarrow Y)\) by \(e_k(T)\).
Entropy numbers (or the inverse concept of metric entropy) belong to the fundamental concepts of approximation theory. They appear in various approximation problems, e.g., in the estimation of the decay of operator eigenvalues [4, 11, 20]; in the estimation of learning rates for machine learning problems [39, 43]; or in bounding s-numbers such as approximation, Gelfand, or Kolmogorov numbers from below [4, 16]. We note that Gelfand numbers find application in the recent field of compressive sensing [6, 13, 16] and Information Based Complexity in general. Entropy numbers are also closely connected to small ball problems in probability theory [21, 24]. For further applications and basic properties, we refer to the monographs [5, 28], and the recent survey [8, Chapter 6].
The goal of this paper is to improve estimates for entropy numbers of embeddings between function spaces of dominating mixed smoothness
where \(\Omega \subset \mathbb {R}^n\) is a bounded domain, \(0<p_0, p_1, q_0, q_1 \le \infty \), and \(r_0-r_1>(1/p_0-1/p_1)_+\). The case \(A=B\) stands for the scale of Besov spaces of dominating mixed smoothness, while \(A=F\) refers to the scale of Triebel–Lizorkin spaces, which includes classical \(L_p\) and Sobolev spaces of mixed smoothness. That is why (1) also includes the classical embeddings
if \(r>1/p_0-1/p_1\). Function space embeddings of this type play a crucial role in hyperbolic cross approximation [8]. Entropy numbers of such embeddings have been the subject of intense study, see[42, 8, Chapt. 6], and the recent papers by A.S. Romanyuk [33,34,35,36] and V.N. Temlyakov [40]. Note that there are a number of deep open problems connected to the case \(p_1 = \infty \), which reach out to probability and discrepancy theory [8, 2.6,6.4].
Typically, one observes asymptotic decays of the form
where \(\eta >0\). This behavior is also well-known for s-numbers of these embeddings such as approximation, Gelfand, or Kolmogorov numbers, see [8] and the references therein. Although the main rate is the same as in the univariate case, the dimension still appears in the logarithmic term. We show that the logarithmic term completely disappears in regimes of small smoothness
That is, we establish sharp purely polynomial asymptotic bounds of the form
which depends on the underlying dimension n only in the constant. This settles several open questions stated in the literature [8, 42], see Sect. 5, and makes the framework highly relevant for high-dimensional approximation.
A key ingredient in the proof of (2) is a counterpart of Schütt’s theorem for the entropy numbers of the embedding
where \(0<p<r\le \infty \) and \(0<q\le u\le \infty \). We prove matching bounds for all parameter constellations. A particularly relevant case for the purpose of this paper is the situation where \(b\le d\) and
Here, we have the surprising behavior
Note that this relation is not a trivial extension of the classical Schütt result [38], which reads as
for the norm-1-embedding \(\mathrm {id}:\ell ^b_p \rightarrow \ell ^b_r\), where \(0<p\le r\le \infty \). In fact, using trivial embeddings would give an additional \(\log \)-term in the third case of (3). The absence of this \(\log \)-term makes (3) interesting and useful as we will see below.
For \(1 \le k \le \log (db)\) and \(k \ge bd\), it requires only trivial and standard volumetric arguments to establish matching bounds for the entropy numbers \(e_k(id: \ell _p^b(\ell _q^d) \rightarrow \ell _r^b(\ell _u^d))\). The middle range \(\log (bd)~\le ~k~\le ~bd\) is much more involved. In general, it is far from straightforward to generalize the proof ideas from \(d=1\) (Schütt) to \(d>1\). Fortunately, the crucial work has already been done in a recent work by Edmunds and Netrusov [10]. They prove a general abstract version of Schütt’s theorem for operators between vector-valued sequence spaces. It remains for us to turn these general, abstract bounds into explicit estimates for the entropy numbers \(e_k(\mathrm {id}:\ell _p^b(\ell _q^d)\rightarrow \ell _r^b(\ell _u^d))\). Unfortunately, the paper [10] is written very concisely, which makes it difficult to follow the arguments at several points. Hence, we decided to provide some additional, explanatory material. We hope that Sect. 3 helps a broader readership to appreciate the powerful ideas in [10], in particular, a novel covering construction based on dyadic grids.
Outline The paper is organized as follows. In Sect. 2, we recapitulate basic definitions and results including entropy numbers and Schütt’s theorem. Afterwards, in Sect. 3, we discuss the generalization of Schütt’s theorem by [10]. In Sect. 4, we show consequences of this result, including matching bounds for the entropy numbers \(e_k(\mathrm {id}:\ell _p^b(\ell _q^d) \rightarrow \ell _r^b(\ell _u^d))\). Finally, we improve upper bounds for the entropy numbers of Besov and Triebel–Lizorkin embeddings in regimes of small smoothness in Sect. 5.
Notation As usual \(\mathbb {N}\) denotes the natural numbers, \(\mathbb {N}_0:=\mathbb {N}\cup \{0\}\), \(\mathbb {Z}\) denotes the integers, \(\mathbb {R}\) the real numbers, \(\mathbb {R}_+\) the positive real numbers, and \(\mathbb {C}\) the complex numbers. For \(a\in \mathbb {R}\) we denote \(a_+ := \max \{a,0\}\). We write \(\log \) for the natural logarithm. \(\mathbb {R}^{m\times n}\) denotes the set of all \(m\times n\)-matrices with real entries and \(\mathbb {R}^n\) denotes the Euclidean space. Vectors are usually denoted with \(x,y\in \mathbb {R}^n\). For \(0<p\le \infty \) and \(x\in \mathbb {R}^n\), we use the quasi-norm \(\Vert x\Vert _p := (\sum _{i=1}^n |x_i|^p)^{1/p}\) with the usual modification in the case \(p=\infty \). If X is a (quasi-)normed space, then \(B_X\) denotes its unit ball and the (quasi-)norm of an element x in X is denoted by \(\Vert x\Vert _X\). If X is a Banach space, then we denote its dual by \(X^*\). We will frequently use the quasi-norm constant, i.e., the smallest constant \(\alpha _X\) satisfying
For a given \(0<p\le 1\) we say that \(\Vert \cdot \Vert _X\) is a p-norm if
As is well known, any quasi-normed space can be equipped with an equivalent p-norm (for a certain \(0<p\le 1\), see [2, 32]). If \(T:X\rightarrow Y\) is a continuous operator we write \(T\in \mathcal {L}(X,Y)\) and \(\Vert T\Vert \) for its operator (quasi-)norm. The notation \(X \hookrightarrow Y\) indicates that the identity operator \(\mathrm {Id}:X \rightarrow Y\) is continuous. For two non-negative sequences \((a_n)_{n=1}^{\infty },(b_n)_{n=1}^{\infty }\subset \mathbb {R}\) we write \(a_n \lesssim b_n\) if there exists a constant \(c>0\) such that \(a_n \le c\,b_n\) for all n. We will write \(a_n \simeq b_n\) if \(a_n \lesssim b_n\) and \(b_n \lesssim a_n\). If \(\alpha \) is a set of parameters, then we write \(a_n \lesssim _{\alpha } b_n\) if there exists a constant \(c_{\alpha }>0\) depending only on \(\alpha \) such that \(a_n \le c_{\alpha }\,b_n\) for all n.
Let \(b,d \in \mathbb {N}\). For \(0<p,q \le \infty \), the bd-dimensional mixed space \(\ell _p^b(\ell _q^d)\) is defined as the space of all matrices \(x \in \mathbb {R}^{b\times d}\) equipped with the mixed (quasi-)norm
with the usual modification that the corresponding sum is replaced by a maximum in the case that either \(p=\infty \) or \(q=\infty \). We always refer to the \(\ell _p\)-space supported on \([b]:=\{1,\ldots ,b\}\) as the outer space and to the \(\ell _q\)-space supported on [d] as the inner space. For any \(S\subset [b]\times [d]\) and \(x\in \mathbb {R}^{b\times d}\) we define \(x_S\) as the matrix \((x_S)_{ij} = x_{ij}\) for \((i,j)\in S\), \((x_S)_{ij} = 0\) for \((i,j)\in S^c\).
2 Entropy Numbers and Schütt’s Theorem
Let us recall basic notions and properties concerning entropy numbers. Let K be a subset of a quasi-Banach space Y. Given \(\varepsilon > 0\), an \(\varepsilon \) -covering is a set of points \(x_1,\dots ,x_n \in K\) such that
An \(\varepsilon \)-packing is a set of points \(x_1,\dots , x_m \in K\) such that \(\Vert x_i - x_j\Vert _Y > \varepsilon \) for pairwise different i, j. The covering number \(N_\varepsilon (K,Y)\) is the smallest n such that there exists an \(\varepsilon \)-covering of K, while the packing number \(M_\varepsilon (K, Y)\) is the largest m such that there exists an \(\varepsilon \)-packing of K. It is easy to see that
The metric entropy is defined to be
see Remark 4 for the relation of metric entropy to other notions of entropy.
The k-th entropy number \(e_k(K,Y)\) can be redefined as
It is easy to see that the sequence of entropy numbers is decaying, i.e., \(e_1 \ge e_2 \ge \dots \ge 0\). Moreover, the set K is compact in X if and only if \(\lim _{k \rightarrow \infty } e_k(K,Y) = 0\).
Let T denote an operator mapping between two quasi-Banach spaces X and Y. Recall from the introduction that the operator’s entropy numbers are given by
Clearly, we have
If \(T_1,T_2\) are both operators from X to Y, and Y is a \(\vartheta \)-normed space, then the entropy numbers of the sum can be estimated as follows
Moreover, if \(S \in \mathcal {L}(X,Y)\) and \(R \in \mathcal {L}(Y,Z)\) then
In particular, this gives
For further general properties of entropy numbers and basic estimates, we refer the reader to the monographs [5, 25, 29]. For remarks on the history of entropy number research, see [5, 43].
In the concrete situation where \(X=\ell _p^b\) and \(Y=\ell _q^b\) for \(0<p \le q \le \infty \), the entropy numbers of the embedding \(\mathrm {id}: \ell _p^b \rightarrow \ell _q^d\) are completely understood in terms of their decay in k and b. This central result is often referred to as Schütt’s theorem. For its history and references, see Remark 3. We only state the interesting case \(0<p<q \le \infty \) here.
Theorem 1
(Schütt’s theorem) For \(0<p\le q \le \infty \) and \(k,b \in \mathbb {N}\), we have
The constants in the estimates do neither depend on k nor on b.
Remark 2
Note that \(e_k(\mathrm {id}: \ell _\infty ^b \rightarrow \ell _\infty ^b) = 1\) as long as \(k \le b\) because \(\Vert x - y\Vert _\infty = 2\) for different \(x,y \in \{-1,1\}^b\).
Remark 3
In 1984, Schütt [38] gave a proof for the general case of symmetric Banach spaces, which implies Theorem 1 if \(1\le p \le q \le \infty \). In the range \(1 \le k \le b\), the upper bound was first proved for all \(0<p \le q \le \infty \) by Edmunds and Triebel [11] in 1996 by covering the unit ball using suitable sparse vectors. Edmunds and Netrusov [9, Thm. 2] generalized this covering construction in 1998 to arbitrary quasi-Banach spaces. In the same paper, Edmunds and Netrusov also proved matching lower bounds for general quasi-Banach spaces [9, Thm.2]. Kühn [22] also proved the lower bound for \(e_k(\mathrm {id}: \ell _p^b \rightarrow \ell _q^b)\) with \(0<p\le q \le \infty \) in 2001. Both [9, Thm. 2] and [22] rely on the very same idea to pack the unit ball with sparse vectors and use the fundamental combinatorial fact discussed in Remark 12 (ii) below. In 2000, Guédon and Litvak [15, Thm.6] provided an alternative proof of Theorem 1 that relies completely on interpolation arguments and improved the constants in the upper bound.
Remark 4
The concept of metric entropy for compact sets has been introduced independently by Kolmogorov [18] and Pontrjagin and Schnirelmann [31]. It should not be confused with the metric entropy of a dynamical system, which also has been introduced by Kolmogorov [19]. The latter entropy is also called Kolmogorov-Sinai entropy or measure-theoretic entropy. However, these two notions of metric entropy are related [1]. There is also a deep connection between Kolmogorov-Sinai entropy and the notions of information entropy and thermodynamic entropy [3].
3 Edmunds–Netrusov Revisited
In addition to Schütt’s theorem, the main tool that we employ in this work is a powerful result by Edmunds and Netrusov [10]. They prove a generalization of Schütt’s theorem for vector-valued sequence spaces. Let us restate the part of their result that is relevant for us.
Theorem 5
(Theorems 3.1 and 3.2 in [10]) Let \(b \in \mathbb {N}\) such that \(b \ge 2\), \(0<p\le r\le \infty \) and let X and Y be \(\gamma \)-normed quasi-Banach spaces. For \(k,m \in \mathbb {N}\) such that \(m \le k\), let
and
For \(k \ge \log _2(b)\), we have the following.
-
(i)
If \(k \le b\), then
$$\begin{aligned} e_k(\mathrm {id}:\ell _p^b(X) \rightarrow \ell _r^b(Y)) \simeq A(k,b). \end{aligned}$$ -
(ii)
If \(k \ge b\), then there are absolute constants \(c_1, c_2\) such that
$$\begin{aligned} D(c_1 k/b, k) \lesssim e_k(\mathrm {id}:\ell _p^b(X) \rightarrow \ell _r^b(Y)) \lesssim D(c_2 k/b,k). \end{aligned}$$
Theorem 5 gives abstract lower and upper bounds that are “matching” in the sense that both have the same functional form. At first glance, this functional form is neither obvious nor easy to interpret. In addition, we found it difficult to follow the arguments in [10] at several points due to its succinct style of presentation. We thus believe that it is of value to review their key arguments and to provide some additional material that makes Theorem 5 more comprehensible. This is the subject of the remainder of this section. The reader who is only interested in applications of Theorem 5 may proceed directly to Sect. 4.
Remark 6
Theorems 3.1 and 3.2 in [10] are only stated for \(0<p<r\le \infty \). However, these theorems also hold true for \(p=r\). First note that in the latter case, we have
Now for \(k \ge b\), Theorem 5 has been proved in [27, Thm. 4.3]. For \(k \le b\), the lower bound in Theorem 5 is a consequence of [27, Thm.4.3] in combination with arguments analogous to Remark 12; the upper bound is trivial.
3.1 A Special Case to Begin with
If \(p=r=\infty \) it is clear that one simply has to take b-fold Cartesian products of the optimal covering and packing of \(B_X\) in Y to obtain the bounds
In any other case, simple Cartesian products will not be good enough.
The special case of equal inner spaces \(X=Y\) also allows for a rather straightforward solution if the dimension of the inner space is finite. For an easier understanding of the contribution in [10], see Theorem 5 above, we find it instructive to give a direct proof of this special case and point out its limitations. Indeed, a straightforward generalization of the well-known Edmunds–Triebel covering construction [11] based on volume arguments will do the job to establish the optimal upper bound. Recall that the essence of this covering construction is a result from best s-term approximation, sometimes referred to as Stechkin’s inequality, see [8, Sect. 7.4], which yields a \(s^{-1/p+1/r}\)-covering of \(B_{\ell _p^b}\) in \(\ell _r^b\) using only s-sparse vectors. We simply have to extend this approach to row-sparse matrices. To improve readability, we will omit some technical details in the following proof.
Proposition 7
Let \(0<p \le r \le \infty \) and X be \(\mathbb {R}^d\) (quasi-)normed with \(\Vert \cdot \Vert _X\). Further let \(b,d \in \mathbb {N}\) and \(d>5\). Then, for \(1 \le k \le bd\),
Proof
The first case is trivial. The last case follows from volumetric arguments using the recent findings in [17, Sect. 3.2]. By these we know that
and similarly for \({{\,\mathrm{vol}\,}}(B_{\ell _r^b(X)})^{1/(bd)}\). For \(k>bd\) we use the standard volume argument to obtain
For the second case let \(s \in [b]\). Clearly, we have that
where \( B_I := \{ x \in B_{\ell _p^b(\ell _q^d)} : \Vert x_{i\cdot }\Vert _X \ge \Vert x_{k\cdot }\Vert _X \text { for } i \in I, k\in [b]\setminus I\}\). When we replace the s rows with the largest \(\Vert \cdot \Vert _X\)-(quasi-)norm by 0 in \(x \in B_I\), then the resulting matrix has a \(\ell _r^b(X)\)-(quasi-)norm of at most \(s^{-(1/p-1/r)}\), which follows from a well-known relation for best s-term approximation in \(\ell _r\). Hence, if we wish to cover the set \(B_I\) by balls of radius \(\varepsilon \simeq s^{-(1/p-1/r)}\), it suffices to take care of the s largest components of the vectors in \(B_I\). That is, we take a suitable covering of \(B_{\ell _p^s(X)}\) in \(\ell _r^s(X)\) and append \(b-s\) zero rows to every matrix of the covering. A similar volumetric argument as above in (7) and (8) tells us that
so that we obtain a covering of \(B_I\) with cardinality \(2^{c_{p,q}sd}\).
Combining the coverings for all possible index sets I yields an \(\varepsilon \)-covering U of \(B_{\ell _p^b(X)}\) in \(B_{\ell _r^b(X)}\), where \(\varepsilon \simeq s^{-1/p+1/r}\), with cardinality
Now, given \(k \in [bd]\), we choose
such that
is assured. Consequently, we obtain the upper bound
\(\square \)
Remark 8
One way to obtain the matching lower bound in the case \(X = Y\) is to generalize the proof idea underlying Schütt’s theorem (Theorem 1) in the case that \(\log (b) \le k \le b\). However, the standard combinatorial lemma is not sufficient here. A suitable packing to do this generalization has already been considered in [6, Prop. 5.3]. See also Remark 12 below.
3.2 The Covering Construction by Edmunds and Netrusov
The generalized Edmunds–Triebel covering is optimal for finite dimensional \(X=Y\), see Proposition 7 in the previous section. In the general situation, where X is compactly embedded into Y, it seems that the volumetric arguments underlying (10) are too coarse to obtain sharp estimates (at least in the finite dimensional situation). The main contribution of [10] is a covering construction which resolves this shortcoming by not using volumetric arguments at all. In particular, X and Y do not have to be finite dimensional. We give a detailed recapitulation of their idea in this section. For some comments concerning the lower bound in Theorem 5, see Remark 12 at the end of this section.
The covering in [10] works in the very general situation where we are given quasi-Banach spaces \(X_1,\dots , X_b\) and \(Y_1,\dots , Y_b\), see Proposition 10 below. The basic idea is to cover the unit ball \(B_{\ell _p(\{X_i\}_{i=1}^b)}\) by N cuboids
where \(v^1,\dots ,v^N \in \mathbb {R}_+^b\) and N is exponential in b (think of each cuboid as an anisotropically rescaled version of \(B_{\ell _\infty (\{X_i\}_{i=1}^b)}\)). The crux is to find suitable vectors \(v^i\) such that an optimal covering can be reached by covering the cuboid \(U(v^i)\) using a product of optimal coverings of \(B_{X_1}\),...,\(B_{X_b}\). Edmunds and Netrusov [10] had the idea to consider vectors that form a dyadic grid derived from the simplex
The dyadic grid is constructed with the help of the following mapping. Let
and for \(x \in [0,1]^b\), put
This mapping \(\upsilon \) leads to a finite grid with the following properties.
Lemma 9
(Simplification of Lemma 2.2 in [10]) For \(b \in \mathbb {N}\), let \(\Gamma (b) = \upsilon (S(b))\). The set \(\Gamma (b)\) has the following properties.
-
(i)
For all \(u \in S(b)\), there is \(v \in \Gamma (b)\) such that \(u_i \le v_i\) for all \(i \in [b]\).
-
(ii)
For all \(v \in \Gamma (b)\), we have \(\Vert v\Vert _1 \le 2\).
-
(iii)
For all \(v \in \Gamma (b)\), we have \(bv_i \in \mathbb {N}\) for each \(i \in [b]\).
-
(iv)
We have \(\sharp \Gamma (b) \le 2^{3b}\).
Proof
Given \(x \in S(b)\), let \(v = \upsilon (x)\). We clearly have \(\sum _{i=1}^b v_i \le 2\) and \(b v_i \in \mathbb {N}\) for all indices \(i=1,\dots ,b\). Further
which is a crucial property to estimate the cardinality of the set \(\Gamma (b)\). Let
Clearly, \(\sharp B(v,k) \le \sharp C(v,k) \le \min \{b, b2^{1-k}\}\). Varying over all elements in the simplex, B(v, 0) can be any of the \(2^b\) subsets of [b]. Fixing B(v, 0), there are at most \(2^b\) possibilities for B(v, 1). Fixing B(v, 0) up to \(B(v,k-1)\), there are at most \(2^{b2^{1-k}}\) possibilities for B(v, k). Hence, in total the set \(\Gamma (b)\) may contain at most
many elements.\(\square \)
The dyadic grid according to Lemma 9 allow us to establish the following upper bound on entropy numbers.
Proposition 10
(Reformulation of Lemma 2.3 in [10]) Let \(X_1,\dots ,X_b\) and \(Y_1,\dots ,Y_b\) be quasi-Banach spaces, let \(0<p\le r\le \infty \), and let \(k\in \mathbb {N}\) such that \(k \ge 8b\). Then, we have
Proof
Consider the transformed grid
By Lemma 9 (i), we have
where U(v) is the cuboid defined in (11).
Let \(v \in \Gamma (b,p)\) be given by \(v = (v_1^{1/p}, \dots , v_b^{1/p})\). For each
let \(\mathcal C_i\) be a \(e_{m_i}(v_i^{1/p}B_{X_i},Y_i)\)-covering. Then, for every \(x \in U(v)\), there is \(y \in \ell _r^b(Y)\) such that \(y_{i\cdot } \in \mathcal C_i\) and
By construction of the set \(\Gamma (b)\), we have \(\left( \sum _{i=1}^b v_i \right) ^{1/r} \le 2^{1/r}\) and
Finally, note that the product \(\mathcal C_1 \times \cdots \times \mathcal C_s\) has cardinality
which, in combination with \(\sharp \Gamma (b,p) \le 2^{3b}\), implies the desired result.\(\square \)
Proposition 10 is not the complete final answer. For \(k \le b\), we have to modify the proof of Proposition 7. We sketch the proof and refer to the proof of [10, Thm 3.1] for technical details.
Proposition 11
Let \(\log _2(b) \le k \le b\). Then, we have
where A(k, b) is defined in Theorem 5.
Proof
(Proof sketch) Let \(s \in [k]\). It is clear that, analogously to (9), we have
Similar to Proposition 7, we can use a covering for \(B_{\ell _p^s(X)}\) to construct a covering for \(B_I\). Consider now \(\varepsilon = e_k(B_{\ell _p^s(X)},\ell _r^s(Y))\) and let \(\Gamma _0\) be a minimal \(\varepsilon \)-covering of \(B_{\ell _p^s(X)}\) in \(\ell _r^s(Y)\). Let \(\Gamma _I = \Gamma _0 \times \{0\}^{[b]\setminus I}\). Then, for every \(x \in B_I\), there is \(y \in \Gamma _I\) such that
where the second term on the right-hand side follows from the best s-term approximation result already used in Proposition 7. Consequently, we have
In contrast to Proposition 7, volumetric arguments would now give a suboptimal estimate for the entropy numbers \(e_k(B_{\ell _p^s(X)},\ell _r^s(Y)\). In this general situation, it requires Proposition 10 with \(X_1=\dots =X_b=X\) and \(Y_1=\dots =Y_b=Y\) to get the proper estimate. Concretely, since \(s \le k\), we have
which leads in combination with Proposition 10 and (12) to an upper bound of the form
The usual arguments show that it is optimal to choose s of the order \(k/\log (eb/k)\).\(\square \)
Remark 12
We close this section with some remarks concerning the lower bound in Theorem 5. Its proof relies on two surprisingly simple observations, see [10] for details.
(i) Let M be a maximal \(\varepsilon \)-packing of \(B_X\) in Y. Using the Gilbert–Varshamov bound, which is well-known in coding theory [14, 41], we know that \((2s)^{-1/p}M^{2s} \subset B_{\ell _p^b(X)}\) contains N elements of mutual distance \(s^{1/r-1/p}\varepsilon \), where \(N \simeq \mathrm {card}(M)^s\). This leads to the lower bound
see [27, p. 68] and [10, Lem.2.6] for a more general formulation. Given \(k \in \mathbb {N}\), we have to make a good choice for the dimension s to maximize the lower bound. Choose \(s = k/m\) for some \(m \in [k]\) to obtain
If \(k \le b\), we conclude
If \(k \ge b\), then \(m \ge k/b\) guarantees \(s=k/m \le b\) and thus
(ii) Choose a vector \(x \in B_X\) such that
We construct a packing by building row-sparse matrices, where the nonzero rows contain copies of x and the row support sets are chosen according to the following combinatorial fact that is well-known in various disciplines of mathematics, see e.g., [13, Lemma 10.12], [22], [12], or [30, Prop. 2.21, p. 219]. Given \(s,n \in \mathbb {N}\) such that \(0< s < n/2\), there exist subsets \(I_1, \ldots , I_N\) of [n], where
such that each subset \(I_i\) has cardinality 2s and
This leads to the lower bound
In view of the packing construction that we have mentioned in Remark 8 it is somewhat surprising that it is not necessary to combine the combinatorics of the two observations in order to obtain the optimal abstract bound in Theorem 5. An explanation is given in [27, Rem. 4.13,p. 69].
4 Consequences of the Edmunds–Netrusov Result
We discuss some consequences of Theorem 5. Let us begin with considering the entropy numbers
We have the following matching bounds.
Theorem 13
Let \(0<p \le r\le \infty \) and \(0< q \le u \le \infty \). Then, we have
For \(\log (bd) \le k \le bd\), we have the following case distinctions.
-
(i)
Let \(1/p-1/r > 1/q-1/u \ge 0\).
-
(i.a)
In the special case \(q=u\), we have
$$\begin{aligned} e_k \simeq {\left\{ \begin{array}{ll} 1 &{}: \log (bd) \le k \le d,\\ \left\{ \frac{\log (eb/k)+d}{k} \right\} ^{1/p-1/r} &{}: d \le k \le bd. \end{array}\right. } \end{aligned}$$ -
(i.b)
If \(q<u\) and \(b \le d\), then
$$\begin{aligned} e_k \simeq {\left\{ \begin{array}{ll} \left( \frac{\log (ed/k)}{k}\right) ^{1/q-1/u} &{}: \log (bd) \le k \le d,\\ \left( \frac{d}{k}\right) ^{1/p - 1/r} d^{1/u - 1/q} &{}: d \le k \le bd. \end{array}\right. } \end{aligned}$$ -
(i.c)
If \(q < u\) and \(d \le b\), then
$$\begin{aligned} e_k \simeq {\left\{ \begin{array}{ll} \max \left\{ \left( \frac{\log (eb/k)}{k}\right) ^{1/p-1/r}, \left( \frac{\log (ed/k)}{k}\right) ^{1/q-1/u} \right\} &{}: \log (bd) \le k \le d,\\ \max \left\{ \Big (\frac{\log (eb/k)}{k}\Big )^{1/p-1/r}, \Big (\frac{d}{k}\Big )^{1/p-1/r}d^{1/u-1/q} \right\} &{}: d \le k \le b,\\ \left( \frac{d}{k}\right) ^{1/p-1/r} d^{1/u-1/q} &{}: b \le k \le bd. \end{array}\right. } \end{aligned}$$
-
(i.a)
-
(ii)
Let \(1/q-1/u \ge 1/p - 1/r \ge 0\). Then, we have
$$\begin{aligned} e_k \simeq {\left\{ \begin{array}{ll} \left( \frac{\log (ebd/k)}{k}\right) ^{1/p-1/r} &{}: \log (bd) \le k \le b\log (d),\\ b^{1/r-1/p} \left( \frac{b\log (ebd/k)}{k}\right) ^{1/q-1/u} &{}: b\log (d) \le k \le bd. \end{array}\right. } \end{aligned}$$
Proof
For \(1 \le k \le \log (bd)\) and \(k \ge bd\), it requires only standard volumetric arguments, see [27, Appendix A] for details. Let us also refer to [7, Lemma 3], where this case has been already considered. Let D(m, k) and A(k, b) be as defined in Theorem 5. Moreover, throughout the proof, we write for \(k,l \in \mathbb {N}\),
Ad (i.a). Since \(q=u\), it follows from Theorem 1 that \(e_l(\mathrm {id}: \ell _q^d \rightarrow \ell _u^d) \simeq 1\) for \(1 \le l \le d\) and consequently that \(D(1,k) = D(k/b,k) \simeq 1\) and \(A(k,b) \simeq 1\) for all \(k \le d\). Now, for \(k \ge d\), we have that \(s_{k,l} \simeq (l/k)^{1/p-1/q}\) for \(1 \le l \le d\), so the sequence is bounded from above by a monotonically increasing sequence. For \(d \le l \le k\), we have
Since \(2^{-l/d}\) decays faster in l than \((l/k)^{1/p - 1/r}\) increases, we conclude that for \(d \le l \le k\), the sequence \(s_{k,l}\) is “essentially monotonically decreasing”. To be more precise \(t_{k,l}\) attains at \(l=\beta _{p,r}d\) its maximum, where the factor \(\beta \) depends only on p and r. Hence, the maximum of \(s_{k,l}\) can be bounded from above by a constant times the maximum of \(t_{k,l}\) and therefore by \(c_{p,r}(d/k)^{1/p-1/r}\). Using analogous arguments for D(k/b, k), we conclude that \(\widetilde{D}(1,k) = D(k/b,k) \simeq (d/k)^{1/p-1/r}\) and
for \(d \le k \le b\).
Ad (i.b). Consider now \(0<q<u\) and \(b \le d\). For \(\log (bd) \le k \le b\), we have, in consequence of Theorem 1, that \(s_{k,l} \simeq (l/k)^{1/p - 1/r}\) for \(1 \le l \le \log (d)\) and
Since \(1/p-1/r > 1/q-1/u\), the sequence \(s_{k,l}\) is bounded from above and below up to a constant by a monotonically increasing sequence and consequently, the maximum is attained at \(l=k\) such that \(D(1,k) \simeq (\log (ed/k)/k)^{1/q-1/u}\). Since \(b \le d\), we further have
For \(b \le k \le d\) we find as before that \(D(k/b, k) \simeq (\log (ed/k)/k)^{1/q-1/u}\) and for \(d < k \le bd\), we have the estimate
Ad (i.c). Consider now \(d \le b\). For \(\log (bd) \le k \le d\), we find \(D(1,k) \simeq (\log (ed/k)/k)^{1/q-1/u}\) since the sequence \(s_{k,l}\) is bounded from below and above by a sequence that increases monotonically in l. If \(d \le k \le b\), then
and
Finally, if \(b \le k \le bd\), then \(D(k/b,k) \simeq (d/k)^{1/p-1/r} d^{1/u-1/q}\).
Ad (ii). For \(\log (bd) \le k \le b\), we observe that
since the term \(e_\ell (B_{\ell _q^d}, \ell _u^d)\) is decaying in \(\ell \) at least as fast as \((\ell /k)^{1/p-1/r}\) is growing. Hence,
Next, we consider \(b \le k \le b \log (d)\). Since \(k/b \le \log (d)\), we find
where we have used \(b/k \le 1\) in the last estimate. At the same time, since \(k/b \le \log (d)\), we also have \(\log (bd/k) \gtrsim \log (d)\) and thus
Finally, for \(b \log (d) \le k \le bd\) it is easy to see that
\(\square \)
Remark 14
The upper bound for \(k \ge bd\) in Theorem 13 also follows from [7, Lem. 3]. The upper bound in Theorem 13 (ii) has also been proved in [42, Lem 3.16] for the range \(b \max \{\log (d),\log (b)\} \le k \le bd\). The proof there uses the following covering construction, which as far as we know, first appeared in [23, Proof of Prop. 4]. Let \(X_1,\dots ,X_b\) and \(Y_1,\dots , Y_b\) be (quasi-)Banach spaces and \(0<p,r\le \infty \). The covering rests on the idea to split the ball \(B_{\ell _p^b({X_1,\dots ,X_b})}\) into subsets of matrices with non-increasing rows,
where the union is taken over all permutations of [b]. This leads to the upper bound
for \(n_1,\dots ,n_b \in \mathbb {N}\). If
with \(0<q\le u\), and we choose \(n_j \simeq j^{-\alpha }\) for some \(0<\alpha <1\) such that
then (13) is strong enough to obtain the upper bound in Theorem 13 (ii), provided
Now we increase the level of abstraction and consider mixed norms of higher order. Let, for \(\mu =1,\ldots ,b\), the weighted spaces \(X_{\mu }\) and \(Y_{\mu }\) be given by
with \(0<p\le r\le \infty \), \(0<q\le u\le \infty \), and \(\alpha \, \beta \in \mathbb {R}\). The dimensions \((d_\mu )_\mu \) and \((b_\mu )_{\mu }\) are non-decreasing natural numbers satisfying \(d_{\mu } \gtrsim b_{\mu }\). These spaces are used as “inner spaces” in the way that
Note that for \(x=(x_{\mu ,i,j})_{\mu ,i,j} \in X\) with \(\mu =1,\dots ,b\), \(i=1,\dots ,b_{\mu }\), \(j=1,\dots ,d_\mu \), the norm is given by
We are interested in the behavior of the entropy numbers
in the special situation \(1/q-1/u < 1/p-1/r\) .
Proposition 15
Let \(0 \le 1/q-1/u < 1/p-1/r\) and \(\alpha -\beta \le 1/p-1/r-(1/q-1/u)\). Let further X, Y and \(X_{\mu }\), \(Y_{\mu }\) be as above. Then we have for all \(k\ge 8b\) and \(k \ge \max \limits _{\mu = 1,\ldots ,b} d_{\mu }\)
Proof
We use Theorem 5, in particular the upper bound in Proposition 10. Since \(k\ge 8b\) we obtain
Let us evaluate the first \(\max [\cdots ]\). With Theorem 13, (i.b), (i.c) we have
Because \(1/p-1/r > 1/q-1/u\) the maximum is attained for \(\ell = d_{\mu }\), which leads to
Let us discuss the second \(\max [\cdots ]\). Using again Proposition 10 we obtain
Due to our assumption the exponent for \(d_{\mu }\) is positive in both cases. Since \(k \ge d_\mu \) we may replace \(d_\mu \) by k to increase the right-hand side. This leads to
\(\square \)
We are now aiming for a similar relation for small k.
Proposition 16
Let \(\alpha -\beta >0\) and \(1/p-1/r>1/q-1/u \ge 0\). Then we have for \(8b \le k \le \min \limits _{\mu =1,\ldots ,b} d_{\mu }\) the estimate
Proof
Again we use Theorem 5, in particular the upper bound in Proposition 10. This gives
where we used once again Theorem 13, (i.b). Clearly, we get
Since the function \(x \mapsto x^{-(\alpha -\beta )}[\log (ex)]^{(1/q-1/u)}\) is bounded on \([1,\infty )\) we conclude with
\(\square \)
5 Polynomial Decay of Entropy Numbers for Multivariate Function Space Embeddings
We come to the main subject of this paper, improved upper bounds for entropy numbers of function space embeddings (1) in regimes of small mixed smoothness.
5.1 Function Spaces of Dominating Mixed Smoothness
Besov and Triebel–Lizorkin spaces of mixed smoothness are typically defined via a dyadic decomposition on the Fourier side. Let \(\{\varphi _{j}\}_{j\in \mathbb {N}_0^n}\) be the standard tensorized dyadic decomposition of unity, see [37] and [42]. We further denote by \(S'(\mathbb {R}^n)\) the space of tempered distributions and by \(D'(\Omega )\) the space of distributions (dual space of \(D(\Omega )\), which represents the space of test functions on the bounded domain \(\Omega \subset \mathbb {R}^n\)). The Besov space of dominating mixed smoothness \(S^r_{p,q}B(\mathbb {R}^n)\) with smoothness parameter \(r > 0\) and integrability parameters \(0<p,q\le \infty \) is given by
with the usual modification in the case \(q=\infty \). The Triebel–Lizorkin space of dominating mixed smoothness \(S^r_{p,q}F(\mathbb {R}^n)\) is given by \((p<\infty )\)
The latter scale of spaces contains the classical \(L_p\) spaces and Sobolev spaces with dominating mixed smoothness if \(1<p<\infty \) and \(q=2\), namely we have \(S^0_{p,2}F(\mathbb {R}^n) = L_p(\mathbb {R}^n)\) and \(S^k_{p,2}F(\mathbb {R}^n) = S^k_pW(\mathbb {R}^n)\) for \(k\in \mathbb {N}\). Note that we also have \(S^r_{p,p}B(\mathbb {R}^n) = S^r_{p,p}F(\mathbb {R}^n)\) for all \(0<p<\infty \) and \(r\in \mathbb {R}\). Though we have the embedding
for \(p_0 \le p_1\) and \(r_0-r_1>1/p_0-1/p_1\), see [37, Chapt. 2], the embedding (16) is never compact. Hence, the entropy numbers of embeddings between function spaces defined on the whole \(\mathbb {R}^n\) do not converge to zero. We restrict our considerations to spaces on bounded domains \(\Omega \). Let \(\Omega \) be an arbitrary bounded domain in \(\mathbb {R}^n\). Then, we define \(S^r_{p,q}A(\Omega )\) for \(A\in \{B,F\}\) as
and its (quasi-)norm is given by \(\Vert f\Vert _{S^r_{p,q}A(\Omega )}:=\inf _{g|_{\Omega } = f} \Vert g\Vert _{S^r_{p,q}A}\) . The embedding (16) transfers to the bounded domain \(\Omega \) and is compact such that the entropy numbers decay and converge to zero.
5.2 Sequence Spaces
The key to establishing the decay rate of entropy numbers for the embedding (1) is a discretization technique which has been developed over the years by several authors beginning with Maiorov [26]. Later, after wavelet isomorphisms had been established, this technique was refined by Lemarie, Meyer, Triebel, and many others. In [42, Thm. 2.10] Vybíral gave the necessary modifications to deal with the above defined \(S^r_{p,q}A(\Omega )\) spaces in detail. The main advantage of this approach is to transfer questions for function space embeddings to certain sequence spaces.
Using sufficiently smooth wavelets with sufficiently many vanishing moments (and the notation from [42]) the mapping
represents a sequence spaces isomorphism between \(S^r_{p,q}B(\mathbb {R}^n), S^r_{p,q}F(\mathbb {R}^n)\) and
respectively, with the usual modification in the case \(\max \{p,q\} = \infty \). Here we denote for \(j \in \mathbb {N}_0^n\) and \(m \in \mathbb {Z}^n\)
and
Further, \(\chi _{j,m}\) denotes the characteristic function of \(Q_{j,m}\). Consider the sequence spaces
with (quasi-)norms given by
Let us also define the following building blocks for \(\mu \in \mathbb {N}_0\) fixed
Clearly, for \(\mu \in \mathbb {N}_0\) we have
and \(\sharp A_{j}^\Omega \simeq 2^{\Vert j\Vert _1} = 2^{\mu }\). Consider
for \(r_0-r_1 > 1/p_0-1/p_1\) such that the embedding is compact. Defining the building blocks
we have \(\mathrm {id}= \sum _{\mu =0}^{\infty } \mathrm {id}_{\mu }\). Of course, the identity
holds true, where \(\mathrm {id}_{\mu }'\) denotes the corresponding embedding operator on the respective block (17). Although these operators have the same mapping properties we use different notations to formally distinguish between them. If \(a = a^{\dag } = b\) we also have, for a finite index set I, that
where \(X_{\mu } = 2^{\mu (r_0-1/p_0)}\ell _{q_0}^{(\mu +1)^{n-1}}(\ell _{p_0}^{2^\mu })\) and \(Y_{\mu } = 2^{\mu (r_1-1/p_1)}\ell _{q_1}^{(\mu +1)^{n-1}}(\ell _{p_1}^{2^\mu })\), which means \(d_{\mu } = 2^{\mu }\) and \(b_\mu = (\mu +1)^{n-1}\) in the notation of (14). In particular, we have \(b_{\mu } \lesssim d_{\mu }\) .
5.3 Entropy Numbers
As a consequence of the boundedness of certain restriction and extension operators, see [42, 4.5], the investigation of entropy numbers of Besov space embeddings can be shifted to the sequences spaces side. We formulate our first result in the framework of sequence spaces, which improves the upper bound. More specifically, we prove that the lower bound in (23) is sharp in the case that \(0 \le 1/p_0-1/p_1<r_0-r_1 \le 1/q_0-1/q_1\), which also includes the limiting case \(r_0-r_1 = 1/q_0 - 1/q_1\) . What is known in this direction is summarized in Remark 20 below.
Proposition 17
Let \(\Omega \) be a bounded domain and \(0<q_0 < q_1 \le \infty \), \(0<p_0\le p_1\le \infty \) such that
Then we have
Proof
The lower bound follows by [42, Thm. 3.18]. The upper bound is the actual contribution. We argue as follows.
Step 1. Put \(\varrho :=\min \{1,p_1,q_1\}\) and fix \(m\ge m_0\), where \(m_0\) is large enough (depending on \(p_0,p_1,q_0,q_1,r_0,r_1\)). We decompose the identity operator \(\mathrm {id}\) as follows
where \(L_m := \lfloor \log _2(m) \rfloor \) and \(M_m := \lfloor m/8 \rfloor \). With an eye on Proposition 10, this means, in particular, that \(m \ge 8L_m\) and \(m\ge 8M_m\) (for m large enough). Using (4) we obtain
Step 2. We estimate the first summand. By (18) this breaks down to the entropy numbers
with \(X_\mu , Y_\mu \) chosen as after (18) and I denotes the range for \(\mu \). Putting
and \(\beta :=r_1-1/p_1\) in (14) we may apply Proposition 15 . Since \(m \ge \max \{8L_m, \max \limits _\mu d_\mu \}\) we may apply Proposition 15 to obtain
Note that, due to Proposition 15, we only used that \(r_0-r_1 \le 1/q_0-1/q_1\). To estimate the first summand in (20) it is not needed that \(r_0-r_1>1/p_0-1/p_1\).
Step 3. Let us address the second summand in (20). Clearly, it can be reduced to (21) with spaces \(X_\mu , Y_\mu \) defined analogously, but with \(\mu \) running this time in the range
Hence, we have \(b := \sharp I = M_m \le \min \limits _\mu d_\mu \) . We apply Proposition 16 to end up with (22). Note, that we have used here only that \(\alpha -\beta >0\), or, equivalently, \(r_0-r_1>1/p_0-1/p_1\) .
Step 4. Finally, we deal with the third summand in (20). Clearly, we have
This gives
This concludes the proof.\(\square \)
In the next theorem we consider the situation where a Besov type sequence space compactly embeds into a Triebel–Lizorkin type sequence space. This setting is particularly important, since it leads to results with target space \(L_p\).
Proposition 18
Let \(\Omega \) be a bounded domain and \(0<q_0 < q_1 \le \infty \), \(0<p_0\le p_1 < \infty \), \(q_0 <p_0\), \(r_0>r_1\) such that
Then we have
Proof
Again, the lower bounds follow from [42, Thm. 3.18].
Step 1. In the case \(p_1 > q_1\) we use the commutative diagram in Fig. 1.
Hence, we may use Proposition 17 and obtain
Step 2. Now we consider \(p_1 < q_1\). After decomposing the identity operator in an analogous way as in (19) and (20) we use the commutative diagrams in Fig. 2 for the first and second summand, respectively. In fact, for the first summand in (20) we obtain by (6)
Note, that the identity operator is bounded since \(\Omega \) is a bounded domain.
Furthermore, the entropy numbers
can be estimated by the same reasoning as in Step 2 of the proof of Proposition 17. Note that \(r_0-r_1\) may be smaller than \(1/p_0 - 1/q_1\). However, this is not important for the argument (based on Proposition 15). It remains to consider the second summand in (20). Here we use the right diagram in Fig. 2 and obtain
We continue to estimate the appearing entropy numbers as in Step 3 of the proof of Proposition 17. Note that \(r_0-r_1\) might be larger than \(1/q_0-1/p_1\). However, for the argument, we only need \(r_0-r_1>1/p_0-1/p_1\). This concludes the proof.\(\square \)
Let us finally consider the situation, where a Triebel–Lizorkin type sequence space compactly embeds into a Besov type sequence space.
Proposition 19
Let \(0<q_0 < q_1 \le \infty \), \(0<p_0\le p_1 < \infty \), \(q_1 > p_1\), \(r_0>r_1\) such that
and let \(\Omega \) be a bounded domain. Then we have
Proof
The lower bound follows from [42, Thm. 3.18].
Step 1. For the upper bound in the case \(p_0<q_0\) we may use the commutative diagram in Fig. 3 below to decompose the identity operator. Afterwards, we use (5) to reduce everything to the situation in Proposition 17.
Step 2. In the case \(p_0>q_0\) we argue analogously to Step 2 of Proposition 18. This time we use the decompositions in Fig. 4 for the first and second summand in (20), respectively.\(\square \)
Unfortunately, we were not able to find a corresponding result for the \(f-f\) situation. So, this remains an open problem.
Remark 20
To clarify the contribution of this paper, let us briefly recapitulate the known results and open questions which motivated this work. For several results and historical remarks on the subject we refer to [8] and the references therein. In particular, Vybíral [42, Thm. 4.9] proved for \(0<p_0\le p_1 \le \infty \) and \(0<q_0\le q_1 \le \infty \) in the case of small smoothness
that there is for any \(\varepsilon >0\) a number \(C_{\varepsilon }>0\) such that
The result is a direct consequence of the bound for \(r>\max \{1/p_0-1/p_1,1/q_0-1/q_1\}\) (the case of “large smoothness”), saying that
In fact, the entropy numbers in (23) can be bounded from above by
if \(q^{*}\ge q_0\). Now choose \(q_1>q^{*}>q_0\) such that \(1/q^{*}-1/q_1+\varepsilon /(n-1)=r_0-r_1>1/q^{*}-1/q_1\), which, together with (24) and \(q_0\) replaced by \(q^{*}\), implies (23).
The propositions proved above allow us to improve a number of existing results for the entropy numbers of the embedding
Theorem 21
Let \(\Omega \) be a bounded domain and \(A, A^{\dag } \in \{B,F\}\) but \((A,A^{\dag }) \ne (F,F)\). Let \(0<q_0 < q_1 \le \infty \), \(0<p_0\le p_1 < \infty \) and \(r_0>r_1\) such that
In addition, we assume \(q_0<p_0\) if \((A, A^{\dag }) = (B,F)\) and \(q_1>p_1\) if \((A, A^{\dag }) = (F,B)\), respectively. Then the following holds
Proof
The result is a direct consequence of Propositions 17, 18, 19 and the machinery described in the proof of [42, Thm. 4.11].\(\square \)
As a corollary of Theorem 21, we obtain the following result, which settles Open Problem 6.4 in [8].
Corollary 22
Let \(\Omega \) be as above. Let further \(0<q<p_0\le p_1\), \(1<p_1<\infty \) and \(1/p_0-1/p_1<r\le 1/q-1/2\). Then we have
Proof
Identifying \(S^0_{p_1,2}F(\Omega ) = L_{p_1}(\Omega )\) in the case \(1<p_1<\infty \), the result is a direct consequence of Theorem 21.\(\square \)
With the final corollary below (from Theorem 21) we close some more gaps in [42, Thm. 4.18 (ii), (iii)].
Corollary 23
Let \(\Omega \) be as above. We have the following sharp bounds for entropy numbers.
(i) Let \(1 < p \le \infty \) and \(1/p<r\le 1\). Then, we have
(ii) Let \(1< p< q <\infty \) and \(1/p-1/q<r\le 1/2\). Then, we have
(iii) Let \(0< q<p \le \infty \), \(q<1\) and \(1/p < r \le 1/q - 1\). Then, we have
Remark 24
Entropy numbers of mixed smoothness Sobolev-Besov embeddings into \(L_p\), where \(1\le p\le \infty \), recently gained significant interest, see [40] and [33,34,35,36]. There are some fundamental open problems connected with \(p=\infty \), see [8, 2.6, 6.4, 6.5]. Interestingly, when choosing the third index q small enough in Corollaries 22, 23 we get rid of the logarithm.
Change history
06 July 2021
A Correction to this paper has been published: https://doi.org/10.1007/s00365-021-09556-z
References
Akashi, S.: An operator theoretical characterization of \(\varepsilon \)-entropy in Gaussian processes. Kodai Math. J. 9(1), 58–67 (1986)
Aoki, T.: Locally bounded linear topological spaces. Proc. Imp. Acad. 18(10), 588–594 (1942)
Billingsley, P.: Ergodic Theory and Information. Wiley, New York (1965)
Carl, B.: Entropy numbers, \(s\)-numbers, and eigenvalue problems. J. Funct. Anal. 41(3), 290–306 (1981)
Carl, B., Stephani, I.: Entropy, Compactness and the Approximation of Operators. Cambrige University Press, Cambridge (1990)
Dirksen, S., Ullrich, T.: Gelfand numbers related to structured sparsity and Besov space embeddings with small mixed smoothness. J. Complex. 48, 69–102 (2018)
Dũng, D.: Non-linear approximations using sets of finite cardinality or finite pseudo-dimension. In: 3rd Conference of the Foundations of Computational Mathematics, vol. 17, pp. 467–492. 2001, Oxford (1999)
Dũng, D., Temlyakov, V.N., Ullrich, T.: Hyperbolic Cross Approximation. Advanced Courses in Mathematics. CRM Barcelona. Birkhäuser/Springer, Basel/Berlin (2018)
Edmunds, D.E., Netrusov, Y.: Entropy numbers of embeddings of Sobolev spaces in Zygmund spaces. Studia Math. 128(1), 71–102 (1998)
Edmunds, D.E., Netrusov, Y.: Schütt’s theorem for vector-valued sequence spaces. J. Approx. Theory 178, 13–21 (2014)
Edmunds, D.E., Triebel, H.: Function Spaces, Entropy Numbers, Differential Operators. Cambridge Tracts in Mathematics, vol. 120. Cambridge University Press, Cambridge (1996)
Foucart, S., Pajor, A., Rauhut, H., Ullrich, T.: The Gelfand widths of \(\ell _p\)-balls for \(0< p\le 1\). J. Complex. 26(6), 629–640 (2010)
Foucart, S., Rauhut, H.: A Mathematical Introduction to Compressive Sensing. Applied and Numerical Harmonic Analysis. Birkhäuser/Springer, New York (2013)
Gilbert, E.: A comparison of signalling alphabets. Bell Syst. Techol. J. 31, 504–522 (1952)
Guédon, O., Litvak, A.E.: Euclidean projections of a \(p\)-convex body. In: Geometric Aspects of Functional Analysis, volume 1745 of Lecture Notes in Math., pp. 95–108. Springer, Berlin (2000)
Hinrichs, A., Kolleck, A., Vybíral, J.: Carl’s inequality for quasi-Banach spaces. J. Funct. Anal. 271(8), 2293–2307 (2016)
Kempka, H., Vybíral, J.: Volumes of unit balls of mixed sequence spaces. Math. Nachr. 290(8–9), 1317–1327 (2017)
Kolmogorov, A.N.: On certain asymptotic characteristics of completely bounded metric spaces. Dokl. Akad. Nauk SSSR (N.S.) 108, 385–388 (1956)
Kolmogorov, A.N.: A new metric invariant of transient dynamical systems and automorphisms in Lebesgue spaces. Dokl. Akad. Nauk SSSR (N.S.) 119, 861–864 (1958)
König, H.: Eigenvalue Distribution of Compact Operators. Birkhäuser Verlag, Basel (1986)
Kuelbs, J., Li, W.V.: Metric entropy and the small ball problem for Gaussian measures. J. Funct. Anal. 116(1), 133–157 (1993)
Kühn, T.: A lower estimate for entropy numbers. J. Approx. Theory 110(1), 120–124 (2001)
Kühn, T., Leopold, H.-G., Sickel, W., Skrzypczak, L.: Entropy numbers of embeddings of weighted Besov spaces. Constr. Approx. 23(1), 61–77 (2006)
Li, W.V., Linde, W., et al.: Approximation, metric entropy and small ball estimates for Gaussian measures. Ann. Probab. 27(3), 1556–1578 (1999)
Lorentz, G.G., Golitschek, M.V., Makovoz, Y.: Constructive Approximation, volume 304 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer-Verlag, Berlin, Advanced problems (1996)
Maĭorov, V.E.: Discretization of the problem of diameters. Uspehi Mat. Nauk 30(6(186)), 179–180 (1975)
Mayer, S.: Preasymptotics via metric entropy. Dr. Hut Verlag, Munich, PhD thesis (2018)
Pietsch, A.: Operator ideals, volume 16. Deutscher Verlag der Wissenschaften, (1978)
Pietsch, A.: Eigenvalues and \(s\)-Numbers. Cambridge Studies in Advanced Mathematics, vol. 13. Cambridge University Press, Cambridge (1987)
Pinkus, A.: \(n\)-widths in approximation theory. Ergebnisse der Mathematik und ihrer Grenzgebiete (3) [Results in Mathematics and Related Areas (3)], vol. 7. Springer-Verlag, Berlin (1985)
Pontrjagin, L., Schnirelmann, L.: Sur une propriété métrique de la dimension. Ann. Math. (2) 33(1), 156–162 (1932)
Rolewicz, S.: On a certain class of linear metric spaces. Bull. Acad. Polon. Sci 5, 471–473 (1957)
Romanyuk, A.S.: Entropy numbers and widths of the classes \(B^r_{p,\theta }\) of periodic functions of many variables. Ukraïn. Mat. Zh. 68(10), 1403–1417 (2016)
Romanyuk, A.S.: Estimation of the entropy numbers and Kolmogorov widths for the Nikol’skii-Besov classes of periodic functions of many variables. Ukrainian Math. J. 67(11), 1739–1757 (2016)
Romanyuk, A. S.: Translation of Ukraïn. Mat. Zh. 67 , no. 11, 1540–1556 (2015)
Romanyuk, A.S.: Entropy numbers and widths for the Nikol’skii–Besov classes of functions of many variables in the space \(L_\infty \). Anal. Math. 45(1), 133–151 (2019)
Schmeisser, H.-J., Triebel, H.: Topics in Fourier Analysis and Function Spaces. Wiley, Chichester (1987)
Schütt, C.: Entropy numbers of diagonal operators between symmetric Banach spaces. J. Approx. Theory 40(2), 121–128 (1984)
Temlyakov, V.: Greedy Approximation. Cambridge Monographs on Applied and Computational Mathematics, vol. 20. Cambridge University Press, Cambridge (2011)
Temlyakov, V.: On the entropy numbers of the mixed smoothness function classes. J. Approx. Theory 217, 26–56 (2017)
Varshamov, R.: Estimate of the number of signals in error correcting codes. Dokl. Acad. Nauk SSSR 117, 739–741 (1952)
Vybíral, J.: Function spaces with dominating mixed smoothness. Dissertationes Math. (Rozprawy Mat.) 436, 73 (2006)
Williamson, R.C., Smola, A.J., Schölkopf, B.: Entropy numbers, operators and support vector kernels. In: Fischer, P., Simon, H.U. (eds.) Computational Learning Theory, pp. 285–299. Springer, Berlin (1999)
Acknowledgements
The authors would like to thank Dinh Dũng, Thomas Kühn, Van Kien Nguyen, Winfried Sickel and Vladimir N. Temlyakov for several discussions on the topic. They would also like to thank two anonymous referees for their valuable comments. T.U. and S.M. would like to acknowledge support by the DFG Ul-403/2-1 and by the Fraunhofer Cluster of Excellence “Cognitive Internet Technologies”.
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Vladimir N. Temlyakov.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Mayer, S., Ullrich, T. Entropy Numbers of Finite Dimensional Mixed-Norm Balls and Function Space Embeddings with Small Mixed Smoothness. Constr Approx 53, 249–279 (2021). https://doi.org/10.1007/s00365-020-09510-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00365-020-09510-5
Keywords
- Entropy numbers
- Schütt’s theorem
- Finite-dimensional (quasi-)normed spaces
- Function spaces with small mixed smoothness
- Compact embeddings