1 Introduction

By taking over the arguments of [23], Tanaka & Sugihara [24] proposed an algorithm to design accurate approximation formulas in function spaces called weighted Hardy spaces defined by

$$\begin{aligned} {\mathbb {H}}^\infty ({\mathcal {D}}_d, w) :=\left\{ f:{\mathcal {D}}_d\rightarrow {\mathbb {C}} \,\Big |\, f \ \text{ is } \text{ analytic } \text{ on } \ {\mathcal {D}}_d,\ \sup _{z\in {\mathcal {D}}_d}\left| \frac{f(z)}{w(z)}\right| <\infty \right\} , \end{aligned}$$
(1)

where \(d > 0\), \({\mathcal {D}}_d:=\{z\in {\mathbb {C}}\mid |\mathop {\textrm{Im}}z|<d\}\), and w is a weight function characterized later in Sect. 2.1. The spaces \({\mathbb {H}}^\infty ({\mathcal {D}}_d, w)\) are often considered as spaces of transformed functions for well-used sinc approximation formulas shown later in (2). The objective of [23] and [24] was to provide formulas outperforming the sinc formulas. However, their studies only provided heuristic analyses on the proposed formulas without any theoretical guarantees, although their methods have shown superiority to the sinc approximation formulas. In this study, we mathematically

  1. (1)

    prove near optimality of the formulas, and

  2. (2)

    provide a general upper bound of the errors of the proposed formulas and show that the bound coincides in asymptotic order with the heuristic bound derived by [23].

Below we describe the background of this study more precisely. The spaces \({\mathbb {H}}^\infty ({\mathcal {D}}_d, w)\) appear in literature as spaces of variable-transformed functions [18, 19, 21, 25]. For example, the double exponential (DE) transform, which is well-used in numerical analysis [22], has the form

$$\begin{aligned} f(x)=g\left( \tanh \left( \frac{\pi }{2}\sinh (x)\right) \right) \end{aligned}$$

and shows a double-exponential decay. Also, TANH transform \(g(\tanh (x/2))\) is commonly used [2, 15]. These variable transformations are employed for the accurate approximation of functions by yielding functions with rapid decay on \({\mathcal {D}}_d\), which enables us to neglect the values of the functions for large |x|. This motivates us to analyze the approximation possibility over weighted Hardy spaces with general weight functions w. After Sugihara [21] demonstrated near optimality of sinc approximation formulas

$$\begin{aligned} f(x) \approx \sum _{k=N_-}^{k=N_+}f(kh) {\textrm{sinc}}\left( \frac{x}{h}-k\right) \end{aligned}$$
(2)

for several weight functions w, attempts to construct an optimal formula for general weight functions was started in the literature.

For this purpose, Tanaka et al. [23] employed potential theoretical arguments to generate sampling points for the approximation of functions. Furthermore, Tanaka & Sugihara [24] simplified the arguments and proposed accurate formulas \(L_n[a^{*};f](x)\) given later by (6) with special sets \(a^{*}\) of sampling points. The formulas \(L_n[a^{*};f](x)\) outperform the sinc methods for functions \(f \in {\mathbb {H}}^\infty ({\mathcal {D}}_d, w)\). The authors showed that

$$\begin{aligned} \sup _{\Vert f\Vert \le 1,\ x\in {\mathbb {R}}}|f(x)-L_n[a^*;f](x)| \le \exp \left( -\frac{F^\textrm{D}_{K, Q}(n)}{n-1}\right) , \end{aligned}$$
(3)

where \(\Vert f \Vert \) is a norm of \(f \in {\mathbb {H}}^\infty ({\mathcal {D}}_d, w)\) and \(F^\textrm{D}_{K, Q}(n)\) is determined later in (13) by a “discrete” energy minimization problem. Furthermore, they considered the minimum worst error \(E_n^{\min }({\mathbb {H}}^\infty ({\mathcal {D}}_d, w))\) in (5) of n-point approximation formulas in \({\mathbb {H}}^\infty ({\mathcal {D}}_d, w)\) and evaluated it as

$$\begin{aligned} \exp \left( -\frac{F^\textrm{C}_{K, Q}(n)}{n}\right) \le E_n^{\min }({\mathbb {H}}^\infty ({\mathcal {D}}_d, w)), \end{aligned}$$
(4)

where \(F^\textrm{C}_{K, Q}(n)\) is determined later in (11) by a “continuous counterpart” of the above energy minimization problem. The following problems about the formula \(L_n[a^*;f](x)\) were left unsolved in [24].

  1. (i)

    Since (the RHS of (4))\(\, \le \,\)(the LHS of (3)), the formula \(L_n[a^*;f](x)\) is assured of “near optimality” if \(F^\textrm{C}_{K, Q}(n)\) and \(F^\textrm{D}_{K, Q}(n)\) are close. However, their difference was not estimated.

  2. (ii)

    To estimate the convergence rate of the error in the LHS of (3), we need to know how \(F^\textrm{D}_{K, Q}(n)\) depends on n. However, it was not known.

In this paper, we provide solutions to these problems. Our contributions (1) and (2) mentioned in the first paragraph of this section correspond to the solutions to problems (i) and (ii), respectively. More precisely, we show the following statements.

  1. (1)

    We show an evaluation like

    $$\begin{aligned} F_{K, Q}^\textrm{D}(n)\lesssim F_{K, Q}^\textrm{C}(n)\lesssim 2 F_{K, Q}^\textrm{D}(n). \end{aligned}$$

    Its rigorous version is given by Theorem 23 in Sect. 2.3. The quantities \(F^\textrm{D}_{K, Q}(n)\) and \(F^\textrm{C}_{K, Q}(n)\) were obtained from the optimal solutions of the “discrete” energy minimization problem and its “continuous counterpart”, respectively. Therefore we construct a feasible solution for the latter using the optimal solution of the former to show this theorem.

  2. (2)

    We show an inequality

    $$\begin{aligned} \frac{F_{K, Q}^\textrm{C}(n)}{n}\ge \frac{Q(\alpha _n)}{2}, \end{aligned}$$

    where \(Q(x) = - \log w(x)\) and \(\alpha _{n}\) is determined by a tractable inequality. Its details are given by Theorem 24 in Sect. 2.3. By combining this inequality, the above statement (1), and Inequality (3), we obtain explicit convergence rates of the proposed formulas. To show this theorem, we consider the dual problem of the “continuous” energy minimization problem and provide its feasible solution. For preparation, we present a primal-dual theory of the energy minimization problem in Sect. 4.

As a result, we explicitly obtain lower bounds of \(F_{K, Q}^\textrm{C}(n)\) and demonstrate that the rates of lower bounds coincide with those of heuristic bounds in [23].

The rest of this paper is organized as follows. In Sect. 2, we present a mathematical overview of the existing studies and describe our main results as mathematical statements. Section 3 describes the proof of the first result, i.e., Theorem 23. Section 4 contains general arguments, which introduce the concept of “positive semi-definite in measure”. Then, we show that the problem under our interest is a special case of that concept and derive the duality theorem. The evaluations for the second result, described by Theorem 24, are given in Sect. 5. We compare the bounds with those in [23] in Sect. 6. Finally, we describe the concluding remarks in Sect. 7.

2 Mathematical preliminaries and main results

2.1 General settings

We first give some definitions and formulate the problem mathematically. Let \(d>0\) and define the strip region \({\mathcal {D}}_d:=\{z\in {\mathbb {C}}\mid |\mathop {\textrm{Im}}z|<d\}\). Throughout this paper, a weight function \(w:{\mathcal {D}}_d\rightarrow {\mathbb {C}}\) is supposed to satisfy the following conditions:

  1. 1.

    w is analytic and does not vanish over the domain \({\mathcal {D}}_d\) and takes values in (0, 1] on \({\mathbb {R}}\);

  2. 2.

    w satisfies \(\lim _{x\rightarrow \pm \infty }\int _{-d}^d|w(x+iy)|\,\textrm{d}y=0\) and \(\lim _{y\nearrow d}\int _{-\infty }^\infty (|w(x+iy)|+|w(x-iy)|)\,\textrm{d}x<\infty \);

  3. 3.

    \(\log w\) is strictly concave on \({\mathbb {R}}\).

For a weight function with the above conditions, we define the weighted Hardy space \({\mathbb {H}}^\infty ({\mathcal {D}}_d, w)\) on \({\mathcal {D}}_d\) in (1). We define

$$\begin{aligned} \Vert f\Vert :=\sup _{z\in {\mathcal {D}}_d}\left| \frac{f(z)}{w(z)}\right| \end{aligned}$$

for \(f\in {\mathbb {H}}^\infty ({\mathcal {D}}_d,w)\), and the expression \(\Vert f\Vert <\infty \) shall also imply \(f\in {\mathbb {H}}^\infty (\mathcal {D}_{d}, w)\) in the following.

For an approximation formula over \({\mathbb {H}}^\infty ({\mathcal {D}}_d, w)\), an evaluation criterion needs to be defined. Based on [21] and [24], we adopt the minimum worst-case error

$$\begin{aligned}&E_n^{\min } ({\mathbb {H}}^\infty ({\mathcal {D}}_d, w)) \nonumber \\&\quad :=\inf \left\{ \sup _{\Vert f\Vert \le 1,\ x\in {\mathbb {R}}}\left| f(x)-\sum _{j=1}^l\sum _{k=0}^{n_j-1}f^{(k)}(a_j)\phi _{jk}(x) \right| \,\Big |\, \begin{array}{c} 1\le l\le n,\ n_1+\cdots +n_l=n,\\ a_j\in {\mathcal {D}}_d\ \text {are distinct},\\ \phi _{jk}:{\mathcal {D}}_d\rightarrow {\mathbb {C}}\ \text {are analytic} \end{array} \right\} \end{aligned}$$
(5)

as the optimal performance over all possible n-point interpolation formulas on \({\mathbb {R}}\), which is applicable to any \(f\in {\mathbb {H}}^\infty ({\mathcal {D}}_d, w)\).

2.2 Properties of approximation formulas to be analyzed

Let us introduce some functions dependent on an n-sequence \(a=\{a_j\}_{j=1}^n\subset {\mathbb {R}}\) as follows.

$$\begin{aligned} T_d(x)&:=\tanh \left( \frac{\pi }{4d}x\right) ,\\ B_n(x;a,{\mathcal {D}}_d)&:=\prod _{j=1}^n\frac{T_d(x)-T_d(a_j)}{1-T_d(a_j)T_d(x)},\\ B_{n;k}(x, a, {\mathcal {D}}_d)&:=\prod _{\begin{array}{c} 1\le j\le n,\\ j\ne k \end{array}}\frac{T_d(x)-T_d(a_j)}{1-T_d(a_j)T_d(x)}. \end{aligned}$$

Using these functions, we can give an n-point interpolation formula

$$\begin{aligned} L_n[a;f](x):=\sum _{k=1}^nf(a_k)\frac{B_{n;k}(x; a,{\mathcal {D}}_d)w(x)}{B_{n;k}(a_k; a,{\mathcal {D}}_d)w(a_k)} \frac{T_d'(x-a_k)}{T_d'(0)}, \end{aligned}$$
(6)

which is known to characterize the value \(E_n^{\min }({\mathbb {H}}^\infty ({\mathcal {D}}_d, w))\) as follows.

Proposition 21

[21, 24] We have an upper bound of the error of (6) as

$$\begin{aligned} \sup _{\Vert f\Vert \le 1,\ x\in {\mathbb {R}}}|f(x)-L_n[a;f](x)| \le \sup _{x\in {\mathbb {R}}}|B_n(x;a,{\mathcal {D}}_d)w(x)| \end{aligned}$$

for any fixed sequence \(a=\{a_j\}_{j=1}^n\subset {\mathbb {R}}\) (of distinct points). Moreover, by taking infimum of the above expression over all n-sequences, it holds that

$$\begin{aligned} E_n^{\min }({\mathbb {H}}^\infty ({\mathcal {D}}_d, w))= & {} \inf _{a_j\in {\mathbb {R}}} \sup _{\Vert f\Vert \le 1,\ x\in {\mathbb {R}}}|f(x)-L_n[a;f](x)| \\= & {} \inf _{a_j\in {\mathbb {R}}}\sup _{x\in {\mathbb {R}}}|B_n(x;a,{\mathcal {D}}_d)w(x)|. \end{aligned}$$

By this assertion, it is enough to consider interpolation formulas of the form (6). Additionally, this motivates us to analyze the value \(\sup _{x\in {\mathbb {R}}}|B_n(x;a,{\mathcal {D}}_d)w(x)|\), which is simpler than the worst-case error of (6). In [23] and [24],

$$\begin{aligned} -\log \left( \inf _{a_j\in {\mathbb {R}}}\sup _{x\in {\mathbb {R}}}|B_n(x;a,{\mathcal {D}}_d)w(x)| \right) \end{aligned}$$

is treated as an optimal value of an optimization problem (justifiable by the addition rule of \(\tanh \))

$$\begin{aligned} (\textrm{DC})\quad \begin{array}{lr}\text {maximize}&{}\displaystyle \inf _{x\in {\mathbb {R}}} \left( \sum _{i=1}^nK(x-a_i)+Q(x)\right) \\ \text {subject to}&{}\displaystyle a_1<\cdots <a_n, \end{array} \end{aligned}$$

where K and Q are defined by

$$\begin{aligned} K(x)&:=-\log |T_d(x)|\ \left( =-\log \left| \tanh \left( \frac{\pi }{4d}x\right) \right| \right) , \end{aligned}$$
(7)
$$\begin{aligned} Q(x)&:=-\log w(x). \end{aligned}$$
(8)

They considered a continuous relaxation of (DC) as

$$\begin{aligned} (\textrm{CT})\quad \begin{array}{lr}\text {maximize}&{}\displaystyle \inf _{x\in {\mathbb {R}}} \left( \int _{\mathbb {R}}K(x-y)\,\textrm{d}\mu (y)+Q(x)\right) \\ \text {subject to}&{}\displaystyle \mu \in {\mathcal {M}}_c({\mathbb {R}}, n), \end{array} \end{aligned}$$

where, we define \({\mathcal {M}}({\mathbb {R}}, n)\) as the set of all (positive) Borel measures \(\mu \) over \({\mathbb {R}}\) with \(\mu ({\mathbb {R}})=n\) and

$$\begin{aligned} {\mathcal {M}}_c({\mathbb {R}}, n):=\{\mu \in {\mathcal {M}}({\mathbb {R}}, n)\mid \mathop {\textrm{supp}}\mu \ \text {is compact}\}. \end{aligned}$$

Because each feasible solution of (DC) can be interpreted as a combination of \(\delta \)-measures being a feasible solution of (CT),

$$\begin{aligned} \text {(the optimal value of (DC))} \le \text {(the optimal value of (CT))} \end{aligned}$$
(9)

Potential theoretical arguments [5, 14, 24] lead to the following proposition.

Proposition 22

[24, Theorem 2.4, 2.5] The energy of \(\mu \in {\mathcal {M}}({\mathbb {R}}, n)\) is defined as

$$\begin{aligned} I_n^\textrm{C}(\mu ) :=\int _{\mathbb {R}}\int _{\mathbb {R}}K(x-y)\,\textrm{d}\mu (x)\,\textrm{d}\mu (y) +2\int _{\mathbb {R}}Q(x)\,\textrm{d}\mu (x). \end{aligned}$$
(10)

Then, there exists a unique minimizer \(\mu _n^*\) over \({\mathcal {M}}({\mathbb {R}}, n)\) of \(I_n^\textrm{C}(\mu )\) with a compact support and \(\mu _n^*\) is also an optimal solution of (CT). Furthermore, if we define

$$\begin{aligned} F_{K,Q}^\textrm{C}(n)&:=I_n^\textrm{C}(\mu _n^*)-\int _{\mathbb {R}}Q(x)\,\textrm{d}\mu _n^*(x) \nonumber \\&\ \left( =\int _{\mathbb {R}}\int _{\mathbb {R}}K(x-y)\,\textrm{d}\mu _n^*(x)\,\textrm{d}\mu _n^*(y) +\int _{\mathbb {R}}Q(x)\,\textrm{d}\mu _n^*(x)\right) , \end{aligned}$$
(11)

the optimal value of (CT) coincides with \(\displaystyle \frac{F_{K,Q}^\textrm{C}(n)}{n}\).

Following this proposition, Tanaka & Sugihara [24] considered a discrete counterpart of \(I^\textrm{C}_n(\mu )\) and \(F_{K,Q}^\textrm{C}\), which are defined for \(a=\{a_i\}_{i=1}^n\) (\(a_1<\cdots <a_n\)) as

$$\begin{aligned}&I_{K,Q}^\textrm{D}(a) :=\sum _{i\ne j}K(a_i-a_j)+\frac{2(n-1)}{n}\sum _{i=1}^nQ(a_i), \end{aligned}$$
(12)
$$\begin{aligned}&F_{K,Q}^\textrm{D}(n) :=I_{K, Q}^\textrm{D}(a^*)-\frac{n-1}{n}\sum _{i=1}^nQ(a_i^*), \end{aligned}$$
(13)

where \(a^*=\{a_i^*\}_{i=1}^n\) is the unique minimizer of \(I_{K, Q}^\textrm{D}(a)\), which certainly exists according to Theorem 3.3 in [24]. We can easily obtain \(a^*\) numerically as it is a solution of the convex programming and it is known to satisfy [24, Theorem 4.1]

$$\begin{aligned} \sup _{\Vert f\Vert \le 1,\ x\in {\mathbb {R}}}|f(x)-L_n[a^*;f](x)| \le \exp \left( -\frac{F^\textrm{D}_{K, Q}(n)}{n-1}\right) . \end{aligned}$$
(14)

Then \(E_n^{\min }({\mathbb {H}}^\infty ({\mathcal {D}}_d, w))\) is evaluated as [24, Remark 4.2]

$$\begin{aligned} \exp \left( -\frac{F^\textrm{C}_{K, Q}(n)}{n}\right) \le E_n^{\min }({\mathbb {H}}^\infty ({\mathcal {D}}_d, w)) \le \exp \left( -\frac{F^\textrm{D}_{K, Q}(n)}{n-1}\right) . \end{aligned}$$

Indeed, the left inequality holds true by (9) and Proposition 22 and the right inequality follows from (14). By this evaluation, we can consider \(L_n[a^*;f](x)\) as a nearly optimal approximation formula if \(F^\textrm{C}_{K, Q}(n)/n\) and \(F^\textrm{D}_{K, Q}(n)/(n-1)\) are sufficiently close.

2.3 Main results

In this paper, we demonstrate the following two theorems. The first and second theorems, respectively, correspond to (1) and (2) in Sect. 1.

Theorem 23

For \(n\ge 2\), the following holds true:

$$\begin{aligned} \frac{F^\textrm{D}_{K,Q}(n)}{n-1}\le \frac{F_{K, Q}^\textrm{C}(n)}{n} \le \frac{n}{n-1}\left( \frac{2F^\textrm{D}_{K, Q}(n)}{n-1} +(3+\log 2)\right) . \end{aligned}$$

Theorem 24

Suppose w is even on \({\mathbb {R}}\). For \(\alpha _n>0\) that satisfies

$$\begin{aligned} \frac{2\alpha _n}{\pi \tanh (d)}\frac{Q(\alpha _n)^2+Q'(\alpha _n)^2}{Q(\alpha _n)}\le n, \end{aligned}$$

we have

$$\begin{aligned} \frac{F_{K, Q}^\textrm{C}(n)}{n}\ge \frac{Q(\alpha _n)}{2}. \end{aligned}$$

Theorem 23 shows the near optimality of the approximation formula \(L_n[a^*;f](x)\). In addition, Theorem 24 (combined with Theorem 23) gives an explicit upper bound of \(E_n^{\min }({\mathbb {H}}^\infty ({\mathcal {D}}_d, w))\). We describe these results by the following theorem.

Theorem 25

Let w be a weight function and let K and Q be given by (7) and (8), respectively. In addition, let \(a^*=\{a_i^*\}_{i=1}^n\) be the unique minimizer of \(I_{K, Q}^\textrm{D}(a)\) and let \(L_n[a^*;f](x)\) be the formula given by (6) with \(a = a^{*}\). Then, for arbitrary \(\varepsilon >0\), we have

$$\begin{aligned} \sup _{\Vert f\Vert \le 1,\ x\in {\mathbb {R}}}|f(x)-L_n[a^*;f](x)| \le \sqrt{2e^3}E_n^{\min }({\mathbb {H}}^\infty ({\mathcal {D}}_d, w))^{\frac{1}{2+\varepsilon }} \end{aligned}$$

for each sufficiently large n. In addition, we have

$$\begin{aligned} E_n^{\min }({\mathbb {H}}^\infty ({\mathcal {D}}_d, w))\le \sqrt{2e^3}\exp \left( -\frac{n-1}{4n}Q(\alpha _n)\right) . \end{aligned}$$

2.4 Basic ideas to show the main results

The left inequality of Theorem 23 is from Theorems 3.4 and 3.5 in [24]. To prove the right inequality of Theorem 23, we consider the optimization problem

$$\begin{aligned} (\textrm{P})\quad \begin{array}{lr}\text {minimize}&{}\displaystyle \int _{\mathbb {R}}\int _{\mathbb {R}}K(x-y)\,\textrm{d}\mu (x)\,\textrm{d}\mu (y) + 2\int _{\mathbb {R}}Q(x)\,\textrm{d}\mu (x) \\ \text {subject to}&{} \begin{array}{r} \mu \in {\mathcal {M}}({\mathbb {R}}, n), \end{array} \end{array} \end{aligned}$$

whose solution provides \(F^\textrm{C}_{K, Q}(n)\) as shown in Proposition 22. The quantity \(F^\textrm{D}_{K, Q}(n)\) is obtained from the optimal solution of a discrete counterpart of (P) given by (12). Then, we construct a feasible solution of (P) given later by (16) from the optimal solution of the discrete counterpart. By using the feasible solution, we bound \(F^\textrm{C}_{K, Q}(n)\) from above by using \(F^\textrm{D}_{K, Q}(n)\).

To prove Theorem 24, we need a lower bound of the optimal value of (P). However, because (P) is a minimization problem, any concrete feasible solution does not help us. Therefore, we prove that (P) can be regarded as an infinite-dimensional convex quadratic programming, as K is positive semi-definite in measure (Definition 41), and take the dual problem [1, 6]. We also show that the dual problem

$$\begin{aligned} (\textrm{D})\quad \begin{array}{lr}\text {maximize}&{}\displaystyle -\int _{\mathbb {R}}\int _{\mathbb {R}}K(x-y)\,\textrm{d}\nu (x)\,\textrm{d}\nu (y) + 2ns \\ \text {subject to}&{} \begin{array}{r} \nu \ \text {is a signed Borel measure}\\ \displaystyle s-\int _{\mathbb {R}}K(\cdot -y)\,\textrm{d}\nu (y)\le Q \end{array} \end{array} \end{aligned}$$
(15)

satisfies the weak and strong duality (Theorem 43), i.e., the optimal value of (D) coincides with that of (P). By this, we can obtain a lower bound for the optimal value of (P), taking concrete \(\nu \) and s. The practical advantage of taking (D) is that \(\nu \) can be a signed measure (though we indeed deal with a little wider class in Sect. 4), which means that we can define \(\nu \) as some Fourier transform of the symmetric function, without confirming the non-negativity. This solves one of the improper points of the evaluation in [23].

Remark 1

Problem (D) in (15) needs to be more rigorously stated to realize a primal-dual theory for (P) and (D). In Sect. 4, we provide a rigorous form of (D) by introducing a set \({\mathcal {S}}_{K}\) for \(\nu \).

3 Proof of Theorem 23

To prove Theorem 23, we prepare the following lemmas.

Lemma 31

For arbitrary \(t>0\), the following holds true.

$$\begin{aligned} \int _0^1K(tx)\,\textrm{d}x \le K(t)+1. \end{aligned}$$

Proof

Consider the function \(g(x):=K(x)+\log \left( \frac{\pi }{4d}x\right) \) defined for \(x>0\). We first prove that g(x) is strictly increasing and satisfies \(\lim _{x\searrow 0}g(x)=0\). Let \(h(x):=\exp \left( g\left( \frac{2d}{\pi }x\right) \right) \). Then, we have

$$\begin{aligned} h(x)=\frac{x}{2\tanh \frac{x}{2}} =\frac{x(e^{x}+1)}{2(e^{x}-1)} \end{aligned}$$

and

$$\begin{aligned} h'(x) =\frac{(xe^x+e^{x}+1)(e^{x}-1)-x(e^{x}+1)e^{x}}{2(e^{x}-1)^2} =\frac{e^{2x}-2xe^x-1}{2(e^x-1)^2}. \end{aligned}$$

Because \((e^{2x}-2xe^x-1)'=2(e^{2x}-e^x-xe^x)=2e^x(e^x-1-x)\) is valid, we have \(h'(x)>0\) for \(x>0\). Evidently, we also have \(\lim _{x\searrow 0}h(x)=1\). Thus, g satisfies the above properties.

Because g is positive and increasing, \(\int _0^1g(tx)\,\textrm{d}x\le g(t)\) is valid. Therefore, we have

$$\begin{aligned} \int _0^1 K(tx) \,\textrm{d}x&=\int _0^1 g(tx) \,\textrm{d}x -\int _0^1 \log \left( \frac{\pi }{4d}tx\right) \,\textrm{d}x\\&\le g(t) - \log \left( \frac{\pi }{4d}t\right) + 1 =K(t)+1 \end{aligned}$$

as desired. \(\square \)

Lemma 32

For arbitrary \(x>0\), the following holds true.

$$\begin{aligned} K\left( \frac{x}{2}\right) \le K(x)+\log 2. \end{aligned}$$

Proof

By the definition of K, the assertion follows from the fact that \(\tanh x \le 2\tanh \frac{x}{2}\). \(\square \)

We can now prove the first theorem.

Proof of Theorem 23

The left inequality is from Theorem 3.4 and 3.5 in [24].

Let us prove the right inequality. Let \(a=(a_1, \ldots , a_n)\) (with \(a_1<\cdots <a_n\)) be the minimizer of the discrete energy, satisfying

$$\begin{aligned} F_{K, Q}^\textrm{D}(n)=\sum _{i\ne j}K(a_i-a_j)+\frac{n-1}{n}\sum _{i=1}^nQ(a_i). \end{aligned}$$

Let \(\mu \) be a measure with the density function p defined by

$$\begin{aligned} p(x)={\left\{ \begin{array}{ll} \frac{n}{(n-1)(a_{i+1}-a_i)} &{} (x\in [a_i, a_{i+1}),\ i=1,\ldots ,n-1),\\ 0 &{} (\text {otherwise}). \end{array}\right. } \end{aligned}$$
(16)

Then, we have

$$\begin{aligned} F_{K, Q}^\textrm{C}(n) \le I_n^\textrm{C}(\mu _n^*) \le I_n^\textrm{C}(\mu ). \end{aligned}$$
(17)

In the following, we obtain an upper bound of \(I_n^\textrm{C}(\mu )\). First, we evaluate \(\int _{\mathbb {R}}\int _{\mathbb {R}}K(x-y)\,\textrm{d}\mu (x)\,\textrm{d}\mu (y)\). For \(1\le k\le n-1\) and \(y\in [a_k, a_{k+1})\), we have

$$\begin{aligned} \int _{\mathbb {R}}K(x-y)\,\textrm{d}\mu (x)&=\int _{\mathbb {R}}K(x-y)p(x)\,\textrm{d}x\\&=\sum _{i=1}^{n-1} \frac{n}{(n-1)(a_{i+1}-a_i)}\int _{a_i}^{a_{i+1}}K(x-y)\,\textrm{d}x\\&=\frac{n}{n-1}\sum _{i=1}^{n-1}\int _0^1K(a_i+(a_{i+1}-a_i)z-y)\,\textrm{d}z. \end{aligned}$$

Here, because \(y\in [a_k, a_{k+1})\), for \(i\not \in \{k-1, k, k+1\}\), the convexity and monotonicity of K over \((-\infty , 0)\) or \((0, \infty )\) shows that

$$\begin{aligned} \int _0^1K(a_i+(a_{i+1}-a_i)z-y)\,\textrm{d}z\le {\left\{ \begin{array}{ll} \frac{1}{2}\left( K(a_i-a_k)+K(a_{i+1}-a_k)\right) &{}(i\le k-2),\\ \frac{1}{2}\left( K(a_i-a_{k+1})+K(a_{i+1}-a_{k+1})\right) &{}(i\ge k+2). \end{array}\right. } \end{aligned}$$

Therefore, by considering that K is non-negative, we have

$$\begin{aligned} \sum _{i\ne k-1,k,k+1}\int _0^1K(a_i+(a_{i+1}-a_i)z-y)\,\textrm{d}z&\le \sum _{j\le k-2}K(a_j-a_k)+\sum _{j\ge k+3}K(a_j-a_{k+1})\nonumber \\&\quad +\frac{1}{2}\left( K(a_{k-1}-a_k)+ K(a_{k+2}-a_{k+1}) \right) . \end{aligned}$$
(18)

Here, the terms that include an index of a outside the domain \(\{1,\ldots ,n\}\) are void. Next, we consider the cases \(i=k\pm 1\). If \(k-1\ge 1\) is valid, we have

$$\begin{aligned} \int _0^1K(a_{k-1}+(a_k-a_{k-1})z-y)\,\textrm{d}z&\le \int _0^1K(a_{k-1}+(a_k-a_{k-1})z-a_k)\,\textrm{d}z\nonumber \\&=\int _0^1K((a_k-a_{k-1})w)\,\textrm{d}w\nonumber \\&\le K(a_k-a_{k-1}) + 1 =K(a_{k-1}-a_k)+1. \end{aligned}$$
(19)

Similarly, if \(k+2\le n\) is valid, we have, by Lemma 31,

$$\begin{aligned} \int _0^1K(a_{k+1}+(a_{k+2}-a_{k+1})z-y)\,\textrm{d}z \le K(a_{k+2}-a_{k+1})+1. \end{aligned}$$
(20)

Finally, we deal with the case \(i=k\). We show that the integral

$$\begin{aligned} L_k(y):=\int _0^1 K(a_k+(a_{k+1}-a_k)z-y)\,\textrm{d}z \end{aligned}$$

is maximized at \(y=\frac{a_k+a_{k+1}}{2}\) (over \(y\in [a_k, a_{k+1})\)). If we define \(t:=\frac{y-a_k}{a_{k+1}-a_k}\) (\(t\in [0, 1)\)), the following holds true.

$$\begin{aligned} L_k(y) =\int _0^t K((a_{k+1}-a_k)w)\,\textrm{d}w + \int _0^{1-t} K((a_{k+1}-a_k)w)\,\textrm{d}w. \end{aligned}$$

For \(t<\frac{1}{2}\), we have

$$\begin{aligned}&L_k\left( \frac{a_k+a_{k+1}}{2}\right) -L_k(y)\\&\quad =\int _t^{\frac{1}{2}}K((a_{k+1}-a_k)w)\,\textrm{d}w-\int _{\frac{1}{2}}^{1-t}K((a_{k+1}-a_k)w)\,\textrm{d}w\\&\quad =\int _0^{\frac{1}{2}-t}\left( K((a_{k+1}-a_k)(t+w))-K\left( (a_{k+1}-a_k)\left( \frac{1}{2}+w\right) \right) \right) \,\textrm{d}w > 0. \end{aligned}$$

By symmetry, \(L_k(y)< L_k\left( \frac{a_k+a_{k+1}}{2}\right) \) is valid for \(t>\frac{1}{2}\). Therefore, by Lemma 31 and 32,

$$\begin{aligned}&\int _0^1 K(a_k+(a_{k+1}-a_k)z-y)\,\textrm{d}z \nonumber \\&\quad \le L_k\left( \frac{1}{2}\right) =2\int _0^{\frac{1}{2}}K((a_{k+1}-a_k)w)\,\textrm{d}w =\int _0^1K\left( \frac{a_{k+1}-a_k}{2}v\right) \,\textrm{d}v\nonumber \\&\quad \le K\left( \frac{a_{k+1}-a_k}{2}\right) +1 \le K(a_{k+1}-a_k)+1+\log 2 \end{aligned}$$
(21)

By (1821), we have the bound

$$\begin{aligned}&\left( \frac{n-1}{n}\right) ^2 \int _{a_k}^{a_{k+1}}\int _{\mathbb {R}}K(x-y)\,\textrm{d}\mu (x)\,\textrm{d}\mu (y) \le \frac{n-1}{n}\sup _{y\in [a_k, a_{k+1})}\int _{\mathbb {R}}K(x-y)\,\textrm{d}\mu (x)\\&\quad \le \sum _{j\le k-2}K(a_j-a_k) +\sum _{j\ge k+3}K(a_j-a_{k+1}) + 3+\log 2\\&\qquad +\frac{3}{2}K(a_{k-1}-a_k)+\frac{1}{2}K(a_k-a_{k+1}) +\frac{1}{2}K(a_{k+1}-a_k)+\frac{3}{2}K(a_{k+2}-a_{k+1}). \end{aligned}$$

Considering the sum of the right-hand side with respect to \(k=1,\ldots ,n-1\), the coefficient of each \(K(a_i-a_j)\) with \(|i-j|\ge 2\) is at most 1, and that of \(K(a_i-a_j)\) with \(|i-j|=1\) is at most 2 (\(=\frac{1}{2}+\frac{3}{2}\)), where we have distinguished \(K(a_i-a_j)\) from \(K(a_j-a_i)\). Therefore, we have

$$\begin{aligned} \left( \frac{n-1}{n}\right) ^2\int _{\mathbb {R}}\int _{\mathbb {R}}K(x-y)\,\textrm{d}\mu (x)\,\textrm{d}\mu (y) \le 2\sum _{i\ne j}K(a_i-a_j)+(n-1)(3+\log 2). \end{aligned}$$
(22)

Let us now evaluate the second term of \(I_n^\textrm{C}(\mu )\), i.e., \(\int _{\mathbb {R}}Q(x)\,\textrm{d}\mu (x)\). By the convexity of Q, we have

$$\begin{aligned} \int _{\mathbb {R}}Q(x)\,\textrm{d}\mu (x)&=\frac{n}{n-1}\sum _{i=1}^{n-1}\int _0^1Q(a_i+(a_{i+1}-a_i)z)\,\textrm{d}z\\&\le \frac{n}{n-1}\sum _{i=1}^{n-1}\max \{Q(a_i), Q(a_{i+1})\}. \end{aligned}$$

To estimate the sum, we consider the following two cases:

  1. 1.

    Q is not monotone in \([a_{1}, a_{n}]\),

  2. 2.

    Q is monotone in \([a_{1}, a_{n}]\).

In the former case, the unique minimizer \(q^{*}\) of Q on \({\mathbb {R}}\) exists in \([a_{1}, a_{n}]\) because of the strict convexity. Then, by the strict convexity of Q, we have

$$\begin{aligned} \max \{Q(a_i), Q(a_{i+1})\} = {\left\{ \begin{array}{ll} Q(a_{i}) &{} (a_{i}, a_{i+1}< q^{*}), \\ Q(a_{i+1}) &{} (q^{*} < a_{i}, a_{i+1} ), \\ Q(a_{j^{*}}) &{} (q^{*} \in [a_{i}, a_{i+1}]), \end{array}\right. } \end{aligned}$$

where

$$\begin{aligned} j^{*} \in \mathop {\textrm{argmax}}_{ j \in \{i, i+1\} } Q(a_j). \end{aligned}$$

Therefore

$$\begin{aligned} \sum _{i=1}^{n-1} \max \{ Q(a_{i}), Q(a_{i+1}) \} = \sum _{i \in \{1,\ldots , n \} \setminus \{ k \}} Q(a_{i}) \end{aligned}$$
(23)

holds for some \(k \in \{1, \ldots , n\}\). In the latter case (case 2 above), we have equality (23) for \(k = 1\) or \(k = n\). Therefore, in both cases, the following holds true:

$$\begin{aligned} \int _{\mathbb {R}}Q(x)\,\textrm{d}\mu (x) \le \frac{n}{n-1}\sum _{i=1}^{n}Q(a_i). \end{aligned}$$
(24)

Combining (22) and (24), we obtain

$$\begin{aligned} I_n^\textrm{C}(\mu )&=\int _{\mathbb {R}}\int _{\mathbb {R}}K(x-y)\,\textrm{d}\mu (x)\,\textrm{d}\mu (y) +2\int _{\mathbb {R}}Q(x)\,\textrm{d}\mu (x)\\&\le 2\left( \frac{n}{n-1}\right) ^2\sum _{i\ne j}K(a_i-a_j) +\frac{2n}{n-1}\sum _{i=1}^nQ(a_i)+ \frac{n^2}{n-1}(3+\log 2)\\&=2\left( \frac{n}{n-1}\right) ^2\left( \sum _{i\ne j}K(a_i-a_j) +\frac{n-1}{n}\sum _{i=1}^nQ(a_i)\right) + \frac{n^2}{n-1}(3+\log 2)\\&=2\left( \frac{n}{n-1}\right) ^2 F_{K,Q}^\textrm{D}(n) +\frac{n^2}{n-1}(3+\log 2). \end{aligned}$$

Now, using (17), we reach the conclusion. \(\square \)

4 Duality theorem for convex programming of measures

The following definition is a variant of the existing definitions of positive definite kernel [4, 17, 20].

Definition 41

Let X be a topological space. A non-negative measurable function \(k:X\times X\rightarrow {\mathbb {R}}_{\ge 0}\cup \{\infty \}\) is called positive semi-definite in measure if it satisfies

$$\begin{aligned} \int _X\int _Xk(x,y)\,\textrm{d}\mu (x)\,\textrm{d}\mu (y)\ +&\ \int _X\int _Xk(x,y)\,\textrm{d}\nu (x)\,\textrm{d}\nu (y)\nonumber \\&\ge \int _X\int _Xk(x,y)\,\textrm{d}\mu (x)\,\textrm{d}\nu (y)\nonumber \\&\quad +\int _X\int _Xk(x,y)\,\textrm{d}\nu (x)\,\textrm{d}\mu (y) \end{aligned}$$
(25)

for arbitrary (positive) \(\sigma \)-finite Borel measures \(\mu , \nu \) on X.

Remark 2

Let k be positive semi-definite in measure. Considering the Hahn-Jordan decomposition of a signed measure, we have

$$\begin{aligned} \int _X\int _Xk(x,y)\,\textrm{d}|\mu |(x)\,\textrm{d}|\mu |(y)<\infty \Longrightarrow \int _X\int _Xk(x,y)\,\textrm{d}\mu (x)\,\textrm{d}\mu (y)\ge 0 \end{aligned}$$

for an arbitrary signed Borel measure \(\mu \) on X with \(|\mu |\) being \(\sigma \)-finite, where \(|\mu |\) denotes the total variation of \(\mu \). This is the generalization of the ordinary positive semi-definiteness. Notice that this non-negativity holds for a wider class of “measure". Indeed, if we define

$$\begin{aligned} {\mathcal {S}}_k:=\left\{ (\mu _+, \mu _-) \,\Big |\, \begin{array}{c} \mu _+\ \text {and}\ \mu _-\ \text {are}\ \sigma \text {-finte Borel measures}\\ \int _X\int _Xk(x,y)\,\textrm{d}\mu _+(x)\,\textrm{d}\mu _+(y), \int _X\int _Xk(x,y)\,\textrm{d}\mu _-(x)\,\textrm{d}\mu _-(y)<\infty \end{array}\right\} \end{aligned}$$

and for each \(\nu =(\nu _+,\nu _-)\in {\mathcal {S}}_k\) define

$$\begin{aligned} \int _X\int _Xk(x,y)\,\textrm{d}\nu (x)\,\textrm{d}\nu (y)&:= \int _X\int _Xk(x,y)\,\textrm{d}\nu _+(x)\,\textrm{d}\nu _+(y)\ + \ \int _X\int _Xk(x,y)\,\textrm{d}\nu _-(x)\,\textrm{d}\nu _-(y)\\&\quad - \int _X\int _Xk(x,y)\,\textrm{d}\nu _+(x)\,\textrm{d}\nu _-(y) -\int _X\int _Xk(x,y)\,\textrm{d}\nu _-(x)\,\textrm{d}\nu _+(y), \end{aligned}$$

then this integral is well-defined and the generalization of quadratic forms for ordinary signed measures. We formally write \(\nu =\nu _+-\nu _-\) in such a situation, and call it also the Hahn-Jordan decomposition of \(\nu \).

Lemma 42

Let \(K:{\mathbb {R}}\rightarrow {\mathbb {R}}_{\ge 0}\cup \{+\infty \}\) be an even function. If \(K\in L^1({\mathbb {R}})\) and K is convex on \([0, \infty )\), and satisfies \(\lim _{x\searrow 0}K(x)=K(0)\), then \(K(x-y)\) is positive semi-definite in measure.

Proof

Because K is integrable and convex, K is continuous over \((0, \infty )\) and \(\lim _{x\rightarrow \infty }K(x)=0\) holds true. If \(K(0)<\infty \), K becomes continuous and this type of function is called Pólya-type. Pólya-type functions are known to be a characteristic function of a positive bounded Borel measure, i.e., there exists a positive bounded measure \(\alpha \) on \({\mathbb {R}}\) such that

$$\begin{aligned} K(x)=\int _{\mathbb {R}}e^{-i\omega x} \,\textrm{d}\alpha (\omega ) \end{aligned}$$
(26)

is valid [4, 12]. Let \(\mu \) be a signed Borel measure with \(\int _{\mathbb {R}}\int _{\mathbb {R}}K(x-y)\,\textrm{d}\mu (x)\,\textrm{d}\mu (y)\) being finite and \(|\mu |\) being \(\sigma \)-finite. Then, we can take a sequence of increasing Borel sets \(A_1\subset A_2\subset \cdots \rightarrow {\mathbb {R}}\) satisfying \(|\mu |(A_k)<\infty \) for all k. Let \(\mu =\mu _+-\mu _-\) be the Hahn-Jordan decomposition and \(\mu _+^k:=\mu _+(A_k\cap \cdot )\), \(\mu _-^k:=\mu _-(A_k\cap \cdot )\). For each k, by Fubini’s theorem and (26), we have

$$\begin{aligned} \int _{\mathbb {R}}\int _{\mathbb {R}}K(x-y) \,\textrm{d}(\mu _+^k-\mu _-^k)(x)\,\textrm{d}(\mu _+^k-\mu _-^k)(y) =\int _{\mathbb {R}}\left| \int _{\mathbb {R}}e^{-ikx}\,\textrm{d}(\mu _+^k-\mu _-^k)\right| ^2\,\textrm{d}\alpha (\omega )\ge 0. \end{aligned}$$

This can be rewritten as

$$\begin{aligned} \int _{\mathbb {R}}\int _{\mathbb {R}}K(x-y) \,\textrm{d}\mu _+^k(x)\,\textrm{d}\mu _+^k(y) \ +&\int _{\mathbb {R}}\int _{\mathbb {R}}K(x-y) \,\textrm{d}\mu _-^k(x)\,\textrm{d}\mu _-^k(y) \nonumber \\&\ge 2\int _{\mathbb {R}}\int _{\mathbb {R}}K(x-y) \,\textrm{d}\mu _+^k(x)\,\textrm{d}\mu _-^k(y). \end{aligned}$$
(27)

The integrals in (27) are given by integrands that are monotone increasing with respect to k. Indeed, the first term of the left-hand side is written in the form

$$\begin{aligned} \int _{\mathbb {R}}\int _{\mathbb {R}}K(x-y) \,\text {d}\mu _+^k(x)\,\text {d}\mu _+^k(y) =\int _{\mathbb {R}}\int _{\mathbb {R}}1_{A_k\times A_k}(x, y)K(x-y)\,\text {d}\mu _+(x)\,\text {d}\mu _+(y) \end{aligned}$$

and its integrand \( 1_{A_k\times A_k}(x, y)K(x-y) \) is monotone increasing with respect to k because \(A_{1} \subset A_{2} \subset \cdots \). Similar arguments can be applied to the other terms. Therefore we get the desired inequality by letting \(k \rightarrow \infty \) and using the monotone convergence theorem in (27).

Let us consider the case \(K(0)=\infty \). In this case, K is continuous on \((0, \infty )\) and has a limit \(\lim _{x\searrow 0}K(x)\) For any \(\varepsilon >0\), define

$$\begin{aligned} K_\varepsilon (x):=\frac{1}{\varepsilon }\int _0^\varepsilon K(|x|+z)\,\textrm{d}z,\quad x\in {\mathbb {R}}. \end{aligned}$$

Then, by \(K\in L^1({\mathbb {R}})\), K is bounded everywhere by \(\varepsilon ^{-1}\Vert K\Vert _{L^1}\). Moreover, \(K_\varepsilon \) is still convex, such that \(K_\varepsilon (x-y)\) is positive semi-definite in measure. Now, the continuity of K leads to

$$\begin{aligned} K_\varepsilon (x) =\int _0^1K(|x|+\varepsilon z) \,\textrm{d}z \nearrow K(|x|)=K(x)\quad (\varepsilon \searrow 0) \end{aligned}$$

by the monotone convergence theorem. Applying the monotone convergence theorem to both sides of (25) with \(K=K_\varepsilon \), we obtain the conclusion. \(\square \)

The function \(K=-\log \left| \tanh \left( \frac{\pi }{4d}\cdot \right) \right| \) satisfies the condition of Lemma 42. Thus, we can observe the optimization problem

$$\begin{aligned} (\textrm{P})\quad \begin{array}{lr}\text {minimize}&{}\displaystyle \int _{\mathbb {R}}\int _{\mathbb {R}}K(x-y)\,\textrm{d}\mu (x)\,\textrm{d}\mu (y) + 2\int _{\mathbb {R}}Q(x)\,\textrm{d}\mu (x) \\ \text {subject to}&{}\mu \in {\mathcal {M}}({\mathbb {R}}, n) \end{array} \end{aligned}$$

as convex quadratic programming. We can analogously make the dual problem to the finite-dimensional case in [1], as

$$\begin{aligned} (\textrm{D})\quad \begin{array}{lr}\text {maximize}&{}\displaystyle -\int _{\mathbb {R}}\int _{\mathbb {R}}K(x-y)\,\textrm{d}\nu (x)\,\textrm{d}\nu (y) + 2ns\\ \text {subject to}&{}\displaystyle \nu \in {\mathcal {S}}_K,\ s-\int _{\mathbb {R}}K(\cdot -y)\,\textrm{d}\nu (y)\le Q. \end{array} \end{aligned}$$

Note that this is a rigorous version of problem (D) in (15). It should be noted here that we have not justified (D) as a formal (topologically) dual problem. There are arguments limited to the optimization of Radon measure over compact space [10, 11, 27]. While they are on quadratic programming problems, there exist more general theories on duality, such as [3], von Neumann’s minimax theorem [8, 16] and Fenchel-Rockafellar duality theorem [13, 26]. However, as it is essential that our duality can treat infinite measure \(\nu \) with unbounded support (we indeed later use such a measure as a dual feasible solution), it is difficult to just apply existing studies and check all the conditions for (D) to be a topologically dual problem. Therefore, we here do not go deeper in this aspect, but just prove the assertion of Theorem 43. This assertion is sufficient to derive a lower bound of the optimal value of (P), which is our objective.

In the following, we demonstrate that the weak duality and strong duality are still valid in this infinite-dimensional primal-dual pair. It should be noted that \(s=0\), \(\nu \equiv 0\) is a trivial feasible solution of (D) such that there exists an optimal value of (D).

Theorem 43

The optimal value of (D) is equal to the optimal value of (P).

Proof

First, we present the weak duality. Let \(\mu \) and \((\nu , s)\) be feasible solutions of (P) and (D), respectively, and \(\nu =\nu _+-\nu _-\) be the Hahn-Jordan decomposition. If we write \(\langle \alpha , \beta \rangle _K:=\int _{\mathbb {R}}\int _{\mathbb {R}}K(x-y)\,\textrm{d}\alpha (x)\,\textrm{d}\beta (y)\) for measures \(\alpha \) and \(\beta \),

$$\begin{aligned} \langle \nu , \nu \rangle _K=\langle \nu _+, \nu _+ \rangle _K+\langle \nu _-, \nu _- \rangle _K-2\langle \nu _+, \nu _- \rangle _K \end{aligned}$$

holds true. Because \(\langle \mu , \mu \rangle _K, \langle \nu _+, \nu _+ \rangle _K, \langle \nu _-, \nu _- \rangle _K<\infty \), we have \(\langle \mu , \nu _+ \rangle _K, \langle \mu , \nu _- \rangle _K, \langle \nu _+, \nu _- \rangle _K<\infty \) by K’s positive semi-definiteness in measure. Therefore, we have

$$\begin{aligned}&\left( \int _{\mathbb {R}}\int _{\mathbb {R}}K(x-y)\,\textrm{d}\mu (x)\,\textrm{d}\mu (y) + 2\int _{\mathbb {R}}Q(x)\,\textrm{d}\mu (x) \right) \\&\qquad -\left( -\int _{\mathbb {R}}\int _{\mathbb {R}}K(x-y)\,\textrm{d}\nu (x)\,\textrm{d}\nu (y) + 2ns \right) \\&\quad =\langle \mu , \mu \rangle _K+\langle \nu , \nu \rangle _K+2\int _{\mathbb {R}}(Q(x)-s)\,\textrm{d}\mu (x)\\&\quad \ge \langle \mu , \mu \rangle _K+\left( \langle \nu _+, \nu _+ \rangle _K+\langle \nu _-, \nu _- \rangle _K-2\langle \nu _+, \nu _- \rangle _K \right) \\&\qquad +2\int _{\mathbb {R}}\left( -\int _{\mathbb {R}}K(x-y)\,\textrm{d}\nu (y)\right) \,\textrm{d}\mu (x)\\&\quad =\langle \mu , \mu \rangle _K+\langle \nu _+, \nu _+ \rangle _K+\langle \nu _-, \nu _- \rangle _K-2\langle \nu _+, \nu _- \rangle _K -2\langle \mu , \nu _+ \rangle _K+2\langle \mu , \nu _- \rangle _K\\&\quad =\langle \mu +\nu _-, \mu +\nu _- \rangle _K+\langle \nu _+, \nu _+ \rangle _K-2\langle \mu +\nu _-, \nu _+ \rangle _K\ge 0 \end{aligned}$$

by the positive semi-definiteness in measure. This indicates the weak duality. Note that we have the last inequality above by replacing \(\mu \) and \(\nu \) in (25) in Definition 41 with \(\mu + \nu _{+}\) and \(\nu _{+}\), respectively.

To prove the strong duality, we construct the optimal solution of (D) using that of (P). By Theorem 2.4 in [24], \(\mu ^*\), the optimal solution of (P), satisfies

$$\begin{aligned} \int _{\mathbb {R}}K(x-y)\,\textrm{d}\mu ^*(y)+Q(x)\ge \frac{F_{K, Q}^\textrm{C}(n)}{n} \end{aligned}$$
(28)

for all \(x\in {\mathbb {R}}\). Now, \(\mu ^*\) and \(n^{-1}F^\textrm{C}_{K, Q}(n)\) is a feasible solution for (D). Moreover, the equation that we obtain by replacing the inequality of (28) with an equality is valid on the support of \(\mu ^*\). Therefore we have

$$\begin{aligned}&-\int _{\mathbb {R}}\int _{\mathbb {R}}K(x-y)\,\textrm{d}\mu ^*(x)\,\textrm{d}\mu ^*(y)+2n\frac{F^\textrm{C}_{K, Q}(n)}{n}\\&\quad =-\int _{\mathbb {R}}\int _{\mathbb {R}}K(x-y)\,\textrm{d}\mu ^*(x)\,\textrm{d}\mu ^*(y) +2\int _{\mathbb {R}}\left( Q(x)+\int _{\mathbb {R}}K(x-y)\,\textrm{d}\mu ^*(y)\right) \,\textrm{d}\mu ^*(x)\\&\quad =\int _{\mathbb {R}}\int _{\mathbb {R}}K(x-y)\,\textrm{d}\mu ^*(x)\,\textrm{d}\mu ^*(y)+2\int _{\mathbb {R}}Q(x)\,\textrm{d}\mu ^*(x). \end{aligned}$$

This shows the strong duality. \(\square \)

5 Proof of Theorem 24

We can now give a lower bound of \(F_{K, Q}^\textrm{C}(n)\) by using the dual problem (D) and prove Theorem 24. Let \(\alpha >0\) be a constant and f be the inverse Fourier transform of

$$\begin{aligned} \left( {\mathcal {F}}[f](\omega )=\right) \quad \frac{\omega }{\pi \tanh (d\omega )} \int _{-\alpha }^\alpha \left( Q(\alpha ) - Q(x) \right) e^{-i\omega x}\,\textrm{d}x \end{aligned}$$

Along with this, f is \(L^2\)-integrable by Theorem 4.4 in [23]. Here, the Fourier transform of a function \(g\in L^1({\mathbb {R}})\cap L^2({\mathbb {R}})\) is defined by

$$\begin{aligned} {\mathcal {F}}[g](\omega ):=\int _{\mathbb {R}}g(x)e^{-i\omega x}\,\textrm{d}x \end{aligned}$$

and for the whole space \(L^2({\mathbb {R}})\), \({\mathcal {F}}[\cdot ]\) is defined as the continuous extension of \({\mathcal {F}}[\cdot ]|_{L^1\cap L^2}\). Because Q(x) is even by the assumption, f is an inverse Fourier transform of an even real function, so that f itself is an even real function. Then, the formula [9, p.43, 7.112]

$$\begin{aligned} {\mathcal {F}}\left[ \log \left| \tanh \left( \frac{\pi }{4d}\cdot \right) \right| \right] (\omega ) =-\frac{\pi }{\omega }\tanh (d\omega ) \end{aligned}$$

leads to the (almost everywhere) equation

$$\begin{aligned} {\mathcal {F}}\left[ \int _{\mathbb {R}}K(x-y)f(y)\,\textrm{d}y\right] (\omega ) ={\mathcal {F}}[K](\omega )\cdot {\mathcal {F}}[f](\omega ) =\int _{-\alpha }^\alpha \left( Q(\alpha ) - Q(x) \right) e^{-i\omega x}\,\textrm{d}x, \end{aligned}$$
(29)

where \(K\in L^1({\mathbb {R}})\cap L^2({\mathbb {R}})\) and \(f\in L^2({\mathbb {R}})\) are used for the justification of the first equality. The former statement \(K\in L^1({\mathbb {R}})\cap L^2({\mathbb {R}})\) follows from

$$\begin{aligned}&\int _{-\infty }^\infty \left( -\log \tanh \left( \frac{\pi }{4d} x \right) \right) \,\textrm{d}x< \infty \quad \text {and} \\&\int _{-\infty }^\infty \left( -\log \tanh \left( \frac{\pi }{4d} x \right) \right) ^2 \,\textrm{d}x < \infty . \end{aligned}$$

The integrability of \(K(x-\cdot )f(\cdot )\) comes from \(K, f\in L^2({\mathbb {R}})\). Indeed, we have

$$\begin{aligned} \left| \int _{{\mathbb {R}}} K(x-y) f(y) \,\textrm{d}y \right| ^{2}&\le \left( \int _{{\mathbb {R}}} |K(x-y)|^{1/2} |K(x-y)|^{1/2} |f(y)| \,\textrm{d}y \right) ^{2} \\&\le \int _{{\mathbb {R}}} |K(x-y)| \,\textrm{d}y \int _{{\mathbb {R}}} |K(x-y)| |f(y)|^{2} \,\textrm{d}y \\&= \Vert K \Vert _{L^{1}} \int _{{\mathbb {R}}} |K(x-y)| |f(y)|^{2} \,\textrm{d}y, \end{aligned}$$

where the Cauchy-Schwarz inequality is used on the second inequality. Therefore the integrability is shown as follows:

$$\begin{aligned} \left\| \int _{{\mathbb {R}}} K(\cdot -y) f(y) \,\textrm{d}y \right\| _{L^{2}}^{2}&\le \Vert K \Vert _{L^{1}} \int _{{\mathbb {R}}} \int _{{\mathbb {R}}} |K(x-y)| |f(y)|^{2} \,\textrm{d}y \,\textrm{d}x \\&= \Vert K \Vert _{L^{1}} \int _{{\mathbb {R}}} \left( \int _{{\mathbb {R}}} |K(x-y)| \,\textrm{d}x \right) |f(y)|^{2} \,\textrm{d}y \\&= \Vert K \Vert _{L^{1}}^{2} \Vert f \Vert _{L^{2}}^{2} < \infty , \end{aligned}$$

where the Fubini theorem is used on the first equality. Considering the inverse Fourier transform of (29), we also have

$$\begin{aligned} \int _{\mathbb {R}}K(x-y)f(y)\,\textrm{d}y=1_{[-\alpha , \alpha ]}(x)(Q(\alpha )-Q(x)). \end{aligned}$$

It should be noted that \(f(x)\,\textrm{d}x \in {\mathcal {S}}_K\) follows from the inequality

$$\begin{aligned} \int _{\mathbb {R}}\int _{\mathbb {R}}K(x-y)|f(x)f(y)|\,\textrm{d}x\,\textrm{d}y\le \Vert K*f\Vert _{L^2}\Vert f\Vert _{L^2}\le \Vert K\Vert _{L^1}\Vert f\Vert _{L^2}^2<\infty . \end{aligned}$$

These two relations imply that \((f(x)\,\textrm{d}x, Q(\alpha ))\) is a feasible solution of (D). We can now evaluate the value of the objective function of (D). Let us define

$$\begin{aligned} F(\alpha ):=-\int _{\mathbb {R}}\int _{\mathbb {R}}K(x-y)f(x)f(y)\,\textrm{d}x\,\textrm{d}y+2nQ(\alpha ). \end{aligned}$$
(30)

Because the first term can be considered as the inner product of \(K*f\) and f in \(L^2({\mathbb {R}})\), it can be computed through the Fourier transform as

$$\begin{aligned}&\int _{\mathbb {R}}\int _{\mathbb {R}}K(x-y)f(x)f(y)\,\textrm{d}x\,\textrm{d}y\nonumber \\&\quad =\frac{1}{2\pi }\int _{\mathbb {R}}\left( \frac{\omega }{\pi \tanh (d\omega )} \int _{-\alpha }^\alpha \left( Q(\alpha ) - Q(x) \right) e^{-i\omega x}\,\textrm{d}x\ \overline{\int _{-\alpha }^\alpha \left( Q(\alpha ) - Q(x) \right) e^{-i\omega x}\,\textrm{d}x}\right) \,\textrm{d}\omega \nonumber \\&\quad =\frac{1}{2\pi ^2}\int _{\mathbb {R}}\frac{\omega }{\tanh (d\omega )} \left| \int _{-\alpha }^\alpha \left( Q(\alpha ) - Q(x) \right) e^{-i\omega x}\,\textrm{d}x\right| ^2 \,\textrm{d}\omega . \end{aligned}$$
(31)

Let \(G(\alpha )\) be the value of the right-hand side. \(G(\alpha )\) can be decomposed into two parts, which are defined as

$$\begin{aligned} G_1(\alpha ):=\frac{1}{2\pi ^2}\int _{-1}^1\frac{\omega }{\tanh (d\omega )} \left| \int _{-\alpha }^\alpha \left( Q(\alpha ) - Q(x) \right) e^{-i\omega x}\,\textrm{d}x\right| ^2 \,\textrm{d}\omega \end{aligned}$$

and

$$\begin{aligned} G_2(\alpha ):=\frac{1}{2\pi ^2}\int _{[-1, 1]^c}\frac{\omega }{\tanh (d\omega )} \left| \int _{-\alpha }^\alpha \left( Q(\alpha ) - Q(x) \right) e^{-i\omega x}\,\textrm{d}x\right| ^2 \,\textrm{d}\omega . \end{aligned}$$

We first evaluate \(G_1\). Because the function \(\omega /\tanh (d\omega )\) is monotonically increasing in \([0, \infty )\) (see the proof of Lemma 31), we have

$$\begin{aligned} G_1(\alpha )&\le \frac{1}{\pi \tanh (d)}\cdot \frac{1}{2\pi }\int _{\mathbb {R}}\left| \int _{-\alpha }^\alpha \left( Q(\alpha ) - Q(x) \right) e^{-i\omega x}\,\textrm{d}x\right| ^2\,\textrm{d}\omega \nonumber \\&=\frac{1}{\pi \tanh (d)}\Vert 1_{[-\alpha , \alpha ]}(x)(Q(\alpha )-Q(x))\Vert _{L^2}^2\nonumber \\&\le \frac{2}{\pi \tanh (d)}\ \alpha Q(\alpha )^2. \end{aligned}$$
(32)

Next, we similarly evaluate \(G_2\). By integration by parts, we get

$$\begin{aligned} \omega \int _{-\alpha }^\alpha (Q(\alpha )-Q(x))e^{-i\omega x}\,\textrm{d}x =-\frac{1}{i}\int _{-\alpha }^\alpha Q'(x)e^{-i\omega x}\,\textrm{d}x. \end{aligned}$$

Thus, we have

$$\begin{aligned} G_2(\alpha )&=\frac{1}{2\pi ^2}\int _{[-1, 1]^c}\frac{1}{\omega \tanh (d\omega )} \left| \int _{-\alpha }^\alpha Q'(x)e^{-i\omega x}\,\textrm{d}x \right| ^2\,\textrm{d}\omega \nonumber \\&\le \frac{1}{\pi \tanh (d)}\Vert 1_{[-\alpha ,\alpha ]}(x)Q'(x)\Vert _{L^2}^2\nonumber \\&\le \frac{2}{\pi \tanh (d)}\ \alpha Q'(\alpha )^2. \end{aligned}$$
(33)

Finally, we reach the evaluation

$$\begin{aligned}{} & {} G(\alpha )\le \frac{2\alpha }{\pi \tanh (d)}\left( Q(\alpha )^2+Q'(\alpha )^2\right) , \\ {}{} & {} F(\alpha )\ge 2nQ(\alpha )-\frac{2\alpha }{\pi \tanh (d)}\left( Q(\alpha )^2+Q'(\alpha )^2\right) . \end{aligned}$$

By letting \(\alpha _n\) satisfy

$$\begin{aligned} \frac{2\alpha _n}{\pi \tanh (d)}\frac{Q(\alpha _n)^2+Q'(\alpha _n)^2}{Q(\alpha _n)}\le n, \end{aligned}$$

we get \(nQ(\alpha _n)\) as a lower bound for the optimal value of (P). For such \(\alpha _n\), we finally have

$$\begin{aligned} nQ(\alpha _n) \le I^\textrm{C}_{K, Q}(\mu ^*) \le 2F_{K, Q}^\textrm{C}(n) \end{aligned}$$

and this is equivalent to the assertion of Theorem 24.

6 Examples of convergence rates for several Q(x)’s

Although the asymptotic rates given in [23, Section 4.3] are derived through mathematically informal arguments, we here demonstrate that those rates roughly coincide with the bound in Theorem 24.

Example 61

(The case w is a single exponential) Consider the case

$$\begin{aligned} w(x)=\exp \left( -(\beta |x|)^\rho \right) ,\quad Q(x)=(\beta |x|)^\rho , \end{aligned}$$

for \(\beta >0\) and \(\rho \ge 1\). In this case, for a sufficiently large \(\alpha \) (satisfying \(\alpha \ge \rho \)), we have

$$\begin{aligned} \frac{2\alpha }{\pi \tanh (d)}\frac{Q(\alpha )^2+Q'(\alpha )^2}{Q(\alpha )} =\frac{2\alpha }{\pi \tanh (d)}\frac{(\beta \alpha )^{2\rho } +(\beta \rho )^2(\beta \alpha )^{2(\rho -1)}}{(\beta \alpha )^\rho } \le \frac{4\beta ^\rho \alpha ^{\rho +1}}{\pi \tanh (d)} \end{aligned}$$

and \(\alpha _n\) can be taken as

$$\begin{aligned}&\alpha _n=\left( \frac{\pi \tanh (d)}{4\beta ^\rho }n\right) ^{\frac{1}{\rho +1}}, \nonumber \\&\frac{Q(\alpha _n)}{2} =\frac{1}{2}\beta ^\rho \left( \frac{\pi \tanh (d)}{4\beta ^\rho }n\right) ^{\frac{\rho }{\rho +1}} \qquad \left( =\Theta \left( \beta ^{\frac{\rho }{\rho +1}}n^{\frac{\rho }{\rho +1}}\right) \right) , \end{aligned}$$
(34)

for sufficiently large n. This rate roughly coincides with (4.37) in [23].

Example 62

(The case w is a double exponential) Consider the case

$$\begin{aligned} w(x)=\exp \left( -\beta \exp (\gamma |x|)\right) , \quad Q(x)=\beta \exp (\gamma |x|), \end{aligned}$$

for \(\beta ,\gamma >0\). In this case,

$$\begin{aligned} \frac{2\alpha }{\pi \tanh (d)}\frac{Q(\alpha )^2+Q'(\alpha )^2}{Q(\alpha )} =\frac{2\alpha \beta (1+\gamma ^2)\exp (\gamma \alpha )}{\pi \tanh (d)} \end{aligned}$$

is valid. Let \(\alpha _n>0\) satisfy that the right-hand side is equal to n. Then, we have

$$\begin{aligned} \gamma \alpha _n=W\left( \frac{\pi \tanh (d)\gamma }{2\beta (1+\gamma ^2)}n \right) \qquad \left( \sim \log \left( \frac{\gamma }{\beta (1+\gamma ^2)}n\right) \right) , \end{aligned}$$

where W is Lambert’s W function, i.e., the inverse of \(x\mapsto xe^x\). Using this, we get

$$\begin{aligned} \frac{Q(\alpha _n)}{2}&=\frac{\beta }{2\gamma \alpha _n}\cdot \gamma \alpha _n\exp (\gamma \alpha _n) =\frac{\beta }{2\gamma \alpha _n}\frac{\pi \tanh (d)\gamma }{2\beta (1+\gamma ^2)}n =\frac{\pi \tanh (d)}{4(1+\gamma ^2)}\frac{n}{\alpha _n} \nonumber \\&= \frac{\pi \tanh (d) \gamma }{4(1+\gamma ^2)}\frac{n}{W\left( \frac{\pi \tanh (d)\gamma }{2\beta (1+\gamma ^2)}n \right) } \qquad \left( \sim \frac{\pi \tanh (d)\gamma }{4(1+\gamma ^2)} \frac{n}{\log \left( \frac{\gamma }{\beta (1+\gamma ^2)}n\right) } \right) . \end{aligned}$$
(35)

This rate roughly coincides with the asymptotic order (4.44) in [23] for each fixed constant \(\gamma \).

Remark 3

We choose the weight functions in Examples 61 and 62 for simplicity although they are not (necessarily) analytic in the strip region \({\mathcal {D}}_{d}\) for any \(d > 0\). This is because we just need their asymptotic properties for finding \(\alpha _{n}\).

7 Conclusion

In this study, we analyzed the approximation method proposed by [24] over weighted Hardy spaces \({\mathbb {H}}^\infty ({\mathcal {D}}_d, w)\). We provided (1) proof of the fact that the approximation formulas are nearly optimal from the viewpoint of minimum worst-case error \(E_n^{\min }({\mathbb {H}}^\infty ({\mathcal {D}}_d, w))\); and (2) upper bounds of \(E_n^{\min }({\mathbb {H}}^\infty ({\mathcal {D}}_d))\) to evaluate the convergence rates of approximation errors with \(n\rightarrow \infty \). To obtain (2), we introduced the concept “positive semi-definite in measure” and by using this, provided a lower bound for \(F_{K, Q}^\textrm{C}(n)\). We also compared the given bounds with those mentioned in the study by [23], and demonstrated that they have the same convergence rate with \(n\rightarrow \infty \).

The new bounds do not indicate that the approximation formulas in [24] are optimal. Another method to bound the error is recently considered by [7], although their bound do not show the optimality, either. We need tighter bounds to show the optimality, which may require more sophisticated analysis. We leave such analysis to future work.