The approximation capabilities of Durrmeyer-type neural network operators

Coroianu, Lucian; Costarelli, Danilo; Natale, Mariarosaria; Pantiş, Alexandra

doi:10.1007/s12190-024-02146-9

The approximation capabilities of Durrmeyer-type neural network operators

Original Research
Open access
Published: 11 June 2024

(2024)
Cite this article

Download PDF

You have full access to this open access article

Journal of Applied Mathematics and Computing Aims and scope Submit manuscript

The approximation capabilities of Durrmeyer-type neural network operators

Download PDF

263 Accesses
Explore all metrics

Abstract

In this paper, a new family of neural network (NN) operators is introduced. The idea is to consider a Durrmeyer-type version of the widely studied discrete NN operators by Costarelli and Spigler (Neural Netw 44:101–106, 2013). Such operators are constructed using special density functions generated from suitable sigmoidal functions, while the reconstruction coefficients are based on a convolution between a general kernel function $\chi $ and the function being reconstructed, f. Here, we investigate their approximation capabilities, establishing both pointwise and uniform convergence theorems for continuous functions. We also provide quantitative estimates for the approximation order thanks to the use of the modulus of continuity of f; this turns out to be strongly influenced by the asymptotic behaviour of the sigmoidal function $\sigma $. Our study also shows that the estimates we provide are, under suitable assumptions, the best possible. Finally, $L^p$-approximation is also established. At the end of the paper, examples of activation functions are discussed.

Approximation by exponential sampling type neural network operators

Article 15 May 2021

Quantitative Estimates for Neural Network Operators Implied by the Asymptotic Behaviour of the Sigmoidal Activation Functions

Article 04 September 2022

Voronovskaja Type Theorems and High-Order Convergence Neural Network Operators with Sigmoidal Functions

Article 08 May 2020

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In recent years, there has been a large increase in interest of the scientific community in neural network (NN) operators driven by their wide-ranging applications, in particular in numerical analysis, artificial intelligence, and machine learning [15, 30,31,32, 35, 37, 38].

Anastassiou [3], was the first to establish NN approximations for continuous functions. He provided estimates for the rate of convergence using NN operators of the Cardaliaguet-Euvrard and Squashing type [11], based on the modulus of continuity of the function being approximated, thereby deriving Jackson type inequalities. The same author continued the investigations on the rate of convergence for univariate and multivariate NN operators activated by the hyperbolic tangent and by the logistic sigmoidal functions (see, e.g., [4]). Ramp function activated NN operators were further examined in [8].

Afterward, in [19], the authors established a unified approach for a general class of sigmoidal functions and they introduced the following discrete NN operators

$$\begin{aligned} (F_nf)(x) := \frac{\displaystyle \sum \nolimits _{k=\lceil {na \rceil }}^{\lfloor {nb \rfloor }}f\left( \frac{k}{n}\right) \phi _{\sigma }(nx-k)}{\displaystyle \sum \nolimits _{k=\lceil {na \rceil }}^{\lfloor {nb \rfloor }}\phi _{\sigma }(nx-k)}, \quad x\in [a,b], \end{aligned}$$

(I)

for sufficiently large $n \in \mathbb {N}^{+}$, where $\phi _\sigma $ is a suitable linear combination of sigmoidal activation functions $\sigma : \mathbb {R} \rightarrow \mathbb {R}$, and $f: [a,b] \rightarrow \mathbb {R}$ is a bounded function. In the above definition, $\lfloor \cdot \rfloor $ and $\lceil \cdot \rceil $ denote the integral and ceiling part of a given number.

In [20], replacing the sampling values f(k/w) in (I) with an average of f in an interval containing k/w, Costarelli and Spigler constructed the NN-type operators in the Kantorovich form, as follows

$$\begin{aligned} (K_nf)(x) := \frac{\displaystyle \sum \nolimits _{k=\lceil {na \rceil }}^{\lfloor {nb \rfloor }-1}\left[ n\int _{k/n}^{(k+1)/n}f(u)du\right] \phi _{\sigma }(nx-k)}{\displaystyle \sum \nolimits _{k=\lceil {na \rceil }}^{\lfloor {nb \rfloor }-1}\phi _{\sigma }(nx-k)}, \quad x\in [a,b], \end{aligned}$$

(II)

for sufficiently large $ n \in \mathbb {N}^{+} $ such that $ \lceil na \rceil \le \lfloor {nb \rfloor } - 1 $. For more detailed insights into such operators, we refer to [13, 14, 36].

In practice, the passage from the operators $F_n$ to $K_n$ followed the same path as the evolution from the Bernstein polynomials to the Kantorovich polynomials.

Following the same way, the next step appears to involve the introduction of a Durrmeyer version of the NN-type operators, just as a Durrmeyer version of the Bernstein polynomials has been defined, extensively studied, and extended in the literature (see, e.g., [5, 6, 17, 23,24,25,26,27, 29]).

In this paper, motivated by the above considerations, we introduce the Durrmeyer-type NN operators, as follows

$$\begin{aligned} \left( D_{n}^{\sigma ,\chi }f\right) (x):=\frac{\displaystyle \sum \nolimits _{k=\lceil {na \rceil }}^{\lfloor {nb \rfloor }-1} \left[ n\, \int _{a}^{b}\chi (nt-k)f(t)\, dt\right] \phi _{\sigma }(nx-k)}{\displaystyle \sum \nolimits _{k=\lceil {na \rceil }}^{\lfloor {nb \rfloor }-1} \left[ n\, \int _{a}^{b}\chi (nt-k)\, dt\right] \phi _{\sigma }(nx-k)}, \quad x\in [a,b]. \end{aligned}$$

(III)

Here, the coefficients are obtained through a general convolution integral involving a kernel function $\chi $ and the function being approximated, denoted as f. The operators in (III) represent a generalization of the Kantorovich NN operators, as we will show in the Remark 8 below.

Here, we investigate the approximation capabilities of $D_n^{\sigma ,\chi }$, establishing both pointwise and uniform convergence theorems for continuous functions (Sect. 4). We also provide quantitative estimates for the approximation order thanks to the use of the modulus of continuity $\omega (f, \delta )$, $\delta > 0$, of f; this turns out to be strongly influenced by the asymptotic behaviour of the sigmoidal function $\sigma $ (Sect. 5). Our study also shows that the estimates we provide are, under suitable assumptions, the best possible on the space of continuous functions on [a, b]. Finally, also $L^p$, $1 \le p<+\infty $, convergence theorem is established (Sect. 6). In order to prove such a result, we exploit a norm inequality for the above operators, and the density of the continuous functions in $L^p$.

Finally, several examples of sigmoidal activation functions have been presented (see, e.g., [9, 10, 33, 34]); here we also show that also the case of the celebrated ReLU and RePUs function can be included in the results established by the present approach.

2 Preliminaries

Throughout the paper, we denote the integer part and the ceiling of a fixed real number by $\lfloor {\cdot \rfloor }$ and $\lceil {\cdot \rceil }$, respectively.

In line with the definition given in [22], we will assume that $\sigma $ is a sigmoidal function, i.e., a non-decreasing measurable function such that

$$\begin{aligned} \lim _{x\rightarrow -\infty }\sigma (x)=0,\quad \quad \lim _{x\rightarrow +\infty }\sigma (x)=1. \end{aligned}$$

Let us consider a sigmoidal function $\sigma :\mathbb {R}\rightarrow \mathbb {R}$ such that $\sigma (x)-1/2$ is an odd function, $\sigma \in C^{2}(\mathbb {R})$ is concave for $x\ge 0$, and $\sigma (x)={\mathcal {O}}\left( \left| x\right| ^{-\alpha }\right) $ as $x\rightarrow -\infty $ for some $\alpha >1$ (see, e.g., [19]).

Now, taking $\phi _{\sigma }(x)=\frac{1}{2}\left[ \sigma (x+1)-\sigma (x-1)\right] $, $x\in {\mathbb {R}}$, it is immediate that $\phi _{\sigma }$ is an even function taking non-negative values, non-decreasing on $(-\infty ,0]$ and non-increasing on $[0,+\infty )$. In addition to that, $\phi _{\sigma }$ satisfies a list of properties (see Lemmas 2.2, 2.4, 2.6 and 2.7 in [19] and Lemma 2.6 (I) in [20]), which can be summarized as follows.

Lemma 1

We have

(i)
$\displaystyle \sum _{k\in \mathbb {Z}}\phi _{\sigma }(x-k)=1$, for all $x\in \mathbb {R}$;
(ii)
$1\ge \displaystyle \sum _{k=\lceil {na \rceil }}^{\lfloor {nb \rfloor }}\phi _{\sigma }(nx-k)\ge \phi _{\sigma }(1)>0$, for every $n\in \mathbb {N}^{+}$ such that $\lceil {na \rceil } \le \lfloor {nb \rfloor } $, and $x\in [a,b]$;
(iii)
$\displaystyle \sum _{k=\lceil {na \rceil }}^{\lfloor {nb \rfloor }-1}\phi _{\sigma }(nx-k)\ge \phi _{\sigma }(2)>0$, for every $n\in \mathbb {N}^{+}$ such that $\lceil {na \rceil } \le \lfloor {nb \rfloor } - 1$, and $x\in [a,b];$
(iv)
$\phi _{\sigma }(x)={\mathcal {O}}\left( \left| x\right| ^{-\alpha }\right) $ as $x\rightarrow \pm \infty $, and consequently $\phi _{\sigma }\in L^{1}({\mathbb {R}})$; furthermore, we also have that $\displaystyle \int _{{\mathbb {R}}}\phi _{\sigma }(t)dt=1;$
(v)
for every $\gamma >0$, we have $\displaystyle \lim _{n\rightarrow +\infty }\sum _{\left| x-k\right| >\gamma n}\phi _{\sigma }(x-k)=0$, uniformly with respect to $x\in \mathbb {R}$.

Remark 2

We can observe that, the theory of the neural network operators holds also for sigmoidal activation functions $\sigma $ which not-necessarily belong to $C^{2}(\mathbb {R})$. In the latter case, we can assume that the corresponding $\phi _{\sigma }$ satisfies the following conditions

$\phi _{\sigma }(x)$ is non-decreasing for $x<0$ and non-increasing for $x\ge 0$;
$\phi _{\sigma }(1)>0$.

For more details on this matter one can check, e.g., [19].

We recall the notion of the discrete absolute moment of order $\nu \ge 0$ of $\phi _{\sigma }$, i.e.,

$$\begin{aligned} M_{\nu }(\phi _{\sigma }):=\sup _{x\in {\mathbb {R}}}\sum _{z\in {\mathbb {Z}}} \phi _{\sigma }(x-k)|x-k|^{\nu }. \end{aligned}$$

Given the aforementioned conditions on $\sigma $, it follows that $M_{\nu }(\phi _{\sigma })<+\infty ,$ $ 0\le \nu <\alpha -1$ (see, e.g., [16]). Now, we introduce the truncated discrete absolute moment of order $\nu \ge 0$, as follows

$$\begin{aligned} M_{\lceil {na \rceil }}^{\lfloor {nb \rfloor }}(\phi _{\sigma }, \nu , x):=\sum _{\lceil {na \rceil }}^{\lfloor {nb \rfloor }}\phi _{\sigma }(nx-k)|nx-k|^{\nu }, \quad x\in [a,b],\quad n\in \mathbb {N}^{+}. \end{aligned}$$

This notion can also be applied to express the sum in ii) of Lemma 1, that can be reformulated as $M_{\lceil {na \rceil }}^{\lfloor {nb \rfloor }}(\phi _{\sigma }, 0, x)\ge \phi _{\sigma }(1)>0$.

Furthermore, we recall the following lemmas that will be useful in the next sections. These lemmas serve as a formal representation of results established in [14].

Lemma 3

Let $\sigma $ be a sigmoidal function satisfying condition (iv) of Lemma 1 with $1<\alpha <2$. Then, there exist constants $H\ge 1$, $K>0$ such that

$$\begin{aligned} M_{\lceil {na \rceil }}^{\lfloor {nb \rfloor }}(\phi _{\sigma }, 1, x)\le H + Kn^{2-\alpha }, \quad x\in [a,b], \end{aligned}$$

for $n\in \mathbb {N}^{+}$.

Lemma 4

Let $\sigma $ be a sigmoidal function satisfying condition (iv) of Lemma 1 with $\alpha =2$. Then, there exist constants $H\ge 1$, $K>0$ such that

$$\begin{aligned} M_{\lceil {na \rceil }}^{\lfloor {nb \rfloor }}(\phi _{\sigma }, 1, x)\le H +K\ln n, \quad x\in [a,b], \end{aligned}$$

for $n\in \mathbb {N}^{+}$ sufficiently large.

3 The Durrmeyer-type NN operators

From now on, we assume for simplicity that both a, b are integers, with $a<b$.

Let $\chi :\mathbb {R}\rightarrow [0,+\infty )$ be bounded and $L^{1}$ -integrable on $\mathbb {R}$, such that

$$\begin{aligned} \int _{0}^{1}\chi (u)\,du\ =:\ {{{\mathcal {K}}}}\ >0, \end{aligned}$$

(1)

and its discrete absolute moment of order 0 is finite, i.e.,

$$\begin{aligned} M_{0}(\chi ):=\sup _{u\in {\mathbb {R}}} \sum _{k\in {\mathbb {Z}}} \chi (u-k)<+\infty . \end{aligned}$$

Moreover, for any $\nu \ge 0$, let us define the continuous absolute moment of order $\nu $ as follows

$$\begin{aligned} {\tilde{M}}_{\nu }(\chi ):=\int _{{\mathbb {R}}}\chi (t)|t|^{\nu }dt. \end{aligned}$$

Now, we are able to introduce the following definition.

Definition 5

If $f:[a,b]\rightarrow \mathbb {R}$ is bounded and $L^{1}$-integrable on [a, b], the Durrmeyer-type neural network (NN) operators associated to f with respect to $\phi _{\sigma }$ and $\chi $ is defined by

$$\begin{aligned} \left( D_{n}^{\sigma ,\chi }f\right) (x):=\frac{\displaystyle \sum \nolimits _{k=na}^{nb-1} \left[ n\, \int _{a}^{b}\chi (nt-k)f(t)\, dt\right] \phi _{\sigma }(nx-k)}{\displaystyle \sum \nolimits _{k=na}^{nb-1} \left[ n\, \int _{a}^{b}\chi (nt-k)\, dt\right] \phi _{\sigma }(nx-k)}, \;\; x\in [a,b],\;\; n\in \mathbb {N}^{+}. \end{aligned}$$

(2)

In order to show that the above definition is well-defined, we have to prove the following lemma.

Lemma 6

Under the above assumptions, the following inequality holds

$$\begin{aligned} \sum _{k=na}^{nb-1} \left[ n\, \int _{a}^{b}\chi (nt-k)\, dt\right] \phi _{\sigma }(nx-k)\ \ge \ {{{\mathcal {K}}}}\, \phi _{\sigma }(2) >0, \end{aligned}$$

$x \in [a,b]$, $n \in \mathbb {N}^+$ such that $n \ge 1/(b-a)$.

Proof

Using the change of variable $y=nt-k$, we can write what follows

$$\begin{aligned} \sum _{k=na}^{nb-1} \left[ n\, \int _{a}^{b}\chi (nt-k)\, dt\right] \phi _{\sigma }(nx-k)\ =\ \sum _{k=na}^{nb-1} \left[ \int _{na-k}^{nb-k}\chi (y)\, dy\right] \phi _{\sigma }(nx-k), \end{aligned}$$

$x \in [a,b]$. Since $\chi $ is non-negative, and the following inclusions hold

$$\begin{aligned}{}[0, 1] \subset [-i,n(b-a)-i], \quad \quad i=0, 1,..., n(b-a)-1, \end{aligned}$$

we immediately obtain

$$\begin{aligned} \begin{aligned} \sum _{k=na}^{nb-1} \left[ n\, \int _{a}^{b}\chi (nt-k)\, dt\right] \phi _{\sigma }(nx-k)\ {}&\ge \ \left[ \int _{0}^{1}\chi (y)\, dy\right] \sum _{k=na}^{nb-1} \phi _{\sigma }(nx-k) \\&\ge {{{\mathcal {K}}}}\,\phi _{\sigma }(2)>0, \end{aligned} \end{aligned}$$

thanks to the use of (1) and Lemma 1 (iii). $\square $

Therefore, by Lemma 6, we immediately obtain that $D_{n}^{\sigma ,\chi }$ is well-defined, e.g., for any function that is bounded over the interval [a, b]. Then, of course, $D_{n}^{\sigma ,\chi }$ is a positive linear operator on the space of bounded and $L^{1}$-integrable function on [a, b].

Remark 7

It is worth noting that the operator $D_{n}^{\sigma ,\chi }$ preserves constant functions, a property inherent in its definition. Specifically, $D_{n}^{\sigma ,\chi }e_{0}=e_{0}$, where $e_{0}:[a,b]\rightarrow \mathbb {R}$ and $e_{0}(x)=1$.

Remark 8

We emphasize that the Durrmeyer-type NN operators generalizes some other well-known families of NN operators. For instance, the Kantorovich-type NN operators (see, e.g., [20])

$$\begin{aligned} (K_n f)(x):=\frac{\displaystyle \sum \nolimits _{k=na}^{nb-1} \left[ n\, \int _{\frac{k}{n}}^{\frac{k+1}{n}}f(t)\, dt\right] \phi _{\sigma }(nx-k)}{ \displaystyle \sum \nolimits _{k=na}^{nb-1} \phi _{\sigma }(nx-k)}, \quad x\in [a,b], \end{aligned}$$

can be viewed as a special case of the Durrmeyer-type NN operators. In fact, for any bounded $f\in L^1([a,b])$ and $\chi (u)={\textbf{1}}_{[0,1]}(u)$, $u\in {\mathbb {R}}$, where ${\textbf{1}}_{[0,1]}$ is the characteristic function on the set $[0,1]\subset {\mathbb {R}}$, we have

$$\begin{aligned} \begin{aligned} \left( D_{n}^{\sigma ,{\textbf{1}}_{[0,1]} }f\right) (x)&= \frac{\displaystyle \sum \nolimits _{k=na}^{nb-1} \left[ n\, \int _{a}^{b}{\textbf{1}}_{[0,1]} (nt-k)f(t)\, dt\right] \phi _{\sigma }(nx-k)}{\displaystyle \sum \nolimits _{k=na}^{nb-1} \left[ n\, \int _{a}^{b}{\textbf{1}}_{[0,1]} (nt-k)\, dt\right] \phi _{\sigma }(nx-k)}\\&=\frac{\displaystyle \sum \nolimits _{k=na}^{nb-1} \left[ n\, \int _{\frac{k}{n}}^{\frac{k+1}{n}}f(t)\, dt\right] \phi _{\sigma }(nx-k)}{\displaystyle \sum \nolimits _{k=na}^{nb-1} \phi _{\sigma }(nx-k)}=(K_n f)(x). \end{aligned} \end{aligned}$$

Thus, $\left( D_{n}^{\sigma ,{\textbf{1}}_{[0,1]} }f\right) (x)=(K_n f)(x)$, for every $x\in [a,b]$ and $n\in \mathbb {N}^{+}$.

In the next Remark 14, we will provide further examples illustrating kernel functions $\chi $ suitable for defining $D_n^{\sigma ,\chi }$.

4 Pointwise and uniform convergence

If $a<b$ and $f\in C([a,b])$, we denote, as usual, the uniform norm on C([a, b]), by

$$\begin{aligned} \left\| f\right\| _{\infty } =\max _{x\in [a,b]}\left| f(x)\right| . \end{aligned}$$

Theorem 9

Let $f:[a,b]\rightarrow \mathbb {R}$ be a bounded and $L^{1}$-integrable function. Then

$$\begin{aligned} \lim _{n\rightarrow +\infty }\left( D_{n}^{\sigma ,\chi }f\right) (x_{0})=f(x_{0}), \end{aligned}$$

at any point $x_0\in [a,b]$ of continuity of f. Moreover, if $f\in C([a,b])$, then

$$\begin{aligned} \lim _{n\rightarrow +\infty }\left\| D_{n}^{\sigma ,\chi }f -f\right\| _{\infty } =0. \end{aligned}$$

Proof

Let $x_0\in [a,b]$ be a point of continuity of f. Using Lemma 6, we obtain

$$\begin{aligned}&\left| \left( D_{n}^{\sigma ,\chi }f\right) (x_{0})-f(x_{0})\right| \nonumber \\&\quad =\left| \left( D_{n}^{\sigma ,\chi }f\right) (x_{0})-f(x_{0})\cdot \frac{\displaystyle \sum \nolimits _{k=na}^{nb-1} \left[ n\, \int _{a}^{b}\chi (nt-k)\, dt\right] \phi _{\sigma }(nx_0-k)}{\displaystyle \sum \nolimits _{k=na}^{nb-1} \left[ n\, \int _{a}^{b}\chi (nt-k)\, dt\right] \phi _{\sigma }(nx_0-k)}\right| \nonumber \\&\quad =\left| \frac{\displaystyle \sum \nolimits _{k=na}^{nb-1}n\left[ \int _{a}^{b}\chi (nt-k)\left( f(t) -f(x_0)\right) dt\right] \phi _{\sigma }(nx_{0}-k)}{ \displaystyle \sum \nolimits _{k=na}^{nb-1} \left[ n\, \int _{a}^{b}\chi (nt-k)\, dt\right] \phi _{\sigma }(nx_0-k)}\right| \nonumber \\&\quad \le \frac{1}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)}\sum _{k=na}^{nb-1}n\left[ \int _{a}^{b}\chi (nt-k)\left| f(t) -f(x_0)\right| dt\right] \phi _{\sigma }(nx_{0}-k). \end{aligned}$$

(3)

Let now $\varepsilon >0$ be arbitrarily chosen. By the continuity of f at $x_0$, there exists $\gamma >0$ such that $|f(t)-f(x_0)|<\varepsilon $ for every $t\in [a,b]$ with $|t-x_0|\le \gamma $. At this point, we proceed to define

$$\begin{aligned} S_1:=\{k:\;na\le k\le nb-1 \text { and } |k/n -x_0|\le \gamma /2\}, \end{aligned}$$

and

$$\begin{aligned} S_2:=\{k:\;na\le k\le nb-1 \text { and } |k/n -x_0|>\gamma /2\}. \end{aligned}$$

Hence, we can write

$$\begin{aligned} \begin{aligned}&\left| \left( D_{n}^{\sigma ,\chi }f\right) (x_{0})-f(x_{0})\right| \\&\quad \le \frac{1}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)} \Biggl \{\sum _{k\in S_1}n\left[ \int _{a}^{b}\chi (nt-k)\left| f(t) -f(x_0)\right| dt\right] \phi _{\sigma }(nx_{0}-k)\\&\quad +\sum _{k\in S_2}n\left[ \int _{a}^{b}\chi (nt-k)\left| f(t) -f(x_0)\right| dt\right] \phi _{\sigma }(nx_{0}-k)\Biggr \}\\&\quad =:\frac{1}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)} \left( I_1+I_2\right) . \end{aligned} \end{aligned}$$

The first term can be further divided into

$$\begin{aligned} \begin{aligned} I_1&=\sum _{k\in S_1}n\left[ \Biggl \{\int _{|t-k/n|<\gamma /2}+\int _{|t-k/n|\ge \gamma /2}\Biggr \}\chi (nt-k)\left| f(t) -f(x_0)\right| dt\right] \phi _{\sigma }(nx_{0}-k)\\&=:I_{1,1}+I_{1,2}. \end{aligned} \end{aligned}$$

For $t\in [a,b]$, such that $\left| t-k/n\right| <\gamma /2$, and $k\in S_{1}$, we have

$$\begin{aligned} |t-x_0|\le \left| t-\frac{k}{n}\right| +\left| \frac{k}{n}-x_0\right| <\frac{\gamma }{2}+\frac{\gamma }{2}=\gamma , \end{aligned}$$

therefore, it follows that $\left| f(t)-f(x_{0})\right| <\varepsilon $. This implies

$$\begin{aligned} \begin{aligned} I_{1,1}&<\varepsilon \sum _{k\in S_1}\phi _{\sigma }(nx_{0}-k)\;n\int _{|t-k/n|<\gamma /2}\chi (nt-k)dt\\&\le \varepsilon \sum _{k\in {\mathbb {Z}}}\phi _{\sigma }(nx_{0}-k)\int _{{\mathbb {R}}}\chi (u)du\\&=\varepsilon \left\| \chi \right\| _{1}, \end{aligned} \end{aligned}$$

where, in the above computations, we used the change of variable $nt-k=u$, and also condition (i) of Lemma 1.

Let us now estimate $I_{1,2}$. By the boundedness of f, it follows that

$$\begin{aligned} I_{1,2}\le 2\left\| f\right\| _{\infty }\sum _{k\in S_1} n\left[ \int _{|nt-k|\ge \frac{n\gamma }{2}}\chi (nt-k)dt\right] \phi _{\sigma }(nx_0-k), \end{aligned}$$

with

$$\begin{aligned} n\int _{|nt-k|\ge \frac{n\gamma }{2}}\chi (nt-k)dt=\int _{|y|\ge \frac{n\gamma }{2}}\chi (y)dy\rightarrow 0, \text { as } n\rightarrow +\infty , \end{aligned}$$

since $\chi \in L^{1}({\mathbb {R}})$. Then, using again i) of Lemma 1, $I_{1,2}\le 2\left\| f\right\| _{\infty }\varepsilon $, for sufficiently large n. Through analogous reasonings, we obtain the following inequalities

$$\begin{aligned} I_{2}\le & {} 2\left\| f\right\| _{\infty }\sum _{k\in S_2}n\left[ \int _{a}^{b}\chi (nt-k)dt\right] \phi _{\sigma }(nx_{0}-k)\\\le & {} 2\left\| f\right\| _{\infty } \left\| \chi \right\| _{1} \sum _{k\in S_2} \phi _{\sigma }(nx_{0}-k) \\\le & {} 2\left\| f\right\| _{\infty } \left\| \chi \right\| _{1} \sum _{\left| nx_{0}-k\right| >\frac{n\gamma }{2}}\phi _{\sigma }(nx_{0}-k). \end{aligned}$$

From property v) of Lemma 1, there exists a sufficiently large n, such that $I_{2}\le 2\left\| f\right\| _{\infty } \left\| \chi \right\| _{1} \varepsilon $.

In conclusion, setting $M:=\left\| \chi \right\| _{1}+2\left\| f\right\| _{\infty }(1+\left\| \chi \right\| _{1})$, we have

$$\begin{aligned} \left| \left( D_{n}^{\sigma ,\chi }f\right) (x_{0})-f(x_{0})\right| \le \frac{M\varepsilon }{{{{\mathcal {K}}}}\phi _{\sigma }(2)}, \end{aligned}$$

for sufficiently large n, and by the arbitrariness of $\varepsilon $, it follows that

$$\begin{aligned} \lim _{n\rightarrow \infty }\left( D_{n}^{\sigma ,\chi }f\right) (x_{0})=f(x_{0}), \end{aligned}$$

therefore the first part of the statement is proved.

Finally, if $f\in C([a,b])$ the second part of the theorem easily follows, as we can now use the parameter $\gamma $ of the uniform continuity of f (in place of that of the continuity of f) for any arbitrary $x\in [a,b]$. $\square $

5 Quantitative estimates

In this section, we aim to figure out how fast are the approximations achieved by the operator $D_{n}^{\sigma ,\chi }$.

To address this task, it is necessary to recall the notion of the modulus of continuity of a given function $f\in C([a,b])$, defined as usual

$$\begin{aligned} \omega (f,\delta ):=\max \{|f(x)-f(y)|:\; x,y\in [a,b],\; |x-y|\le \delta \}, \end{aligned}$$

with $\delta >0$. It is interesting to point out that the following well-known inequality

$$\begin{aligned} \omega (f,\lambda \delta )\le (\lambda +1)\,\omega (f,\delta ), \end{aligned}$$

(4)

holds, with $\lambda ,\delta >0$.

Let $x_{0}$ be a fixed point within this interval, for a sake of clarity, we rewrite the inequality obtained in (3)

$$\begin{aligned}{} & {} \left| \left( D_{n}^{\sigma ,\chi }f\right) (x_{0})-f(x_{0})\right| \le \frac{1}{\mathcal{K}\,\phi _{\sigma }(2)}\\{} & {} \sum _{k=na}^{nb-1}n\left[ \int _{a}^{b}\chi (nt-k)\left| f(t) -f(x_0)\right| dt\right] \phi _{\sigma }(nx_{0}-k). \end{aligned}$$

One can easily check that

$$\begin{aligned} \begin{aligned} \left| f(t)-f(x_{0})\right|&\le \omega \left( f,\left| t-x_{0}\right| \right) =\omega \left( f, \frac{\gamma (n)}{\gamma (n)}\cdot \left| t-x_{0}\right| \right) \\&\le (1+\gamma (n)|t-x_0|)\,\omega \left( f,\frac{1}{\gamma (n)}\right) , \end{aligned} \end{aligned}$$

where the previous estimate is a consequence of (4), with $\lambda =\gamma (n)|t-x_0|$ and $\delta =\frac{1}{\gamma (n)}$. Here, in accordance with the insights from [14], the function $\gamma (n)$ is defined as

$$\begin{aligned} \gamma (n):={\left\{ \begin{array}{ll} n^{\alpha -1}, &{} \text { if } \alpha \in (1,2), \\ \displaystyle \frac{n}{\ln n}, &{} \text { if } \alpha =2, \\ n, &{} \text { if } \alpha >2. \end{array}\right. } \end{aligned}$$

Hence, it follows

$$\begin{aligned} \begin{aligned}&\left| \left( D_{n}^{\sigma ,\chi }f\right) (x_{0})-f(x_{0})\right| \\&\quad \le \frac{\omega \left( f,\frac{1}{\gamma (n)}\right) }{\mathcal{K}\,\phi _{\sigma }(2)}\sum _{k=na}^{nb-1}n\left[ \int _{a}^{b}\chi (nt-k)\,(1+\gamma (n)|t-x_0|)dt\right] \phi _{\sigma }(nx_{0}-k). \end{aligned} \end{aligned}$$

Through the change of variable $nt-k=u$, we have $\displaystyle \int _{a}^{b}\chi (nt-k)dt\le \frac{\left\| \chi \right\| _{1}}{n}$. Therefore, we can write

$$\begin{aligned} \begin{aligned}&n \int _{a}^{b}\chi (nt-k)\,(1+\gamma (n)|t-x_0|)dt\le \left\| \chi \right\| _{1}+n\,\gamma (n)\int _{a}^{b}\chi (nt-k)\,|t-x_0|dt\\&\le \left\| \chi \right\| _{1}+\gamma (n)\,n\int _{a}^{b}\chi (nt-k)\,\left| t-\frac{k}{n}\right| dt\, +\gamma (n)\,n\int _{a}^{b}\chi (nt-k)\,\left| \frac{k}{n}-x_0\right| dt\\&=\left\| \chi \right\| _{1}+\frac{\gamma (n)}{n}\,n\int _{a}^{b}\chi (nt-k)\,\left| nt-k\right| dt\, +\gamma (n)\,n\int _{a}^{b}\chi (nt-k)\,\left| \frac{k}{n}-x_0\right| dt\\&\le \left\| \chi \right\| _{1}+\frac{\gamma (n)}{n}{\tilde{M}}_{1}(\chi )+\frac{\gamma (n)}{n}\left\| \chi \right\| _{1}|k-nx_0|, \end{aligned} \end{aligned}$$

assuming that ${\tilde{M}}_{1}(\chi )<+\infty $. Consequently, by i) of Lemma 1, we get

$$\begin{aligned} \begin{aligned}&\left| \left( D_{n}^{\sigma ,\chi }f\right) (x_{0})-f(x_{0})\right| \\&\le \frac{\omega \left( f,\frac{1}{\gamma (n)}\right) }{\mathcal{K}\,\phi _{\sigma }(2)}\left( \left\| \chi \right\| _{1}+\frac{\gamma (n)}{n}{\tilde{M}}_{1}(\chi )+\frac{\gamma (n)}{n} \left\| \chi \right\| _{1}\sum _{k=na}^{nb-1}|k-nx_0|\phi _{\sigma }(nx_{0}-k)\right) \\&=\frac{\omega \left( f,\frac{1}{\gamma (n)}\right) }{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)}\left( \left\| \chi \right\| _{1}+\frac{\gamma (n)}{n}{\tilde{M}}_{1}(\chi )+\frac{\gamma (n)}{n} \left\| \chi \right\| _{1}M_{na}^{nb-1}(\phi _{\sigma },1,x_0)\right) , \end{aligned} \end{aligned}$$

where the truncated absolute moment of order 1, under the assumption on $\sigma $ assumed in Lemma 3 and in Lemma 4 can be bounded by an increasing function on the variable n. Thus, for $f\in C([a,b])$, we can immediately deduce the following estimates

$$\begin{aligned} \left\| D_{n}^{\sigma ,\chi }f -f\right\| _{\infty } \le C\,\omega \left( f,\,\frac{1}{n^{\alpha -1}}\right) ,\quad&\text { if }\alpha \in \left( 1,2\right) \text {,} \end{aligned}$$

(5)

$$\begin{aligned} \left\| D_{n}^{\sigma ,\chi }f -f\right\| _{\infty } \le C\,\omega \left( f,\,\frac{\ln n}{n}\right) ,\quad&\text { if }\alpha =2\text {,} \end{aligned}$$

(6)

$$\begin{aligned} \left\| D_{n}^{\sigma ,\chi }f -f\right\| _{\infty } \le C\,\omega \left( f,\,\frac{1}{n}\right) ,\quad&\text { if }\alpha >2, \end{aligned}$$

(7)

where each constant C appearing in these estimates is independent on n and f.

Now, we proceed to prove that these estimates are the best possible.

We focus only on the case $\alpha =2$, employing an approach inspired by the recent paper [13]. The remaining cases follow by similar methodologies, and therefore, we omit discussing them here.

Throughout this section, the notation $f(x) \sim g(x)$ as $x\rightarrow a$, means that $f(x)={\mathcal {O}}(g(x))$ as $x\rightarrow a$, and $g(x)={\mathcal {O}}(f(x))$ as $x\rightarrow a$.

Now, if we define

$$\begin{aligned} \sigma (x):= {\left\{ \begin{array}{ll} -\displaystyle \frac{1}{8x}, &{}\quad x\le -1/2, \\ \displaystyle \frac{1}{2}x+\frac{1}{2}, &{}\quad -1/2<x<1/2, \\ \displaystyle 1-\frac{1}{8x}, &{}\quad x\ge 1/2, \end{array}\right. } \end{aligned}$$

it is easy to see that the corresponding $\phi _{\sigma }$ satisfies

$$\begin{aligned} \phi _{\sigma }(x)\,=\,{\frac{1}{8\,(x^{2}-1)}}\,\ge \,{\frac{1}{8\,x^{2}}},\quad |x|\,\ge \,\frac{3}{2}. \end{aligned}$$

Then, consider the function $f:[-1,1]\rightarrow \mathbb {R}$, $f(x)=\left| x\right| $. It is immediate that, for $\delta >0$ sufficiently small, we have

$$\begin{aligned} \omega \left( f,\,\delta \right) =\delta . \end{aligned}$$

(8)

Thus, recalling that $\phi _{\sigma }$ is even, we can write

$$\begin{aligned} \begin{aligned}&\left| \left( D_{n}^{\sigma ,\chi }f\right) (0)-f\left( 0\right) \right| \\&\quad =\left( D_{n}^{\sigma ,\chi }f\right) (0)=\frac{\displaystyle \sum \nolimits _{k=-n}^{n-1}n \left[ \int _{-1}^{1}\chi (nt-k)f(t)dt\right] \phi _{\sigma }(k)}{\displaystyle \sum \nolimits _{k=-n}^{n-1} \left[ \int _{-n-k}^{n-k}\chi (y)dy\right] \phi _{\sigma }(k)}\\&\quad \ge \frac{1}{\left\| \chi \right\| _{1}}\frac{ \displaystyle \sum \nolimits _{k=-n}^{n-1}n \left[ \int _{-1}^{1}\chi (nt-k)f(t)dt\right] \phi _{\sigma }(k)}{\displaystyle \sum \nolimits _{k=-n}^{n-1} \phi _{\sigma }(k)} \\&\quad \ge \frac{1}{\left\| \chi \right\| _{1}}\sum _{k=2}^{n-1}n\left[ \int _{-1}^{1}\chi (nt-k)f(t)dt\right] \phi _{\sigma }(k)\\&\quad = \frac{1}{\left\| \chi \right\| _{1}}\sum _{k=2}^{n-1}\left[ \int _{-n-k}^{n-k}\chi (y)\cdot \frac{|y+k|}{n}dy\right] \phi _{\sigma }(k)\\&\quad \ge \frac{1}{\left\| \chi \right\| _{1}}\sum _{k=2}^{n-1}\left[ \int _{0}^{1}\chi (y)\cdot \frac{|y+k|}{n}dy\right] \phi _{\sigma }(k)\\&\quad \ge \frac{1}{\left\| \chi \right\| _{1}}\sum _{k=2}^{n-1}\frac{k}{n}\left[ \int _{0}^{1}\chi (y)dy \right] \phi _{\sigma }(k)=\frac{\mathcal{K}}{\left\| \chi \right\| _{1}}\sum _{k=2}^{n-1}\frac{k}{n}\cdot \phi _{\sigma }(k)\\&\quad \ge \frac{{{{\mathcal {K}}}}}{8n\left\| \chi \right\| _{1}} \sum _{k=2}^{n-1}\frac{1}{k}. \end{aligned} \end{aligned}$$

Noting that

$$\begin{aligned} \sum _{k=2}^{n-1}\frac{1}{k}\,\ge \,\log n -\log 2\,= \,\log {\frac{n}{2}}\,\ge \,\frac{\log n}{2}, \end{aligned}$$

for sufficiently large $n\in \mathbb {N}^{+}$, and referring to (8), it can be deduced that

$$\begin{aligned} \left| \left( D_{n}^{\sigma ,\chi }f\right) (0)-f\left( 0\right) \right| \ge \frac{\mathcal{K}}{16\left\| \chi \right\| _{1}}\cdot \frac{\log n}{n}=\frac{\mathcal{K}}{16\left\| \chi \right\| _{1}}\omega \left( f,\,\frac{\log n}{n}\right) , \end{aligned}$$

and, hence

$$\begin{aligned} \left\| D_{n}^{\sigma ,\chi }f -f\right\| _{\infty } \ge \frac{{{{\mathcal {K}}}}}{16\left\| \chi \right\| _{1}}\omega \left( f,\,\frac{\log n}{n}\right) , \end{aligned}$$

for sufficiently large $n\in \mathbb {N}^{+}$. It could be possible to generalize this estimation for the case of an arbitrary sigmoidal function $\sigma $ such that $\phi _{\sigma }(x)\sim |x|^{-2}$, as $x\rightarrow -\infty $, see, e.g., [13].

In conclusion, estimation (6) is the best possible in the case $\alpha =2$.

By similar reasonings, we deduce that the estimations (5) and (7) are the best possible, in the cases $\alpha \in \left( 1,2\right) $ and $\alpha >2$, respectively.

Therefore, we can summarize such properties in the following theorem.

Theorem 10

Consider a sigmoidal function $\sigma :{\mathbb {R}}\rightarrow {\mathbb {R}}$ and $\alpha > 1$, such that $\phi _{\sigma }(x)\sim \left| x\right| ^{-\alpha }$ as $x\rightarrow -\infty $, and let $\chi :\mathbb {R}\rightarrow [0,\infty )$ be a bounded and $L^{1}$-integrable function on $\mathbb {R}$. The Durrmeyer-type neural network operators, defined in (2), have the following properties:

(i)
if $\alpha \in (1,2)$, ${\mathcal {O}}\left( \omega \left( f,\,\frac{1}{n^{\alpha -1}}\right) \right) $, as $n\rightarrow +\infty $, is the best order of uniform approximation over the class C([a, b]) in the approximation by the operators $D_{n}^{\sigma ,\chi }$;
(ii)
if $\alpha =2$, ${\mathcal {O}}\left( \omega \left( f,\,\frac{\log n}{n}\right) \right) $, as $n\rightarrow +\infty $, is the best order of uniform approximation over the class C([a, b]) in the approximation by the operators $D_{n}^{\sigma ,\chi }$;
(iii)
if $\alpha >2$, ${\mathcal {O}}\left( \omega \left( f,\,\frac{1}{n}\right) \right) $, as $n\rightarrow +\infty $, is the best order of uniform approximation over the class C([a, b]) in the approximation by the operators $D_{n}^{\sigma ,\chi }$.

6 Convergence in Lebesgue spaces

Now, in order to obtain approximation results for not necessarily continuous functions, we work in the setting of Lebesgue spaces $L^{p}([a,b])$, $1\le p<+\infty $. Therefore, we want to exploit the integral form of the operator $D_{n}^{\sigma ,\chi }$ to show that it also converges with respect to the $L^{p}$-norm.

As an easy consequence of Theorem 9, we can prove the following.

Theorem 11

For every $f\in C([a,b])$ and $1\le p<+\infty $, we have

$$\begin{aligned} \lim _{n\rightarrow \infty }\left\| D_{n}^{\sigma ,\chi }f -f\right\| _{p}=0, \end{aligned}$$

where $\left\| \cdot \right\| _{p}$ denotes the usual $L^{p}$-norm.

Proof

We immediately have

$$\begin{aligned} \left\| D_{n}^{\sigma ,\chi }f -f\right\| _{p}=\left( \int _{a}^{b}\left| \left( D_{n}^{\sigma ,\chi }f\right) (x)-f(x)\right| ^{p}dx\right) ^{1/p}\le \left\| D_{n}^{\sigma ,\chi }f -f\right\| _{\infty }\cdot (b-a)^{1/p}, \end{aligned}$$

from which we get the thesis, if $n\in \mathbb {N}^{+}$ is sufficiently large, thanks to Theorem 9. $\square $

Subsequently, to prove the convergence of the family of Durrmeyer-type neural network operators in $L^p$, it is necessary to establish the following inequality.

Theorem 12

For every $f\in L^{p}(\left[ a,b\right] ) $, $1\le p<+\infty $, we have

$$\begin{aligned} \left\| D_{n}^{\sigma ,\chi }f \right\| _{p}\le \frac{\left\| \chi \right\| _{1}^{\frac{p-1}{p}}M_{0}(\chi )^{\frac{1}{p}}}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)}\left\| f\right\| _{p}. \end{aligned}$$

Proof

For every $f\in L^{p}(\left[ a,b\right] ) $, $1\le p<+\infty $, taking into account the convexity of the function $x\rightarrow \left| x\right| ^{p}$, and applying the Jensen inequality, we obtain

$$\begin{aligned} \begin{aligned}&\left\| D_{n}^{\sigma ,\chi }f \right\| _{p} =\left( \int _{a}^{b}\left| \left( D_{n}^{\sigma ,\chi }f\right) (x) \right| ^{p}dx\right) ^{1/p} \\&=\left( \int _{a}^{b}\left| \frac{\displaystyle \sum \nolimits _{k=na}^{nb-1}\left[ n \int _{a}^{b}\chi (nt-k)f(t) dt\right] \phi _{\sigma }(nx-k)}{\displaystyle \sum \nolimits _{k=na}^{nb-1}\left[ n \int _{a}^{b}\chi (nt-k)dt\right] \phi _{\sigma }(nx-k)}\right| ^{p}dx\right) ^{1/p} \\&\le \frac{1}{{{\mathcal {K}}}}\left( \int _{a}^{b}\left| \frac{\displaystyle \sum \nolimits _{k=na}^{nb-1}\left[ n \int _{a}^{b}\chi (nt-k)f(t) dt\right] \phi _{\sigma }(nx-k)}{\displaystyle \sum \nolimits _{k=na}^{nb-1}\phi _{\sigma }(nx-k)}\right| ^{p}dx\right) ^{1/p}\\&\le \frac{1}{{{\mathcal {K}}}}\left( \int _{a}^{b} \frac{\displaystyle \sum \nolimits _{k=na}^{nb-1}\left| n \int _{a}^{b}\chi (nt-k)f(t) dt \right| ^{p} \phi _{\sigma }(nx-k)}{\displaystyle \sum \nolimits _{k=na}^{nb-1}\phi _{\sigma }(nx-k)} dx\right) ^{1/p} \end{aligned} \\ \begin{aligned}&\le \frac{1}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)}\left( \int _{a}^{b} \sum _{k=na}^{nb-1}\left| n\int _{a}^{b}\chi (nt-k)f(t)dt\right| ^{p}\phi _{\sigma }(nx-k)dx\right) ^{1/p}\\&=\frac{1}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)}\left( \sum _{k=na}^{nb-1}\int _{a}^{b}\phi _{\sigma }(nx-k)dx\left| n\int _{a}^{b}\chi (nt-k)f(t)dt\right| ^{p}\right) ^{1/p}. \end{aligned} \end{aligned}$$

One can easily verify that

$$\begin{aligned} \int _{a}^{b}\phi _{\sigma }(nx-k)dx\le \frac{1}{n}\int _{ \mathbb {R}}\phi _{\sigma }(t)dt=\frac{1}{n}, \end{aligned}$$

where the equality arises from (iv) of Lemma 1, which asserts that $\displaystyle \int _{\mathbb {R}}\phi _{\sigma }(t)dt=1$. Taking this into consideration, we obtain

$$\begin{aligned} \left\| D_{n}^{\sigma ,\chi }f\right\| _{p}\le \frac{1}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)}\left( \frac{1}{n} \sum _{k=na}^{nb-1}\left| n\int _{a}^{b}\chi (nt-k)f(t) dt\right| ^{p}\right) ^{1/p}. \end{aligned}$$

Subsequently applying Jensen inequality once again, we deduce that

$$\begin{aligned} \begin{aligned} \left\| D_{n}^{\sigma ,\chi }f\right\| _{p}&= \frac{1}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)}\left( \frac{1}{n}\sum _{k=na}^{nb-1}\left| \frac{n\int _{a}^{b}\chi (nt-k)dt}{n\int _{a}^{b}\chi (nt-k)dt} \cdot n\int _{a}^{b}\chi (nt-k)f(t) dt\right| ^{p}\right) ^{1/p}\\&\le \frac{1}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)}\left( \frac{1}{n} \sum _{k=na}^{nb-1}\left( n\int _{a}^{b}\chi (nt-k)dt\right) ^{p-1} \cdot n\int _{a}^{b}\chi (nt-k)|f(t)|^{p} dt\right) ^{1/p}\\&\le \frac{\left\| \chi \right\| _{1}^{\frac{p-1}{p}}}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)}\left( \int _{a}^{b}\left[ \sum _{k=na}^{nb-1}\chi (nt-k)\right] |f(t)|^{p} dt\right) ^{1/p}\\&\le \frac{\left\| \chi \right\| _{1}^{\frac{p-1}{p}}M_{0}(\chi )^{\frac{1}{p}}}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)}\left( \int _{a}^{b}|f(t)|^{p} dt\right) ^{1/p}\\&=\frac{\left\| \chi \right\| _{1}^{\frac{p-1}{p}}M_{0}(\chi )^{\frac{1}{p}}}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)}\left\| f\right\| _{p}, \end{aligned} \end{aligned}$$

and now the proof is complete. $\square $

From Theorem 12, we deduce that $D_{n}^{\sigma ,\chi }$ map the whole space $L^{p}([a,b])$ into itself, and that $D_{n}^{\sigma ,\chi }$ are well-defined in $L^{p}([a,b])$.

Finally, by exploiting the density of C([a, b]) within $L^{p}(\left[ a,b\right] ) $, $1\le p<+\infty $, with respect to the norm $\left\| \cdot \right\| _{p}$, we achieve the following convergence result.

Theorem 13

For every $f\in L^{p}(\left[ a,b\right] ) $, $1\le p<+\infty $, we have

$$\begin{aligned} \lim _{n\rightarrow +\infty }\left\| D_{n}^{\sigma ,\chi }f -f\right\| _{p}=0. \end{aligned}$$

Proof

Let $f \in L^{p}([a,b])$ and $\varepsilon > 0$ be fixed. Since the space C([a, b]) is dense in $L^{p}([a,b])$ with respect to the norm $\Vert \cdot \Vert _p$, there exists $g \in C([a,b])$ such that $\Vert f - g\Vert _p < \left( \frac{\left\| \chi \right\| _{1}^{\frac{p-1}{p}}M_{0}(\chi )^{\frac{1}{p}}}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)}+1\right) ^{-1}\varepsilon /2$.

Then, using such a g and Theorem 12,

$$\begin{aligned} \begin{aligned} \Vert D_{n}^{\sigma ,\chi }f - f\Vert _p&\le \Vert D_{n}^{\sigma ,\chi }f - D_{n}^{\sigma ,\chi }g\Vert _p + \Vert D_{n}^{\sigma ,\chi }g - g\Vert _p + \Vert g - f\Vert _p \\&\le \frac{\left\| \chi \right\| _{1}^{\frac{p-1}{p}}M_{0}(\chi )^{\frac{1}{p}}}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)}\left\| f-g\right\| _{p}+ \Vert D_{n}^{\sigma ,\chi }g - g\Vert _p + \Vert g - f\Vert _p\\&=\left( \frac{\left\| \chi \right\| _{1}^{\frac{p-1}{p}}M_{0}(\chi )^{\frac{1}{p}}}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)}+1\right) \left\| f-g\right\| _{p}+ \Vert D_{n}^{\sigma ,\chi }g - g\Vert _p\\&<\frac{\varepsilon }{2}+ \Vert D_{n}^{\sigma ,\chi }g - g\Vert _p. \end{aligned} \end{aligned}$$

Finally, by Theorem 11,

$$\begin{aligned} \Vert D_{n}^{\sigma ,\chi }f - f\Vert _p<\frac{\varepsilon }{2}+\frac{\varepsilon }{2}=\varepsilon , \end{aligned}$$

for $n \in \mathbb {N}^+$ sufficiently large. Being $\varepsilon $ arbitrary, the proof follows. $\square $

Remark 14

Examples of sigmoidal functions satisfying all the assumption of the above theory can be easily provided. For instance, we can mention the well-known logistic function $\sigma _{l}(x)=(1+e^{-x})^{-1}$, $x\in {\mathbb {R}}$, and the hyperbolic tangent sigmoidal function $\sigma _{h}(x)=\frac{1}{2}(\tanh x+1)$, $x\in {\mathbb {R}}$ (see, e.g., [4, 7, 30]). An example of a non-smooth sigmoidal function can be provided by the ramp function $\sigma _{R}(x)$ [8, 33, 34], defined by

$$\begin{aligned} \sigma _{R}(x)={\left\{ \begin{array}{ll} 0, &{} \text { if } x<-\frac{1}{2}, \\ x+\frac{1}{2}, &{} \text { if } -\frac{1}{2}\le x\le \frac{1}{2}, \\ 1, &{} \text { if } x>\frac{1}{2}, \end{array}\right. } \end{aligned}$$

for which the corresponding function $\phi _{\sigma _{R}}$ has compact support.

Additionally, since it is well-known that in the last years the so-called rectified linear unit (ReLU) activation function received a lot of attention due to their useful peculiarities in the application of training algorithms, it is natural to ask if the previous results can be extended also to that case. In general, such a question can be non-trivial, since the ReLU function is not a sigmoidal one.

We recall that, the ReLU function is defined as $\psi _{\text {ReLU}}(x)=(x)_{+}$, $x\in {\mathbb {R}}$ (see, e.g., [12]), where the function $(x)_{+}:= \max \{x, 0\}$ denotes the positive part of $x\in {\mathbb {R}}$.

Similarly, also powers of ReLU functions, known as rectified power unit (RePUs, [28]) functions $\psi ^{k}_{\text {ReLU}}(x)=[(x)_{+}]^{k}$, $x\in {\mathbb {R}}$, are of interest in the theory of neural networks, for reasons similar to those written above.

However, it is not difficult to see that certain density functions $\phi _{\sigma }$ generated by suitable $\sigma $ can be viewed as finite linear combination of ReLu or RePUs, respectively (see [14]).

Finally, to conclude the paper, it is important to provide examples of functions $\chi $ that can be used in the definition of the Durrmeyer-type NN operators. This is quite easy since the required assumptions on $\chi $ are very soft. For instance, we can consider as $\chi $ the well-known central B-spline kernel of order n, expressed as

$$\begin{aligned} M_{n}(x):=\frac{1}{(n-1)!}\sum _{i=0}^{n}(-1)^{i}\left( {\begin{array}{c}n\\ i\end{array}}\right) \left( \frac{n}{2} +x-i \right) _{+}^{n-1}. \end{aligned}$$

Additionally, the Féjer kernel provides another suitable choice

$$\begin{aligned} F(x):=\frac{1}{2} \text {sinc}^{2}\left( \frac{x}{2}\right) ,\;\;\;x\in \mathbb {R}, \end{aligned}$$

with the sinc-function defined as

$$\begin{aligned} \text {sinc}(x):={\left\{ \begin{array}{ll} \frac{\sin (\pi x)}{\pi x} &{} \text { } x\in \mathbb {R}\setminus \{0\}, \\ 1, &{} \text { } x=0. \end{array}\right. } \end{aligned}$$

Further options can be explored in references such as [1, 2, 17, 18, 21].

7 Final remarks and conclusions

In this paper, the theory of the Durrmeyer-type NN operators is introduced and studied, in case of the approximation of functions of one variable. It is well-known that NN-type approximations typically involve multivariate data; hence, as a future work we aim to extend the above definition and results to the multivariate setting, following the same approach considered in [20].

Data availability

Not applicable.

References

Allasia, G., Cavoretto, R., De Rossi, A.: A class of spline functions for landmark-based image registration. Math. Methods Appl. Sci. 35, 923–934 (2012)
Article MathSciNet Google Scholar
Allasia, G., Cavoretto, R., De Rossi, A.: Lobachevsky spline functions and interpolation to scattered data. Comput. Appl. Math. 32, 71–87 (2013)
Article MathSciNet Google Scholar
Anastassiou, G.A.: Rate of convergence of some neural network operators to the unit-univariate case. J. Math. Anal. Appl. 212, 237–262 (1997)
Article MathSciNet Google Scholar
Anastassiou, G.A.: Intelligent Systems: Approximation by Artificial Neural Networks, Intelligent Systems Reference Library, vol. 19. Springer, Berlin (2011)
Google Scholar
Bardaro, C., Faina, L., Mantellini, I.: Quantitative Voronovskaja formulae for generalized Durrmeyer sampling type series. Math. Nachr. 289(14–15), 1702–1720 (2016)
Article MathSciNet Google Scholar
Bardaro, C., Mantellini, I.: Asymptotic expansion of generalized Durrmeyer sampling type series. Jean J. Approx. 6(2), 143–165 (2014)
MathSciNet Google Scholar
Baxhaku, B., Agrawal, P.N.: Neural network operators with hyperbolic tangent functions. Expert Syst. Appl. 226(15), 119996 (2023)
Article Google Scholar
Cao, F., Chen, Z.: The construction and approximation of a class of neural networks operators with ramp functions. J. Comput. Anal. Appl. 14(1), 101–112 (2012)
MathSciNet Google Scholar
Cao, F., Xie, T., Xu, Z.: The estimate for approximation error of neural networks: a constructive approach. Neurocomputing 71(4), 626–630 (2008)
Article Google Scholar
Cao, F., Zhang, R.: The errors of approximation for feed forward neural networks in the Lp metric. Math. Comput. Model. 49(7), 1563–1572 (2009)
Article Google Scholar
Cardaliaguet, P., Euvrard, G.: Approximation of a function and its derivative with a neural network. Neural Netw. 5(2), 207–220 (1992)
Article Google Scholar
Chen, H., Yu, D., Li, Z.: The construction and approximation of ReLU neural network operators. J. Funct. Spaces (2022)
Coroianu, L., Costarelli, D.: Best approximation and inverse results for neural network operators, in print in: Results in Mathematics (2024)
Coroianu, L., Costarelli, D., Kadak, U.: Quantitative estimates for neural network operators implied by the asymptotic behaviour of the sigmoidal activation functions. Mediterr. J. Math. 19, 211 (2022)
Article MathSciNet Google Scholar
Coroianu, L., Kadak, U.: Integrating multivariate fuzzy neural networks into fuzzy inference system for enhanced decision making. Fuzzy Sets Syst. 470, 108668 (2023)
Article MathSciNet Google Scholar
Costarelli, D., Natale, M., Vinti, G.: Convergence results for nonlinear sampling Kantorovich operators in modular spaces. Numer. Funct. Anal. Optim. 44(12), 1276–1299 (2023)
Article MathSciNet Google Scholar
Costarelli, D., Piconi, M., Vinti, G.: Quantitative estimates for Durrmeyer-sampling series in Orlicz spaces. Sampling Th. Signal Proc. Data Anal. 21(1), 3 (2023)
Article MathSciNet Google Scholar
Costarelli, D., Sambucini, A.R., Vinti, G.: Convergence in Orlicz spaces by means of the multivariate max-product neural network operators of the Kantorovich type and applications. Neural Comput. Appl. 31, 5069–5078 (2019)
Article Google Scholar
Costarelli, D., Spigler, R.: Approximation results for neural network operators activated by sigmoidal functions. Neural Netw. 44, 101–106 (2013)
Article Google Scholar
Costarelli, D., Spigler, R.: Convergence of a family of neural network operators of the Kantorovich type. J. Approx. Theory 185, 80–90 (2014)
Article MathSciNet Google Scholar
Cruz-Uribe, D., Hasto, P.: Extrapolation and interpolation in generalized Orlicz spaces. Trans. Am. Math. Soc. 370, 4323–4349 (2018)
Article MathSciNet Google Scholar
Cybenko, G.: Approximation by superpositions of a sigmoidal function. Math. Control Signals Syst. 2, 303–314 (1989)
Article MathSciNet Google Scholar
Derriennic, M.M.: Sur l’approximation de fonctions integrables sur $[0,1]$ par des plynomes de Berstein modifies. J. Approx. Theory 31(4), 325–343 (1981)
Article MathSciNet Google Scholar
Durrmeyer, J.L.: Une formule d’inversion de la transformée de Laplace: applications à la théorie des moments, Thése de 3e cycle, Université de Paris (1967)
Gonska, H., Heilmann, M., Rasa, I.: Convergence of iterates of genuine and ultraspherical Durrmeyer operators to the limiting semigroup: $c^2$-estimates. J. Approx. Theory 160(1–2), 243–255 (2009)
Article MathSciNet Google Scholar
Gonska, H., Kacso, D., Rasa, I.: The genuine Bernstein–Durrmeyer operators revisited. Results Math. 62(3–4), 295–310 (2012)
Article MathSciNet Google Scholar
Gonska, H., Zhou, X.: A global inverse theorem on simultaneous approximation by Berstein–Durrmeyer operators. J. Approx. Theory 67, 284–302 (1991)
Article MathSciNet Google Scholar
Gühring, I., Raslan, M.: Approximation rates for neural networks with encodable weights in smoothness spaces. Neural Netw. 134, 107–130 (2021)
Article Google Scholar
Heilmann, M., Rasa, I.: A nice representation for a link between Baskakov- and Szász–Mirakjan–Durrmeyer operators and their Kantorovich variants. Results Math. 74, 1–12 (2019)
Article Google Scholar
Kadak, U.: Multivariate neural network interpolation operators. J. Comput. Appl. Math. 414, 114426 (2022)
Article MathSciNet Google Scholar
Kainen, P.C., Kurková, V., Vogt, A.: Approximative compactness of linear combinations of characteristic functions. J. Approx. Theory 257, 105435 (2020)
Article MathSciNet Google Scholar
Kurková, V., Sanguineti, M.: Classification by sparse neural networks. IEEE Trans. Neural Netw. Learn. Syst. 30(9), 2746–2754 (2019)
Article MathSciNet Google Scholar
Qian, Y., Yu, D.: Rates of approximation by neural network interpolation operators. Appl. Math. Comput. 418, 126781 (2022)
Article MathSciNet Google Scholar
Qian, Y., Yu, D.: Neural network interpolation operators activated by smooth ramp functions. Anal. Appl. 20(4), 791–813 (2022)
Article MathSciNet Google Scholar
Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)
Article Google Scholar
Sharma, M., Singh, U.: Some density results by deep Kantorovich type neural network operators. J. Math. Anal. Appl. 533(2), 128009 (2024)
Article MathSciNet Google Scholar
Zhou, D.X.: Universality of deep convolutional neural networks. Appl. Comput. Harmonic Anal. 48(2), 787–794 (2020)
Article MathSciNet Google Scholar
Zoppoli, R., Sanguineti, M., Gnecco, G., Parisini, T.: Neural Approximations for Optimal Control and Decision, Communications and Control Engineering book series CCE. Springer, Cham (2020)
Google Scholar

Download references

Acknowledgements

The second and the third authors are members of the Gruppo Nazionale per l’Analisi Matematica, la Probabilità e le loro Applicazioni (GNAMPA) of the Istituto Nazionale di Alta Matematica (INdAM), of the network RITA (Research ITalian network on Approximation), and of the UMI (Unione Matematica Italiana) group T.A.A. (Teoria dell’Approssimazione e Applicazioni).

Funding

The authors D. Costarelli and M. Natale have been partially supported within the 2024 GNAMPA-INdAM Project “Tecniche di approssimazione in spazi funzionali con applicazioni a problemi di diffusione", CUP E53C23001670001. Moreover, D. Costarelli has been partially supported within the projects: (1) 2023 GNAMPA-INdAM Project “Approssimazione costruttiva e astratta mediante operatori di tipo sampling e loro applicazioni" CUP E53C22001930001, (2) “National Innovation Ecosystem grant ECS00000041 - VITALITY", funded by the European Union - NextGenerationEU under the Italian Ministry of University and Research (MUR), and (3) PRIN 2022 PNRR: “RETINA: REmote sensing daTa INversion with multivariate functional modeling for essential climAte variables characterization", funded by the European Union under the Italian National Recovery and Resilience Plan (NRRP) of NextGenerationEU, under the MUR (Project Code: P20229SH29, CUP: J53D23015950001).

Author information

Authors and Affiliations

Department of Mathematics and Computer Science, University of Oradea, Universitatii 1, 410610, Oradea, Romania
Lucian Coroianu & Alexandra Pantiş
Department of Mathematics and Computer Science, University of Perugia, 1, Via Vanvitelli, 06123, Perugia, Italy
Danilo Costarelli & Mariarosaria Natale

Authors

Lucian Coroianu
View author publications
You can also search for this author in PubMed Google Scholar
Danilo Costarelli
View author publications
You can also search for this author in PubMed Google Scholar
Mariarosaria Natale
View author publications
You can also search for this author in PubMed Google Scholar
Alexandra Pantiş
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Danilo Costarelli.

Ethics declarations

Conflict of interest

The author declares that he has no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Coroianu, L., Costarelli, D., Natale, M. et al. The approximation capabilities of Durrmeyer-type neural network operators. J. Appl. Math. Comput. (2024). https://doi.org/10.1007/s12190-024-02146-9

Download citation

Received: 06 May 2024
Revised: 06 May 2024
Accepted: 25 May 2024
Published: 11 June 2024
DOI: https://doi.org/10.1007/s12190-024-02146-9

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

The approximation capabilities of Durrmeyer-type neural network operators

Abstract

Similar content being viewed by others

Approximation by exponential sampling type neural network operators

Quantitative Estimates for Neural Network Operators Implied by the Asymptotic Behaviour of the Sigmoidal Activation Functions

Voronovskaja Type Theorems and High-Order Convergence Neural Network Operators with Sigmoidal Functions

1 Introduction

2 Preliminaries

Lemma 1

Remark 2

Lemma 3

Lemma 4

3 The Durrmeyer-type NN operators

Definition 5

Lemma 6

Proof

Remark 7

Remark 8

4 Pointwise and uniform convergence

Theorem 9

Proof

5 Quantitative estimates

Theorem 10

6 Convergence in Lebesgue spaces

Theorem 11

Proof

Theorem 12

Proof

Theorem 13

Proof

Remark 14

7 Final remarks and conclusions

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation