1 Introduction

In recent years, there has been a large increase in interest of the scientific community in neural network (NN) operators driven by their wide-ranging applications, in particular in numerical analysis, artificial intelligence, and machine learning [15, 30,31,32, 35, 37, 38].

Anastassiou [3], was the first to establish NN approximations for continuous functions. He provided estimates for the rate of convergence using NN operators of the Cardaliaguet-Euvrard and Squashing type [11], based on the modulus of continuity of the function being approximated, thereby deriving Jackson type inequalities. The same author continued the investigations on the rate of convergence for univariate and multivariate NN operators activated by the hyperbolic tangent and by the logistic sigmoidal functions (see, e.g., [4]). Ramp function activated NN operators were further examined in [8].

Afterward, in [19], the authors established a unified approach for a general class of sigmoidal functions and they introduced the following discrete NN operators

$$\begin{aligned} (F_nf)(x) := \frac{\displaystyle \sum \nolimits _{k=\lceil {na \rceil }}^{\lfloor {nb \rfloor }}f\left( \frac{k}{n}\right) \phi _{\sigma }(nx-k)}{\displaystyle \sum \nolimits _{k=\lceil {na \rceil }}^{\lfloor {nb \rfloor }}\phi _{\sigma }(nx-k)}, \quad x\in [a,b], \end{aligned}$$
(I)

for sufficiently large \(n \in \mathbb {N}^{+}\), where \(\phi _\sigma \) is a suitable linear combination of sigmoidal activation functions \(\sigma : \mathbb {R} \rightarrow \mathbb {R}\), and \(f: [a,b] \rightarrow \mathbb {R}\) is a bounded function. In the above definition, \(\lfloor \cdot \rfloor \) and \(\lceil \cdot \rceil \) denote the integral and ceiling part of a given number.

In [20], replacing the sampling values f(k/w) in (I) with an average of f in an interval containing k/w, Costarelli and Spigler constructed the NN-type operators in the Kantorovich form, as follows

$$\begin{aligned} (K_nf)(x) := \frac{\displaystyle \sum \nolimits _{k=\lceil {na \rceil }}^{\lfloor {nb \rfloor }-1}\left[ n\int _{k/n}^{(k+1)/n}f(u)du\right] \phi _{\sigma }(nx-k)}{\displaystyle \sum \nolimits _{k=\lceil {na \rceil }}^{\lfloor {nb \rfloor }-1}\phi _{\sigma }(nx-k)}, \quad x\in [a,b], \end{aligned}$$
(II)

for sufficiently large \( n \in \mathbb {N}^{+} \) such that \( \lceil na \rceil \le \lfloor {nb \rfloor } - 1 \). For more detailed insights into such operators, we refer to [13, 14, 36].

In practice, the passage from the operators \(F_n\) to \(K_n\) followed the same path as the evolution from the Bernstein polynomials to the Kantorovich polynomials.

Following the same way, the next step appears to involve the introduction of a Durrmeyer version of the NN-type operators, just as a Durrmeyer version of the Bernstein polynomials has been defined, extensively studied, and extended in the literature (see, e.g., [5, 6, 17, 23,24,25,26,27, 29]).

In this paper, motivated by the above considerations, we introduce the Durrmeyer-type NN operators, as follows

$$\begin{aligned} \left( D_{n}^{\sigma ,\chi }f\right) (x):=\frac{\displaystyle \sum \nolimits _{k=\lceil {na \rceil }}^{\lfloor {nb \rfloor }-1} \left[ n\, \int _{a}^{b}\chi (nt-k)f(t)\, dt\right] \phi _{\sigma }(nx-k)}{\displaystyle \sum \nolimits _{k=\lceil {na \rceil }}^{\lfloor {nb \rfloor }-1} \left[ n\, \int _{a}^{b}\chi (nt-k)\, dt\right] \phi _{\sigma }(nx-k)}, \quad x\in [a,b]. \end{aligned}$$
(III)

Here, the coefficients are obtained through a general convolution integral involving a kernel function \(\chi \) and the function being approximated, denoted as f. The operators in (III) represent a generalization of the Kantorovich NN operators, as we will show in the Remark 8 below.

Here, we investigate the approximation capabilities of \(D_n^{\sigma ,\chi }\), establishing both pointwise and uniform convergence theorems for continuous functions (Sect. 4). We also provide quantitative estimates for the approximation order thanks to the use of the modulus of continuity \(\omega (f, \delta )\), \(\delta > 0\), of f; this turns out to be strongly influenced by the asymptotic behaviour of the sigmoidal function \(\sigma \) (Sect. 5). Our study also shows that the estimates we provide are, under suitable assumptions, the best possible on the space of continuous functions on [ab]. Finally, also \(L^p\), \(1 \le p<+\infty \), convergence theorem is established (Sect. 6). In order to prove such a result, we exploit a norm inequality for the above operators, and the density of the continuous functions in \(L^p\).

Finally, several examples of sigmoidal activation functions have been presented (see, e.g., [9, 10, 33, 34]); here we also show that also the case of the celebrated ReLU and RePUs function can be included in the results established by the present approach.

2 Preliminaries

Throughout the paper, we denote the integer part and the ceiling of a fixed real number by \(\lfloor {\cdot \rfloor }\) and \(\lceil {\cdot \rceil }\), respectively.

In line with the definition given in [22], we will assume that \(\sigma \) is a sigmoidal function, i.e., a non-decreasing measurable function such that

$$\begin{aligned} \lim _{x\rightarrow -\infty }\sigma (x)=0,\quad \quad \lim _{x\rightarrow +\infty }\sigma (x)=1. \end{aligned}$$

Let us consider a sigmoidal function \(\sigma :\mathbb {R}\rightarrow \mathbb {R}\) such that \(\sigma (x)-1/2\) is an odd function, \(\sigma \in C^{2}(\mathbb {R})\) is concave for \(x\ge 0\), and \(\sigma (x)={\mathcal {O}}\left( \left| x\right| ^{-\alpha }\right) \) as \(x\rightarrow -\infty \) for some \(\alpha >1\) (see, e.g., [19]).

Now, taking \(\phi _{\sigma }(x)=\frac{1}{2}\left[ \sigma (x+1)-\sigma (x-1)\right] \), \(x\in {\mathbb {R}}\), it is immediate that \(\phi _{\sigma }\) is an even function taking non-negative values, non-decreasing on \((-\infty ,0]\) and non-increasing on \([0,+\infty )\). In addition to that, \(\phi _{\sigma }\) satisfies a list of properties (see Lemmas 2.2, 2.4, 2.6 and 2.7 in [19] and Lemma 2.6 (I) in [20]), which can be summarized as follows.

Lemma 1

We have

  1. (i)

    \(\displaystyle \sum _{k\in \mathbb {Z}}\phi _{\sigma }(x-k)=1\), for all \(x\in \mathbb {R}\);

  2. (ii)

    \(1\ge \displaystyle \sum _{k=\lceil {na \rceil }}^{\lfloor {nb \rfloor }}\phi _{\sigma }(nx-k)\ge \phi _{\sigma }(1)>0\), for every \(n\in \mathbb {N}^{+}\) such that \(\lceil {na \rceil } \le \lfloor {nb \rfloor } \), and \(x\in [a,b]\);

  3. (iii)

    \(\displaystyle \sum _{k=\lceil {na \rceil }}^{\lfloor {nb \rfloor }-1}\phi _{\sigma }(nx-k)\ge \phi _{\sigma }(2)>0\), for every \(n\in \mathbb {N}^{+}\) such that \(\lceil {na \rceil } \le \lfloor {nb \rfloor } - 1\), and \(x\in [a,b];\)

  4. (iv)

    \(\phi _{\sigma }(x)={\mathcal {O}}\left( \left| x\right| ^{-\alpha }\right) \) as \(x\rightarrow \pm \infty \), and consequently \(\phi _{\sigma }\in L^{1}({\mathbb {R}})\); furthermore, we also have that \(\displaystyle \int _{{\mathbb {R}}}\phi _{\sigma }(t)dt=1;\)

  5. (v)

    for every \(\gamma >0\), we have \(\displaystyle \lim _{n\rightarrow +\infty }\sum _{\left| x-k\right| >\gamma n}\phi _{\sigma }(x-k)=0\), uniformly with respect to \(x\in \mathbb {R}\).

Remark 2

We can observe that, the theory of the neural network operators holds also for sigmoidal activation functions \(\sigma \) which not-necessarily belong to \(C^{2}(\mathbb {R})\). In the latter case, we can assume that the corresponding \(\phi _{\sigma }\) satisfies the following conditions

  • \(\phi _{\sigma }(x)\) is non-decreasing for \(x<0\) and non-increasing for \(x\ge 0\);

  • \(\phi _{\sigma }(1)>0\).

For more details on this matter one can check, e.g., [19].

We recall the notion of the discrete absolute moment of order \(\nu \ge 0\) of \(\phi _{\sigma }\), i.e.,

$$\begin{aligned} M_{\nu }(\phi _{\sigma }):=\sup _{x\in {\mathbb {R}}}\sum _{z\in {\mathbb {Z}}} \phi _{\sigma }(x-k)|x-k|^{\nu }. \end{aligned}$$

Given the aforementioned conditions on \(\sigma \), it follows that \(M_{\nu }(\phi _{\sigma })<+\infty ,\) \( 0\le \nu <\alpha -1\) (see, e.g., [16]). Now, we introduce the truncated discrete absolute moment of order \(\nu \ge 0\), as follows

$$\begin{aligned} M_{\lceil {na \rceil }}^{\lfloor {nb \rfloor }}(\phi _{\sigma }, \nu , x):=\sum _{\lceil {na \rceil }}^{\lfloor {nb \rfloor }}\phi _{\sigma }(nx-k)|nx-k|^{\nu }, \quad x\in [a,b],\quad n\in \mathbb {N}^{+}. \end{aligned}$$

This notion can also be applied to express the sum in ii) of Lemma 1, that can be reformulated as \(M_{\lceil {na \rceil }}^{\lfloor {nb \rfloor }}(\phi _{\sigma }, 0, x)\ge \phi _{\sigma }(1)>0\).

Furthermore, we recall the following lemmas that will be useful in the next sections. These lemmas serve as a formal representation of results established in [14].

Lemma 3

Let \(\sigma \) be a sigmoidal function satisfying condition (iv) of Lemma 1 with \(1<\alpha <2\). Then, there exist constants \(H\ge 1\), \(K>0\) such that

$$\begin{aligned} M_{\lceil {na \rceil }}^{\lfloor {nb \rfloor }}(\phi _{\sigma }, 1, x)\le H + Kn^{2-\alpha }, \quad x\in [a,b], \end{aligned}$$

for \(n\in \mathbb {N}^{+}\).

Lemma 4

Let \(\sigma \) be a sigmoidal function satisfying condition (iv) of Lemma 1 with \(\alpha =2\). Then, there exist constants \(H\ge 1\), \(K>0\) such that

$$\begin{aligned} M_{\lceil {na \rceil }}^{\lfloor {nb \rfloor }}(\phi _{\sigma }, 1, x)\le H +K\ln n, \quad x\in [a,b], \end{aligned}$$

for \(n\in \mathbb {N}^{+}\) sufficiently large.

3 The Durrmeyer-type NN operators

From now on, we assume for simplicity that both a, b are integers, with \(a<b\).

Let \(\chi :\mathbb {R}\rightarrow [0,+\infty )\) be bounded and \(L^{1}\) -integrable on \(\mathbb {R}\), such that

$$\begin{aligned} \int _{0}^{1}\chi (u)\,du\ =:\ {{{\mathcal {K}}}}\ >0, \end{aligned}$$
(1)

and its discrete absolute moment of order 0 is finite, i.e.,

$$\begin{aligned} M_{0}(\chi ):=\sup _{u\in {\mathbb {R}}} \sum _{k\in {\mathbb {Z}}} \chi (u-k)<+\infty . \end{aligned}$$

Moreover, for any \(\nu \ge 0\), let us define the continuous absolute moment of order \(\nu \) as follows

$$\begin{aligned} {\tilde{M}}_{\nu }(\chi ):=\int _{{\mathbb {R}}}\chi (t)|t|^{\nu }dt. \end{aligned}$$

Now, we are able to introduce the following definition.

Definition 5

If \(f:[a,b]\rightarrow \mathbb {R}\) is bounded and \(L^{1}\)-integrable on [ab], the Durrmeyer-type neural network (NN) operators associated to f with respect to \(\phi _{\sigma }\) and \(\chi \) is defined by

$$\begin{aligned} \left( D_{n}^{\sigma ,\chi }f\right) (x):=\frac{\displaystyle \sum \nolimits _{k=na}^{nb-1} \left[ n\, \int _{a}^{b}\chi (nt-k)f(t)\, dt\right] \phi _{\sigma }(nx-k)}{\displaystyle \sum \nolimits _{k=na}^{nb-1} \left[ n\, \int _{a}^{b}\chi (nt-k)\, dt\right] \phi _{\sigma }(nx-k)}, \;\; x\in [a,b],\;\; n\in \mathbb {N}^{+}. \end{aligned}$$
(2)

In order to show that the above definition is well-defined, we have to prove the following lemma.

Lemma 6

Under the above assumptions, the following inequality holds

$$\begin{aligned} \sum _{k=na}^{nb-1} \left[ n\, \int _{a}^{b}\chi (nt-k)\, dt\right] \phi _{\sigma }(nx-k)\ \ge \ {{{\mathcal {K}}}}\, \phi _{\sigma }(2) >0, \end{aligned}$$

\(x \in [a,b]\), \(n \in \mathbb {N}^+\) such that \(n \ge 1/(b-a)\).

Proof

Using the change of variable \(y=nt-k\), we can write what follows

$$\begin{aligned} \sum _{k=na}^{nb-1} \left[ n\, \int _{a}^{b}\chi (nt-k)\, dt\right] \phi _{\sigma }(nx-k)\ =\ \sum _{k=na}^{nb-1} \left[ \int _{na-k}^{nb-k}\chi (y)\, dy\right] \phi _{\sigma }(nx-k), \end{aligned}$$

\(x \in [a,b]\). Since \(\chi \) is non-negative, and the following inclusions hold

$$\begin{aligned}{}[0, 1] \subset [-i,n(b-a)-i], \quad \quad i=0, 1,..., n(b-a)-1, \end{aligned}$$

we immediately obtain

$$\begin{aligned} \begin{aligned} \sum _{k=na}^{nb-1} \left[ n\, \int _{a}^{b}\chi (nt-k)\, dt\right] \phi _{\sigma }(nx-k)\ {}&\ge \ \left[ \int _{0}^{1}\chi (y)\, dy\right] \sum _{k=na}^{nb-1} \phi _{\sigma }(nx-k) \\&\ge {{{\mathcal {K}}}}\,\phi _{\sigma }(2)>0, \end{aligned} \end{aligned}$$

thanks to the use of (1) and Lemma 1 (iii). \(\square \)

Therefore, by Lemma 6, we immediately obtain that \(D_{n}^{\sigma ,\chi }\) is well-defined, e.g., for any function that is bounded over the interval [ab]. Then, of course, \(D_{n}^{\sigma ,\chi }\) is a positive linear operator on the space of bounded and \(L^{1}\)-integrable function on [ab].

Remark 7

It is worth noting that the operator \(D_{n}^{\sigma ,\chi }\) preserves constant functions, a property inherent in its definition. Specifically, \(D_{n}^{\sigma ,\chi }e_{0}=e_{0}\), where \(e_{0}:[a,b]\rightarrow \mathbb {R}\) and \(e_{0}(x)=1\).

Remark 8

We emphasize that the Durrmeyer-type NN operators generalizes some other well-known families of NN operators. For instance, the Kantorovich-type NN operators (see, e.g., [20])

$$\begin{aligned} (K_n f)(x):=\frac{\displaystyle \sum \nolimits _{k=na}^{nb-1} \left[ n\, \int _{\frac{k}{n}}^{\frac{k+1}{n}}f(t)\, dt\right] \phi _{\sigma }(nx-k)}{ \displaystyle \sum \nolimits _{k=na}^{nb-1} \phi _{\sigma }(nx-k)}, \quad x\in [a,b], \end{aligned}$$

can be viewed as a special case of the Durrmeyer-type NN operators. In fact, for any bounded \(f\in L^1([a,b])\) and \(\chi (u)={\textbf{1}}_{[0,1]}(u)\), \(u\in {\mathbb {R}}\), where \({\textbf{1}}_{[0,1]}\) is the characteristic function on the set \([0,1]\subset {\mathbb {R}}\), we have

$$\begin{aligned} \begin{aligned} \left( D_{n}^{\sigma ,{\textbf{1}}_{[0,1]} }f\right) (x)&= \frac{\displaystyle \sum \nolimits _{k=na}^{nb-1} \left[ n\, \int _{a}^{b}{\textbf{1}}_{[0,1]} (nt-k)f(t)\, dt\right] \phi _{\sigma }(nx-k)}{\displaystyle \sum \nolimits _{k=na}^{nb-1} \left[ n\, \int _{a}^{b}{\textbf{1}}_{[0,1]} (nt-k)\, dt\right] \phi _{\sigma }(nx-k)}\\&=\frac{\displaystyle \sum \nolimits _{k=na}^{nb-1} \left[ n\, \int _{\frac{k}{n}}^{\frac{k+1}{n}}f(t)\, dt\right] \phi _{\sigma }(nx-k)}{\displaystyle \sum \nolimits _{k=na}^{nb-1} \phi _{\sigma }(nx-k)}=(K_n f)(x). \end{aligned} \end{aligned}$$

Thus, \(\left( D_{n}^{\sigma ,{\textbf{1}}_{[0,1]} }f\right) (x)=(K_n f)(x)\), for every \(x\in [a,b]\) and \(n\in \mathbb {N}^{+}\).

In the next Remark 14, we will provide further examples illustrating kernel functions \(\chi \) suitable for defining \(D_n^{\sigma ,\chi }\).

4 Pointwise and uniform convergence

If \(a<b\) and \(f\in C([a,b])\), we denote, as usual, the uniform norm on C([ab]), by

$$\begin{aligned} \left\| f\right\| _{\infty } =\max _{x\in [a,b]}\left| f(x)\right| . \end{aligned}$$

Theorem 9

Let \(f:[a,b]\rightarrow \mathbb {R}\) be a bounded and \(L^{1}\)-integrable function. Then

$$\begin{aligned} \lim _{n\rightarrow +\infty }\left( D_{n}^{\sigma ,\chi }f\right) (x_{0})=f(x_{0}), \end{aligned}$$

at any point \(x_0\in [a,b]\) of continuity of f. Moreover, if \(f\in C([a,b])\), then

$$\begin{aligned} \lim _{n\rightarrow +\infty }\left\| D_{n}^{\sigma ,\chi }f -f\right\| _{\infty } =0. \end{aligned}$$

Proof

Let \(x_0\in [a,b]\) be a point of continuity of f. Using Lemma 6, we obtain

$$\begin{aligned}&\left| \left( D_{n}^{\sigma ,\chi }f\right) (x_{0})-f(x_{0})\right| \nonumber \\&\quad =\left| \left( D_{n}^{\sigma ,\chi }f\right) (x_{0})-f(x_{0})\cdot \frac{\displaystyle \sum \nolimits _{k=na}^{nb-1} \left[ n\, \int _{a}^{b}\chi (nt-k)\, dt\right] \phi _{\sigma }(nx_0-k)}{\displaystyle \sum \nolimits _{k=na}^{nb-1} \left[ n\, \int _{a}^{b}\chi (nt-k)\, dt\right] \phi _{\sigma }(nx_0-k)}\right| \nonumber \\&\quad =\left| \frac{\displaystyle \sum \nolimits _{k=na}^{nb-1}n\left[ \int _{a}^{b}\chi (nt-k)\left( f(t) -f(x_0)\right) dt\right] \phi _{\sigma }(nx_{0}-k)}{ \displaystyle \sum \nolimits _{k=na}^{nb-1} \left[ n\, \int _{a}^{b}\chi (nt-k)\, dt\right] \phi _{\sigma }(nx_0-k)}\right| \nonumber \\&\quad \le \frac{1}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)}\sum _{k=na}^{nb-1}n\left[ \int _{a}^{b}\chi (nt-k)\left| f(t) -f(x_0)\right| dt\right] \phi _{\sigma }(nx_{0}-k). \end{aligned}$$
(3)

Let now \(\varepsilon >0\) be arbitrarily chosen. By the continuity of f at \(x_0\), there exists \(\gamma >0\) such that \(|f(t)-f(x_0)|<\varepsilon \) for every \(t\in [a,b]\) with \(|t-x_0|\le \gamma \). At this point, we proceed to define

$$\begin{aligned} S_1:=\{k:\;na\le k\le nb-1 \text { and } |k/n -x_0|\le \gamma /2\}, \end{aligned}$$

and

$$\begin{aligned} S_2:=\{k:\;na\le k\le nb-1 \text { and } |k/n -x_0|>\gamma /2\}. \end{aligned}$$

Hence, we can write

$$\begin{aligned} \begin{aligned}&\left| \left( D_{n}^{\sigma ,\chi }f\right) (x_{0})-f(x_{0})\right| \\&\quad \le \frac{1}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)} \Biggl \{\sum _{k\in S_1}n\left[ \int _{a}^{b}\chi (nt-k)\left| f(t) -f(x_0)\right| dt\right] \phi _{\sigma }(nx_{0}-k)\\&\quad +\sum _{k\in S_2}n\left[ \int _{a}^{b}\chi (nt-k)\left| f(t) -f(x_0)\right| dt\right] \phi _{\sigma }(nx_{0}-k)\Biggr \}\\&\quad =:\frac{1}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)} \left( I_1+I_2\right) . \end{aligned} \end{aligned}$$

The first term can be further divided into

$$\begin{aligned} \begin{aligned} I_1&=\sum _{k\in S_1}n\left[ \Biggl \{\int _{|t-k/n|<\gamma /2}+\int _{|t-k/n|\ge \gamma /2}\Biggr \}\chi (nt-k)\left| f(t) -f(x_0)\right| dt\right] \phi _{\sigma }(nx_{0}-k)\\&=:I_{1,1}+I_{1,2}. \end{aligned} \end{aligned}$$

For \(t\in [a,b]\), such that \(\left| t-k/n\right| <\gamma /2\), and \(k\in S_{1}\), we have

$$\begin{aligned} |t-x_0|\le \left| t-\frac{k}{n}\right| +\left| \frac{k}{n}-x_0\right| <\frac{\gamma }{2}+\frac{\gamma }{2}=\gamma , \end{aligned}$$

therefore, it follows that \(\left| f(t)-f(x_{0})\right| <\varepsilon \). This implies

$$\begin{aligned} \begin{aligned} I_{1,1}&<\varepsilon \sum _{k\in S_1}\phi _{\sigma }(nx_{0}-k)\;n\int _{|t-k/n|<\gamma /2}\chi (nt-k)dt\\&\le \varepsilon \sum _{k\in {\mathbb {Z}}}\phi _{\sigma }(nx_{0}-k)\int _{{\mathbb {R}}}\chi (u)du\\&=\varepsilon \left\| \chi \right\| _{1}, \end{aligned} \end{aligned}$$

where, in the above computations, we used the change of variable \(nt-k=u\), and also condition (i) of Lemma 1.

Let us now estimate \(I_{1,2}\). By the boundedness of f, it follows that

$$\begin{aligned} I_{1,2}\le 2\left\| f\right\| _{\infty }\sum _{k\in S_1} n\left[ \int _{|nt-k|\ge \frac{n\gamma }{2}}\chi (nt-k)dt\right] \phi _{\sigma }(nx_0-k), \end{aligned}$$

with

$$\begin{aligned} n\int _{|nt-k|\ge \frac{n\gamma }{2}}\chi (nt-k)dt=\int _{|y|\ge \frac{n\gamma }{2}}\chi (y)dy\rightarrow 0, \text { as } n\rightarrow +\infty , \end{aligned}$$

since \(\chi \in L^{1}({\mathbb {R}})\). Then, using again i) of Lemma 1, \(I_{1,2}\le 2\left\| f\right\| _{\infty }\varepsilon \), for sufficiently large n. Through analogous reasonings, we obtain the following inequalities

$$\begin{aligned} I_{2}\le & {} 2\left\| f\right\| _{\infty }\sum _{k\in S_2}n\left[ \int _{a}^{b}\chi (nt-k)dt\right] \phi _{\sigma }(nx_{0}-k)\\\le & {} 2\left\| f\right\| _{\infty } \left\| \chi \right\| _{1} \sum _{k\in S_2} \phi _{\sigma }(nx_{0}-k) \\\le & {} 2\left\| f\right\| _{\infty } \left\| \chi \right\| _{1} \sum _{\left| nx_{0}-k\right| >\frac{n\gamma }{2}}\phi _{\sigma }(nx_{0}-k). \end{aligned}$$

From property v) of Lemma 1, there exists a sufficiently large n, such that \(I_{2}\le 2\left\| f\right\| _{\infty } \left\| \chi \right\| _{1} \varepsilon \).

In conclusion, setting \(M:=\left\| \chi \right\| _{1}+2\left\| f\right\| _{\infty }(1+\left\| \chi \right\| _{1})\), we have

$$\begin{aligned} \left| \left( D_{n}^{\sigma ,\chi }f\right) (x_{0})-f(x_{0})\right| \le \frac{M\varepsilon }{{{{\mathcal {K}}}}\phi _{\sigma }(2)}, \end{aligned}$$

for sufficiently large n, and by the arbitrariness of \(\varepsilon \), it follows that

$$\begin{aligned} \lim _{n\rightarrow \infty }\left( D_{n}^{\sigma ,\chi }f\right) (x_{0})=f(x_{0}), \end{aligned}$$

therefore the first part of the statement is proved.

Finally, if \(f\in C([a,b])\) the second part of the theorem easily follows, as we can now use the parameter \(\gamma \) of the uniform continuity of f (in place of that of the continuity of f) for any arbitrary \(x\in [a,b]\). \(\square \)

5 Quantitative estimates

In this section, we aim to figure out how fast are the approximations achieved by the operator \(D_{n}^{\sigma ,\chi }\).

To address this task, it is necessary to recall the notion of the modulus of continuity of a given function \(f\in C([a,b])\), defined as usual

$$\begin{aligned} \omega (f,\delta ):=\max \{|f(x)-f(y)|:\; x,y\in [a,b],\; |x-y|\le \delta \}, \end{aligned}$$

with \(\delta >0\). It is interesting to point out that the following well-known inequality

$$\begin{aligned} \omega (f,\lambda \delta )\le (\lambda +1)\,\omega (f,\delta ), \end{aligned}$$
(4)

holds, with \(\lambda ,\delta >0\).

Let \(x_{0}\) be a fixed point within this interval, for a sake of clarity, we rewrite the inequality obtained in (3)

$$\begin{aligned}{} & {} \left| \left( D_{n}^{\sigma ,\chi }f\right) (x_{0})-f(x_{0})\right| \le \frac{1}{\mathcal{K}\,\phi _{\sigma }(2)}\\{} & {} \sum _{k=na}^{nb-1}n\left[ \int _{a}^{b}\chi (nt-k)\left| f(t) -f(x_0)\right| dt\right] \phi _{\sigma }(nx_{0}-k). \end{aligned}$$

One can easily check that

$$\begin{aligned} \begin{aligned} \left| f(t)-f(x_{0})\right|&\le \omega \left( f,\left| t-x_{0}\right| \right) =\omega \left( f, \frac{\gamma (n)}{\gamma (n)}\cdot \left| t-x_{0}\right| \right) \\&\le (1+\gamma (n)|t-x_0|)\,\omega \left( f,\frac{1}{\gamma (n)}\right) , \end{aligned} \end{aligned}$$

where the previous estimate is a consequence of (4), with \(\lambda =\gamma (n)|t-x_0|\) and \(\delta =\frac{1}{\gamma (n)}\). Here, in accordance with the insights from [14], the function \(\gamma (n)\) is defined as

$$\begin{aligned} \gamma (n):={\left\{ \begin{array}{ll} n^{\alpha -1}, &{} \text { if } \alpha \in (1,2), \\ \displaystyle \frac{n}{\ln n}, &{} \text { if } \alpha =2, \\ n, &{} \text { if } \alpha >2. \end{array}\right. } \end{aligned}$$

Hence, it follows

$$\begin{aligned} \begin{aligned}&\left| \left( D_{n}^{\sigma ,\chi }f\right) (x_{0})-f(x_{0})\right| \\&\quad \le \frac{\omega \left( f,\frac{1}{\gamma (n)}\right) }{\mathcal{K}\,\phi _{\sigma }(2)}\sum _{k=na}^{nb-1}n\left[ \int _{a}^{b}\chi (nt-k)\,(1+\gamma (n)|t-x_0|)dt\right] \phi _{\sigma }(nx_{0}-k). \end{aligned} \end{aligned}$$

Through the change of variable \(nt-k=u\), we have \(\displaystyle \int _{a}^{b}\chi (nt-k)dt\le \frac{\left\| \chi \right\| _{1}}{n}\). Therefore, we can write

$$\begin{aligned} \begin{aligned}&n \int _{a}^{b}\chi (nt-k)\,(1+\gamma (n)|t-x_0|)dt\le \left\| \chi \right\| _{1}+n\,\gamma (n)\int _{a}^{b}\chi (nt-k)\,|t-x_0|dt\\&\le \left\| \chi \right\| _{1}+\gamma (n)\,n\int _{a}^{b}\chi (nt-k)\,\left| t-\frac{k}{n}\right| dt\, +\gamma (n)\,n\int _{a}^{b}\chi (nt-k)\,\left| \frac{k}{n}-x_0\right| dt\\&=\left\| \chi \right\| _{1}+\frac{\gamma (n)}{n}\,n\int _{a}^{b}\chi (nt-k)\,\left| nt-k\right| dt\, +\gamma (n)\,n\int _{a}^{b}\chi (nt-k)\,\left| \frac{k}{n}-x_0\right| dt\\&\le \left\| \chi \right\| _{1}+\frac{\gamma (n)}{n}{\tilde{M}}_{1}(\chi )+\frac{\gamma (n)}{n}\left\| \chi \right\| _{1}|k-nx_0|, \end{aligned} \end{aligned}$$

assuming that \({\tilde{M}}_{1}(\chi )<+\infty \). Consequently, by i) of Lemma 1, we get

$$\begin{aligned} \begin{aligned}&\left| \left( D_{n}^{\sigma ,\chi }f\right) (x_{0})-f(x_{0})\right| \\&\le \frac{\omega \left( f,\frac{1}{\gamma (n)}\right) }{\mathcal{K}\,\phi _{\sigma }(2)}\left( \left\| \chi \right\| _{1}+\frac{\gamma (n)}{n}{\tilde{M}}_{1}(\chi )+\frac{\gamma (n)}{n} \left\| \chi \right\| _{1}\sum _{k=na}^{nb-1}|k-nx_0|\phi _{\sigma }(nx_{0}-k)\right) \\&=\frac{\omega \left( f,\frac{1}{\gamma (n)}\right) }{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)}\left( \left\| \chi \right\| _{1}+\frac{\gamma (n)}{n}{\tilde{M}}_{1}(\chi )+\frac{\gamma (n)}{n} \left\| \chi \right\| _{1}M_{na}^{nb-1}(\phi _{\sigma },1,x_0)\right) , \end{aligned} \end{aligned}$$

where the truncated absolute moment of order 1, under the assumption on \(\sigma \) assumed in Lemma 3 and in Lemma 4 can be bounded by an increasing function on the variable n. Thus, for \(f\in C([a,b])\), we can immediately deduce the following estimates

$$\begin{aligned} \left\| D_{n}^{\sigma ,\chi }f -f\right\| _{\infty } \le C\,\omega \left( f,\,\frac{1}{n^{\alpha -1}}\right) ,\quad&\text { if }\alpha \in \left( 1,2\right) \text {,} \end{aligned}$$
(5)
$$\begin{aligned} \left\| D_{n}^{\sigma ,\chi }f -f\right\| _{\infty } \le C\,\omega \left( f,\,\frac{\ln n}{n}\right) ,\quad&\text { if }\alpha =2\text {,} \end{aligned}$$
(6)
$$\begin{aligned} \left\| D_{n}^{\sigma ,\chi }f -f\right\| _{\infty } \le C\,\omega \left( f,\,\frac{1}{n}\right) ,\quad&\text { if }\alpha >2, \end{aligned}$$
(7)

where each constant C appearing in these estimates is independent on n and f.

Now, we proceed to prove that these estimates are the best possible.

We focus only on the case \(\alpha =2\), employing an approach inspired by the recent paper [13]. The remaining cases follow by similar methodologies, and therefore, we omit discussing them here.

Throughout this section, the notation \(f(x) \sim g(x)\) as \(x\rightarrow a\), means that \(f(x)={\mathcal {O}}(g(x))\) as \(x\rightarrow a\), and \(g(x)={\mathcal {O}}(f(x))\) as \(x\rightarrow a\).

Now, if we define

$$\begin{aligned} \sigma (x):= {\left\{ \begin{array}{ll} -\displaystyle \frac{1}{8x}, &{}\quad x\le -1/2, \\ \displaystyle \frac{1}{2}x+\frac{1}{2}, &{}\quad -1/2<x<1/2, \\ \displaystyle 1-\frac{1}{8x}, &{}\quad x\ge 1/2, \end{array}\right. } \end{aligned}$$

it is easy to see that the corresponding \(\phi _{\sigma }\) satisfies

$$\begin{aligned} \phi _{\sigma }(x)\,=\,{\frac{1}{8\,(x^{2}-1)}}\,\ge \,{\frac{1}{8\,x^{2}}},\quad |x|\,\ge \,\frac{3}{2}. \end{aligned}$$

Then, consider the function \(f:[-1,1]\rightarrow \mathbb {R}\), \(f(x)=\left| x\right| \). It is immediate that, for \(\delta >0\) sufficiently small, we have

$$\begin{aligned} \omega \left( f,\,\delta \right) =\delta . \end{aligned}$$
(8)

Thus, recalling that \(\phi _{\sigma }\) is even, we can write

$$\begin{aligned} \begin{aligned}&\left| \left( D_{n}^{\sigma ,\chi }f\right) (0)-f\left( 0\right) \right| \\&\quad =\left( D_{n}^{\sigma ,\chi }f\right) (0)=\frac{\displaystyle \sum \nolimits _{k=-n}^{n-1}n \left[ \int _{-1}^{1}\chi (nt-k)f(t)dt\right] \phi _{\sigma }(k)}{\displaystyle \sum \nolimits _{k=-n}^{n-1} \left[ \int _{-n-k}^{n-k}\chi (y)dy\right] \phi _{\sigma }(k)}\\&\quad \ge \frac{1}{\left\| \chi \right\| _{1}}\frac{ \displaystyle \sum \nolimits _{k=-n}^{n-1}n \left[ \int _{-1}^{1}\chi (nt-k)f(t)dt\right] \phi _{\sigma }(k)}{\displaystyle \sum \nolimits _{k=-n}^{n-1} \phi _{\sigma }(k)} \\&\quad \ge \frac{1}{\left\| \chi \right\| _{1}}\sum _{k=2}^{n-1}n\left[ \int _{-1}^{1}\chi (nt-k)f(t)dt\right] \phi _{\sigma }(k)\\&\quad = \frac{1}{\left\| \chi \right\| _{1}}\sum _{k=2}^{n-1}\left[ \int _{-n-k}^{n-k}\chi (y)\cdot \frac{|y+k|}{n}dy\right] \phi _{\sigma }(k)\\&\quad \ge \frac{1}{\left\| \chi \right\| _{1}}\sum _{k=2}^{n-1}\left[ \int _{0}^{1}\chi (y)\cdot \frac{|y+k|}{n}dy\right] \phi _{\sigma }(k)\\&\quad \ge \frac{1}{\left\| \chi \right\| _{1}}\sum _{k=2}^{n-1}\frac{k}{n}\left[ \int _{0}^{1}\chi (y)dy \right] \phi _{\sigma }(k)=\frac{\mathcal{K}}{\left\| \chi \right\| _{1}}\sum _{k=2}^{n-1}\frac{k}{n}\cdot \phi _{\sigma }(k)\\&\quad \ge \frac{{{{\mathcal {K}}}}}{8n\left\| \chi \right\| _{1}} \sum _{k=2}^{n-1}\frac{1}{k}. \end{aligned} \end{aligned}$$

Noting that

$$\begin{aligned} \sum _{k=2}^{n-1}\frac{1}{k}\,\ge \,\log n -\log 2\,= \,\log {\frac{n}{2}}\,\ge \,\frac{\log n}{2}, \end{aligned}$$

for sufficiently large \(n\in \mathbb {N}^{+}\), and referring to (8), it can be deduced that

$$\begin{aligned} \left| \left( D_{n}^{\sigma ,\chi }f\right) (0)-f\left( 0\right) \right| \ge \frac{\mathcal{K}}{16\left\| \chi \right\| _{1}}\cdot \frac{\log n}{n}=\frac{\mathcal{K}}{16\left\| \chi \right\| _{1}}\omega \left( f,\,\frac{\log n}{n}\right) , \end{aligned}$$

and, hence

$$\begin{aligned} \left\| D_{n}^{\sigma ,\chi }f -f\right\| _{\infty } \ge \frac{{{{\mathcal {K}}}}}{16\left\| \chi \right\| _{1}}\omega \left( f,\,\frac{\log n}{n}\right) , \end{aligned}$$

for sufficiently large \(n\in \mathbb {N}^{+}\). It could be possible to generalize this estimation for the case of an arbitrary sigmoidal function \(\sigma \) such that \(\phi _{\sigma }(x)\sim |x|^{-2}\), as \(x\rightarrow -\infty \), see, e.g., [13].

In conclusion, estimation (6) is the best possible in the case \(\alpha =2\).

By similar reasonings, we deduce that the estimations (5) and (7) are the best possible, in the cases \(\alpha \in \left( 1,2\right) \) and \(\alpha >2\), respectively.

Therefore, we can summarize such properties in the following theorem.

Theorem 10

Consider a sigmoidal function \(\sigma :{\mathbb {R}}\rightarrow {\mathbb {R}}\) and \(\alpha > 1\), such that \(\phi _{\sigma }(x)\sim \left| x\right| ^{-\alpha }\) as \(x\rightarrow -\infty \), and let \(\chi :\mathbb {R}\rightarrow [0,\infty )\) be a bounded and \(L^{1}\)-integrable function on \(\mathbb {R}\). The Durrmeyer-type neural network operators, defined in (2), have the following properties:

  1. (i)

    if \(\alpha \in (1,2)\), \({\mathcal {O}}\left( \omega \left( f,\,\frac{1}{n^{\alpha -1}}\right) \right) \), as \(n\rightarrow +\infty \), is the best order of uniform approximation over the class C([ab]) in the approximation by the operators \(D_{n}^{\sigma ,\chi }\);

  2. (ii)

    if \(\alpha =2\), \({\mathcal {O}}\left( \omega \left( f,\,\frac{\log n}{n}\right) \right) \), as \(n\rightarrow +\infty \), is the best order of uniform approximation over the class C([ab]) in the approximation by the operators \(D_{n}^{\sigma ,\chi }\);

  3. (iii)

    if \(\alpha >2\), \({\mathcal {O}}\left( \omega \left( f,\,\frac{1}{n}\right) \right) \), as \(n\rightarrow +\infty \), is the best order of uniform approximation over the class C([ab]) in the approximation by the operators \(D_{n}^{\sigma ,\chi }\).

6 Convergence in Lebesgue spaces

Now, in order to obtain approximation results for not necessarily continuous functions, we work in the setting of Lebesgue spaces \(L^{p}([a,b])\), \(1\le p<+\infty \). Therefore, we want to exploit the integral form of the operator \(D_{n}^{\sigma ,\chi }\) to show that it also converges with respect to the \(L^{p}\)-norm.

As an easy consequence of Theorem 9, we can prove the following.

Theorem 11

For every \(f\in C([a,b])\) and \(1\le p<+\infty \), we have

$$\begin{aligned} \lim _{n\rightarrow \infty }\left\| D_{n}^{\sigma ,\chi }f -f\right\| _{p}=0, \end{aligned}$$

where \(\left\| \cdot \right\| _{p}\) denotes the usual \(L^{p}\)-norm.

Proof

We immediately have

$$\begin{aligned} \left\| D_{n}^{\sigma ,\chi }f -f\right\| _{p}=\left( \int _{a}^{b}\left| \left( D_{n}^{\sigma ,\chi }f\right) (x)-f(x)\right| ^{p}dx\right) ^{1/p}\le \left\| D_{n}^{\sigma ,\chi }f -f\right\| _{\infty }\cdot (b-a)^{1/p}, \end{aligned}$$

from which we get the thesis, if \(n\in \mathbb {N}^{+}\) is sufficiently large, thanks to Theorem 9. \(\square \)

Subsequently, to prove the convergence of the family of Durrmeyer-type neural network operators in \(L^p\), it is necessary to establish the following inequality.

Theorem 12

For every \(f\in L^{p}(\left[ a,b\right] ) \), \(1\le p<+\infty \), we have

$$\begin{aligned} \left\| D_{n}^{\sigma ,\chi }f \right\| _{p}\le \frac{\left\| \chi \right\| _{1}^{\frac{p-1}{p}}M_{0}(\chi )^{\frac{1}{p}}}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)}\left\| f\right\| _{p}. \end{aligned}$$

Proof

For every \(f\in L^{p}(\left[ a,b\right] ) \), \(1\le p<+\infty \), taking into account the convexity of the function \(x\rightarrow \left| x\right| ^{p}\), and applying the Jensen inequality, we obtain

$$\begin{aligned} \begin{aligned}&\left\| D_{n}^{\sigma ,\chi }f \right\| _{p} =\left( \int _{a}^{b}\left| \left( D_{n}^{\sigma ,\chi }f\right) (x) \right| ^{p}dx\right) ^{1/p} \\&=\left( \int _{a}^{b}\left| \frac{\displaystyle \sum \nolimits _{k=na}^{nb-1}\left[ n \int _{a}^{b}\chi (nt-k)f(t) dt\right] \phi _{\sigma }(nx-k)}{\displaystyle \sum \nolimits _{k=na}^{nb-1}\left[ n \int _{a}^{b}\chi (nt-k)dt\right] \phi _{\sigma }(nx-k)}\right| ^{p}dx\right) ^{1/p} \\&\le \frac{1}{{{\mathcal {K}}}}\left( \int _{a}^{b}\left| \frac{\displaystyle \sum \nolimits _{k=na}^{nb-1}\left[ n \int _{a}^{b}\chi (nt-k)f(t) dt\right] \phi _{\sigma }(nx-k)}{\displaystyle \sum \nolimits _{k=na}^{nb-1}\phi _{\sigma }(nx-k)}\right| ^{p}dx\right) ^{1/p}\\&\le \frac{1}{{{\mathcal {K}}}}\left( \int _{a}^{b} \frac{\displaystyle \sum \nolimits _{k=na}^{nb-1}\left| n \int _{a}^{b}\chi (nt-k)f(t) dt \right| ^{p} \phi _{\sigma }(nx-k)}{\displaystyle \sum \nolimits _{k=na}^{nb-1}\phi _{\sigma }(nx-k)} dx\right) ^{1/p} \end{aligned} \\ \begin{aligned}&\le \frac{1}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)}\left( \int _{a}^{b} \sum _{k=na}^{nb-1}\left| n\int _{a}^{b}\chi (nt-k)f(t)dt\right| ^{p}\phi _{\sigma }(nx-k)dx\right) ^{1/p}\\&=\frac{1}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)}\left( \sum _{k=na}^{nb-1}\int _{a}^{b}\phi _{\sigma }(nx-k)dx\left| n\int _{a}^{b}\chi (nt-k)f(t)dt\right| ^{p}\right) ^{1/p}. \end{aligned} \end{aligned}$$

One can easily verify that

$$\begin{aligned} \int _{a}^{b}\phi _{\sigma }(nx-k)dx\le \frac{1}{n}\int _{ \mathbb {R}}\phi _{\sigma }(t)dt=\frac{1}{n}, \end{aligned}$$

where the equality arises from (iv) of Lemma 1, which asserts that \(\displaystyle \int _{\mathbb {R}}\phi _{\sigma }(t)dt=1\). Taking this into consideration, we obtain

$$\begin{aligned} \left\| D_{n}^{\sigma ,\chi }f\right\| _{p}\le \frac{1}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)}\left( \frac{1}{n} \sum _{k=na}^{nb-1}\left| n\int _{a}^{b}\chi (nt-k)f(t) dt\right| ^{p}\right) ^{1/p}. \end{aligned}$$

Subsequently applying Jensen inequality once again, we deduce that

$$\begin{aligned} \begin{aligned} \left\| D_{n}^{\sigma ,\chi }f\right\| _{p}&= \frac{1}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)}\left( \frac{1}{n}\sum _{k=na}^{nb-1}\left| \frac{n\int _{a}^{b}\chi (nt-k)dt}{n\int _{a}^{b}\chi (nt-k)dt} \cdot n\int _{a}^{b}\chi (nt-k)f(t) dt\right| ^{p}\right) ^{1/p}\\&\le \frac{1}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)}\left( \frac{1}{n} \sum _{k=na}^{nb-1}\left( n\int _{a}^{b}\chi (nt-k)dt\right) ^{p-1} \cdot n\int _{a}^{b}\chi (nt-k)|f(t)|^{p} dt\right) ^{1/p}\\&\le \frac{\left\| \chi \right\| _{1}^{\frac{p-1}{p}}}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)}\left( \int _{a}^{b}\left[ \sum _{k=na}^{nb-1}\chi (nt-k)\right] |f(t)|^{p} dt\right) ^{1/p}\\&\le \frac{\left\| \chi \right\| _{1}^{\frac{p-1}{p}}M_{0}(\chi )^{\frac{1}{p}}}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)}\left( \int _{a}^{b}|f(t)|^{p} dt\right) ^{1/p}\\&=\frac{\left\| \chi \right\| _{1}^{\frac{p-1}{p}}M_{0}(\chi )^{\frac{1}{p}}}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)}\left\| f\right\| _{p}, \end{aligned} \end{aligned}$$

and now the proof is complete. \(\square \)

From Theorem 12, we deduce that \(D_{n}^{\sigma ,\chi }\) map the whole space \(L^{p}([a,b])\) into itself, and that \(D_{n}^{\sigma ,\chi }\) are well-defined in \(L^{p}([a,b])\).

Finally, by exploiting the density of C([ab]) within \(L^{p}(\left[ a,b\right] ) \), \(1\le p<+\infty \), with respect to the norm \(\left\| \cdot \right\| _{p}\), we achieve the following convergence result.

Theorem 13

For every \(f\in L^{p}(\left[ a,b\right] ) \), \(1\le p<+\infty \), we have

$$\begin{aligned} \lim _{n\rightarrow +\infty }\left\| D_{n}^{\sigma ,\chi }f -f\right\| _{p}=0. \end{aligned}$$

Proof

Let \(f \in L^{p}([a,b])\) and \(\varepsilon > 0\) be fixed. Since the space C([ab]) is dense in \(L^{p}([a,b])\) with respect to the norm \(\Vert \cdot \Vert _p\), there exists \(g \in C([a,b])\) such that \(\Vert f - g\Vert _p < \left( \frac{\left\| \chi \right\| _{1}^{\frac{p-1}{p}}M_{0}(\chi )^{\frac{1}{p}}}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)}+1\right) ^{-1}\varepsilon /2\).

Then, using such a g and Theorem 12,

$$\begin{aligned} \begin{aligned} \Vert D_{n}^{\sigma ,\chi }f - f\Vert _p&\le \Vert D_{n}^{\sigma ,\chi }f - D_{n}^{\sigma ,\chi }g\Vert _p + \Vert D_{n}^{\sigma ,\chi }g - g\Vert _p + \Vert g - f\Vert _p \\&\le \frac{\left\| \chi \right\| _{1}^{\frac{p-1}{p}}M_{0}(\chi )^{\frac{1}{p}}}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)}\left\| f-g\right\| _{p}+ \Vert D_{n}^{\sigma ,\chi }g - g\Vert _p + \Vert g - f\Vert _p\\&=\left( \frac{\left\| \chi \right\| _{1}^{\frac{p-1}{p}}M_{0}(\chi )^{\frac{1}{p}}}{{{{\mathcal {K}}}}\,\phi _{\sigma }(2)}+1\right) \left\| f-g\right\| _{p}+ \Vert D_{n}^{\sigma ,\chi }g - g\Vert _p\\&<\frac{\varepsilon }{2}+ \Vert D_{n}^{\sigma ,\chi }g - g\Vert _p. \end{aligned} \end{aligned}$$

Finally, by Theorem 11,

$$\begin{aligned} \Vert D_{n}^{\sigma ,\chi }f - f\Vert _p<\frac{\varepsilon }{2}+\frac{\varepsilon }{2}=\varepsilon , \end{aligned}$$

for \(n \in \mathbb {N}^+\) sufficiently large. Being \(\varepsilon \) arbitrary, the proof follows. \(\square \)

Remark 14

Examples of sigmoidal functions satisfying all the assumption of the above theory can be easily provided. For instance, we can mention the well-known logistic function \(\sigma _{l}(x)=(1+e^{-x})^{-1}\), \(x\in {\mathbb {R}}\), and the hyperbolic tangent sigmoidal function \(\sigma _{h}(x)=\frac{1}{2}(\tanh x+1)\), \(x\in {\mathbb {R}}\) (see, e.g., [4, 7, 30]). An example of a non-smooth sigmoidal function can be provided by the ramp function \(\sigma _{R}(x)\) [8, 33, 34], defined by

$$\begin{aligned} \sigma _{R}(x)={\left\{ \begin{array}{ll} 0, &{} \text { if } x<-\frac{1}{2}, \\ x+\frac{1}{2}, &{} \text { if } -\frac{1}{2}\le x\le \frac{1}{2}, \\ 1, &{} \text { if } x>\frac{1}{2}, \end{array}\right. } \end{aligned}$$

for which the corresponding function \(\phi _{\sigma _{R}}\) has compact support.

Additionally, since it is well-known that in the last years the so-called rectified linear unit (ReLU) activation function received a lot of attention due to their useful peculiarities in the application of training algorithms, it is natural to ask if the previous results can be extended also to that case. In general, such a question can be non-trivial, since the ReLU function is not a sigmoidal one.

We recall that, the ReLU function is defined as \(\psi _{\text {ReLU}}(x)=(x)_{+}\), \(x\in {\mathbb {R}}\) (see, e.g., [12]), where the function \((x)_{+}:= \max \{x, 0\}\) denotes the positive part of \(x\in {\mathbb {R}}\).

Similarly, also powers of ReLU functions, known as rectified power unit (RePUs, [28]) functions \(\psi ^{k}_{\text {ReLU}}(x)=[(x)_{+}]^{k}\), \(x\in {\mathbb {R}}\), are of interest in the theory of neural networks, for reasons similar to those written above.

However, it is not difficult to see that certain density functions \(\phi _{\sigma }\) generated by suitable \(\sigma \) can be viewed as finite linear combination of ReLu or RePUs, respectively (see [14]).

Finally, to conclude the paper, it is important to provide examples of functions \(\chi \) that can be used in the definition of the Durrmeyer-type NN operators. This is quite easy since the required assumptions on \(\chi \) are very soft. For instance, we can consider as \(\chi \) the well-known central B-spline kernel of order n, expressed as

$$\begin{aligned} M_{n}(x):=\frac{1}{(n-1)!}\sum _{i=0}^{n}(-1)^{i}\left( {\begin{array}{c}n\\ i\end{array}}\right) \left( \frac{n}{2} +x-i \right) _{+}^{n-1}. \end{aligned}$$

Additionally, the Féjer kernel provides another suitable choice

$$\begin{aligned} F(x):=\frac{1}{2} \text {sinc}^{2}\left( \frac{x}{2}\right) ,\;\;\;x\in \mathbb {R}, \end{aligned}$$

with the sinc-function defined as

$$\begin{aligned} \text {sinc}(x):={\left\{ \begin{array}{ll} \frac{\sin (\pi x)}{\pi x} &{} \text { } x\in \mathbb {R}\setminus \{0\}, \\ 1, &{} \text { } x=0. \end{array}\right. } \end{aligned}$$

Further options can be explored in references such as [1, 2, 17, 18, 21].

7 Final remarks and conclusions

In this paper, the theory of the Durrmeyer-type NN operators is introduced and studied, in case of the approximation of functions of one variable. It is well-known that NN-type approximations typically involve multivariate data; hence, as a future work we aim to extend the above definition and results to the multivariate setting, following the same approach considered in [20].