1 Correction to: Neural Computing and Applications https://doi.org/10.1007/s00521-020-05182-1

1.1 Introduction

Here we provide Addendum on SPOCU fitting and Erratum to Article Title:”SPOCU”: scaled polynomial constant unit activation function. https://doi.org/10.1007/s00521-020-05182-1.

Following the publication of our article [1], we have become aware that the legends of Fig. 8b should read as follows: generator h(·) of the activation function S. Moreover, the following Fig. 1 provides the graph of activation function S (c = ∞).

Fig. 1
figure 1

Activation function S

The SPOCU activation function is given by

$$ s(x) = \alpha h\left( {\frac{x}{\gamma } + \beta } \right) - \alpha h(\beta ) $$
(1)

where β ∈ (0,1), α, γ > 0 and

$$ h(x)=\left\{\begin{array}{lll} r(c),& x\geq c,\\ r(x),&x\in[0,c),\\ 0,&x<0, \end{array}\right.$$
(2)

with r(x) = x3(x5 − 2x4 + 2) and 1 ≤ c < ∞ (we admit c goes to infinity with r(c) → ∞).

Shortly after publication of our article [1], we have received questions from the community how to implement SPOCU, since it is not involved in current software packages, e.g. in Matlab or Python (Keras). Here we provide a short idea how to select parameters α, β, γ or c.

Parametric space for parameters of SPOCU is (α, β, γ, c) ∈ (0,∞) × (0,1) × (0,∞) × (1,∞]: = P. For proper choice of the parameters, it is sufficient they belong to P and they can satisfy several conditions. One set of conditions has been outlined in [1], which givesFootnote 1

$$ {\text{E}}f\left[ {s\left( z \right)} \right] \, = :Mean\left( {\alpha ,\beta ,c,\gamma } \right) \, = \, 0 $$
(3)
$$ {\text{E}}f\left[ {s^{{2}} \left( z \right)} \right] \, = :V \, ar\left( {\alpha ,\beta ,c,\gamma } \right) \, = { 1} $$
(4)
$$ \left| {\left| {J\left( {\alpha ,\beta ,c,\gamma } \right)} \right|} \right| < {1} $$
(5)

hold. Here f(z) is the pdf of underlying distribution, J is the Jacobi matrix w.r.t to mean and variance parameters and ||·|| is the appropriate matrix norm, e.g. L1 or L2, as considered in the application.

However, other sets of conditions can be more attractive for the reader, all depends on the aims and type of network. E.g. one can, similarly as in 5.1. of [1] fix c = γ = 1. And, to compute α, β one can consider Pareto density g(x) = a(x + 1)a−1, x > 0, a > 1 to satisfy for some a > 1

$$ {\mathbb{E}}_{g} \left[ {s(x)} \right] = \frac{1}{a - 1} $$
(6)

with additional condition \(s'(0)=1\), which is 2\((\beta^{2} \beta^{5} - \beta^{4} + 3) = \gamma\). Thus, for Pareto tail parameter a = 4 we received in Maple software α = 0.8874425243 and β = 0.4495811364.