Statistical analysis and first-passage-time applications of a lognormal diffusion process with multi-sigmoidal logistic mean

We consider a lognormal diffusion process having a multisigmoidal logistic mean, useful to model the evolution of a population which reaches the maximum level of the growth after many stages. Referring to the problem of statistical inference, two procedures to find the maximum likelihood estimates of the unknown parameters are described. One is based on the resolution of the system of the critical points of the likelihood function, and the other is on the maximization of the likelihood function with the simulated annealing algorithm. A simulation study to validate the described strategies for finding the estimates is also presented, with a real application to epidemiological data. Special attention is also devoted to the first-passage-time problem of the considered diffusion process through a fixed boundary.

The differential equations which drive the growth of the aforementioned deterministic models are very useful to describe population dynamics.However, in order to make them more realistic, it is necessary to introduce a noise term in the equation.In this way, the differential equations are replaced by stochastic ones.Most of the times, the analysis of the resulting stochastic equation is quite complex, and the transition probability density of the resulting diffusion process cannot be determined (for example, see Campillo et al. (2018) [5], in which the authors propose, for this reason, a new approach to find the maximum likelihood estimates).Models based on diffusion processes are commonly used in various fields of applications, for example plant dynamics (cf.Rupšys et al. (2020) [32], where a hybrid growth is based on Gompertz and Vasicek models), resources consumption (for instance, Nafidi et al. (2019) [22] uses the Brennan-Schwartz process to model electricity consumption in Morocco) or particular fish species growth (cf.a stochastic version of the open-ended logistic model considered in Yoshioka et al. (2019) [38]).
In a recent paper, Di Crescenzo et al. (2020) [9] focuses on the generalization of the classical logistic growth model introducing more than one inflection point.To this end, firstly, two different birth-death processes, one with linear birth and death rates and the other with quadratic rates were considered.Then, a diffusive approximation was performed leading to a non-homogeneous lognormal diffusion process with mean of multi-sigmoidal logistic type.Attention was also given to the description of its main features of interest in applied contexts.For instance, the mean of the process is a generalized version of the classical logistic function (see, for instance, Di Crescenzo and Paraggio (2019) [8]) with more than one inflection point.The transition probability density of the process has been obtained explicitly and has been applied to plant dynamics.
Starting from the theoretical results of the previous works, in the present paper we approach the problem of the inference of the stochastic model.This is done by means of the maximum likelihood method, thanks to the availability in closed form of the likelihood function.We also address the treatment of some collateral problems that emerge in the development carried out, such as: (i) obtaining initial solutions to solve the system of likelihood equations, and (ii) bounding the parametric space for addressing the estimation by metaheuristic procedures.All development is supported by simulation examples.Subsequently, in order to provide an example of application to real phenomena, we adopt the proposed model to describe the behavior of the data on the evolution of COVID-19 in different European countries during the two first waves of infection.
Indeed, some of the main features of the diffusion process, such as the mean, the mode and the quantiles, may be used for prediction purposes and they are expressed as a function of the parameters of the process.
The problem of parameters estimation has been considered in several papers, for instance in Shimizu and Iwase (1987) [34] and in Tanaka (1987) [36].See also the more recent works of Garcia (2019) [14], in which the author converts the maximization of the likelihood function into an equivalent problem regarding the minimization of a square error, and of Ramos-Ábalos et al. (2020) [24] where maximum likelihood estimates of the parameters of the powers of the homogeneous Gompertz diffusion process are obtained.
Two different strategies to obtain the maximum likelihood estimates of the parameters are introduced.The first is based on the solution of the system of the critical points of the likelihood function, and the other stems from a meta-heuristic optimization method (simulated annealing) to maximize the likelihood function.This is the outline of the content of the paper.In Section 2, the most relevant characteristics of the deterministic and the corresponding stochastic model are recalled.Then, the problem of finding the maximum likelihood estimates of the involved parameters is described in Section 3. In several contexts of population dynamics, it may be relevant to know how long the population spends below a certain control threshold.For this reason the first-passage-time (FPT) problem is also addressed.More precisely, in Section 4, the R-package fptdApprox (see [39]) is used to determine the approximated FPT density of the lognormal diffusion process through a constant boundary.With the purpose of validating the described procedures for finding the maximum likelihood estimates, a simulation study is presented in Section 5. Finally, in Section 6 we propose an application of the model to real data concerning the COVID-19 infections in France, Italy, Spain and United Kingdom.

The multisigmoidal logistic model and the corresponding diffusion process
Consider the classical logistic equation with r, η, C > 0. If the intrinsic growth rate r is replaced by a polynomial P (t), then the solution of this equation, with the initial condition l(t 0 ) = l 0 , is given by where Q(t) − Q(t 0 ) = t t0 P (τ )dτ .With the hypotesis that Q(t) → +∞ when t → ∞, the carrying capacity of this generalized model is given by C η , and thus it is independent from the initial condition l 0 .In order to obtain a generalized logistic function in which the carrying capacity is dependent on the initial condition, we consider the following equation (cf.Di Crescenzo et al. (2020) in [9]) with and . Under these assumptions, the solution of the ordinary differential equation ( 1), with initial condition l m (t 0 ) = l 0 , is the so-called multisigmoidal logistic function given by We point out that the function l m may exhibit more than one inflection point, and its carrying capacity is where C = C(l 0 , η, β, t 0 ) = l 0 η + e −Q β (t0) and Q β is defined in Eq. (3).It is easy to note that the function ( 4) is not monotonous in general, since the monotonicity intervals depend on the coefficients β 1 , . . ., β p of the polynomial Q β , and the carrying capacity is the maximum value attainable by the function l m .See Figure 1 for some plots of the multisigmoidal logistic function.
The investigation of the inflection points in the case of multisigmoidal growth curves are of great interest.Unfortunately, since the expression of function ( 4) is quite complex, these points cannot be obtained explicitly, but it is possible to provide an equation in the unknown t solved by the inflection points, that is In Figure 2, the multisigmoidal logistic function and the corresponding inflection points are shown for some choices of the parameters.

The corresponding diffusion process
In Di Crescenzo et al. (2020) [9], a special time-dependent lognormal diffusion process {X(t); t ∈ I} has been considered, with I = [t 0 , +∞) and infinitesimal moments where h θ is defined in (2), θ = (η, β T ) T and σ > 0. The aforementioned process is determined by the following stochastic differential equation, obtained from Eq. ( 1) by adding a multiplicative noise term, = means equality in distribution, and where W (t) denotes a Wiener process independent from the (possibly random) initial state X 0 , for t ≥ t 0 .We point out that this is not the only way to randomize the growth deterministic equation.Indeed, in the case of random catastrophes, it may be more appropriate to consider as a noise term a Poisson process (see for example Schlomann (2018) [33]).The solution of Eq. ( 8) is with The existence and uniqueness of solution of the linear stochastic differential equation ( 8) is ensured by virtue of the continuity of function h θ (t) (see, for example, Arnold (1974) [1]).Moreover, if either X 0 is degenerated at x 0 , in the sense that P [X(t 0 ) = x 0 ] = 1, or X 0 follows a lognormal distribution Λ 1 µ 0 , σ 2 0 , then the finite dimensional distributions of the process are lognormal.Namely, for any n ∈ N and t 0 ≤ t 1 < . . .< t n , the vector (X(t 1 ), . . ., X(t n )) T follows an n-dimensional lognormal distribution Λ n (ϵ, Σ), where the entries of the vector ϵ are given by and the components of the matrix Σ = (σ i,j ) are given by Further, the conditional distribution of the process follows a lognormal distribution, i.e. for s < t From the above mentioned distributions, some characteristics associated to the process can be obtained (cf.Di Crescenzo et al. (2020) [9]).For example, the mean of X(t) conditional on X(t 0 ) = x 0 is given by Moreover, if and the α-percentiles for t ≥ t 0 are for 0 < α < 1, where z α is the α-percentile of the standard normal random variable.Note that the conditional mean (11) and the mean (12) are multisigmoidal logistic functions of t, in the sense that they solve the multisigmoidal logistic equation (1).

Maximum likelihood estimations
The stochastic model introduced in Section 2.1 can be employed in several applications, especially for describing real populations that exhibit a growth pattern with more than one inflection point.Clearly, in order to apply this model to real data, the unknown parameters need to be estimated.In Section 2.1 we obtained the distribution of the diffusion process X(t) defined in (9).Now we propose to estimate the parameters by means of the classical maximum likelihood method.The adoption of this strategy is particularly suggested by the availability in closed form of the transition distribution of the process X(t).Hence, we follow the same lines introduced in Román-Román et al. (2018) [26] for general lognormal diffusion processes.We consider a discrete sampling of X(t) based on d independent sample paths, with n i different observation instants for the i-th sample path, i.e. t ij , j = 1, . . ., n i , for i = 1, . . ., d.For simplicity, assume that the first observation time is identical for any trajectory, i.e. t i1 = t 0 , i = 1, . . ., d.Moreover, let the vector X i = (X(t i1 ), . . ., X(t ini )) T contain the variables of the i-th sample path, for i = 1, . . ., d, and let X = By supposing that X(t 0 ) follows a one-dimensional lognormal distribution Λ 1 µ 1 , σ 2 1 and by considering the transitions of the process X(t), the probability density function of X has the following expression Recalling (10), the parameters in (14) are given by and ∆ j+1,j i In order to obtain a more manageable expression of the density (14), the following change of variables may be considered: Hence, the probability density function of the vector for v = (v 01 , . . ., v 0d , v 11 , . . ., v 1(n1−1) , . . ., v d1 , . . ., v d(n d −1) ) ∈ R n+d , with n defined in (15), where lv 0 = (log v 01 , . . ., log v 0d ) T , and , for j = 1, . . ., n i − 1 and i = 1, . . ., d.
T and supposing that α and ξ are functionally independent, the log-likelihood function is given by The maximum likelihood estimations (MLEs) of α = µ 1 , σ 2 1 T can be computed easily.Indeed, by differentiating L V , from (17) we obtain Further on, in order to find the maximum likelihood estimates of ξ, two different approaches are available: (i) solving the nonlinear system ∂ ∂ξ LV = 0, (ii) maximizing the objective function LV .
Hereafter, in the Sections 3.1 and 3.2 we provide a description of the two strategies, whereas in Section 5 we present an application to a simulation study that involves the given strategies.
The availability of the probability density function of X in ( 14) allows to obtain explicitly the log-likelihood function given in (17).Consequently, following the maximum likelihood estimation procedure, in Section 3.1 we obtain the associated system of equations, the final form being reported in Eq. ( 23) below.However, since such system does not have an explicit solution, its resolution must be obtained by adopting numerical methods.

Solving the nonlinear system
Recalling that θ T = (θ 0 , θ 1 , . . ., θ p ) = (η, β 1 , . . ., β p ), the partial derivatives of LV are given by where Hence, the MLEs are the solutions of the following system of p + 2 nonlinear equations By defining the following quantities with l = 0, 1, . . ., p, the last p + 1 equations of the system (19) can be written as follows Substituting the expression ( 16) of m ξ in the previous equations, one has where, for any l = 0, 1, . . ., p, one has Hence, until now, the expression of the system solved by the MLEs is The first equation of system (21) can be further simplified.Indeed, by setting one has Consequently, the system (21) finally becomes Note that ( 23) is a system of p + 2 equations in the unknowns contained in ξ = (η, β 1 , . . ., β p , σ 2 ).
Remark 1 For the first equation of the system (23) in the unknown σ 2 , since Z 3 > 0, n > 0 and the only acceptable solution is Clearly, since in general system ( 23) cannot be solved analytically, then a numerical approach is needed.Specifically, we adopt the well-known Newton-Raphson method to solve (23) (for instance, see Dennis and Schnabel (1996) [7]).For such an iterative method, an initial approximation for the solutions of the system is needed.It can be obtained by a procedure similar to that used by Román-Román et al. (2019) [27].For the initial solution of the vector θ = (η, β 1 , . . ., β p ) T , by considering the multisigmoidal logistic function, i.e.
it can be supposed, without loss of generality, that t 0 = 0 (see Remark 2.1 of Di Crescenzo et al. (2020) [9]), so that Then, considering the sampling T defined in Section 3, consisting of d independent sample paths of the process X(t), for simplicity we suppose that any sample path of the process has the same number of observations, i.e. n i = N for any i = 1, . . ., d.However, the following remarks hold even in more general cases.Moreover, let m j be the values of the mean of the sample paths at the time t j , for j = 1, . . ., N , that is where x ij is the value of the i-th sample path at the time t j .
In general, the carrying capacity C/η is unknown.We suppose that the observations are available over a large time interval, such that the evolution of the population is terminated over such an interval.Hence, the carrying capacity C/η can be approximated with the last value of the sample mean m N .This approximation can be adopted also in the other cases, since it is used just to construct an initial solution for the parameters of the Newton-Raphson method for the estimate of θ.Thus, we can consider a polynomial regression for the pairs The coefficients ( β1 , . . ., βp , log η) of the approximating polynomial will be the initial values for the parameters (β, log η).Thus, the initial solution for η is given by η.
Finally, in order to construct the initial solution of σ 2 , let us now recall that for a lognormal distribution Y ∼ Λ 1 (α, δ), one has log Y ∼ N (α, δ), so that the quantity 2 log m m g gives an approximation for δ, where m and m g are respectively the arithmetic sample mean and the geometric sample mean of a random sample (y 1 , . . ., y n ) from Y .Hence, one has Since E[Y ] = e α+δ/2 is estimated by the sample mean m, we have As a consequence, in our setting an estimate for σ 2 0 + σ 2 t j is given by where m j and m g j denote respectively the arithmetic and the geometric sample mean of the observations performed at the time t j .Hence, an initial approximation for σ 2 can be obtained by performing a simple linear regression of σ 2 j − σ 2 0 against t j .In conclusion, in order to obtain the maximum likelihood estimates of the parameters contained in ξ = (η, β 1 , . . ., β p , σ 2 ), the steps of the proposed strategy to solve the system (23) are: (i) finding an initial solution for the parameters η and β with a polynomial regression of − log m N mj − 1 against t j , for any j = 1, . . ., N − 1; (ii) finding an initial solution for σ 2 with a simple linear regression of σ 2 j − σ 2 0 against t j , with σ 2 j = 2 log mj m g j , for any j = 1, . . ., N and where σ 2 0 can be obtained by means of the second of Eqs. ( 18); (iii) using the Newton-Raphson method to solve the system (23), with the initial solutions determined at steps (i) and (ii).
The adoption of the above strategy requires to start from good initial solutions for the unknown parameters.Unfortunately, even in this case it is not always possible to guarantee the convergence of this method.For this reason, recently various procedures have been proposed aimed at addressing the maximization of the likelihood function, by viewing this as a direct optimization problem.Indeed, there is a wide range of stochastic metaheuristic methods, which can be classified into two large families: those based on trajectories and those based on swarms.Hereafter, in Section 3.2 we employ one of the most widely used, the Simulated Annealing.This method requires necessarily to bound the parametric space, and this matter is the object of Section 3.2.2.

Maximizing the log-likelihood function
Let us now illustrate a strategy based on Simulated Annealing (S.A.) and finalized to obtain the MLEs for the parameters of the process (9).We first provide a brief description of this method in Section 3.2.1.Then, in Section 3.2.2we describe a suitable criterion to restrict the parametric space, this being essential to apply the S.A. method in the remainder of the paper.

Brief notes on Simulated Annealing
The aim of this section is to determine the MLEs by using the S.A. algorithm.The aforementioned method, introduced by Kirkpatrick et al. (1983) in [18], is a meta-heuristic optimization algorithm used for problems like finding arg min θ∈Θ f (θ).It is considered more suitable with respect to other numerical algorithms since it needs less restrictive conditions regarding the regularity of the domain Θ and the analytical properties of the objective function f .The algorithm works such that in every step a random point is chosen in the solution space.If the new solution is better than the previous one, then the latter is replaced.Otherwise, if the new solution is worse than the previous, then the latter may be replaced with a probability rate ρ = min{exp(−∆f /T ), 1} which depends on the increase of the objective function ∆f = f (ξ)−f (θ 0 ) and on a suitable scale factor T , that is named 'temperature' in agreement with the metallurgical process of annealing that inspired this algorithm.We recall that the S.A. is successful because it avoids local minima.In recent years it has been widely used in the context of estimation in diffusion processes (see, for example Luz Sant'Ana et al. (2018) [21] and Román-Román and Torres-Ruiz (2015) [30]).In this context, the algorithm works in the following way.It begins with an initial choice θ 0 for the parameters of interest, then ξ is generated from an uniform distribution in a neighborhood ν(θ 0 ) of θ 0 .Then, a new value θ 1 of θ is obtained in such a way ξ, with probability ρ θ 0 , with probability 1 − ρ.
Otherwise, if f (ξ) > f (θ 0 ), then ξ may be accepted anyway with probability ρ ∈ (0, 1).The temperature T is defined in such a way that at the beginning the probability of accepting ξ is high, and during the execution of the algorithm the function T decreases.The initial temperature T 0 must be sufficiently large so that the algorithm accept the solutions which let the objective function increases with a large probability p 0 .In literature, the choices of the initial parameters are usually p 0 = 0.9 and T 0 = −∆f + / log p 0 , where ∆f + denotes the average increase of the objective function in an application test where all the solutions which cause an increase are accepted.The cooling process which defines the temperature T is usually chosen of geometric type, i.e.T i = γT i−1 for i = 1, 2, . . . .Usually the constant γ is chosen among 0.8 and 0.99 in order to have a slow cooling procedure.In our case, we set γ = 0.95.In any iteration of the algorithm, a chain of L new solutions is obtained, for L = 50.As required, the algorithm stops when at least one of the following rules is satisfied: (i) the last L obtained values are equal, (ii) the maximum number of iterations (1000, in our case) is attained, (iii) the final temperature T F = 10 −7 is reached.

Bounding the parametric space
S.A. needs a restriction of the solution space Θ, namely the set which contains the parameters ξ = (η, β T , σ 2 ).Until now, this space is continuous and unbounded, since We consider 0 < σ < 0.1 so that the simulated sample paths are less variable around the sample mean, and thus the multisigmoidal logistic profile is advisable.For the parameters β = (β 1 , . . ., β p ) T , we find the confidence intervals by using the data of the polynomial regression performed previously to find the initial solutions.More in detail, it is known that the carrying capacity of the multisigmoidal logistic model with t 0 = 0 is l 0 1 + 1 η (see Eq. ( 5)).The carrying capacity can be approximated with the last value of the sample mean, whereas the initial value l 0 with the first value of the sample mean (24), so that one has From Eq. ( 25), it easily follows η ≈ Considering Eqs. ( 4) and ( 5), for t 0 = 0 one has Hence, by replacing η with its estimate η, we can use the resulting confidence intervals of the parameters of the polynomial regression as intervals of variation for the parameters β of the diffusion process.We adopt a confidence level equal to 0.999, to attain a high probability that the true parameters β belong to the computed intervals.
In order to approximate the range of variation of η, from Eq. ( 25) we have that the last value of the i-th sample path satisfies where x i,j with i = 1, 2, . . ., d and j = 1, 2, . . ., n i are the sample data.Hence, for the range of variation of η one has η ∈ (a, b), where In conclusion, the following bounded intervals are employed: • for β 1 , . . ., β p we consider the confidence intervals of the coefficients of the polynomial regression of − log m N mj − 1 η against t j , for j = 1, . . ., N , • for η we consider the interval I η = (a, b), with a and b defined in (26), • for σ 2 we consider the interval I σ 2 = (0, 0.01).

Asymptotic distribution of the MLEs
On the ground of the results given in Section 5 of Román-Román et al. (2018) [26], in this section we aim to determine the asymptotic distribution of the MLEs (i) of the parameters µ 1 , σ 2 1 of the initial distribution, and (ii) of the parameters ξ = (η, β T , σ 2 ) of the process.[26].
In the following section we address a relevant problem for the applications, namely the FPT problem of the diffusion process X(t) through a continuous boundary.Subsequently, in Section 5 we adopt a simulation-based approach as the basis of both computational methods described so far, namely the Newton-Raphson method and the S.A. method.The estimates of the parameters obtained through these methods are then used to perform inference on the FPT density.

First-passage-time problem
The FPT problem of a stochastic process X(t) through a boundary S(t) is a problem of great interest in many fields of application, such as medicine, biology or mathematical finance, since the threshold S(t) may represent a critical value of the modeled population size.Considering a stochastic process {X(t); t 0 ≤ t ≤ T }, the FPT of the process X(t) through the continuous boundary S(t), given X(t 0 ) = x 0 , is defined as the following random variable Finding the expression of the distribution of the variable T is hard in general.However, in literature there are several studies for particular types of processes, for example diffusion processes.It has been shown that if S(t) is a continuous and differentiable function, then the density of T , denoted by g (S(t), t|x 0 , t 0 ), solves a II-kind Volterra equation (cf.Eq. (2.4) of Buonocore et al. (1987) [4]).The aforementioned Volterra equation has an explicit solution only for certain special boundaries (see for example Sections 2.3 and 4.3 of Giorno and Nobile (2019) [15] in which the FPT density through special boudaries has been obtained for the restricted Gompertz-type diffusion processes).In certain instances, it is appropriate to adopt numerical procedures in order to approximate its solution.To this aim Buonocore et al. (1987) [4] proposed a simple but efficient algorithm, based on the composite trapezoidal formula.More in detail, Theorem 4 of Buonocore et al. (1987) [4] proves the convergence of the approximated FPT density to the theoretical one.However, the application of the proposed numerical procedure requires (i) the choice of a suitable step h of integration which ensures a good approximation of the real solution, (ii) the choice of an initial time instant t 0 and (iii) the choice of the final time instant T = t 0 + N h.Román-Román et al. (2008) in [25] studied the problems related to the practical application of the numerical procedure.The first problem is linked with a suitable choice of h.Indeed, taking into account the result of Theorem 4 of Buonocore et al. (1987) [4], it is easy to note that the convergence is ensured when h → 0 + .Consequently, the value of h should be small enough, but sufficiently far from 0. Indeed, if h is excessively small, then the computational cost may increase in vain because, with a larger integration step, a similar approximation may be obtained with a smaller number of iterations.On the other hand, if h is excessively large, the approximation may be unsatisfactory.These problems depend on the localization of the FPT T , and may be solved if the range of variation of T is known.For this reason, Román-Román et al. (2008) in [25] introduced a function, called 'FPT location' (FPTL), finalized to obtain, from a heuristic point of view, the range of variation of T .Specifically, the FPTL function is defined as follows where F (x, t|x 0 , t 0 ) is the transition distribution of the process X(t).Referring to the diffusion process with infinitesimal moments given by Eq. ( 7), and by Table 1 The mean, the standard deviation, the mode, the first, the fifth and the ninth decile of the FPT of the process X(t considering a fixed and constant boundary S > x 0 , given X(t 0 ) = x 0 , the FPTL function for the process X(t) is given by where Φ is the standard normal distribution and The information provided by the FPTL function is relevant for an efficient application of the algorithm proposed by Buonocore et al. (1987) in [4].Indeed, thanks to the FPTL function, an adaptive step of integration can be obtained.
In this way, the execution time of the algorithm is reduced.
Other useful quantities related to the FPT density are given in Table 1.Diffusion process: {X(t), t in [0, 50]}, P(X(0 Fig. 4 (a) The FPTL function and (b) the approximated FPT density of the process X(t) through the constant boundary S = 15, for the same assumptions of Figure 3.

Simulation
In Section 3, two procedures have been introduced to obtain the MLEs of the parameters involved in the diffusion process (9).The former procedure is based on the numerical resolution of a system of nonlinear equations, whereas the latter is based on the application of S.A. algorithm.In this section, a simulation study is developed to verify the validity of the two aforementioned procedures.We consider the diffusion process X(t) with infinitesimal moments (7), for p = 3, and β 1 ∈ {0.1, 0.5}, β 2 ∈ {−0.009, −0.007}, β 3 ∈ {0.0002, 0.0004}, η ∈ e −1 , e −3 and σ ∈ {0.01, 0.05}.These choices of the parameters are performed arbitrarily, to obtain different patterns of the growth curve.For example, the choice β 1 = 0.1, β 2 = −0.009,β 3 = 0.0002, η = e −1 refers to the case of a non monotonous multisigmoidal logistic function, whereas the choice β 1 = 0.1, β 2 = −0.007,β 3 = 0.0003 and η = e −1 to the case of an increasing multisigmoidal logistic curve (see Figure 5).To estimate the parameters in ξ, we consider the 32 combinations of the values of the parameters listed in Table 2, with x 0 = 5 in every case.For each case, we simulate 200 sample paths of X(t), by generating 501 simulated points at equidistant times for 0 ≤ t ≤ 50.
The remainder of this section is organized as follows: (a) since the degree of the polynomial Q β is unknown a priori, we propose the use of the strategy described in Román-Román et al. (2019) [27], by increasing the degree until the goodness of fit is optimal; (b) considering the degree obtained at the step (a), we use the two procedures described in Sections 3.1 and 3.2 to find the MLEs of the parameters.
The choice of the best degree of the polynomial Q β is performed under the goodness of fit criteria based on the four following measures:  (i) the absolute relative error (RAE) between the sample mean and the estimated mean, i.e.
where Ê(X (p) (t i )) denotes the mean of the estimated process considering a polynomial Q β of degree p; (ii) the Akaike information criterion (AIC), which is defined as (iii) the Bayesian information criterion (BIC), which is given by where n represents the number of observations, (iv) the resistor-average distance (D RA ) between the sample distribution f C and the p-th estimated distribution f Sp , for p = 2, . . ., 6, which is defined as the following harmonic mean (cf.Johnson and Sinanovic (2001) [17]): where D KL denotes the Kullback-Leibler divergence.Assuming that the sample distribution is lognormal with parameters and that the estimated distribution is lognormal with parameters the Kullback-Leibler divergence between the sample distribution f C and the p-th estimated distribution f Sp for p = 2, . . ., 6 is given by, for any with H ξ (t 0 , t) defined in (10).Clearly, if the theoretical distribution of the process is known, one can alternatively compute the resistor-average distance between the theoretical and the estimated distribution.We consider the expected distance and the median of the distance as reference values for the resistor-average distance.
In cases (ii) and (iii), the stochastic model is characterized by p + 2 parameters.Moreover, L V (α, ξ) is defined in (17), and α and ξ are the MLEs of the parameters α and ξ.The best fit is attained for the smallest value of the considered goodness measures.Table 3 shows the estimated parameters for the case no. 1 of Table 2, which is obtained by solving the system (23) for different degrees of the polynomial Q β .Furthermore, the results about the goodness of measures are given in Table 4 and in Figure 5.It can be noticed that the estimated parameters for p = 3 and p = 4 are almost identical, and that β 4 is very close to zero in the case p = 4. Hence, the results concerning the measures of goodness obtained in these two cases are quite similar.This conclusion is also confirmed by the analysis of the RAE measures (in Table 4), that are often used to measure the fit error of the model in terms of the fit of the mean function.The analysis is performed in terms of the scale of judgment of the model accuracy based on the Mean Absolute Percentage Error (MAPE), cf.Klimberg et al. (2010) [19] and Lewis (1982) [20].Indeed, the judgment suggested by the MAPE shows that p = 3 and p = 4 are referred as highly accurate, whereas p = 2 is evaluated as good forecast, with both p = 5  and p = 6 considered as reasonable forecast.Consequently, the choice p = 3 is taken as the best, since it involves the lowest number of parameters.
The same result can be obtained for the other parameters choices, but it is omitted for brevity.Here, we limit to mention that the AIC and its Bayesian version, the BIC, provide a global measure of the adjustment to the model in terms of the likelihood that the model itself gives to the observed sample, so that these measures also allow for model selection criteria.The AIC and the BIC are seen often as complementary measures to the use of the Resistor Average Distance between the sample and the theoretical distributions of the model.However, there is no criterion that indicates that one measure is better than another, and thus in general the use of several alternative measures is recommended, as usual in practical applications.In our analysis, the coincidence of the conclusions suggested by these measures supports the final decision.Hence, from now on, a polynomial of degree p = 3 will be considered.
Table 5 shows the estimated values of the parameters obtained by solving the nonlinear system (23) by means of the Newton-Raphson method.These values provide good parameters estimates, especially when σ is small.The last column of the Table 5 contains the RAE.In this case, it is defined as where m i are the values of the sample mean and Ê(X (3) (t i )) are the values of the estimated mean at the time t i considering a polynomial of degree p = 3.For a comparison between σ or η and the RAE, see Figure 7 (a)-(b): it can be noticed that the value of the RAE shows an increasing trend with respect to the parameter σ, whereas it shows a constant trend with respect to η.In Figure 8 (a)-(b) the theoretical, sample and estimated sample means for the parameters choices number 1 and 2 of the Table 2 are shown.Clearly, the best estimation is obtained when σ is small.Further on, the estimated values obtained via S.A. are given in Table 6, whose last column contains the value of the RAE defined in Eq. ( 28).Since S.A. is a heuristic algorithm, the M LEs have been computed as the average of the results obtained by 10 uses of the procedure.Figures 8 (c)-(d) provide the theoretical, the sample and the estimated (via S.A.) means for the cases no. 1 and no. 2 of the Table 2.In addition, in Figure 7 (c), the trend of the RAE is plotted as a function of the number of replications: clearly, the goodness of  the results improves as the number of replications increases.Moreover, Table 7 contains the estimated values of the parameters (obtained by solving the system ( 23)), as well as their real values and the asymptotic estimation error.Finally, Table 7 provides various confidence intervals obtained by applying the delta method and using the distribution given in Section 3.3 for the case no. 1 of Table 2.

Approximation of FPT density
In this section, the FPT problem is analyzed.With reference to a diffusion process X(t) with a multisigmoidal logistic mean and Q β (t) = 0.1t − 0.009t 2 + 0.0002t 3 , η = e −1 and σ = 0.01, we construct 50 simulated sample paths (see Figure 9-(a)), each one being formed by 361 data simulating X(t i ) for t i = (i − 1) 0.1, i = 1, . . ., 361.As in Section 5, we first chose the optimal polynomial degree (which corresponds to the best fit), and then we found the MLEs of the parameters by solving the system (23).Further on, the R package fptdApprox is used to approximate the FPT density of the process through a constant threshold S = 15.Table 8 provides the estimated parameters, whereas Figure 9-(b) shows the theoretical, the sample and the estimated means, for p = 2, 3, 4, 5, 6.Table 9 provides the four goodness measures for the considered degrees p of the polynomial Q β .Figure 10 shows the resistor-average distances between the theoretical and the estimated distributions, and also between the sample and the estimated distributions.From the given results it follows that the best degree is p = 3.Using the estimated model obtained so far, we now focus on the approximation of the FPT density through the boundary S = 15.Figure 11 shows the approximated FPT density and the FPTL function realized with the package Table 7 The estimated values of the parameters (obtained by solving the system (23)), their real values, their asymptotic estimation error and their 95%, 90% and 75% confidence intervals (simulation study).Diffusion process: {X(t), t in [0, 50]}, P(X(0 Fig. 11 The approximated FPT density and the FPTL function of the process X(t) through the boundary S = 15 (simulation study -FPT problem).

Application to real data
Multisigmoidal functions are suitable to model several special growth phenomena in which the carrying capacity is reached after various stages.In any of these stages a linear growth trend is followed by an explosion of exponential type which finally flattens to a specific value.A growth of this kind is typical of some fruit species, such as peaches or coffee berries (see, for instance the application given in Section 3 of Di Crescenzo et al. (2020) [9]).But also some population diseases follow an expansion with a multisigmoidal trend.In this section we apply the considered stochastic model to data concerning the COVID-19 infections in four different European countries, taken from [40].This is just an example finalized to show an application of the multisigmoidal logistic model, without taking into account specific more sophisticated models that describe epidemiological phenomena with greater precision.First of all, we note that the trend of infections in France, Italy, Spain and United Kingdom is similar (see Figure 12-(a)).This suggests to view these data as different trajectories of the diffusion process X(t) defined on I = [t 0 , t f ], having a multisigmoidal logistic mean (cf.Section 2.1).Hence, in order to find the MLEs of the parameters, we apply the procedure described in Section 3.1.For each country, the initial time t 0 = 0 corresponds to the 30-th day after the one in which the number of infections exceeded 100 (March 30th for France, March 24th for Italy, March 21st for Spain, April 5th for UK), and the final time is chosen as t f = 250.For any path, the data are scaled as divided by their maximum value, so to be interpreted as a percentage of  Table 13 The estimated values, the standard error and the 95%, 90% and 75% confidence intervals for the parameters, considering p = 3 (real application).the last and therefore the maximum value of the growth curve.The estimated means obtained for different degrees are plotted in Figure 12-(b).Table 11 provides the initial and the estimated values of the parameters, whereas Table 12 shows the four measures of goodness, for different degrees of the polynomial.Regarding the RAE, every time the degree increases, the approximation improves, whereas the AIC, the BIC and resistor-average distance show that the best choice is p = 3. Regarding the last measure of goodness, see also Figure 13-(a), in which the resistor-average distances between the sample and the estimated distributions are provided.Hence, in view of the results obtained for the measures of goodness, the degree p = 3 is considered.Table 13 shows the estimated values of the parameters, the estimation of their standard error and the 95%, 90% and 75% percentiles.Moreover, the α-percentiles (13) of the estimated diffusion process with p = 3 are provided in Figure 13  Let us now consider a restricted time range from t 0 = 0 to t f = 246, in order to predict the trend of the growth curve in a short-term prediction analysis.Indeed, forecasting the number of infections during a disease in progress is interesting also in the case of short terms, especially for the goodness of estimation (better in this case than in the long term analysis) and for the timeliness of the results.The considered procedure is the same of the one used above, so (i) the best degree p for the polynomial Q β is chosen by considering various measures of goodness, and (ii) the estimated values of the parameters are used to construct a diffusion process X(t) defined on I = [0, 250].The estimated values of the parameters are given in Table 14, the values of the four measures of goodness are given in Table 15, and finally in Figure 14-(a) we provide the resistor-average distances between the restricted sample and the estimated distributions.See also Figure 14-(b) for the plots of the estimated means for different degrees of the polynomial.Also in this case, the RAE does not provide a good measure of goodness, since every time the degree increases, the approximation improves.From the remaining results, the Table 18 The RAE, the AIC, the BIC, the median and the mean of the resistor-average distance D RA of the parameters for p = 2, 3, 4, 5, 6.For the resistor-average distance, the estimated and the sample distributions are considered (real application -FPT problem).4. The resulting FPT density is then compared to the approximated FPT density obtained by using the whole data set.More in detail, we consider only the first 220 data of COVID-19 infections in the restricted time range I R = [0, 219] and we investigate the best model to fit them.The choice of the optimal degree p of the polynomial Q β is based on the measures of goodness (i)-(iv) described in Section 5.The estimated parameters (given in Table 17) are obtained by solving the system (23).By comparing the results given in Table 18 and in Figure 15-(b), we choose p = 3 as the optimal degree.Then, we fix a constant threshold S = 0.7 which corresponds to the 70% of the last and maximum data in the complete time range I C = [0, 250] and we use the R package fptdApprox to obtain an estimation of the FPT density.The choice of the constant boundary S = 0.7 is not random.Indeed, it is worth observing that the descendent inflection points correspond to the peaks of the function representing the daily increments of the infections.More in detail, by means of Eq. ( 6), the function representing the sample mean of the infections shows two descendent inflection points, one at the time t F 1 = 21.65 and the other at the time t F 2 = 220.66.The population sizes corresponding to the inflection time instants are S F 1 = 0.08 and S F 2 = 0.7.In the time interval [0, t F 1 ], the mean function has a logistic trend, hence the FPT problem through the boundary S F 1 is beyond the scope of the present work.Instead, since the mean in the time interval [0, t F 2 ] has a multisigmoidal logistic profile, we focus our attention to the FPT problem through the threshold S = S F 2 = 0.7.
The approximated FPT density and the FPTL function of the estimated process X(t) through the boundary S = 0.7 are plotted in Figure 15(c)-(d).In order to validate the predicted results concerning the FPT, we consider also the same problem in the complete time range I C .The forecasted results for the restricted time range I R and the approximated results for the complete time range I C are given in Table 19.We note that the most meaningful index Table 19 The mean, the mode, the 1st and the 5th deciles and the standard deviation of the FPT in the complete time range I C and in the restricted time range I R (real application -FPT problem).order to model more complex population dynamics in which the maximum level of the growth is reached after many stages, we referred to the multisigmoidal logistic stochastic growth model.More in detail, the present work has been devoted to the analysis of the corresponding statistical inference and of the FPT problem.Two procedures useful to find the MLEs of the parameters have been described, one based on the resolution of the system of the critical points of the likelihood function, and the other one based on the maximization of the likelihood function by means of the S.A. algorithm.Then, the described strategies have been validated with a simulation study.The last section of the paper has been devoted to a real application concerning COVID-19 infections in four different European countries (France, Italy, Spain and United Kingdom).The data have been fitted using a suitable multisigmoidal logistic stochastic model.Finally, a study regarding the FPT through a fixed boundary has been also performed.Future developments can be oriented to find the MLEs of the parameters with other meta-heuristic optimization procedures (such as Variable Neighborhood Search or other swarm-based algorithms) in order to obtain nice estimates in a short computational time.We aim also to introduce a more sophisticated model suitable to describe better epidemiological dynamics with multiple waves, starting from the multisigmoidal logistic equation.Moreover, aiming at a thorough analysis of the convergence speed for parameter estimation in stochastic differential equations, these approaches will be compared with applications of the recent method called 'covariance matrix adaptation evolution strategy'.Indeed, the latter is used often in the presence of several parameters (cf., for instance, Ghosh et al. [13] and Willjuice and Baskar [37]).

Fig. 6
Fig. 6 The resistor average distance between (a)-(b) the sample and the estimated distribution, and (c)-(d) the theoretical and estimated distribution for the case 1 of Table2, for different degrees of the polynomial (simulation study).

Fig. 8
Fig. 8 The theoretical, sample and estimated means of the process X(t) for the parameters of the cases number 1 and 2 of Table 2 (from left to right).The results are obtained via Newton-Raphson method in (a) and (b) and via S.A. in (c) and (d).(Simulation study).

6 8Fig. 10
Fig.10Resistor-average distance between (a) the theoretical and the estimated distributions and (b) between the sample and the estimated distributions for the FPT density approximation (simulation study -FPT problem).

Fig. 13 (
Fig.13(a) The resistor-average distances between the sample and the estimated distributions considering different degrees of the polynomial.(b) The α-percentiles of the estimated diffusion process X(t) obtained for a degree p = 3 and for α = 95, 90, 75 (real application).
through the boundary S = 15.

Table 2
The values of the parameters (simulation study).

Table 3
(23)estimated parameters obtained by solving system(23)for different degrees of the polynomial Q β (simulation study).

Table 4
The goodness measures for different degree.For the resistor-average distance D RA , the estimated and the theoretical distributions are considered (simulation study).

Table 5
The estimated values of the parameters obtained by solving the system (simulation study).

Table 6
The estimated values of the parameters obtained via S.A. (simulation study).

Table 8
The estimated values of the parameters considering p = 2, 3, 4, 5, 6 for the FPT density approximation (simulation study -FPT problem).Finally, in Table10other useful quantities related to the FPT density are provided.It is worth noting that the results obtained in this section are in agreement with those given in Example 4.1.

Table 9
The RAE, the AIC, the BIC the median and the mean of the resistor-average distance D RA of the parameters for p = 2, 3, 4, 5, 6.For the resistor-average distance, the estimated and the theoretical distributions are considered (simulation study -FPT problem).

Table 11
The initial and the estimated values of the parameters considering different degrees.The results have been obtained by solving the system (real application).

Table 12
RAE, AIC, BIC, the median and the mean of the resistor-average distance D RA considering different degrees.For the resistor-average distance, the estimated and the sample distributions are considered (real application).

Table 17
The estimated values of the parameters considering p = 2, 3, 4, 5, 6 in the restricted time range I C (real application -FPT problem).