1 Introduction

A unique characteristic of the European labour market was pointed out by Olivier Blanchard and Lawrence Summers in 1986. According to them, a standard macroeconomic theory predicates that demand and supply shocks may unexpectedly cause some deviations of the actual unemployment rate from its equilibrium level. However, the actual unemployment rate would eventually return to its equilibrium level in the long run (Blanchard and Summers 1986a, 1986b). This hypothesis is known as the natural rate hypothesis, which is one of the central ideas of theoretical macroeconomics (Song and Wu 1998). On the other hand, shocks are more persistent than the standard theory could predicate, and high unemployment does not seem to return to its equilibrium level in the European labour market. Blanchard and Summers theorised this interesting empirical regularity of European unemployment as the hysteresis hypothesis (Blanchard and Summers 1986a; Blanchard and Summers 1986b; Mitchell 1993; León–Ledesma 2002; Camarero and Tamarit 2004).

To test empirically the hysteresis hypothesis, researchers used various types of unit root tests. The summary of major empirical findings is reported in Table 1. As the table indicates, there are four different periods of empirical analysis which are separated by dominant methods for the analysis of unemployment hysteresis. During the first period, that is, from 1986 to 1996, researchers used the standard unit root tests, such as the augmented Dickey-Fuller (ADF) test (Dickey and Fuller 1979, 1981) or the Phillips-Perron (PP) test (Phillips and Perron 1988). They observed the presence of hysteresis in the time-series data (Blanchard and Summers 1986b; Neudorfer et al. 1990; Brunello 1990; Mitchell 1993; Røed 1996).

Table 1 Summary of major empirical findings on unemployment hysteresis

In the second period (1998–2004), the dominant method for empirical analysis of unemployment hysteresis changed from individual time series-based unit root tests in the previous period to the panel data-based unit root tests. Researchers used some major panel unit root tests, such as the Levin-Lin-Chu (LLC) test (Levin et al. 2002), the Im-Pesaran-Shin (IPS) test (Im et al. 2003) and the multivariate ADF test (MADF) test (Sarno and Taylor 1998). They generally concluded that there was no hysteresis in the panel data on unemployment rates (Song and Wu 1998; León–Ledesma 2002; Smyth 2003; Camarero and Tamarit 2004).

Furthermore, during the third period from 2004 to 2010, a popular method for empirical analysis of unemployment hysteresis was the unit root test with structural breaks. Researchers used several different types of structural break unit root tests, such as the ADF test with structural breaks (Lumsdaine and Papell 1997), the panel KPSS test with structural breaks (Carrion-i-Silvestre et al. 2005) and the panel LM test with structural breaks (Im et al. 2005). They concluded that there was no hysteresis in the unemployment rate (Camarero et al. 2005, 2006; Lee et al. 2009, 2010).

During the fourth and most recent period from 2011 to the present, researchers started to use the nonlinear unit root test to take account of unknown nonlinearity and they produced mixed results (Chang 2011; Bolat et al. 2014; Furuoka 2017; Akay et al. 2020). Some researchers detected the presence of hysteresis in unemployment rates (Chang 2011; Akay et al. 2020) while other researchers denied the presence of unemployment hysteresis (Bolat et al. 2014; Furuoka 2017; Yaya et al. 2019; Awolaja et al. 2021; Cheng 2022). For example, Chang (2011) examined unemployment hysteresis in OECD countries by using the Fourier KPSS test (Becker et al. 2006) and concluded that unemployment hysteresis existed in these countries. By contrast, Bolat et al. (2014) analysed the unemployment hysteresis in European countries using the panel KSS test without the Fourier function (Ucar and Omay 2009) and the panel KSS test with the Fourier function (Chang and Chang 2012). The panel KSS test without the Fourier function indicated that unemployment hysteresis existed in these countries, while the panel KSS test with the Fourier function showed that there was no unemployment hysteresis. Furthermore, Furuoka (2017) used the Fourier ADF with structural break (FADF-SB) test to examine the unemployment hysteresis in the Nordic countries and concluded that there was no unemployment hysteresis in these countries. However, Akay et al. (2020) examined the unemployment hysteresis in transition economies using the Kruse test (Kruse 2011) and the Fourier Kruse test (Güriş 2019) and they detected unemployment hysteresis in these countries.

Despite their numerous research efforts, researchers could not empirically prove whether there was hysteresis in the unemployment rates. As the summary in Table 1 indicates, researchers failed to produce consistent empirical findings on this crucial topic in macroeconomics. To overcome the inconsistency in the empirical findings from the unit root approach, Luis A. Gil-Alana suggested using the fractional integration approach for the analysis of unemployment hysteresis. He pointed out that the standard unit root approach could consider a restrictive dichotomy between the unit root process (\(I\left( 1 \right))\) and the stationary process (\(I\left( 0 \right)) \) following Diebold and Rudebusch (1991); Hassler and Wolters (1994) and Lee and Schmidt (1996). In other words, the traditional approach may tend to ignore a unique difference in the manifestation of unemployment dynamics which could be captured by the fractional integration approach (\(I\left( d \right))\). Thus, he suggested that Robinson’s Lagrange Multiplier (LM) method (Robinson 1994) could be used for the empirical estimation of the fractional integration parameter (d) in the unemployment dynamics (Gil-Alana 2001a, 2001b, 2002). There could be three possible ranges of the fractional integration parameter in line with major hypotheses of unemployment rates. Firstly, if the estimated fractional integration parameter is in the range (0, 0.5), unemployment rates are stationary and mean-reverting in line with the natural rate hypothesis. Secondly, if it lies in the range [0.5, 1.0), unemployment rates are non-stationary but mean-reverting in line with the persistence hypothesis. Thirdly, if it lies in the range [1.0, ∞), unemployment rates are non-stationary and non-mean-reverting in line with the hysteresis hypothesis (Caporale and Gil‐Alana 2007; Cuestas et al. 2011; Caporale and Gil‐Alana 2018).

More recently, a new unit root approach for the analysis of unemployment dynamics was suggested by Yaya et al (2021). They authors proposed a novel ADF-type unit root test, which was based on the autoregressive neural network (ARNN) framework. The methodological advantage of this ARNN-ADF test was that the hidden layer in the neural network approach could be used to capture the latent structure of unemployment dynamics in terms of nonlinearities. According to the results of their Monte Carlo simulation, the ARNN-ADF test would suffer less from size distortion than the standard unit root test (Yaya et al. 2021). Also, due to short samples of unemployment data, often obtained annually for cross-section of countries, a good nonlinear approximator, such as the ARNN nonlinear function, is required in the unit root testing framework for unemployment hysteresis. Introducing this in empirical analysis of unemployment data will drastically reduce the bias due to fewer annual time series observations for unemployment.

Thus, the main objective of the current study is to propose a new fractional integration approach or the autoregressive neural network fractional integration (ARNN-FI) testing framework, which is an updated version of Robinson’s (1994) LM method, based on a nonlinear ARNN nonlinearity, rather than the original linear ADF regression.

Our contribution to the literature is threefold. First, we propose a fractional integration test based on the ANN nonlinearity testing framework and provide a theoretical background for it. Second, we evaluate the size and power of the testing procedure in comparison with extant unit root tests, such as the recently proposed ARNN-ADF unit root test by Yaya et al. (2021), in a Monte Carlo simulation exercise. Third, we apply the new test to empirically ascertain the stationarity stance of unemployment rates in selected countries.

Following this introductory section, the theoretical properties of the new fractional integration approach are given in Sect. 2 and Sect. 3 presents a supporting asymptotic theory. Section 4 displays the simulation analysis, while Sect. 5 contains an empirical application to unemployment rates in selected European countries; finally, the overall conclusion is given in Sect. 6.

2 A new fractional integration approach

Testing unit roots has become a standard practice in the empirical analysis of economic data because econometric analysis of economic time series data tends to rely on stationary time structures (Box and Jenkins 1976). A prominent seminal paper by Dickey and Fuller (1979) on Augmented Dickey-Fuller (ADF) unit root test led to the development of many other unit root tests in the literature over the last thirty years. These include those of Phillips and Perron (1988), Kwiatkowski et al. (1992), Elliot et al. (1996), and Ng and Perron (2001), among others. However, it is a familiar stylized fact that most unit root methods have very low power when alternatives include: structural breaks (Perron 1989; Campbell and Perron 1991), fractional unit roots (Diebold and Rudebusch 1991; Hassler and Wolters 1994; Lee and Schmidt 1996), regime-switching (Nelson et al. 2001), or more general nonlinear structures (Enders and Granger 1998).

One way to overcome these limitations is to incorporate nonlinearities in the deterministic components of the auxiliary regressions of the unit root tests, as a proxy for structural breaks. While fractional integration and structural break are intimately related and can be likened to nonlinearities in the time series; either nonlinearity or fractional integration is shown to dominate the other in some cases (van Dijk et al. 2002; Gil-Alana and Yaya 2021); hence, a powerful nonlinear approximator is needed in this case. Franses and van Dijk (2000) emphasized nonlinear time series models as more appropriate tools for explaining and predicting economic time series. Several approaches have attempted to integrate nonlinear dynamics into the unit root testing framework; these include as such Caner and Hansen (2001), Shin and Lee (2001) and Kapetanios et al. (2003). In another study, Allen et al. (2016) extended the unit root tests to nonlinear models for the analysis of exchange rate movements. Furthermore, Trapletti et al. (2000) introduced the Autoregressive Neural Network (ARNN) process, driven by additive noise and demonstrated the behaviour of its stationarity. They further showed that the nonlinear unit root test examined would be satisfactory if the activation function of the ARNN is bounded. Earlier, Steurer (1996) demonstrated empirically that neural networks only work best for stationary data.

The artificial neural network (ANN) is a parametric model’s approximator for other nonlinear time series models, such as the Threshold Autoregressive (TAR), Smooth Transition Autoregressive (STAR), Markov Switching (MS), and Bilinear models (Franses and van Dijk 2000). Thus, it is uncommon to imagine an ANN model as a Data Generating Process (DGP) of any time-dependent system. Thus, to test for neural network-type nonlinearity, the ANN serves as a universal nonlinear approximator, as it induces stronger nonlinearities than other extant nonlinear time series models (Lee et al. 1993).

So far, unit root tests based on ANN nonlinearity are scarce in the literature. A recent test—the ARNN-ADF unit root test—was proposed by Yaya et al (2021). The testing procedure relies on the linear, quadratic, and cubic components of the neural network process that induce nonlinearity in the ARNN-ADF test. The authors applied the simplest form of the ANN model in the ARNN-ADF test regression.

In this present paper, we extend the ARNN-ADF unit root test of Yaya et al. (2021) to a fractional unit root framework by relying on the fact that the classical unit root tests, as well as other Dickey-Fuller-like tests, have very low power against fractional unit root alternatives. Thus, the proposed fractional integration test is based on the fractional integration approach using the ANN framework and a more general test to the existing ARNN-ADF unit root test. This new fractional integration approach is based on the following model:

$$ y_{t} = \mathop \sum \limits_{p = 1}^{r} \theta_{p} F(\gamma_{p}^{\prime } w_{t} ) + x_{t} , \quad t = 1,{ }2,{ } \ldots ,T, $$
(1)

where yt is the time series under investigation, \(F\left( {\gamma_{p}^{\prime } ,w_{t} } \right)\) is the expression for the ANN nonlinear function in time t, where \(\gamma_{p}\) and \(w_{t}\) are defined later, \(\theta_{p}\), \(p = 1, \ldots ,r\) being the “connector strength” parameters and \(x_{t}\) is the fractionally integrated process which is expressed by:

$$ \left( {1 - L} \right)^{d} x_{t} = u_{t} ,\quad t = 1,{ }2,{ } \ldots ,T, $$
(2)

where L is the usual lag-operator in the form of \(L^{k} x_{t} = Lx_{t - k}\) for every k lag integer, d is the fractional integration parameter defined in the interval \(- 0.5 < d < 2\) including the moving average invertibility and nonstationary ranges of time series (Sowell 1992), and \(u_{t}\) is the covariance stationary I(0) process, assumed to be independently and uniformly dispersed with mean, 0 and variance, \(\sigma_{u}^{2}\). We suppose \(x_{t} = 0\) for \(t \le 0\), following Type II definition of fractional integration as in Marinucci and Robinson (1999). For the fractional integration parameter, at \(d = 0\) from (2), \(x_{t} = u_{t}\), and at \(d = 1\) and \(d = 2\); we have the respective series differenced-transformations \(x_{t} - x_{t - 1} = u_{t}\) and \(x_{t} - 2x_{t - 1} + x_{t - 2} = u_{t}\). The fractional difference operator \((1 - L)^{d} \) is expressed by the Maclaurin series as:

$$ \left( {1 - L} \right)^{d} = \sum\limits_{k = o}^{\infty } {\frac{{\Gamma \left( { - d + k} \right)}}{{\Gamma \left( { - d} \right)\Gamma \left( {k + 1} \right)}}} L^{k} , $$
(3)

where \(\Gamma \left( . \right)\) is a Gamma function. By putting (3) in (2), \(u_{t}\) in (2) can be expressed as:

$$ u_{t} = \sum\limits_{k = 0}^{\infty } {\frac{{\Gamma \left( { - d + k} \right)}}{{\Gamma \left( { - d} \right)\Gamma \left( {k + 1} \right)}}} x_{t - k} $$
(4)

The function \(F\left( {\gamma_{p}^{\prime } w_{t} } \right)\) in (1) is known as the hidden unit of the ANN. This is a bounded logistic function between 0 and 1, such that:

$$ \begin{aligned} & F\left( {\gamma_{p}^{\prime } w_{t} } \right) = \left\{ {1 + \exp \left( { - \gamma_{p}^{\prime } w_{t} } \right)} \right\}^{ - 1} - \frac{1}{2}, \\ & \quad \quad = \frac{1}{{1 + \exp \left[ {\left( {c_{1} - \gamma_{11} y_{t - 1} - \gamma_{12} y_{t - 2} - ... - \gamma_{1p} y_{t - p} } \right)} \right]}} - \frac{1}{2} \\ \end{aligned} $$
(5)

where \(\gamma_{p} \,\, = \,\,\,( - c,\,\,\gamma_{11} \,,\,...\,,\,\gamma_{1p} )^{\prime}\) and \(\left( {p + 1} \right) \times 1\) vector of parameters of p hidden units.

These hidden units are then approximated using third-order Taylor series expansion on the logistic function asFootnote 1:

$$ \begin{aligned} & F\left( {\gamma_{p}^{\prime } w_{t} } \right) = F\left( {\gamma_{p}^{\prime } w_{t}^{0} } \right) + \mathop \sum \limits_{i = 0}^{p} \frac{{\partial F\left( {\gamma_{p}^{\prime } w_{t}^{0} } \right)}}{{\partial \gamma_{i} }}\gamma_{i} + \frac{1}{2!}\mathop \sum \limits_{i = 0}^{p} \mathop \sum \limits_{j = 0}^{p} \frac{{\partial^{2} F\left( {\gamma_{p}^{\prime } w_{t}^{0} } \right)}}{{\partial \gamma_{i} \partial \gamma_{j} }}\gamma_{i} \gamma_{j} \\ & \quad \quad + \frac{1}{3!}\mathop \sum \limits_{i = 0}^{p} \mathop \sum \limits_{j = 0}^{p} \mathop \sum \limits_{l = 0}^{p} \frac{{\partial^{3} F\left( {\gamma_{p}^{\prime } w_{t}^{0} } \right)}}{{\partial \gamma_{i} \partial \gamma_{j} \partial \gamma_{l} }}\gamma_{i} \gamma_{j} \gamma_{l} + \cdots + R_{h} \left( {\gamma_{p} ,w_{t} ,w_{t}^{0} } \right) \\ \end{aligned} $$
(6)

where \(R_{h} \left( {\gamma_{p} ,w_{t} ,w_{t}^{0} } \right)\) is the remainder of the hth order expansion in the Taylors series expansion (Rech 2002; Medeiros et al. 2006; Yaya 2013; Yaya et al. 2021), and

$$ F\left( {\gamma_{p}^{\prime } w_{t}^{0} } \right) = \frac{1}{1 + \exp \left( 0 \right)} - \frac{1}{2} = 0; $$
(7)
$$ \frac{{\partial F\left( {\gamma_{p}^{\prime } w_{t} } \right)}}{{\partial \gamma_{i} }} = \left\{ {1 + \exp \left( { - \gamma_{p}^{\prime } w_{t} } \right)} \right\}^{ - 2} \exp \left( { - \gamma_{p}^{\prime } w_{t} } \right)w_{t} \;{\text{and}}\;{\text{for}}\;i \ge 1, $$
$$ \frac{{\partial F\left( {\gamma_{p}^{\prime } w_{t}^{0} } \right)}}{{\partial \gamma_{i} }} = \frac{1}{4}y_{t - i} ; $$
(8)
$$ \begin{aligned} & \frac{{\partial^{2} F\left( {\gamma_{p}^{\prime } w_{t} } \right)}}{{\partial \gamma_{i} \partial \gamma_{j} }} = \frac{{\left[ {\exp \left( { - \gamma_{p}^{\prime } w_{t} } \right)\left\{ {1 + \exp \left( { - \gamma_{p}^{\prime } w_{t} } \right)} \right\}^{2} - 2\left\{ {1 + \exp \left( { - \gamma_{p}^{\prime } w_{t} } \right)} \right\}\exp \left( { - 2\gamma_{p}^{\prime } w_{t} } \right)} \right]}}{{\left\{ {1 + \exp \left( { - \gamma_{p}^{\prime } w_{t} } \right)} \right\}^{4} }} \\ & \quad \quad = \frac{{\left[ {\exp \left( { - \gamma_{p}^{\prime } w_{t} } \right) - \exp \left( { - 2\gamma_{p}^{\prime } w_{t} } \right)} \right]}}{{\left\{ {1 + \exp \left( { - \gamma_{p}^{\prime } w_{t} } \right)} \right\}^{3} }}\;{\text{for}}\;i,j \ge 1, \\ \end{aligned} $$
$$ \frac{{\partial^{2} F\left( {\gamma_{p}^{\prime } w_{t}^{0} } \right)}}{{\partial \gamma_{i} \partial \gamma_{j} }} = 0; $$
(9)
$$ \frac{{\partial^{3} F\left( {\gamma_{p}^{\prime } w_{t} } \right)}}{{\partial \gamma_{i} \partial \gamma_{j} \partial \gamma_{l} }} = \frac{{\left\{ {\exp \left( { - \gamma_{p}^{\prime } w_{t} } \right) - 2\exp \left( { - \gamma_{p}^{\prime } w_{t} } \right)} \right\}\left\{ {1 - 2\exp \left( { - \gamma_{p}^{\prime } w_{t} } \right)} \right\}}}{{\left\{ {1 + \exp \left( { - \gamma_{p}^{\prime } w_{t} } \right)} \right\}^{4} }}\;{\text{for}}\;i,j,l \ge 1. $$

Then, \(\frac{{\partial^{3} F\left( {\gamma_{p}^{\prime } w_{t}^{0} } \right)}}{{\partial \gamma_{i} \partial \gamma_{j} \partial \gamma_{k} }} = \frac{1}{16}y_{t - i} y_{t - j} y_{t - l}\) and if \(i,j \ge 1\), and \(l = 0\),

$$ \frac{{\partial^{3} F\left( {\gamma_{p}^{\prime } w_{t}^{0} } \right)}}{{\partial \gamma_{i} \partial \gamma_{j} \partial \gamma_{l} }} = \frac{1}{16}y_{t - i} y_{t - j} $$
(10)

Thus, by substituting Eqs. (7)–(10) appropriately in Eq. (6), we obtain the approximated hidden unit of the ANN function \(F\left( {\gamma_{p}^{\prime } w_{t} } \right)\) which is used to mimic nonlinear dynamics in \(f\left( {\gamma_{p} ,w_{t} } \right)\) in (1). Also, by merging terms of the same orders in Eq. (6) gives,Footnote 2

$$ \left( {1 - L} \right)^{d} y_{t} = m_{0} + \sum\limits_{i = 0}^{p} {m_{i} y_{t - i} } + \sum\limits_{i = 0}^{p} {\sum\limits_{j = 1}^{p} {m_{ij} y_{t - i} y_{t - j} } } + \sum\limits_{i = 0}^{p} {\sum\limits_{j = i}^{p} {\sum\limits_{l = j}^{p} {m_{ijl} y_{t - i} y_{t - j} y_{t - l} } } + \tilde{\varepsilon }_{t} } , $$
(11)

where coefficients \(m_{0}\) is the intercept; and \(m_{i}\) form the parameters for the linear logistic component, which further acts as the autoregressive parameters linking \(y_{t}\) and \(y_{t - i}\). In the nonlinear part, \(m_{ij}\) is the parameter for the quadratic component, where \(y_{t - i} y_{t - j}\) is the quadratic component; \(m_{ijl}\) is the parameter for the cubic component; and \(y_{t - i} y_{t - j} y_{t - l}\) is the cubic component.

About the linearity of the process, the acceptance of the null hypothesis:

$$ H_{0} :\left\{ {\begin{array}{*{20}l} {m_{i} = 0, i = 0,...,p} \hfill \\ {m_{ij} = 0, i = 0,...,p; j = i,...,p} \hfill \\ {m_{ijl} = 0, i = 0,...,p; j = i,...,p; l = j,...,p} \hfill \\ \end{array} } \right. $$
(12)

implies linearity of the time structure. A suitable F test for linearity against nonlinearity, is therefore conducted. Note, the standard test may be carried out and the problem of nuisance parameter \(\theta \) not being identified under the null hypothesis as noted in Luukkonen et al. (1988) is solved following Davies (1977).

To estimate the parameters in the model in Eq. (10), one needs to minimize the errors \(\tilde{\varepsilon }_{t}\) which could be re-written as the linear parameter form,

$$ \tilde{\varepsilon }_{t} = y_{t}^{*} - \hat{m}_{0} 1_{t}^{*} + \mathop \sum \limits_{i = 1}^{p} \hat{m}_{i} z_{t}^{*} - \mathop \sum \limits_{i = 0}^{p} \mathop \sum \limits_{j = 1}^{p} \hat{m}_{ij} zz_{t}^{*} - \mathop \sum \limits_{i = 0}^{p} \mathop \sum \limits_{j = i}^{p} \mathop \sum \limits_{l = j}^{p} \hat{m}_{ijl} zzz_{t}^{*} , $$
(13)

where \(y_{t}^{*} \,\,\, = \,\,\,(1\, - \,L)^{{d_{o} }} \,y_{t} \,:\) \(1_{t}^{*} \,\,\, = \,\,\,(1\, - \,L)^{{d_{o} }} \,1_{t} \,:\) \(z_{t}^{*} = \left( {1 - L} \right)^{{d_{0} }} y_{t - i}\); \(zz_{t}^{*} = \left( {1 - L} \right)^{{d_{0} }} y_{t - i} y_{t - j}\) and \(zzz_{t}^{*} = \left( {1 - L} \right)^{{d_{0} }} y_{t - i} y_{t - j} y_{t - l}\) with the hypothesized value \(d = d_{0}\). The error process \(\tilde{\varepsilon }_{t}\) is assumed to be I(0).

In the absence of nonlinearity, that is under the acceptance of the nested null hypothesis, testing the ARNN-FI framework in Eq. (11) reduces to the linear specification of the Robinson fractional integration test for \(p = 1\), and \(y_{t - i} = t\), that is, a time trend with coefficient \(\beta\) and intercept \(m_{0} = \alpha\).

$$ \left( {1 - L} \right)^{d} y_{t} = \alpha + \beta t + \tilde{\varepsilon }_{t} , $$
(14)

with the error process \(\overline{\varepsilon }_{t}\), one can easily estimate the coefficients α and β by the conventional OLS methods such that,

$$ \tilde{\varepsilon }_{t} = y_{t}^{*} - \hat{\alpha }_{0} 1_{t}^{*} + \hat{\beta }t_{t}^{*} , $$
(15)

and \(y_{t}^{*} \,\,\, = \,\,\,(1\, - \,L)^{{d_{o} }} \,y_{t} \,:\) \(1_{t}^{*} \,\,\, = \,\,\,(1\, - \,L)^{{d_{o} }} \,1_{t} \,:\) \(t_{t}^{*} \,\,\, = \,\,\,(1\, - \,L)^{{d_{o} }} \,t_{t} \,,\) with the aid of the complex form of the test statistic,

$$ \hat{R}\,\,\, = \,\,\,\frac{T}{{\hat{\sigma }^{4} }}\,\hat{a}^{\prime}\,\hat{A}^{ - 1} \,\hat{a}\,, $$
(16)

where T is the sample size, and

$$ \hat{a}\,\,\, = \,\,\,\frac{ - 2\pi }{T}\,\sum\limits_{f}^{*} {\psi (\lambda_{f} )\,g_{u} (\lambda_{f} ;\,\hat{\tau })^{ - 1} \,I(\lambda_{f} )} ;\quad \hat{\sigma }^{2} \,\,\, = \,\,\,\sigma^{2} (\hat{\tau })\,\,\, = \,\,\,\frac{2\pi }{T}\sum\limits_{f\, = \,1}^{T\, - \,1} {g_{u} (\lambda_{f} ;\,\hat{\tau })^{ - 1} \,I(\lambda_{f} )} , $$
$$ \hat{A} = \frac{2}{T}\left\{ {\sum\limits_{f}^{*} {\psi \left( {\lambda_{f} } \right)\psi \left( {\lambda_{f} } \right)^{\prime } - } \sum\limits_{f}^{*} {\psi \left( {\lambda_{f} } \right)\hat{\xi }\left( {\lambda_{f} } \right)^{\prime } \left[ {\sum\limits_{f}^{*} {\hat{\xi }\left( {\lambda_{f} } \right)\hat{\xi }\left( {\lambda_{f} } \right)^{\prime } } } \right]}^{ - 1} \sum\limits_{f}^{*} {\hat{\xi }\left( {\lambda_{f} } \right)\psi \left( {\lambda_{f} } \right)^{\prime } } } \right\}; $$
$$ \psi \left( {\lambda_{f} } \right) = \log \left| {2\sin \frac{{\lambda_{f} }}{2}} \right|;\quad \hat{\xi }\left( {\lambda_{j} } \right) = \frac{\partial }{\partial \tau }\log g_{\varepsilon } \left( {\lambda_{j} ;\hat{\tau }} \right), $$

where λf = 2πf/T, and * indicates that the sums are taken over all frequencies bounded in the spectrum, with periodogram I(λj) for \(\tilde{\varepsilon }_{t}\) and \(\hat{\tau }\,\,\,\,\, =\)\(\,\arg \,\,\min_{{\tau \, \in \,T^{*} }} \,\sigma^{2} (\tau )\,,\)(T* is a subset of the Rq Euclidean space), and f is the frequency of the sine function, and \(\pi = 3.142\).

3 Asymptotic theory

This section offers an asymptotic theory for the newly developed statistic or autoregressive neural network–fractional integration (ARNN–FI) statistic. This new statistic is based on the nested hypothesis by setting the parameter θ to zero. The zero restrictions would impose on the sequence of scalar real values or \(x_{t}\) (Robinson 1994; Gil-Alana and Robinson 1997):

$$ \varphi \left( L \right)x_{t} = u_{t} $$
(17)

where φ is a function of L which is the lag operator, \(u_{t} \) is covariance stationary sequence with zero mean and weak autocorrelation. The lag operator function φ could be expressed as \(\varphi \left( {z;\theta } \right)\) in which z is variate and θ is a vector of parameter. The value of \(x_{t}\) is set to zero when \(t = 0\). Furthermore, \(\varphi \left( {z;\theta } \right) = \varphi \left( z \right) \) for all z if and only if the zero restrictions are imposed on the parameter. The simple unit root model could be described as \(\varphi \left( z \right)x_{t} = \left( {1 - z} \right)x_{t} = u_{t}\). By contrast, the simple autoregressive model could be expressed as \(\varphi \left( {z;\theta } \right)x_{t} = \left( {1 - \left( {1 + \theta } \right)z} \right)x_{t} = u_{t}\) (Robinson 1994).

Using the fractional integration function, Eq. (1) could be re-formulated as (Robinson 1994; Gil-Alana and Robinson 1997; Gil-Alana 2000):

$$ (1 - L)^{d} x_{t} = u_{t} $$
(18)

where d is fractional integration parameter which means \(x_{t}\) is integrated of order d. The fractional integration function could be expressed as (Gil-Alana and Robinson 1997; Gil-Alana 2000):

$$ (1 - L)^{d} = \mathop \sum \limits_{j = 0}^{\infty } \left( {\begin{array}{*{20}c} d \\ j \\ \end{array} } \right)( - 1)^{j} L^{j} = 1 - dL + \frac{{d\left( {d - 1} \right)}}{2}L^{2} - \frac{{d\left( {d - 1} \right)\left( {d - 2} \right)}}{6}L^{3} + \ldots $$
(19)

In the fractional integration framework, the unit root hypothesis could be tested whether the fractional integration parameter is equal to unity (\(d = 1\)). If the fractional integration parameter is greater than zero (\(d > 0\)), \(x_{t}\) could be considered as the long memory process in which the autocorrelation would persist for the long run. By contrast, if the fractional integration parameter is equal to zero (\(d = 0\)), \(x_{t}\) could be considered as the short memory process in which the autocorrelation would rapidly decay (Gil-Alana and Robinson 1997; Gil-Alana 2000).

The main objective of the fractional integration test could be to examine unit root processes or other forms of non-stationarity. In this context, Equation (2) could be expressed as (Tanaka 1999; Gil-Alana 2000):

$$ (1 - L)^{d + \theta } x_{t} = u_{t} $$
(20)

In this equation, the zero restrictions on the parameter could be expressed as the following null hypothesis (Diebold and Rudebusch 1989; Robinson 1994; Gil-Alana and Robinson 1997):

$$ H_{0} : \theta = 0 $$
(21)

This newly developed statistical test uses the Lagrange multiplier (LM) statistic for the hypothesis testing on the fractional integration by solving zero restrictions on the parameter as constrained maximization problem. The LM statistic is widely used because it has a simple null distribution in the analysis of nested parametric hypotheses. However, in the analysis of unit root process, the LM statistic may have non-standard null and local asymptotic distribution (Robinson 1994; Davidson and MacKinnon 2004). Robinson (1994) offered an interesting statistical solution for this non-standard distribution of fractional integration statistics which are asymptotically locally most powerful. Under the assumption of the Gaussian distribution of \(u_{t}\), Robinson’s test statistic could be considered as efficient against other local alternatives in a Pitman sense (Robinson 1994; Gil-Alana 2000). Our fractional integration test is an extension of his approach. In the estimation of this new fractional integration statistic, \(x_{t}\) could be non-observable but it could be considered as the error terms in the multiple regression model (Robinson 1994; Gil-Alana and Robinson 1997):

$$ y_{t} = \beta^{\prime}z_{t} + x_{t} $$
(22)

where \(y_{t}\) and \(k \times 1 \) vector \(z_{t}\) are observable and β is \(k \times 1 \) vector of unknown parameters and k is number of parameters. The parameter β could be estimated as (Robinson 1994; Gil-Alana and Robinson 1997):

$$ \tilde{\beta } = \left( {\mathop \sum \limits_{t = 1}^{T} w_{t} w^{\prime}_{t} } \right)^{ - 1} \mathop \sum \limits_{t = 1}^{T} w_{t} \left( {1 - L} \right)^{d} y_{t} $$
(23)

where \(w_{t} = \left( {1 - L} \right)^{d} Z_{t}\) and T is number of observations in the time-series. Using the estimations from the multiple regression model of Eq. (6), \(u_{t}\) in Eq. (4) could be also estimated as (Robinson 1994; Gil-Alana and Robinson 1997):

$$ \tilde{u}_{t} = \left( {1 - L} \right)^{d} y_{t} - \tilde{\beta }^{\prime } w_{t} $$
(24)

Furthermore, the periodogram of \(\tilde{u}_{t}\) can be expressed as (Geweke and Porter‐Hudak 1983; Gil-Alana and Robinson 1997; Hurvich et al. 1998):

$$ I\left( {\lambda_{j} } \right) = \left| {\frac{1}{{\sqrt {2\pi T} }}\mathop \sum \limits_{t = 1}^{T} \tilde{u}_{t} e^{{i\lambda_{j} t}} } \right|^{2} \quad j = 1,2, \ldots ,m $$
(25)

where \(\lambda = 2\pi j/T\), and m is a positive integer. Using these estimations, our test statistic could be expressed as:

$$ \hat{R}_{NN} = \frac{T}{{\hat{\sigma }^{4} }}\hat{a}^{\prime } \hat{A}^{ - 1} \hat{a} = \hat{r}_{NN}^{\prime } \hat{r}_{NN} ;\quad \hat{r}_{NN} = \frac{{T^{1/2} }}{{\hat{\sigma }^{2} }}\hat{a}^{\prime } \hat{A}^{ - 1/2} \hat{a} $$
(26)

There is a large similarity in the estimation approach between the Robinson’s test statistic \(\widehat{R }\) and our test statistic \(\hat{R}_{NN} \) which is based on the autoregressive neural network (ARNN) process (Yaya et al. 2021). The difference between these two test statistics essentially lies in the definition of \(z_{t}\). In other words, we propose a new test statistic by modifying \(z_{t}\) following an example of the Fourier fractional integration test (Gil-Alana and Yaya 2021). In the Robinson’s original test statistics, \(z_{t}\) is deterministic regressors (i.e. \(z_{t} = \left( {1,t} \right)\)) and \(z_{t}\) is defined as the Fourier function. In the Fourier fractional integration, that Eq. (22) could be re-formulated as (Gil-Alana and Yaya 2021):

$$ y_{t} = f\left( t \right) + x_{t} $$
(27)

where \(f\left( t \right)\) is the smooth trend Fourier function. Similarly, in our new fractional integration test, Eq. (22) could be re-formulated as:

$$ y_{t} = n\left( t \right) + x_{t} $$
(28)

where \(n\left( t \right)\) is the autoregressive neutral network function which is defined as (Yaya et al. 2021):

$$ n\left( t \right) = \mathop \sum \limits_{i = 1}^{p} k_{i} y_{t - i} + \mathop \sum \limits_{i = 1}^{q} \mathop \sum \limits_{j = i}^{q} k_{ij} y_{t - i} y_{t - j} + \mathop \sum \limits_{i = 1}^{r} \mathop \sum \limits_{j = i}^{r} \mathop \sum \limits_{l = j}^{r} k_{ijl} y_{t - i} y_{t - j} y_{t - l} $$
(29)

where \(k_{i}\) is coefficient for linear component, \(k_{ij}\) is coefficient for quadratic component and \(k_{ijl}\) is coefficient for cubic component, p is lag length for linear component, q is lag length for quadratic linear component and r is lag length for cubic component (Yaya et al. 2021).

Theorem 1

Under the null hypothesis defined in (4) and (5) and under the condition:

$$ 0 < \det \left( \psi \right) < \infty $$
(30)

where det denotes determinant, the ARNN–FI test statistic would converge in distribution:

$$ \hat{r}_{NN} \to _{d} N\left( {0,I_{p} } \right)\;{\text{as}}\;T \to \infty $$
(31)

where \(I_{p}\) is p-lowed identity matrix and p is number of zero restriction (Refer to Appendix for the proof of this theorem).

4 Monte Carlo simulation analysis

In this section, the Monte Carlo simulation is used to examine the finite-sample behaviour of three different fractional integration tests, namely the Robinson test (Robinson 1994), the Alana-Yaya (AY) test (Gil-Alana and Yaya 2021), and the proposed autoregressive neural network-fractional integration (ARNN-FI) test. Firstly, a simple version of the Robinson test (Robinson 1994) is based on the following equation:

$$ y_{t} = \mu + \beta t + x_{t} ;\quad (1 - L)^{d} x_{t} = \varepsilon_{t} , $$
(32)

where d is the fractional integration parameter, μ is the intercept, t is the trend, β is the gradient (slope) parameter and \(\varepsilon_{t}\) is the white noise. In this simulation analysis, the gradient parameter (β) is set to unity. Secondly, the Alana-Yaya (AY) test (Gil-Alana and Yaya 2021) is based on equation:

$$ y_{t} = \mu + \beta t + \gamma_{1} \sin \left( {\frac{2\pi ft}{T}} \right) + \gamma_{2} \cos \left( {\frac{2\pi ft}{T}} \right) + x_{t} ; (1 - L)^{d} x_{t} = \varepsilon_{t} , $$
(33)

where sin is a sine function, cos is a cosine function and γ is the slope parameter for the trigonometry function which is set to unity in this simulation. The ARNN-FI test is based on the new multilayer perceptron framework with time trend, \(t\) suggested in Yaya et al. (2021):

$$ y_{t} = \mu + \beta t + \mathop \sum \limits_{i = 1}^{r} m_{i} y_{t - i} + \mathop \sum \limits_{i = 1}^{s} \mathop \sum \limits_{j = i}^{s} m_{ij} y_{t - i} y_{t - j} + \mathop \sum \limits_{i = 1}^{v} \mathop \sum \limits_{j = i}^{v} \mathop \sum \limits_{l = j}^{v} m_{ijl} y_{t - i} y_{t - j} y_{t - l} + x_{t} ; (1 - L)^{d} x_{t} = \varepsilon_{t} . $$
(34)

In this simulation analysis, the lag of the linear component (r) is set to zero and the lag of the quadratic component (s) and the cubic component (v) is set to 1. The null hypothesis in this Monte Carlo simulation could be formulated as:

$$ d = d_{0} , $$
(35)

For the size and power analysis, the null hypothesis is rejected when the absolute value of estimated statistics is greater than the 5% critical value. In this Monte Carlo simulation, five alternative values of fractional integration parameters are used:

$$ d = d_{0} - 1, d_{0} - 0.75, d_{0} - 0.5, d_{0} - 0.25, d_{0} , d_{0} + 0.25, d_{0} + 0.50 $$

with four alternative values: \(d_{0} = 0, 0.25, 0.75, 1. \) This simulation study uses 1,000 replications with five different sample sizes, \(T = 50, T = 100, T = 250,T = 500 and T = 1000\).

Table 2 reports the simulation results from the Robinson test (Robinson 1994). As the simulation analysis indicated, there would be some power and size distortions in the Robinson test when the number of observations is small (\(T = 50 {\text{or }} T = 100).\) For example, at the lowest number of observations (\(T = 50)\), the power of the Robinson test would be 0.270 when \(d_{0} = 0 \) and \(d = - 0.25, \) and the size of the test is 0.206 when both \(d_{0}\) and \(d\) is equal to zero. As the number of observations increases to 500 or 1000, these distortions tend to disappear in the Robinson test. For example, as the number of observations becomes 1000, the power of the test would converge to 1.000 when \(d_{0} = 0 \) and \(d = - 0.25\) and the size of the test would decrease to 0.115 when both \(d_{0}\) and \(d\) are set to 0.

Table 2 Statistical power and size of the Robinson test

The simulation results from the Alana-Yaya (AY) test (Gil-Alana and Yaya 2021) are reported in Table 3. In comparison with the Robinson test, there would be relatively stronger power and size distortions in the AY test when the number of observations is small (\(T = 50 {\text{or}} T = 100).\) For example, in \(T = 100\), the power of the AY test would be 0.390 and the power of the Robinson test is 0.727, when \(d_{0} = 0.75 \) and \(d = 0.5\). In the same number of observations, the size of the AY test is 0.218 and the size of the Robinson test is 0.125, when both \(d_{0}\) and \(d\) is equal to 0.75. As the number of observations increases to 500 or 1000, these distortions tend to slowly disappear in the AY test. For example, at \(T = 500\), the size of the AY test would be still 0.162 when \(d_{0} = 0 \) and \(d = 0. \) At the highest number of observations \(T = 1000\), the size of the AY test would decrease to 0.121 when both \(d_{0}\) and \(d\) are set to zero.

Table 3 Statistical power and size of the Gil-Alana-Yaya test

More importantly, Table 4 reports the simulation results from the autoregressive neural network-fractional integration (ARNN-FI) test. In comparison with the Robinson test or the Alana-Yaya (AY) test, there would be a lower size in the ARNN-FI test. For example, in the \(T = 500, \) the size of the Robinson test is 0.106, the size of the AY test is 0.147 and the size of the ARNN-FI test is 0.041 when both \(d_{0}\) and \(d \) is set to 0.25. It means that the ARNN-FI test would have a relatively lower probability to reject incorrectly the true hypothesis. On the other hand, there would be a relatively higher power and size distortions in the ARNN-FI test when the number of observations is small\(.\) For example, with \(T = 100\), for \(d_{0} = 0.25\) and \(d = - 0.25\), the power of the ARNN-FI test is 0.614, the power of the Robinson test is 0.990 and the power of the AY test is 0.925. As the number of observations increases, these distortions tend to disappear in this test. For example, at the number of observations is 1000, the power of the ARNN-FI test would be equal to 1.000 when \(d_{0} = 0.25\) and \(d = - 0.25\). The fact that the ARNN-FI simulation results indicate higher power when the number of observations is small compared to large number of time series observations further emphasized the applicability of ARNN to modelling unit root processes of unemployment data that are often with few observations.

Table 4 Statistical power and size of autoregressive neural network-fractional integration test

5 Empirical application

We examine the hysteresis in monthly unemployment rates (1978M1–2020M12) in three European countries, namely France (FR), Germany (DE) and the United Kingdom (UK) as well as two non-European countries are included for comparison, namely the United States (US) and Japan (JP). The total number of observations was 540. The source of data was the S&P Capital IQ (S&P Global 2023). Three fractional integration tests, namely, the Robinson test (Robinson 1994), the Alana-Yaya (AY) test (Gil-Alana and Yaya 2021), and the autoregressive neural network-fractional integration (ARNN–FI) test, are used for the empirical analysis.

Firstly, the Robinson test (Robinson 1994) is based on Eq. (17). As Table 5 shows, the Robinson test indicates that the unemployment rates in three European countries, namely France, Germany and United Kingdom, are non-stationary and non-mean-reverting in line with the hysteresis hypothesis. It means that the findings from the Robinson test confirmed those from the seminal paper by Blanchard and Summers (1986a,b). On the other hand, the Robinson test failed to offer unambiguous results on unemployment status in the USA because its 95% confidence interval was in the range of [0.90–1.44]. By contrast, the findings from the Robinson test indicated that the unemployment rate in Japan is non-stationary but mean-reverting in line with the persistence hypothesis.

Table 5 Findings from the Robinson test

Secondly, the Alana-Yaya (AY) test (Gil-Alana and Yaya 2021) is based on Eq. (18). The methodological advantage of this test would be able to take account of unknown nonlinearity by using the Fourier approximation function. As Table 6 shows, the findings from the AY test confirmed those from the Robinson test that the unemployment rates in the three European countries are non-stationary and non-mean-reverting in line with the hysteresis hypothesis. On the other hand, the AY test also failed to produce clear-cut results on unemployment rates in the USA. However, the findings from the AY test showed that the unemployment rate in Japan is non-stationary but mean-reverting.

Table 6 Findings from the Alana–Yaya test

Finally, the ARNN-FI test would incorporate a new multilayer perceptron (Yaya et al. 2021) into the context of fractional integration analysis. The test is based on Eq. (19). As Table 7 shows, three different fractional unit root tests, namely the Robinson test, the AY test and the ARNN-FI test, produced consistent findings on the unemployment rate in three European countries to support empirically the validity of the hysteresis hypothesis. On the other hand, the ARNN-FI test produced unambiguous findings on unemployment rates in the USA to substantiate the hysteresis hypothesis. The findings from the ARNN-FI test indicated that unemployment rates in Japan are non-stationary but mean-reverting.

Table 7 Findings from the ARNN-FI test

In short, the main objective of the empirical analysis in this section is to employ the ARNN–FI test to revisit and re-examine a unique pattern of unemployment or unemployment hysteresis in the European labour market (Blanchard and Summers 1986b). In the context of the existing literature, the findings from the ARNN–FI test have largely confirmed the results from the earlier country-specific time series analyses that reported the presence of hysteresis in the unemployment rates in three European countries, namely, France, Germany and the UK (Blanchard and Summers 1986b; Mitchell 1993; Røed 1996; Camarero et al. 2005; Chang 2011). However, the findings from the ARNN–FI test contradict a recent study by Cheng (2022) that detected the natural rate of unemployment in France. This minor discrepancy in the results could be due to the differences in the methods. Cheng (2022) employed the threshold regression method which takes account of the change of the slope coefficients within the different regimes in the unemployment time series. In contrast, the ARNN–FI test does not incorporate the threshold effect in the estimation.

6 Conclusions

This paper proposed the autoregressive neural network-fractional integration (ARNN–FI) test which is a nonlinear fractional integration time series test based on a new multilayer perceptron framework. The methodological advantage of this multilayer perceptron framework is that its estimation model includes a hidden layer that is expected to capture a latent structure in time-series data as in Yaya et al. (2021). The theoretical properties and the asymptotic theory of the new fractional integration test are presented in the paper.

In the simulation exercise, the Monte-Carlo analysis examined the size and power of three alternative fractional integration tests; namely, the Robinson test, the AY test and a newly-proposed proposed ARNN–FI test. The simulation analysis showed that there were power and size distortions in all three methods when the number of observations was small. However, as the number of observations increased, these distortions tended to disappear in all three methods. As an empirical application example, the three fractional integration tests examined unemployment hysteresis in three European countries, namely France, Germany and United Kingdom. The findings from the ARNN–FI test are consistent with those from the Robinson test and the AY test and indicated that unemployment in these European countries was non-stationary and non-mean-reverting in line with the hysteresis hypothesis.

The empirical findings reported in this article have theoretical and policy implications. From a theoretical perspective, the current findings have substantiated the hysteresis hypothesis proposed by Blanchard and Summers (1986a). In other words, this study offers additional empirical evidence that contradicts the mainstream theoretical perspective on unemployment dynamics or the natural rate hypothesis that assumes the mean-reversion of unemployment rates. In this context, a notable policy implication is that a high unemployment rate in the European labour market would not revert to a lower level without appropriate labour market interventions. In other words, policymakers in three European countries, namely, France, Germany and the UK, need to hammer out policies aimed at reducing the persistently high unemployment in the labour market.

There are some potential weaknesses in the ARNN–FI test. As mentioned in the previous section, this new fractional integration test does not take account of the regime change or structural break. In this context, Gil‐Alana (2008) proposed an innovative method to incorporate structural breaks in the fractional integration framework. Researchers may consider re-formulating the ARNN–FI test by adopting this pioneering idea. Another possible weakness is that the newly proposed test is based on the linear regression analysis. Future lines of research within the methodology of fractional integration might include the analysis of nonlinearities using stochastic approaches rather than deterministic ones. An earlier study by Caporale and Gil-Alana (2007) offered some examples of such approaches.