Introduction

Financial time series (FTS) comprise stock market, commodity, energy, gold, silver, and currency exchange prices, etc., whose behaviors are uncertain and sensitive to various international aspects and socio-economic factors (Hsu et al. 2016). The inherent dynamism, nonlinearities, and non-stationarity of such FTS data make the prediction process challenging (Fama 1970; Kara et al. 2011). As a complex and dynamic system, FTS forecasting requires an accurate prediction model and has been an attractive research area in the domains of data mining and financial engineering. Statistical methods were early approaches of FTS forecasting (Zhang 2003; Adhikari and Agrawal 2014; Box et al. 2015). However, their approximations are deprived and mostly fail to model the underlying dynamism and replicate the ever-changing patterns of FTS (Akbilgic et al. 2014; Nayak and Misra 2019a; Das et al. 2022; Nayak 2022).

Recently, nonlinear approximation systems such as artificial neural networks (ANNs) and deep-learning-based forecasting have been applied in FTS forecasting which are data-driven in nature and have established their predictability by exploring huge amounts of available historical financial data (Nayak 2022; Akman et al. 2020). Typically, ANNs comprise a multi-layered architecture, providing a “black box” visualization. Multiple layers lead to multiple weight and threshold updates, an elongated convergence rate, and being trapped in local optima owing to gradient descent-based learning, thereby attracting experts to frame simple and flat neural models. Higher- order neural networks (HONNs) are a class of feed-forward ANN capable of providing nonlinear decision margins and achieving an improved classification accuracy compared with linear neurons (Guler and Sahin 1994). The introduction of higher-order terms differentiates them from normal ANNs. Unlike summing elements in conventional ANNs, HONNs use both summing elements and the products of weighted inputs, which are called higher-order terms. Because of their single layer and smaller number of adjustable model parameters, they have simple and flat architectures. They can capture nonlinearities coupled with complex real-world data using a simple flat architecture (Shin and Ghosh 1995; Park et al. 2000). The representational power of these higher-order terms helps enhance the information capacity of a model in solving nonlinear problems with a smaller network and faster convergence rate (Leerink et al. 1994). Altogether, HONNs exhibit fast learning, a robust approximation with fault tolerance competence, and influential input–output mapping with solitary trainable weights and thresholds (Wang et al. 2008). The functional link artificial neural network (FLANN) proposed by Pao (1992) and pi-sigma neural network (PSNN) proposed in Shin and Ghosh (1991) are popular HONNs applied in many areas of engineering optimization. Financial forecasting using HONNs has been determined to be better than that using conventional multilayer ANNs (Das et al. 2020; Ghazali 2005; Nayak et al. 2015). To alleviate the limitations of traditional ANNs, higher order product unit neural networks (HPUNNs) (Giles and Maxwell 1987) and product unit neural networks (PUNNs) (Durbin and Rumelhart 1989) have been proposed in the literature with amended nonlinear mapping abilities. However, an exponential increase in higher-order terms increases the magnitude and complexity of the networks. To better control the amplified model parameters and processing units, a PSNN with a robust classification ability was suggested by Shin and Ghosh (1991). A PSNN was proposed in Nayak et al. (2015) for successful prediction of future stock indices. An HONN with a Bayesian confidence measure was proposed by Knowles et al. for EUR/USD exchange rate forecasting (Knowles et al. 2005). A FLANN with trigonometric basis functions and recursive LMS algorithm for weight updating were developed for S&P 500 and DJIA index forecasting (Majhi et al. 2009).

The network magnitude and learning method significantly influence the performance of an ANN. The most widely used learning method in ANNs (gradient descent learning) has been victimized by sluggish convergence speeds, inaccurate learning, leaning towards local optima, and augmented computational overhead. To address these problems, a few nature-inspired learning algorithms have been developed and extensively used for training ANNs. Although such methods are used to train ANNs to solve multifaceted problems, their proficiency is mostly determined through well-modified control parameters. The selection of such learning parameters while designing a model necessitates human intervention and domain expertise, making the method difficult to use. Typically, for a particular problem, the selection of such parameters requires numerous trial-and-error steps, and an improper assortment leads the search operation towards a local optimum, thereby generating an inaccurate solution. Hence, techniques with fewer control parameters and sophisticated approximation capabilities are of interest.

To adjust ANN parameters, evolutionary algorithms such as particle swarm optimization (PSO), genetic algorithm (GA), and differential evolution (DE) have been demonstrated to be proficient (Shadbolt 2004). As no solitarily apt method has been developed for resolving all classes of problems, incessant perfections are ongoing through the enhancement (Opara and Arabas 2018; Jiang et al. 2014) or hybridization (Nayak and Misra 2019b; Nayak and Ansari 2019a; Chiroma et al. 2015) of existing methods. Recently, the artificial electric field algorithm (AEFA) was projected as an optimization procedure based on the principle of electrostatic force (Yadav 2019; Yadav and Kumar 2020, 2019). The conceptualization of the AEFA is based on the theoretical perception of electric fields, charged particles, and the force of attraction/repulsion. The mathematical representation of its learning ability, acceleration update, and convergence speed has been well demonstrated in the literature by answering benchmark optimization tasks (Yadav 2019; Yadav and Kumar 2020, 2019).

Metaheuristic algorithms are stochastic in nature and their performance and efficiency vary from one dataset to another. Several schemes have been proposed to modify existing metaheuristic algorithms to improve their accuracy. The oppositional-based learning (OBL) concept has been proposed to enhance the performance of machine learning algorithms (Tizhoosh 2005). Several attempts have been made to establish theoretical extensions, incorporating OBL with existing metaheuristics to improve their performance as well as real-world engineering applications. A systematic survey was conducted in Mahdavi et al. (2018), summarizing the use and growth of OBL.

In this study, we designed an improved optimization technique by incorporating the concepts of elitism and OBL into the basic AEFA, called elitist-opposition-based AEFA (EOAEFA). Elitism advances the convergence of the AEFA towards global optima by retaining the finest solutions obtained thus far, and OBL helps enhance the diversification ability of the AEFA. The EOAEFA was used to explore the potential weights and thresholds of two competitive HONNs, FLANN and PSNN, independently, thereby forming two hybrid models. The performances of the two hybrid models were evaluated for predicting the futures prices of 11 dynamic and chaotic FTS, including four stock closing prices, four currency exchange, and three energies prices series. For fair comparison, five other forecasts were developed in a similar manner. The comparison models include the basic AEFA-based FLANN and PSNN (i.e., AEFA-FLN and AEFA-PSNN), the genetic algorithm-based FLANN and PSNN (i.e., GA-FLN and GA-PSNN), and a back propagation neural network (BPNN). Training/test patterns were selected using the sliding window algorithm, data normalization was performed using the sigmoid method, and all models were adaptively trained to reduce training costs. Finally, comparative studies and statistical tests were conducted to ensure the predictive ability of EOAEFA + HONN. The original contributions of this study are as follows:

  • An improved learning algorithm, EOAEFA, is proposed by integrating the OBL and elitism concepts with the basic AEFA.

  • The EOAEFA was used to adjust the parameters of two HONNs; thus, two computationally efficient hybrid models were created: EOAEFA-FLN and EOAEFA-PSNN.

  • The hybrid models were evaluated to predict 11 real-world FTS with a systematic performance evaluation.

This paper is organized into seven sections. Sect. "Background and related work" summarizes problem-related studies. Sect. "Proposed EOAEFA+HONN forecasting method" describes the methodologies used. Sect. "FTS data and statistical analysis" briefly discusses the FTS data and their statistics. Sect. "Results and analysis" describes and analyzes the simulation results. Sect. "Statistical testing and further analysis" discusses the significance of the models. Finally, Sect. "Conclusions and future work" presents the conclusions, limitations, and possible extensions of this study.

Background and related work

This section analyzes recent related research in the area of FTS forecasting using machine learning techniques. In addition, it explores metaheuristic learning methods, their variants, and opposition-based learning applied for ANN parameter tuning. To this end, the mathematical modelling of two HONNs, PSNN and FLANN, are discussed.

The distribution of financial data is complex because of the intricacy of human behavior and varying social environments. to the detection, scope optimization, and interpretation of clusters in large-scale financial datasets are difficult tasks. Li et al. proposed a computationally efficient integrated approach that detects a reasonable number of clusters and evaluates their quality (Li et al. 2021). Kou et al. designed a bankruptcy prediction model using payment-network-based variables and transactional data, incorporating optimal feature subset selection and importance evaluations for small- and medium-sized enterprises (Kou et al. 2021a). This approach achieved a satisfactory classification accuracy with a reduced feature subset. Researchers have evaluated five financial technology-based investments in European banking services (Kou et al. 2021b). This study identified payment and money transfer systems as the most important investment alternatives for advancing the financial performance of European banks, nourishing customer expectations, easing the banks’ collection of receivables, and reducing operational outlays.

Although many conventional and statistical method-based predictive systems for FTS forecasting have been proposed, they mostly failed to describe nonlinearities coupled with huge volumes of financial data. They are less competent than machine learning (ML) methods, such as ANN, RNN, deep learning (DL), and HONNs. ANNs have been well established for FTS forecasting. Different ML models such as random forest and gradient boosting approaches have been used by Yoon for real GDP growth forecasting (Yoon 2020). Methods such as multilayer perceptron (MLP) (Ecer et al. 2020), support vector machine (SVR) (Guo et al. 2018), and ANN (Nayak 2022; Akman et al. 2020; Shadbolt 2004) have proven successful in capturing nonlinearities coupled with FTS. However, multiple hidden layers in conventional ANNs lead to a longer convergence speed, become stuck in a local optimum owing to traditional back-propagation-based learning, and offer black-box visualization, thereby paving the path towards designing flat and simple ANNs.

To address the disadvantages of the backpropagation learning of HONNs, several evolutionary algorithms have been developed and extensively used for HONN training in the last two decades. To improve the performance of HONNs, chemical reaction optimization is used for training, and the resulting hybrid model is applied to stock market forecasting (Nayak et al. 2017a). The hybrid model has been demonstrated to be superior to other comparative methods. The HONN parameters tuning via chemical reaction optimization (CRO) is conducted for the effective prediction of exchange rate series (Nayak et al. 2017b). Adaptive HONNs trained using different evolutionary optimization techniques were proposed in Sahu et al. (2016); Nayak 2017; Nayak et al. 2016) to predict fluctuations in stock market data. Some hybrid models called RAFLANNs have been proposed and determined to be effective in terms of computational cost and accuracy (Subhranginee et al. 2021). A hybrid PSNN model was proposed to predict GOLD/INR and GOLD/AED prices and showed significant results (Dash et al. 2021a). Reference (Naik et al. 2021) developed a hybrid FLANN for classification whose accuracy was superior than that of other comparative models. To control the uncertainties associated with crude oil prices, Nayak et al. (2020) proposed a hybrid forecasting model by aggregating the generalization power of PSNN with the effective learning ability of FWA and determined that the FWA-PSNN model is superior to an MLP and other models (Nayak 2020). The outcome of a hybrid PSNN-SDE model was tested using various evaluation metrics, such as root mean squared error (RMSE), mean absolute percentage of error (MAPE), and Theil’s U statistic (TU) and exhibited an enhanced forecasting accuracy (Rajashree et al. 2019). PSNN and FLANN have been hybridized with different training algorithms, such as GA, CRO, and COA, to expand their input space portrayal and high learning ability and establish significant developments in performance (Nayak and Ansari 2019b). The EPSNN was determined to be a better model for exchange rate prediction than a backpropagation neural network (BPNN) and other models (Sahu et al. 2019). To determine the uncertainty associated with stock data, Nayak et al. proposed ACFLN and determined that it is more efficient in handling uncertainty than the MLP, BPNN, auto-regressive integrated moving average, and other models (Nayak et al. 2018). Based on the literature, HONNs and their variants (hybridized HONNs and metaheuristic-based learning) are widely applied in FTS, particularly in the domains of stock index prediction, bitcoin price prediction, wind energy forecasting, function regularization, classification, and electricity consumption. Evidently, HONNs and their variants have achieved distinguished performance in terms of a high accuracy, fast convergence, and fewer prediction errors in FTS forecasting. Although several evolutionary algorithms have been used for HONN training to solve intricate problems, their efficiency is hindered by the learning parameters.

Many applications of AEFA have been reported in the literature. Table 1 lists some recent engineering applications of AEFA and its improved versions.

Table 1 AEFA applications and variants in engineering optimization

Behera et al. applied the basic AEFA to optimize an ANN structure for software reliability dataset forecasting (Behera et al. 2022). The ANN trained with AEFA yielded a better model than that obtained used other algorithms. In 2021, Al-Khraisat et al. used the AEFA for the optimum placement of PMU (Al-Khraisat, AL-Dmour and Al-Maitah 2021). Nayak et al. incorporated the concept of elitism into the basic AEFA to train a neuro-fuzzy model for the compressive strength prediction of concrete structures and established the competitiveness of the AEFA (Nayak et al. 2021). An improvement in the AEFA through the inclusion of an inertia factor and the concept of repulsive force was claimed by Bi et al. in 2022. They coined the IRAEFA model and applied it to the spherical mining spanning-tree problem. Their improved version was determined to be more effective than the other metaheuristics. However, this is associated with long execution times. To supplement the convergence and poor search capability of the AEFA, a scheme for generating Coulomb’s constant was proposed in Cheng et al. (2022). A log-sigmoid function was suggested for generating the Coulomb’s constant instead of an exponential function, which decreases rapidly through evolution. The improved version was tested with 18 benchmarked functions as well as the optimization of neural networks. The Nelder-Mead simplex algorithm integrated with AEFA for the optimization problem was proposed in Izci et al. (2020). The former method helps achieve an improved local search ability, and the latter helps achieve a global search capacity. Another improvement found by the authors in Houssein et al. (2021) incorporated strategies such as a modified local escaping operator, opposition-based learning, and levy flight with the basic AEFA. The new version was tested on the parameter optimization of CEC’s 2020 functions and fuel cell; it was determined to be superior to nine other metaheuristics. The AEFA potential in determining the controlling parameters of an automatic voltage regulator was established in Demirören et al. (2019). The AEFA accurately estimated the undefined parameters of the triple‐diode model of a photovoltaic unit in Selem et al. (2021). The AEFA pattern search method was adopted in Alanazi and Alanazi (2022) for distribution network reconfiguration. The results highlight the improved performance of the anticipated technique in achieving a lower value of different objectives than the conventional AEFA, PSO, and grey wolf optimization (GWO) methods based on many-criteria reconfiguration. Zheng et al. proposed a sin-cosine-based AEFA for logistic distribution vehicle routing (Zheng et al. 2022). The sine–cosine update mechanism was integrated with the AEFA, which helps achieve dynamic steadiness between the global and local searches of the AEFA. The basic AEFA was hybridized with a cuckoo search algorithm and refractive learning in Adegboye and Deniz Ülker (2023). The hybrid model was evaluated using benchmark function optimization and achieved a better convergence speed and search ability than the basic AEFA. The aforementioned studies show rapid growth in the application of the AEFA and its variants to engineering optimization. However, AEFA applications in the domain of data mining lack, specifically in FTS forecasting, requiring further exploration.

Several schemes have been proposed to modify existing metaheuristic algorithms and improve their accuracy. OBL is a popular concept introduced in 2005 to enhance the performance of machine learning algorithms (Tizhoosh 2005). Subsequently, several attempts have been made to establish theoretical extensions, incorporating OBL with existing metaheuristics to improve their performance as well as real-world engineering applications. A systematic survey was conducted in Mahdavi et al. (2018) that summarized the use and growth of OBL. Table 2 summarizes recent advances in OBL and its integration with existing learning algorithms.

Table 2 OBL integrated with metaheuristics and their applications to the engineering optimization problem

The mathematical and theoretical aspects of OBL were determined by Tizhoosh in 2005. A compressive survey of OBL, its variants, and real-world engineering applications was conducted (Mahdavi et al. 2018). An improved crow search algorithm (ICSA) with OBL was proposed in Shekhawat and Saxena (2020), revealing the competitive performance of ICSA over other methods. To enhance the exploration capacity of the whale optimization algorithm, OBL was integrated and determined to be better than competitive models in evaluating different benchmarked functions as well as in estimating the parameters of solar cell diode models (Abd Elaziz and Oliva 2018). The same author proposed an enhanced sine–cosine algorithm with OBL for global optimization, claiming a fast convergence (Abd Elaziz et al. 2017). In 2020, Tubishat et al. used OBL and a local search algorithm to increase the population diversity and exploration capacity of the salp swarm algorithm and applied this method on 18 benchmark datasets from the UCI repository for the feature selection problem (Tubishat et al. 2020). The economic dispatch problem of a power system was solved in Pradhan et al. (2018) using a grey wolf optimization algorithm integrating OBL (OGWO). The proposed method accelerated the convergence rate of the standalone method in terms of computational and fuel costs. Jain and Saxena proposed OBL moth flame optimization (OB-MFO) for solving CEC-2017 functions and energy market datasets, and their study concluded with the potential effectiveness of the OB-MFO model (Jain and Saxena 2019). An opposition-based AEFA was designed by Demirören et al. (2021) for an FOPID controller design. The proposed method was determined to be statistically and computationally superior to the comparison methods. OBL has also been used with atom search optimization (which) to estimate the control parameters of an automatic voltage regulator system (Ekinci et al. 2020). For an appropriate feature selection problem, Ibrahim et al. (2019) applied OBL to social spider optimization and tested its performance. The dragonfly algorithm was coupled with OBL and applied to image segmentation (Bao et al. 2019). A genetic algorithm using OBL for FTS forecasting was developed in Kar et al. (2016). The proposed OBGA-trained ANN was determined to be superior to the ANN trained using the conventional GA. Dash et al. (Dash et al. 2021b) proposed another OBL application to predict cryptocurrency prices. The method was determined to be better than other classical predictors in terms of generating a lower prediction accuracy. A memetic search method using an opposition-based concept (OBMA) was developed to solve the maximum diversity problem (Zhou et al. 2017). The OBMA ties the best-known outcomes in most instances. The experimental analysis confirmed the effectiveness of integrating OBL with a memetic search, which significantly affected the search ability of the standard memetic search.

Elitism helps retain healthy individuals across generations in evolutionary algorithms. An elite opposition-based learning methodology was used to advance the Grasshopper optimization algorithm (Yildiz et al. 2022). The improved method was applied to solve several engineering problems and determined to be effective. A distance strategy-based elitism method was proposed for selection operations in evolutionary algorithms (Du et al. 2018). In addition, elitism was used with the GA to facilitate the layout design (Jerin Leno et al. 2016). The sine cosine algorithm (SCA) with an elitism strategy was used in Sindhu et al. (2017) to select discriminative feature sets that enhanced the classification accuracy of high-dimensional datasets. A chaotic map was integrated into the Henry gas solubility optimization algorithm to enhance the convergence rate, and the robustness of the resulting method was tested and established to solve various constraint optimization problems in the areas of manufacturing and mechanical design (Yıldız et al. 2022a). The grasshopper optimization algorithm has been hybridized with the Nelder-Mead algorithm and performed well in real-world engineering optimization problems, such as in designing robot grippers (Yildiz et al. 2021). The Nelder-Mead algorithm was hybridized with the salp swarm optimization method for the structural optimization of electric vehicle components (Yıldız 2020). In incorporating chaotic maps with a levy flight distribution, a new hybrid algorithm for engineering optimization was suggested in Yıldız et al. (2022b). RNNs, such as long short-term memory (LSTM), have been suggested for modelling sequence data. As FTS are sequential, these methods can be used as effective approximations of such data. Bai et al. systematically evaluated generic recurrent networks, such as LSTMs and convolutional architectures, in sequence modelling using a broad range of datasets across diversified tasks (Bai et al. 2018). The results suggest the superior performance of a simple convolutional architecture over LSTM, i.e., a temporal convolutional network. An attention mechanism-based sequence transduction model, called the transformer, was proposed to replace recurrent layers in the encoder-decoder model (Vaswani et al. 2017). This approach uses scaled dot-product attention and multi-headed self-attention methods to induce global input–output dependencies. A transformer assumedly trains significantly faster than models based on recurrent and convolutional layers.

A FLANN produces higher-order effects of input signals through nonlinear functional transformations via links. The attributes of an input pattern were expanded into several terms and passed through a functional expansion unit. The \(sine\) and \(cosine\) trigonometric functions are used to expand the original input dimensions. For example, input \({x}_{i}\) expands into several terms through trigonometric expansion functions such as \({c}_{1} \left({x}_{i}\right)=\left({x}_{i}\right), {c}_{2} \left({x}_{i}\right)=sin\left({x}_{i}\right), {c}_{3} \left({x}_{i}\right)=cos\left({x}_{i}\right), {c}_{4} \left({x}_{i}\right)=sin\left(\pi {x}_{i}\right), {c}_{5} \left({x}_{i}\right)=cos\left(\pi {x}_{i}\right), {c}_{6} \left({x}_{i}\right)=sin\left(2\pi {x}_{i}\right), and {c}_{7} \left({x}_{i}\right)=cos(2\pi {x}_{i})\). The weighted sum of the functional expansion unit outputs is passed to the activation function to estimate the value of the output neuron. The output of the model is then compared with the target output, and the absolute deviation, called the error signal, is calculated. The accumulated error signal is propagated back to train the model.

For a given input pattern, the FLANN computes an output as follows. Let \(X\left(n\right)=\left\{{x}_{i}, {x}_{i+1,}\dots \dots \dots , {x}_{n}\right\},\) where (n = input vector size), be an input vector. Using trigonometric basis functions, this vector is expanded nonlinearly as \({X}_{expanded}(N)\). Given the input \({X}_{expanded}(N)\), the model produces an output \(\widehat{y}(n)\) as shown in Eq. (1):

$$\hat{y}\left( n \right) = X_{expanded} \left( N \right) \times W\left( n \right) + bias,$$
(1)

where \(W\left(n\right)\) are the weight associated with the \({n}^{th}\) pattern. This output is then passed through a sigmoid activation to produce an output \(y\left(n\right)\) as shown in Eq. 2. The \(error\left(n\right)\) is calculated using Eq. (3).

$$y\left(n\right)= \frac{1}{1+ {e}^{-\widehat{y}\left(n\right)}}$$
(2)
$$error\left(n\right)= d\left(n\right)- y \left(n\right)$$
(3)

The weights are updated as shown in Eq. (45) using adaptive learning rules:

$$w_{ij} \left( {t + 1} \right) = w_{ij} \left( t \right) + \mu *\Delta \left( t \right),$$
(4)
$$\Delta \left( t \right) = \delta \left( t \right)*\left[ {{\text{f}}\left( {x_{i} } \right)} \right],$$
(5)

where \(\left(t\right)=\left[{\delta }_{1}\left(t\right), {\delta }_{2}\left(t\right), \cdots , {\delta }_{k}\left(t\right)\right],\) \({\delta }_{i}\left(t\right)=(1-{\widehat{{y}_{i}}}^{2}(t)*{e}_{i}(t))\), and µ is the learning parameter.

The PSNN computational model is as follows. It has a two-layered, fully connected feedforward network architecture. The first layer comprises sigma units (summing), and the second layer is the product (pi) layer. The inputs are connected to the neurons of the sigma layer, and their outputs are fed to the neurons in the pi layer. The weights and thresholds of the input-summing layer are trainable, whereas those of the summing-product layer are set to unity. Owing to the single adjustable parameter set, the training time was drastically reduced. Neurons at the summing units use linear activations and those at the product units use nonlinear activations of the network output. The product units provide a higher-order capability by expanding the lower-dimensional input space into higher dimensions. The higher-dimensional space achieved offers a superior nonlinear separability without exponential growth in the weights. The output of the jth sigma unit is computed using the weight sum of each input \({x}_{i}\) and corresponding weight \({w}_{ij}\) as in Eq. (6):

$$y_{j} = \mathop \sum \limits_{i = 1}^{n} w_{ij} *x_{i} ,$$
(6)

where n denotes the input size. The pi-layer neuron computes the product of the outputs of the sigma units and applies a nonlinear activation to it, as in Eq. (7):

$$y = \sigma \left( {\mathop \prod \limits_{j = 1}^{k} y_{j} } \right),$$
(7)

where k is the number of sigma units, and is the order of the network. The error signal \(error\left(n\right)\) was calculated as follows:

$$error\left(n\right)= d\left(n\right)- y \left(n\right).$$
(8)

We reviewed recent articles on financial forecasting, different HONN applications for FTS forecasting, different metaheuristics for HONN optimization, OBL integration with existing metaheuristics, and the elitism concept in improving the learning capacity. We obtained the following insights from these studies:

  • Automotive predictive systems with improved accuracies remain lacking in the large-scale FTS forecasting domain.

  • Metaheuristic-based HONNs have demonstrated promising performances in different data-driven problems, and their proficiency in FTS forecasting must be assessed in depth.

  • The AEFA has emerged as a competitive optimization mechanism. However, AEFA-based HONNs for FTS forecasting remains limited.

  • The convergence rate and population diversity of the basic AEFA must be improved to maintain a balance between its exploration and exploitation capacities.

  • Finally, no predictive framework has been established that integrates the improved AEFA and HONN for large-scale FTS forecasting.

Proposed EOAEFA + HONN forecasting method

This section describes the proposed EOAEFA + HONN based forecasting method in two phases; Phase 1: improved the AEFA by incorporating elitism and OBL into the basic AEFA, called EOAEFA, and Phase 2: search for the optimal parameters of the HONN using EOAEFA.

Design of EOAEFA

As the basis of the proposed EOAEFA is the AEFA, OBL, and concept of elitism, recalling may be helpful to understand the working principles of the EOAEFA.

Basis of EOAEFA

The AEFA mimics a charged particle as an agent in the search space, and the strength of such an agent can be measured in terms of its charge. A mass of these charged particles floats in the search domain with the help of electrostatic attraction and repulsive forces. Particles can interact with each other through their charges. The positions of these charges are measured as potential solutions to the target problem. The force of attraction is considered only in the basic AEFA, meaning that all particles associated with lower charges are attracted to the particle with the highest charge, called the best particle or individual. The position of the \({i}^{th}\) charged particle (\({X}_{i}\)) at time \(t\) is represented by Eq. (9):

\({X}_{i}\left(t\right)=\left({X}_{i}^{1},{X}_{i}^{2}, {X}_{i}^{3}, {\cdots ,X}_{i}^{D} \right), i=1, 2, 3,\cdots ,N and d=\mathrm{1,2},3,\cdots ,D\), (9).


where N and D are the total numbers of charged particles and parameters (dimensions), respectively. The position of the \({i}^{th}\) particle at time \((t+1)\) is updated as in Eq. (10) when it achieves the best fitness value.

$$P_{i}^{d} \left( {t + 1} \right) = \left\{ {\begin{array}{*{20}l} {X_{i}^{d} \left( {t + 1} \right)\; if\;fitness\left( {X_{i} \left( {t + 1} \right)} \right) \le fitness\left( {P_{i} \left( t \right)} \right)} \hfill \\ {P_{i}^{d} \left( t \right) \;if \;fitness\left( {X_{i} \left( {t + 1} \right)} \right) > fitness\left( {P_{i} \left( t \right)} \right)} \hfill \\ \end{array} } \right.$$
(10)

The charge associated with the \({i}^{th}\) particle (\({Q}_{i}(t)\)) at time t is expressed as Eq. (11):

$$Q_{i} \left( t \right) = \frac{{q_{i} \left( t \right)}}{{\mathop \sum \nolimits_{i = 1}^{N} q_{i} \left( t \right)}}i = 1,2, \cdots ,N,$$
(11)

where \({q}_{i}\left(t\right)\) is a suitable charge function calculated as in Eq. (11) using the best- and worst-fit particles in the search space.

$${q}_{i}\left(t\right)={exp}^{\left(\frac{{fitness}_{i}\left(t\right)-{fitness}_{worst}(t)}{{fitness}_{best}\left(t\right)-{fitness}_{worst}(t)}\right)}$$
(12)

The force \({F}_{ij}^{d}\left(t\right)\) experienced at the \({i}^{th}\) particle holding charge \({Q}_{i}\left(t\right)\) because of the \({j}^{th}\) particle holding charge \({Q}_{j}\left(t\right)\) is defined as follows:

$$F_{ij}^{d} \left( t \right) = K\left( t \right)\frac{{Q_{i} \left( t \right) \cdot Q_{j} \left( t \right) \cdot \left( {P_{j}^{d} \left( t \right) - X_{i}^{d} \left( t \right)} \right)}}{{X_{i} \left( t \right) - X_{j} \left( t \right)^{2} + \varepsilon }},$$
(13)

where \(K(t)\) is Coulomb’s constant calculated in terms of the current and maximum iteration (as in Eq. (14)) and \(\varepsilon\) is a small positive constant.

$$K\left( t \right) = K_{0} \cdot exp^{{\left( { - \alpha \frac{iteration}{{max.iteration}}} \right)}}$$
(14)

The value of parameter \(\alpha =30\) and \({K}_{0}=500\). A larger initial value of \({K}_{0}\) helps in exploring the search process and gradually decreases through the iterations to regulate the accuracy. The resultant electrostatic force \({F}_{i}^{d}\) acting on the ith particle at time t can be calculated as in Eq. (15) and the electric field is calculated as in Eq. (16).

$${F}_{i}^{d}\left(t\right)=\sum rand\cdot {F}_{ij}^{d}(t), j=\mathrm{1,2},\cdots ,N and i\ne j$$
(15)
$${E}_{i}^{d}\left(t\right)=\frac{{F}_{i}^{d}(t)}{{Q}_{i}\left(t\right)}$$
(16)

Per Newton’s law of motion, the acceleration \({a}_{i}^{d}\left(t\right)\) of the ith charged particle with unit mass \({M}_{i}(t)\) at time t is computed as in Eq. (17).

$${a}_{i}^{d}\left(t\right)= \frac{{Q}_{i}\left(t\right)\cdot {E}_{i}^{d}\left(t\right)}{{M}_{i}(t)}$$
(17)

The velocity and position of the ith charged particle at time (t + 1) are updated according to Eqs. (18) and Eq. (19), respectively.

$${V}_{i}^{d}\left(t+1\right)={rand}_{i}*{V}_{i}^{d}\left(t\right)+{acceleration}_{i}^{d}(t)$$
(18)
$${X}_{i}^{d}\left(t+1\right)={X}_{i}^{d}\left(t\right)+{V}_{i}^{d}(t+1)$$
(19)

The particle associated with the maximum quantity of charge can be considered the best individual. This individual particle attracts other particles with a lower charge and fewer voyages in the search domain.

Population-based search methods begin their exploration process with the initialization of a collection of potential candidate solutions called the initial population. The initialization of a population with random points in the search space is a common method. Exploration begins with these random solutions and is directed towards the global optima with the application of several control parameters and mechanisms. Occasionally, these random initial points lead the optimization algorithm towards a local optimum. To avoid such a trap, the concept of OBL, which is based on the theory of an opposite point, has been proposed. According to OBL, both the initial population and its opposite population are included in the search space, which considers the existence of possible solutions in any direction, thereby enabling the algorithm to effectively explore the search space. In addition, an opposite number \(\overline{x }\) of a given real number \(x\in [lb, ub]\) in one dimension can be computed as in Eq. (20):

$$\overline{x} = lb + ub - x,$$
(20)

where \(lb\) and \(ub\) denote the lower and upper bounds of the search domain, respectively. Equation (12) can be extended to a multidimensional space. Let\(x\in {R}^{n}, x=[{x}_{1}, {x}_{2}, \cdots , {x}_{n}]\), where\({x}_{i}\in R\); the opposite point \(\overline{x }=[\overline{{x }_{1}},\overline{{x }_{2}}, \cdots , \overline{{x }_{n}} ]\) can be computed as in Eq. (21).

$${\overline{x} }_{i}={lb}_{i}+{ub}_{i}-{x}_{i}, i=1, 2, \cdots ,n$$
(21)

In OBL, if the fitness function value of position \(x\) is inferior to that of its opposite position \(\overline{x }\) (i.e., \(fitness(x)<fitness(\overline{x })\)), it is replaced by \(\overline{x }\); otherwise, \(x\) is saved. The population is updated with the better value of \(x\) or \(\overline{x }\).

Maintaining steadiness between intensification and diversification in an evolutionary search algorithm is a critical factor that significantly affects its performance. Elitism is applied to the selection operation of evolutionary algorithms for this purpose. This strategy retains the most beneficial candidates and benefits exploitation. In practice, a small value is often set for the degree of elitism strategy or number of elites. The elitism process includes a few best individuals from one generation in the population of the following generation. The key purpose of using elitism is to preserve the positions of promising parts of the search space across generations and enable the continuous exploitation of these promising areas. Furthermore, it ensures the existence of the best individuals by considering the entire processing of an algorithm in the last generation created, which is the final outcome.

EOAEFA algorithm

The deprived convergence rate and trapping in a local best optimum are the two weaknesses of the basic AEFA that influence its overall performance. It updates the current particles in the search space towards the local best solution obtained thus far and may ignore some better-fitting solutions that are in opposite directions from the current particle. The EOAEFA method avoids simultaneous consideration of a solution and its opposite. This helps improve the exploration ability of the basic AEFA. In addition, in each iteration, it saves the highly fit solutions as elites and carries them over to the next iteration, thereby helping achieve a better convergence. The EOAEFA does not significantly affect the configuration of the basic AEFA but improves its accuracy through the inclusion of OBL and elitism. Algorithm 1 presents the steps of the EOAEFA.

figure a

The EOAEFA method begins with the initialization of the algorithm-specific parameters and a randomly initialized population \(P\). The oppositional population \(\overline{P }\) is then generated from \(P\). Both \(P\) and \(\overline{P }\) are evaluated simultaneously, and N best-fit particles are selected from \(P U \overline{P }\). Next, the AEFA operators are applied to the updated population to intensify and diversify the search space. The positions and velocities of all particles were updated through the iterations. At the end of each iteration, M elite particles are identified. The value of M must be carefully selected and assigned a small number. A larger value of M may significantly alter the quality of the solutions, and many duplicate solutions can exist. Finally, the same number of worst solutions was replaced by these elite solutions. Hence, the elite particles are carried to the next iteration, retaining the finest solutions obtained thus far and advancing the convergence rate of the algorithm.

EOAEFA + HONN-based forecasting

The proposed EOAEFA was used to adjust the parameters of the two HONNs: PSNN and FLANN; hence, two hybrid models were formed: EOAEFA-PSNN and EOAEFA-FLN. This process is explained as follows. An arbitrary HONN structure can be mapped onto a particle or agent of the EOAEFA. Therefore, the EOAEFA population can be viewed as a set of potential HONN structures. Each HONN structure, along with the training data, was evaluated and the corresponding fitness values were calculated as in Eq. (22).

$${fitness(HONN}_{i})= \frac{1}{{\sum }_{i=1}^{input size}\left|{predicted}_{i}-{actual}_{i}\right|/input size}$$
(22)

The EOAEFA is then applied to execute search operations as explained in the previous section on a population of the HONNs. The particles compete, and finally, the best solution, i.e., the best HONN architecture, evolves. The test data are then fed to the best HONN, and the deviation of the HONN output from the actual output is extracted and considered as the performance of the corresponding HONN model; the lower the deviation, the better the fitness of the HONN. Figure 1 illustrates this process.

Fig. 1
figure 1

EOAEFA + HONN based forecasting

FTS data and statistical analysis

For experimental purposes, real FTS datasets were collected from Internet sources. Four closing prices series, NASDAQ, DJIA, S&P 500, and Russell 2000, were downloaded from https://finance.yahoo.com/, considering financial transactional days from August 23, 2021 to August 19, 2022 on a daily basis. Each FTS consists of 255 index records with comprising the date, open, high, low, close, adj. close prices, and volume. We considered open, high, low, and closed price values only for the experiment. Similarly, four exchange rate series, Bitcoin (BTC), Euro (EUR), British Pound (GBP), and Japanese Yen (JPY), were collected from the same source in daily volume against the US Dollar, with each series containing 255 records. Another three FTS, crude oil prices on daily volume, natural gas prices on daily volume, and weekly coal prices were collected from the source U.S. Energy Information Administration (website: http://www.eia.doe.gov/). The crude oil and natural gas price series were collected from January 2, 2020, to August 16, 2022, each containing 658 records. Fifty-seven records of weekly coal prices were available from August 8, 2021, to August 13, 2022. Different are listed in Tables 3, 4, and 5 list the statistics of the closing prices, exchange rate, and energy data series, respectively. The NASDAQ and Russell 2000 series in Table 3 show larger kurtosis values, indicating higher investment risks. All closing price series are platykurtic. The Russell 2000 series deviated significantly, whereas the other three were stable to an extent. All series showed weak correlations among their data points. The Phillips-Perron (PP) and augmented Dickey-Fuller (ADF) test results from these FTS show that the series are nonstationary. The Lijung-box test results indicate a lack of autocorrelations in the series, and the KPSS statistics support the existence of non-stationarity around a deterministic trend. Figure 2 and Fig. 3 depict the trends in closing prices and distribution of the price series, respectively.

Table 3 Statistical properties of the closing prices series
Table 4 Statistical properties of the exchange rate series
Table 5 Statistical properties of energy price series
Fig. 2
figure 2

Daily closing prices series

Fig. 3
figure 3

Distribution of numerical closing prices of four FTS

Figure 4 shows the daily closing price data charts for the four FTS considered. The charts show frequent increases and decreases in the FTS, making the prediction of future prices difficult. Figure 5 shows the distribution of the closing price data. All four series exhibit an asymmetric distribution of data about the center, i.e., random variation. The DJIA, NASDAQ, and S&P 500 series skew left, whereas the Russell’s distribution is random. The four exchange rate series and three energy prices series exhibit similar behaviors to the closing price series. Figures 4 and 5 present the exchange rate series and their distributions, respectively. Figures 6 and 7 present the energy price series and their distributions, respectively.

Fig. 4
figure 4

Trend of numerical closing prices of four FTS

Fig. 5
figure 5

Distribution of numerical closing prices of four FTS

Fig. 6
figure 6

Trend of numerical energy prices of four FTS

Fig. 7
figure 7

Distribution of numerical energy prices of four FTS

Results and analysis

This section presents the input preparation, normalization of the model input, research design, experimental outcomes, their analysis, and comparative studies.

Model input preparation and normalization

The next step after the FTS collection and analysis is input preparation for the forecasting models and normalization of input patterns. The ordinary time-series forecasting process splits the available data into training and test sets. However, FTS samples cannot be selected arbitrarily or dispensed to either set, as using future values to forecast past values is nonsensical. The temporal dependence of the data must be preserved during testing. Therefore, considering current data and immediate past observations to forecast future data is important. As the FTS prediction problem is a sequence prediction task, we adopted a sliding window mechanism to generate training and test patterns from the original FTS (Nayak et al. 2014). Figure 8 provides an example of the sliding window process, where \(\left[{X}_{i-k}, { \cdots ,X}_{i-2}, {X}_{i-1}\right]\) are used, and \({X}_{i}\) is considered the target forming an input pattern. The window is then moved one step forward and the process repeats. The window size \(k\) remains fixed throughout the process and is determined experimentally. The sigmoid method, as in Eq. (23), is used to normalize data \({x}_{i}\) into \({x}_{norm}\) using the minimum (\({x}_{min}\)) and maximum data points (\({x}_{max}\)) of the current window under consideration.

Fig. 8
figure 8

Input pattern generation using sliding window process

$${x}_{norm}=\frac{1}{1+{e}^{-\left(\frac{{x}_{i}-{x}_{min}}{{x}_{max}-{x}_{min}}\right)}}$$
(23)

Research design

Based on the methodologies described in Sect. "Proposed EOAEFA+HONN forecasting method", different experiments were designed using standardized input patterns from 11 FTS. The window size in our experiments was selected to be 12, i.e., 12 data points from the input pattern. The same patterns were applied to all the forecasts to maintain an unbiased comparative study. Two consecutive patterns, formed by a one-step movement of the window over the series, differ only in adding new data point and dropping the oldest. Therefore, the variance in nonlinearity coupled with two consecutive patterns is nominal because we used the optimized parameters of the previous pattern for successive training instead of considering a new random parameter set. Once the network is trained using the first training set, the number of iterations for successive training sets is set to a small value. This type of adaptive training decreases the number of iterations, and thus decreases the training time. Seven models (six hybrid and one conventional ANN) were developed in a similar manner. The parameters of the two HONNs (i.e., FLANN and PSNN) were optimized using the proposed EOAEFA (forming EOAEFA-FLN and EOAEFA-PSNN-based forecasts), basic AEFA (forming AEFA-FLN and AEFA-PSNN-based forecasts), GA (forming GA-FLN- and GA-PSNN-based forecasts), and a BPNN forecast. Seven trigonometric functions (sine and cosine) were used in the FLANN for the functional expansion of the input data. Therefore, each input datum was expanded to 84 terms. For the PSNN, a 12–8–1 architecture was used as the base for the PSNN. Both types of HONNs used sigmoid activation in their neurons. The AEFA parameters were set as follows: particle size 50, \(\alpha =30\), and \({K}_{0}=500\) in reference to their respective articles (Yadav and Kumar 2020, 2019). The elitism factor was set to 2%, and the algorithm was iterated 100 times to reach the optimal parameter values. For the GA, the crossover and mutation probability values were set to 0.6 and 0.002, respectively. The GA was allowed 100 generations. The BPNN used an architecture of 12–25-1 neurons with a learning rate of 0.3, momentum factor of 0.4, and gradient descent learning. All experiments were conducted using MATLAB. To compensate for the stochastic behavior of all neural forecasts, each model was executed 30 times with the aforementioned parameters, random initial weights, and threshold values. The average prediction error values from the 30 runs were recorded for performance comparisons. The mean absolute percentage error (MAPE) (Eq, (24)) was used to measure the prediction accuracy of all the forecasts.

$$MAPE= \frac{1}{No.of pattern}\sum_{i=1}^{No.of pattern}\frac{\left|{Target}_{i}-{Predicted}_{i}\right|}{{Target}_{i}}\times 100\%$$
(24)

Analysis of the results from the closing prices series

All seven forecasts were iterated 100 times, and Fig. 9 plots the error convergence graphs from the four closing price series. In all series, the EOAEFA-FLN converged the fastest except for the Russell2000 series. The convergence of the four AEFA-based models was determined to be similar to and better than that of the GA-based models. The BPNN convergence was determined to be unsatisfactory. Table 6 lists the MAPE statistics for the seven forecasting approaches from the four series. Values less than \({10}^{-5}\) are considered to be zero. The best MAPE is indicated in bold, and the second best in italics. For the NASDAQ series, the EOAEFA-FLN generated the best average error of 0.010285 whereas the EOAEFA-PSNN was second-best with an average error of 0.010823. Similarly, both methods retained their respective positions in the case of the DJIA and Russell FTS, and tied in the S&P 500 FTS. Overall, EOAEFA-FLN-based forecasting performs better than the other methods.

Fig. 9
figure 9

MAPE convergence graphs from seven forecasts

Table 6 MAPE statistics from four closing prices series

To confirm the success of the proposed forecasts, Fig. 10, 11, 12, 13 plot the model predictions against the actual closing prices for the DJIA, NASDAQ, S&P 500, and Russell 2000 FTS, respectively. Based on these plots, the nearness of the EOAEFA-HONN-based predictions to the actual closing prices is evident. The EOAEFA-FLN predictions are the closest to the actual values, followed by those of the EOAEFA-PSNN. Both models were competitive and efficient in preserving the patterns of the actual closing price series. The basic AEFA- and GA-based HONNs predictions were moderate, whereas those from the BPNN-based forecasting were poor. The stability of the proposed forecasts is confirmed by the box–whisker plots depicted in Fig. 14.

Fig. 10
figure 10

Prediction plots of seven models from DJIA series

Fig. 11
figure 11

Prediction plots of seven models from NASDAQ series

Fig. 12
figure 12

Prediction plots of seven models from S&P 500 series

Fig. 13
figure 13

Prediction plots of seven models from Russell 2000 series

Fig. 14
figure 14

Box-Whisker plots of MAPE values with seven forecasts and four closing price FTS

Analysis of results from exchange rate series

The next type of FTS data we considered for the evaluation of EOAEFA + HONN-based forecasts are four exchange rate FTS: BTC, EURO, GBP, and JPY versus US dollars. The input data preparation and normalization were the same as those in the case of closing the price series. Figure 15 visualizes the convergence rates of all forecasts. Here, the EOAEFA-FLN and EOAEFA-PSNN converged faster than the other forecasts. Table 7 summarizes the MAPE statistics. Values less than \({10}^{-5}\) are considered to be zero. The proposed forecasts achieved lower error statistics than the others; in particular, the EOAEFA-FLN generated the lowest average errors of 0.005833, 0.006013, and 0.006599 from the BTC/USD, EUR/USD, and GBP/USD series, respectively, followed by the EOAEFA-PSNN. However, for the JPY/USD series, the AEFA-PSNN is the best performer with an error of 0.004579. Overall, the EOAEFA-FLN obtained the lowest minimum, mean, median, maximum, standard deviation, and interquartile range values, which is evidence of its superior performance in capturing the underlying nonlinearity coupled with the exchange rate FTS. Among the forecasts, the BPNN obtained poor prediction accuracies on the four FTS owing to backpropagation-based learning. The enhanced learning capability of the EOAEFA was established by these numerical outcomes. Figure 16, 17, 18, 19 depict the predicted versus actual exchange rates from all models for BTC/USD, EUR/USD, GBP/USD, and JPY/USD, respectively. Evidently, the EOAEFA-FLN and EOAEFA-PSNN can follow the original FTS patterns more accurately than the others. Although the GA- and basic AEFA-based forecasting seems to follow the original pattern, their prediction values are much larger than those of the proposed models. Similar conclusions can be drawn from the box–whisker plots shown in Fig. 20.

Fig. 15
figure 15

MAPE convergence rate of seven forecast from four exchange rate FTS

Table 7 MAPE statistics from four exchange rate series
Fig. 16
figure 16

Prediction plots of seven models from BTC/USD FTS

Fig. 17
figure 17

Prediction plots of seven models from EUR/USD FTS

Fig. 18
figure 18

Prediction plots of seven models from GBP/USD FTS

Fig. 19
figure 19

Prediction plots of seven models from JPY/USD FTS

Fig. 20
figure 20

Box-Whisker plots of MAPE values with seven forecasts and four exchange rate FTS

5.5 Analysis of results from the energy prices series.

The proficiency of the proposed forecasts was then exploited to forecast three energy price FTS. The input selection, preprocessing, and model training were conducted in a similar fashion to the previous FTS forecasting. Figure 21 depicts the error convergence rate of all forecasts from the three FTS. Here, the EOAEFA-FLN converged faster than the others again. Table 8 lists the average error values from 30 independent runs. Values less than \({10}^{-5}\) are considered to be zero. For the crude oil and natural gas price series, the EOAEFA-FLN first again, followed by the EOAEFA-PSNN. It achieved average errors of 0.003688 and 0.004725 for the crude oil and natural gas price series, respectively. For the weekly coal price series, the EOAEFA-FLN and EOAEFA-PSNN obtained the same average error, followed by the AEFA-PSNN-based forecasts. The AEFA- and GA-based predictions were moderate, whereas those of the BPNN were inferior. The prices predicted by the proposed approach were closer to the actual prices, as shown in Figs. 22, 23, 24 for the crude oil, natural gas, and coal price series. Forecasted prices appear to deviate slightly from actual prices in the case of the weekly coal price data. Premature training resulting from insufficient training data (only 50 data points were available) may be the reason for this. However, the direction of movement of all FTS was well retained by the proposed forecasts. The box-whisker plot in Fig. 25 further supports the superiority of the proposed forecasts.

Fig. 21
figure 21

MAPE convergence rate of seven forecast from three energy price FTS

Table 8 MAPE statistics from three energy price series
Fig. 22
figure 22

Prediction plots of seven models from crude oil price series

Fig. 23
figure 23

Prediction plots of seven models from natural gas price series

Fig. 24
figure 24

Prediction plots of seven models from weekly coal price series

Fig. 25
figure 25

Box-Whisker plots of MAPE from seven forecast and three FTS

Statistical testing and further analysis

To further confirm the benefits of the EOAEFA + HONN-based forecasting, we conducted statistical tests, such as the Wilcoxon signed-rank and Deibold Mariano (DM) tests. In addition, runtime, relative worth, and MAPE reduction percentage analyses were conducted when adopting the proposed model. This section summarizes the outcomes of these tests and analyses.

The forecast performances were compared in terms of computation time. The experiments in this study were conducted using a system with an Intel(R) Core (TM) i7-10750H CPU @ 2.60 GHz, 2.59 GHz, and 16.0 GB of memory in a MATLAB-2016 programming environment. Table 9 summarizes the execution times of all the models s. For input size \(N\) and functional expansion unit size \(FE\), an FLN-based model must adjust the (\(N\times FE\)) number of weights and one \(bias\) value. Similarly, for hidden layer width \(NH\), a PSNN-based model must adjust (\(N\times NH\)) weights and \(NH\) biases. With \(M\) hidden neurons and one output neuron, the BPNN model must fine-tune \(M\times (N+1)\) weights and (\(M+1\)) biases. Evidently, the BPNN required the highest running time because of a greater number of adjustable parameters and backpropagation learning. The run times of the EOAEFA-FLN and EOAEFA-PSNN were nearer and slightly greater than those of the basic AEFA-based models because of the inclusion of OBL. However, this can be tolerated as compensation for a higher forecasting accuracy.

Table 9 Run-time (s) from seven forecasts and eleven FTS

The Wilcoxon signed-rank test, which is a paired two-sided test, was conducted as a significance check. The null hypothesis indicated that the variance in the proposed and comparative models originated from a distribution with zero medians. Rejection is indicated by the logical value h = 1. The DM test was a twosome comparison of forecasts. This test is used to determine whether the forecasts under consideration are equally acceptable. The null hypothesis states that the two forecasts have an equivalent accuracy, and the alternate hypothesis states that they have different levels of accuracy. If the computed DM statistics lie beyond the critical values, i.e., \(-1.965<DM<1.965\), the null hypothesis of no variance is rejected. Table 10 presents the Wilcoxon signed-rank test (with a 5% significance level) and DM test results from the closing price series. These statistics indicate that the predictions of the proposed forecast significantly differ from those of the other forecasts.

Table 10 Significance test results of the proposed forecast

The efficiency of the proposed forecasting approach was observed in the above discussion. In most cases, the proposed forecast achieved the lowest error statistics compared with the other forecasts. To determine the precision of the comparative performance of the EOAEFA-FLN, another measure, called the relative worth (RW), of the model was considered. This is the average reduction ratio of the prediction error of a particular model using the proposed method over all the FTS. We used the MAPE values to obtain the RWs. The relative worth \({RW}_{j}\) of a model over the worst-performing model is defined in Eq. (25). For our calculations, we considered BPNN to be the worst-performing model;

$$RW_{j} = \frac{1}{N}\mathop \sum \limits_{i = 1}^{N} \left( {\frac{{MAPE_{i} - MAPE_{ij} }}{{MAPE_{i} }}} \right) \times 100\% , \forall j = 1,2,3, \cdots , n,$$
(25)

where \({MAPE}_{ij}\) is the forecasting error of the \({j}^{th}\) model on the \({i}^{th}\) FTS, \({MAPE}_{i}\) is the error of the worst-performing forecast for the same FTS, and N is the number of datasets. Table 11 lists the computed RW values. Based on these statistics, the EOAEFA-based HONN forecasting has high RW values compared with those of the others.

Table 11 RW of forecasting models

The model predictions were compared in terms of the percentage reduction in the MAPE (MR) values when adopting the proposed prediction model. This is computed as in Eq. (26). Figure 26 shows the computed MR values from all FTS. Figure 26 presents a good reduction in the MAPE values upon adopting the EOAEFA-FLN over other forecasts.

Fig. 26
figure 26

MAPE reduction on adopting EOAEFA-FLN over other forecasts

$$MR=\frac{(MAPE of existing forecast-MAPE of proposed forecast)}{MAPE of existing model}\times 100\%$$
(26)

Conclusions and future work

The AEFA is a newly developed physics-inspired metaheuristic with a robust development ability, simpler computations, and attainment of global optima with fewer control parameters. Avoiding poor convergences, awhile improving the performance of the basic AEFA and solving real engineering problems remains an open challenge. To address these issues, this study proposes an elitism oppositional learning-based AEFA, called EOAEFA, to intensify the exploration and exploitation abilities of the AEFA. The EOAEFA was used to determine the weights and thresholds of the FLANN and PSNN, forming two hybrid models: EOAEFA-FLN and EOAEFA-PSNN. The proposed methods were evaluated for predicting future patterns of the closing price series of four fast-growing stocks, four exchange rate series from developed economies, and three energy price FTS. The combined effects of the enhanced exploration ability of the EOAEFA, higher fault tolerance capability, and powerful mapping of single-layer trainable weights of the HONNs made the FTS predictions effective and accurate.

To verify the competitiveness of the proposed approach, its predictions were compared with those of AEFA-FLN, AEFA-PSNN, GA-FLN, GA-PSNN, and BPNN-based forecasting. The results show that the EOAEFA + HONNs, particularly the EOAEFA-FLN, obtained accurate predictions for most FTS compared with the others. For the four closing prices FTS, the EOAEFA-FLN generated the lowest MAPE values of 0.010285, 0.010895, 0.012230, and 0.009959. For three of the four exchange rate FTS, it obtained MAPE values of 0.005833, 0.006013, and 0.006599 which are lower than those of the others. Similarly, it generated the lowest MAPE values of 0.003688, 0.004725, and 0.039972 for the three energy prices FTS. The EOAEFA-FLN provided the best forecast, followed by EOAEFA-PSNN. In addition, the EOAEFA-FLN and EOAEFA-PSNN are found 87.85% and 83.85% relative worth compared to the worst performing model respectively. The stronger performance of the EOAEFA-FLN was further established through statistical test results. Further improvements in the EOAEFA performance and its hybridization with other ANNs, such as RNNs and DL methods, are possible extensions of the current study. Moreover, the predictability of the proposed forecast can be further explored in the healthcare and material science engineering domains.