1 Introduction

Genetic algorithms (GAs), proposed by J. Holland, are one of the most important classes of evolutionary computation methods. GA mimics the evolutionary process of nature, based on Darwin's theory of "survival of the fittest." GAs have five basic steps: generation of initial population, selection of parents for the next generation, crossover, mutation and replacement. The initial population consisting of chromosomes corresponding to solution set is generated randomly. This is followed by selection of parents that are recombined to create offspring for the next generation. The popular selection methods in GA are roulette wheel selection, stochastic universal sampling, tournament selection, rank selection, etc. Crossover and mutation are the operators that provide diversity to the solutions in GA. The final step in GA is the replacement step. The replacement technique is used to decide which individuals (from the parent and offspring sets) will form the population of the next generation, selection that influences the algorithm convergence [35]. All these steps are important for the GA process. Parameters such as population size, crossover rate, mutation rate and the chromosomes to be eliminated (replaced) also affect the performance of the GA. In GA literature, although there are certain parameter values for the population size, crossover rate, mutation rate, there is no indicative parameter value for the chromosomes to be eliminated in the replacement step. There are various strategies that try to preserve diversity in the population while, at the same time, trying to ensure a very good convergence.

This process is known as exploration/exploitation trade-off and is a challenging aspect of evolutionary computation, with no best method known so far. In this paper, the chromosomes to be eliminated in the replacement step are determined by a statistical evaluation based on the empirical distribution of the objective function values, which is the root mean square error (RMSE). If the distribution of the RMSE values of the chromosomes of a population is the normal distribution, the mean statistics is determined as a suitable measure of central tendency. Thus, the chromosomes with RMSE values greater than the mean statistic are eliminated in the replacement step. Otherwise, if the distribution of the RMSE values is not normal, the median statistic is determined as a suitable measure of central tendency for the objective function values. Thus, the chromosomes with RMSE values greater than the median statistic are eliminated in the replacement step. A new approach called statistical-based replacement is proposed to determine the chromosomes to be eliminated in the replacement. The number of chromosomes to be eliminated is determined automatically, preserving about 50% of the chromosomes from the parents and offspring population. The performance of the proposed GA approach is evaluated on the training process of single multiplicative neuron model artificial neural networks (SMNM-ANN) proposed in [30].

The paper is organized as follows: in Sect. 2, we introduce the statistical-based replacement method. In Sect. 3, we give a brief introduction to SMNM-ANN. Section 4 summarizes very briefly the GA. Our proposed methodology is given in Sect. 5. Section 6 compares the results of our proposed algorithm against other methods. Finally, the paper ends with a discussion and conclusions in Sect. 7.

2 The proposed statistical-based replacement procedure

Determining the number of chromosomes to be eliminated is a very important step in the GA iterative process. The size of the population usually remains constant at each iteration. This is achieved in the replacement phase, where a constant number from the parent and offspring population is preserved for the next generation. Replacement is the last step of any genetic algorithm cycle.

Let us consider the objective function values obtained for each chromosome in the population as a data set. We verify whether this data set has a normal distribution or not. The Lilliefors test is used to check the normality of the distribution. When a data set has a normal distribution, the mean statistic is the best measure of central tendency. If a data set does not have a normal distribution, the median statistic can be preferred as a measure of central tendency.

From this statistical perspective, if the relevant data set (the objective function values of each chromosome in our case) has a normal distribution, the chromosomes with an objective function value greater than the mean statistics (obtained by dividing the sum of the objective function values of each chromosome in the population to the total number of chromosomes) are eliminated for an objective function that requires minimization. When the objective function needs to be maximized, the chromosomes with an objective function value smaller than the mean statistics are eliminated.

Otherwise, if the relevant data set does not have a normal distribution, the chromosomes with an objective function value greater than the median statistics are eliminated, in the case of objective minimization. In case of objective function maximization, the chromosomes with an objective function value smaller than the median statistics are eliminated. This strategy is called as a statistical-based replacement in this paper.

An example of this representation is given in Fig. 1. There are \(n\) chromosomes, and each chromosome has \(g\) genes. OFV shows the objective function value obtained for each chromosome, and the median statistic of OFV is calculated by Eq. (1):

$${\text{med}} = \left\{ {\begin{array}{*{20}c} {{\text{X}}_{{\left( {\frac{{\text{n + 1}}}{{2}}} \right)}} } & {\text{if n is odd number}} \\ {\frac{{{\text{X}}_{{\left( {\frac{{\text{n}}}{{2}}} \right)}} {\text{ + X}}_{{\left( {\frac{{\text{n}}}{{2}}} \right){ + 1}}} }}{{2}}} & {\text{if n is even number}} \\ \end{array} } \right.$$
(1)

where \(X\) represents the median rank of the sorted OFV values.

Fig. 1
figure 1

An example of the representation used by the proposed method

3 Single multiplicative neuron model artificial neural networks

SMNM-ANN proposed in [30] is an artificial neural network that uses multiplicative neuron in its structure instead of additive neuron (that is most common with ANNs models). The authors use backpropagation learning (BP) algorithm in the training of SMNM-ANN.

The structure of SMNM-ANN is given in Fig. 2.

Fig. 2
figure 2

The architecture of SMNM-ANN

In Fig. 2, f represents the activation function, the join function \(\Omega\) is multiplication, the weight and bias vector is \(\Theta =({w}_{1}, {w}_{2},\ldots ,{w}_{m},{b}_{1}, {b}_{2},\ldots ,{b}_{m})\) and the inputs are \(({x}_{1}, {x}_{2},\ldots ,{x}_{m})\). The neuron model with \(m\) input variables \(\left( {x_{i} , i = 1,2, \ldots ,m} \right)\) given in Fig. 2 has \(\left( {2 \times m} \right)\) elements. Of these, \(m\) correspond to the weights \(\left( {w_{i} , i = 1,2, \ldots ,m} \right)\) and the other \(m\) to the biases \(\left( {b_{i} , i = 1,2, \ldots ,m} \right)\). The output of the SMNM-ANN with the logistic activation function is calculated using Eqs. (2) and (3):

$${\text{net}} = \mathop \prod \limits_{{{\text{i}} = 1}}^{m} \left( {w_{{\text{i}}} x_{{\text{i}}} + b_{i} } \right)$$
(2)
$${\text{y = f}}\left( {{\text{net}}} \right) = \frac{{1}}{{{\text{1 + exp}}\left( {\text{ - net}} \right)}}$$
(3)

Many heuristic optimization algorithms have also been used in the training process of SMNM-ANN in recent years. [24] uses particle swarm optimization (PSO), [22] uses improved back propagation algorithm, [6] uses harmony search algorithm, [37] employs online training algorithms, [19] uses improved glow-worm swarm optimization algorithm, [10] uses differential evolution algorithm (DEA), [11] uses artificial bat algorithm (ABA) for the optimization of SMNM-ANN. [12] also used PSO for the training of SMNM-ANN and proposed a robust learning algorithm for SMNM-AN. [38] makes use of SMNM-ANN for forecasting of hourly wind speed. [27] used SMNM-ANN based on Gaussian activation function. [28] proposed a SMNM-ANN with autoregressive coefficient for time-series modeling. [13] uses sine cosine algorithm and [9] uses a hybrid ABA and BP algorithm in the training process of SMNM-ANN. [4] proposes a SMNM-ANN based on threshold for non-linear time-series forecasting.

4 Genetic algorithms

There are many metaheuristics optimization algorithms, with different properties, available in the literature. Among these algorithms, harmony search algorithm aims to obtain the best harmony from different melodies, sine cosine optimization algorithm employs the properties of sine and cosine functions, symbiotic learning algorithm is linked to the optimization process of symbiotic relationships in nature, and GA is a population-based heuristic optimization algorithm [18]. Many popular metaheuristics optimization algorithms similar to these are clearly explained, together with the algorithm provided, in the studies of [16, 36] and [23].

Among these methods, GA is one of the oldest known heuristic algorithms. GA differs from other heuristic optimization algorithms with its unique paradigm and operators used in the optimization process.

GA is a search and optimization method inspired from natural selection and genetics. GA is an approximation method, generally very effective in searching over very large (almost intractable) search space and in finding a good solution. The elements of a GA are called chromosomes, the basic structural units of the chromosome are called genes, and the values of genes are alleles. To solve a problem with a GA, it is necessary to create an initial population, to determine the types and rates of crossover and mutation operators, the selection operator and the stopping criterion. All of these parameters and operators greatly affect the overall performance.

The chromosomes correspond to potential solutions. If the GA has a good initial population, the algorithm is considered to have a better chance of finding a good solution. The population size parameter, which is generally defined by the user, is one of the important factors affecting the performance of the GA. It is known that small population size can direct the algorithm to non-optimal solutions, and large population size may require more computation time for the algorithm to find a solution. The crossover operator, one of the most basic operators in GA, is an operator that allows the creation of new chromosomes according to a predetermined recombination mechanism applied to selected chromosome pairs. The mutation operator is an operator applied generally with a lower probability. The mutation operator takes a random walk around the chromosomes and has the effect of replacing the allele of a random gene of a randomly selected chromosome. In the final step of GA, which is the replacement of the population for the next generation, the chromosomes from the old population are replaced by new ones.

This process is iterative, repeated until a stopping condition is met.

GA has been used for different aims with ANNs. [17and [29] proposed a method based on GA and ANN. GA-based ANN has been used for scheduling problems in [1, 2], fault diagnosis model in [33], for bankruptcy prediction in [5], and for forecasting in [34]. GA has been used for designing ANN in [3], [14] and [26]. [8] and [32] trained the feed-forward ANN with GA. [25] used GA for missing data problems in ANN. [20] and [31] compared GA with backpropagation and simulated annealing, respectively. [21] compared GA-based ANN with a statistical methods. [39] used GA for input selection in ANN.

5 The proposed genetic algorithm with statistical-based replacement procedure for the training of SMNM-ANN

Determining the size and the elements of the chromosome set to be eliminated in the replacement step appears to be a problem in the GA process. In this paper, we propose a method to overcome the problem of selecting the chromosomes to be eliminated in the replacement step. We use a statistical evaluation that takes into account a statistical-based replacement procedure. How many and what chromosomes to be eliminated is determined by whether the RMSE value of a chromosome is greater or smaller than the mean or median of the objective function values of the entire population. Chromosomes with values greater than the mean or median RMSE values of all chromosomes in the population are eliminated. The innovations and advantages of the proposed method are as follows:

  • The method is systematic and statistically based;

  • Important genes are preserved due to the fact that the top 50% of the chromosomes in the population are not eliminated;

  • The convergence speed of GA is increased by using the greedy selection strategy in the mutation and crossover steps;

  • The restart strategy is used to increase the probability that the algorithm will escape the local optimum trap;

  • In order to prevent the overfitting problem of the artificial neural network, the number of iterations in which the genetic algorithm cannot provide improvement in the best chromosome of the population is checked as an early stop condition.

The steps of the proposed methodology are given below.

Step 1. Set GA and SMNM-ANN parameters.

These parameters are:

  • \(nc\): the number of chromosomes in the population;

  • \(ng\): the number of genes in a chromosome;

  • \(cor\): the crossover ratio;

  • \(mr\): the mutation ratio;

  • \(m\): the number of the inputs for the network;

  • n: the size of the training set.

Step 2. Generate the initial population.

The initial population is generated by considering the parameters \(nc, ng {\text{ and }} m\). Since an SMNM-ANN with \(m\) input will have \(m\) bias and \(m\) weight values, each chromosome in the population has a total of \(\left( {2 \times m} \right)\) genes. These gene values are generated with a uniform distribution between zero and one \(\left( {U\left( {0,1} \right)} \right)\). A population structure with \(m\) and \(nc\) parameters is represented in Fig. 3.

Fig. 3
figure 3

The positions in the population of a genetic algorithm with statistical-based replacement

Step 3. Calculate the objective function values of each chromosome in the population.

For this aim, the net values of each chromosome are first calculated using Eqs. (2 and 3). Then, these net values are used for the calculation of SMNM-ANN output ( \(\hat{y}_{t}\)). Finally, the RMSE criterion given in Eq. (4) is used as the fitness function value.

$${\text{RMSE}} = \sqrt {\frac{{\mathop \sum \nolimits_{{{\text{t}} = 1}}^{n} \left( {y_{{\text{t}}} - \hat{y}_{{\text{t}}} } \right)^{2} }}{n}}$$
(4)

Step 4. Apply the genetic operators.

Step 4.1. Apply crossover operator.

Before applying the crossover operator, the number of chromosome pairs to be crossed is determined by the \(cor\) parameter. The chromosomes are selected randomly from the population and the mutual genes of chromosome pairs to be crossed are displaced. The new chromosome obtained as a result is evaluated according to the greedy selection strategy and is included in the population if it has a fitness value better than the old chromosome. Otherwise, the old chromosome is stored in the population.

Step 4.2. Apply the mutation operator.

The number of chromosomes to be mutated is determined by the multiplication of \(nc\) and \(mr\) parameters. The chromosomes are selected randomly from the population and the gene/genes to be changed are also randomly selected. The new chromosome obtained as a result of the mutation is evaluated according to the greedy selection strategy and is included in the population if it has a better fitness value than the parent chromosome. Otherwise, the old chromosome is stored in the population.

Step 5. Apply the restart strategy.

The number of iterations is checked. If a pre-set number of iterations is reached (this is set to 200 in our work), all chromosomes in the population are randomly reproduced.

Step 6. Determine the number of chromosomes to be eliminated.

After calculating all RMSE values corresponding to each chromosome in the population, these RMSE values are subjected to normality test using Lilliefors test. If the RMSE values have a normal distribution, calculate the mean of these RMSE (meanRMSE) values and the RMSE values corresponding to chromosomes greater than meanRMSE are eliminated. If these RMSE values do not have normal distribution, calculate the median of these RMSE (medianRMSE). The chromosomes whose RMSE values are greater than medianRMSE are eliminated. The eliminated chromosomes are replaced by new generated chromosomes.

Step 7. Check the stopping condition.

If the maximum number of iterations or a number of consecutive iterations with no improvements has been reached, the process is ended. Otherwise, go to Step 3. The flow chart is given for the algorithm of the proposed method in Fig. 4.

Fig. 4
figure 4

The flowchart for the statistical-based genetic algorithm

6 Experiments

In the application part of the paper, a detailed analysis is carried out to evaluate the performance of the proposed method. S&P500, DOW JONES, and NIKKEI stock exchange closing time-series data sets between 2016 and 2018 are analyzed with SMNM-ANN trained by particle swarm optimization algorithm (PSO) proposed in [24], SMNM-ANN trained by artificial bat algorithm (BAT) proposed by [11], SMNM-ANN trained by a hybrid of artificial bat algorithm and backpropagation algorithms (BAT-BP) proposed by [9], SMNM-ANN trained by differential evolution algorithm (DEA) proposed by [10], SMNM-ANN trained by GA, SMNM-ANN trained by artificial bee colony algorithm (ABC), SMNM-ANN trained by backpropagation algorithm (BP) proposed by [30].

The performance of each time series analyzed is tested on different test set lengths. For this purpose, the test set lengths for each year of each time series are taken as 10, 20, and 50. In the use of the proposed SBGA, the number of inputs varies between 1 to 5 with an increment of 1, the number of chromosomes in the population varies between 30 and 100 with an increment 10, the crossover rate takes values between 0.1 and 0.7 with an increment 0.1, the mutation rate values used are 0.01, 0.05, 0,1, 0.15 and 0.2, and the parameter sets for the best result for each year of each time series with different test lengths are given in Table 1. The parameter settings used for the proposed method are the same as the ones used by the other methods we compared our results with.

Table 1 Parameter values used in experiments

An evaluation of all the methods analyzed in this paper is done considering the RMSE metric (given in Eq. (4)) and mean absolute percentage error (MAPE) criteria given in Eq. (5):

$${\text{MAPE}} = \frac{1}{n} \mathop \sum \limits_{{{\text{t}} = 1}}^{n} \left| {\frac{{y_{{\text{t}}} - \hat{y}_{{\text{t}}} }}{{y_{{\text{t}}} }}} \right|$$
(5)

The RMSE and MAPE results for the S&P500 stock exchange time-series test data for different years are given in Table 2. When analyzing the results, we can observe that the proposed SBGA method is the best in terms of RMSE criterion when the length test is 20 and 50 for S&P500 2016 test data set and for all the test lengths (10, 20, and 50) for SP500 2018 test data set.

Table 2 RMSE and MAPE results of S&P500 stock exchange time-series test data

SBGA method also obtains the best results for MAPE criterion when the length test is 10 for S&P500 2017 test data set. Results are also graphically given in Fig. 5.

Fig. 5
figure 5

A graphical representation of S&P500 data set analysis

Overall, the results are also very good for the other values of the parameters.

We next look at the DOW JONES stock exchange data sets. The RMSE and MAPE results are provided in Table 3. From the results presented, we can observe that SBGA method is the best in terms of RMSE and MAPE criteria when the length test is 10 and 20 for DOW JONES 2016 test data set. SBGA method obtains the best results for.

Table 3 RMSE and MAPE results for DOW JONES stock exchange time-series test data

RMSE criterion and the second-best values for MAPE metric when the length test is 50 for DOW JONES 2017 test data set. Also, SBGA method is the best in terms of RMSE criteria when the length test is 10 and 50 for DOW JONES 2017 test data set and the length test is 10, 20 and 50 for DOW JONES 2018 test data set. Results are given in Fig. 6.

Fig. 6
figure 6

A graphical representation of DOW JONES data set analysis results

The last experiment performed is for NIKKEI data set, with the results presented in Table 4. SBGA method is the best in terms of RMSE criterion when the length test is 20 and 50 for NIKKEI 2016 test data set and when the length test is 10, 20, and 50 for NIKKEI 2018 test data set. SBGA method obtains the best results for the MAPE criterion when the length test is 10 for NIKKEI 2017 test data set. Results are also graphically given in Fig. 7.

Table 4 RMSE and MAPE results for NIKKEI stock exchange time-series test data
Fig. 7
figure 7

A graphical representation of the NIKKEI data set analysis

Finally, we rank all the methods compared in this study in terms of RMSE results for each of the parameter configuration considered. Results are presented in Table 5.

Table 5 RMSE ranking of all the methods for all the parameter value combinations

7 Conclusions and discussions

In the GA literature, although there are certain parameter values for population size, crossover and mutation rates, there is no specific value or percentage for the number of chromosomes to be eliminated in the replacement step. In order to fill this gap, a new replacement operator named statistical-based replacement procedure based on the distribution of the objective function values is proposed in this paper. The new method is a statistical-based replacement procedure and uses the mean or median statistics through the replacement procedure, regardless of whether the value of the objective function has a normal distribution or not. The proposed statistical-based replacement procedure is used in the training process of SMNM-ANN. The performance of the proposed method is assessed on different test sets with different test lengths. The analysis results show that the proposed method is the best among all the methods used. It can be also said that the GA method using statistical-based replacement has a significantly superior performance compared to the classical genetic algorithm.

For each data set analysis, the rank numbers are computed for all methods. The mean of the rank numbers is calculated by using ranks as sample data for different test set lengths. The mean rank numbers of the proposed method are 1.83, 2.83 and 1.5 for the Nikkei data sets, 3.16, 2 and 1.66 for the S&P500 data sets and 1.33, 1.83 and 1.66 for the DOW JONES data sets for different years, respectively. The proposed method is the best or second-best method for all individual analyses, and it has the best performance overall.

In future studies, the proposed method can be used for the training of different artificial neural network types and in other application areas with similar data types.