Traffic prediction using a selfadjusted evolutionary neural network
 319 Downloads
Abstract
Shortterm prediction of traffic flow is one of the most essential elements of all proactive traffic control systems. The aim of this paper is to provide a model based on neural networks (NNs) for multistepahead traffic prediction. NNs’ dependency on parameter setting is the major challenge in using them as a predictor. Given the fact that the best combination of NN parameters results in the minimum error of predicted output, the main problem is NN optimization. So, it is viable to set the best combination of the parameters according to a specific traffic behavior. On the other hand, an automatic method—which is applicable in general cases—is strongly desired to set appropriate parameters for neural networks. This paper defines a selfadjusted NN using the nondominated sorting genetic algorithm II (NSGAII) as a multiobjective optimizer for shortterm prediction. NSGAII is used to optimize the number of neurons in the first and second layers of the NN, learning ratio and slope of the activation function. This model addresses the challenge of optimizing a multioutput NN in a selfadjusted way. Performance of the developed network is evaluated by application to both univariate and multivariate traffic flow data from an urban highway. Results are analyzed based on the performance measures, showing that the genetic algorithm tunes the NN as well without any manually preadjustment. The achieved prediction accuracy is calculated with multiple measures such as the root mean square error (RMSE), and the RMSE value is 10 and 12 in the best configuration of the proposed model for single and multistepahead traffic flow prediction, respectively.
Keywords
Traffic prediction Neural networks Genetic algorithm Selfadjusted framework1 Introduction
Intelligent transportation systems (ITSs) are expected to alleviate traffic problems around the world. Shortterm traffic prediction is a highly researched area within ITS, and the results are used by transportation practitioners to reduce congestion and increase mobility. Efforts in this field started from the application of autoregressive integrated moving average (ARIMA) models and nonparametric techniques for traffic prediction. Since then, several parametric, nonparametric and also hybrid methods have been proposed by researchers. Basic parametric methods such as ARIMA [1], seasonal autoregressive integrated moving average method (SARIMA) [2] and Kalman filter [3] have been widely used in the literature. Developing these algorithms to meet the requirements of current engineering applications has been the subject of many research efforts in the past few decades [4]. For example, Luo et al. [5] proposed a hybrid prediction methodology based on improved SARIMA model and multiinput autoregressive (AR) model with genetic algorithm (GA) optimization, in order to provide a better prediction accuracy and also reduce the operation time.
Many nonparametric algorithms have also been proposed in this field. Huang and Sun [6] applied kernel regression with sparse metric learning to predict shortterm traffic flow. In 2016, Habtemichael and Cetin [7] identified similar traffic patterns with an enhanced Knearest neighbor (KNN) algorithm and provided a datadriven shortterm traffic prediction. Multilayer feedback NN [8] and KNNbased neurofuzzy system [9] are examples of applying NNs for shortterm traffic prediction. Owing to their ability to approximate any degree of nonlinearity, NNs have been widely used in the literature. However, because of their developmental nature, a large degree of uncertainty is present when trying to select the optimal network parameters. To overcome this deficiency, researchers had to rely on very timeconsuming and questionable rules of thumb [10].
An alternative approach for improving prediction accuracy is combining parametric, nonparametric and/or optimization algorithms to provide a hybrid method. In this approach, several methods are aggregated in order to provide a more efficient model. For instance, Hu et al. [11] used a combination of particle swarm optimization (PSO) and GA for traffic flow prediction. Cong et al. [12] proposed a model combined with support vector machine (SVM) and fruit fly optimization.
New interest in hybrid methods arises from the use of GAs. In 2015, Feng [13] analyzed the disadvantage of wavelet NNs and used GA to optimize the weight and threshold of NN. The GA has also been used to optimize NNs for different types of roads, to optimize links which connect input cells to hidden cells in the NN trained by Levenberg–Marquardt method [14] or to optimize the weights of the NN [15].
 1.
Multistepahead prediction with NNs is usually provided with two approaches: (1) training separate NNs for each prediction horizon or (2) using one trained NN and sequentially predicting the traffic flow at time t + 2 using the predicted traffic flow at t + 1 and so on. The first approach is very timeconsuming, and in the second approach, the accuracy of results decreases as the prediction horizon increases. The best approach is to use a multioutput NN which then raises the challenge of optimizing its parameters using a consistent optimization algorithm. In this paper, the result of applying a multiobjective optimization algorithm on multilayer perceptron (MLP) NNs is discussed.
 2.
The effort of this paper is to optimize these parameters for a multioutput MLP in a selfadjusted and evolutionary manner. Our goal is to reduce the dependency of final parameters on the manually initialized parameters.
The remainder of this paper is organized as follows: Sect. 2 presents the methodological and optimization framework used in this paper. Section 3 discusses the data used in this study and also the temporal and spatial representation of the traffic data. In Sect. 4 we present the empirical results, and finally, in Sect. 5 we discuss the findings of this paper.
2 Methodology
The framework employed for prediction entails two major blocks: the traffic estimator and its optimizer. The estimator structure is developed based on MLP NN with backpropagation learning algorithm. The optimizer used for setting the optimal set of variables is based on a specific kind of GAs called “nondominated sorting genetic algorithm II” or briefly NSGAII. This section gives a brief review on the properties of these two blocks.
2.1 MLP NNs
2.1.1 Standard backpropagation algorithm for MLP
This paper trains the MLPs with error backpropagation algorithm and two hidden layers with respect to four parameters: (1) learning parameter, (2) slope (gain) of the activation function, (3) number of the first layer’s neurons, and (4) number of the second layer’s neurons.
2.1.2 Backpropagation algorithm with momentum updating
The described learning algorithm has some important drawbacks. First of all, the learning parameter should be chosen small to provide minimization of the total error function \(E_{j}\). However, for a small learning parameter, the learning process becomes very slow. On the other hand, while large values correspond to rapid learning, they lead to parasitic oscillations which prevent the algorithm from converging to the desired solution. Moreover, if the error function contains many local minima, the network might get trapped in some local minimum or get stuck on a very flat plateau. One simple way to improve the backpropagation learning algorithm is to smooth the weight changes by overrelaxation, i.e., by adding the momentum term (Eq. 4) [16].
This means that the effective learning rate increases to the value \(\eta_{\text{eff}} = \frac{\eta }{1  \alpha }\) without magnifying the parasitic oscillations [16]. This NN is trained with respect to five parameters: (1) learning parameter, (2) slope (gain) of the activation function, (3) momentum term, (4) number of the first layer’s neurons, and (5) number of the second layer’s neurons.
2.2 NSGAII
The main approach in multiobjective evolutionary algorithms (MOEAs) is to find a set of Paretooptimal solutions in one single run. In multiobjective models, a set of Paretooptimal solutions are reported instead of finding a single solution that optimizes all the objectives simultaneously.
In comparison with a number of MOEAs proposed in the past decade, NSGAII is a wellknown multiobjective GA proposed by Deb that finds a better spread of solutions in different problems [17].
3 Optimizing NN using NSGAII
NNs’ dependency on parameter setting is the major challenge in using them as a predictor. Given the fact that the best combination of NN parameters results in the minimum error of the predicted output, the main problem is NN optimization. So, it is viable to set the best combination of the parameters according to a specific traffic behavior. On the other hand, an automatic method—which is applicable in general cases—is strongly desired to address the appropriate NN’s parameters. In this section, a selfadjusted framework is developed using an optimized NN for shortterm prediction.
Most prediction systems are dependent on data transmission. This suggests that a continuous flow of data about traffic parameters is necessary to operate efficiently. However, it is common for most realtime traffic data collection systems to experience failures. So, a realtime prediction system must be able to generate predictions for multiple steps ahead to ensure its operation in cases of data collection failures [18].
Although multistepahead prediction is reckoned to be a proper solution in cases of failures of data collection systems, it was found in some previous relevant studies such as [10] that the correlation coefficient between actual and predicted flow series decreases as the prediction horizon increases. In order to solve this problem, we use a multioutput NN and used an MLP to optimize its parameters. The advantage of this combination is predicting multiple steps ahead through the original set of data with high accuracy (It will be shown that the correlation coefficient between actual and predicted flow series does not decrease as the prediction horizon increases). Optimizing the model using NSGAII assures that we are getting the minimum error simultaneously for all steps ahead.
As illustrated in Fig. 3, the optimized value for the number of neurons in the first and second layers of the NN, learning ratio and slope of the activation function—shown by q1, q2, eta and gamma—is resulted from using NSGAII to minimize NN error. In this process, EV1 and EV2 represent the error of validation data for onestep and twostepahead prediction, respectively. P and Q are the parents and children populations. The NNBP is the MLP utilized by backpropagation algorithm. After that, these efficient values are transferred to the NN algorithm to be set as the initial values of the mentioned parameters. The NN algorithm can be run once to provide the weights of links connected in three layers. These weights alongside the estimated parameters are structuring our final NN for multistepahead prediction.
4 Study data
4.1 Temporal and spatial representation of data
 1.
Timelagged events of V_{down}, such as \(V_{\text{down}} \left( {t  1} \right),V_{\text{down}} \left( {t  2} \right), \ldots ,V_{\text{down}} \left( {t  n} \right)\).
 2.
Timelagged events of V_{down} plus spatial attributes. In this case, input values are timelagged events of both V_{down} and V_{up}, such as \(V_{\text{down}} \left( {t  1} \right),V_{\text{down}} \left( {t  2} \right), \, \ldots ,V_{\text{down}} \left( {t  n} \right),V_{\text{up}} \left( {t  1} \right),V_{\text{up}} \left( {t  2} \right), \ldots ,V_{\text{up}} \left( {t  n} \right)\).
The chosen highway section has three loop detectors: One is for collecting data at the desired section, and other two are placed at the upstream sections. Suppose that A and B represent the traffic data of the upstream sections (collected by No. 02 and No. 03 detectors, respectively) and C is for the downstream and the desired section (collected by No. 01).

Univariate (type 1 input):
$$\left[ {C\left( t \right),C\left( {t + 1} \right)} \right] \, = f\left( {C\left( {t  1} \right),C\left( {t  2} \right), \ldots ,C\left( {t  n} \right)} \right).$$ 
Multivariate (type 2 input):
$$\left[ {C\left( t \right),C\left( {t + 1} \right)} \right] \, = f\left( {A\left( {t  1} \right) \, ,B\left( {t  1} \right),C\left( {t  1} \right),C\left( {t  2} \right), \ldots ,C\left( {t  n} \right)} \right).$$
4.2 Finding the input dimension
This section defines the aforementioned lookback window size or simply the input dimension. Increasing the input dimension of NNs can exponentially increase the computational complexity, but it may also increase the forecasting accuracy. Therefore, choosing the best dimension is a crucial issue. In this work, the statistical autocorrelation function (ACF) and partial autocorrelation function (PACF) are used for selecting the input dimension of a given time series in the nonparametric approach for traffic flow forecasting. Statistically, autocorrelation measures the degree of association between data in a time series separated by different time lags. The ACF is evaluated for various values of the lag time, and results are plotted. For traffic flows in 5min intervals, the lag time will be in 5 min. Wherever the ACF curve intersects the lag time axis, its value is zero, indicating that y(t − D) and y(t) are linearly independent, where D denotes the lookback window size. The lag time corresponding to the first point of intersection is chosen as the optimum input dimension [19].
The next step is to produce the partial autocorrelation plot of the data. The partial autocorrelation plot of the data with 95% confidence bands shows that only the partial autocorrelations of the first, second, third and fourth lag are significant.
The PACF curve enters the confidence band at D = 4, indicating that y(t − 4) and y(t) are linearly independent. The lag time D = 4 is therefore chosen as the optimum value to be used in the input dimension (Fig. 6c).
In other words, a fourdimensional input traffic flow vector, including four timelagged periods of flow from No. 01 and two output units (representing traffic flow for No. 01 at t + 1 and t + 2 time intervals), will be used to model the univariate set of data, and a sixdimensional input traffic flow vector, including four timelagged periods of flow from No. 01, one timelagged periods of flow from both No. 02 and No. 03, and two output units (representing traffic flow for No. 01 at t + 1 and t + 2 time intervals), will be used to model the multivariate set of data.
5 Empirical results
Comparing errors in detail based on NN types, we find that using techniques like adding the momentum rate to the classic gradient descent method did not improve the performance. Both models have similar performance, and it is notable that the R^{2} values in the models are very close and high for both one and twostepahead predictions. Comparing errors based on data types (univariate and multivariate) shows that using spatial attributes in addition to the temporal ones improves the performance.
The aforementioned findings suggest that the multivariate approach can be used for traffic prediction at the selected highway site according to its predictive accuracy.
Comparison results
Forecasting 10 min ahead with type 2 input data  MAPE  R ^{2} 

Optimized NN with gradient descent  17  0.97 
Seasonal ARIMA  18  0.92 
Historical average  20  0.90 
6 Conclusion
The ability to predict the future values of traffic parameters helps to improve the performance of traffic control systems. Both single/multistepahead predictions play a significant role in this field, but in cases of system failure, multistepahead predicted values become beneficial. In order to avoid the low accuracy of longterm forecasts, instead of applying the iterated approach to the results of a single output model, multioutput NN is used in this study. This paper applied the NSGAII to optimize the parameters of NNs with different learning rules and different types of inputs. This specific genetic algorithm is a wellknown multiobjective genetic algorithm that finds a better spread of solutions in different problems.
The proposed framework predicts traffic flow values based on their recent temporal and spatial profiles at a given highway site during the past few minutes. Both temporal and spatial effects are found to be essential for more accurate prediction. Moreover, it was found that longer extent of prediction does not decrease the accuracy of results in this model. The model performance was validated using real traffic flow data obtained from the field.
This paper demonstrates the ability of this class of genetic algorithms to produce the best combination of network parameters. Results obtained from test data adduce that the model generalization ability is satisfactory. For the case of 5 and 10min prediction horizon, R^{2} indices are at least 0.98, which evidently shows the model generalization ability.
References
 1.Voort MVD, Dougherty M, Watson S (1996) Combining Kohonen maps with ARIMA time series models to forecast traffic flow. Transp Res Part C 4(5):307–318CrossRefGoogle Scholar
 2.Williams BM, Hoel LA (2003) Modeling and forecasting vehicular traffic flow as a seasonal ARIMA process: theoretical basis and empirical results. J Transp Eng 129(6):664–672CrossRefGoogle Scholar
 3.Gong Y, Zhang Y (2013) Research of shortterm traffic volume prediction based on Kalman filtering. Paper presented at the 6th international conference on intelligent networks and intelligent systems, Shenyang, ChinaGoogle Scholar
 4.van Lint H, van Hinsbergen C (2012) Shortterm traffic and travel time prediction models. In: Artificial intelligence applications to critical transportation issues. Transportation Research Circular, Number EC168, pp 22–41Google Scholar
 5.Luo X, Niu L, Zhang S (2018) An algorithm for traffic flow prediction based on improved SARIMA and GA. KSCE J Civ Eng 22(10):1–9Google Scholar
 6.Huang R, Sun S (2013) Kernel regression with sparse metric learning. J Intell Fuzzy Syst 24(4):775–787MathSciNetzbMATHGoogle Scholar
 7.Habtemichael FG, Cetin M (2016) Shortterm traffic flow rate forecasting based on identifying similar traffic patterns. Transp Res Part C 66:61–78CrossRefGoogle Scholar
 8.Hou Y, Edara P, Sun C (2015) Traffic flow forecasting for urban work zones. IEEE Trans Intell Transp Syst 16(4):1761–1770CrossRefGoogle Scholar
 9.Wei CC, Chen TT, Lee SJ (2013) KNN based neurofuzzy system for time series prediction. Paper presented at the 14th ACIS international conference on software engineering, artificial intelligence, networking and parallel/distributed computing, Honolulu, USA, pp 569–574Google Scholar
 10.Vlahogianni EI, Matthew GK, Golias JC (2005) Optimized and metaoptimized neural networks for shortterm traffic flow prediction: a genetic approach. Transp Res Part C 13:211–234CrossRefGoogle Scholar
 11.Hu W, Yan L, Liu K, Wang H (2016) A shortterm traffic flow forecasting method based on the hybrid PSOSVR. Neural Process Lett 43:155–172CrossRefGoogle Scholar
 12.Cong Y, Wang J, Li X (2016) Traffic flow forecasting by a least squares support vector machine with a fruit fly optimization algorithm. Procedia Eng 137:59–68CrossRefGoogle Scholar
 13.Feng G (2015) Network traffic prediction based on neural network. Presented at the international conference on intelligent transportation, big data and smart city (ICITBS), pp 527–530Google Scholar
 14.Afandizadeh SH, Kianfar J (2009) A hybrid neurogenetic approach to shortterm traffic volume prediction. Int J Civ Eng 7(1):41–48Google Scholar
 15.Cui J (2010) Traffic prediction based on improved neural network. J Converg Inf Technol (JCIT) 5(9):85Google Scholar
 16.Cichoki A, Unbehauen R (1993) Neural networks for optimization and signal processing. Wiley, ChichesterGoogle Scholar
 17.Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multi objective genetic algorithm: NSGAII. IEEE Trans Evol Comput 6:182–197CrossRefGoogle Scholar
 18.Eleni IV, Matthew GK (2010) Urban transport and hybrid vehicles, local and global iterative algorithms for realtime shortterm traffic flow predictionGoogle Scholar
 19.Jiang X, Adeli H (2005) Dynamic wavelet neural network model for traffic flow forecasting. J Transp Eng 131(10):771CrossRefGoogle Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.