Mode decomposition based deep learning model for multi-section traffic prediction

Road traffic prediction plays a vital role in real-time traffic management of an intelligent transportation system (ITS). Many prediction models achieve fine results. However, most ignore the intrinsic characteristics of traffic parameter data and do not consider the spatiotemporal effects of road sections, which can reflect the situation of all road traffic. Therefore, multi-section traffic prediction is still an open problem. In this paper, empirical mode decomposition (EMD) is employed to decompose the information of traffic parameters into many intrinsic mode function (IMF) components, which represent the original road traffic information in periodic and random sequences. Then, by considering the superiority of deep learning in multi-dimensional data processing, which can handle the spatiotemporal effects, a prediction model based on a convolutional neural network (CNN) is proposed to achieve the prediction of periodic and random sequences, whose results are combined to obtain the final prediction. The dataset from the Caltrans Performance Measurement System is used to validate the model. The proposed prediction model is compared to several well-known models, such as PCA-BP, Lasso-BP, and standard CNN. Experiments show that the proposed prediction model achieves higher accuracy.


Introduction
Road traffic prediction is the task of mining traffic patterns and predicting traffic trends by analyzing all kinds of traffic conditions of urban roads, including flow, speed, and occupancy. It not only helps traffic managers improve road operation efficiency but provides important information on road conditions for travelers [16]. Therefore, many scholars are committed to researching reliable and precise traffic prediction models [24]. The autoregressive integrated moving average model (ARIMA) is one of the most widely used statistical time series models [23], and improvements such as the seasonal ARIMA model [25] and ARIMA-GARCH model [4] have been proposed. Although these models have good prediction results, they do not consider spatiotemporal effects, which are a vital part of traffic data. Traffic parameters in different sections of a road can obviously be regarded as describing space and time processes. Various detectors can be deployed on a road network, each representing the state of a current traffic parameter. The status of these detectors can reflect the traffic status of road sections, hence multi-section road traffic prediction can reflect the future traffic status of entire road sections. It is a challenging task and a research hotspot [1,3,7]. Yao et al. [26] considered the temporal and spatial correlation of traffic parameters of each section of a road. They established a multi-road-section traffic parameter prediction model based on SVM and principal component analysis (PCA). Experimental results based on this scheme are much better than those of a single road section. Jiang et al. [11] considered the influence of multi-section, and established a prediction model of traffic parameters based on Lasso, with the help of a nonlinear neural network. Experiments showed that the Lasso-NN model has an overall lower error rate. Li et al. [10] thought there was information redundancy and correlation between road sections. They proposed a road traffic network prediction model based on rough set theory and SVM. The results showed a relatively good prediction effect. The above models have accomplished the prediction of road traffic parameters to a certain extent. However, they ignore the intrinsic characteristics of traffic parameter data, which can optimize prediction models. Due to the stochastic and nonlinear nature of traffic parameter data, they can be seen as a time series that can be decomposed into different frequency components [29]. Huang et al. [9] introduced a method called empirical mode decomposition (EMD) to decompose nonstationary signals into a set of intrinsic oscillatory modes. Therefore, studying the internal characteristic information of traffic parameter data may improve forecasting precision. Based on a single-section road traffic prediction model [28] that considers the periodic and random characteristics of road traffic parameters, this paper proposes a mode decomposition based deep learning model for multi-section road traffic prediction, which not only captures the intrinsic characteristics of traffic parameter data but shows superior capability for spatiotemporal effects.
The rest of this paper is structured as follows. Related studies of EMD and deep learning methods on road traffic prediction are briefly described in Section 2. The method of EMD of road section traffic is demonstrated in Section 3. Section 4 describes the proposed convolutional neural network (CNN)-based road section traffic prediction model. Experiments and analysis are presented in Section 5. Section 6 discusses conclusions and suggestions for future work.

Related work
Several road traffic evaluation models using mode decomposition techniques have been studied, their main goal being to decompose traffic information to obtain the most useful information from their original data through so-called intrinsic mode functions (IMFs). Chen et al. [5] introduced a hybrid short-term traffic flow prediction model based on EMD and a recurrent neural network. The model was shown to deliver superior short-term traffic flow predictions compared to other models. Duo et al. [6] employed EMD and GPSO-SVM as a hybrid model to forecast short-term traffic flow. Experimental results showed a better effect and higher accuracy compared to other models. Tian et al. [21] proposed an EMDbased model to predict short-term traffic flow. First, the original traffic flow sequence was decomposed by EMD. Then, prediction models were established according to different randomness of IMF components. Finally, multiple predicted results were added together to obtain the predicted traffic flow. The results showed that the performance indices of the proposed model exceeded those of classical prediction models. Zheng et al. [28] proposed a mode decomposition based hybrid model that could analyze the characteristics of traffic flow information and predict traffic flow. Due to the complexity of spatial and temporal dependence, the above studies did not consider spatiotemporal effects. In fact, multiple sensors are deployed along the road network to capture traffic data, which can represent traffic information in space and time. Hence spatiotemporal effects can reflect the traffic status of entire road sections, and to make full use of this dependence is the key to solving multi-section road traffic problems.
With the availability of effective open-source libraries and frameworks to implement basic learning algorithms, along with powerful and compact GPUs at affordable prices [17], deep neural networks, also known as deep learning, become attractive. Li et al. [13,14] applied deep learning models in edge computing and fog computing to reduce network traffic of information data from IoT devices to cloud servers and to optimize network performance. They deployed IoT devices as an edge layer, which is one of the multilayer structures of a deep learning model to improve the efficiency of processing image data. Experimental results showed that deep learning models such as CNNs outperform other optimization methods.
CNN is one of many deep learning methods. Song et al. [19] used a CNN model for traffic speed prediction. Experimental results show that the model could capture the local dependencies of traffic data and had an advantage on space and time effects. Ma et al. [15] proposed a CNN-based prediction model that converted spatial and temporal data to a two-dimensional space-time matrix, and used the advantage of CNN in image processing to predict traffic speed. Ratchanon Toncharoen [22] proposed traffic state prediction using a CNN model to transform traffic data of 40 nodes along an expressway to spatiotemporal matrices. Experiments have shown that prediction models based on CNN have an advantage regarding spatiotemporal effects and good prediction accuracy. However, the above studies ignored the intrinsic characteristics of traffic parameter data, which can further optimize the prediction model. This paper proposes a traffic prediction model which not only considers the intrinsic characteristics of traffic parameter data but the spatiotemporal effects.

Empirical mode decomposition of road section traffic
One can think of the evolution of traffic parameters as a temporal and spatial process by considering the traffic status parameters of flow, occupancy, and speed as a triplet. According to traffic flow theory, speed, flow, and occupancy are all related [18]. Traffic speed is considered the mean speed of a traffic stream, and is typically expressed in kilometers or miles per hour. Traffic flow can be considered a temporal measurement. It is typically expressed as the number of vehicles per hour. Traffic occupancy is a parameter expressing the crowding of a section of a road. It is typically expressed as the number of vehicles per kilometer or mile, or vehicles per road-section in each time interval. Therefore, the three parameters of flow, speed, and occupancy are selected as input variables. Then, the traffic status of the i th observation location at the t th time interval is denoted as (f i t , o i t , s i t ). At time T , the task is to predict three traffic parameters (f i t+1 , o i t+1 , s i t+1 ) at time T + 1 based on the historical The goal of EMD is to empirically identify the intrinsic oscillatory modes by their characteristic time scales in the data and then to decompose the data [9]. This complex system analysis method is suitable to process nonlinear and non-stationary time series. EMD assumes that any complex signal consists of simple IMF, and its basic idea is to divide an original irregular signal into multiple single-frequency signals and residual waves. For the input signal x(t) and original signal f (t), s(t), and o(t) as three traffic parameters data flow, speed and occupany. The procedure of the EMD algorithm is shown in Figure 1. The process of EMD has four steps.
Step 1: Set the original signal f (t), s(t), and o(t) as the input signal x(t). Find all local extrema of x(t); we can use cubic spline curve interpolation to separately obtain the upper envelope x up (t) and lower envelope x low (t) of x(t). The mean m(t) is estimated as the average of the two envelopes, ( 1 ) Step 2: Substract the mean m(t) from the input signal x(t): ( 2 )

Figure 1 Flow Diagram of EMD for Road Traffic Parameters
Step 3: Repeat step 1 to calculate the new m(t) and h(t) until h(t) meets the requirements of IMF, after which it is designated c(t), the first IMF component of x(t), and the residual of the signal r(t) can be obtained as Step 4: Using r(t) as the signal to be decomposed, repeat the process until the residual signal satisfies the termination condition. The final result of EMD is where c i (t) is the i th IMF component and r(t) is the residual wave. From this equation, we see that EMD can decompose the original signal x(t) into the sum of n frequencies of IMF and the residual wave r(t). According to the terminal constraints introduced in equation (4), we can first obtain all the IMFs from traffic flow data, as shown in Figure 2.  The results of the decomposition show that the fluctuation frequencies of IMF1, IMF2, and IMF3 are higher, and belong to the random sequence. The fluctuation frequencies of IMF4, IMF5, and IMF6 are stable, so they belong to the periodic sequence. With similar EMD processing in our experiments, the traffic speed and occupancy data can also be decomposed into six IMFs.

CNN-based road section traffic prediction model
By considering the output from the EMD algorithm and the length invariance of data after decomposition, the mode decomposition of multi-section road traffic must take spatial and temporal information into account. However, since the traffic state of one section depends greatly on nearby traffic sections in both the time and space aspects, the process is relatively complex. Therefore, the ability to extract important local features from input data makes CNN suitable for traffic state prediction. Also, the interaction of traffic flow, speed, and occupancy in the traffic network affect the overall traffic status. Therefore, the task of predicting traffic parameters should be considered comprehensively. When the three parameters are considered simultaneously, they are similar to the three primary colors red, green, and blue (RGB) in the image domain, as shown in Figure 3.
In image processing, a CNN-based model has the advantages that it can capture local dependencies and is less sensitive to noise in data [27]. This has enabled big advances using CNN-based approaches in many research fields, including image-and activity-recognition [12]. Therefore, in the prediction stage, this paper uses CNN for multi-section road traffic prediction. CNN is one of many widely used deep learning models in image-processing, whose examples include Google Net [20] and ResNet [8]. However, these models are more suitable for image classification and may not work well with the unique characteristics of road traffic prediction. The convolution layer has many convolution filters, which can extract and group lower features into higher and more abstract network traffic features. The process can be defined as where j and k are the indices of the convolutional filters; o k l , o j (l−1) , W jk l , and b k l are respectively the output, input, weights, and additive bias of the l th convolution layer; c (l−1) is the number of convolutional filters in the (l − 1) th ; and f is the activation function. In this study, the rectifier linear unit (ReLU) is chosen as the activation function since it does not squeeze the input, and it increases the speed of training. The ReLU function can be written as We use the MaxPooling operation to reduce the number of parameters for training the CNN, and this defines a spatial neighborhood. Therefore, 2 × 2 MaxPooling is used for the pooling layer. The function can be described as The loss function is where o k and p k are respectively the observation values and proposed model output values. There are two ways to implement the mode decomposition of a multi-section road traffic algorithm by EMD. The first is to complete the mode decomposition directly in twodimensional space because of the relation between the spatial and temporal. The second is to use EMD to achieve mode decomposition on single sections of road traffic and then synthesize the decomposition results of all sections. Then, we combine them into two dimensions for spatial and temporal relations. This paper uses the second method to avoid the correlation of road sections. The procedure of the proposed prediction model is shown in Figure 4, and it has the following three steps.
Step 1: The EMD method is used to decompose the time series x(t) to obtain n components of IMF and residual r(t).
Step 2: CNN models are established to train the periodic and random sequences, and the sub-prediction results are obtained. The implementation of the CNN model is presented in Figure 7.
Step 3: The sub-prediction results of the periodic and random sequences are aggregated to obtain the final prediction results.

Experiments
We used traffic data collected by the Caltrans Performance Measurement System (PeMS) [2]  ). The first 23 weeks were used to train the model, and the last two weeks were used to test the performance. Based on the periodicity of traffic parameters, the predicted time is still one week. Therefore, the traffic state of a time interval was predicted using the data of the previous week. We chose the average value of the traffic flow of 10 road sections to predict the traffic state of the next time interval. To provide effective traffic information for travelers, the data prediction cycle was selected as one week, nine prediction steps were forecast for the next 45 minutes, and the prediction data were updated with one interval or every five minutes. Based on the PeMS data, an EMD model was first used to decompose the original sequence in a cycle of one day, which consisted of 288 points with a five-minute time interval. Through the results of data decomposition, the number of IMFs for traffic flow data decomposition is 5, 6, 7, 8 and 9 through the analysis of 10 road sections, with a total of 1750 samples. IMF5, IMF6, IMF7, IMF8, and IMF9 had 48, 624, 938, 137, and 3 decompositions, respectively. The decompositions were concentrated in IMF6, IMF7, and IMF8. Traffic, speed, and occupancy data had the same result. Next, the combination of a periodic and random sequence was carried out through comparative analysis. The results of the combined model are shown in Table 1. Figure 5 compares the periodic and random sequences of the traffic parameters speed, occupancy, and flow at a single location after six IMFs were selected. Data of the three traffic parameters at a certain time point of 10 road sections were randomly selected, and   Figure 6.
The average flow value of 10 road sections was selected to measure the traffic state of the next time section, which can carry out the sub-prediction output of the network. The output of the corresponding periodic and random sequences were obtained by mode decomposition and mode combination. Statistical analysis showed that the number of IMFs is 5, 6, 7, and 8. The results are shown in Table 2.
Since a sensor collected traffic data every five minutes, it would collect 12 observations in an hour, or 2016 in a week, for each of the 10 road sections. Therefore, the corresponding matrix is 10 x 2016. Figure 7 shows the architecture of the proposed CNN model. From Figure 7, because of an imbalance of data length and width, two convolution layers were followed by a pooling layer, and the convolution core conv1 and conv4 changed the time dimension but not the corresponding spatial dimension. To reduce the number of parameters, we used max pooling to reduce the corresponding time and spatial dimensions by half. To effectively increase the data extraction ability, the channel parameters of the convolution core were designed with 4, 8, 16, and 32 filters. Moreover, to avoid overfitting, we adopted dropout and regularization strategies. The network parameters are shown in Table 3. It should be noted that to increase the training speed of the network, the ReLU activation function was added after each convolution layer. Our proposed model was developed using a TensorFlow framework based on the Adam algorithm and Xavier initialization.   To demonstrate the efficiency of the proposed prediction method, we chose standard CNN, Lasso-BP, and PCA-BP as comparative models. Lasso-BP and PCA-BP are hybrid models, where Lasso and PCA are used to reduce the data dimension, and BP for prediction after dimension reduction is input, to realize multi-section data prediction. The standard CNN prediction model was used as a comparison as well. Mean absolute percentage error (MAPE), mean absolute error (MAE), and root mean square error (RMSE) were used for performance evaluation. These are defined as follows: where r i is a real observation value, p i is a predicted result, and n is the number of test samples. We set m = 9 because the average of nine predicted values is regarded as the predicted value of the current time node. After the output from mode decomposition, two sub-prediction models were constructed. Considering that the prediction task is not meaningful after midnight, the error rate of the model was calculated only from 6:00 a.m. to midnight at 216 points. After sub-prediction results of periodic and random sequences were aggregated, the final prediction results were obtained. The error rate was 0.06674 for MAPE, 22.0999 for MAE, and 29.602 for RMSE. The comparison between predicted traffic flow and real observations is shown in Figure 8.
To validate the effectiveness of the proposed model, the well-known prediction models PCA-BP, Lasso-BP, and standard CNN were chosen for experimental comparison using the same dataset as our proposed model. The results are shown in Table 4.
The experimental results in Table 4 clearly show that the accuracy rates of the four models are different. Because of the advantage of CNN-based prediction models in dealing with multidimensional data, RMSE, MAPE, and MAE of our model are much smaller than for other models. A higher value indicates less prediction efficiency. These outcomes show that the proposed prediction model takes into account the intrinsic characteristics of traffic parameters and delivers the advantages of a CNN, which can perform more powerful prediction efficiency.

Conclusion
In this paper, a mode decomposition based deep learning model for multi-section road traffic prediction was proposed for highway traffic prediction. First, with consideration of the intrinsic characteristics of traffic parameters, the raw dataset was transformed to periodic and random sequences by EMD. Next, a prediction model based on CNN was established to complete the prediction of periodic and random sequences by considering the effect of spatiotemporal information. Finally, two parts of the sub-prediction results were aggregated to obtain the final prediction results. Experimental results show that the proposed prediction model is more accurate than several popular models. Nevertheless, this study does not consider other factors, such as weather conditions. Such factors will be considered in future work to achieve higher prediction accuracy.