Prediction of Spatiotemporal Evolution of Urban Traffic Emissions Based on Taxi Trajectories

With the rapid increase of the amount of vehicles in urban areas, the pollution of vehicle emissions is becoming more and more serious. Precise prediction of the spatiotemporal evolution of urban traffic emissions plays a great role in urban planning and policy making. Most existing methods usually focus on estimating vehicle emissions at historical or current moments which cannot well meet the demands of future planning. Recent work has started to pay attention to the evolution of vehicle emissions at future moments using multiple attributes related to emissions, however, they are not effective and efficient enough in the combination and utilization of different inputs. To address this issue, we propose a joint framework to predict the future evolution of vehicle emissions based on the GPS trajectories of taxis with a multi-channel spatiotemporal network and the motor vehicle emission simulator (MOVES) model. Specifically, we first estimate the spatial distribution matrices with GPS trajectories through map-matching algorithms. These matrices can reflect the attributes related to the traffic status of road networks such as volume, speed and acceleration. Then, our multi-channel spatiotemporal network is used to efficiently combine three key attributes (volume, speed and acceleration) through the feature sharing mechanism and generate a precise prediction of them in the future period. Finally, we adopt an MOVES model to estimate vehicle emissions by integrating several traffic factors including the predicted traffic states, road networks and the statistical information of urban vehicles. We evaluate our model on the Xi’an taxi GPS trajectories dataset. Experiments show that our proposed network can effectively predict the temporal evolution of vehicle emissions.


Introduction
Environmental pollution from traffic is becoming an important issue that people are concerned about. Urban traffic congestion and dense traffic flow have exacerbated environmental problems, and vehicle emissions have become one of the main sources of urban air pollution [1] . The pollutants emitted by vehicles contain carbon monoxide, nitrogen oxides, and particulate matter, which are the main causes of smog and photochemical smog pollution [2] . Therefore, the urban transportation system needs to establish an effective environmental monitoring and early warning system. The key issue is the accurate prediction of the evolution of traffic emissions. The trend of vehicle pollution emissions is mainly affected by driving conditions, such as changes in vehicle speed, acceleration, and traffic volume. Predicting the emission of motor vehicles in the future means predicting the temporal and spatial trends of multiple related traffic condition variables. Therefore, the accurate and efficient prediction of trends of multiple related traffic condition variables can provide scientific estimations for predicting the evolution of urban vehicle emissions.
In recent years, more and more researchers have focused their research on vehicle emissions and urban air pollution. The existing researches on the prediction of the temporal and spatial distribution of urban mobile source emissions can be roughly divided into two main methods: model-driven and data-driven. Based on the model-driven method, vehicle emission models are proposed to obtain an emission inventory, which is used to estimate or predict the total emission traffic of a specific area [3,4] , within a specified time range (usually one year). Wang et al. [5] proposed a vehicle emission model based on mileage and emission factors to study vehicle emission trends in China. Emission models such as motor vehicle emission simulator (MOVES) [6] , computer programme to calculate emissions from road transport (COPERT) [7,8] and the international vehicle emission (IVE) model [9] have been developed and adjusted according to vehicle information databases (such as vehicle type and fuel type) in various locations. MOVES was developed by the US Environmental Protection Agency and used to estimate emissions from mobile sources on highways. It considers several mobile emissions processes, including exhaust during driving, brake wear, tire wear, and driving losses. COPERT is a commonly used emission model in Europe. The model uses lots of experimental data to determine the emission parameters of road transportation and obtain an emission inventory. The IVE model uses vehicle specific power (VSP) and engine stress (ES) as inputs to calculate emission factors. Recently, Jamshidnejad et al. [10] proposed a comprehensive framework that combines micro and macro emission models to estimate vehicle emissions. The emission inventory estimated by the model-driven methods can provide the macro-emissions of the city, but it cannot satisfy the short-term and fine-grained forecasting needs of the early warning mechanism in an urban environmental monitoring system.
Although the above methods have made great progress, it is still a challenging problem to predict vehicle emissions. The model-driven algorithms lack universality and ignore the influence of geographic and environmental factors on the distribution of traffic flow. Due to the limitation of accuracy, the above algorithm can only predict the total emissions of the entire city or region, and cannot predict emissions within the fine-scale range.
With the rapid development of traffic data collection and big data technology [11,12] , researchers have turned to data-driven methods, using mobile source pollution emission monitoring data and other urban multi-source data to study spatiotemporal prediction of vehicle emissions. The development of laser remote sensing technology [13−16] has enabled on-road monitoring equipments for the remote sensing of emissions to measure the instantaneous emissions of vehicles while driving. Xu et al. [17] proposed long short-term memory (LSTM) networks combined with automatic encoders to predict vehicle emissions based on data obtained from the remote sensing station. Xu et al. [18] proposed a deep spatiotemporal residual early-late fusion network with the semi-supervised geographical weighted regression to predict vehicle emissions in urban areas, using the sparse monitoring stations. The data collected in this way is not extensive enough, and the sparseness of the monitoring stations makes the granularity of the emission predictions insufficient.
With the establishment and improvement of the environmental monitoring system and intelligent transporta-tion system, a large number of GPS sensors are installed on taxis. The trajectory data with the spatiotemporal properties are distributed on various streets of the city, and can reflect the traffic state in a small area or even on the street scale. GPS trajectories have time continuity and closeness, which can realize short-term estimation. More than that, the data covers all the streets of the city, which not only overcomes the sparseness of telemetry stations, but also enables fine-grained emission analysis.
Aslam et al. [19] verified that the traffic patterns reflected in the taxi trajectory data obtained through highdensity sampling have roughly the same trend as the actual traffic patterns. Therefore, it is widely used in the research of urban traffic and traffic pollution. Nyhan et al. [20] inferred the spatial and temporal distribution of Singapore′s vehicle emissions by taxi GPS trajectories and loop detector data. Shang et al. [21] used taxi trajectory data and urban road network information to infer vehicle emissions in Beijing. Nocera et al. [22] estimated the carbon emissions of road transport using incomplete traffic information collected by the flow estimator.
The existing research on using big data to estimate vehicle pollution emissions at any scale is mature. In most works, the current or past vehicle emission distribution is estimated with the aid of traffic conditions, and the analysis of spatiotemporal evolution of emissions is blank temporarily.
It is a prediction problem to obtain evolution of vehicle emissions by using historical GPS trajectory data. There are some challenges here: 1) The prediction of emission trends is complicated and requires high accuracy and calculation efficiency. Only considering a single feature cannot achieve accurate modeling and prediction of emissions. In order to improve the accuracy of vehicle emission prediction, multiple traffic attributes need to be considered in the vehicle emission model. For example, the traffic average speed and volume are used in the COPERT model to calculate the emissions of vehicles; the MOVES model additionally considers average acceleration. However, predicting multiple features separately will greatly increase the computation cost and reduce the development efficiency. Therefore, how to improve calculation efficiency in the case of predicting multiple traffic state-related features at the same time becomes an urgent problem to be solved.
2) How to effectively extract and integrate multiple attributes related to traffic emissions is also a challenging task. The attributes related to traffic emissions include traffic volume, average speed, acceleration, etc. The temporal and spatial variation trend of the attributes are complex nonlinear models and are hard to predict precisely. Besides, affected by the external environment and location factors, these attributes reflect different traffic conditions, but at the same time they are related to each other, which brings difficulties to the design of feature extraction and fusion mechanism.
In order to solve the above problems, we propose a joint framework based on the multi-channel spatiotemporal graph convolution network and the MOVES emission factor model to predict spatial and temporal distribution of traffic emissions with reasonable accuracy, using multi-source datasets including meteorological data and road network data.
Specifically, we first match the GPS trajectories to different spatial grids with a map matching algorithm, and extract traffic status that changes with time, which is the volume, average speed and acceleration of passing taxis in each grid. Then spatiotemporal feature spaces of taxi volume, speed and acceleration can be constructed by multi-channel spatiotemporal graph convolution networks. We will use the feature sharing mechanism to couple above features to predict traffic states in future periods. Finally, the MOVES emission model is used to calculate emission factors, combined with road network information to estimate the emission evaluation of vehicles. Through the comparison and visualization experiments on the Xi′an taxi trajectories dataset, the effectiveness of this method is proved. The main contributions of this paper are as follows: 1) In order to improve the accuracy of pollution prediction and ensure the computational efficiency of the model, we use a multi-channel mechanism to achieve simultaneous prediction of multiple attributes. At the same time, in order to balance the scale differences of different features, we introduce homoscedastic uncertainty to learn the weight of the loss of each channel.
2) In order to better construct the spatiotemporal dependence of each attribute, we use spatiotemporal graph convolutional network (STGCN) in each channel to extract spatiotemporal features layer by layer. In addition, we introduce a feature sharing mechanism to model the selection tendency in the extraction of features between related attributes which helps to achieve better fusion and utilization of different attributes.
3) We evaluate the effectiveness of our method on the GPS trajectory dataset of taxis in Xi′an. The results show that the multi-channel mechanism shortens the training time by 17.72%, and under the premise of ensuring the prediction accuracy, the prediction accuracy of traffic volume and average speed attributes are respectively increased by 4.86% and 4.68%, proving the effectiveness of the feature sharing mechanism. In addition, through the prediction and analysis of pollution in different functional areas of the city, the distribution of pollution is basically consistent with the actual situation, so it can be considered that the prediction of vehicle emissions is effective.
The rest of this paper is organized as follows. System overview is written in Section 2. Section 3 describes the proposed urban traffic emission evolution prediction model. The details of the experimental setup and the results of the related experiments are written in Section 4. Finally, we conclude this paper in Section 5.

Definition
Definition 1. Trajectory. Let denote a set of GPS trajectories at the t-th time interval, and is a trajectory in , where has a geospatial coordinate set and a timestamp , .

Methodology
In this section, we propose the prediction model for evolution of traffic emissions based on a joint training framework of map-matching and multi-channel spatiotemporal graph convolution network, as shown in Fig. 1. Specifically, we first match the GPS trajectory to corresponding spatial grids using the map-matching algorithm, and taxi volume, average speed and acceleration in each grid are calculated. Then, we adopt a multi-channel spatiotemporal graph convolution network to construct the feature spaces of volume, speed and acceleration respectively, and use the feature sharing mechanism to couple the above features to predict the traffic states in spatial grids in the next time interval. Finally, we estimate the emission of urban vehicles by the MOVES model, using road network and urban vehicle statistical information.

Traffic states construction
The GPS trajectories received by the vehicle are projected onto the road network using the map-matching algorithm [23] . After matching, each point of the trajectories is mapped onto the corresponding road segment. Given two trajectory points and , the speed and acceleration of can be calculated: where is the function that calculates the distance of road network between two points. Likewise, we can calculate the average speed and average acceleration of grid at the interval as follows: where denotes the cardinality of the set. and denote a set of trajectory points and a set of trajectories in grid . In addition, GPS trajectory data can be used to find the traffic volume of a certain area in a time interval. The taxi volume of grid at the interval is defined as Therefore, three matrices representing time-varying traffic conditions within the grid area can be extracted from the historical trajectory. Firstly, we define the set of time intervals as , and traffic status including average speed, acceleration and taxi volume in all N regions can be denoted as three tensors where . The area within Xi′an Second Ring Road is divided into 8×8 disjoint grids. Each row of the three matrices represents a time interval, and each column represents a grid unit. Each element represents the average speed, acceleration, and taxi volume of all vehicles passing through the grid area in certain time interval, which can reflect the traffic condition of the area. As show in Fig. 2, when the trajectories in the grid are dense, i.e., the taxi volume is large, the average speed of vehicles is lower than that in the sparse trajectory area. Similarly, the average acceleration of vehicles is also lower than the grid with fewer vehicles.

Graph construction
In this work, we define the road network as a set of time-varying spatial graphs . In graph at the t-th time interval, is a set of vertices corresponding to the traffic status in the above-mentioned nodes; is the set of edges representing the connectedness between nodes, while denotes the weighted adjacency matrix of .

Multi-channel STGCN
The multi-channel spatiotemporal graph convolutional network is a model that we proposed based on STGCN [24] , which can predict the three features of taxi volume, average speed and acceleration in the traffic network simultaneously, and guarantee reasonable accuracy and scale. The network structure is shown in Fig. 1. A novel multi-channel feature sharing mechanism is proposed in our model. The network constructs the spatiotemporal feature space of volume, speed and acceleration separately, and interactively encapsulates them into the spatiotemporal convolution module of other channels. Three channels eavesdrop to feature information related to themselves through a feature sharing mechanism.
In this paper, spatiotemporal graph convolutional networks [24] are used to capture the dynamic spatial and temporal correlations on traffic networks. The network includes several spatiotemporal convolutional blocks, which are a combination of graph convolutional layers [25] and temporal convolutional layers to model spatial and temporal correlations.
Regarding the graph convolution layer, we mainly consider spectral convolution on arbitrary graphs [25] . Since it is difficult to express meaningful conversion operators in the node domain, the spectral representation of the convolution operator is given on the graph [25] , de-* G X ∈ R N ×T gw = diag(w) w ∈ R N noted as . According to above definition, node feature vector with a filter parameterized by in the Fourier domain is is the eigenvector matrix and is the diagonal matrix of the eigenvalues of the normalized graph Laplacian .
where represents the identity matrix and represents the diagonal degree matrix with . We define as the eigenvalue function of . However, since the complexity of multiplying by is , the calculation is huge. In order to solve the above problem, Defferrard et al. [25] used Chebyshev polynomial expansion to obtain effective approximation, is the largest eigenvalue of , and is the vector of Chebyshev coefficients. Details about Chebyshev polynomials approximation can be found in [25,26].
As for temporal convolutional layers, it contains a one-dimensional convolution, the width of the kernel is , and then gated linear units (GLU) [27] is connected as the activation unit. The input of the temporal convolution for each node in graph can be regarded as a sequence of length , with channels, denoted as . The convolution kernel is designed to map to a single output. Therefore, temporal convolutional layers can be defined as field by stacking temporal convolutional layers. In addition, when stacking temporal convolutional layers, residual connections are realized. Similarly, the same convolution kernel is used on each node, and the temporal convolution can be generalized to 3D variables, denoted as with .
In order to fuse features from spatial and temporal domains at the same time, Yu et al. [24] constructed a spatiotemporal convolutional block (ST-Conv block) based on the bottleneck strategy, including two temporal convolutional layers, respectively in the upper and lower two layers, and a spatial convolution layer in the middle. When the input of the block is the characteristic matrix , then the output is calculated: where , are the upper and lower temporal convolutional layers parameters of block , is the graph convolution spectrum kernel, and ReLU represents the ReLU activation function. After stacking two ST-Conv blocks, a temporal convolutional layer and a fullyconnected layer are used as the final output layer.
We design three parallel channels, namely volume channel, speed channel, and acceleration channel to extract the temporal and spatial dependence feature of multiple attributes such as traffic flow, speed and acceleration respectively, as shown in Fig. 1. Each channel in multi-channel spatiotemporal graph convolutional network (MC-STGCN) is a ST-subnetwork, which is composed of two ST-Conv blocks and an output layer. Input of the three sub-networks is time-ordered sequence of traffic attribute graphs. Under the actions of parallel networks, feature extraction processes of attributes are independent of each other and proceed simultaneously, which improves computational efficiency of our task.
In order to fuse features from spatial and temporal domains at the same time, Yu et al. [24] constructed the spatiotemporal convolutional block (ST-Conv block) based on the bottleneck strategy, including two temporal convolutional layers, respectively in the upper and lower, and a spatial convolution layer in the middle. When the input of the block is the characteristic matrix of the traffic speed, the characteristic matrix of its acceleration is also packaged as the input of the block , then the output is calculated: where , are the upper and lower temporal convolutional layers parameters of block , is the graph convolution spectrum kernel, and ReLU represents the ReLU activation function. After stacking two ST-Conv blocks, a temporal convolutional layer and a fully connected layer are used as the final output layer.

Feature sharing mechanism
For traffic state prediction models including taxi volume, traffic speed and acceleration, they have relatively close temporal and spatial characteristics, such as spatial correlation, temporal periodicity and dependence, etc. Due to the similarity between channels, feature sharing can provide more information about spatiotemporal characterization for each task, thereby assisting ST-subnetworks to extract more accurate feature representations.
In this article, the output of the first ST-Conv block in each channel will be used as a shared feature and become the input feature of the second ST-Conv block in other channels. Since the feature representations of different channels have different scales and statistical features, the network will prefer features with larger values and ignore other feature information. Therefore, we first standardize the shared features and then concatenate them into high-dimensional feature vectors. The input of the second ST-Conv block of speed channel is where , , are outputs of the first ST-Conv block in three channels, and represents standardized operation, which transforms features into a representation with same mean and variance. is to concatenate matrix according to a certain dimension. Similarly, the inputs of second ST-Conv block of acceleration channel and volume channel are:

External factor fusion
Et ∈ R i t External factors like weather can affect urban traffic and road conditions. For example, a heavy rain may congest the streets. External factors are like switches, and if they do, the road conditions will change dramatically. Based on this conclusion, Zhang et al. [28] develop a gating-mechanism-based fusion, which can obtain the corresponding external features expressed as at time in the network, as shown in Fig. 1. Formally, we can get the following gating value: ain fin where and are learnable parameters. is the output value of the gate. is the sigmoid function and " " is dot product of two matrices. Then a product fusion based on the gating mechanism is embedded in the output layer of each ST-subnetwork for speed , acceleration and volume : where is a hyperbolic tangent function, which ensures that the output value is between −1 and 1.
is the parameter of the temporal convolutional layer, and represents the weather forecast of environmental features at future moments. According to this, the predicted values , at time are as follows:ŝ

S
is the set of all learnable parameters of predicted speed models of MC-STGCN. Our goal is to learn them by minimizing the loss function between the predicted value and the true value : and is a set of all learnable parameters of predicted acceleration models and volume models. For the square loss function, , , are ground truths, , , represent predicted values of model.
The performance of the multi-channel learning model depends on the loss weight between channels. Manually adjusting the weights is time-consuming and labor-intensive. In order to better optimize our proposed multi-channel network, we use the strategy proposed in [29] to balance the three channels. Then, the network loss function formula is as follows: σs σa σ f where , , are balancing weights of three channels, which can be optimized as parameters in training. The loss function automatically learns the weighted hyperparameters in the loss function through the homoscedastic uncertainty, so that the loss function of each task has a similar scale.

Emission model
Traffic speed, acceleration and volume information can be further used to estimate vehicle emissions in the road network. Different models can be used in environmental science. These models can quantify the relationship between emissions and speed and other factors based on large amounts of data. The MOVES was developed by the US Environmental Protection Agency (EPA) and is capable of calculating vehicles pollutant emissions at different scales. The reason why we use this model is that MOVES can more meticulously describe the working conditions and emission levels of vehicles. In the MOVES model, it mainly calculates the distribution of operating conditions of vehicles, and combines speed, acceleration and vehicle specific power (VSP).  (22) is simplified to (23) [30] ,

(27)
A simplified MOVES model is proposed in [31], which divides the operating mode of vehicles into 23 types, corresponding to different default average emission rates (AER), as shown in Table 1. The values are given in Table 1 to calculate different kinds of emissions, which applies for Euro III passenger vehicles. Although diversity of vehicles will influence accuracy, the results are still statistically useful as we sample the most represent- The dataset applied in this paper is taxi GPS trajectories collected in the Second Ring of Xi′an city, and speed of vehicles in urban area is generally limited between 40 km/h and 60 km/h. Moreover, according to statistical characteristics of speed, that is the average is 30.26 km/h and standard deviation is 9.66 km/h. Therefore, the situation in Table 1 satisfies emission estimation under almost all driving states. Therefore, the emission factor (EF) which is the amount of pollutants (g/km) emitted by each vehicle per kilometer is calculated, The total emissions in particular area is is the road length, and is traffic volume. Traffic volume is estimated by the ratio of the predicted taxi volume and the total number of urban vehicles. For example, in 2018, there were 3.24 million vehicles in Xi′an, and 86 267 taxis were collected in the data set. Therefore, the data sample accounts for 2.67% of the total vehicles in urban. The traffic volume in each grid is estimated by dividing taxi volume in the grid by the ratio.

Dataset description
In this paper, the dataset we used is shown in Table 2, the details are as follows.
We use a large-scale online taxi GPS dataset collected by Didi Chuxing, which is an online car-hailing company in China. The data source is https://gaia.didichuxing.com. The dataset contains taxi GPS trajectories from October 1, 2018 to November 29, 2018 in Xi′an. The  The road network within the Second Ring Road of Xi′an includes a spatial range of 8 km 8 km, with a total length of 514 thousands meters. Urban roads are divided into 4 levels, which are expressways, main roads, secondary roads and branch roads.

× ×
We divide the Second Ring road area of Xi′an into 8 8 grids, and the size of each area is about 1 km 1 km. 1 hour is set as the length of the time interval, then nodes in graph contain 24 data points per day. We use the Zscore method to convert traffic speed and acceleration to a scale with mean 0 and variance 1. In experiment, the data from October 1, 2018 to November 17, 2018 (48 days) was used for training, and the data from November 18, 2018 to November 29, 2018 (12 days) was used as the validation set and the test set. When testing the prediction results, we use the first 12 time intervals to predict the value of the next time interval.
The adjacency matrix is calculated based on the distance between grids in the road network. In the paper, we use dynamic time warping [32] to calculate the similarity distance between node (grid) and node (grid) , . The weighted adjacency matrix is as follows: where is weight of the edge, which is decided by And and are thresholds that control distribution and sparsity of , with 10 and 0.5, respectively. The visualization of and is shown in Fig. 3.

Experimental settings
We set hyperparameters of the network based on the performance on validation dataset. In our model, graph convolutional layers of the first and second ST-Conv block use 64 and 128 convolution kernals, respectively. All temporal convolutional layers use 32 convolution kernals, and adjust the temporal span of the data by controlling the step size of temporal convolution. During the training phase, the learning rate is 0.001 and the batch size is 16. In experiments, the multi-channel spatiotemporal network performance is evaluated by two common metrics: mean absolute error (MAE) and root mean square error (RMSE) We compare MC-STGCN with widely used time series regression models, including: 1) HA: Historical average [33] .

Experiment results
We compare MC-STGCN with 7 baseline methods in the Taxi in Xi′an dataset. Table 3 shows the prediction performance results for the next hour. According to the evaluation indicators, we have achieved reasonably excellent performance in traffic volume, speed, and acceleration. As we can see, the prediction results of some general time series analysis methods (HA, static, FNN, LSTM, GRU) are usually not ideal, which shows that they only consider the temporal dependencies of features, and ignore spatial correlation. Therefore, these methods have limited ability to model nonlinear and complex traffic data. For Var, it further considers the spatial correlation between features. As a result, it achieves better performance. However, it fails to capture complex nonlinear temporal dependencies and dynamic spatial correlation. In contrast, STGCN is superior to other methods, indicating that it can effectively capture the dynamic changes of traffic data.
The MC-STGCN model we proposed is significantly better than the single channel result in the traffic speed feature, and slightly better than the single channel result in the traffic acceleration. This proves that the multichannel feature sharing mechanism has advantages in assisting the network to extract spatiotemporal features.
Moreover, we designed an ablation experiment on feature sharing machanisms, which removes concatenation of feature vectors after the first ST-Conv block. The experiment results are in Table 3, in which MC-STGCN (no-FS) line is the result of a model without a feature sharing mechanism. Simply using multi-channel models to train three traffic attributes in parallel has a larger deviation than single-task STGCN. This is because the loss function of MC-STGCN contains errors of all attributes, which makes it difficult to achieve the optimization effect of individual training. The introduction of feature sharing machanism makes up for this defect. CO2 Fig. 4 visualizes the prediction of traffic volume, , CO, HC and NO in the Second Ring Road of Xi′an on a weekday (2018/11/26 Monday), respectively. As shown in the first line, in the time period from 8 AM to 9 AM, the traffic volume is larger than that from 10 AM to 11 AM. This is because of the morning peak, which is consistent with our common sense. Similarly, the volume distribution from 5 PM to 6 PM is different from volume distribution from 8 PM to 9 PM. When people get off work and  school, traffic volume from the work places in the city center to the residential areas. Therefore, the predicted vehicle emissions are mainly concentrated in dense roads in the city center during the morning and evening peak hours. After the evening peak, vehicle emissions increase in the suburban direction. However, the distribution of traffic volume and vehicle emissions on weekends is different from that on weekdays. As shown in the first row of Fig. 5, the traffic volume during the period from 10 AM to 11 AM is significantly more than that from 8 AM to 9 AM. Judging by common sense, people do not have to go to work on weekends, usually go out late, and are more inclined to go to the entertainment area in the city center. Therefore, the vehicle emissions in the city center have increased during this period. Similarly, until 5 PM to 6 PM, the city center is still a gathering place for citizens, and vehicle emissions are still high. From 8 PM to 9 PM. in the evening, the traffic volume began to spread along the main urban roads, and the distribution of vehicle emissions also changed in the same trend. Fig. 6 shows the comparison of traffic volume and vehicle emissions on weekdays and weekends. There is more traffic volume on weekdays from 8 AM to 9 AM than on weekends from 8 AM to 9 AM. In addition, the volume on weekdays is more concentrated than that on weekends, whether in the morning or at night. This is because people′s travel locations are more specific (work-place or school) on weekdays, and the travel time is concentrated in the same period. On the contrary, on weekends, citizens′ travel locations, travel time, and the number of outgoing vehicles are scattered. Although the purpose of travel is different, the distribution of vehicles is roughly the same due to the concentration of workplaces and entertainment areas in the city center. Therefore, the pollutants emitted by vehicles in the morning and evening peaks are concentrated in the city center, and the vehicle emissions in the late night are higher in the suburbs. In addition, vehicle volume and pollutant emissions are distributed along urban roads. Large traffic volume in dense roads will cause congestion, and vehicles will emit more pollutants.

Conclusions
In this paper, we predict the evolution of vehicle emissions in urban road networks based on historical taxi GPS trajectories. The knowledge gained from our research can provide many valuable applications for social welfare, such as vehicle emission warnings, improving urban planning, and studying the sources of air pollution. Considering the efficiency and effectiveness, we solve this problem through a three-step method. Considering the efficiency and effectiveness, we solved this problem through a three-step method. We first map the trajectory data to road networks and calculate the average traffic speed and acceleration of the area. And then, we use the multichannel STGCN network to predict the traffic status in future periods. Finally, the pollution emission distribution is calculated based on the predicted traffic status and the proportionally estimated traffic volume. We evaluate our method based on extensive experiments. The experiment uses GPS trajectories generated by more than 80 000 vehicles within two months. The results prove the effectiveness and rationality of our method. In the future, we will further improve the vehicle flow estimation algorithm to make the evolution of attributes related to traffic emissions more accurate.

Open Access
This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.