Short-term multi-step-ahead sector-based traffic flow prediction based on the attention-enhanced graph convolutional LSTM network (AGC-LSTM)

Zhang, Ying; Xu, Shimin; Zhang, Linghui; Jiang, Weiwei; Alam, Sameer; Xue, Dabin

doi:10.1007/s00521-024-09827-3

Short-term multi-step-ahead sector-based traffic flow prediction based on the attention-enhanced graph convolutional LSTM network (AGC-LSTM)

S.I.: Deep Neural Networks for Traffic Forecasting
Open access
Published: 07 May 2024

(2024)
Cite this article

Download PDF

You have full access to this open access article

Neural Computing and Applications Aims and scope Submit manuscript

Short-term multi-step-ahead sector-based traffic flow prediction based on the attention-enhanced graph convolutional LSTM network (AGC-LSTM)

Download PDF

Ying Zhang^1,2,
Shimin Xu^1,2,
Linghui Zhang^2,3,
Weiwei Jiang⁴,
Sameer Alam⁵ &
…
Dabin Xue ORCID: orcid.org/0000-0002-4368-9261⁶

706 Accesses
Explore all metrics

This article has been updated

Abstract

Accurate sector-based air traffic flow predictions are essential for ensuring the safety and efficiency of the air traffic management (ATM) system. However, due to the inherent spatial and temporal dependencies of air traffic flow, it is still a challenging problem. To solve this problem, some methods are proposed considering the relationship between sectors, while the complicated spatiotemporal dynamics and interdependencies between traffic flow of route segments related to the sector are not taken into account. To address this challenge, the attention-enhanced graph convolutional long short-term memory network (AGC-LSTM) model is applied to improve the short-term sector-based traffic flow prediction, in which spatial structures of route segments related to the sector are considered for the first time. Specifically, the graph convolutional networks (GCN)-LSTM network model was employed to capture spatiotemporal dependencies of the flight data, and the attention mechanism is designed to concentrate on the informative features from key nodes at each layer of the AGC-LSTM model. The proposed model is evaluated through a case study of the typical enroute sector in the central–southern region of China. The prediction results show that MAE reduces by 14.4% compared to the best performing GCN-LSTM model among the other five models. Furthermore, the study involves comparative analyses to assess the influence of route segment range, input and output sequence lengths, and time granularities on prediction performance. This study helps air traffic managers predict flight situations more accurately and avoid implementing overly conservative or excessively aggressive flow management measures for the sectors.

Traffic Flow Prediction Based on Attention Mechanism Convolutional Neural Network

A multi-level attention long short-term memory neural network based on rival rise algorithm for traffic volume prediction

Article 24 April 2024

Traffic Flow Forecasting of Graph Convolutional Network Based on Spatio-Temporal Attention Mechanism

Article 19 July 2023

Find the latest articles, discoveries, and news in related topics.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The Aviation industry faces challenges of air traffic congestion and reduced flight operation efficiency [1]. These issues stem mainly from the demand for flights surpassing the capacity of available airspace and airport accommodation [2]. As an essential component of air traffic management (ATM), air traffic flow management (ATFM) is designed to achieve demand–capacity balancing (DCB) [3]. Conceptually, ATFM encompasses three distinct phases based on the time of implementation [4]: (1) strategic planning (a few months ahead) involving measures such as runway expansion [5] and shorter separation standards [6]; (2) pre-tactical planning (1 day ahead) that includes traffic flow and sector splitting [7]; and (3) tactical planning (on the day of implementation) that entails aircraft sequencing and re-sequencing during flight operations [8]. The traditional ATFM methods include group delay programs [9, 10], airport surface management [11,12,13], flight rerouting [14, 15], flight scheduling [16, 17], and flight sequencing [18, 19].

Air traffic flow is an important indicator for smooth flight operation. Therein, accurate traffic flow prediction can help identify air traffic operation bottlenecks and serve as the prerequisite and basis for effective ATFM [20]. Traditional air traffic flow prediction methods refer to the approaches used before the advent of modern data-driven and machine learning techniques, which rely on analytical and statistical approaches to forecast air traffic flow, such as time-series analysis, regression analysis, exponential smoothing, moving averages, seasonal decomposition, and historical averages [21,22,23,24]. Generally, traditional methods are valuable for predicting air traffic flow when historical data are limited, and simpler models are preferred due to ease of implementation and interpretability. However, traditional methods may be less capable of capturing complex patterns and relationships present in large and dynamic datasets due to the simplified linear relationship assumptions [25]. In contrast, data-driven and machine learning techniques have revolutionized air traffic flow prediction, allowing for more accurate and adaptive forecasts. These methods utilize historical data, real-time information, and various features to learn patterns and relationships in air traffic flow. Notably, deep learning neural networks, including convolutional neural networks (CNN), recurrent neural networks (RNN), and long short-term memory (LSTM) networks, are commonly used for air traffic flow prediction [26,27,28,29,30]. These models can capture complex spatiotemporal dependencies and have shown promising air traffic flow prediction.

Airspace is divided into sectors based on various factors, such as geographical location, traffic density, and complexity of the airspace. The sector capacity determines the number of aircraft that can be safely handled within a sector at a given time, depending on factors such as airspace configuration, available resources, and controller workload [31]. Airspace sector management plays a crucial role in flow management, especially during high-traffic periods or in congested areas. Clearly, air traffic flow prediction in sectors is crucial for managing airspace capacity, balancing traffic flows, ensuring safety, optimizing sector utilization, and enabling collaborative decision-making among stakeholders [32]. Accurate predictions can help authorities proactively manage air traffic, enhance operational efficiency, and maintain a safe and orderly flow of aircraft through the airspace. Figure 1 shows the schematic diagram of the air traffic flow management.

However, sector-based traffic flow prediction is a complex task that comes with several challenges. First, air traffic flow in sectors can be highly variable and is influenced by factors such as weather conditions, peak travel times, special events, and unforeseen incidents. Predicting the flow of traffic under such dynamic conditions requires robust models that can adapt to changing patterns. Then, air traffic flow exhibits nonlinear dependencies, where the relationships between input features and traffic flow can be complex and nonlinear. Traditional linear models may not capture these intricate dependencies effectively, necessitating the use of more sophisticated machine learning techniques. Additionally, there are temporal dependencies due to the sequential nature of flight operations. Capturing these dependencies accurately requires specialized modeling techniques. Last, the location and size of sector boundaries and the complicated and unique internal airway structure within the sector can also affect the flow of traffic. The traffic flow of one sector is also impacted by interactions with adjacent downstream and upstream sectors. Designing sector-specific prediction models that take into account the topological structure characteristics and traffic flow patterns of the sector and tailor predictions for individual sectors is an area that requires exploration. These challenges arise due to the dynamic and generally unpredictable nature of air traffic, as well as the need for accurate and timely forecasts several steps ahead to ensure safe and efficient airspace and air traffic flow management [33].

The utilization of deep learning technologies, specifically those rooted in graph neural networks, has found extensive applications across diverse domains. These include but are not limited to natural language processing [34, 35], computer vision [36, 37], recommendation systems [38, 39], graph analysis [40, 41], and traffic prediction [42, 43].

Within the realm of both road traffic and air traffic domains, to comprehensively model the spatiotemporal correlation features of the prediction target, graph convolutional network (GCN) combined with long short-term memory (LSTM) methods find application in forecasting traffic metrics such as flow, speed, and traffic complexity. Li et al. [44] integrate GCN and LSTM models to extract spatial–temporal traffic features and then implement a soft attention mechanism to make final road traffic flow prediction. He et al. [45] validated the effectiveness of GC-LSTM in capturing spatial and temporal characteristics, intra-station correlations, and exogenous factors for passenger flow forecasting in high-speed rail networks. Guo et al. [46] combined GCN and LSTM and build the Seq2Seq model to predict multi-step road traffic speed. Du et al. [47] employed a fusion of GCN and gated recurrent units (GRU) for forecasting traffic flow across multiple airports. Li et al.[48] combined graph convolutional modules with attention-based temporal convolutional modules to formulate a prediction model for airspace complexity.

Within the realm of sector traffic prediction, graph neural networks showcase substantial versatility, forming the core of our investigation. In response to the aforementioned research challenges, we proposed a cutting-edge attention-enhanced graph convolutional long short-term memory network (AGC-LSTM) model for short-term multi-step-ahead sector flow prediction, combining the attention mechanism with the graph convolution layer and capturing temporal dependencies of flight data using the LSTM layer.

The major contributions and highlights of this study can be summarized as follows:

(1)
The topological structures of spatial route segments both within and beyond a sector are considered by the study. These structures are used as inputs to construct a graph representation, owing to the spatiotemporal correlations observed in traffic flow data between the adjacent route segments inside the sector, as well as those upstream and downstream of the focal sector. By utilizing the traffic data from those related route segments, a comprehensive representation and prediction of sector traffic flow are achieved. This approach effectively captures the influence of both traffic complexity and sector airspace structure complexity on sector-level traffic flow, consequently enhancing the accuracy of predictions at the sector-wide level traffic flow.
(2)
The AGC-LSTM model integrates the attention-enhanced graph convolutional network and the long short-term memory network. The graph convolutional layer employs the multi-head attention mechanism to capture the multiple spatiotemporal dependencies of sector-based traffic flow, and the LSTM layer is applied to capture the temporal dependencies.
(3)
We evaluate our approach using the typical sector traffic datasets. The proposed model can generate more accurate predictions on air traffic flows than the baseline models, which has the potential to help air traffic control officers (ATCOs) manage air traffic flow efficiently.

The rest of this paper is organized as follows. Section 2 reviews the related studies on air traffic prediction. Section 3 proposes the AGC-LSTM model for airspace sector-based traffic flow prediction. Section 4 introduces the flight data and generates the network graphs and demonstrates the experimental results of the proposed model. Section 5 discusses the limitations and contributions of this study, and also the scalability of the model to enable the expansion of application to relevant field. Finally, Sect. 6 provides a summary of the research conducted in this paper and outlines future research directions.

2 Related works

Air traffic flow prediction constitutes a crucial component of ATFM [49]. This section briefly reviews the related works from two aspects, i.e., airport-based traffic flow prediction and sector-based traffic flow prediction.

2.1 Airport-based traffic flow prediction

Airport-based traffic flow prediction is of significant importance for efficient air traffic management and airport operations [50]. Airports have a limited capacity to handle a certain number of flights and passengers within a given timeframe [51]. Airport flow prediction allows airport operators to allocate resources optimally such as runways, shuttles, gates, and taxiways [52]. Consequently, airport flow prediction directly impacts the passenger experience. By accurately estimating the flow of flights, airports can provide real-time information about expected wait times, gate changes, and potential disruptions. This allows passengers to plan their journeys better, manage their time, and navigate the airport more efficiently, reducing stress and enhancing the overall passenger experience [53, 54]. Additionally, many busy airports use slot management systems to schedule and allocate arrival and departure slots to airlines. Airport flow prediction plays a crucial role in determining the availability of slots and optimizing their allocation [55]. By accurately predicting the flow of flights, airport authorities can make informed decisions about slot assignments, reducing delays and maximizing the utilization of available slots [56]. Besides, air traffic control officers (ATCOs) rely on accurate arrival flow prediction to maintain safe separation between arriving aircraft [57]. By knowing the estimated arrival times and sequence of flights, controllers can plan and execute the necessary air traffic control instructions, including sequencing, spacing, and vectoring of aircraft for a safe and orderly flow of arrivals [58].

In recent years, there has been a surge in research focused on air traffic flow prediction at airports, with machine learning and deep learning models gaining popularity due to their superior prediction accuracy and learning capabilities [59]. Li and Wang [60] utilized the stacked automatic coding machine model, the long and short memory network (LSTM) model, and the control gate recursion model to predict short-term traffic flow at capital airports. Similarly, LSTM-based air traffic flow prediction has been explored for Diyarbakır Airport [61]. Recognizing the impact of meteorological conditions, Yang et al. [62] proposed a combined LSTM and extreme gradient boosting method for predicting airport flight arrival flow. Zhu et al. [63] introduced a novel graph attention RNN model to forecast short-term airport throughput over a national air traffic network. Building on the strength of residual neural networks, GCN, and LSTM, Zang et al. [27] developed a deep learning architecture for predicting the spatiotemporal distribution of traffic flow at the airport network level. Considering the influence of the topological airport network, Yan et al. [64] introduced an airport traffic flow prediction network designed to capture spatial–temporal dependencies of historical airport traffic flow (departure and arrival) for multiple step situational (network-level) arrival flow predictions. To model network-wide spatial dependencies among airports based on flight duration and flight schedule factors, a multi-view attention-based spatial–temporal network was presented [65]. Addressing heterogeneous and dynamic network dependencies, Yan et al. [66] proposed a novel large-range air traffic flow prediction model to improve airport arrival flow prediction.

2.2 Sector-based traffic flow prediction

Air traffic flow prediction at sectors is essential for efficient air traffic management and ensuring the safe and orderly flow of aircraft through specific airspace sectors. Each sector within the airspace has a limited capacity to handle a certain number of aircraft at a given time [67]. Therefore, accurately predicting the flow of air traffic in sectors can help prevent traffic congestion, flight delays, and airspace saturation, ensuring that aircraft can flow smoothly and safely through sectors. In addition, sector flow prediction can contribute to balancing the flow of air traffic among different sectors. By forecasting the expected demand and traffic volume in each sector, ATCOs can adjust the flow of aircraft [68], distribute the workload evenly [69], and detect potential aircraft conflicts [70, 71]. Besides, sector flow prediction can help authorities optimize the utilization of available airspace capacity by opening or closing certain routes or sectors, adjusting sector boundaries, or implementing flow management measures to accommodate the predicted traffic flows effectively [72].

Airspace operation complexity evaluation is significant for ensuring flight safety [73], optimizing airspace capacity [74], improving traffic flows [75], and supporting decision-making [76]. By evaluating and understanding complexity, authorities can implement measures and strategies that contribute to efficient and safe airspace operations. Accordingly, Shi-Garrier et al. [77] adopted a novel encoder–decoder LSTM neural network to predict ATC tasks based on the presented intrinsic complexity metric. Furthermore, a novel end-to-end learning framework was introduced by Xie et al. [78] to assess sector operation complexity. This approach employed a deep CNN to transform air traffic data into images, marking the first application of this technique for comprehensive complexity analysis. Subsequently, Sui et al. [79] extended the study by abstracting the multi-sector airspace scenario as an undirected graph. They then introduced a spatiotemporal GCN model to capture the correlations between changes in sector operational conditions over time and space. Xu et al. [80] proposed a Bayesian ensemble graph attention network for predicting stochastic traffic density near the terminal. Their model accounted for the intricate spatial–temporal variations in traffic patterns and considered the inter-dependencies within air traffic networks.

Currently, sector flow prediction has been carried out based on GCN [81], supervised learning [82], machine learning algorithms, RNN, and LSTM [83]. Moreover, researchers have explored various approaches to capture meaningful spatiotemporal correlations within high-dimensional feature space for traffic flow prediction. These methods include an end-to-end deep learning-based model [26], a three-dimensional CNN [84], and several machine learning models [85]. In the context of traffic flow coordination at major intersections, the flow-centric paradigm has been utilized to aid controllers in effectively managing intersecting traffic movements [86]. In line with this, Delahaye et al. [87] presented a transformer neural network model for flow prediction at coordination points. Additionally, studies have approached air traffic flow prediction from the perspective of air traffic flow networks. For instance, in enroute airspace, a dynamic network-based approach has been employed to achieve short-term air traffic flow prediction, characterizing the topological structure of airspace and the dynamics of air traffic flow [88]. Zhang et al. [89] proposed a hybrid model based on fuzzy c-means and GCN to capture the upstream and downstream dependencies within air traffic flow networks. Similarly, Cai et al. [90] introduced a temporal attention aware dual-graph convolution network for predicting air traffic flow, considering the airspace structure and routes of air traffic flows. Unlike these studies, this article will apply the AGC-LSTM method taking into account multiple spatial–temporal dependencies including spatial adjacency, and long-term and short-term temporal dependencies. The graph is constructed with route segments inside both the focal sector and its surrounding upstream and downstream sectors to capture the complex impact of inner traffic flow and airspace structure dependency characteristic on the sector traffic flow.

3 Methodology

Firstly, this section provides an introduction to the method of spatiotemporal feature extraction. Following that, a comprehensive explanation of the AGC-LSTM model is presented.

3.1 Spatiotemporal feature representation and graph modeling

This study aims to predict the traffic flow within the sector for future time intervals. To achieve this, the task involves learning a mapping function that calculates the traffic flow $Y$ for the upcoming $Q$ time steps, based on the topological structure $G$ of the flight route segment network and the feature matrix $X$ of the preceding $P$ time steps. The schematic diagram of this process is depicted in Fig. 2, and the model function is expressed as follows:

$$\left[ {Y_{t + 1} , \cdots ,Y_{t + Q} ,} \right] = f\left( {G;\left( {X_{t - P + 1} , \cdots ,X_{t - 1} ,X_t } \right)} \right)$$

(1)

The spatial relationship between segments is transformed into an adjacency matrix $A^{n \times n}$ as shown in (2), where $n$ is the number of flight route segments, and the values in the matrix represent the connectivity between segments. The topological structure of the neural network $G$ is established based on $A^{n \times n}$. This matrix was constructed to describe the features of each node in $G$. Each row of the matrix represents a flight route segment, and each column represents the time dimension. Vector $\left( {{\text{rs}}_{{\text{ij}}} ,{\text{ra}}_{{\text{ij}}} ,w_{{\text{ij}}} } \right){ }$ in each element of the matrix represents features for each flight route segment at different time intervals, where ${\text{rs}}_{{\text{ij}}} { }$ and ${\text{ra}}_{{\text{ij}}}$ represent the scheduled and actual traffic flows of the route segments, respectively, and $w_{ij}$ representing weekly periodicity of the flight schedules. Ultimately, a comprehensive input feature matrix $X^{n \times t \times 3}$ is obtained as shown in (3), where $t$ is the number of the time slot, and 3 is the number of input features in our model. The output target variable vector $Y^{1 \times (t - P)}$ is shown in (4), where $k$ is the input sequence length.

$$A = \left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} {a_{11} } & {a_{12} } \\ \end{array} } & \cdots & {a_{1n} } \\ {\begin{array}{*{20}c} {a_{21} } & {a_{22} } \\ \end{array} } & \cdots & {a_{2n} } \\ {\begin{array}{*{20}c} {\begin{array}{*{20}c} \vdots & { \vdots } \\ \end{array} } \\ {\begin{array}{*{20}c} {a_{n1} } & {a_{n2} } \\ \end{array} } \\ \end{array} } & {\begin{array}{*{20}c} \ddots \\ \cdots \\ \end{array} } & {\begin{array}{*{20}c} \vdots \\ {a_{nn} } \\ \end{array} } \\ \end{array} } \right]$$

(2)

$$X = \left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\left( {{\text{rs}}_{11} ,{\text{ra}}_{11} ,w_{11} } \right)} & {\left( {{\text{rs}}_{12} ,{\text{ra}}_{12} ,w_{12} } \right)} \\ \end{array} } & \cdots & {\left( {{\text{rs}}_{1t} ,{\text{ra}}_{1t} ,w_{1t} } \right)} \\ {\begin{array}{*{20}c} {\left( {{\text{rs}}_{21} ,{\text{ra}}_{21} ,w_{21} } \right)} & {\left( {{\text{rs}}_{22} ,{\text{ra}}_{22} ,w_{22} } \right)} \\ \end{array} } & \cdots & {\left( {{\text{rs}}_{2t} ,{\text{ra}}_{2t} ,w_{2t} } \right)} \\ {\begin{array}{*{20}c} {\begin{array}{*{20}c} \vdots & { \vdots } \\ \end{array} } \\ {\begin{array}{*{20}c} {\left( {{\text{rs}}_{n1} ,{\text{ra}}_{n1} ,w_{n1} } \right)} & {\left( {{\text{rs}}_{n2} ,{\text{ra}}_{n2} ,w_{n2} } \right)} \\ \end{array} } \\ \end{array} } & {\begin{array}{*{20}c} \ddots \\ \cdots \\ \end{array} } & {\begin{array}{*{20}c} \vdots \\ {\left( {{\text{rs}}_{nt} ,{\text{ra}}_{nt} ,w_{nt} } \right)} \\ \end{array} } \\ \end{array} } \right]$$

(3)

$$Y = \left[ {\begin{array}{*{20}c} {{\text{sr}}_{1,P + 1} } & {{\text{sr}}_{1,P + 2} } & {\begin{array}{*{20}c} \cdots & {{\text{sr}}_{1,t} } \\ \end{array} } \\ \end{array} } \right]$$

(4)

3.2 AGC-LSTM prediction model

To better explore spatiotemporal dependencies of sector traffic flows, we propose the AGC-LSTM model. Figure 3 illustrates the architecture of the AGC-LSTM model, comprising five layers, namely, the input layer, the multi-head attention-based graph convolutional layer, the dropout layer, the LSTM layer, and the output layer.

The input layer receives feature dataset X and target label dataset Y. The subsequent graph convolutional layer captures spatial dependencies within the data, enhancing feature extraction. It employs a multi-head attention mechanism to capture diverse relationships and feature importance in the graph structure. After the graph convolutional layer, a dropout layer randomly discards some output, promoting network learning on different subsets for robustness. The resulting sequence is input to the LSTM layer, which captures temporal dependencies and extracts temporal features. High-level features from the previous layers are propagated through fully connected layers to predict sector traffic flow effectively. This comprehensive approach enables the AGC-LSTM model to explore both spatiotemporal dependencies and accurately predict traffic flow.

3.2.1 Graph convolutional layer with a multi-head attention mechanism

In recent years, GCN has been introduced to handle graph-structured data. They capture spatial features between vertices by constructing filters in the Fourier domain, which act on vertices and their first-order neighbors. The multi-head attention mechanism is an extension based on attention mechanisms used for sequential data [91]. It expands the single attention head in the attention mechanism to multiple parallel attention heads, thereby enhancing the expressive and modeling capabilities of the model. The operation process of the entire multi-head attention mechanism graph convolutional layer can be represented by the following equations:

$$H_1 = {\text{Concatenate}}\left[ {\sigma \left( {D^{ - \frac{1}{2}} A_1 D^{ - \frac{1}{2}} {\text{XW}}_1 } \right), \sigma \left( {D^{ - \frac{1}{2}} A_2 D^{ - \frac{1}{2}} {\text{XW}}_2 } \right), \cdots ,\sigma \left( {D^{ - \frac{1}{2}} A_K D^{ - \frac{1}{2}} {\text{XW}}_K } \right)} \right]$$

(5)

$$A_K = {\text{softmax}}(e_K )$$

(6)

$$e_{{\text{ij}}} = {\text{LeakyReLU}}\left( {a_K^T [W_K X_i ||W_K X_j ]} \right)$$

(7)

where $H_1$ is the output of the multi-head attention graph convolution layer, and $K$ represents the number of attention heads in the multi-head attention mechanism, where each head has its attention weight matrix and linear transformation matrix. $A_k$ denotes the attention weight matrix of the $K_{{\text{th}}}$ head, $W_K$ represents the linear transformation matrix of the K_th head, $D$ is the node degree matrix, $X$ is the feature matrix, and $\sigma ( \cdot )$ denotes the activation function. $e_K$ represents the attention weights between all nodes in the Kth attention head, which are normalized using the ${\text{softmax}}$ function to ensure that the attention weights of each node sum up to 1. $e_{ij}$ represents the attention weight between node $i$ and node $j$, while $a_K$ denotes the parameter vector of the Kth attention head. $X_i , X_j$ represent the feature vectors of node $i$ and node $j$, respectively, $||$ represents the concatenation operation of vectors, and ${\text{LeakyReLU}}$ represents the rectified linear activation function with a leaky slope.

3.2.2 Dropout layer

In the context of deep learning, dropout has been demonstrated to be effective in preventing overfitting in deep neural networks [92]. It is a commonly used regularization technique that randomly sets a portion of neuron outputs to zero during the training process of a neural network to prevent overfitting. The dropout operation can be seen as an ensemble learning technique that enhances the robustness of the network to small perturbations in the input by randomly dropping the outputs of neurons. In the constructed model, the formula for dropout can be expressed as follows:

$$H_2 = M \odot H_1$$

(8)

where $H_1$ represents the input vector of the dropout layer, and $M$ is a binary mask vector with the same shape as $H_1$, indicating which neurons should be dropped. $\odot$ denotes element-wise multiplication, and ${ }H_2$ represents the output vector after dropout.

3.2.3 LSTM layer

By leveraging the aforementioned steps, we effectively extract spatial structural features from the data. Subsequently, the LSTM layer is employed to capture the temporal dependencies. Due to the limitations of traditional RNNs, such as the vanishing gradient and exploding gradient problems, the LSTM model [93] was proposed as a variant that can address these issues. As illustrated in Fig. 4, the core idea of LSTM is to control the flow of information through gate units, including the forget gate, input gate, and output gate.

These gates utilize learnable parameters to selectively retain and discard information, enabling the capture and propagation of important information within the sequence. Below are the formulas that describe the implementation of the LSTM layer. $H_2 { }$ is required to be inputted sequentially, and the input gate determines the information that needs to be updated. It processes the input and the previous hidden state using the ${\text{Sigmoid}}$ activation function to obtain a value between 0 and 1, representing the importance of each input. Equation (9) defines $i_t$ as the activation value of the input gate, which controls the importance of the current input. $b_i$ is the bias vector, and $x_t$ represents the current input sequence. Similarly, the forget gate uses the ${\text{Sigmoid}}$ activation function to determine the degree of retention for each previous hidden state. Equation (10) represents $f_t$ as the activation value of the forget gate, which controls the degree of forgetting the previous hidden state. Equation (11) defines $\tilde{c}_t$ as the candidate value for the update, generated by the hyperbolic tangent (${\text{tanh}}$) activation function to produce a new candidate hidden state. $c_t$ represents the cell state, responsible for transmitting and storing information, which is updated and stored at each time step. $o_t$ represents the activation value of the output gate, and $h_t$ represents the current hidden state, which is the output of the LSTM model. $W$ and $b$ represent the weight matrix and bias matrix for the respective time step.

$$i_t = \sigma (W_{{\text{xi}}} \cdot x_t + W_{{\text{hi}}} \cdot h_{t - 1} + b_i )$$

(9)

$$f_t = \sigma (W_{{\text{xf}}} \cdot x_t + W_{{\text{hf}}} \cdot h_{t - 1} + b_f )$$

(10)

$$\tilde{c}_t = {\text{tanh}}(W_{{\text{xc}}} \cdot x_t + W_{{\text{hc}}} \cdot h_{t - 1} + b_c )$$

(11)

$$c_t = f_t \cdot c_{t - 1} + i_t \cdot \tilde{c}_t$$

(12)

$$o_t = \sigma (W_{{\text{xo}}} \cdot x_t + W_{{\text{ho}}} \cdot h_{t - 1} + b_o )$$

(13)

$$h_t = o_t \cdot {\text{tanh}}(c_t )$$

(14)

By incorporating the gate mechanism in the LSTM layer, selective retention and forgetting of information can be achieved. This design addresses the challenge of capturing long-term dependencies in traditional RNN and enables better extraction of temporal features from air traffic data.

3.2.4 Loss function

Within the proposed model, we utilize L2 regularization as the loss function for the regression model’s mean squared error. The formula for the L2 regularization is as follows:

$${\text{loss}} = \frac{{\sum^(\hat{y}_i - y_i )^2 }}{n} + \lambda L_{{\text{reg}}}$$

(15)

where $\hat{y}_i$ and $y_i$ represent the actual value and predicted value, respectively. $\lambda$ denotes the L2 regularization coefficient, and $L_{{\text{reg}}}$ represents the trainable weights of the regression model.

3.2.5 Evaluation metric

Four metrics were utilized to evaluate the predictive performance of AGC-LSTM: mean absolute error (MAE), root-mean-square error (RMSE), symmetric mean absolute percentage error (SMAPE), and coefficient of determination (R²). The formulas are as follows:

$${\text{MAE}} = \frac{1}{n}\mathop \sum \limits_1^n |\hat{y}_i - y_i |$$

(16)

$${\text{RMSE}} = \sqrt {{\frac{1}{n}\mathop \sum \limits_1^n (\hat{y}_i - y_i )^2 }}$$

(17)

$${\text{SMAPE}} = \frac{1}{n}\mathop \sum \limits_1^n \frac{{|\hat{y}_i - y_i |}}{{(\left| {\hat{y}_i } \right| + |y_i |)/2}} \times 100$$

(18)

$$R^2 = 1 - \frac{{\sum_1^n (\hat{y}_i - y_i )^2 }}{{\sum_1^n (y_i - \overline{y})^2 }}$$

(19)

where $\hat{y}_i$ represents the predicted value, $y_i$ represents the true value, $\overline{y}$ denotes the mean of the true values, and $n$ represents the number of samples. MAE and RMSE quantify the differences between the predicted and true values, SMAPE measures the magnitude of relative errors, and R² evaluates the extent to which the predictive model explains the total variability.

4 Experiments and results

Following the introduction of the AGC-LSTM model for sector-based traffic flow prediction, this section first introduced the experiment datasets, selecting two key sectors in the central–southern region of China to conduct a case study. Subsequently, graph structures are constructed, taking into account the actual internal routes and traffic flow patterns within these selected sectors. Then to verify the feasibility and validity of the model, we made comparative experiments of AGC-LSTM with five baseline models to evaluate the prediction performance at a time granularity of 15 min for the single-step ahead. Additionally, comparative experiments were conducted to evaluate the prediction results at different statistical time granularities for a single time interval of 15, 30, 45, and 60 min. Furthermore, prediction results of different look-ahead time steps were compared.

4.1 Datasets

The Automatic Dependent Surveillance-Broadcast (ADS-B) data are adopted to extract the spatiotemporal features of air traffic flows. The ADS-B is based on the Global Navigation Satellite System (GNSS) and can provide comprehensive datasets including flight ID, timestamp, latitude, longitude, altitude, aircraft type, etc. Detailed information about ADS-B can refer to [94, 95]. The two selected sectors, namely, ZGGGAR11 and ZGGGAR22, are situated in the central–southern region of China. They act as crucial junctions connecting numerous airports, including those in the Greater Bay Area and the southwest region of China. In comparison with other airspace sectors, ZGGGAR11 and ZGGGAR22 accommodate a higher volume of flights and play a more significant role in air traffic management. Figure 5 provides a detailed spatial depiction of these two sectors. Both sectors share identical horizontal extents, but they differ in their vertical ranges (ZGGGAR11: 9200–12,500 m; ZGGGAR22: 6000–9200 m).

This study utilizes ADS-B data from March 2019, obtained from Variflight (http://www.variflight.com). The dataset comprises 72,512 records from a total of 34,167 flights, encompassing 20 flight route segments within the two airspace sectors (ZGGGAR11 and ZGGGAR22). For each segment, both the scheduled and actual traffic flows were determined, using the scheduled arrival time and the ADS-B arrival time, respectively. However, it was observed that while calculating the overflying times of the waypoints using the original ADS-B data, certain abnormal data arose due to inconsistencies in the adjustments of entry/exit times for each segment and sector. To address this issue, a set of criteria were applied to clean the flight data and ensure its reliability and accuracy. Following the application of these selection criteria, which involved considering the flying time of each route segment to be between 2 and 6 min, a dataset containing valid flight route segment data was derived. This dataset comprised 70,276 records originating from 33,942 flights. Subsequently, an examination of the entry/exit times for each flight route segment enabled the calculation of air traffic flow for the 32 one-way flight route segments present in sector ZGGGAR11/ZGGGAR22. These calculations were performed for specific time intervals, each corresponding to a node in the subsequent graph network to be constructed.

4.2 Graph generation

This section presents the principle of graph generation through a series of transformations as depicted in Fig. 6. Especially, Fig. 6a illustrates the flight route segment network within the sectors ZGGGAR11/ZGGGAR22, comprising 32 flight route segments listed on the right side of the figure. To represent the topological structure of the network, this network is converted into a directed graph denoted as $G=(V,E)$, where $V$ denotes the nodes (flight route segment), and $E$ denotes the edges (waypoint). For further clarity, Fig. 6b depicts the correspondence between the route segment structure of the main air route ONEMI-VQ-MAMSI-ENKUS within the sector and $G$. Similarly, Fig. 6c shows the correspondence between the route segment structure of NODOG-QP-MAMSI-SJG-AKNAV-ELKAL and $G$.

4.3 Experimental settings

In the experiment, 70% of the data was used as the training set, while 30% of the data was allocated for the test set. These datasets were utilized to predict the sector traffic flow for multiple subsequent time series. The AGC-LSTM model undergoes training with a learning rate set at 0.001 for a total of 300 epochs, using a batch size of 64. The number of attention heads in the graph convolutional layer is set to 2. Given the significant impact of hyperparameters on model predictive accuracy, we conducted hyperparameter tuning to enhance model performance. The optimized hyperparameter involved the hidden neurons in the GCN layer, the hidden neurons within the LSTM layer, and the dropout ratio. Figure 7 illustrates the prediction error results for different values of the hyperparameter. In particular, Fig. 7a and b depicts the application of binary search to identify the optimal number of hidden neurons in both the GCN and the LSTM layers. Meanwhile, Fig. 7c illustrates the prediction error result for each of the dropout ratio values. The resulting hyperparameters are presented in Table 1.

Table 1 Hyperparameter values obtained through hyperparameter tuning

Full size table

To gain a comprehensive understanding of the AGC-LSTM model, an analysis of the algorithmic complexity is conducted. Let $\left\| A \right\|_0$ represents the number of nonzero elements in the adjacency matrix, $N$ denotes the number of nodes, $F$ represents the feature dimensionality, and $L$ indicates the number of GCN layers. Within each GCN layer, forward propagation involves feature propagation and aggregation operations, while backward propagation requires gradient computation and parameter updates, resulting in an overall complexity is $O(L\left\| A \right\|_0 F + {\text{LNF}})$. Upon incorporating a multi-head attention mechanism into the GCN, with $K$ denoting the number of attention heads, the overall complexity of the GCN with a multi-head attention mechanism can thus be expressed as $O\left( {{\text{LK}}(\left\| A \right\|_0 F + {\text{NF}})} \right)$. For LSTM, with an input sequence length of $T$ and a hidden state dimensionality of $h$, the complexity is $O(4{\text{Th}} + 4{\text{TFh}})$. Therefore, the total complexity of the AGC-LSTM model is $O\left( {{\text{LK}}\left( {\left\| A \right\|^0 F + {\text{NF}}^2 } \right) + \left( {4{\text{Th}}^2 + 4{\text{TFh}}} \right)} \right)$.

To evaluate the performance of the AGC-LSTM model, the comparison was made with five baseline models: (1) Historical average (HA) is a forecasting method that predicts future values by taking the average of past observations. (2) Autoregressive integrated moving average (ARIMA) is a time-series prediction technique. (3) Support vector regression (SVR) is a regression method that utilizes a linear support vector machine for series prediction. (4) LSTM is a specialized variant of recurrent neural network (RNN) commonly used for sequential prediction tasks. (5) GCN-LSTM is a model designed to leverage the advantages of both graph-based and sequential data modeling.

The proposed AGC-LSTM is compared with five baseline models to evaluate the prediction performance at a time granularity of 15 min for the single-step ahead. Additionally, comparative experiments were conducted to evaluate the prediction results at different statistical time granularities for a single time interval of 15, 30, 45, and 60 min. Furthermore, we conducted experiments to compare the results of multi-step ahead predictions. All the experiments are conducted on a machine with NVIDIA GeForce GTX1050 (2 GB memory), i7-7700HQ CPU (2.80 GHz), and 8 GB of RAM.

4.4 Experimental results

4.4.1 Single-step ahead prediction results compared to baseline models

The training time of the AGC-LSTM model is 0.91 s per epoch, and the iteration number is 8700. When applied to the test dataset, the model demonstrates a runtime of 0.39 s on average. AGC-LSTM is compared with five baseline models for the single-step ahead prediction of the sector-based traffic flow. Table 2 provides a detailed comparison of model performance for each prediction method under different input sequence lengths. The “Input_seq_length” represents the input sequence length, and the best values under different feature sequence lengths for each model are highlighted in boldface. The results indicate that the AGC-LSTM model demonstrates improved predictive performance across multiple metrics, including MAE, RMSE, SMAPE, and R². Compared to the best-performing model GCN-LSTM among the other five models, the AGC-LSTM model reduces the MAE by 14.4%, RMSE by 14.1%, SMAPE by 10.2%, and increases R² by 9.96%. These data indicate that the AGC-LSTM model is more accurate in predicting. This is primarily attributed to the fact that while the GCN-LSTM model only extracts spatiotemporal features, the AGC-LSTM model incorporates the multi-head attention mechanism in the graph convolutional layer to fully exploit the spatial correlations in the data and focuses on critical node information within the flight route segment network to better learn the topological structure of the entire graph network. Additionally, as sector traffic flow exhibits significant fluctuations in a short period, the AGC-LSTM model mitigates the impact of these perturbations by incorporating the dropout layer, thereby enhancing the robustness of the predictions.

Table 2 Performance comparison of the six prediction methods for single-step ahead prediction with features of 15-min time granularity

Full size table

Another phenomenon can be observed that the prediction accuracy of the model initially improves and then deteriorates as the number of input time sequence lengths increases. It is speculated that during the process of increasing the data dimension, the quantity of input features related to the predicted results gradually increases, leading to optimal prediction accuracy. However, as the input data dimension continues to increase, the data sparsity increases, making it more difficult for the model to capture effective patterns and trends, thereby affecting the model’s performance.

To assess the impact of the space range of route segments on the prediction results, Table 2 also includes the results of using the extended segment network shown in Fig. 8 to construct the spatial network structure of the AGC-LSTM model. The results indicate that the AGC-LSTM model using the expanded spatial network reduces the MAE by 1.1%, RMSE by 1.2%, and SMAPE by 6.7% when compared with the AGC-LSTM model using a network constructed only with route segment in the focused sector.

The features have a time granularity of 15 min and vary in input sequence length. Figure 9a presents the real and predicted sector-based traffic flow by AGC-LSTM and GCN-LSTM. Figure 9b shows the absolute prediction errors for each time slot of the whole day. For the forecasted time slots 1–20 (0–5 h), where the actual values are relatively low, the AGC-LSTM model exhibits errors within 2, with most errors being within 1. In contrast, the GCN-LSTM model has errors ranging mostly more than 2 for the same time series, with the highest error up to 6. Ordinally, 5–10% of sector capacity is reserved to take care of all “non-adherence issues” caused by the low traffic predictability in the pre-tactical and tactical stages [96]. Sectors in the central–southern region of China ordinally have the maximum traffic flow not exceeding 21, the improvement of 1–2 flights in the forecasting accuracy has a significant impact. Furthermore, for the subsequent time slot beyond 20 (5–24 h), it can be observed that the AGC-LSTM model achieves a higher fit, with smaller absolute prediction errors for the majority of the time slot.

4.4.2 Single-step ahead prediction results under different time granularities

To analyze the impact of different time granularities of the input and output features on the prediction performance, we extracted features at time granularities of 15 min, 30 min, 45 min, and 60 min to predict the traffic flow of the next time step. The experimental results are presented in Table 3.

Table 3 Performance comparison of prediction under different time granularities for single-step ahead prediction with AGC-LSTM model

Full size table

It can be seen that when the time granularity increases, the model’s prediction performance decreases according to MAE and RMSE, while the SMAPE values tend to decrease, and R² continues to increase. The decrease in prediction performance may be attributed to the reduction in the number of input time series due to the increase in time granularity, resulting in a decrease in data volume and an increase in outliers, thereby disrupting the model’s learning process. This aligns with the characteristic that mean absolute error is less sensitive to outliers, while root-mean-square error amplifies the error of outliers. The increase in R² may be due to the smoothing effect of statistical data as the time granularity increases.

The statistical time granularity has an impact on TFM monitoring and implementation. The Enhanced Traffic Management System in the US typically uses a 15-min flow statistics time interval, while the Air Traffic Flow Control Management (ATFCM) system in Europe typically uses a 1-h time interval. Additionally, the intervals can be manually adjusted [97, 98].

4.4.3 Multi-step ahead prediction results under different output sequence lengths

To analyze the performance of the AGC-LSTM model in multi-step ahead predictions at a 15-min granularity, we conducted experiments using different input sequence lengths to predict the sector-based traffic flow for different output sequence lengths. For tactical air traffic flow management operations, when the prediction look-ahead time (LAT) is more than 1 h, the flow prediction is often less accurate [99, 100], so a two to five steps ahead output sequence length is used here. The experimental results are presented in Table 4. The “Predict_seq_length” column indicates the number of time steps ahead that were predicted. The table highlights in boldface the optimal results for each prediction of different time steps.

Table 4 Performance comparison for 2–5 steps ahead prediction with AGC-LSTM model

Full size table

The prediction results demonstrate that as the number of forward prediction sequence lengths increases, the AGC-LSTM model experiences slight performance degradation. Results indicate that the optimal length of the input sequence increases with the prediction time steps. This can be attributed to the long-term memory characteristics of the LSTM layer, where the AGC-LSTM model retains less information loss during the transmission of longer time series, thereby maintaining high prediction accuracy in long-term forecasting scenarios.

5 Discussion and implications

The model presented retains certain limitations, for instance, the impact of weather conditions on sector traffic flow is significant, and weather-related features have not been incorporated. Additionally, given the disparate roue segment structure related to the sectors, the current methodology outlined in this research necessitates the reconstruction of graph structures and the retraining of networks to predict traffic flow accurately across different sectors.

Nevertheless, this research offers several noteworthy contributions. Firstly, the utilization of the innovative AGC-LSTM model for sector traffic flow prediction based on sector-related flight segments demonstrates exceptional performance. Secondly, an exploration into the optimal input sequence length and granularity of time intervals for prediction markedly enhances prediction accuracy. Furthermore, the proposed model holds promise in aiding ATCOs in efficiently managing traffic flow and mitigating their operational burdens in practical settings. In practice, this entails a process of selecting the target region for prediction, followed by rigorous data collection, feature extraction, and model training. Subsequently, the model’s real-time application enables precise sector traffic flow prediction in a short time, facilitating controllers in evaluating airspace congestion and implementing effective management strategies. The practical application framework for the proposed AGC-LSTM method is shown in Fig. 10.

Regarding the scalability of the model, the enhancement of its performance can be pursued through the exploration of diverse attention mechanisms. Beyond conventional self-attention mechanisms, the incorporation of multi-scale attention and positional attention, among others, can augment the model’s adaptability to various focal points. Moreover, the enrichment of graph data representation by integrating additional node features, edge features, or subgraph structural information holds promise for further amplifying the model’s efficacy. In essence, the AGC-LSTM model exhibits commendable extensibility, permitting subsequent refinements and expansions tailored to specific task exigencies to elevate both performance and applicability.

6 Conclusions and future work

Data-driven and machine learning techniques offer the advantage of adaptability and flexibility, allowing models to learn from data and adjust predictions based on changing patterns in air traffic flow. These advanced approaches have significantly improved the accuracy and effectiveness of air traffic flow prediction, supporting safer and more efficient airspace operations. In this paper, we presented an innovative AGC-LSTM model for sector-based traffic flow prediction, which combines graph convolution networks with attention mechanisms. The traffic flow in the sector is influenced not only by the number of planned flights but also by the airspace complexity within the sector. To address this, our model integrates the multi-head attention mechanism into GCN, effectively capturing the sector’s topological structure and focusing on critical nodes. Furthermore, the LSTM model is incorporated to capture temporal dynamics in node attributes, allowing the extraction of essential spatiotemporal features from the data. Through comprehensive comparisons with the five baseline models, our proposed method demonstrates superior performance across all evaluation metrics. Notably, the AGC-LSTM model reduces the MAE to 1.6 which is of significance to the sector-based traffic flow management. The prediction performance can still be improved when the AGC-LSTM model is constructed with an expanded graph space range. Additionally, we explore the optimal input sequence length and time interval granularity variables for prediction, leading to significant improvements in prediction accuracy. The AGC-LSTM model also proves highly adept at accurately predicting long input sequence lengths. All the results highlight the effectiveness of our AGC-LSTM approach in predicting sector traffic flow and showcase its potential for real-world applications in the aviation domain. Specifically, the proposed model has the potential to help ATCOs manage traffic flow efficiently and reduce the workload.

Based on the analysis of real aviation datasets, the following observations can be made regarding the proposed method. (1) The proposed method outperforms the GCN-LSTM model, showing superior predictive ability, particularly in high-flow scenarios. (2) The prediction accuracy of the model initially improves with an increase in the length of the input time sequence. However, there comes a point where a further increase in the length of the input sequence leads to a deterioration in prediction accuracy. (3) The model’s prediction performance improves when extending the route segment range beyond the current sector to include the neighboring sectors. (4) The model’s prediction performance is negatively affected as the time granularity increases. In other words, when dealing with coarser time intervals, the model’s accuracy decreases. (5) Extending the prediction time steps beyond a certain point does not significantly impact the prediction performance of the model. This verifies the stability performance of GCN-LSTM.

However, it is crucial to recognize that the research findings presented in this paper come with certain limitations and open avenues for future investigations in novel directions. Firstly, although the input features for prediction encompass periodicity and historical traffic flow, incorporating additional factors such as weather conditions might have the potential to further enhance prediction performance. Secondly, the current method proposed in this study necessitates reconstructing the graph structure and retraining the network when predicting the traffic flow for a different sector, which could be a subject for improvement in the future research. Exploring approaches that allow for more flexible and generalized predictions without the need for complete restructuring would be beneficial. Thirdly, it is worth noting that different sectors may exhibit unique spatial structural characteristics and traffic flow patterns, which can impact sector traffic flow differently. Investigating the extraction of universal input features that can accommodate diverse sector characteristics would further enhance the robustness and applicability of the prediction model. In summary, while this research lays a foundation for sector traffic flow prediction, addressing these limitations and exploring new research directions will contribute to advancing the accuracy, flexibility, and generalization of the prediction model.

Data availability

The data used for this research work are from a publicly available source. Data for the experiment will be shared on reasonable request.

Change history

19 May 2024
The original online version of this article was revised to correct the subheadings of section 3.2.

References

Dixit A, Jakhar SK (2021) Airport capacity management: a review and bibliometric analysis. J Air Transp Manag 91:102010
Article Google Scholar
Xu Y, Prats X, Delahaye D (2020) Synchronised demand-capacity balancing in collaborative air traffic flow management. Transp Res Part C: Emerg Technol 114:359–376
Article Google Scholar
Juntama P, Delahaye D, Chaimatanan S, Alam S (2022) Hyperheuristic approach based on reinforcement learning for air traffic complexity mitigation. J Aerospace Inf Syst 19(9):633–648
Article Google Scholar
ICAO (2016) Doc 4444-Procedures for Air Navigation Services: Air Traffic Management. In: International Aviation Civil Organization Montreal
Dray L (2020) An empirical analysis of airport capacity expansion. J Air Transp Manag 87:101850
Article Google Scholar
Cai Q, Ang HJ, Alam S (2023) Collision risk assessment of reduced aircraft separation minima in procedural airspace using advanced communication and navigation. Chin J Aeronaut 36(4):315–337
Article Google Scholar
Ng KK, Chen C-H, Lee CK (2021) Mathematical programming formulations for robust airside terminal traffic flow optimisation problem. Comput Ind Eng 154:107119
Article Google Scholar
Xue D, Hsu L-T, Wu C-L, Lee C-H, Ng KK (2021) Cooperative surveillance systems and digital-technology enabler for a real-time standard terminal arrival schedule displacement. Adv Eng Inform 50:101402
Article Google Scholar
Liu Y, Liu Y, Hansen M, Pozdnukhov A, Zhang D (2019) Using machine learning to analyze air traffic management actions: ground delay program case study. Transp Res Part E Logist Transp Rev 131:80–95
Article Google Scholar
Glover CN, Ball MO (2013) Stochastic optimization models for ground delay program planning with equity–efficiency tradeoffs. Transp Res Part C Emerg Technol 33:196–202
Article Google Scholar
Ng K, Lee CK, Chan FT, Lv Y (2018) Review on meta-heuristics approaches for airside operation research. Appl Soft Comput 66:104–133
Article Google Scholar
Corlu CG, de la Torre R, Serrano-Hernandez A, Juan AA, Faulin J (2020) Optimizing energy consumption in transportation: literature review, insights, and research opportunities. Energies 13(5):1115
Article Google Scholar
Bolat A (2001) Models and a genetic algorithm for static aircraft-gate assignment problem. J Oper Res Soc 52(10):1107–1120
Article Google Scholar
Ding W, Zhang Y, Hansen M (2018) Downstream impact of flight rerouting. Transp Res Part C Emer Technol 88:176–186
Article Google Scholar
McCrea MV, Sherali HD, Trani AA (2008) A probabilistic framework for weather-based rerouting and delay estimations within an airspace planning model. Transp Res Part C Emer Technol 16(4):410–431
Article Google Scholar
Birolini S, Antunes AP, Cattaneo M, Malighetti P, Paleari S (2021) Integrated flight scheduling and fleet assignment with improved supply-demand interactions. Transp Res Part B Methodol 149:162–180
Article Google Scholar
Eufrásio ABR, Eller RA, Oliveira AV (2021) Are on-time performance statistics worthless? An empirical study of the flight scheduling strategies of Brazilian airlines. Transp Res Part E Logist Transp Rev 145:102186
Article Google Scholar
Eun Y, Hwang I, Bang H (2010) Optimal arrival flight sequencing and scheduling using discrete airborne delays. IEEE Trans Intell Transp Syst 11(2):359–373
Article Google Scholar
Xu S, Bi W, Zhang A, Mao Z (2022) Optimization of flight test tasks allocation and sequencing using genetic algorithm. Appl Soft Comput 115:108241
Article Google Scholar
Dalmau R (2022) Predicting the likelihood of airspace user rerouting to mitigate air traffic flow management delay. Transportation Res Part C Emer Technol 144:103869
Article Google Scholar
Montgomery DC, Jennings CL, Kulahci M (2015) Introduction to time series analysis and forecasting. John Wiley & Sons, New Jersey
Google Scholar
Hong W-C (2011) Traffic flow forecasting by seasonal SVR with chaotic simulated annealing algorithm. Neurocomputing 74(12–13):2096–2107
Article Google Scholar
Benítez RBC, Paredes RBC, Lodewijks G, Nabais JL (2013) Damp trend Grey Model forecasting method for airline industry. Expert Syst Appl 40(12):4915–4921
Article Google Scholar
Huang C, Xu Y, Johnson ME (2017) Statistical modeling of the fuel flow rate of GA piston engine aircraft using flight operational data. Transp Res Part D: Transp Environ 53:50–62
Article Google Scholar
Halevy A, Norvig P, Pereira F (2009) The unreasonable effectiveness of data. IEEE Intell Syst 24(2):8–12
Article Google Scholar
Lin Y, Zhang J-w, Liu H (2019) Deep learning based short-term air traffic flow prediction considering temporal–spatial correlation. Aerosp Sci Technol 93:105113
Article Google Scholar
Zang H, Zhu J, Gao Q (2022) Deep learning architecture for flight flow spatiotemporal prediction in airport network. Electronics 11(23):4058
Article Google Scholar
Kim YJ, Choi S, Briceno S, Mavris D (2016) A deep learning approach to flight delay prediction. In: 2016 IEEE/AIAA 35th Digital Avionics Systems Conference (DASC). IEEE, pp 1–6
Zhang X, Mahadevan S (2020) Bayesian neural networks for flight trajectory prediction and safety assessment. Decis Support Syst 131:113246
Article Google Scholar
Shi Z, Xu M, Pan Q (2020) 4-D flight trajectory prediction with constrained LSTM network. IEEE Trans Intell Transp Syst 22(11):7242–7255
Article Google Scholar
Sherali HD, Hill JM (2013) Configuration of airspace sectors for balancing air traffic controller workload. Ann Oper Res 203:3–31
Article Google Scholar
Di Vaio A, Varriale L (2020) Blockchain technology in supply chain management for sustainable performance: evidence from the airport industry. Int J Inf Manage 52:102014
Article Google Scholar
Tobaruela G, Fransen P, Schuster W, Ochieng WY, Majumdar A (2014) Air traffic predictability framework–Development, performance evaluation and application. J Air Transp Manag 39:48–58
Article Google Scholar
Sun Q, Zhang K, Huang K, Li X, Zhang T, Xu T (2022) Enhanced graph convolutional network based on node importance for document-level relation extraction. Neural Comput Appl 34(18):15429–15439
Article Google Scholar
Wu L, Chen Y, Shen K, Guo X, Gao H, Li S, Pei J, Long B (2023) Graph neural networks for natural language processing: a survey. Found Trends® Mach Learn 16(2):119–328
Article Google Scholar
Cao P, Zhu Z, Wang Z, Zhu Y, Niu Q (2022) Applications of graph convolutional networks in computer vision. Neural Comput Appl 34(16):13387–13405
Article Google Scholar
Han K, Wang Y, Guo J, Tang Y, Wu E (2022) Vision gnn: an image is worth graph of nodes. Adv Neural Inf Process Syst 35:8291–8303
Google Scholar
Feng C, Liu Z, Lin S, Quek TQ (2019) Attention-based graph convolutional network for recommendation system. In: ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp 7560–7564
Dhawan S, Singh K, Rabaea A, Batra A (2022) ImprovedGCN: an efficient and accurate recommendation system employing lightweight graph convolutional networks in social media. Electron Commer Res Appl 55:101191
Article Google Scholar
Vo T (2023) An integrated fuzzy neural supervision and attention-based graph neural network for improving network clustering. Neural Comput Appl 35(33):24015–24035
Article Google Scholar
Tsitsulin A, Palowitch J, Perozzi B, Müller E (2023) Graph clustering with graph neural networks. J Mach Learn Res 24(127):1–21
MathSciNet Google Scholar
Jiang W, Luo J (2022) Graph neural network for traffic forecasting: a survey. Expert Syst Appl 207:117921
Article Google Scholar
Jiang W, Luo J, He M, Gu W (2023) Graph neural network for traffic forecasting: the research progress. ISPRS Int J Geo Inf 12(3):100
Article Google Scholar
Li Z, Xiong G, Chen Y, Lv Y, Hu B, Zhu F, Wang F-Y (2019) A hybrid deep learning approach with GCN and LSTM for traffic flow prediction. In: 2019 IEEE intelligent transportation systems conference (ITSC). IEEE, pp 1929–1933
He Y, Zhao Y, Wang H, Tsui KL (2020) GC-LSTM: a deep spatiotemporal model for passenger flow forecasting of high-speed rail network. In: 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC). IEEE, pp 1–6
Guo J, Song C, Wang H (2019) A multi-step traffic speed forecasting model based on graph convolutional LSTM. In: 2019 Chinese Automation Congress (CAC). IEEE, pp 2466–2471
Du W, Chen S, Li Z, Cao X, Lv Y (2023) A spatial-temporal approach for multi-airport traffic flow prediction through causality graphs. In: IEEE Transactions on Intelligent Transportation Systems
Li B, Li Z, Chen J, Yan Y, Lv Y, Du W (2024) MAST-GNN: A multimodal adaptive spatio-temporal graph neural network for airspace complexity prediction. Transp Res Part C Emerg Technol 160:104521
Article Google Scholar
Jardines A, Soler M, Cervantes A, García-Heras J, Simarro J (2021) Convection indicator for pre-tactical air traffic flow management using neural networks. Mach Learn Appl 5:100053
Google Scholar
Kuhn KD (2016) A methodology for identifying similar days in air traffic flow management initiative planning. Transp Res Part C Emer Technol 69:1–15
Article Google Scholar
Jacquillat A, Odoni AR (2018) A roadmap toward airport demand and capacity management. Transp Res Part A Policy Pract 114:168–185
Article Google Scholar
Soltani M, Ahmadi S, Akgunduz A, Bhuiyan N (2020) An eco-friendly aircraft taxiing approach with collision and conflict avoidance. Transp Res Part C Emerg Technol 121:102872
Article Google Scholar
Wattanacharoensil W, Schuckert M, Graham A (2016) An airport experience framework from a tourism perspective. Transp Rev 36(3):318–340
Article Google Scholar
Harrison V (2015) Delivering a first class travel experience for passengers. J Airport Manage 9(4):317–326
Google Scholar
Badii C, Nesi P, Paoli I (2018) Predicting available parking slots on critical and regular services by exploiting a range of open data. IEEE Access 6:44059–44071
Article Google Scholar
Ivanov N, Netjasov F, Jovanović R, Starita S, Strauss A (2017) Air traffic flow management slot allocation to minimize propagated delay and improve airport slot adherence. Transp Res Part A Policy Pract 95:183–197
Article Google Scholar
Xue D, Yang J, Liu Z (2022) Potential impact of GNSS positioning errors on the satellite-navigation-based air traffic management. Space Weather 20(7):e2022SW003144
Article Google Scholar
Alharbi EA, Abdel-Malek LL, Milne RJ, Wali AM (2022) Analytical model for enhancing the adoptability of continuous descent approach at airports. Appl Sci 12(3):1506
Article Google Scholar
Polson NG, Sokolov VO (2017) Deep learning for short-term traffic flow prediction. Transp Res Part C Emer Technol 79:1–17
Article Google Scholar
Li J, Wang J (2017) Short term traffic flow prediction based on deep learning. CICTP 2019:2457–2469
Google Scholar
Dursun ÖO (2023) Air-traffic flow prediction with deep learning: a case study for Diyarbakır airport. J Aviat 7(2):196–203
Article Google Scholar
Yang Z, Wang Y, Li J, Liu L, Ma J, Zhong Y (2020) Airport arrival flow prediction considering meteorological factors based on deep-learning methods. Complexity 2020:1–11
Article Google Scholar
Zhu X, Lin Y, He Y, Tsui K-L, Chan PW, Li L (2022) Short-term nationwide airport throughput prediction with graph attention recurrent neural network. Front Artif Intell 5:884485
Article Google Scholar
Yan Z, Yang H, Li F, Lin Y (2021) A deep learning approach for short-term airport traffic flow prediction. Aerospace 9(1):11
Article Google Scholar
Yan Z, Yang H, Wu Y, Lin Y (2023) A multi-view attention-based spatial-temporal network for airport arrival flow prediction. Transp Res Part E Logist Transp Rev 170:102997
Article Google Scholar
Yan Z, Yang H, Guo D, Lin Y (2023) Improving airport arrival flow prediction considering heterogeneous and dynamic network dependencies. Inf Fus 100:101924
Article Google Scholar
Starita S, Strauss AK, Fei X, Jovanović R, Ivanov N, Pavlović G, Fichert F (2020) Air traffic control capacity planning under demand and capacity provision uncertainty. Transp Sci 54(4):882–896
Article Google Scholar
Hu Y, Xu B, Bard JF, Chi H (2015) Optimization of multi-fleet aircraft routing considering passenger transiting under airline disruption. Comput Ind Eng 80:132–144
Article Google Scholar
Kantowitz BH, Casper PA (2017) Human workload in aviation. Human error in aviation. Routledge, London, pp 123–153
Book Google Scholar
Jilkov VP, Ledet JH, Li XR (2018) Multiple model method for aircraft conflict detection and resolution in intent and weather uncertainty. IEEE Trans Aerosp Electron Syst 55(2):1004–1020
Article Google Scholar
Matsuno Y, Tsuchiya T, Wei J, Hwang I, Matayoshi N (2015) Stochastic optimal control for aircraft conflict resolution under wind uncertainty. Aerosp Sci Technol 43:77–88
Article Google Scholar
Metzger U, Parasuraman R (2017) Automation in future air traffic management: Effects of decision aid reliability on controller performance and mental workload. Decision Making in Aviation, Routledge, pp 345–360
Cao X, Zhu X, Tian Z, Chen J, Wu D, Du W (2018) A knowledge-transfer-based learning framework for airspace operation complexity evaluation. Transp Res Part C Emer Technol 95:61–81
Article Google Scholar
Li B, Du W, Zhang Y, Chen J, Tang K, Cao X (2021) A deep unsupervised learning approach for airspace complexity evaluation. IEEE Trans Intell Transp Syst 23(8):11739–11751
Article Google Scholar
Tang J, Liu G, Pan Q (2022) Review on artificial intelligence techniques for improving representative air traffic management capability. J Syst Eng Electron 33(5):1123–1134
Article Google Scholar
Du X, Lu Z, Wu D (2020) An intelligent recognition model for dynamic air traffic decision-making. Knowl-Based Syst 199:105274
Article Google Scholar
Shi-Garrier L, Delahaye D, Bouaynaya NC (2021) Predicting air traffic congested areas with long short-term memory networks. In: Fourteenth USA/Europe Air Traffic Management Research and Development Seminar (ATM2021).
Xie H, Zhang M, Ge J, Dong X, Chen H (2021) Learning air traffic as images: a deep convolutional neural network for airspace operation complexity evaluation. Complexity 2021:1–16
Google Scholar
Sui D, Liu K, Li Q (2022) Dynamic prediction of air traffic situation in large-scale airspace. Aerospace 9(10):568
Article Google Scholar
Xu Q, Pang Y, Liu Y (2023) Air traffic density prediction using Bayesian ensemble graph attention network (BEGAN). Transp Res Part C Emer Technol 153:104225
Article Google Scholar
Ma C, Alam S, Cai Q, Delahaye D (2022) Sector entry flow prediction based on graph convolutional networks. In: International Conference on Research in Air Transportation.
Brito IR, Rocha Murca MC, Oliveira Md, Oliveira AV (2021) A Machine Learning-based Predictive Model of Airspace Sector Occupancy. In: AIAA AVIATION 2021 FORUM. p 2324
Asirvadam TV, Rao S, Balachander T (2022) Predicting air traffic density in an air traffic control sector. ECS Trans 107(1):5037
Article Google Scholar
Liu H, Lin Y, Chen Z, Guo D, Zhang J, Jing H (2019) Research on the air traffic flow prediction using a deep learning approach. IEEE Access 7:148019–148030
Article Google Scholar
Moreno FP, Comendador VFG, Jurado RD-A, Suárez MZ, Janisch D, Valdés RMA (2023) Methodology of air traffic flow clustering and 3-D prediction of air traffic density in ATC sectors based on machine learning models. Expert Syst Appl 223:119897
Article Google Scholar
Hong Y, Choi B, Lee K, Kim Y (2017) Dynamic robust sequencing and scheduling under uncertainty for the point merge system in terminal airspace. IEEE Trans Intell Transp Syst 19(9):2933–2943
Article Google Scholar
Delahaye D, Ma C, Alam S, Cai Q (2022) Air traffic flow representation and prediction using transformer in flow-centric airspace. In: SESAR Innovation Days.
Chen D, Hu M, Ma Y, Yin J (2016) A network-based dynamic air traffic flow model for short-term en route traffic prediction. J Adv Transp 50(8):2174–2192
Article Google Scholar
Zhang Y, Lu Z, Wang J, Chen L (2023) FCM-GCN-based upstream and downstream dependence model for air traffic flow networks. Knowl-Based Syst 260:110135
Article Google Scholar
Cai K, Shen Z, Luo X, Li Y (2023) Temporal attention aware dual-graph convolution network for air traffic flow prediction. J Air Transp Manag 106:102301
Article Google Scholar
Niu Z, Zhong G, Yu H (2021) A review on the attention mechanism of deep learning. Neurocomputing 452:48–62
Article Google Scholar
Pham V, Bluche T, Kermorvant C, Louradour J (2014) Dropout improves recurrent neural networks for handwriting recognition. In: 2014 14th international conference on frontiers in handwriting recognition. IEEE, pp 285–290
Kiperwasser E, Goldberg Y (2016) Simple and accurate dependency parsing using bidirectional LSTM feature representations. Trans Assoc Comput Linguist 4:313–327
Article Google Scholar
Xue D, Yang J, Liu Z, Yu S (2023) Examining the economic costs of the 2003 Halloween storm effects on the North Hemisphere aviation using flight data in 2019. Space Weather 21(3):e2022SW003381
Article Google Scholar
Ali BS, Ochieng WY, Zainudin R (2017) An analysis and model for Automatic Dependent Surveillance Broadcast (ADS-B) continuity. GPS Solut 21(4):1841–1854
Article Google Scholar
Jovanovic R, Babic O, Toic V, Tošic V (2015) Pricing to reconcile predictability, efficiency and equity in ATM. In: Proceedings of the 11th USA/Europe ATM R&D Seminar.
Koelman H, Koelle R, Shetty K, Gulding J (2019) Comparison of ATFM Practices and Performance in The US and Europe (2015–2018). In: 2019 Integrated Communications, Navigation and Surveillance Conference (ICNS). IEEE, pp 1–16
Masalonis A, Mulgund S, Song L, Wanke C, Zobell S (2004) Using probabilistic demand predictions for traffic flow management decision support. In: AIAA guidance, navigation, and control conference and exhibit. p 5231
Gilbo E, Smith S (2007) A new model to improve aggregate air traffic demand predictions. In: AIAA Guidance, Navigation and Control Conference and Exhibit. p 6450
Lee H, Jung YC, Zelinski SJ, Zhu Z, Hosagrahara V (2019) Fast-Time Simulation for Evaluating the Impact of Estimated Flight Ready Time Uncertainty on Surface Metering. In: 2019 IEEE/AIAA 38th Digital Avionics Systems Conference (DASC). IEEE, pp 1–10

Download references

Acknowledgements

This work was supported by the National Key R&D Program of China (2022YFB2602403).

Funding

Open access funding provided by The Hong Kong Polytechnic University.

Author information

Authors and Affiliations

College of General Aviation and Flight, Nanjing University of Aeronautics and Astronautics, Nanjing, China
Ying Zhang & Shimin Xu
State Key Laboratory of Air Traffic Management System, Nanjing, China
Ying Zhang, Shimin Xu & Linghui Zhang
College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China
Linghui Zhang
School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing, China
Weiwei Jiang
Air Traffic Management Research Institute, School of Mechanical & Aerospace Engineering, Nanyang Technological University, Singapore, Singapore
Sameer Alam
Department of Land Surveying and Geo-Informatics, The Hong Kong Polytechnic University, Hong Kong, China
Dabin Xue

Authors

Ying Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Shimin Xu
View author publications
You can also search for this author in PubMed Google Scholar
Linghui Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Weiwei Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Sameer Alam
View author publications
You can also search for this author in PubMed Google Scholar
Dabin Xue
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dabin Xue.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Zhang, Y., Xu, S., Zhang, L. et al. Short-term multi-step-ahead sector-based traffic flow prediction based on the attention-enhanced graph convolutional LSTM network (AGC-LSTM). Neural Comput & Applic (2024). https://doi.org/10.1007/s00521-024-09827-3

Download citation

Received: 16 November 2023
Accepted: 12 April 2024
Published: 07 May 2024
DOI: https://doi.org/10.1007/s00521-024-09827-3

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Short-term multi-step-ahead sector-based traffic flow prediction based on the attention-enhanced graph convolutional LSTM network (AGC-LSTM)

Abstract

Similar content being viewed by others

Traffic Flow Prediction Based on Attention Mechanism Convolutional Neural Network

A multi-level attention long short-term memory neural network based on rival rise algorithm for traffic volume prediction

Traffic Flow Forecasting of Graph Convolutional Network Based on Spatio-Temporal Attention Mechanism

Explore related subjects

1 Introduction

2 Related works

2.1 Airport-based traffic flow prediction

2.2 Sector-based traffic flow prediction

3 Methodology

3.1 Spatiotemporal feature representation and graph modeling

3.2 AGC-LSTM prediction model

3.2.1 Graph convolutional layer with a multi-head attention mechanism

3.2.2 Dropout layer

3.2.3 LSTM layer

3.2.4 Loss function

3.2.5 Evaluation metric

4 Experiments and results

4.1 Datasets

4.2 Graph generation

4.3 Experimental settings

4.4 Experimental results

4.4.1 Single-step ahead prediction results compared to baseline models

4.4.2 Single-step ahead prediction results under different time granularities

4.4.3 Multi-step ahead prediction results under different output sequence lengths

5 Discussion and implications

6 Conclusions and future work

Data availability

Change history

19 May 2024

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation