Flight risk evaluation based on flight state deep clustering network

Flight risk evaluation based on data-driven approach is an essential topic of aviation safety management. Existing risk analysis methods ignore the coupling and time-variant characteristics of flight parameters, and cannot accurately establish the mapping relationship between flight state and loss-of-control risk. To deal with the problem, a flight state deep clustering network (FSDCN) model was proposed to mine latent loss-of-control risk information implicating in raw flight parameters. FSDCN integrates the feature extraction and clustering into an end-to-end deep hybrid network to extract latent risk features from multivariate time-series flight parameters and cluster them. In the FSDCN model, a sequential multi-attention encoder–decoder network is designed to extract embedded risk features, and the feature clustering layer is designed to iteratively refine clustering effects and feature extraction. Besides, a loss-of-control classifier is added to optimize the risk feature vector expression and ensure sufficient dividing feature for facilitate clustering. The multi-task joint learning strategy is adopted to improve the clustering performance of the model further. According to extracted risk features and similarity metrics, the optimal clusters number of flight states is set as 5. Comparative experiments show that FSDCN significantly performs better than other clustering models with performance percentage error below 6%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$6\%$$\end{document}. Through statistical analysis of clustering results, the risk level is quantified for each cluster. Three high-difficulty maneuver cases are presented to demonstrate FSDCN for flight risk evaluation. The flight parameter sequences of the maneuver cases are input into the well-trained FSDCN to obtain the risk prediction results. The spatiotemporal distribution characteristics of the risk-quantized results are consistent with flight parameters over-limit situations, which demonstrates the effectiveness of FSDCN on clustering flight states. The experimental results on flight maneuver cases show that FSDCN can find potential loss-of-control risk features according to multivariate time-series flight data and provide support for in-flight risk warnings.


Introduction
Flight risk early warning has always been the focus of flight safety research, and its core is to evaluate the aircraft's performance in advance objectively [1,2].When the aircraft falls into complex conditions, accurate and objective risk evaluation for the aircraft's performance will help the crew take corresponding manipulation strategies to operate the aircraft away from the loss-of-control (LOC) states [3].In this way, flight crashes can be reduced effectively.Actually, the timevariant characteristics of flight parameters directly determine the aircraft's performance and contain important risk information reflecting the LOC occurrence.Hence, mining the latent risk features of the flight parameters based on the datadriven method is significant for flight risk evaluation and prediction.
Clustering analysis is a standard data-mining method based on unsupervised machine learning [4].Without prior knowledge, the method can separate data into clusters according to the captured features, then find the valuable information hidden in the data [5].Due to the apparent advantages in unsupervised learning, the clustering analysis method has been widely used in the field of medicine [6,7], industry [8], and aviation [9][10][11].The flight data recording the airplane path, speed, attitude and operation have some typical characteristics, including high-dimensional, redundant, and time-series.Directly clustering the flight parameters in the original data space is the simplest cluster proce-dure [12].However, the classic clustering algorithms used to detect abnormal flight status lack efficient representation learning for the coupling and time-variant characteristics of flight parameters.The large-scale, unbalanced-distribution and high-dimensional data can greatly increase the clustering difficulty.Any special or abnormal data could affect clustering analysis results and result in the low reliability of the results.Deep clustering algorithms based on feature extraction can effectively solve this problem by mapping the higher-dimensional data space to an embedded features space.
Inspired by the deep clustering model, a flight state deep clustering network (FSDCN) model is proposed to jointly perform feature extraction and clustering assignments for multivariate time-series flight data.In the FSDCN model, a sequential multi-attention encoder-decoder network is designed to extract embedded risk features, and the feature clustering layer is designed to iteratively refine the cluster centroids by using an auxiliary target distribution.Minimizing the Kullback-Leibler (KL) divergence between embedded features' soft assignments and auxiliary target distribution is used to improve clustering assignment and feature representation simultaneously.In addition, the multi-task joint learning strategy is adopted to improve the clustering performance of the model further.Some clustering models including random forest K-means model, autoencoder Kmeans model and deep embedding cluster model are also introduced to make a comparison with FSDCN.The comparative experiments show that FSDCN performs better than other clustering models.Through statistical analysis of clustering results, the risk level is finally quantified for each cluster.Three high-difficulty maneuver cases are presented to demonstrate the effectiveness of FSDCN on clustering flight states.The experiment results confirm that well-trained FSDCN can effetely define the LOC risk level of the flight state according to multivariate time-series flight data.The main contributions of this work are summarized as follows: 1.An unsupervised FSDCN is proposed to realize flight state clustering and flight risk evaluation.The end-to-end multi-dimensional learning structure integrates feature extraction and unsupervised clustering procedure, and it can separate the flight state into different clusters according to the multivariate time-series flight parameters.2. The multi-task joint learning strategy is designed by deriving a loss function containing reconstruction loss, classification loss and clustering loss.In network training, the shared network parameters are jointly adjusted to ensure convergence at different stages.3. The LOC risk levels of clusters are defined, and the experimental results confirm the cluster ability of FSDCN and the effectiveness of flight risk evaluation.
This paper is organized as follows: The related works in terms of clustering analysis are delineated in "Related works".The flight state clustering problem is defined in "Flight state clustering and splitting".The flight state deep clustering network is illustrated in "Flight state deep clustering network".Experiments and results analysis are shown in "Experiments".Finally, some conclusions are presented in "Conclusions".

Classic clustering
In recent years, some classic clustering algorithms have been developed and expanded, such as modified k-means clustering algorithm [13], density spatial clustering algorithm [14], K-nearest neighbor decision clustering algorithm [15], density deviation multi-peaks automatic clustering algorithm [16], and synchronization-inspired clustering algorithm [17].The classic clustering algorithms are effective when the data structures are the combination of simple form and the features are representative.Clustering analysis has grown in popularity, given the need for detecting abnormal flight characteristics.Jiang et al. [15] used the density-based spatial clustering algorithm to identify abnormal spatiotemporal flight trajectories on the basis of the pilot's operation.To address the negative impacts of outliers during clustering, Liu et al. [18] proposed a clustering with the outlier removal algorithm and used it to detect abnormal flight trajectories.Gao et al. [19] combined principal component analysis and hierarchical clustering algorithm to locate known exceedance flight data fragments and potential abnormal flight data fragments.Aslaner et al. [20] used the dynamic time-warping method to evaluate the similarities of the flight parameters in clusters.These clustering algorithms focus on anomaly detection for historical flight data, which causes its inability to track the cluster changes in flight.Zhao et al. [21] developed an online clustering algorithm to achieve cluster adjustment as onboard flight data update.Because these models do not consider multi-parameter coupling and timevarying characteristics from the perspective of flight safety, it is difficult to accurately describe the relationship between flight state and LOC risk.

Deep clustering
Recently, the neural network has developed rapidly, and has been widely applied in many fields of modern technology due to its approximation properties and feature learning capabilities.The deep clustering algorithms based on the neural network are the promising methods in both feature extraction and clustering assignments.Siłka et al. [22] developed long short term memory neural network model with hyperbolic tangent in hidden layers, and use it to predict potential vibrations of high-speed train from time series of recorded vibrations.Mittal et al. [23] introduced Levenberg-Marquardt neural network into routing protocols for detecting malicious attacks over wireless sensor network.Wieczorek et al. [24] proposed a complex solution for the network training, and used the custom neural network to flexibly predict virus spread.Moreover, some researchers have attempted to combine the neural network with clustering algorithms.The neural network can improve the performance of the clustering algorithm by mapping the higher-dimensional data space to an embedded features space.Qin et al. [25] proposed a deep hybrid model to detect anomaly flight.In the model, the time-feature attention-based convolutional neural network extracts flight features, and the hierarchical density-based spatial clustering algorithm detects anomalous flights according to the extracted features.The deep hybrid model has limitations because it decouples feature extraction from clustering assignments, which could result in a mismatch problem between the extracted features and the clustering target.Xie et al. [26] proposed an unsupervised deep embedding clustering (DEC) model.DEC model integrates feature extraction and clustering assignments into a deep neural network, which can simultaneously improve feature expression and optimize clustering objectives.DEC model and its improved versions have been superiorly applied to graph classification [27][28][29], signal processing [30] document retrieving [31], and traffic crash prediction [32].

Flight state clustering and splitting
Flight state clustering aims to establish the mapping relationship between flight state and LOC risk.The time-variant characteristics of flight parameters are the primary basis for unsupervised cluster analysis and risk classification.Unlike the traditional risk classification method based on single flight parameter limitation, FSDCN first extracts the risk features implicit in multivariate time-series flight parameters.The original data space is abstracted into feature space.Then, unsupervised clustering is complicated in the risk feature space.
In this paper, eight flight parameters are selected to be combined as the model input x, as shown in Eq. (1).x is the multivariate time-series that is extracted from the flight data stream.The extraction range of the flight parameters is t = 10 s, and the detector collection frequency is f s = 50 Hz: Because the flight parameters have different scale characteristics, which will affect the initialization of network parameters and the efficiency of network training.The flight parameters usually vary within a specific range.Their available limitations have obvious boundaries, and their performances are continuous.Hence, a normalization method of multi-scaled variables is proposed based on available limitations of single flight parameters.The model input x is preprocessed by the maximum normalization method.For each dimensional parameter x ∈ {V , Ḣ , α, θ, φ, p, q, r }, its normalized value x is calculated as shown in Eq. ( 2): where, xup limit is the upper limit of the parameter, xdown limit is the lower limit of the parameter, as shown in Table 1, and mid( x) is the median value of the parameter x.
FSDCN model automatically summarizes the nonlinear mapping relationship F between the original data space X and the risk feature space Z through learning train data.Flight parameters {x i ∈ X} N i=1 are firstly transformed to risk feature vectors {z i ∈ Z} N i=1 with the nonlinear mapping relationship f .The risk feature vectors are clustered into K clusters whose centroids are {µ j ∈ Z} K j=1 in the risk feature space Z.The risk level of each cluster is determined according to the number of LOC states.
where, represents learned parameters in the deep clustering network.

Risk feature extraction with sequential multi-attention encoder-decoder network
Encoder-decoder is a general network framework, which is widely used for machine learning.Hence, a sequential multiattention encoder-decoder network is designed to extract

Risk feature extraction
Normalize Network parameters update GRU neural network is an improvement of the recurrent neural network (RNN), which has excellent long-term memory ability [33].It not only contains the cyclic network structure but also introduces the gating mechanism to control the accumulation and update of information, which makes the GRU network suitable for processing time-series data.The GRU neural network extracts sequential features of corresponding parameters, and the multi-head attention mechanism is adopted to adjust the feature weight at dif-ferent time nodes.According to the many-to-one mapping criterion, the risk feature vector f c is obtained.Next, the fully connected layer is used for further dimension reduction and obtains the low-dimensional risk feature vector f p .

Cluster Model
The decoder layer is composed of flight parameters reconstructor and LOC classifier.Specifically, the reverse network layer is adopted to realize reconstructed input vector x from the risk feature vector f p .The reconstruction loss L reconstruct is shown in Eq. ( 4).Since normal flight samples far exceed LOC samples, a dropout layer is adopted to design the LOC classifier to avoid network over-fitting.The LOC classifier can distinguish normal data and LOC data based on extracted low-dimension risk feature vector f p , which can ensure f p retains sufficient latent LOC risk features.Moreover, focal loss [34] is introduced to address the training scenario in which there is an extreme imbalance between normal samples and LOC samples, as shown in Eq. ( 5).This can significantly increase the loss contribution of the LOC samples to make the model tend to learn from these samples.
where, x(t) is the real flight parameter, x(t) is the reconstructed parameter.
where, p y is the estimated probability for the LOC label y, and ζ represents the balance coefficient of positive and negative samples.Eventually, the loss function of the sequential multiattention encoder-decoder network is shown in Eq. ( 6).The designed encoder-decoder network is trained to update the parameters by minimizing the loss L F E as shown in Eq. ( 6).The trained-well sequential multi-attention encoderdecoder network can summarize nonlinear mapping relationships and build an effective mapping from original data space X to risk feature space Z.
where, δ is the weighting factor.

Risk feature clustering with feature clustering layer
Based on the obtained low-dimensional risk feature vector f p ,the feature clustering layer aims to learn K clustering centers in the risk feature space and determine the risk label of each data sample according to the similarity between the feature vector and the cluster center.The conventional clustering method updates the cluster center and modifies the data samples' labels by minimizing the value of the similarity measure.The conventional clustering method does not change the location and distribution of data samples in the process.Different from the conventional clustering method, FSDCN combines clustering process with feature extraction.The feature clustering layer introduces soft assignment Q and auxiliary target distribution P, and updates layer parameters by minimizing the KL divergence representing the similarity between Q and P.
As shown in Fig. 3, K cluster centers {µ j ∈ Z} K j=1 are initialed by the K-means algorithm in risk feature space.The similarity between the risk feature vectors z i and the cluster center µ j is calculated as Eq. ( 7) where, q i, j is softer probabilistic targets, and it represents the probability that the feature point z i are assigned to the cluster centroid µ j .
After obtaining the soft assignment of the data sample set, auxiliary target distribution P is calculated by first squaring Q and then normalizing by frequency per cluster, as shown in Eq. ( 8).The operation can make P have a stricter probability distribution (its value is closer to 0 or 1) than Q, which facilitates learning risk feature data with high-confidence assignments.
The clustering loss L cluster is defined as the KL divergence that measures the difference between soft assignment Q and auxiliary target distribution P, as shown in Eq. ( 9).The clusters are iteratively refined by learning from high-confidence assignments.The clustering results can guide feature extraction, and that in turn optimizes the clustering effect:

Network joint training
Flight risk state clustering relies on extracting risk features contained in the parameter data set.FSDCN integrates risk feature extraction and cluster analysis into an end-to-end network structure.At network joint training stage, a new loss function is formed by extraction loss and clustering loss, as shown in Eq. (10).So that the risk feature vectors are updated in a direction more conducive to clustering.
where, γ is the weighting factor.

Multi-task joint learning strategy
According to multi-task joint learning strategy, meaningful and well-separated feature representations are firstly pro-  2.
(1) Risk feature extraction.Update the parameters of the sequential multi-attention encoder-decoder network to learn the nonlinear mapping function f .The network initially completes flight risk feature extraction: (2) Risk feature clustering.Update the parameters of the feature clustering layer by minimizing the KL divergence representing the similarity between Q and P: (3) Network joint training.Jointly fine tune the parameters of the overall FSDCN and the parameters of the FSDCN are updated in the direction conducive to clustering tasks:

Experiments
Due to the development of aviation technology, flying has been the safest mode of transportation by accident statistics.
Over the past 20 years, accident rates dramatically declined while flights rose steadily.LOC events rarely happen in actual flight.Hence, there are few flight data in case of complex situations, especially LOC accidents.Flight data used in this

Model parameters setting and evaluation metrics
Flight states are related to the number of clusters.The reconstruction loss, classification loss and clustering loss are weighted by the weighting factors δ and γ .The role of the weighted loss function is to guide network training in a direction more conducive to clustering.The optimal number of clusters depends on the risk features extracted for partitioning, as well as measuring similarities method.Sum of squared error (SSE) and gap value are the common indexes used to select the optimal cluster number in K-means algorithm, as shown in Eqs. ( 14) and (15).Both indexes can provide valuable information for cluster analysis by fitting the model with a range of values for cluster number K .Hence, it is a good The elbow method finds the elbow point by drawing a line plot between SSE and K .As shown in Fig. 5a, for cluster number K = 5, which represents the elbow point.Gap statistics (GS) measures the cluster difference between observed data and reference data with reference distribution.The most optimal number of clusters can be chosen as the smallest value of K such that the gap value is within one standard deviation of the gap at K + 1.As shown in Fig. 5b, when K = 5, its gap value is greater than the value at K = 6, which satisfies Eq. ( 15).Hence, the optimal number of clusters is set as 5: Meanwhile, Table 3 gives the other primary parameter configurations of FSDCN.The flight data are divided into a training dataset and a testing dataset to verify the performance superiority of FSDCN.The k-fold cross-validation method is prevalent for evaluating classification algorithm performance [35].The experiment in this paper uses ten-fold cross-validation.The flight data are randomly divided into ten disjoint datasets with approximately equal sizes.The ratio of the training dataset to the testing dataset currently stands at 9:1.Each disjoint dataset is used in turn as the testing dataset to evaluate the flight state clustering effect, and the other nine disjoint datasets are used as the training dataset to learn feature representations and cluster assignments.SC combines cluster cohesion a(i) and cluster separation b(i).SC is calculated as shown in Eq. ( 16).SC is essentially the difference between a(i) and b(i) divided by the maximum of the two.Hence, the score of SC ranges within [−1, 1], and its value getting closer to 1 indicates a better clustering effect.
where, a(i) refers to the average distance between an instance and all other samples within the same cluster, and b(i) refers to the average distance between an instance and all samples in other clusters.Their formulas are shown in Eq. ( 17): where, n represents the number of all other samples within the same cluster, and N represents the number of all samples in other clusters.
CH combines the sum of inter-cluster dispersion GS (B k ) and the sum of intra-cluster dispersion for all clusters GS (W k ).CH is calculated as shown in Eq. (18).Unlike SC, CH has no bound, and its high score means a better clustering where, where, µ 0 is the centroid of all samples, µ k is the centroid of the k-th cluster, z k j is the j-th sample of the k-th cluster, and n k is the number of samples in the k-th cluster.DB describes the average similarity of each cluster with a cluster most similar to it, which combines intra-cluster dispersion C i , C j and separation measure M i j . where, FSDCN proposed in this paper is an unsupervised clustering model.The model input lacks true labels.Hence, there is no independent validation data with label to evaluate FSDCN performance.To solve this problem, the statistics of the metrics (DB, SC and CH) are used to measure the performance of clustering model.Since there is a randomness mechanism Mini-batch size 64 ζ in Eq. ( 5) 0 .65 δ in Eq. ( 6) 0 .5 γ in Eq. (10) 0 .5 in k-fold cross-validation, the average-value and standarddeviation of metrics can be adopted to verify unsupervised learning algorithms' performance.The value of the metrics (average-value) can measure the clusters assignment, and the distribution of the metrics (standard-deviation) can measure the stability of the model.However, the scale characteristics of different metrics can affect the comparisons between clustering models.To compare the performance of different clustering models directly, percentage error is introduced as shown in Eq. (22).

Percentage error
where, V train is metrics value from training dataset, and V test is metrics value from testing dataset.

Clustering results analysis
T-distributed stochastic neighbor embedding (T-SNE) is a popular tool for high-dimensional data visualization, which maps the data in a high-dimensional space to a lowdimensional space.Here, T-SNE is applied to map the high-dimensional latent risk features to three-dimensional vectors for visualization.The progression of the risk features embedded representation is visualized, as shown in Fig. 6.It is clear that the clusters are becoming increasingly well separated.This means that FSDCN extract more distinguishable risk features and can better service for flight state clustering.Figure 6a is the visualization result for embedded representation clustering in the initial stage.FSDCN start from the chaotic state where clusters overlap too closely to capture the data separability.Figure 6b is the visualization result for embedded representation clustering in the risk features extraction stage.The sequential multi-attention encoder-decoder network is updated to improve the performance of clustering.All data move closer toward the cluster center, but the boundaries of clusters still overlap.Figure 6c is the visualization result for embedded representation clustering in the risk features clustering stage.The deep cluster layer is updated to enhance the performance of clustering.
Obviously, the distance between clusters is getting farther.Figure 6d is the visualization result for embedded representation clustering in the multi-task joint learning stage.The parameters of overall FSDCN are updated simultaneously at the multi-task joint learning stage.Clearly, the distribution of clusters begins to change with the training iteration, and the embedded representation of risk features flock together more distinctly, and cluster assignments become more reasonable.Figure 6e is the clustering result for embedded representation.The elements in the cluster are highly concentrated and the interval between clusters is apparent.This proves the effectiveness of the proposed multi-task joint learning strategy.
The performance of evaluation metrics for FSDCN is shown in Fig. 7.As the training process goes, DB index representing the separation between the clusters decreases gradually, and SC index indicating the cohesion of the clusters increases gradually.The clustering effect is continuously improved by risk feature extraction.At the risk features clustering stage, DB and SC indexes have little change, and CH index containing separation and cohesion information significantly increases.The higher value of CH index means the clusters are dense and well separated.At the multi-task joint learning stage, three evaluation metrics, DB, SC, and CH, oscillate slightly with the training iteration, and the clustering effect is optimized continuously.As can be observed from Fig. 7, the performance of three evaluation metrics converges to a stable state finally.FSDCN search for a stable clustering solution that can extract latent risk features and cluster them reasonably.
The results of k-fold cross-validation for FSDCN are shown in Table 4.In both the training dataset and the testing dataset, FSDCN has high SC, CH indexes and a low DB index, which indicates that the FSDCN has strong unsupervised clustering ability.Moreover, the average values of DB, CH, and SC are similar, and their variance values are small.The percentage error is not more than 6%, indicating that FSDCN has good generalization ability.
To validate the effectiveness of FSDCN, we compare its clustering effect on data with other models, and specifically including random forest K-means model [36]: features extracted by random forest are used to support K-means clustering, and autoencoder K-means model [37]: features extracted by a autoencoder are used to support K-means clustering, and deep embedding cluster model [26]: features extraction and unsupervised clusters assignment are integrated into a deep embedding network.
Figure 8 presents visualization results of other clustering algorithms on flight datasets, and Table 4 also reports the performance of other clustering algorithms on the training and testing datasets.From Fig. 8, it is observed that the autoencoder has fascinating potential in feature extraction.Compared with the random forest, the autoencoder significantly improves the performance of the clustering algorithm.The deep embedding cluster algorithm has better metrics among other three clustering algorithms according to Table 4, and the clustering result of the deep embedding cluster algorithm is better than other clustering algorithms according to Fig. 8c.This means that the deep neural network integrating feature extraction and clusters assignment can improves the performance of the clustering algorithm further.However, the percentage error of deep embedding cluster model is obviously large so there are still overlapping areas between clusters, which means that the generalization ability of deep embedding cluster model is poor.Hence, inspired by deep embedding cluster model, FSDCN is developed.
Comparing Figs.6e and 8c, the boundaries between the clusters obtained by FSDCN are more apparent than that obtained by the deep embedding cluster algorithm.This means that latent risk features contain more information served for flight state division.Moreover, according to Table 4, the metrics values of FSDCN are better than other clustering algorithms, especially the percentage error of all metrics is less than 6%.This can prove its better clustering performance.In summary, FSDCN is a competitive algo-rithm that can simultaneously learn feature representations and cluster assignments.

Flight risk evaluation
According to the clustering results of flight data by FSDCN, all flight states are classified into clusters from RL A to RL E .Every cluster includes the normal flight state and LOC state.Hence, some metrics used to evaluate the characteristics of the clusters are defined.Based on the relationship between flight parameters and latent LOC information, the risk level of the clusters is determined through the statistics method.
Proportion of cluster data (PCD): the sample number of each cluster in the total sample.The value of PCD is related to the complexity and danger of flight maneuvers.Its expression is shown in Eq. (23).
where, n i c is the sample number of the i-th cluster, and n all is the sample number of all data.
Proportion of LOC data (PLD): the LOC samples contained in the cluster to the total LOC samples.The larger the PLD value, the more LOC samples in the cluster.Its expression is shown in Eq. (24).
where, n i loss is the number of the LOC sample contained in the i-th cluster, and n all loss is the number of the total LOC sample.Average LOC correlation (ALC): the average correlation between LOC samples and other samples in the cluster.The larger the ALC value, the stronger the correlation between flight status and LOC in the cluster.Its expression is shown in Eq. (25).
where, Pearson( * ) is the Pearson correlation coefficient function, which can refers to [38].
The statistical results of the clusters are shown in Table 5.For the RL B cluster, which account for 35.33% of the whole dataset, its LOC samples only accounts for 1.56% of the total LOC samples.p ALC of the RL B cluster is less than 0.1.This indicates that the correlation between the flight states and LOC states in the RL B cluster is extremely weak or irrelevant.Hence, the risk level of the RL B cluster is relatively low.For the RL C and RL E clusters, which account for 34.41% of the whole dataset, their LOC samples account for 17.31% of the Fig. 7 The performance of evaluation metrics for FSDCN shown in Table 5.The higher the risk level, the more serious the LOC correlation.
To verify the validity of FSDCN in classifying flight states, some flight maneuver cases' flight parameter sequences are inputted into the well-trained FSDCN.Then, the risk level C R for each input sequence is exported.The risk level C R and the flight parameters performance are put together to compare their temporal distribution characteristics.Besides, the trajectory with risk level C R of the flight maneuver case is also provided to conveniently view the spatial distribution characteristics.Some high-difficulty maneuvers are included in the flight cases, such as loops, barrel rolls, s-turns, wingovers, and nosedives, etc. Figures 9, 10 and 11 show the risk heatmap of the spatiotemporal distribution for the flight maneuver cases.By examining the risk heatmap of the spatial distribution for the flight maneuver cases, it is found that LOC risk is mainly concentrated in the loop maneuver stage.After examining Figs.9a, 10a and 11a, it is found that the distribution characteristics of the flight parameter over-limit are basically

Conclusions
In this paper, a flight state deep clustering network was proposed for flight risk evaluation.FSDCN possesses an end-to-end multi-task learning structure to integrate feature extraction and unsupervised clustering procedure.In FSDCN, a sequential multi-attention encoder-decoder network is designed to extract effective risk features from original flight parameters, and a clustering layer is constructed iteratively refine clusters by measuring the clustering performance to facilitate feature extraction.Multi-task joint learning strategy is adopted to optimize the clustering performance of FSDCN.Compared with other deep clustering models, FSDCN has better clustering performance and obtains the most apparent boundaries between the clusters.This greatly improves the accuracy of risk evaluation.According to self-defined metrics evaluating the relationship between flight parameters and latent LOC information, and each cluster's LOC risk level is quantified through statistical analysis.Three high-difficulty maneuver cases are presented to illustrate FSDCN for flight risk evaluation.The results of the risk spatiotemporal distribution for the flight maneuver cases confirm the flight state clustering effectiveness of the proposed FSDCN.
However, our work has some limitations.The FSDCN model needs to be further improved in future work.Since flight data used in this paper mainly came from flight aerobatics training with a simulator, the data excludes abnormities such as noise, missing and bias.The near-perfect flight data may limit the practical application of FSDCN on the risk alarm system.Hence, future studies will focus on data cleaning to treat noise, missing values and deviation in data.Besides, the other factors (e.g.parameters combination, extraction range and collection interval) associated with the shape of input variables are need to further study to clarify the effects on clustering performance of FSDCN.These will set the foundation for successful, accurate, and efficient data analysis.In addition, the display form of risk information is also need to concern to clarify the suitability for improving the crew's situational awareness.

FSDCN 1 .
adopts an end-to-end multi-task learning structure to simultaneously accomplish the feature extraction and clustering task for multi-dimensional flight parameters with time-series features.The framework of FSDCN is shown in Fig.According to the priority of tasks, the training process of FSDCN is divided into three stages: risk feature extraction stage, risk feature clustering stage and network joint training stage.A multi-task joint learning strategy is designed to jointadjust shared network parameters and ensure convergence at different stages.

Input:
Flight parameters data x Normalization: Data maximum normalization Initialization: Initialize FSDCN parameters Switch epoch = from 1 to Epochs Case epoch in (0, 0.2Epochs] Update parameters of sequential multi-attention encoder-decoder network by −∇ F E (L F E ) Case epoch in (0.2Epochs, 0.6Epochs] Update parameters of deep cluster layer by −∇ cluster (L cluster ) Case epoch in (0.6Epochs, Epochs] Update parameters of overall FSDCN by −∇ (L) end Output: the class label of corresponding input paper mainly came from flight aerobatics training with a simulator.The flight simulator and part of the flight aerobatics trajectory are shown in Fig. 4. The implementation of the FSDCN algorithm in this paper is based on python 3.7 and Pytorch 1.10.2deep learning framework.

Fig. 4
Fig. 4 Flight aerobatics training with simulator Three metrics are introduced to evaluate the quality of clustering: Davies-Bouldin (DB), Silhouette coefficient (SC), and Calinski-Harabasz (CH).DB represents the similarity measurement between clusters, and SC represents how tightly grouped the samples in the clusters are.CH represents a ratio of the sum of inter-cluster dispersion and the sum of intra-cluster dispersion for all clusters.

Fig. 5
Fig. 5 Determination of number of clusters

Fig. 8
Fig. 8 The visualization results of other clustering algorithms

Table 1
The limitations of the flight parameters Parameter xdown

Table 2
Multi-task joint learning strategy Implementation of FSDCN training

Table 3
Other primary parameters setting of FSDCN

Table 4
The results of k-fold cross validation for clustering algorithms ALC of the RL C and RL E clusters is 0.2567 and 0.3177 respectively.p ALC ∈ [0.2, 0.4] indicates that the correlation between the flight states and LOC states in the RL C and RL E clusters is medium.Hence, the risk level of the RL C and RL E clusters is moderate.For the RL A and RL D clusters, which account for 30.26% of the whole dataset, their LOC samples account for 81.13% of the total LOC samples. of the RL A and RL D clusters is 0.6489 and 0.8357 respectively.p ALC ∈ [0.6, 0.9] indicates that the correlation between the flight states and LOC states in the RL A and RL D clusters is strong.Hence, the risk level of the RL A and RL D clusters is high.Finally, the risk level of different clusters is determined according to the PLD and ALC metrics, as

Table 5
Flight state statistics and risk level evaluation