A binary-domain recurrent-like architecture-based dynamic graph neural network

Chen, Zi-chao; Lin, Sui

doi:10.1007/s43684-024-00067-9

A binary-domain recurrent-like architecture-based dynamic graph neural network

Original Article
Open access
Published: 25 June 2024

Volume 4, article number 11, (2024)
Cite this article

Download PDF

You have full access to this open access article

Autonomous Intelligent Systems Aims and scope Submit manuscript

A binary-domain recurrent-like architecture-based dynamic graph neural network

Download PDF

Zi-chao Chen¹ &
Sui Lin¹

275 Accesses
Explore all metrics

Abstract

The integration of Dynamic Graph Neural Networks (DGNNs) with Smart Manufacturing is crucial as it enables real-time, adaptive analysis of complex data, leading to enhanced predictive accuracy and operational efficiency in industrial environments. To address the problem of poor combination effect and low prediction accuracy of current dynamic graph neural networks in spatial and temporal domains, and over-smoothing caused by traditional graph neural networks, a dynamic graph prediction method based on spatiotemporal binary-domain recurrent-like architecture is proposed: Binary Domain Graph Neural Network (BDGNN). The proposed model begins by utilizing a modified Graph Convolutional Network (GCN) without an activation function to extract meaningful graph topology information, ensuring non-redundant embeddings. In the temporal domain, Recurrent Neural Network (RNN) and residual systems are employed to facilitate the transfer of dynamic graph node information between learner weights, aiming to mitigate the impact of noise within the graph sequence. In the spatial domain, the AdaBoost (Adaptive Boosting) algorithm is applied to replace the traditional approach of stacking layers in a graph neural network. This allows for the utilization of multiple independent graph learners, enabling the extraction of higher-order neighborhood information and alleviating the issue of over-smoothing. The efficacy of BDGNN is evaluated through a series of experiments, with performance metrics including Mean Average Precision (MAP) and Mean Reciprocal Rank (MRR) for link prediction tasks, as well as metrics for traffic speed regression tasks across diverse test sets. Compared with other models, the better experiments results demonstrate that BDGNN model can not only better integrate the connection between time and space information, but also extract higher-order neighbor information to alleviate the over-smoothing phenomenon of the original GCN.

Dynamic spatial–temporal graph-driven machine remaining useful life prediction method using graph data augmentation

Article 22 November 2022

Spatio-temporal graph convolutional neural network for remaining useful life estimation of aircraft engines

Article Open access 17 November 2020

Attentive graph structure learning embedded in deep spatial-temporal graph neural network for traffic forecasting

Article 12 February 2024

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Smart Manufacturing represents a revolutionary shift in industrial production, leveraging advanced technologies such as the Internet of Things (IoT), artificial intelligence (AI), and big data analytics to create highly automated and interconnected production environments. This integration allows for real-time data acquisition and analysis, essential for optimizing production lines, predicting maintenance needs, and ensuring quality control. Dynamic Graph Neural Networks (DGNNs) enhance these capabilities by learning from the temporal dynamics within the manufacturing environment. Unlike static models, DGNNs adapt to ongoing changes, providing more accurate predictions. For example, by modeling inter-component relationships within a production line as a dynamic graph, DGNNs can foresee potential failures or bottlenecks. This predictive insight enables preemptive actions to minimize downtime and boost productivity, thus maintaining continuous and efficient operations. By adopting DGNNs, Smart Manufacturing not only becomes more efficient but also shifts towards proactive maintenance strategies, reducing the risk of costly unplanned disruptions.

Initially, research on DGNNs primarily built upon the successes achieved in static graph research, which demonstrated notable outcomes in downstream tasks including link prediction [1–3], node classification [4], and community detection [5, 6]. To harness these capabilities for dynamic scenarios, researchers began integrating static GNN frameworks with Recurrent Neural Networks (RNNs), enabling the models to not only distill graph structures but also effectively encode temporal dynamics. Although recent years have witnessed significant strides in the domain of dynamic graph node embedding models [7, 8], the field is still in the development stage in general. First of all, contemporary research often integrates GNN models, originally tailored for static graphs, with RNNs without adequately accounting for the possibility that nodal changes could be constrained by both spatial and temporal factors. This oversight can result in a decoupling of spatiotemporal relationships. Secondly, the fundamental nature of the GCN model is akin to Laplacian smoothing; thus, with increasing network depth, traditional GCNs tend to over-facilitate information exchange among neighboring nodes. Consequently, regardless of the nodes’ proximity or topological dissimilarities, their feature representations tend to converge, leading to a homogenization effect known as over-smoothing. This issue significantly impairs the GNNs’ representational capacity, complicating the network’s ability to discern intricate graph patterns and ultimately impairing performance on subsequent analytical tasks. In view of this, this paper researches the embedding vector representation of nodes in dynamic graphs, and proposes the Binary Domain Graph Neural Network (BDGNN), which effectively mitigates the detrimental effects of over-smoothing while simultaneously reducing noise during the dynamic changes in graph topology.

The contributions of this paper can be summarized in three aspects: 1) BDGNN, a dynamic graph neural network based on binary domain RNN-like structure, is proposed. AdaBoost [9] algorithm is applied to replace the stack of layers of the ordinary dynamic graph neural network, and the application of AdaBoost algorithm is migrated to link prediction task and regression task; 2) On the link prediction task, it is verified through experiments that BDGNN alleviates the over-smoothing [10] phenomenon that the embedding representation ability of dynamic graph neural network does not increase but decreases with the deeper the network layers; 3) In time domain processing, residual connection is added to eliminate the interference of special time steps on the embedding stability of subsequent time steps.

2 Related work

2.1 Dynamic graph neural network

A dynamic graph is characterized by a sequence of graph snapshots. Traditional graph neural networks are limited in their ability to capture the evolving nature of dynamic graphs, as they only consider data features from individual time steps without exploring inter-node relationships. Since links within a dynamic graph undergo temporal changes, there is a need for graph neural network models specifically designed to address dynamic graph analysis and modeling. There are several typical works:

Seo et al. [11] designed a GCRN model that employs GCN to encode the information of individual graphs at specific time steps, while integrating temporal information along the time axis using Long Short Term Memory (LSTM) [12]. This model is implemented in two different combinations: The first approach utilizes the GCN to obtain node embeddings, which are then fed into an LSTM designed to learn dynamics. The second is a modified LSTM that takes node features as input but replaces the fully connected layers with graph convolutions.

In order to generate stable graph embeddings, Goyal et al. [13] designed the DynGEM model, which is encoded with a graph auto-encoder and initialized in the next time step using the encoding of the previous time step. This model can handle growing dynamic graphs while keeping embeddings stable and takes less time than using static models on each snapshot.

In order to keep the size of the model not too large while maintaining high accuracy, Pareja et al. [14] designed the EvolveGCN model, which no longer inputs the node features learned by GCN into the RNN. Instead, the GCN parameters of previous step are used as the input of RNN, and the output of RNN is used as the parameters of GCN of the next time step, which effectively reduces the pressure of the dynamic graph model on the hardware equipment and improves the accuracy of the model.

Tensor decomposition is widely used to model high-order variable correlation. Further research by Takeuchi et al. [15] found that tensor decomposition can mine the time change pattern of objects. Inspired by this, Shi et al. [16] designed the GAEN model. This model combines both the Graph Attention Network (GAT) [17] with the Gate Recurrent Unit (GRU) [18], PARAFAC tensor decomposition was used to extract the graph change pattern, which has been experimentally verified to have a certain improvement in accuracy.

In order to migrate the success of static GNNs to dynamic graphs, You et al. [19] proposed ROLAND, an effective graph representation learning framework. The main idea is to regard the node embeddings of different GNN layers as hierarchical node states, and then update them periodically over time. With the help of the ROLAND framework, researchers can easily apply any static GNN to dynamic graphs.

Based on the aforementioned studies, it is evident that the majority of current approaches for dynamic graph neural networks involve the integration of GNN and RNN temporal processors, supplemented by intermediate components to strengthen the interplay between these two components.

2.2 Over-smoothing

Since the essence of the GCN model is Laplacian smoothing, as the layers are stacked, GCN will cause the information exchange between neighbors to become too frequent. Regardless of the distance or topological similarities and differences of each node, their features will tend to be indistinguishable. This phenomenon is called over-smoothing. In order to alleviate the over-smoothing phenomenon, researchers have designed a variety of solutions, including several typical works:

In an effort to regulate the rate of neighborhood expansion, Xu et al. [20] proposed the Jumping Knowledge Network (JK-Net) framework. This architecture allows embeddings from each layer to directly connect to the output layer during the iterative aggregation process. The final output is then generated by aggregating these connections with a specific aggregation function. Each node adaptively selects information from its higher-order neighbors, ensuring that the influence of each node’s neighborhood is distinct, preserving node diversity and, to some extent, mitigating the over-smoothing phenomenon.

Rong et al. [21] proposed an additional model, DropEdge, based on the method of cutting edges. The design idea of the model is relatively simple, just cutting some edges randomly from the original graph. However, cutting edges can make the connection between nodes more sparse, reduce the frequency of node information exchange, and effectively alleviate the over smoothing phenomenon with the stacking of model layers.

In order to effectively mine the information of high-order neighbors, Sun et al. [22] designed the ADAGCN model. Since this model introduces the Adaboost method to integrate the information of each order of neighbors, it can effectively absorb information from different levels in the process of building the depth map model. The knowledge of order neighbors not only improves the prediction accuracy, but also avoids the over-smoothing of the original GCN to a certain extent.

In summary, researchers have proposed various solutions to alleviate over-smoothing in dynamic graph neural networks, such as adaptive information selection, edge cutting, and integration of high-order neighbor information.

3 Problem statement

A salient challenge in crafting dynamic GNN models lies in the intricate task of amalgamating the temporal and spatial dimensions of dynamic graphs. While a naive superposition of graph neural network layers, as applied to static graphs, might yield satisfactory results, this approach fails to account for the potential variations in nodal connections across successive time steps in dynamic environments. For example, in the dynamic collaboration network shown in Fig. 1, nodes represent authors, and edges between nodes represent authors’ collaborative relationships. At time step $t_{0}$, author A has a cooperative relationship with author B, author B also has a cooperative relationship with author C. At time step $t_{1}$, author A and author C skip author B for cooperation. While at time step $t_{2}$, the cooperative relationship between A, B, and C returns to the state at time step $t_{0}$. The embedding at the previous time $t_{1}$ may not effectively help calculate the embedding at time step $t_{2}$. Therefore, we use RNN in processing time series and introduce residual systems to ensure the stability of embedding.

In addition, changes in node characteristics in dynamic networks can affect higher-order neighbors over time. Take, for instance, the traffic speed graph depicted in Fig. 2, where nodes represent traffic speed monitors and edges signify direct road connections between these monitors. At time step $t_{0}$, a severe traffic accident occurred on the road segment between monitor A and monitor B, resulting in a significant traffic jam (highlighted in red). Consequently, the average speeds recorded by monitors A and B will exhibit abnormal changes; at time step $t_{1}$, this congestion is likely to propagate along the roadway. Monitors directly connected to A and B will start detecting this traffic jam, and by the subsequent time step $t_{2}$, the disturbance will spread to adjacent monitors. As time progresses, the neighboring nodes of A and B will be impacted in varying degrees.

In response to these dynamics, our research aims to address the differential impacts exerted by each neighborhood order on node embeddings. Rather than merely stacking traditional Graph Neural Network (GNN) layers, we employ AdaBoost to adaptively integrate information across different neighborhood orders. This approach enables us to effectively combine the temporal and spatial dimensions of the dynamic graph relationships, thereby enhancing the model’s ability to interpret and predict changes in network states caused by incidents such as traffic jams. Below we will introduce the detailed design of each module of BDGNN.

4 Method

4.1 Model overall design

The overarching schematic of the model is illustrated in Fig. 3, encompassing three parts: modified graph convolutional network without activation function, RNN-based processing for temporal data and an analogous RNN architecture to handle higher-order spatial domain information. The model accepts a feature matrix X and a sequence of normalized adjacency matrices for input. Structured with L layers, each is considered an independent learner. The l-th layer utilizes RNNs alongside residual modules to capture the temporal information inherent in the graph sequence, and a bi-layer perceptron $g_{\theta}$ to extract topological details from the adjacency matrix, represented by $\boldsymbol{X}=\boldsymbol{\hat{A}}^{l}X$. This produces an output $\boldsymbol{H}^{(l)}$. Following training, the nodal weights $w^{l}$ and the neural network parameters $\theta _{l}$ are propagated to the next layer’s learner, which then embarks on learning from the enhanced adjacency matrix $\boldsymbol{\hat{A}}^{l+1}\boldsymbol{X}$. After the L-th layer learner has been trained, an AdaBoost algorithm, chosen in accordance with the specific downstream task, integrates the features from all L layers to derive the model’s final output H.

4.2 Modified graph convolutional network without activation function

The proposed model employs modified Graph Convolutional Networks (GCN) to effectively process and analyze the neighborhood information inherent in graph topology. The GCN is established as both the most ubiquitously adopted and fundamental model among graph neural network architectures. It is constructed with multiple strata of graph convolution. Specifically, the l-th layer receives both the adjacency matrix A and the node embedding matrix from the previous layer, $\boldsymbol{H}^{(l)}$, as inputs. Subsequently, under the action of the weight matrix $\boldsymbol{W}^{(l)}$, it updates the representation from $\boldsymbol{H}^{(l)}$ to $\boldsymbol{H}^{(l+1)}$, which is then propagated as the output. This process is encapsulated by the following mathematical formulation:

$$ \begin{aligned} \boldsymbol{H}^{(l+1)}&=\operatorname{GCONV}\bigl( \boldsymbol{A},\boldsymbol{H}^{(l)},\boldsymbol{W}^{(l)}\bigr) \\ &:=\sigma \bigl(\boldsymbol{\hat{A}}\boldsymbol{H}^{(l)}\boldsymbol{W}^{(l)}\bigr), \end{aligned} $$

(1)

where $\boldsymbol{\hat{A}}$ represents the normalization of A, which is defined as:

$$ \begin{aligned} &\boldsymbol{\hat{A}}=\boldsymbol{\tilde{D}}^{-\frac{1}{2}}\boldsymbol{ \tilde{A}} \boldsymbol{\tilde{D}}^{-\frac{1}{2}}, \\ &\boldsymbol{\tilde{A}}=\boldsymbol{A}+\boldsymbol{I}, \\ &\boldsymbol{\tilde{D}}=\operatorname{diag}\biggl(\sum_{j} \boldsymbol{\tilde{A}}_{ij}\biggr). \end{aligned} $$

(2)

σ is the activation function (usually ReLU) of all layers except the output layer. The first layer of the model usually treats the node feature matrix X as the node embedding matrix $\boldsymbol{H}^{(0)}$.

The modified GCN we employed incorporates two specific computational modifications inspired by AdaGCN:

1) The activation function is removed. Since the feature representation of each node is usually a one-dimensional sparse vector, rather than an image, which intuitively requires a deep convolution network to extract high-level representations for visual tasks, the ideal representation of nodes does not necessarily require too many nonlinear transformations. Wu et al. [23] was also driven by this thinking when designing the Simple Graph Convolution (SGC) mode, which removed the nonlinear activation function between GCN layers:

$$ \begin{aligned} \boldsymbol{H}^{(l)}=\boldsymbol{\hat{A}}^{l} \boldsymbol{X}\boldsymbol{W}^{(0)}\boldsymbol{W}^{(1)}\cdots \boldsymbol{W}^{(l)}= \boldsymbol{\hat{A}}^{l}\boldsymbol{X}\boldsymbol{\tilde{W}}. \end{aligned} $$

(3)

2) Similar to SGC, which removes ReLU, the stacked linear transformation from graph convolution is insufficient in integrating high-order neighbor information. Therefore, we use a two-layer fully connected neural network $g_{\theta}$ to replace the linear transformation W to make up for the lack of nonlinear changes in GCN.

Based on the above two changes, the input-output relationship of the l-th layer GCN can be expressed as:

$$ \boldsymbol{H}^{(l)}_{t}=g_{\theta}\bigl(\boldsymbol{ \hat{A}}_{t}^{l}\boldsymbol{X}\bigr). $$

(4)

4.3 Modified recurrent neural network with residual system

We utilize Recurrent Neural Networks (RNN) to effectively capture and analyze the temporal information present in graph sequences. RNN is a special neural network that focuses on processing sequence data. Classical RNN models include Long Short Term Memory (LSTM) and Gate Recurrent Unit (GRU). LSTM can be expressed as:

$$\begin{aligned} & \boldsymbol{f}_{t}=\sigma \bigl({ \boldsymbol{W}_{f}}\cdot [{\boldsymbol{h}_{t-1}},{\boldsymbol{x}_{t}}]+{ \boldsymbol{b}_{f}}\bigr), \\ & \boldsymbol{i}_{t}=\sigma \bigl({\boldsymbol{W}_{i}}\cdot [{ \boldsymbol{h}_{t-1}},{\boldsymbol{x}_{t}}]+{ \boldsymbol{b}_{i}} \bigr), \\ & \boldsymbol{\widetilde{C}}_{t}=\tanh \bigl({\boldsymbol{W}_{C}}\cdot [{\boldsymbol{h}_{t-1}},{ \boldsymbol{x}_{t}}]+{\boldsymbol{b}_{c}} \bigr), \\ & \boldsymbol{C}_{t}={\boldsymbol{f}_{t}}*{\boldsymbol{C}_{t-1}}+{ \boldsymbol{i}_{t}}* \widetilde{{\boldsymbol{C}_{t}}}, \\ & \boldsymbol{o}_{t}=\sigma \bigl({\boldsymbol{W}_{o}}\cdot [{ \boldsymbol{h}_{t-1}},{\boldsymbol{x}_{t}}]+{ \boldsymbol{b}_{o}} \bigr), \\ & \boldsymbol{h}_{t}={\boldsymbol{o}_{t}}*\tanh ({ \boldsymbol{C}_{t}}), \end{aligned}$$

(5)

where t denotes the discrete time step, $\boldsymbol{f}_{t}$ symbolizes the forget gate, and $\boldsymbol{i}_{t}$ denotes the input gate. The cell state is represented by $\boldsymbol{C}_{t}$, while $\boldsymbol{o}_{t}$ corresponds to the output gate, and $\boldsymbol{h}_{t}$ signifies the hidden state. The parameters of the model are encapsulated by W for weights and b for biases. The sigmoid function is employed as the activation function, and the symbol ∗ is used to denote the Hadamard product, which refers to element-wise multiplication within the model’s computations.

GRU can be expressed as:

$$ \begin{aligned} & {\boldsymbol{z}_{t}}=\sigma \bigl({ \boldsymbol{W}_{z}}[{\boldsymbol{h}_{t-1}},{\boldsymbol{x}_{t}}]+{ \boldsymbol{b}_{z}}\bigr), \\ & {\boldsymbol{r}_{t}}=\sigma \bigl({\boldsymbol{W}_{r}}[{ \boldsymbol{h}_{t-1}},{\boldsymbol{x}_{t}}]+{ \boldsymbol{b}_{r}} \bigr), \\ & \widetilde{{\boldsymbol{h}_{t}}}=\tanh \bigl({\boldsymbol{W}_{h}}[ \boldsymbol{r}_{t}*{\boldsymbol{h}_{t-1}},{ \boldsymbol{x}_{t}}]+{ \boldsymbol{b}_{h}}\bigr), \\ & {\boldsymbol{h}_{t}}=(1-{\boldsymbol{z}_{t}})*{\boldsymbol{h}_{t-1}}+{ \boldsymbol{z}_{t}}* \widetilde{{\boldsymbol{h}_{t}}}, \end{aligned} $$

(6)

where t denotes the discrete time step, $\boldsymbol{z}_{t}$ is the update gate, $\boldsymbol{r}_{t}$ signifies the reset gate, $\boldsymbol{\widetilde{h}}_{t}$ represents the candidate hidden state, and $\boldsymbol{h}_{t}$ is the final hidden state.

In order to avoid the particularity of some time steps affecting the performance of the overall dynamic system, we have added the residual system $\operatorname{Res}^{(l)}$ in each time step, which is implemented as the network weight before i time steps, and i is usually 3. Similar to EvovleGCN, this paper also uses two versions to update the weight $\boldsymbol{W}^{(l)}_{t}$ of the two-layer fully connected neural network in the base classifier.

The first version treats $\boldsymbol{W}^{(l)}_{t}$ as the hidden state in the recurrent architecture, and the input state is the node embedding $\boldsymbol{H}^{(l)}_{t}$. This version is denoted H, and its representation can be written as:

$$ \begin{aligned} \boldsymbol{W}^{(l)}_{t}= \operatorname{RNN}\bigl(\boldsymbol{H}^{(l)}_{t}, \boldsymbol{W}^{(l)}_{t-1}\bigr)+\operatorname{Res}^{(l)}_{t}. \end{aligned} $$

(7)

The second version treats $\boldsymbol{W}^{(l)}_{t}$ as the input and output of the recurrent structure in stead of node embedding. This version is denoted O, and its representation can be written as:

$$ \begin{aligned} \boldsymbol{W}^{(l)}_{t}= \operatorname{RNN}\bigl(\boldsymbol{W}^{(l)}_{t-1}\bigr)+ \operatorname{Res}^{(l)}_{t}. \end{aligned} $$

(8)

4.4 Higher-order neighborhood information processor

To seamlessly handle and integrate high-order spatial information, we leverage the AdaBoost algorithm as an adaptive approach to process the neighborhood information. AdaBoost is an important integrated learning technology. Its main idea is to enhance the combination of multiple weak learners with low prediction accuracy to strong learners with high prediction accuracy. In comparison to other boosting methods employed in graph neural networks, AdaBoost offers distinct advantages, including adaptive feature importance, ensemble model diversity, effective error handling, scalability, and versatility. These characteristics collectively render AdaBoost a powerful and advantageous approach.

After calculating the embedding representation of each layer of GCN at the time step t, we use AdaBoost to integrate them to obtain the final embedding, and then select the corresponding algorithm according to the downstream task to calculate the final result.

We use the AdaBoost classification algorithm SAMME.R [24] if the task of the current data set is node classification or link prediction. The process is as shown in Algorithm 1.

If the task of the current data set is linear regression, the AdaBoost R2 algorithm is used to aggregate the regression prediction of the learner. The process is shown in Algorithm 2.

5 Experiments

5.1 Data sets

The experimental design involves two different downstream tasks, and the dataset is further divided into two groups based on the properties of each downstream task. The first group consists of four datasets specified for the link prediction task.

Stochastic Block Model. (SBM for short) SBM is a commonly used stochastic graph model to simulate community structure and evolution. We use the data model generated in EvolveGCN.

Bitcoin OTC. (BC-OTC for short) The BC-OTC dataset is the Bitcoin user network. This dataset can be used to predict the polarity of each rating and whether the user will rate another rating at the next time step.

Bitcoin Alpha. (BC-Alpha for short) The BC-Alpha is created in the same way as BC-OTC, except that the users and ratings come from a different trading platform.

UC Irvine messages. (UCI for short) The UCI dataset encapsulates an online social network comprising students from the University of California, Irvine, where the exchange of messages between users is depicted through the network’s links. Link prediction emerges as the quintessential task associated with this dataset.

The second group consists of two datasets for traffic speed regression.

SZ-taxi. This data set is taxi speed data in Shenzhen from January 1 to January 31, 2015. The data set counts the traffic speed of each road every 15 minutes.

Los-loop. This data set records the speed of 207 sensors on the Los Angeles Highway from March 1 to March 7, 2012. The dataset counts traffic speed on each road every 5 minutes.

Table 1 shows 4 dynamic graph datasets for link prediction tasks and Table 2 shows 2 dynamic graph datasets for traffic speed regression tasks.

Table 1 Experiment datasets for link prediction

Full size table

Table 2 Experiment datasets for traffic speed regression

Full size table

5.2 Evaluation indicators

To evaluate the prediction performance of the BDGNN model, we use several metrics. For the link prediction task, we use two experimental performance evaluation indicators:

Mean Average Precision (MAP):
$$ \begin{aligned} \mathrm{MAP}=\frac{1}{K}\sum _{i=1}^{K}{\mathrm{AP}}_{i}, \end{aligned} $$
(9)
where $\mathrm{AP}_{i}$ represents the prediction accuracy of each class, and K represents the total number of classes.
Mean Reciprocal Rank (MRR):
$$ \begin{aligned} \mathrm{MRR}=\frac{1}{N}\sum _{i=1}^{N}\frac{1}{p_{i}}, \end{aligned} $$
(10)
where N represents the total number of samples, and $p_{i}$ represents the actual category of sample i in the ranking in predictions.

For traffic prediction tasks, we use four experimental performance evaluation indicators. $Y_{t}$ represents for the real traffic information and $\hat{Y}_{t}$ represents for the prediction:

Root Mean Squared Error (RMSE):
$$ \begin{aligned} \mathrm{RMSE} = \sqrt{\frac{1}{N} \sum _{i=1}^{N} (Y_{t} - \hat{Y}_{t})^{2}}. \end{aligned} $$
(11)
Mean Absolute Error (MAE):
$$ \begin{aligned} \mathrm{MAE} = \frac{1}{N} \sum _{i=1}^{N} |Y_{t} - \hat{Y}_{t}|. \end{aligned} $$
(12)
Accuracy:
$$ \begin{aligned} \mathrm{Accuracy}=1-\frac{\|Y-\hat{Y}\|_{F}}{\|Y\|_{F}}. \end{aligned} $$
(13)
Coefficient of Determination (R2):
$$ \begin{aligned} R^{2} = 1 - \frac{\sum_{i=1}^{N} (Y_{t} - \hat{Y}_{t})^{2}}{\sum_{i=1}^{N} (Y_{t} - \overline{Y})^{2}}. \end{aligned} $$
(14)

5.3 Results

5.3.1 Link prediction

This paper compares the proposed method with six models in the link prediction task: GCN, GCN-GRU, DynGEM, dyngraph2vecAERNN [25], EvolveGCN-H, EvolveGCN-O. GCN, GCN, GRU, and two versions of EvolveGCN all use two-layer graph neural networks, while the two versions of BDGNN use a five-layer learner stack. Table 3 and Table 4 show the comparative experimental results of this paper. On the SBM and UCI data sets, BDGNN goes further on the basis of EvolveGCN and achieves better results. However, there is still an obvious gap with the unsupervised model represented by dyngraph2vecAERNN on the BC-OTC and BC-Alpha data sets, which demonstrate that unsupervised models that rely solely on graph structures may learn better representations on these datasets.

Table 3 Link prediction experiment MAP comparison

Full size table

Table 4 Link prediction experiment MRR comparison

Full size table

5.3.2 Traffic speed regression

In the traffic speed linear regression task, in this paper, we compare the performance of five models: GCN, GCN-GRU, DCRNN [26], T-GCN [27], AST-GCN [28]. Table 5 and Table 6 show the comparative experimental results of this article. Experimental results show that the prediction accuracy of both versions of BDGNN is higher than that of all other models on the SZ-taxi and Los-loop data sets. And the results of version H is better and more stable than version O, which reflects that the information of time evolution is more important than the information of topology structure in the two traffic speed datasets.

Table 5 Comparison of experimental indicators of SZ-taxi

Full size table

Table 6 Comparison of experimental indicators of Los-loop

Full size table

5.3.3 Impact of BDGNN on over-smoothing

To confirm that BDGNN mitigates the over-smoothing issue, this study also evaluates the MAP results across various model depths, contrasting BDGNN-O with other GCN-derived models on both the SBM and UCI datasets. Figure 4 illustrates as the number of layers increases, the prediction performance of standard GCN models deteriorates markedly due to over-smoothing; conversely, the two EvolveGCN variants counteract this effect by integrating dynamic graph information, but they still exhibit a general declining trend. In contrast, BDGNN demonstrates a continuous enhancement in performance with additional layers. As depicted in Fig. 5, beyond the fourth layer, the MAP performance for GCN and two versions of EvolveGCN falls sharply. Following the fifth layer, their MAP scores approach zero; whereas BDGNN maintains stable performance. This stability suggests that BDGNN can discriminate in the absorption of information from neighbors across different hops, thus maintaining a balance and averting the over-smoothing characteristic of traditional GCN models to a notable extent.

6 Discussion

Compared to existing methods, the proposed BDGNN in this study demonstrates superior performance on both the SBM dataset and UCI dataset for link prediction tasks. This outcome suggests that incorporating residual systems effectively mitigates the impact of noise in dynamic networks, thereby enhancing prediction accuracy. In the traffic speed regression dataset, BDGNN outperforms other models across all four evaluation metrics, indicating that the integration of high-order neighbor information through AdaBoost enables regression prediction to anticipate changes in such information, resulting in more stable prediction outcomes. Additionally, in the GCN layer comparison experiment, BDGNN exhibits performance improvement rather than degradation with the addition of layers, providing evidence for its effectiveness in alleviating over-smoothing effects. However, due to the scarcity of dynamic network data in smart manufacturing, the experimental setup may lead to unsatisfactory model adaptability.

7 Conclusions

This study introduces the BDGNN model, specifically engineered to surmount the limitations inherent in dynamic graph neural networks, such as their inadequate capacity to integrate information from high-order neighbors, the weak interplay between temporal and topological data, and suboptimal predictive accuracy. BDGNN leverages a Recurrent Neural Network (RNN) to process temporal dynamics of graphs, while adopting an approach akin to the AdaBoost algorithm with an RNN-like architecture to assimilate information from multi-order neighbors within the spatial domain. The results validate that the BDGNN model adeptly navigates both temporal dynamics and graph topology, surpassing extant approaches in tasks such as link prediction and traffic speed prediction through linear regression. The enhanced accuracy of dynamic graph neural networks in link prediction and regression tasks brings significant benefits to intelligent manufacturing. It enables optimized logistics and production processes, improves product quality and consistency, and effectively supports decision-making and resource allocation, thereby driving the development and innovation of smart manufacturing. Future endeavors will pivot towards the selection or crafting of novel GNN architectures that are attuned to the AdaBoost strategy, aiming to fully harness the topological and dynamic facets of dynamic graphs, thus rectifying the deficiencies observed in conventional GCN methodologies.

Data availability

The data and material supporting the conclusions of this article will be made available by the authors on request.

References

J. Li, H. Shomer, H. Mao, S. Zeng, Y. Ma, N. Shah, J. Tang, D. Yin, Evaluating graph neural networks for link prediction: current pitfalls and new benchmarking. Adv. Neural Inf. Proces. Syst. 36 (2024). arXiv:2306.10453
H. Wang, Z. Cui, R. Liu, L. Fang, Y. Sha, A multi-type transferable method for missing link prediction in heterogeneous social networks. IEEE Trans. Knowl. Data Eng. 35(11), 10981–10991 (2023)
Article Google Scholar
Q. Tan, X. Zhang, N. Liu, D. Zha, L. Li, R. Chen, S.-H. Choi, X. Hu, Bring your own view: graph neural networks for link prediction with personalized subgraph selection, in Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining (2023), pp. 625–633
Chapter Google Scholar
K. Xu, W. Hu, J. Leskovec, S. Jegelka, How powerful are graph neural networks? arXiv preprint (2018). arXiv:1810.00826
Y. Wu, Y. Fu, J. Xu, H. Yin, Q. Zhou, D. Liu, Heterogeneous question answering community detection based on graph neural network. Inf. Sci. 621, 652–671 (2023)
Article Google Scholar
D. He, Y. Song, D. Jin, Z. Feng, B. Zhang, Z. Yu, W. Zhang, Community-centric graph convolutional network for unsupervised community detection, in Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence (2021), pp. 3515–3521
Google Scholar
A. Cini, I. Marisca, F.M. Bianchi, C. Alippi, Scalable spatiotemporal graph neural networks, in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37 (2023), pp. 7218–7226
Google Scholar
S. Dai, J. Wang, C. Huang, Y. Yu, J. Dong, Dynamic multi-view graph neural networks for citywide traffic inference. ACM Trans. Knowl. Discov. Data 17(4), 1–22 (2023)
Article Google Scholar
J. Friedman, T. Hastie, R. Tibshirani, Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). Ann. Stat. 28(2), 337–407 (2000)
Article Google Scholar
Q. Li, Z. Han, X.-M. Wu, Deeper insights into graph convolutional networks for semi-supervised learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Google Scholar
Y. Seo, M. Defferrard, P. Vandergheynst, X. Bresson, Structured sequence modeling with graph convolutional recurrent networks, in Neural Information Processing: 25th International Conference, ICONIP 2018, Siem Reap, Cambodia, December 13–16, 2018 Proceedings, Part I, vol. 25 (Springer, Berlin, 2018), pp. 362–373
Chapter Google Scholar
S. Hochreiter, J. Schmidhuber, Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
P. Goyal, N. Kamra, X. He, Y. Liu, Dyngem: deep embedding method for dynamic graphs. arXiv preprint (2018). arXiv:1805.11273
A. Pareja, G. Domeniconi, J. Chen, T. Ma, T. Suzumura, H. Kanezashi, T. Kaler, T. Schardl, C. Leiserson, Evolvegcn: evolving graph convolutional networks for dynamic graphs, in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34 (2020), pp. 5363–5370
Google Scholar
K. Takeuchi, Y. Kawahara, T. Iwata, Structurally regularized non-negative tensor factorization for spatio-temporal pattern discoveries, in Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2017, Skopje, Macedonia, September 18–22, 2017, Proceedings, Part I, vol. 10 (Springer, Berlin, 2017), pp. 582–598
Chapter Google Scholar
M. Shi, Y. Huang, X. Zhu, Y. Tang, Y. Zhuang, J. Liu, Gaen: graph attention evolving networks, in Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI) (2021)
Google Scholar
P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Lio, Y. Bengio et al., Graph attention networks. Stat 1050(20), 10–48550 (2017)
Google Scholar
K. Cho, B. Van Merriënboer, D. Bahdanau, Y. Bengio, On the properties of neural machine translation: encoder-decoder approaches. arXiv preprint (2014). arXiv:1409.1259
J. You, T. Du, J. Leskovec, Roland: graph learning framework for dynamic graphs, in Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2022), pp. 2358–2366
Chapter Google Scholar
K. Xu, C. Li, Y. Tian, T. Sonobe, K.-I. Kawarabayashi, S. Jegelka, Representation learning on graphs with jumping knowledge networks, in International Conference on Machine Learning (PMLR, 2018), pp. 5453–5462
Google Scholar
Y. Rong, W. Huang, T. Xu, J. Huang, Dropedge: towards deep graph convolutional networks on node classification. arXiv preprint (2019). arXiv:1907.10903
K. Sun, Z. Zhu, Z. Lin, Adagcn: adaboosting graph convolutional networks into deep models, in International Conference on Learning Representations (2021). arXiv:1908.05081v3
Google Scholar
F. Wu, A. Souza, T. Zhang, C. Fifty, T. Yu, K. Weinberger, Simplifying graph convolutional networks, in International Conference on Machine Learning (PMLR, 2019), pp. 6861–6871
Google Scholar
T. Hastie, S. Rosset, J. Zhu, H. Zou, Multi-class adaboost. Stat. Interface 2(3), 349–360 (2009)
Article MathSciNet Google Scholar
P. Goyal, S.R. Chhetri, A. Canedo, dyngraph2vec: capturing network dynamics using dynamic graph representation learning. Knowl.-Based Syst. 187, 104816 (2020)
Article Google Scholar
Y. Li, R. Yu, C. Shahabi, Y. Liu, Diffusion convolutional recurrent neural network: data-driven traffic forecasting. arXiv preprint (2017). arXiv:1707.01926
L. Zhao, Y. Song, C. Zhang, Y. Liu, P. Wang, T. Lin, M. Deng, H. Li, T-GCN: a temporal graph convolutional network for traffic prediction. IEEE Trans. Intell. Transp. Syst. 21(9), 3848–3858 (2019)
Article Google Scholar
J. Zhu, Q. Wang, C. Tao, H. Deng, L. Zhao, H. Li, AST-GCN: attribute-augmented spatiotemporal graph convolutional network for traffic forecasting. IEEE Access 9, 35973–35983 (2021)
Article Google Scholar

Download references

Acknowledgements

I extend my heartfelt thanks to Professor Lin for her invaluable guidance and insights throughout the writing of this thesis. Her expertise has been instrumental in shaping this work.

Funding

This research was funded by Guangdong Provincial Natural Science Foundation of China under Grant No. 2021A1515011243, Guangdong Provincial Science and Technology Plan Project under Grant No. 2019B010139001 and Guangzhou Science and Technology Plan Project under Grant No. 201902020016.

Author information

Authors and Affiliations

School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, 510006, China
Zi-chao Chen & Sui Lin

Authors

Zi-chao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Sui Lin
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization, ZC; methodology, ZC; validation, ZC; writing—original draft preparation, ZC; writing—review and editing, SL; supervision, SL; project administration, SL; funding acquisition, SL. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Sui Lin.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

All authors have read and agreed to the published version of the manuscript.

Competing interests

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Chen, Zc., Lin, S. A binary-domain recurrent-like architecture-based dynamic graph neural network. Auton. Intell. Syst. 4, 11 (2024). https://doi.org/10.1007/s43684-024-00067-9

Download citation

Received: 17 April 2024
Revised: 09 May 2024
Accepted: 13 May 2024
Published: 25 June 2024
DOI: https://doi.org/10.1007/s43684-024-00067-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A binary-domain recurrent-like architecture-based dynamic graph neural network

Abstract

Similar content being viewed by others

Dynamic spatial–temporal graph-driven machine remaining useful life prediction method using graph data augmentation

Spatio-temporal graph convolutional neural network for remaining useful life estimation of aircraft engines

Attentive graph structure learning embedded in deep spatial-temporal graph neural network for traffic forecasting

Explore related subjects

1 Introduction

2 Related work

2.1 Dynamic graph neural network

2.2 Over-smoothing

3 Problem statement

4 Method

4.1 Model overall design

4.2 Modified graph convolutional network without activation function

4.3 Modified recurrent neural network with residual system

4.4 Higher-order neighborhood information processor

5 Experiments

5.1 Data sets

5.2 Evaluation indicators

5.3 Results

5.3.1 Link prediction

5.3.2 Traffic speed regression

5.3.3 Impact of BDGNN on over-smoothing

6 Discussion

7 Conclusions

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation