1 Introduction

With the increasing popularity of online social media, many interactions between people are generated and recorded on the web [1]. Most interactions are positive relationships, such as liking, trusting, and support [2]. At the same time, the network also contains negative relationships, such as disliking, distrusting, and opposing [3]. In a real network, people may like each other differently. In Fig. 1, person A likes person B very much, while person B likes less than A. All these different levels of semantics (like trust and dislike) are primarily asymmetric. For example, we find two Bitcoin trading platforms (Bitcoin-AlphaFootnote 1 and Bitcoin-OTCFootnote 2) where users can rate each other (between − 10 and 10), with higher scores indicating more trust and, conversely, lower scores indicating less confidence, yet they all can be captured via directed weighted signed networks (WSNs). The level of trust and distrust varies among people, and capturing these interactions for modeling studies can help people to better explore the patterns in network data [4, 5] and enhance network mining tasks [6, 7], such as link prediction [8], node ranking [9], and community detection [10].

Fig. 1
figure 1

Weight signed networks

Graph embedding, or representation learning, aims to learn a low-dimensional representation [7, 11, 12]. Such a representation can then be applied to downstream mining tasks on graphs [13]. Graph neural networks (GNNs) have achieved good results in many machine learning tasks [14], such as semi-supervised node classification [15] and link prediction [16]. Many recent studies [17] have focused on using GNN to learn node embedding, which aims to collect information from neighbors for node embedding. These GNN-based network embedding methods have revolutionized the field of network embedding and achieved the most advanced performance [7, 18]. Combining the sociology theory [18,19,20,21,22,23] in the signed network modeling through GNN learning embedded nodes is one of the main directions in the present study. Derr [20] uses balance theory to aggregate. Transmitted symbols GCN model of multilayer neighbor nodes in network information (Li [21]) between attention mechanism are introduced to study the different weights; Huang et al. [22] divided the network structure into different themes. They used GAT to learn node neighbor information under each theme. Huang [23] later extended the sampling method of GraphSage [24] to signed networks, etc. All these studies greatly promoted the progress of signed network embedding. However, the above method does not consider edge weights; edge weight is an essential feature of node embedding; and it helps us better capture the impact of node neighbors on the target node [25]. The link prediction accuracy can be effectively improved with good stability by considering the link weights. It has low computational complex, its corresponding combining the node traits and the corresponding edge weight values.

Based on the above analysis, we propose a novel network embedding framework WSNN to learn weight signed network embeddings via GNN. Compared with traditional GNN, we redesign aggregators and loss functions by considering relevant social theories. WSNN proposed injecting edge weight information when aggregating the information of adjacent nodes and aggregating node information of adjacent nodes with different edge weights by stacking multiple neural network layers. Therefore, WSNN learning node embedding can extract local structural signature networks and global embedded information in the signature network. To train our model, we reconstruct three essential parts of the weighted signed directed network: sign, direction, and triangle.

The significant contributions of this paper are as follows:

  • We introduce a new layer-by-layer weight signed directed relation GNN model. It aggregates and propagates information between nodes under different signed directed relation definitions.

  • After reviewing the two fundamental social psychology theories of signed network analysis, we extend these two theories to weighted signed networks and make an empirical analysis. Guided by two theories, our loss functions consist of reconstructing signs, directions, and triangles.

  • We perform link signed prediction experiments on actual signature social network datasets to prove the validity of our proposed model.

The rest of this paper is organized as follows. In Sect. 2, related works are given. Section 3 introduces sociological theory and extension. Section 4 presents the WSNN framework. The experimental studies are shown in Sect. 5. Finally, the conclusions are given in Sect. 6.

2 Related Work

Our problem of signed network embedding connects to a large body of work on signed network representation learning and recent advancements in applying GNNs.

2.1 Signed Network Embedding

At present, the research [26] on signed networks can be divided into similarity-based methods and theory-based social methods. The SNE [27] model proposed is a random walk method, which adopts the log bilinear model, uses the node representation of all nodes on a given path, and combines two signed vectors to capture the positive or negative relationship of each edge. SiNE [28] introduced the structural balance theory based on reference 8, modeled it with a neural network, sampled a triple with positive and negative triangular relationships, and ensured that the distance between positive relationship node pairs was far less than that of negative relationship node pairs. Huang [23] introduces the attention mechanism into signed networks for the first time. According to the status theory, the neighbors with different structures of nodes are divided into 38 modes to allocate different attention weights between node pairs and gather node neighbors with different weights. SiGAT [22] model is ineffective in dealing with sparse networks and high-dimensional network structures. According to the principle of structural balance, Derr [20] proposed a flexible concept of node neighbor to aggregate the high-order neighbors of nodes with balanced structures and unbalanced structures. However, the model does not consider the direction of node edges and the interaction between nodes. Therefore, Li [21] introduces the attention mechanism into this method, allocates different attention coefficients between node pairs, and uses the technique to gather node neighbor information. SDGNN [23] defines four foreign signed directed relations and proposes a layer-by-layer signed relation GNN to aggregate and propagate the knowledge of nodes in signed networks.

2.2 Graph Neural Networks

Many recent studies have focused on using GNN to learn node embedding, which aims to collect information from neighbors for node embedding. Graph neural networks are a kind of learning node embedding by combining topological structure and node attribute information in the graph [29]. It is a kind of direct learning and cannot be directly generalized to nodes that have not appeared in the training process. GraphSAGE [24] indicated that node embedding could gather neighbor information of nodes through a standard aggregation function, and the trained aggregation function could learn unknown nodes. GCN learns the vector representation of nodes through neighbor information of multilayer convolution aggregation nodes, and the feature vectors of target nodes contain all node information on the graph. Graph attention network GAT [30] is independent of graph structure and aggregates its neighbor information by learning different attention coefficients between nodes. These methods are extended to signed networks to learn node embedding, such as SGCN [20], SNEA [21], SiGAT [22], SDGNN [23], and so on, which completely overturns the field of signed network embedding.

3 Preliminaries

In this section, we extend two sociological theories to the weight signed network. Several notations are listed in Table 1.

Table 1 Notations and explanations

3.1 Theoretical Knowledge

Weight signed network WSN [25] is a directed, weighted graph G = (V, E, W) where V is a set of users, \(E \subseteq V \times V\) is a set of edges, and W is a value of edges. W(u, v) represents the degree to which user u likes or dislikes user v.

Structural Balance Theory Balance theory [31] classifies cycles in a signed network as balanced or unbalanced. It implies that processes with even negative signs are more plausible and should be more prevalent in real networks. For simplicity, we illustrate balanced structures with triangles [32]. More specifically, balanced triangles with three positives in Fig. 2a capture the notion that “the friend of my friend is my friend,” while those with two negatives in Fig. 2b imply that “the enemy of my enemy is my friend.”

Fig. 2
figure 2

Two sociological theories in signed networks

Status Theory Status theory [33] defines an organizing principle for signed links on signed directed networks. In the status theory, a positive link implies that v has a higher status from the perspective of u in Fig. 2c. In contrast, a negative link indicates that v is regarded as having a lower status in Fig. 2d. In signed networks, these relative levels of status can be propagated and aggregated throughout the networks.

Comparison of Balance Theory and Status Theory These two theories successfully apply in signed networks. Structural balance theory is used to aggregate information about the neighbors of nodes, correctly distinguish between friends and enemies of nodes, and deal well with negative links in signed networks. The status theory is used to identify the neighbors of nodes with different structural types, and its use is not as widespread as the structural balance theory. Using structural balance and status theory to model the network, GNNs are then used to obtain vector representations of nodes by aggregating neighbor information to achieve the best results in signed network embedding neighborhoods.

3.2 Sociological Theory Expansion

Definition 1

(Measuring node status) In previous work on signed networks, most of the node edge weights were not considered, and there is no good prediction to estimate the status of the nodes. And the edge weights can be used as a prediction to measure the status of nodes [25]. The prediction made by this measure is the difference between the status of vertex u and vertex v, defined as \(\sigma (u) - \sigma (v)\). The status of vertex k is determined as \(d(k) = \sigma (|W_{{{\text{in}}}}^{ + } (k)| - |W_{{{\text{in}}}}^{ - } (k) + |W_{{{\text{out}}}}^{ + } (k)| - |W_{{{\text{out}}}}^{ - } (k)|)\). The state increases when positive incoming edges are received, and negative outgoing edges are generated to other vertices. In contrast, the state decreases when negative edges are received, and positive, outgoing edges are generated. The state difference measures how much "higher" the state of u is than the state of v. We extend the measure trivially to include weights instead of only signs.

Definition 2

(Energy in triangles) In the signed network without weights, the status theory can be used to estimate the balance of the triangle. We fuse the weight values on the edges to extend the balance of the triangle to the weighted signed network. From Fig. 3, we define the edges from I to j as non-transmitting edges, the edges from node i to node k, and the edges from node k to node j as transmitting edges. The negative edges of the network can be treated as inverse positive edges. If the sum of the transfer edges is greater than or equal to the value of the non-transfer edges, then the triangle weight is conserved. Otherwise, it is not. We will use these two extensions of the theory in the loss function section.

Fig. 3
figure 3

Conservation of triangular energy

4 The Proposed Framework

In this section, we propose a novel network embedding framework WSNN for a weight signed network. The model is divided into three parts: embedding layer, weighted graph aggregator, and downstream task layer. The embedding layer stores the initial vector representation of nodes and the weight matrix between nodes. A weighted graph aggregator summarizes the learning graph's structure information and disseminates the updated nodes' characteristic information. The downstream task layer contains link prediction to test the embedding quality of the model (Fig. 4).

Fig. 4
figure 4

The weighted GNN algorithm of WSNN. Embedding layer: initializes the node vector and the production weight adjacency matrix; weight aggregator layer: used to propagate and aggregate information about neighbor; downstream task layer: check the quality of model node representation vector

4.1 Embedding Layer

This part is used to initialize the node vector and generate the weight adjacency matrix, default set dimension d = 32 of node vector, and generate an initial vector of the node. The adjacency matrix represents the weight information between nodes and their neighbors. In real social networks, the influence of users is not equal. If the link weight of users A to B is more significant than that of users A to C, it can be considered that B is more critical to A than C. To determine the specific value of this importance, we introduce an adjacency matrix to store different weight values among users. In a signed network, the weight of user A to B is not the same as that of user B to A, which represents different meanings and should be treated differently.

4.2 Weight Aggregator Layer

Aggregating Process We use the graph neural network framework GraphSAGE [24] to induce and learn graph structure information. Its primary function is to sample a batch of 500 nodes and their neighbors and aggregate graph neighbors to generate a node representation of each node. GraphSAGE [24] implements mini-batch training by separately calculating each node's embedded expression by saving each node's neighbor nodes. Still, this method gives equal weight to the neighbors of each order. So we propose a new weighted graph aggregator layer to replace GraphSAGE's aggregator. Our weighted graph aggregator layer can be summarized as follows:

$$h_{i}^{k} = \sigma \left( {W^{k - 1} {\text{AGG}}\left( {h_{i}^{k - 1} ,h_{N(i)}^{k - 1} } \right) + b^{k - 1} } \right)$$
(1)

where \(h_{i}^{k}\) (the default node vector in the experiment is set to 32 dimensions) represents the embedding of node i in layer K, node i in layer I − 1, and the weight matrix and bias vector, respectively. The aggregate function AGG is the process of normalizing the weighted adjacency matrix.

Multiple Aggregator work In the signed network, the direction, positive and negative of the edge, and the weight difference in the signed network will constitute different relations and semantic information, forming various types of neighbors. Other kinds of neighbors should be treated differently. According to the status and balance theories, we divide the link direction and symbol into four types and use four aggregators to aggregate the corresponding neighbor information. Among them, add their information to the self-loop and the neighbor. The vectors learned by each aggregator are spliced together, and the vectors known by MLP (multilayer perceptron) represent the nodes.

$${\text{Z}}_{i}^{k + 1} = MLP(CONCAT(X_{{r_{1} }}^{l} (u),X_{{r_{2} }}^{l} (u),X_{{r_{i} }}^{l} (u)))$$
(2)

where \({\text{Z}}_{i}^{k + 1}\) represents the embedding of node i in layer k + 1 and \(X_{r}^{l}\) means single aggregator work.

Single Aggregator Work The expression level and pattern of different genes can also vary for a single aggregator. Thus, we normalize weighted adjacency matrix A as follows:

$$a_{ij} = d_{i} \times \frac{{|a_{ij} |}}{{\sum\nolimits_{j \in N(i)} {|a_{ij} |} }}$$
(3)

where represents the weight of an edge from node j to node i, is the element of A, and denotes the in degree.

Due to batch effects and missing value issues, we proposed to add learnable parameters to each edge as a confidence matrix while leveraging the context of the one-hop neighborhood of nodes in a weighted graph. For node j, we propose a learnable shared parameter as the confidence value of the edge interacting with node j. Another learnable parameter, α, serves as a confidence value for each cell's self-circulating edge. Its value will be shared between types:

$$h_{i}^{k} = \sigma \left( {W^{k - 1} \frac{{ah_{i}^{k - 1} + \sum\nolimits_{j \in N(i)} {B_{j} a_{ij} h_{i}^{k - 1} } ,}}{1 + |N(i)|} + b^{k - 1} } \right)$$
(4)

4.3 Downstream Task

New vector representations of nodes are obtained by learning the WSNN model, and the quality of these vector representations is tested by link prediction. In order to get a more realistic vector representation by designing a loss function, we update the model with callbacks all the time, and finally get the vector representation of the target node.

Loss Function For training our WSNN model, we design an objective function to learn the parameters of WSNN. We design three loss functions to reconstruct three critical features of weight signed networks: sign, direction, and triangle.

For sign, we use the following cross-entropy loss function to model the sign between two nodes:

$$\begin{aligned} & S_{{{\text{u}},v}} = \sigma (z_{u}^{T} z_{v} ) \\ & \xi_{sign} (u,v) = - y_{uv} \log (s_{u,v} ) - (1 - y_{uv} )\log (1 - s_{u,v} ) \\ & \xi_{sign} = \sum\limits_{{e_{u,v} \in \varepsilon }} {\xi_{sign} (u,v)} \\ \end{aligned}$$
(5)

where σ is the sigmoid function, \(y_{uv}\) is the sign ground truth, and \(\varepsilon\) is the edge list with signs.

For triadic status, as we discussed before, we evaluate the real status of nodes in the network by edge weights. We denote the status ranking score of node I; we use the following square loss function to measure the difference between the predicted status relationship value \(s(z_{u} ) - s(z_{v} )\) from the edge \(e_{uv}\) and the ground truth value \(q_{uv}\):

$$\begin{aligned} & \xi_{{{\text{direct}}}} (u \to v) = (q_{uv} - (s(z_{u} ) - s(z_{v} )))^{2} \\ & s(z) = sigmod(W \cdot Z + b) \\ & q_{ij} = d(u) - d(v) \\ & d(x) = \sigma (|W_{{{\text{in}}}}^{ + } (x)| - |W_{{{\text{in}}}}^{ - } (x) + |W_{{{\text{out}}}}^{ + } (x)| - |W_{{{\text{out}}}}^{ - } (x)|) \\ \end{aligned}$$
(6)

where is a score function for mapping embedding z to a score. W is a learnable parameter vector. We denote the accurate status score of node u, v as the values of weight edges calculate the fraction.

Based on energy in triangles, we calculate the loss of the triangle. Edges on node pairs (i, j) represent non-transitive edges. Edges on node pairs (i, k) and (k, j) represent non-transitive edges. The following formula calculates the loss value:

$$\xi_{{{\text{triagle}}}} = p_{ij} - (p_{ik} + p_{kj} )$$
(7)

where \(p_{ij}\) is the weight value of the edge of the node pair (i, j)? The weights on the edges of the node pair (i, j) are obtained by multiplying the vector of node i by the vector of node j.

Based on the sign, direction, and triangle loss function, the overall objective function is written as:

$$\xi_{{{\text{loss}}}} = \xi_{{{\text{sign}}}} + \lambda_{1} \xi_{{{\text{direct}}}} + \lambda_{2} \xi_{{{\text{triagle}}}}$$
(8)

where \(\lambda_{1}\) and \(\lambda_{2}\) are the weight of different loss functions. Equation 8 shows that our loss functions are designed to reconstruct the various properties of weight signed networks.

To train our WSNN model in such large networks, we use mini-batch stochastic gradient descent to update the parameters WSNN. It needs to recombine the neighborhoods of the nodes in the batch to achieve batch calculation. In this paper, the batch size is 500. The training procedure is summarized in Algorithm 1.

figure a

5 Link Sign Prediction Experiments

In this section, we conduct link sign prediction to check whether our model improves the performance of signed network embeddings. Link sign prediction is the task of predicting the unobserved sign of existing edges in the test dataset given the training dataset. Link sign prediction is the task of predicting the unobserved sign of existing edges in the test dataset given the training dataset (Derr et al. [20]; Li et al. [21] and Huang et al. [22]). We follow their experimental settings and compare our method against some state-of-the-art embedding methods.

The vector representation of nodes is obtained by WSNN learning. Eighty percent of connecting edges were randomly selected as the training set, and these connecting edges would generate the vector representation of nodes through the WSNN model. The remaining 20% of connected edges were used as test sets, and the node vector representation of the training set was input into the binary logistic regression model as node features for the experiment. The prediction performance of connected edge symbols in the test set was counted.

5.1 Datasets

This is a who-trusts-whom network of people who trade using Bitcoin on a Bitcoin-OTC platform. Since Bitcoin users are anonymous, there is a need to maintain a record of users' reputations to prevent transactions with fraudulent and risky users. Members of Bitcoin-OTC rate others on a scale of − 10 (total distrust) to + 10 (complete trust). This is the first explicit weighted signed directed network available for research (Table 2).

Table 2 Statistics of datasets

5.2 Baselines

To validate the effectiveness of WSNN, we compare it with several baselines, including unsigned network embedding methods, signed embedding methods, and signed graph neural networks.

  • DeepWalk [34] uses a random walk algorithm to extract a sequence of vertices from the graph and uses natural language processing tools to represent each vertex as a vector of dimension d.

  • SiNE [28] proposed a multilayer neural network to learn the embeddings by optimizing an objective function satisfying structural balance theory. SiNE only concentrated on the immediate neighborhoods rather than the global balance structure.

  • SGCN [20] was a GCN specialized for signed network analysis. Balance theory was leveraged to aggregate and propagate the information of signed networks across signed GCN layers.

  • SNEA [21] was built upon graph attention networks to capture balance theory. In particular, it leveraged masked self-attention layers to aggregate the rich information from neighboring nodes.

  • SiGAT [22] incorporated balance theory and status theory to model signed directed networks based on motifs. Specifically, they defined 38 motifs, including directed edges, signed edges, and triangles. The graph convolution layer of SiGAT consists of 38 GAT aggregators, each corresponding to a neighborhood under a motif definition.

  • SDGNN [23] aggregates messages from different signed directed relation definitions. It can apply multilayers to capture high-order structure information.

5.3 Comparison of Baseline Model Experiments

In this section, accuracy, macro-F1, F1, and AUC are used as evaluation indexes to verify the prediction results of the edge link prediction. The higher the index value is, the more accurate the prediction result is.

  1. a.

    After using unsigned network embedding methods (DeepWalk), the metrics have good results even if only positive links are used. It means that the structural information matters.

  2. b.

    SiNE is a deep learning framework that utilizes extended structural balance theory; compared with the DeepWalk method, the experimental index is significantly improved, which proves the feasibility of modeling a signed network by balance theory.

  3. c.

    SNEA uses the attention mechanism to learn the attention coefficient between nodes, which makes up the deficiency of the SGCN method. The experimental results are better than SGCN, indicating that the attentional mechanism can effectively improve the quality of node embedding by allocating different weights between node pairs.

  4. d.

    Compared with SiGAT, the SDGNN model can use multiple layers to capture high-order structural information. Experiments show that two-layer convolution can achieve the best effect. Both SDGNN and SNEA can process higher-order structural information. SDGNN uses the GraphSage idea to aggregate higher-order structural information, while SNEA aggregates higher-order neighbor information through structural balance theory. From the experimental results, SDGNN is more effective than SNEA in aggregating higher-order neighbor information.

  5. e.

    WSNN model deals with weight signed networks. The experimental results show that metrics are better than the above method, indicating that adding edge weights can improve the quality of signed network embedding (Table 3).

Table 3 The results of link sign prediction on two datasets

5.4 Parameter Analysis and Ablation Study

In this section, we investigate the effects of hyperparameters and perform some ablation studies to analyze our model architecture design. As in the previous section, we chose Bitcoin-Alpha as our dataset and selected 80% training edges and 20% test edges.

Parameter Analysis In this subsection, we analyze the hyperparameters about the number of epochs, the embedding dimension d, \(\lambda_{1}\) and \(\lambda_{2}\).

When parsing a particular hyperparameter, the other parameter values are set to default values. Figure 5 shows the signed link prediction performance of the WSNN model under different parameters. Figure 5a and b shows that with the increase in training rounds, the loss value gradually decreases, and the AUC value increases and then converges progressively and finally becomes stable. Figure 5c shows the influence of node vector dimension d on the experimental results. It can be seen that the effect is optimal when the vector dimension is about 32. With the increase in the dimension, the experimental effect decreases somewhat, which may be caused by over-fitting.

Fig. 5
figure 5

Parameter analysis for sign prediction

As for and defined in the loss function before, we give \(\lambda\) different values to see the impact on link prediction. In the \(\lambda_{1}\) analysis, other values are default parameters and \(\lambda_{2} = 4\); in the \(\lambda_{2}\) analysis, other values are default parameters and \(\lambda_{1} = 8\). It can be found from Fig. 6 that when and is not zero, it shows better performance.

Fig. 6
figure 6

Various values of \(\lambda_{1}\) and \(\lambda_{2}\)

Ablation Study In this subsection, we do some ablation studies to discuss the design of aggregators and loss functions. We also experimented on the Bitcoin-Alpha dataset, setting it up as above.

Our improved aggregator can theoretically aggregate the whole graph nodes. With the increase in network depth, the model's prediction performance and computational efficiency will increase accordingly. It can be found from Fig. 7 that when the number of layers is 2, the link prediction effect is the best.

Fig. 7
figure 7

The effect of stacking multiple layers

Table 4 shows that only reconstructing signs using signs performs poorly. When considering direction and triangle, the results are improved. It demonstrates that our training objectives should consider both directions and triangles. Signs, directions, and triangles are vital features for signed directed networks.

Table 4 Ablation study on loss functions

6 Conclusion

In this paper, we study weighted signed directed network representation learning. We first analyze two fundamental social theories in signed directed networks (i.e., balance theory and status theory) and then extend the theory to obtain new explanations successfully applied in weighted signed directed networks. Under the guidance of sociological theory, we propose WSNN to encode signature networks as network embedding. WSNN is based on the weighted matrix to learn the embedded representation of nodes. It can use multiple layers to capture higher-order structural information. To train our WSNN, we introduce combined loss functions: edge loss, position loss, and triangle loss, respectively. We conducted experiments on real signature networks and proved that our proposed WSNN performed better than other state-of-the-art baselines.

In future work, we will extend this method to heterogeneous networks to incorporate more complex semantic information.