1 Introduction

The rapid development of the Internet has greatly facilitated people's lives, but it has also led to a significant increase in the volume of data available, resulting in an issue known as information overload. With the plethora of products and services available, individuals often struggle to choose items that best meet their needs. Recommendation systems play a crucial role in addressing this problem, primarily by inferring user preferences based on their historical interaction records and providing personalized recommendations, thereby offering more intelligent services. Collaborative filtering (CF) is a classical method that effectively generates recommendations from implicit feedback. In recent years, powerful graph neural networks (GNNs) have been introduced into recommendation algorithms. GNNs model interaction data as a graph, typically a user–item interaction graph, and leverage graph neural networks to learn node representations for recommendation, achieving promising results. However, GNN-based recommendation models still face certain limitations: interaction data, whether user–item or item–item, is often sparse or noisy, making it challenging for nodes to learn reliable representations, which in turn impacts recommendation performance. Existing GNN-based recommendation models rely on explicit user–item interactions for learning node representations and do not fully leverage the potential relationships between nodes, such as user–user or item–item similarities.

Recent research has highlighted the significant advantages of Contrastive Learning (CL) [1,2,3] in the field of representation learning. CL, owing to its ability to extract comprehensive features from a vast pool of unlabeled data and regulate representations in a self-supervised manner, has made substantial strides in various research domains [4,5,6,7]. Given that CL does not require data labeling, it has emerged as a viable solution to address the issue of data sparsity in recommendation systems. A typical approach, as outlined in reference [6], applies CL to recommendation. Initially, data augmentation is employed, involving, for instance, a certain ratio of random edge/node dropout, to enhance the user–item bipartite graph. Subsequently, the primary objective is to maximize the similarity between representation vectors of the same node in different views while minimizing the similarity between representation vectors of different nodes. In this approach, the CL task serves as an auxiliary task and is jointly optimized with the supervised main recommendation task.

Subsequently, researchers have conducted studies that revealed that the presence or absence of graph augmentation has a relatively minor impact on recommendation performance [8]. Instead, what significantly influences recommendation performance is the optimization of the contrastive loss, such as InfoNCE loss [9], irrespective of whether graph augmentation is utilized. Optimizing the contrastive loss InfoNCE can result in more uniform user/item representations, implicitly addressing popularity bias [10]. In light of these findings, we propose a graphless CL approach. From a technical perspective, it discards graph augmentation based on dropout and instead introduces random uniform noise directly into the original representations to achieve data augmentation. Applying different random noises generates distinctions between contrastive views. Compared to graph augmentation, adding noise can lead to more uniformly learned representation distributions and improved efficiency. Furthermore, as demonstrated in references [11,12,13], leveraging latent relationships between nodes (e.g., user–user or item–item similarity) can enrich graph information and enable the construction of more meaningful contrastive learning tasks for recommendation.

Hence, we propose a graph neural network-based recommendation model that integrates contrastive learning (CL). In the model, we employ the advanced LightGCN for the graph neural network component. The contrastive learning part focuses on two aspects: a straightforward and efficient data augmentation contrastive strategy and a prototype-based contrastive learning strategy based on semantic neighbors. Specifically, for data augmentation, we utilize a graphless CL method to achieve data augmentation. Concerning semantic neighbors, we apply prototype-based contrastive learning to capture the relevance between nodes (users or items) and their prototypes. In abstract terms, prototypes can be considered as centroids of neighboring clusters in the semantic space that exhibit semantic similarity. As prototypes are latent, we further propose the use of the expectation–maximization (EM) algorithm to infer prototypes. Through the amalgamation of these additional relationships, our experiments demonstrate significant improvements over existing GNN-based implicit feedback recommendation methods and outperform existing contrastive learning approaches.

2 Related Work

2.1 Recommendation System

Collaborative filtering is the fundamental recommendation method that filters through a vast amount of information by leveraging the collective feedback, ratings, and opinions of users. It is a recommendation process that selects information likely to be of interest to a target user based on the collaborative input from the user community. Let \({\mathbf{U}}\) and \({\mathbf{I}}\) be the set of users and items, respectively. Let \(O^{ + } = \left\{ {{\varvec{y}}_{ui} |{\mathbf{u}} \in {\mathbf{U}},{\mathbf{i}} \in {\mathbf{I}}} \right\}\) be an observable interaction, \(y_{ui}\) which means that the user \(u\) has already interacted with the item \(i\). Most existing GNN-based recommendation models construct user–item interactions into a bipartite graph \(G = (V,E)\), where the node-set \(V = {\mathbf{U}} \cup {\mathbf{I}}\) contains all users and items, and the edge set \(E = O^{ + }\) represents the observed interactions.

Generally speaking, the core of the GNN-based collaborative filtering recommendation method is to apply a neighborhood aggregation scheme on the graph \(G\), and update the self-representation of a node by aggregating the representation of neighbor nodes, which can be divided into two stages represented as below,

$$ z_{{\mathbf{u}}}^{(l)} = f_{{_{{{\text{combine}}}} }} \left( {z_{{\mathbf{u}}}^{(l - 1)} ,f_{{{\text{aggregate}}}} (\{ z_{{\mathbf{i}}}^{(l - 1)} |i \in N_{{\mathbf{u}}} \} )} \right), $$
(1)
$$ z_{{\mathbf{u}}} = f_{{{\text{readout}}}} \left( {\left\{ {z_{{\mathbf{u}}}^{(l)} |l = [0, \ldots ,L]} \right\}} \right), $$
(2)

where \(z_{{\mathbf{u}}}^{(l)}\) is the representation of the user \(u\) on the l-th layer. It first aggregates all the neighbor representations of the previous layer and then combines them with its representation. \(N_{{\mathbf{u}}}\) is the neighborhood set of user \(u\) in the interaction graph \(G\). There are many designs on \(f_{{{\text{aggregate}}}} ( \cdot )\) and \(f_{{{\text{combine}}}} ( \cdot )\) in articles [14,15,16,17]. After \(l\) times iteratively propagation, the representation of the \(L\)th layer can be obtained and a readout function is used to generate the final representation of the user prediction. Common designs of \(f_{{{\text{readout}}}} ( \cdot )\) include direct use of the last layer, splicing operations and weighted sum operations on the representations of each layer.

The LightGCN utilized in this paper simplifies the message propagation process by forgoing the use of non-linear activation functions and feature transformations. This design choice makes the model more efficient and streamlined.

2.2 Contrastive Learning

The idea of self-supervised learning is to set up an auxiliary task, extract additional information from the input data itself, and use the unmarked data space. Compared with supervised learning, self-supervised learning makes use of unmarked data space by modifying input data to mine negative samples that are difficult to distinguish from positive samples, thus achieving significant improvement in downstream tasks. Comparative learning is a method of self-supervised learning.

In the field of computer vision (CV) and natural language processing (NLP), combined with the self-supervised learning framework, comparative learning has made remarkable achievements recently. The literature proposed a simple visual representation comparative learning framework SimCLR, which studies random image enhancement. Comparative learning trains the encoder by maximizing the consistency between two extended views of an instance. The core idea is that the distance between the sample and the positive sample that is similar to it is far greater than the distance between the sample and the negative sample that is not similar to it.

The latest research shows that self-supervised learning can effectively improve the performance of the recommendation model and improve the recommendation effect of the recommendation model on the long-tail project. Literature [6] adds the self-supervised learning method to the recommendation model and proposes a new recommendation method SGL, which supplements the supervised recommendation task with self-supervised learning on the user item graph.

We employ two contrastive learning strategies: a straightforward and efficient data augmentation contrastive strategy and a prototype-based contrastive learning strategy based on semantic neighbors. Compared to traditional contrastive learning methods, these strategies are more concise and exhibit more pronounced improvements in performance. Moreover, they consider the latent semantic relationships within the structure, enhancing the model's effectiveness from various perspectives.

3 Method

The following three parts introduce the proposed graph neural network recommendation model based on comparative learning. The overall framework of the model is shown in Fig. 1. First, the method of graph collaborative filtering is introduced in Sect. 3.1, that is, the method adopted in the first part of the model. Using the user–item interaction diagram, the advanced graph neural network recommendation method LightGCN is adopted to obtain the final representation of user \(u\) and item \(i\). It predicts the possibility of users adopting projects. The adopted scheme is the inner product operation, and the loss of Bayesian personalized ranking (BPR) is obtained. After that, a concise and efficient data enhancement comparison method and a prototype-comparison strategy based on semantic neighbor are introduced in Sects. 3.2 and 3.3 respectively, and the neighbor relationship is integrated into the comparative learning to obtain the comparative loss (InfoNCE). Finally, we adopted a multi-task learning strategy in Sect. 3.4 to properly coordinate collaborative filtering and comparative learning.

Fig. 1
figure 1

Architecture of GNNCL

3.1 Graph Collaborative Filtering

GNN-based recommendation models all construct a bipartite graph of the interaction between users and items, and then perform domain aggregation and propagation on the bipartite graph G to generate the final representation of users and items for prediction. Considering the advanced nature and simplicity of the model, this paper uses LightGCN [18] to model the observed interaction between users and items, which abandons the use of nonlinear activation functions and feature transformations in message propagation. The embedded representations of user and item at the (l + 1)th layer are given as follows,

$$ z_{{\mathbf{u}}}^{(l + 1)} = \sum\limits_{{i \in N_{{\mathbf{u}}} }} {\frac{1}{{\sqrt {\left| {N_{{\mathbf{u}}} } \right|\left| {N_{{\mathbf{i}}} } \right|} }}} z_{{\mathbf{i}}}^{(l)} , $$
(3)
$$ z_{{\mathbf{i}}}^{(l + 1)} = \sum\limits_{{u \in N_{{\mathbf{i}}} }} {\frac{1}{{\sqrt {\left| {N_{{\mathbf{i}}} } \right|\left| {N_{{\mathbf{u}}} } \right|} }}} z_{{\mathbf{u}}}^{(l)} , $$
(4)

where \(z_{{\mathbf{i}}}^{(l)}\) and \(z_{{\mathbf{u}}}^{(l)}\) are the embedded representation of user \(u\) and item \(i\) obtained after l-th layer propagation. \(N_{{\mathbf{u}}}\) is the set of items that have interacted with user \(u\). \(N_{{\mathbf{i}}}\) is the set of users who have interacted with item \(i\). \(\frac{1}{{\sqrt {\left| {N_{{\mathbf{u}}} } \right|\left| {N_{{\mathbf{i}}} } \right|} }}\) is a regularization term and follows the standard GNN’s structure.

After L layers’ propagation, the weighted sum function is used as the readout function to combine the representations of all layers, and the final representation is obtained as follows:

$$ z_{{\mathbf{u}}} = \frac{1}{L + 1}\sum\limits_{l = 0}^{L} {z_{{\mathbf{u}}}^{(l)} } ,\;z_{{\mathbf{i}}} = \frac{1}{L + 1}\sum\limits_{l = 0}^{L} {z_{{\mathbf{i}}}^{(l)} } , $$
(5)

where \(z_{{\mathbf{u}}}\) and \(z_{{\mathbf{i}}}\) represent the final representation of users and items. Take the inner product to predict the likelihood of a user \(u\) interacting with an item \(i\):

$$ \hat{y}_{u,i} = z_{{\mathbf{u}}}^{{\text{T}}} z_{{\mathbf{i}}} , $$
(6)

To directly derive information from interactions, we employ a Bayesian personalization ranking (BPR) loss [19], which is a commonly used recommendation ranking objective function. The formula is as follows:

$$ {\mathcal{L}}_{BPR} = \sum\limits_{{\left( {u,i,j} \right) \in {\text{O}}}} -{ {\text{log}}} \sigma \left( {\hat{y}_{u,i} - \hat{y}_{u,j} } \right), $$
(7)

where \(\sigma ( \cdot )\) is the sigmoid function. \({\text{O}} = \left\{ {(u,i,j)|(u,i) \in {\text{O}}^{ + } ,(u,j) \in {\text{O}}^{ - } } \right\}\) are pairs of training data. \(j\) represents items that the user \(u\) has not interacted with, which is an unobserved item.

3.2 Data Augmentation

The user–item bipartite graph is constructed based on observed user–item interactions and contains collaborative filtering signals. Specifically, the first-hop neighbors (nodes connected by a path of length 1) directly describe the user and item nodes and can be considered as pre-existing characteristics of users (or items). The second-hop adjacent nodes of users (or items) exhibit similar user behavior (or the audience for items). Furthermore, high-order paths from users to items reflect the user's potential interest in the item. Exploring inherent patterns in the graph structure aids in node representation learning. Therefore, in reference [6], three operations were designed on the graph structure: node dropout, edge dropout, and random walk, to create different node views, serving the purpose of graph augmentation.

However, studies have shown that even in CL, highly sparse graph augmentation (e.g., edge dropout rate of 0.9) can lead to expected performance improvements. This phenomenon is counterintuitive because a high dropout rate would result in significant loss of original information and a highly skewed graph structure. Experiments in reference [8] found that the uniformity of node representation distributions is key to performance improvement, as a more uniform representation distribution can retain intrinsic node features and enhance generalization.

We propose a graphless CL approach that abandons graph augmentation based on dropout. Instead, it introduces random uniform noise directly into the original representations to achieve representation-level data augmentation. Applying different random noises generates distinctions between contrastive views. Compared to graph augmentation, adding noise directly regularizes the embedding space into a more uniform distribution, making it easy to implement and more efficient.

Formally, given a node \(i\) and its representation \(e_{i}\) in the \(d\) dimensional embedding space, we can achieve the following representation-level enhancements:

$$ {\mathbf{e}}^{\prime}_{{\mathbf{i}}} = {\mathbf{e}}_{{\mathbf{i}}} + \Delta^{\prime}_{i} ,\;\;{\mathbf{e}}^{\prime\prime}_{{\mathbf{i}}} = {\mathbf{e}}_{{\mathbf{i}}} + \Delta^{\prime\prime}_{i} $$
(8)

where the added noise vectors \(\Delta^{\prime}_{i}\) and \(\Delta^{\prime\prime}_{i}\) are subject to \(\left\| \Delta \right\|_{2} = \varepsilon\) and \(\Delta = \overline{\Delta } \odot {\text{sign}}({\mathbf{e}}_{{\mathbf{i}}} )\), \(\overline{\Delta } \in {\mathbb{R}}^{d} \sim U(0,1)\). The first constraint \(\left\| \Delta \right\|_{2} = \varepsilon\) controls the granularity of the noise and is equivalent to a vector on a hypersphere of radius \(\varepsilon\). The second constraint \(\Delta = \overline{\Delta } \odot {\text{sign}}({\mathbf{e}}_{{\mathbf{i}}} )\) is that the added noise vectors \(\Delta^{\prime}_{i}\) and \(\Delta^{\prime\prime}_{i}\) the original representation \({\mathbf{e}}_{{\mathbf{i}}}\) are required to be in the same quadrant to avoid excessive bias caused by the added noise.

By adding noise, one can think of it as rotating the original representation vectors in space by small angles. When the rotation angles are small, most of the original information is preserved while introducing differences. This rotation can then be used for contrastive learning. The noise vectors are illustrated in Fig. 2.

Fig. 2
figure 2

Data augmentation based on random noise

Specifically, after adding noise to the representations, the representations of the same node are regarded as positive pairs (i.e., \(\{ ({\mathbf{e}}^{\prime}_{{\mathbf{i}}} ,{\mathbf{e}}^{\prime\prime}_{{\mathbf{i}}} )|i \in V\}\)), and the representations of different nodes are regarded as negative pairs (i.e., \(\left\{ {\left( {{\mathbf{e}}^{\prime}_{{\mathbf{i}}} ,{\mathbf{e}}^{\prime\prime}_{{\mathbf{j}}} } \right)|i,j \in V,\;i \ne j} \right\}\)). A contrastive loss InfoNCE is used to maximize the similarity between different representation vectors of the same node and minimize the similarity between different node representation vectors. The formula is as bellow:

$$ {\mathcal{L}}_{S} = \sum\limits_{{i \in {\mathbf{V}}}} -{ {\text{log}}} \frac{{{\text{exp}}\left( {s\left( {{\mathbf{e}}^{\prime}_{{\mathbf{i}}} ,{\mathbf{e}}^{\prime\prime}_{{\mathbf{i}}} } \right)/\tau } \right)}}{{\sum\nolimits_{{j \in {\mathbf{V}}}} {{\text{exp}}\left( {s\left( {{\mathbf{e}}^{\prime}_{{\mathbf{i}}} ,{\mathbf{e}}^{\prime\prime}_{{\mathbf{j}}} } \right)/\tau } \right)} }}, $$
(9)

In the formula, \(s( \cdot )\) is the cosine similarity function used to measure the similarity between two vectors, and \(\tau\) is a hyperparameter. The node set \(V\) is divided into user nodes and project nodes, and their contrast losses are calculated and added.

3.3 Semantic Neighborhood

Semantic neighbors refer to two nodes in the graph that cannot be reached by a direct path but share similar features (for item nodes) or preferences (for user nodes). Inspired by previous work [20], we can identify semantic neighbors by learning latent prototypes for each user and item. Building upon this idea, we further propose prototype-based contrastive learning to explore potential semantic neighbors and incorporate them into the contrastive learning framework to better capture the semantic characteristics of users and items in collaborative filtering.

Specifically, similar users and items often reside in adjacent embedding spaces, and prototypes represent the centers of clusters that correspond to groups of semantic neighbors. Therefore, we apply clustering algorithms to the embeddings of users and items to obtain prototypes for users or items. Since this process cannot be directly optimized end-to-end, we employ the EM algorithm for prototype contrastive learning. In formal terms, the objective of the GNN model is to maximize the following log-likelihood function:

$$ \sum\limits_{i \in V} {{\text{log}}} p\left( {{\mathbf{e}}_{{\mathbf{i}}} |\Theta ,R} \right) = \sum\limits_{i \in V} {{\text{log}}\sum\limits_{{{\mathbf{c}}_{{\mathbf{i}}} \in C}} {p\left( {{\mathbf{e}}_{{\mathbf{i}}} ,{\mathbf{c}}_{{\mathbf{i}}} |\Theta ,R} \right)} } , $$
(10)

where \(\Theta\) is a set of model parameters, \(R\) is the interaction matrix, and \({\mathbf{c}}_{{\mathbf{i}}}\) is the potential prototype of the node \(i\). Following this approach, we can define user and project optimization goals separately. Then, the proposed prototype contrastive learning objective is based on InfoNCE to minimize the following function:

$$ {\mathcal{L}}_{p} = \sum\limits_{i \in V} { - {\text{log}}} \frac{{{\text{exp}}\left( {{\mathbf{e}}_{{\mathbf{i}}} \cdot {\mathbf{c}}_{{\mathbf{i}}} /\tau } \right)}}{{\sum\nolimits_{{{\mathbf{c}}_{{\mathbf{j}}} \in C}} {{\text{exp}}\left( {{\mathbf{e}}_{{\mathbf{i}}} \cdot {\mathbf{c}}_{{\mathbf{j}}} /\tau } \right)} }} $$
(11)

where \(c_{i}\) is the prototype of the node \(i\), which is obtained by clustering all node embeddings using the K-means algorithm and there are \(K\) clusters over all the users. The node set \(V\) is also divided into user nodes and project nodes, and their contrast losses are calculated and added.

3.4 Multi-task Training

We leverage a multi-task training strategy to jointly optimize the classic recommendation task:

$$ {\mathcal{L}} = {\mathcal{L}}_{BPR} + \lambda_{1} {\mathcal{L}}_{S} + \lambda_{2} {\mathcal{L}}_{P} + \lambda_{3} \left\| \Theta \right\|_{2} , $$
(12)

where \(\lambda_{1}\), \(\lambda_{2}\) and \(\lambda_{3}\) are the hyperparameters that control the weights and regularization terms of the above two goals. \(\Theta\) is the GNN model parameter set.

4 Experiments

4.1 Datasets

Using three publicly available datasets, namely Yelp2018, Amazon-Book, and Alibaba-iFashion, we filter out users and items with fewer than 15 interactions. In the experiments, the division of the training set, validation set, and test set is done randomly in a ratio of 8:1:1. The statistical information for the three datasets is summarized in Table 1.

Table 1 Statistics of the datasets

4.2 Evaluation Criteria

This paper focuses on the Top-K recommendation scenario and employs the two most commonly used evaluation metrics in recommendation algorithms to measure the algorithm's performance, namely Recall and Normalized Discounted Cumulative Gain (NDCG). The value of K is set to 10, and a full-ranking strategy is used to rank all candidate items that the user has not interacted with.

4.3 Experimental Parameter Settings

This experiment is implemented based on the popular open source recommendation framework RecBole [21]. All parameters of the model are initialized using the Xavier method [22], the optimizer is Adam, the learning rate size is 0.001, the batch size is 2048, the embedding size is 64, the range of \(\lambda_{1}\) and \(\lambda_{2}\) is [1e−10, 1e−6], \(\varepsilon\) is 0.1, and empirically [6] is \(\tau\) = 0.2.

4.4 Compared Models

We compare the proposed method with the following four methods as follows.

  • BPRMF [19]: The paper utilizes a matrix factorization (MF) framework to optimize the Bayesian personalized ranking (BPR) loss for learning the latent representations of users and items, ultimately for making recommendations.

  • NGCF [23]: The graph-based collaborative filtering method in this paper is implemented using a standard graph convolutional neural network (GCN). It utilizes the user–item bipartite graph to encode second-order interaction features into messages during the message-passing process.

  • LightGCN [18]: The recommendation method based on a simplified and enhanced graph convolutional network (GCN) includes only the most crucial components of GCN. It forgoes the use of non-linear activation functions and feature transformations during message propagation. In the end, it computes weighted sums of the representations of users and items at each layer and uses these for the final predictions.

  • SGL [6]: The method introduces self-supervised learning to enhance recommendations. After applying graph augmentation using edge dropout, it employs contrastive learning and jointly optimizes it with the supervised learning objective, such as the Bayesian personalized ranking (BPR) loss. This combination enhances the recommendation process.

4.5 Contrastive Experiment

Table 2 presents a performance comparison between the proposed method and four other methods on three datasets. From Table 2, we can observe the following: Compared to traditional methods like BPRMF, which are based on matrix factorization, the methods based on GNNs perform better in integrating higher-order information from the bipartite graph into representations. Among the two GNN-based collaborative filtering models, LightGCN outperforms NGCF on most datasets, indicating the effectiveness of simplifying the traditional graph convolutional network architecture. SGL, which incorporates a contrastive learning model, consistently outperforms other models that do not include contrastive learning on all three datasets. This demonstrates the effectiveness of contrastive learning in improving recommendation performance. However, SGL compares the representations of the original graph with those obtained after graph augmentation. The representations from this graph augmentation are not uniform, and research indicates that the main reason for performance improvement is not the graph augmentation. Additionally, SGL overlooks other potential relationships in the recommendation system, such as user similarity and item similarity.

Table 2 Performance comparison of different recommendation models

Finally, it can be seen that the recommendation algorithms proposed in this paper outperform other algorithms, indicating that integrating contrastive learning and exploring semantic neighbors is helpful in improving recommendation performance in GNN-based recommendations.

It is worth to note that the improvement in the third data set is indistinctive. The reason may be the density in the Alibaba-iFashion is lower than the other two datasets, and the densities of datasets are shown in Table 1.

4.6 Ablation Experiment

The method proposed in our paper leverages contrastive learning in two aspects. To validate the effectiveness of these two contrastive learning components, we conducted ablation experiments to assess their respective contributions by comparing them with LightGCN. The results are shown in Fig. 3, where “Variant 1” and “Variant 2” represent the models obtained by removing either data augmentation or semantic neighbors, respectively.

Fig. 3
figure 3

Ablation experimental results

It can be observed that removing either relationship leads to a performance drop, and both variants outperform the baseline LightGCN. This suggests that explicitly modeling both of these relationships is advantageous for improving the performance of graph collaborative filtering. Furthermore, these two relationships complement each other and enhance performance in different aspects.

4.7 Impact of Data Sparsity

To validate that the proposed GNNCL method can mitigate the sparsity of interaction data, we conducted experiments in this section to evaluate its performance on data with varying levels of sparsity. Specifically, we divided all users into four groups based on the number of interactions, keeping the total number of interactions the same in each group. G1 represents the group with the lowest average number of interactions. We then compared the performance (Recall@10) of GNNCL and the more advanced LightGCN on these four groups of data.

From Fig. 4, we can observe that the performance of the GNNCL method consistently outperforms LightGCN. Additionally, as the average number of interactions decreases, the performance gain from this method also increases. This implies that GNNCL can provide high-quality recommendations even for sparse interaction data, thanks to the proposed contrastive learning techniques.

Fig. 4
figure 4

Experimental results of the influence of data sparsity

4.8 Effect of Prototype k Value

To study the effect of prototype contrastive learning, we varied the value of k from 0 to 2500. The results are shown in Fig. 5, where the x-axis represents the value of k, and the y-axis represents Recall@10. The GNNCL method with different values of k consistently outperforms the baseline LightGCN, with the best performance achieved when k is around 1000. This indicates that a large number of prototypes can better mitigate the noise introduced by data augmentation. Performance significantly deteriorates when we set k to 0, highlighting the usefulness of semantic neighbors in improving recommendation performance.

Fig. 5
figure 5

Experimental results of influence of prototype k value

5 Conclusion

This paper presents a graph neural network recommendation model that integrates contrastive learning. It achieves contrastive learning by adding random uniform noise to the representations of user and item nodes and exploring semantic neighbors. It is characterized by efficiency and simplicity, and it addresses the issue of data sparsity, leading to more reliable node representations and significantly improved recommendation performance. Experiments on three public datasets validate the effectiveness of the proposed method. In the future, this framework can be extended to other recommendation tasks such as sequence recommendation and bundle recommendation.