1 Introduction

On online group buying, buyers round up some like-minded people to purchase the same products, which can leverage a large number of people’s collective bargaining power and achieve group discounts [1, 2]. In recent years, benefit from the advanced electronic payment technology and convenient express service, we have witnessed the prosperity of it in some online shopping services (e.g., TaoBao,Footnote 1 Groupon,Footnote 2 and PDD).Footnote 3

Fig. 1
figure 1

A toy example of co-purchase: a simple co-purchase pattern, b similarity of co-purchasers in one co-purchase on the TaoBao. The distance between nodes \(\propto\) their shortest path and sim represents cosine similarity of their shopping history

In real applications, the co-purchase usually includes the following steps: Firstly, merchants promise to offer products or services with a discount on the condition that a certain number of customers would make the purchase. As shown in Fig. 1a, there are multiple buyers in a transaction, which is also the essential feature of the co-purchase; then, an initiator manually invites friends, followers, and like-minded people to participate in the purchase; finally, co-purchasers accept the invitation and benefit from the lower price which is unavailable to the individual buyer [3].

Finding the right co-purchasers is a key step in the co-purchase process. How to choose the appropriate co-purchasers? There are two methods for the problem: the manual invitation of the initiator and the automatic recommendation of the recommendation system. Although the former is a classic solution commonly used by industry [3], it still has many flaws, such as inefficiency, insufficient demand for co-purchaser, and the limited quantity of participants. Compared with the first method, the latter one is a more promising method, as the superiority of the recommendation system has been proven in many other areas [4,5,6,7].

We consider using a recommendation system to recommend co-purchasers. As shown in Table 1, we can see that co-purchaser recommendation is totally different from traditional recommendation task. The existing recommendation methods are also not suitable for co-purchase scenarios.

Recommending items for a user is a hot topic, and previous researchers have made a lot of great contributions [8,9,10]. However, group buying is a distinctive online shopping pattern, where choosing a product is just the beginning of group buying. We need to pay more attention to who is the appropriate co-purchaser for this deal.

The purpose of group recommendation is to recommend acceptable items to a group [11, 12], such as gyms music recommendation [13], teaching material recommendation [11]. They all have a premise that there is an existing stable group, such as all members of a club, all students in a class. But on group buying, as mentioned above, the group is gradually gathered by the initiators, rather than existing before recommendation.

Collaborator recommendation is more close to real co-purchase situation than group recommendation. A co-purchase transaction may be regarded as a collaboration between the initiator and the co-purchaser; then, we can solve the co-purchaser problem build upon the positive experiences of previous collaborator recommendation tasks. Much literature has been published [14,15,16,17] on collaborator recommendation systems as well as their real-world applications, such as co-author recommendation in the academic social network [14, 16], developer recommendation in the open-source community [15, 18], and co-star recommendation in the film industry [19].

Table 1 Difference between co-purchaser recommendation and other recommendation

In the above collaborator recommendation tasks, finding robustly similar users for the target user is a core task, for example, in an academic network, people tend to repetitively collaborate with fellow researchers with close researcher topics [14, 22, 23]. It is also a classic idea in many recommendation algorithms, such as the typical user-based collaborative filtering approach distinguishing the target users interests and preferences by aggregating the highest similar users [8]. However, the co-purchaser recommendation is a special scenario, in which not all co-purchasers have high similarity with the initiator. As shown in Fig. 1b, a large number of weak similar users also participated in the co-purchase transaction, but they are usually not noticed by existing recommendation methods.

The co-purchaser recommendation is a challenging task, since the identification of potential co-purchasers from the weak similar users is not easy. To tackle the problem, two embedding strategies are proposed, which capture weak similar co-purchasers from different perspectives. In the first one, we propose a multi-layered learning architecture with PathSim [22] diffusion, namely PathSim diffused structural deep network embedding (PDSDNE), which connects weak similar users by PathSim and directly imposes a penalty to the mapping error of the weak similar users. Obviously, it is a forthright strategy that is beneficial to the weak similar user, but it will inevitably damage the original network structure. In the second one, we devise a co-occurrence model based on truncated walking paths, namely co-purchasers to vectors (cop2vec). More specifically, cop2vec can smoothly improve the co-occurrence probability of the weak similar co-purchasers by truncated bias walk and thus learn a more reasonable representation for co-purchasers. In this way, not only those co-purchasers who are highly similar to the initiator are close to the initiator, but also the potential co-purchasers with weak similar to the initiator.

In this paper, we refer to this approach but add more delicate recommendation processing. Compared with [21], the specific extension work may summarize as two parts.

In the first one, we compare four classic recommendation tasks, including item recommendation, group recommendation, collaborator recommendation, and co-purchaser recommendation. According to the comparison, we can see that the co-purchaser recommendation is a challenging new recommendation. In the second one, we improve the existing recommendation processing. Specifically, the co-purchaser recommendation model has multiple rounds of recommendation and adjusts the current recommendation list based on the feedback from the recommendation of previous stages.

Generally, the contributions of our paper are summarized as follows:

  • To the best of our knowledge, this is the first work that shows how to recommend co-purchasers in group buying. This is an important subject because co-purchaser recommendation has been proved to be more effective than the handcrafted invitation.

  • We propose PDSDNE and cop2vec, two efficient co-purchaser recommendation methods, which effectively perceive weak similar co-purchasers. In addition, we improve the original co-purchaser recommendation strategy, which makes the recommendation model more flexible.

  • Through extensive experiments, we demonstrate the efficacy and scalability of the presented methods in the co-purchaser recommendation task.

The rest of the paper is organized as follows. Section 2 presents related work. Section 3 introduces the embedding methods (PDSDNE and cop2vec) with the details of how to capture the weak similar users for the co-purchaser recommendation. Section 4 describes the experimental setup and presents qualitative and quantitative results. Section 5 gives the conclusion with future work.

2 Related Work

2.1 Similarity Search

The similarity search is a basic operation in collaborative filtering, and it can be directly used as a simple strategy to find a collaborator [22]. When the input is in the form of a scoring matrix, the common similarity search approach including cosine similarity and Pearson's correlation coefficient [24]. In the network analysis task [25], a large number of similarity search methods with different definitions of similarity have been proposed such as common neighbors, Jaccard index, and Adamic–Adar index. In addition to the above-mentioned local-based function, Sun et al. [22] proposed a path-based similarity measure to suit peer objects.

2.2 Network Embedding

The low-dimensional representation learning of recommendation objects is a classic approach to the recommendation system [16, 26, 27], for example, one of the most efficient and best used recommend methods is matrix factorization in which users and items are represented in a low-dimensional latent factors space [26]. Network embedding aims at learning low-dimensional vectors for the vertices of a network [27,28,29], such that the proximities among the original network are preserved in the low-dimensional space [30].

Recent progress in neural embedding methods for linguistic tasks has dramatically advanced state-of-the-art natural language processing (NLP) capabilities. These methods attempt to map words and phrases to a low-dimensional vector space that captures semantic relations between words [31]. Specifically, skip-gram with negative sampling (SGNS), also known as word2vec, set new records in various NLP tasks. Inspired by it, DeepWalk [27] is proposed as a method for learning the latent representations of the nodes of a social network. The method aims to transplant the word-context concept in documents into networks and combines truncated random walk with skip-gram model to achieve this. We can utilize the model to learn the low-dimensional and distributed embedding of nodes as it facilitates the preservation of its structural context—local neighborhoods—in the original network [26, 32]. On this basis, WALKLETS [33] and node2vec [29] further extend DeepWalk by using high-order proximities and bias walk. LINE [28] is a recently proposed embedding approach for large-scale networks. By design, LINE learns two representations separately, one preserving first-order proximity and the other preserving second-order proximity. Then, Wang et al. extended the method using a deep autoencoder [34].

3 Co-purchaser Recommendation

3.1 Formalizations

In this section, we first introduce the concept of interaction networks and then give a formal definition of the co-purchase recommendation problem.

3.1.1 Interaction Networks

An interaction network is defined as a graph \(G = (V,E)\), where V and E represent the node set and the edge set. For example, one can represent the interaction network in Fig. 1a with buyers and products as nodes, wherein edges indicate the interactions, such as the purchase (buyer to product) and the trust (buyer to buyer). In order to ensure data consistency, the edges are unweighted.

Fig. 2
figure 2

Co-purchaser recommendation

3.1.2 Co-purchaser Recommendation

As shown in Fig. 2, co-purchaser recommendation aimed at recommending people who may participate in the co-purchase to the initiator when he selects a product and initiates a purchase order. Network embedding is to learn a low-dimensional vector for each node. A dense vector \(R \in \mathbb {R}^{K}\) is used to represent a node in the network, and K is far less than |V|. In the group buying, \(R=\{U,P\}\), let U be the vector set of all buyer nodes and let P be the vector set of all possible product nodes. On the basis, we define the co-purchaser recommendation goal as follows: Given a buyer i and a product j, we can now assign a co-purchase score S to each buyer c, it can be written as:

$$\begin{aligned} S(c,(i,j))=U_{c}\cdot {U_{i}}^{T}+U_{c}\cdot {P_{j}}^{T} \end{aligned}$$
(1)

3.2 Multi-layered Learning Architecture with PathSim

In this section, we first define the notation of PathSim diffusion. Then we introduce the multi-layered learning model of PDSDNE. At last we present some discussions and analysis on the model.

Fig. 3
figure 3

The PathSim diffused network (the initiator links to the weakly similar users by PathSim)

Given a network \(G=(V,E)\), we can obtain its adjacency matrix \(S\in \mathbb {R}^{V\times V}\). We have \(s_{ij}=1\) if there exists a link between i and j, and \(s_{ij}=0\) otherwise. For each row \(s_{i}=\left\{ s_{ij} \right\} _{j=1}^{n}\). In reality, the observed links only account for a small portion. There exist many co-purchasers who have some connectivity with the initiator but no direct links, especially weakly similar co-purchasers. we define the PathSim diffused matrix \(P\in \mathbb {R}^{V\times V}\) by extending PathSim measurement proposed in [22] as follows:

$$\begin{aligned} p_{ij}= \left\{ \begin{array}{ll} s_{ij}& \text {if }s_{ij}\ne 0\\ \frac{2\times path(i,j)}{path(i,i)+path(j,j)}& \text {if the shortest length between }i \text { and } j<R \\ 0& \text {otherwise} \end{array}\right. \end{aligned}$$
(2)

where path(ij) is the number of paths between i and j, path(ii) is the number of paths between i and i, path(jj) is the number of paths between j and j. Notice the length of all paths is the shortest length of the path between i and j, where R is the range of the PathSim diffusion. In theory, the score of the p can measure the connectivity between vertexes and normalized by the visibility of vertexes. As shown in Fig. 3, there is 1 path between \(u_{2}\) and \(u_{4}\), 1 path between \(u_{4}\) and \(u_{4}\) and 3 paths between \(u_{2}\) and \(u_{2}\), so we can calculate that the score of the p is 0.5.

Intuitively, if two vertexes share many common neighbors, they tend to be similar. As shown in Fig. 3, \(u_{1}\) has the same shopping history as \(u_{2}\), so they are similar and can be purchasing together. To model the neighbor structure, also known as the second-order proximity, autoencoders have emerged as one of the commonly used building blocks [34, 35]. An autoencoder performs two actions, i.e., the encoder and decoder. The encoder consists of multiple nonlinear functions \(f(\cdot )=f_{\theta _{k}}(\cdot \cdot \cdot f_{\theta _{1}}(\cdot ))\) that map the input data to the representation space. The decoder also consists of multiple functions \(g(\cdot )=g_{\hat{\theta } _{1}}(\cdot \cdot \cdot g_{\hat{\theta } _{k}}(\cdot ))\) mapping the representations in representation space to reconstruction space. Let us assume that \(f_{\theta _{1}}(x)=\sigma (W_{1}x+b_{1})\) and \(g_{\hat{\theta }_{1}}(x)=\sigma (\hat{W}_{1}x+\hat{b}_{1})\), where \(\sigma\) is the activation function, \(\theta =(W,b)\) are the parameters involved in the encoder, and \(\hat{\theta }=(\hat{W},\hat{b})\) are the parameters involved in the decoder. The goal of the autoencoder is to minimize the following reconstruction loss function.

$$\begin{aligned} L_\mathrm{{n}}=\sum _{i}\left\| s_{i}-g(f(s_{i})) \right\| _{2}^{2} \end{aligned}$$
(3)

Naturally, it is necessary for network embedding to preserve the link structure. We wish to see that the stronger the link between the two vertexes, the more similar their embedding vectors. Many classical recommendation algorithms have the objective, for example, in matrix factorization techniques, the higher the user’s rating of the item, the more overlapping their latent vectors. In addition, by adding the penalty of PathSim score, these weakly similar co-purchasers will be close to the initiator in the embedding space. The loss function for this goal is defined as follows:

$$\begin{aligned} L_\mathrm{{l}}=\sum _{i,j}p_{i,j}\left\| f(s_{i})-f(s_{j}) \right\| _{2}^{2} \end{aligned}$$
(4)

To preserve both neighborhood structure and link structure, we jointly minimize the objective function by combining Eqs. 4 and 3:

$$\begin{aligned} L=L_\mathrm{{l}}+\alpha L_\mathrm{{n}} \end{aligned}$$
(5)

As shown in previous works [34], we use stochastic gradient descent (SGD) to optimize the model. The key step is to calculate the partial derivative of the parameters \(\left\{ \theta ,\hat{\theta } \right\}\). Ultimately, the embedding vectors can be computed by the encoder. However, while the PathSim diffusion can be beneficial to weakly similar co-purchasers, it can also damage the original network structure and bring some negative impact on the general reconstruction of the network. We want to use a smoother way to perceive weakly similar co-purchasers and minimize the impact on the basic network features.

3.3 Co-occurrence Model Based on Truncated Walk

For the consideration of being self-contained, we briefly review the key idea of the co-occurrence model. The co-occurrence model is first used for linguistic tasks and attempt to map words to a low-dimensional vector space that captures semantic relations between words. Specifically, the SGNS model aims to maximize the co-occurrence probability among the words that appear within a window. Inspired by it, DeepWalk [27] is proposed as a method for learning the latent representations of the nodes of a social network. The method samples a set of paths from the input graph using the truncated random walk. Each path sampled from the graph corresponds to a sentence from the corpus, where a node corresponds to a word. Given a path consisting of nodes \(w1-wk\), the co-occurrence model objective is to maximize the following term:

Fig. 4
figure 4

Overview of co-occurrence model

$$\begin{aligned} \frac{1}{K}\sum _{i=1}^{K}\sum _{-c<j<c} \log P(w_{i+j}|w_{i}) \end{aligned}$$
(6)

where c is the context window size. Applying negative sampling [27], P is defined as:

$$\begin{aligned} P(w_{i+j}|w_{i})=\sigma (\mathbf {u_{i}^{T}u_{j}})+\sum _{t\in NS}\sigma (-\mathbf {u_{i}^{T}u_{t}}) \end{aligned}$$
(7)

where \(\sigma (x)=1/(1+ \exp (-x))\), and NS is the negative samples for \(w_{i}\).

By applying the co-occurrence model in formula 6, Frequently co-occurring nodes in a path share similar neighborhoods (In this section, the definition of neighborhoods is slightly different from PDSDNE, and it usually refers to the window in paths, not just the one-hop neighbors in networks.) and get similar embedding [27, 29]. For example (see Fig. 4a), \(u_{1}\) may co-occur with \(u_{2}\) most frequently, we will naturally recommend \(u_{1}\) as a co-purchaser to \(u_{2}\). However, it is still challenging to recommend weak similar co-purchasers like \(u_{3}\). Unfortunately, a large number of co-purchaser have distributed the long tail of similarity, and they are difficult to be perceived by the existing recommendation approaches.

To address the weak similar co-purchaser problem, we propose a novel neighborhoods sampling strategy that is beneficial to the weakly similar co-purchasers, which can smoothly improve the co-occurrence probability of the weakly similar co-purchasers by truncated bias walk.

3.3.1 General Neighborhoods Sampling Strategy

Network embedding methods based on the SGNS architecture reconstructs network features by learning the notion of neighborhoods. We first briefly introduce the general neighborhoods sampling strategy—truncated random walk, formally, a random walk begins at the source node s and gets a node sequence of fixed length le, let \(n_{i}\) denote the ith node in the sequence, starting with \(n_{0}=s\). The node \(n_{i}\) is generated by the following distribution.

$$\begin{aligned} P(n_{i}=v|n_{i-1}=u,i<le)= \left\{ \begin{array}{ll} \frac{\pi _{uv}}{\sum _{x\in \Gamma (u)}\pi _{ux}}& \text {if}\quad v\in \Gamma (u)\\ 0& \text {otherwise} \end{array}\right. \end{aligned}$$
(8)

where \(\Gamma (u)\) is the one-hop neighbors of node u, and \(\pi _{uv}\) is the unnormalized transition probability between nodes u and v (e.g., the edge weights \(w_{ux}\)).

However, the simple way not allows us to account for the network structure and guide our search procedure to explore different types of network neighborhoods. Additionally, the farther nodes are difficult to capture and may not even be touched in the finite number of the truncated walk. As shown in Fig. 1a, consider a truncated random walk arrived at the purchaser node \(u_{2}\), after which the walk will have multiple paths to reach another purchaser node \(u_{1}\), that is, \(u_{2}\) will frequently coexist with \(u_{1}\) in the node sequence generated by walks, and finally the SGNS model maps two nodes that frequently coexist into two close feature vectors. In contrast, there are rare opportunities to travel from \(u_{2}\) to \(u_{3}\), that is, \(u_{2}\) will rarely coexist with \(u_{3}\), and finally the SGNS model maps two nodes that rarely coexist into two irrelevant feature vectors.

3.3.2 Biased Neighborhoods Sampling Strategy

Prior studies have found the equivalence between word context and node neighborhood and transplanted the SGNS model to the network embedding. The daily corpus can only represent the common word feature; likewise, the truncated random walk can only preserve the basic and general network feature. We want to get more information that benefits weakly similar users. For example, a student may face the following scenarios on the group buying platform: he may co-purchase with his classmates, which is very intuitive because they are robustly similar; he may also co-purchase with buyers of a safety helmet because they have a consistent need for helmets; he might even co-purchase with buyers of a rucksack; however, there are incongruities between their shopping behavior. In reality, the last scenario is very common. There is no aligned preference between co-purchasers, just an intersection under a large category (e.g., outdoor activities).

Building on the above observations, we design a flexible neighborhood sampling strategy which allows us to perceive the weakly related nodes effectively and sensitively. We achieve this by developing a flexible bias walk procedure that can explore farther neighborhoods with co-purchase tendencies. For example, a bias walk that just traversed edge (tu) and now resides at node u. The walk now needs to decide on the next step, so it evaluates the transition probabilities \(\pi _{ux}\) on edges (ux) leading from u. We set the unnormalized transition probability to \(\pi _{ux}=\alpha _{pkl}(t,u,x)\cdot w_{ux}\), where

$$\begin{aligned} \alpha _{pkl}(t,u,x)= \left\{ \begin{array}{ll} p & \text {if}\quad t==x\\ k\cdot \text {sim}(t,x) & \text {if}\quad t\in I \quad \text {and}\quad x\in I \\ \frac{lw}{1+|(w_{t}-w_{x}|}&\text {if}\quad x\in \Gamma (t)\\ 1 & \text {otherwise} \end{array}\right. \end{aligned}$$
(9)

In the equation, p, k, and l are the preset biased parameters that control the tendency of truncated walks. \(w_{t}\) is the purchase edge associated with t. U and I are the users set and the items set. \(\text {sim}(t,x)\) denotes the approximate index between item nodes t and x. We simply set the approximate index to \(\text {sim}(t,x)=(\Gamma (i)\cap \Gamma (j))/(\Gamma (i)\cup \Gamma (j))\), although we can calculate a more accurate approximate index using side information attached to products.

Fig. 5
figure 5

The number of times co-occurrence of co-purchaser in one co-purchase on the TaoBao ( : initiator, : co-occurrence\(>100\), : co-occurrence\(>50\), : co-occurrence\(>10\), : co-occurrence\(\le 10\))

Parameter p controls the likelihood of immediately revisiting a node in the walk. If we set a value greater than 1, it would lead the walk to explore the nodes that have already visited, and this would keep the walk “local” close to the starting node [29]. Setting it to a low value ensures the walk spreads out at a faster rate and avoids “bigram” redundancy in node sequences.

Parameter k is the key to ensure that the initiator node was able to perceive the weakly similar co-purchaser. Setting k to a high value, the walking strategy encourages the walk to diffuse along the chains (the red line in Fig. 4a) that are composed of related goods. These chains are like backbones in the network; by approaching the chains, the paths generated by walks makes more meaningful when the walk is moving far away. That is, the farther co-purchasers attached to the chain will more likely coexist with the initiator.

Going back to Fig. 4a, Buyer \(u_{2}\) bought a product \(i_{2}\) in online shopping, consider a random walk that just traversed edge \((i_{2},u_{2})\) and now resides at node \(u_{2}\). There are several alternative nodes \((i_{3}, u_{1}, i_{2}, i_{1})\) on the next step. At this point, we could observe that \(i_{3}\) (safety helmet) has a high similarity with \(i_{2}\) (bicycle) because buyers of the two commodities are almost overlapping. The similarity between two items is amplified by the biased parameter k and then propagated to the biased factor, and the transition probability is adjusted to a larger value. That is to say, the walk has the high possibility to choose \(i_{3}\) on the next step, and the walking path is like a backbone of the interaction network. Finally, the purchasers of \(i_{3}\), such as \(u_{3}\), will appear in the walking path and form a co-occurrence with \(u_{1}\) and \(u_{2}\). SGNS model will capture the phenomenon of co-occurrence and map it to the embedding space.

Parameter l allows us to adjust the stay rate of the walk. If two buyers have a consistent preference for one item or two items get a consistent rating by one buyer, the item or buyer has a higher value of the stay. The higher the numeric of the parameter, the larger the influence of the stay rate, and vice versa.

By adjusting the biased parameters, the biased strategy of walking can flexibly explore the neighborhoods of nodes in interaction networks. In particular, the parameters allow our walk procedure to generate more meaningful co-occurrence paths for the co-purchaser recommendation. A toy example is shown in Fig. 5, and the weakly similar co-purchasers (cyan nodes in Fig. 5a) get a higher number of times co-occurrence. As discussed in formula 6, the weakly similar co-purchasers will gain better embedding vectors because they have higher relevance to the initiator. In addition, the biased walk is a smooth strategy and does not damage the original network information. That is, the original structure of networks and the adaptability to weakly similar co-purchasers both can be taken into account.

3.4 Phased Co-purchaser Recommendation

Fig. 6
figure 6

Differences between the two strategies

In online group buying, the initiator launches a co-purchase order and invites other buyers (from the co-purchaser recommendation list). For some time after that, such as 24 h [3], these buyers can choose to accept or reject the invitation of the initiator.

Co-purchase may last longer, but previous studies only recommended co-purchasers once when the initiator launches the co-purchase order [21]; therefore, the overall recommendation results are rigid and inflexible. Also, the buyers in the recommendation list may fail to satisfy the co-purchase requirements. To globally control the flexibility of recommendations, we designed a phased co-purchaser recommendation strategy. It can be applied to all co-purchaser recommendation methods and make the total result better. Specifically, as shown in Fig. 6, the co-purchaser recommendation list is constantly updated by the recommendation methods for a period of time, from the beginning of the initiator’s shopping submission to the end of the last confirmation of co-purchase status.

$$\begin{aligned} f(G)=\frac{1}{|G|}\sum _{i=1}^{|G|} \mathbf {u_{i}} \end{aligned}$$
(10)

According to the phased recommendation strategy, the initiator first invites other buyers to participate in a co-purchase transaction. After some buyers accept the invitation of co-purchase, they form a temporary group G with the initiator. The recommendation model will use the average function in Eq. 10 to aggregate the vector representation of this group members and then adjust the recommendation list for the temporary group and the selected products. Finally, the recommendation results are different in different co-purchase states. With the phased co-purchaser recommendation strategy, the recommendation model is more flexible and the recommendation performance is better.

4 Experiments

In this section, we conduct various experiments to demonstrate the effectiveness of our proposed methods. First, we describe three real-world datasets on online shopping and visualize the embedding of a small number of purchasers. Secondly, we evaluate the methods by the top-k purchaser recommendation task. Finally, we report the co-purchaser detection experimental results on multiple online shopping datasets and present the influence of biased parameters.

Table 2 Statistics of datasets

4.1 Datasets

We design experiments on three widely adopted online shopping datasets, including Epinions, Amazon Electrol, and TaoBao IJCAI16. Note, Amazon and TaoBao are processed into 5-core subsets, which all users and items have at least five records. Additionally, to enhance the diversity of truncated walk, we add a trusted edge between two buyers when they have multiple co-purchase. Table 2 shows some statistics about datasets.

4.2 Visualization of the Embeddings

In this part, we visualize the embeddings of buyers learned by PDSDNE and cop2vec. We compared two classic embedding methods like SVD and DeepWalk (DW). The results are shown in Fig. 7 where the buyers of the same item were highlighted with the same color. While the PDSDNE can be effective for the 2D embedding, it can also present a sparse form. Network embedding methods based on the truncated walk have a natural advantage in dealing with this problem, DeepWalk can map purchasers of the same item more closely. On that basis, cop2vec can further compact these nodes that are mapped to remote locations due to the weak similarity.

Fig. 7
figure 7

Visualization of purchaser

Fig. 8
figure 8

The performance evaluation of Top-k purchaser recommendation

4.3 Top-k Purchaser Recommendation

Although purchaser recommendation is not common on many e-commerce platforms, it is a critical part of group buying because we need to decide whom to recommend products to. To split the test set, we randomly selected 20% items from the TaoBao dataset and removed their 50% purchaser. After the model training, we choose Top-k close purchasers for the item in the embedding space, which are considered to be the most likely buyers to purchase the item. In order to comprehensively evaluate the effectiveness of the recommendation, we not only employ two state-of-the-art embedding models as baselines, including LINE [28] and DeepWalk [27], but also fully compared the two proposed methods.

For a fair comparison, we use a 128 dimensions vector to denote a node in all methods. In LINE, as suggested in [28], the representation is directly concatenated by first order (dimension 64) and second order (dimension 64). In addition, we still use the same parameters for the truncated walk. The number of walks per node is 50, and the walk length is 30. The context window is 8, and the size of negative samples is 5. In PDSDNE, the structure of two-layer encoder is 1000 and 128, and this is also the case with the decoder. In cop2vec, the biased parameters are tuned to be optimal.

As shown in Fig. 8, the walk-based network embedding method (DeepWalk and cop2vec) outperforms the proximity-based method LINE. When k is taken as 50, the performance of cop2vec is better. Compared with PDSDNE, precision is improved by 6%. Compared with LINE, the precision is improved by 23%, which is partly due to the first order and the second order not well coordinated in LINE. In the term of recall, cop2vec was significantly higher than the contrast methods, which is increased by 41% compared to DeepWalk and 67% higher than LINE. This shows that the bias walk strategy can effectively perceive the purchasers who are weakly associated with items.

4.4 Co-purchaser Detection

In this section, we evaluate our proposed method on the co-purchaser recommendation task. Given a purchase initiator and his order for a certain item, we want to select the possible co-purchaser candidates. Note that current group buying platforms encourage buyers to sign in using social accounts, that is, we can give priority to recommending co-purchaser from a group of social accounts, rather than recommending co-purchaser from the whole buyers.

We choose 20% of the items from datasets and remove their n purchaser as a true buyer set. Additionally, we add \((n-1)/2\) unpurchased users as a false buyer set for each selected item, where n is 50% of an items total number of buyers. Select two buyers from the true buyers set, one as the initiator and one as the co-purchaser to form a positive sample, and finally generate \(n(n-1)/2\) positive samples. Select a buyer from true buyers set and false buyers set, respectively, one as the initiator and one as the co-purchaser to form a negative sample, and finally generate \(n(n-1)/2\) negative samples. We use AUC (area under curve) score to evaluate co-purchase intentions of positive and negative samples, where the co-purchase intention can be represented by the Hadamard product of the embedding vectors.

We conduct experiments on three different scale datasets and compare them with two traditional methods including singular value decomposition (SVD), common neighbors (CN), and classic network embedding methods. The performances on three datasets are summarized as Table 3. We observe that cop2vec is consistently better than all the comparison methods.

On Epinions dataset, the performance of co-purchaser recommendations is the best, and we attribute this to a large number of real trust edges on the dataset. The AUC score of PDSDNE is 9% higher than SDNE, 6% higher than LINE, and 2% lower than cop2vec. On Amazon dataset, the walk-based network embedding method is significantly higher than other types of methods, and the worst-performing DeepWalk is also 20% better than the proximity-based embedding method LINE. We can see that the performance of cop2vec gain is more significant on TaoBao dataset, and the AUC score is 7% higher than PDSDNE, 11% higher than DeepWalk, and 19% higher than common neighbors.

Table 3 Area under curve (AUC) scores for co-purchaser prediction
Fig. 9
figure 9

Parameter sensitivity

4.5 Gain of N-Phased Recommendation

In this section, we demonstrate the gain obtained by using the phased recommendation strategy in a variety of co-purchaser recommendation algorithms. Given a co-purchase transaction, we divide the whole transaction into N stages according to the time. In the phased recommendation task, we want to select the possible co-purchaser candidates in each stage. To achieve the goal, the recommendation model needs to adjust the current recommendation list based on the feedback of recommendation results from the previous stage. In order to simulate this behavior, we choose the true positives of the previous stage to join the current temporary group and then generate the recommendation list of the current phase for the temporary group and the selected products.

We use the Epinions dataset for the experiment, because there are more buyers for each product in the data set, and even after the division, there are still enough buyers in each stage to verify the recommendation performance. We choose 20% of the items from datasets as test dataset and remove their all purchaser except the initiator. For each stage of each test item, we will select the users who participated in the transaction in the next stage as a true buyer set, and the size of the true buyer set in each stage is t. Additionally, we add t unpurchased users as a false buyer set for each stage. For the validity of the experiment, we guarantee that the value of t is greater than 1 and less than n, n is the buyer removed from test item, and the total true buyer set of all stages is also n. At each phase, the recommendation model first aggregates the vector representation of this group members and then select the group and a buyer from true buyers set and generate k positive samples and then select the group and a buyer from false buyers set, and generate k negative samples. In each recommendation phase, we use area under curve (AUC) score to evaluate co-purchase intentions of positive and negative samples.

Table 4 Area under curve (AUC) scores for phased co-purchaser prediction

The gains of using the phased co-purchaser recommendation strategy are summarized as Table 4, and we observe that all co-purchaser recommendation methods can benefit from it. Using the basic recommendation strategy, that is, without partition, the recommendation performance is still relatively high, because we have changed the test cases, which only include n co-purchase samples between the initiator and other co-purchasers, rather than \(n(n-1)/2\) co-purchase samples between all co-purchases. The former is obviously easier to determine. We can see that the gain of DeepWalk and cop2vec is higher. In addition, the performance of cop2vec is consistently better than all the comparison methods, which shows that cop2vec method is also efficient in the phased co-purchaser recommendation.

4.6 Parameter Sensitivity

We investigate the parameter sensitivity in this section. Specifically, we mainly evaluate how the different choices of biased parameters affect the results of the co-purchaser recommendation. We report the AUC score on Epinions in Fig. 9. Intuitively, we can see that the performance raises when the parameter p increases, as shown in [29], and a high p ensures that the walk does not go too far from the start node. We also observe that performance tends to saturate once the biased parameter k reaches around 8. Setting too large parameter k will generate a favorable path for weakly co-purchasers, but the impact on the original network will gradually appear. Interestingly, we keep the parameter l at a small figure and get a good performance. This experiment suggests that we do not need to pay too much attention to the “closed-loop” structure in the co-purchaser recommendation task.

5 Conclusions

As an emerging online shopping form, group buying has been restricted by the co-purchaser recommendation problem. Both the handcrafted invitation and the classic collaborative filtering are not suit to solve the problem. In this paper, we present network embedding-based methods to address the co-purchaser recommendation challenge. To cope with the problem that traditional algorithms are desensitized to the weakly similar nodes, we propose two novel co-purchaser recommendation methods, namely PDSDNE and cop2vec, particularly the latter, which effectively perceive weakly similar nodes and maintain the original network information. Experiments on real-world datasets verify the effectiveness of our proposed approaches. Considering that co-purchase may last longer, we improve the existing recommendation strategy, which makes the recommendation model more flexible. For future work, incorporating side information such as stores, product categories, and attributes of buyers constitutes a heterogeneous network with more diversity of the bias walk, which may further improve the co-purchaser recommendation performance.