1 Introduction

Due to the remarkably rapid rise in its availability and pervasive use in the past decades, social media has an impact upon a large proportion of the global population. In the last few years, COVID-19 and Monkeypox which are declared as Public Health Emergency of International Concern lead to some form of lockdown across almost all countries of the world. And during this period, it is reported that millions of new users have come online. Now, the internet, as a good tool to communicate with friends and obtain information, is used by more than half of the world. As a result, social media becomes a wonderful marketing resource for generating brand awareness and connecting with prospective consumers and/or supporters, changes the traditional business model and demands marketers to develop an effective marketing strategy.

Given that social media advertising can be done at a smaller/cheaper scale and with lower investments in production costs, more and more companies start running social media advertisements. And the major social media platforms such as Facebook, YouTube, WhatsApp and Instagram actually have an incredible influence on customers. Different from mass media, users of social media platforms can build their own social networks and promptly share information with their friends or followers. So, companies can trigger the rapid spread of product information and instantly reach a range of audiences in social media marketing campaigns by selecting a few costumers to experience firstly. This inexpensive way is clearly advantageous to market a company’s products and services. And who should be chosen as the source of promotion information is the key. Therefore, influence maximization (IM) problem aiming to find top-k influential nodes to influence as many nodes as possible is proposed in Kempe et al. (2003). The classical IM problem and its numerous variants including Profit Maximization problem and Competitive Influence Maximization problem have been extensively studied.

In most assumption of prior works, even if there are more than two entities (information of different promotional products or service) spreading in the same network, a user of social network, abstracted as a node, can choose only one entity. For example, Microsoft’s Surface and Apple’s iPad are comparable products to some extent. Under the assumption in Ansari et al. (2019) and Shi et al. (2021), if the initial consumer picks iPad, his friends cannot receive the information about Surface, to say nothing of buying Surface. In reality, a social network user can receive marketing information about an entity and its opponents. And when information about a product and its competitor are launched over the same market, there may be some consumers who buy both of them and provide their review about these two products to their friends. Although such consumers may have a little effect on product’s market share, the spread of information is continuing. Then, the market share may be changed by the decision of their friends who can choose from the two product according to their preference. Assuming that most of their friends prefer Surface to iPad, the sales of Surface can be larger in this case. So, the widely used assumption is unrealistic. Of course, due to the competitive relationship between iPad and Surface, the probability with which a potential purchaser is persuaded to buy Surface is influenced by whether he or she has own iPad.

However, assumptions mentioned above are ideal. For example, some previous studies assume that the information of opponent’s seed are known. In fact, it is difficult to know your opponents and their strategies are always kept confidential. In addition, once some influential users of social network are found, your opponents may compete with you and finally win the bidding. And it is a common phenomenon in real competitive social advertising. Therefore, only finding high-quality and influential seeds may not be a good marketing strategy. A few research takes a further step from the perspective and try to combine seed selection and allocation by allocating seeds such that the requirement of clients (marketers) are satisfied or a Nash equilibrium is found. However, they ignore the effect of influential nodes’ preference on their final choice.

In this paper, we propose competitive independent cascade (CIC) model and formulate two novel problems, competition-based diversified-profit maximization (CDM) problem and adaptive competition-based diversified-profit maximization (ACDM) problem. Both of them study the profit related to adoption for a special entity and integrate the seed selection and online allocation in competitive social advertising. What should be emphasized is that these two problems allocate seeds to competitive clients to achieve the goal of maximizing the sum of seeds’ appraisal for both their choice and other seeds picking the same client as them. This method highlights the effect of potential consumers’ appraisal of products and is close to the actual situation. The main difference between these two problems is that the CDM problem takes one-shot policy while ACDM problem chooses kR (k nodes per Round) strategy. The former selects all the nodes at the beginning while the latter chooses a more robust approach. The kR strategies consists of the following steps: pick k nodes as seeds and allocate them to competitive clients; observe the change of all the nodes’ state, and pick the next k node based on these observations; repeat until the given budget is exhausted. Based on the relationship between one-shot policy and kR strategy, we can draw a conclusion that the algorithm designed for ACDM problem is suitable for CDM problem. Therefore, we mainly focus on designing an algorithm to address ACDM problem. And we divide the ACDM problem into two subproblems focusing on seed selection and allocation. Then AS algorithm and OA algorithm are designed for these two subproblems respectively. Based on the AS and OA algorithm, a three-phases algorithm, named as the Adaptive Selection and Online Allocation, is proposed to select seeds and allocate them with kR policy. The final allocation is returned as a feasible solution.

The rest of this paper is organized as follows: Some related works are recalled in Sect. 2. The CIC model is introduced in Sect. 3. The formulation of both CDM problem and ACDM problem are proposed in the same section. In Sect. 4, we show the framework of ASOA algorithm designed for ACDM problem. And the experiments are presented in Sect. 5. We conclude in Sect. 6.

2 Related works

Due to the potential applications of this problem in different domains, recent years have witnessed a significant attention in the study of influence propagation and maximization in online social networks from many different perspectives, such as Kempe et al. (2003), He and Kempe (2014), Chen et al. (2016), Shang et al. (2017), Song et al. (2019), Yuan and Tang (2017), Huang et al. (2020) and Tang and Yuan (2020). The classical IM problem is formulated as a discrete optimization problem and is proved to be NP-hard under two well-studied probabilistic diffusion models, independent cascade (IC) model and linear threshold (LT) model. The former one collects the independent behavior of the nodes while the latter one captures the collective behavior of the ndoes. Meanwhile, the triggering model (TR) introduced in Kempe et al. (2003) is proposed to generalize the aforementioned IC model and LT model. These three models belong to time-unaware models where the diffusion terminates only when no more node could be activated. Except for them, there are several time-aware diffusion models where diffusion happens in discrete steps or dissemination is continuous in time. What should be underlined is that all of these models assume that activated nodes cannot be deactivated in later steps. Although it is proved in Chen et al. (2010) that computing the influence of a seed set is \(\#\)P-hard under IC model, the optimal solution can be approximated if the influence function is monotone and submodular. A greedy solution with a \(1-1/e\) approximation guarantee for classical IM problem whose objective function is a non-negative monotone submodular function is proposed, and its high time complexity leads to numerous subsequent researches focusing on improving greedy algorithm or designing heuristic algorithm. As for this part, many effective algorithms are proposed in Leskovec et al. (2007), Chen et al. (2009, 2010), Goyal et al. (2011), Yu et al. (2013), Tang et al. (2015) and Wang et al. (2017), etc.

Based on classical IM problem, competitive influence maximization (CIM) problem is abstracted from social advertising in real-world where many marketers compete with each other by launching comparable products over the same market. Such a special extension of IM problem allows the information about at least two competitive products simultaneously spreading in the same social network. Previous studies mainly handle this problem from two different perspectives: seed selection and budget allocation. The former consists of two cases. One supposes that seed selection happens in turn, and solves the problem by considering itself as the last player to commit in seed selection. So, it takes advantage of knowing the other opponents’ seed sets. Bharathi et al. (2007), Carnes et al. (2007), Li et al. (2017), Bozorgi et al. (2017) and Yan et al. (2019) study the problem under this assumption. The other perspective proposed in Li et al. (2015), Lin et al. (2015) and Ali et al. (2018) considers that seed selection happens simultaneously, and the main objective is to propose a framework for selecting the best method from a set of available seed selection algorithms. Studies, such as Masucci and Silva (2014, 2017) and Varma et al. (2018, 2019), focus on addressing a kind of budget allocation problem which assumes that companies compete with each other by the amount of budget that they allocate to each node in the network and nodes will choose the highest-bidder. A new two-phases scenario integrating the seed selection and budget allocation is proposed in Ansari et al. (2019). They pay more attention on convincing influential nodes to be seed. And the competition does not play an important role in the first phase.

Profit maximization (PM) problem, as another extension of IM problem, is studied by Tang et al. (2017, 2018), Du et al. (2020, 2021) and Shi et al. (2021). Different from IM problem, PM problem is based on the fact that social network providers always keep the structure of their social network as business secret. Therefore, it is difficult for marketers to calculate the precise number of potential consumers who receive the information about promotion and make a decision to purchase the promoted product. However, abundant information including the follower–followee ratio relationship is available to the host of social network. So they can provide viral marketing services and charge commission. Thus, the spread of product’s information determines their benefit. At the same, providers of social network have to pay for propagating influence or persuading influential nodes to be seeds. Therefore, PM problem is proposed from the perspective of social network hosts who want to gain as much profit as possible. In real marketing activities, advertisers or marketers usually delegate the operation of viral marketing campaigns to the social network provider, so the provider can be simultaneously hired by many competitive companies. As for the objective function of PM problem, it can be submodular or nonsubmodular. The objective function of PM problem studies in Du et al. (2020, 2021) and Tang et al. (2017) generally loses the property of submodular. On the contrary, the profit metric of PM problem studies in Tang et al. (2018) is submodular because only the cost of selected nodes are taken into consideration. Taking a further step of PM problem and CIM problem, Shi et al. (2021) concentrates on maximizing the provider’s profit in presence of competitive companies in social advertising.

3 Problem formulation

3.1 The CIC model

The Com-IC model proposed in Lu et al. (2015) captures the relationship spectrum from complementary to competitive. In this paper, we focus on the competitive relationship between only two entities and denote it as competitive independent cascade (CIC) model.

There are two entities \({\mathcal {A}}\) and \({\mathcal {B}}\) which want to spread information about promotion in a social network abstracted as a directed graph \(G=(V,E)\). Each node \(v\in V\) represents a user of the social network and \(|V|=n\). The edge between each pair of neighbors \(u,v\in V\) is denoted as \((u,v)\in E\). And for each edge (uv), \(p_{u,v}\) represents the probability with which information is successfully spread from u to v. During the information dissemination in G, each node selects one of \(\{\text {idle,accepted,rejected}\}\) as its state for \({\mathcal {A}}\) and does the same thing for \({\mathcal {B}}\). So, each node should stay in a joint state. After receiving the information, whether node v’s state changes depends on its current state and a parameter set \(q(v) =\{q_{{\mathcal {A}}|\emptyset }(v), q_{{\mathcal {A}}|{\mathcal {B}}}(v), q_{{\mathcal {B}}|\emptyset }(v), q_{{\mathcal {B}}|{\mathcal {A}}}(v)\}\) where each parameter is the probability with which the state transformation happens, i.e.,

  • \(({\mathcal {A}}\text {-idle}, {\mathcal {B}}\text {-accepted})\xrightarrow [q_{{\mathcal {A}}|{\mathcal {B}}}(v)]{\text { informed of } {\mathcal {A}}} ({\mathcal {A}}\text {-accepted}, {\mathcal {B}}\text {-accepted})\);

  • \(({\mathcal {A}}\text {-idle}, {\mathcal {B}}\text {-rejected/idle})\xrightarrow [q_{{\mathcal {A}}|\emptyset }(v)]{\text { informed of } {\mathcal {A}}} ({\mathcal {A}}\text {-accepted}, {\mathcal {B}}\text {-rejected/idle})\);

  • \(({\mathcal {A}}\text {-accepted}, {\mathcal {B}}\text {-idle})\xrightarrow [q_{{\mathcal {B}}|{\mathcal {A}}}(v)]{\text { informed of } {\mathcal {B}}} ({\mathcal {A}}\text {-accepted}, {\mathcal {B}}\text {-accepted})\);

  • \(({\mathcal {A}}\text {-rejected/idle}, {\mathcal {B}}\text {-idle})\xrightarrow [q_{{\mathcal {B}}|\emptyset }(v)]{\text { informed of } {\mathcal {B}}} ({\mathcal {A}}\text {-rejected/idle}, {\mathcal {B}}\text {-accepted})\).

We only explain the first one in detail. The current state \(({\mathcal {A}}\text {-idle}, {\mathcal {B}}\text {-accepted})\) of node v means that v has accepted \({\mathcal {B}}\) and hasn’t received the information about \({\mathcal {A}}\). Once it is informed of \({\mathcal {A}}\), its state will change to \(({\mathcal {A}}\text {-accepted}, {\mathcal {B}}\text {-accepted})\) with probability \(q_{{\mathcal {A}}|{\mathcal {B}}}(v)\).

To reflect the competitive relationships between entities \({\mathcal {A}}\) and \({\mathcal {B}}\), we assume that \(0\le q_{{\mathcal {A}}|{\mathcal {B}}}(v)< q_{{\mathcal {A}}|\emptyset }(v)\le 1\) and \(0\le q_{{\mathcal {B}}|{\mathcal {A}}}(v)< q_{{\mathcal {B}}|\emptyset }(v)\le 1\) for each \(v\in V\).

Now, we consider the CIC model as the information diffusion model. Let \( S_{{\mathcal {A}}}, S_{{\mathcal {B}}} \subset V\) be two seed sets. At time \(t=0\), \( v\in S_{{\mathcal {A}}}\) accepts \({\mathcal {A}}\) while \({\mathcal {B}}\) is accepted by \(u \in S_{{\mathcal {B}}}\). Except for them, all the nodes initially stay in the joint state of (\({\mathcal {A}}\)-idle, \({\mathcal {B}}\)-idle). At each time step \(t\ge 1\), for a node u becoming \({\mathcal {A}}\)-accepted at time \(t-1\) and one of its neighbor v, information about \({\mathcal {A}}\) has only one chance to successfully spread from u to v with probability \(p_{u,v}\). And \(p_{u,v}\) is the same for both \({\mathcal {A}}\) and \({\mathcal {B}}\). Then, according to v’s current state and q(v), its state may change. What should be emphasized is a special case where node v stays in the joint state of (\({\mathcal {A}}\)-idle, \({\mathcal {B}}\)-idle) and is informed about both \({\mathcal {A}}\) and \({\mathcal {B}}\) from its neighbors at the same step. In this case, a tie-breaking rule is used to decide which state node v will transform to. It generally consists of two cases: \({\mathcal {A}}\) is superior to \({\mathcal {B}}\), that is, nodes prefer \({\mathcal {A}}\) to \({\mathcal {B}}\) and always adopt \({\mathcal {A}}\) when a choice between \({\mathcal {A}}\) and \({\mathcal {B}}\) should be done; otherwise, \({\mathcal {A}}\) is inferior to \({\mathcal {B}}\). The process stops when there is no node can be activated. When the diffusion terminates, each node’s adoption is fixed and profit generated by adopter of \({\mathcal {A}}\) can be calculated.

3.2 Allocation and social welfare

In this part, we firstly introduce the definition of allocation and its social welfare under no-rejection condition. Given a set of candidates \(S_{\text {cand}}\) and agents of entities \({\mathcal {A}}\) and \({\mathcal {B}}\), the no-rejection condition requires each candidate to choose an agent and the selected one can not reject. Therefore, all the candidates should be allocated after the whole allocation process and an allocation A of \(S_{\text {cand}}\) is used to show each candidate’s choice. In other words, an allocation A is a non-overlapping partition of nodes in \(S_{\text {cand}}\). In this paper, two agents are considered and allocation A can be expressed as \(A=\{S_{{\mathcal {A}}},S_{{\mathcal {B}}}\}\) satisfying \(S_{\text {cand}}=S_{{\mathcal {A}}}\cup S_{{\mathcal {B}}}\) and \(S_{{\mathcal {A}}}\cap S_{{\mathcal {B}}}=\emptyset \) where \(S_{{\mathcal {A}}}\) and \(S_{{\mathcal {B}}}\) are two sets of nodes allocated to agent of entity \({\mathcal {A}}\) and \({\mathcal {B}}\) respectively. Let \(H^{|V|\times |V|}\) be a happiness matrix where \(h_{u,v}\in H\) denotes the degree of candidate u’s happiness when u finds that v chooses the same agent. For each candidate u, \(r_{u,{\mathcal {A}}}\) denotes the appraisal of candidate u on \({\mathcal {A}}\) and \(r_{u,{\mathcal {B}}}\) shows the evaluation of u on \({\mathcal {B}}\). Now, we recall two definitions mentioned in Huzhang et al. (2017) as follows.

Definition 1

Given two agent \({\mathcal {A}}\) and \({\mathcal {B}}\), four candidate \(u,v\in S_{\mathcal {A}}\) and \(x,y\in S_{\mathcal {B}}\) where \( S_{\mathcal {A}}\) and \( S_{\mathcal {B}}\) consist of all the nodes allocated to \({\mathcal {A}}\) and \({\mathcal {B}}\) respectively,

  1. 1.

    the social welfare of allocation A is defined as

    $$\begin{aligned} {\textit{SW}}(A)=h_{u,v}+h_{v,u}+r_{u,{\mathcal {A}}}+r_{v,{\mathcal {A}}} +h_{x,y}+h_{y,x}+r_{x,{\mathcal {B}}}+r_{y,{\mathcal {B}}}. \end{aligned}$$
  2. 2.

    the utility of candidate u is defined as \(U_{u}=\sum _{v\in S_{{\mathcal {A}}}}h_{u,v}+r_{u,{\mathcal {A}}}\).

It is obvious that candidate node’s utility is defined as the sum of its happiness to all the other candidates in the same agent and its appraisal for the agent. Based on it, the definition of weakly agent stable is proposed.

Definition 2

(Definition 3 in Huzhang et al. 2017) An allocation is weakly agent stable if for any two candidates uv choose agent \({\mathcal {A}}\) and two candidates xy choose agent \({\mathcal {B}}\), switching their choices cannot increase all four candidates’ utilities.

We assume that all candidates uniformly arrive one by one and the order is random. Under this assumption, when candidate i arrives, its happiness value \(h_{i,j}\) to all candidates j, as well as its appraisal to \({\mathcal {A}}\) and \({\mathcal {B}}\), are revealed. And it should be assigned immediately.

3.3 CDM problem

The CIC model is equivalent to a live edge graph process. For each edge \((u,v)\in E\), uniformly and randomly generate a number from [0, 1] in advance. Retain this edge if the number is not bigger than \(p_{u,v}\), otherwise remove it. Based on the live edge graph process, given an allocation \(A=\{S_{{\mathcal {A}}},S_{{\mathcal {B}}}\}\), the profit generated by \({\mathcal {A}}\)-adopter can be written by an expectation form. Profit can be expressed as the difference between benefit and cost. Here, let \(\phi _{{\mathcal {A}}}(v)\in [0,1]\) represent the modified profit with respect to entity \({\mathcal {A}}\) generated by node v when it adopts \({\mathcal {A}}\). We propose some assumption as follows.

  1. 1.

    For node v which does not accept \({\mathcal {A}}\), it can not generate profit with respect to \({\mathcal {A}}\) regardless of its state for \({\mathcal {B}}\).

  2. 2.

    For node v accepting both \({\mathcal {A}}\) and \({\mathcal {B}}\), it can spread the information about both \({\mathcal {A}}\) and \({\mathcal {B}}\) to its neighbors. However, v does not generate profit with respect to \({\mathcal {A}}\). Thus, it is ignored when calculating the total profit generated by \({\mathcal {A}}\)-adopter.

Now, integrating seed selection and allocation, we denote diversified-profit function \({\textit{DP}}(A)\) as the sum of social welfare \({\textit{SW}}(A)\) and profit \(\varPhi _{{\mathcal {A}}}(S_{{\mathcal {A}}}|A,S_{{\mathcal {B}}}|A)\) generated by \({\mathcal {A}}\)-adopter. Hence,

$$\begin{aligned} {\textit{DP}}(A)=\varPhi _{{\mathcal {A}}}(S_{{\mathcal {A}}}|A,S_{{\mathcal {B}}}|A)+\lambda {\textit{SW}}(A) \end{aligned}$$
(1)

where \(\varPhi _{{\mathcal {A}}}(S_{{\mathcal {A}}}|A,S_{{\mathcal {B}}}|A)=\sum _{v\in I_{g,A}}\phi _{{\mathcal {A}}}(v)\), \(\lambda \) is a weight parameter representing the importance of social welfare and \(I_{g,A}\) is the set of nodes which can receive the information spread from \(S_{{\mathcal {A}}}\) and become (\({\mathcal {A}}\)-accepted, \({\mathcal {B}}\)-idle/rejected) under CIC model in a subgraph g with allocation \(A=\{S_{{\mathcal {A}}},S_{{\mathcal {B}}}\}\). Now, we can introduce the definition of CDM problem.

Definition 3

(CDM problem) Given a directed graph \(G=(V,E)\), weighted adjacency matrix P, happiness matrix H, appraisal matrix R, budget K, two competitive entities \({\mathcal {A}}\) and \({\mathcal {B}}\), parameter set q(v) and profit \(\phi _{{\mathcal {A}}}(v)\) for each node, Competition-based Diversified-profit Maximization problem aims at finding a weakly agent stable allocation \(A^{*}=\{S_{{\mathcal {A}}},S_{{\mathcal {B}}}\}\) containing at most K seeds such that the expected value of diversified-profit function \({\textit{DP}}(A^{*})\) is maximized under CIC-model, i.e.

$$\begin{aligned} A^{*}\in \arg \max {\mathbb {E}}[{\textit{DP}}(A^{*})] \end{aligned}$$
(2)

We propose the following theorem.

Theorem 1

CDM problem is NP-hard and computing the exact value of \({\textit{DP}}(A)\) for a fixed allocation \(A=\{S_{{\mathcal {A}}},S_{{\mathcal {B}}}\}\) is \(\#\)-P hard. What’s more, \({\textit{DP}}(A)\) is nonsubmodular.

Proof

We construct a special CDM problem as following:

  1. 1.

    Let every element in H be 1.

  2. 2.

    Let \(r_{v,{\mathcal {A}}}=1\) and \(r_{v,{\mathcal {B}}}=0\) for each node \(v\in V\).

  3. 3.

    Let \(\phi _{{\mathcal {A}}}(v)=1\) for each node \(v\in V\).

It is obvious that the special CDM problem consists of two subproblems. One is finding K seeds to maximize \(\varPhi _{{\mathcal {A}}}(S_{{\mathcal {A}}})\). The other is allocating all the selected nodes to obtain a weakly agent stable allocation. Due to the first two settings, \(S_{{\mathcal {B}}}=\emptyset \). And we regard the first subproblem as the main task. It is clear that the subproblem can subsume the classical Influence Maximization problem when \(q_{{\mathcal {A}}|\emptyset }(v)=q_{{\mathcal {B}}|\emptyset }(v)=1\) and \(q_{{\mathcal {A}}|{\mathcal {B}}}(v)=q_{{\mathcal {B}}|{\mathcal {A}}}(v)=0\) for each node \(v\in V\). Based on the analysis of classical IM problem provided in Kempe et al. (2003), we can conclude that CDM problem is NP-hard.

By a similar argument, on the foundation of conclusion proposed in Chen et al. (2010), it is \(\#\)P-hard to compute the exact value of \(\varPhi _{{\mathcal {A}}}(S_{{\mathcal {A}}}|A,S_{{\mathcal {B}}}|A)\) for any given \(S_{{\mathcal {A}}}\) and \(S_{{\mathcal {B}}}\). So computing the exact value of \({\textit{DP}}(A)\) for a fixed allocation \(A=\{S_{{\mathcal {A}}},S_{{\mathcal {B}}}\}\) is \(\#\)-P hard.

As for the submodularity analysis, we highlight the proof of the profit function’s property. Based on the submodularity analysis for competitive cases of Com-IC Model proposed in Lu et al. (2015), we can draw a conclusion that even if \(S_{{\mathcal {B}}}\) is fixed, \(\varPhi _{{\mathcal {A}}}(S_{{\mathcal {A}}}|A,S_{{\mathcal {B}}}|A)\) is not submodular with respect to \(S_{{\mathcal {A}}}\) in general. Because the nonsubmodularity of \({\textit{DP}}(A)\) results from the nonsubmodularity of \(\varPhi _{{\mathcal {A}}}(S_{{\mathcal {A}}}|A,S_{{\mathcal {B}}}|A)\), this proof is finished. \(\square \)

3.4 ACDM problem

The CDM problem adopts the one-shot policy in which budget is exhausted with seed nodes selected and activated at the beginning. And nothing can be done in the subsequent process to improve the final result. If the feedback of seed selection is available, selecting seed nodes with an adaptive policy is obviously a more flexible and effective strategy. Inspired by this idea, we propose the ACDM problem which selects at most k seed nodes of each round and allocates them to client of competitive entities \({\mathcal {A}}\) and \({\mathcal {B}}\) such that the value of diversified-profit function \({\textit{DP}}(A)\) in T rounds is maximized. At the beginning of each round, which nodes should be selected as seeds for current round depends on the propagation and allocation results observed in previous round.

The formal definition follows the framework and terminology introduced in Sun et al. (2018). We denote \(S_{t}\) as the seed set chosen in round t and \(A_{t}\) as an allocation of \(S_{t}\). Let \(S_{{\mathcal {A}},t}\) consist of all the nodes which belong to \(S_{t}\) and are allocated to \({\mathcal {A}}\) by allocation A. Similar to the definition of \(S_{{\mathcal {A}},t}\), \(S_{{\mathcal {B}},t}\) is the set of nodes that are selected as \({\mathcal {B}}\)-seeds. We call \((S_{{\mathcal {A}},t},S_{{\mathcal {B}},t},t)\) as an item and use \({\mathcal {E}}\) to represent the set of all the possible items. For each item, after the propagation, the nodes and edges participated in the propagation are observed as the feedback. For each edge \((u,v)\in E\), it may be deleted with \(1-p_{u,v}\) probability or maintained with \(p_{u,v}\) probability. And each node \(v\in V\) can be in any of the states \(\{\text {idle, accepted, rejected}\}\) w.r.t. every entity. So, there are nine kinds of joint state. Among them, three kinds, i.e. \(({\mathcal {A}}\text {-idle},{\mathcal {B}}\text {-idle})\), \(({\mathcal {A}}\text {-idle},{\mathcal {B}}\text {-accepted})\) and \(({\mathcal {A}}\text {-idle},{\mathcal {B}}\text {-rejected})\), are more important. Because nodes staying in these three kinds of joint state can be selected as seed nodes in the next round. As for nodes in the joint state of (\({\mathcal {A}}\)-accepted, \({\mathcal {B}}\)-idle), (\({\mathcal {A}}\)-accepted, \({\mathcal {B}}\)-accepted), (\({\mathcal {A}}\)-accepted, \({\mathcal {B}}\)-rejected), (\({\mathcal {A}}\)-rejected, \({\mathcal {B}}\)-idle), (\({\mathcal {A}}\)-rejected, \({\mathcal {B}}\)-accepted), and (\({\mathcal {A}}\)-rejected, \({\mathcal {B}}\)-rejected), their state with respect to \({\mathcal {A}}\) is determined. Formally, the feedback is regraded as a state which is equivalent to a G’s subgraph where every node can be reached by \(S_{t}=S_{{\mathcal {A}},t}\cup S_{{\mathcal {B}},t}\) and stay in a known sate based on previous observations. Let \(\psi :{\mathcal {E}}\rightarrow {\mathcal {O}}\) be a function mapping every possible item \((S_{{\mathcal {A}},t},S_{{\mathcal {B}},t},t)\) to a state, where \({\mathcal {O}}\) is the set of all possible states. We denote \(\psi \) as realization and \(\psi (S_{{\mathcal {A}},t},S_{{\mathcal {B}},t},t)\) as the state of \((S_{{\mathcal {A}},t},S_{{\mathcal {B}},t},t)\) under realization \(\psi \). For ACDM problem, in each round t, we pick item \((S_{{\mathcal {A}},t},S_{{\mathcal {B}},t},t)\) and see its state \(\psi (S_{{\mathcal {A}},t},S_{{\mathcal {B}},t},t)\). Then we pick \((S_{{\mathcal {A}},t+1},S_{{\mathcal {B}},t+1},t+1)\) based on \(\psi (S_{{\mathcal {A}},t},S_{{\mathcal {B}},t},t)\) and see its state \(\psi (S_{{\mathcal {A}},t+1},S_{{\mathcal {B}},t+1},t+1)\).

Now, denote our adaptive strategy for picking items based on realization \(\psi \) as a policy \(\pi \). \(\pi \) is actually a function from realization \(\psi \) to allocation A, selecting nodes as seeds based on previous observation and obtaining a weakly agent stable allocation of them. Therefore, different from previous works, our strategy has to complete two tasks: selecting some nodes that can maximize the profit generated by \({\mathcal {A}}\)-adopter and finding a weakly agent stable allocation which allocates seed nodes to \({\mathcal {A}}\) and \({\mathcal {B}}\).

Relying on all the above-mentioned definition, we redefine the diversified-profit function as follows.

$$\begin{aligned} {\textit{DP}}(A_{\psi }) =\varPhi _{{\mathcal {A}}}(S_{{\mathcal {A}}}|A_{\psi },S_{{\mathcal {B}}}|A_{\psi })+\lambda {\textit{SW}}(A_\psi )=\sum _{v\in I_{g,A_\psi }}\phi _{{\mathcal {A}}}(v)+\lambda {\textit{SW}}(A_\psi ) \end{aligned}$$
(3)

where \(I_{g,A_{\psi }}\) is the set of nodes which can receive the information spread from \(S_{{\mathcal {A}}}\) and become (\({\mathcal {A}}\)-accepted, \({\mathcal {B}}\)-idle/rejected) under CIC model with allocation \(A=\{S_{{\mathcal {A}}},S_{{\mathcal {B}}}\}\) under realization \(\psi \). Denote \(A^{\pi }_{\psi }\) as an allocation of seeds selected by policy \(\pi \) under realization \(\psi \). Based on this notation, the expected diversified-profit of a policy \(\pi \) is defined as \({\mathbb {E}}[{\textit{DP}}(A^{\pi }_{\psi })]\) where the expectation is taken with respect to \(p(\psi )\) which is based on a known probability distribution over realizations.

Definition 4

(ACDM problem) Given a directed graph \(G=(V,E)\), weighted adjacency matrix P, happiness matrix H, appraisal matrix R, budget K, the number of round T, two competitive entities \({\mathcal {A}}\) and \({\mathcal {B}}\), parameter set q(v) and profit \(\phi _{{\mathcal {A}}}(v)\) for each node, Adaptive Competition-based Diversified-profit Maximization problem aims to find an allocation A with policy \(\pi ^{*}\) such that the expected value of diversified-profit function is maximized under CIC-model, i.e.,

$$\begin{aligned} A^{\pi ^*}\in \arg \max {\mathbb {E}}[{\textit{DP}}(A^{\pi }_{\psi })] \end{aligned}$$
(4)

Different from CDM problem, ACDM problem is proposed under k-R (k nodes per Round) setting introduced in Shi et al. (2019). The budget K is divided into T equal-sized parts and k nodes are processed for each time round \(t\in [T]\). And the requirement of allocation is the same to that for CDM problem. Then we propose the following theorem.

Theorem 2

ACDM problem is NP-hard. In addition, diversified-profit function \({\textit{DP}}(A^{\pi }_{\psi })\) is nonsubmodular and it is \(\#\)-P hard to compute it even if the allocation A with policy \(\pi ^{*}\) is given.

Proof

Based on the definition of CDM problem and ACDM problem, it’s obvious that ACDM problem can subsume CDM problem when \(T=1\) and \(k=K\). Therefore, this theorem can be obtained by taking a further step of Theorem 1. \(\square \)

4 The algorithm

Given that ACDM problem can subsume CDM problem, we concentrate on designing an algorithm to find a feasible solution for ACDM problem. Due to its definition, diversified-profit is the sum of profit generated by adopters and social welfare of an allocation of seed nodes. So the ACDM problem can be divided into two sub-problems, seed selection and seed allocation. Given that the result of seed selection and allocation influence each other, we propose the Adaptive Selection and Online Allocation (ASOA) algorithm. This algorithm consists of two algorithms designed for two subproblems and returns a feasible solution for the ACDM problem. It works with three steps:

  1. Step 1

    Select a candidate node set S consisting k nodes such that the profit generated by \({\mathcal {A}}\)-adopter based on current seed sets \(S_{{\mathcal {A}}}\) and \(S_{\mathcal {B}}\) reaches maximum.

  2. Step 2

    Obtain an weakly agent stable allocation of all the node in S and update the seed node for both \({\mathcal {A}}\) and \({\mathcal {B}}\).

  3. Step 3

    Iterate Step 1 and 2 for T times.

figure a

Now, we introduce two algorithms proposed for seed selection and allocation. In each round, inspired by Narayanam and Narahari (2011), we firstly design AS Algorithm whose details are shown in Algorithm 1 to find a candidate node set such that the profit generated by adopters of entity \({\mathcal {A}}\) is maximized. Different from the greedy algorithm proposed in Kempe et al. (2003), it models nodes in the social network as players in a coalitional game and captures information diffusion process as the process of coalition formation in the game. And this algorithm is based on the concept of Shapley value. As is shown in Narayanam and Narahari (2011), the Shapley value,\(\varPhi _{i}(N,v)\), of a player i is given by \(\varPhi _{i}(N,v)=\sum _{C\subseteq N{\setminus }\{i\}}{\vert C\vert !\left( n-\vert C\vert -1\right) !\over n!}\left\{ v\left( C\cup \{i\}\right) -v(C)\right\} \) and is hard to exactly compute. Hence, an approach based on sampling is used and works in polynomial time. For two sets of seed nodes for entity \(S_{{\mathcal {A}}}\) and \(S_{{\mathcal {B}}}\), AS algorithm computes a ranking list of the nodes based on the Shapley value and picks the top-k nodes as candidate nodes waiting to be allocated in the next step.

figure b

Subsequently, Algorithm 2 is used to allocate all the node obtained by AS algorithm. Such an algorithm is inspired by Algorithm 2 proposed in Huzhang et al. (2017) and its crucial idea is the online no-rejection bipartite matching algorithm. Regard \(\varGamma \) as a given set consisting of two vertexes representing entity \({\mathcal {A}}\) and \({\mathcal {B}}\) at the beginning. Nodes in S returned by Algorithm 1 are arriving in turn. In sequence of its index in S, we consider the first two nodes and successively allocate them to unmatched vertexes in \(\varGamma \). Then, combine each node and its choice as a unit and regard combinations as updated vertexes in \(\varGamma \). After that, match the next two nodes with adjusted appraisal to updated vertexes.

figure c

In a nutshell, ASOA algorithm shown as Algorithm 3 is a combination of AS algorithm and OA algorithm. If this algorithm is used to address CDM problem, set \(T=1\) and \(k=K\). Otherwise, set \(k=K/T\) and repeat the following process for T times. Firstly, based on partial realization which represents previous observation, it finds a seed set S which satisfies \(|S|=k\) and can maximize the profit \(\phi _{{\mathcal {A}}}(S)\). Then it allocates all the nodes in S to obtain a weakly agent stable allocation whose social welfare reaches maximum. Based on the outcome of seed allocation, it updates \(S_{{\mathcal {A}}}\), \(S_{{\mathcal {B}}}\) and realization.

5 Experiments

5.1 Experimental settings

We conduct a series of experiments on three real social networks. They are open datasets and denoted as PHH, MI and Email. We will introduce some information about them in sequence.

  • PHH (Petster-Hamster-Household dataset in Kunegis 2013): 921 nodes and 4032 edges are included in this PHH data set where edge and node represents friendship and individual respectively.

  • MI (Moreno-Innovation dataset in Kunegis 2013): This directed network captures innovation spread among physicians and consists of 246 nodes and 1098 edges. A node represents a physician and an edge between two physicians shows their friendship or cooperation.

  • Email (email-Eu-core dataset in SNAP website): The network is generated using email data representing communication between members from a large European research institution. 25,571 emails between 1005 members are recorded.

Besides, we propose parameter settings as follows:

  • The propagation probability for each node in CIC model is randomly generated from [0, 1], and the value of each element of Q is got by the same way.

  • Set the profit \(\phi _{{\mathcal {A}}}\) of 90% nodes to 1 and the profit of the remaining nodes to 0.5.

  • The happiness matrix H and appraisal matrix R are randomly generated and satisfy that every element is confined to [0, 1].

  • Set T to be 5.

  • The number of seeds selected in each round varies with the seed size K which is chosen from \(\{10,20,30,40,50\}\).

As this is the first work about ACDM problems, there are no other applicable algorithms can be used for comparison. For this reason, we use ASOA algorithm with one-shot policy which can be used for addressing CDM problem as baseline and denote it as non-adaptive. It picks K seeds and allocates them. So, only the first and second step of ASOA algorithm are conducted.

5.2 Experimental results

Firstly, some experiments are conducted to study the influence of parameter \(\lambda \). This parameter is used to weight the importance of allocation’s social welfare, reflecting the influence of nodes’ preference. We set \(\lambda =0.25,0.5,0.75\) and compare the value of \({\textit{DP}}\) with varying \(\lambda \). As is shown in Fig. 1, the value of \({\textit{DP}}\) increases with \(\lambda \). Due to the definition of \({\textit{DP}}\) function, this conclusion is easy to understand.

Fig. 1
figure 1

The value of \({\textit{DP}}\) with varying value of parameter \(\lambda \)

Fig. 2
figure 2

The relationship between \({\textit{DP}}\) and seed set size on MI

Fig. 3
figure 3

The relationship between \({\textit{DP}}\) and seed set size on Email

Fig. 4
figure 4

The relationship between \({\textit{DP}}\) and seed set size on PHH

Then, we compare the results obtained by two different algorithms based on three datasets and results are shown in Figs. 23 and 4, respectively. According to Figs. 2 and 3, as the number of selected seeds increases, the performance of ASOA algorithm denoted as adaptive is always superior to the baseline. The reason may be that non-adaptive algorithm ignores the stochastic while ASOA algorithm must select seed based on current state of each node in the social network.

When focusing on Fig. 4, we observe that the value of \({\textit{DP}}\) obtained by adaptive algorithm based on PHH dataset is smaller than that calculated by non-adaptive algorithm when \(K=40\) and \(\lambda =0.5, 0.75\). In fact, it is because the difference of profit obtained by two different algorithms is smaller than that of social welfare. ASOA algorithm firstly picks nodes which can maximize the value of profit and then allocate them. Given that the value of social welfare function is related to the selected seeds, the order may influence the final result, especially when the average degree of nodes is small (such as 4.3778 in PHH). Besides, the setting of \(\phi _{{\mathcal {A}}}\), H and R in our experiments may be another reason. In our experiments, when a node v with small profit and big social welfare is selected, the raise of social welfare may be bigger than that of profit. However, such results do not impact the general effectiveness of ASOA algorithm. In summary, ASOA algorithm performs quite well on both small-scale and large-scale networks, demonstrating good scalability.

6 Conclusion

In this paper, we propose two novel problems, i.e. competition-based diversified-profit maximization (CDM) problem and adaptive competition-based diversified-profit maximization (ACDM) problem. Different from prior works, these two problems take seed’s preference into consideration and use social welfare to reflect it. The purpose of these two problems are selecting seeds and allocating them to competitive companies such that the sum of profit generated by adopter or consumer for a special entity after information dissemination and social welfare with respect to seed allocation reaches maximum. So they integrate the process of seed selection and allocation for two competitive entities, making it more realistic and challenging.

To address these two problems, Adaptive Selection and Online Allocation algorithm is designed. This algorithm combines AS algorithm and OA algorithm focusing on seed selection and seed allocation, respectively. Benefiting from the concept of Shapley value and the method used to handle online bipartite matching problem, ASOA algorithm could obtain a better solution. And we conduct experiments on real-world networks to evaluate its effectiveness. To the best of our knowledge, it is the first paper integrating seed selection and allocation in adaptive competitive profit maximization problem.