On minimizing budget and time in influence propagation over social networks

Goyal, Amit; Bonchi, Francesco; Lakshmanan, Laks V. S.; Venkatasubramanian, Suresh

doi:10.1007/s13278-012-0062-z

On minimizing budget and time in influence propagation over social networks

Original Article
Published: 21 March 2012

Volume 3, pages 179–192, (2013)
Cite this article

Social Network Analysis and Mining Aims and scope Submit manuscript

Amit Goyal¹,
Francesco Bonchi²,
Laks V. S. Lakshmanan¹ &
…
Suresh Venkatasubramanian³

1346 Accesses
92 Citations
Explore all metrics

Abstract

In recent years, study of influence propagation in social networks has gained tremendous attention. In this context, we can identify three orthogonal dimensions—the number of seed nodes activated at the beginning (known as budget), the expected number of activated nodes at the end of the propagation (known as expected spread or coverage), and the time taken for the propagation. We can constrain one or two of these and try to optimize the third. In their seminal paper, Kempe et al. constrained the budget, left time unconstrained, and maximized the coverage: this problem is known as Influence Maximization (or MAXINF for short). In this paper, we study alternative optimization problems which are naturally motivated by resource and time constraints on viral marketing campaigns. In the first problem, termed minimum target set selection (or MINTSS for short), a coverage threshold η is given and the task is to find the minimum size seed set such that by activating it, at least η nodes are eventually activated in the expected sense. This naturally captures the problem of deploying a viral campaign on a budget. In the second problem, termed MINTIME, the goal is to minimize the time in which a predefined coverage is achieved. More precisely, in MINTIME, a coverage threshold η and a budget threshold k are given, and the task is to find a seed set of size at most k such that by activating it, at least η nodes are activated in the expected sense, in the minimum possible time. This problem addresses the issue of timing when deploying viral campaigns. Both these problems are NP-hard, which motivates our interest in their approximation. For MINTSS, we develop a simple greedy algorithm and show that it provides a bicriteria approximation. We also establish a generic hardness result suggesting that improving this bicriteria approximation is likely to be hard. For MINTIME, we show that even bicriteria and tricriteria approximations are hard under several conditions. We show, however, that if we allow the budget for number of seeds k to be boosted by a logarithmic factor and allow the coverage to fall short, then the problem can be solved exactly in PTIME, i.e., we can achieve the required coverage within the time achieved by the optimal solution to MINTIME with budget k and coverage threshold η. Finally, we establish the value of the approximation algorithms, by conducting an experimental evaluation, comparing their quality against that achieved by various heuristics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Social Influence Spectrum with Guarantees: Computing More in Less Time

Time-bounded targeted influence spread in online social networks

Article 29 June 2022

Cost-Aware Targeted Viral Marketing with Time Constraints in Social Networks

Notes

We use the terms coverage and expected spread interchangeably throughout the article.
A variant of the linear threshold model, where a deterministic threshold θ_u is chosen for each node, has also been studied (Chen 2008; Ben-Zwi et al. 2009). Coverage under this variant is not submodular.
If \(\epsilon = 1, \mathcal{A}\) outputs an empty collection.
Here, \(\text{OPT} _\mathcal{I}\) and \(\text{OPT} _\mathcal{J}\) represent the size of the optimal solution for instances \(\mathcal{I}\) and \(\mathcal{J}\) respectively.
http://www.arXiv.org
http://www.meme.yahoo.com/
Instead of 1, we could be left with a constant number of elements. Asymptotically, it does not make a difference.

References

Agarwal N, Liu H, Tang L, Yu P (2011) Modeling blogger influence in a community. Social Netw Anal Min 1–24. doi:10.1007/s13278-011-0039-3
Bakshy E, Hofman JM, Mason WA, Watts DJ (2011) Everyone’s an influencer: quantifying influence on twitter. In: Proceedings of the fourth ACM international conference on Web search and data mining, ACM, WSDM ’11, pp 65–74
Bar-Ilan J, Kortsarz G, Peleg D (2001) Generalized submodular cover problems and applications. Theor Comput Sci 250(1–2):179–200
Article MathSciNet MATH Google Scholar
Ben-Zwi O, Hermelin D, Lokshtanov D, Newman I (2009) An exact almost optimal algorithm for target set selection in social networks. In: EC ’09: Proceedings of the tenth ACM conference on electronic commerce, ACM, New York, NY, USA, pp 355–362
Bhagat S, Goyal A, Lakshmanan LVS (2012) Maximizing product adoption in social networks. In: Web search and data mining, WSDM
Bross J, Richly K, Kohnen M, Meinel C (2011) Identifying the top-dogs of the blogosphere. Social Netw Anal Min 1–15. doi:10.1007/s13278-011-0027-7
Cha M, Trez JP, Haddadi H (2011) The spread of media content through blogs. Social Netw Anal Min 1–16. doi:10.1007/s13278-011-0040-x
Chen N (2008) On the approximability of influence in social networks. In: SODA ’08: Proceedings of the nineteenth annual ACM–SIAM symposium on discrete algorithms, pp 1029–1037
Chen W, Wang Y, Yang S (2009) Efficient influence maximization in social networks. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’09)
Chen W, Wang C, Wang Y (2010a) Scalable influence maximization for prevalent viral marketing in large-scale social networks. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’10)
Chen W, Yuan Y, Zhang L (2010b) Scalable influence maximization in social networks under the linear threshold model. In: Proceedings of the 10th IEEE international conference on data mining (ICDM’2010)
Domingos P, Richardson M (2001) Mining the network value of customers. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, NY, USA, KDD ’01, pp 57–66
Feige U (1998) A threshold of XXX for approximating set cover. J ACM 45(4):634–652
Article MathSciNet MATH Google Scholar
Fujito T (1999) On approximation of the submodular set cover problem. Oper Res Lett 25(4):169–174
Article MathSciNet MATH Google Scholar
Fujito T (2000) Approximation algorithms for submodular set cover with applications. IEICE Trans Inf Syst 83
Goyal A, Bonchi F, Lakshmanan LVS (2008) Discovering leaders from community actions. In: Proceeding of the 17th ACM conference on information and knowledge management, ACM, New York, NY, USA, CIKM ’08, pp 499–508
Goyal A, Bonchi F, Lakshmanan LVS (2010) Learning influence probabilities in social networks. In: Proceedings of the third ACM international conference on web search and data mining, ACM, New York, NY, USA, WSDM ’10, pp 241–250
Goyal A, Bonchi F, Lakshmanan LVS (2011) A data-based approach to social influence maximization. PVLDB 5(1)
Kempe D, Kleinberg JM, Tardos É (2003) Maximizing the spread of influence through a social network. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining (KDD’03)
Kempe D, Kleinberg J, Tardos É (2005) Influential nodes in a diffusion model for social networks. In: ICALP, Springer, Berlin, pp 1127–1138
Khuller S, Moss A, Naor JS (1999) The budgeted maximum coverage problem. Inf Process Lett 70(1):39–45
Article MathSciNet MATH Google Scholar
Kimura M, Saito K (2006) Tractable models for information diffusion in social networks. In: Proceedings of PKDD 2006, Lecture notes in computer science, vol 4213
Leskovec J, Krause A, Guestrin C, Faloutsos C, VanBriesen J, Glance NS (2007) Cost-effective outbreak detection in networks. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’07)
Li Gørtz I, Wirth A (2006) Asymmetry in k-center variants. Theor Comput Sci 361(2):188–199
Article Google Scholar
Nemhauser GL, Wolsey LA, Fisher ML (1978) An analysis of approximations for maximizing submodular set functions-I. Math Program 14(1):265–294
Article MathSciNet MATH Google Scholar
Panigrahy R, Vishwanathan S (1998) An O(log^* n) approximation algorithm for the asymmetric p-center problem. J Algorithms 27(2):259–268
Article MathSciNet MATH Google Scholar
Richardson M, Domingos P (2002) Mining knowledge-sharing sites for viral marketing. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, NY, USA, KDD ’02, pp 61–70
Slaví k P (1997) Improved performance of the greedy algorithm for partial cover. Inform Process Lett 64(5):251–254
Article MathSciNet Google Scholar
Sviridenko M (2004) A note on maximizing a submodular set function subject to a knapsack constraint. Oper Res Lett 32(1):41–43
Article MathSciNet MATH Google Scholar
Weng J, Lim EP, Jiang J, He Q (2010) Twitterrank: finding topic-sensitive influential twitterers. In: Proceedings of the third ACM international conference on web search and data mining, ACM, New York, NY, USA, WSDM ’10, pp 261–270
Wolsey LA (1982) An analysis of the greedy algorithm for the submodular set covering problem. Combinatorica 2(4):385–393
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

University of British Columbia, Vancouver, BC, Canada
Amit Goyal & Laks V. S. Lakshmanan
Yahoo! Research, Barcelona, Spain
Francesco Bonchi
University of Utah, Salt Lake City, UT, USA
Suresh Venkatasubramanian

Authors

Amit Goyal
View author publications
You can also search for this author in PubMed Google Scholar
Francesco Bonchi
View author publications
You can also search for this author in PubMed Google Scholar
Laks V. S. Lakshmanan
View author publications
You can also search for this author in PubMed Google Scholar
Suresh Venkatasubramanian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Amit Goyal.

Appendix

1.1 A Proof of Lemma 2

Suppose there exists an algorithm \(\mathcal{A}\) that selects β k sets which covers γ η elements. Apply \(\mathcal{A}\) to an arbitrary instance \(\langle \mathcal{U}, \mathcal{S}, \eta \rangle\) of PSC. The output is a collection of sets \(\mathcal{C}_1\) such that \(|\mathcal{C}_1| \le \beta k\) and \(\left| { \cup _{{s \in c_{1} }} S} \right|{ \ge } \gamma \eta \) Next, discard the sets that have been selected and the elements they cover, and apply again the algorithm \(\mathcal{A}\) on the remaining universe. Repeat this process until 1 or fewer elements are left uncovered.^{Footnote 7}

Let η_i denote the number of elements uncovered after iteration i. In iteration i, the algorithm picks β k sets and covers at least γ η_i−1 elements. Hence, \(\eta_i \le \eta_{i-1} \cdot (1 - \gamma). \) Expanding, \(\eta_i \le \eta \cdot (1 - \gamma)^i. \) Suppose after l iterations, η_l = 1. The total number of sets picked is \(l\beta k. \eta \cdot (1 - \gamma)^l = 1\) implies \(l = \frac{\ln \eta}{\ln \frac{1}{1-\gamma}}. \)

We now prove the first claim. Let γ > 1 − 1/e ^β, then \(\ln \left( \frac{1}{1-\gamma} \right) > \beta. \) This yields a PTIME algorithm for PSC which outputs a solution of size \( l \beta k = \beta k \cdot \ln \eta / \ln \frac{1}{1-\gamma} \le c \cdot k \ln \eta\) (for some c < 1) This yields an \(c \cdot \ln \eta\)-approximation for PSC for some c < 1, which is not possible unless \({\rm NP} \subseteq \text{DTIME}(n^{O(\log \log n)})\) (Feige 1998).

To prove the second claim, assume \(\beta \le (1 - \delta) \ln \left( \frac{1}{1 -\gamma} \right). \) This gives a PTIME algorithm for PSC which outputs a solution of size \(l \beta k = \beta k \cdot \ln \eta / \ln \frac{1}{1-\gamma} \le (1 - \delta) k \cdot \ln \eta\) which is not possible unless \({\rm NP} \subseteq \text{DTIME}(n^{O(\log \log n)}). \) \(\quad\square\)

1.2 B Example illustrating performance of Wolsey’s solution

Wolsey (1982) studied the RSSC problem and showed, among many things, that the greedy algorithm provides a solution that is within a factor of \(1 + \ln (\eta/(\eta-f(S_{t-1}))\) of the optimal solution. Unfortunately, this does not yield an approximation algorithm with any guaranteed bounds. The following example shows the greedy solution with threshold η can be arbitrarily worse than the optimum.

Example

(Illustrated also in Fig. 4). Consider a ground set \(\mathcal{X} = \{w_1, w_2, v_1, v_2, \ldots, v_l\}\) with elements having unit costs. Figure 4 geometrically depicts the definition of a function \({f: 2^{\mathcal{X}} {\rightarrow}\;\mathbb{R}, }\) where for any set \(S \subset \mathcal{X},\;f(S)\) is defined to be the area (shown shaded) covered by the elements of S. Specifically, f(w ₁) = f(w ₂) = 1 − 1/2^l+1 and f(v _i) = 1/2ⁱ⁻¹, 1 ≤ i ≤ l. Notice, \(f(\{v_1, \ldots, v_l\}) = \Upsigma_{i=1}^l 1/2^{i-1} = 2 - 1/2^{l-1} < 2 - 1/2^l = f(\{w_1, w_2\}). \) The greedy algorithm will first pick v ₁. Suppose it picks \(S = \{v_1,\ldots, v_i\}\) in i rounds. Then f(S∪{v _i+1}) − f(S) = 1/2ⁱ > 1 − 1/2^l+1 − 1 + 1/2ⁱ = 1 − 1/2^l+1 − 1/2(2 − 1/2ⁱ⁻¹) = f(S ∪ {w ₁}) − f(S). Thus, greedy will never pick w ₁ or w ₂ before it picks \(v_1,\ldots, v_l. \) Suppose η = 2 − 1/2^l. Clearly, the greedy solution is \(\mathcal{X}\) whereas the optimal solution is {w ₁, w ₂}. Here l can be arbitrarily large.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Goyal, A., Bonchi, F., Lakshmanan, L.V.S. et al. On minimizing budget and time in influence propagation over social networks. Soc. Netw. Anal. Min. 3, 179–192 (2013). https://doi.org/10.1007/s13278-012-0062-z

Download citation

Received: 08 November 2011
Revised: 21 February 2012
Accepted: 28 February 2012
Published: 21 March 2012
Issue Date: June 2013
DOI: https://doi.org/10.1007/s13278-012-0062-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On minimizing budget and time in influence propagation over social networks

Abstract

Access this article

Similar content being viewed by others

Social Influence Spectrum with Guarantees: Computing More in Less Time

Time-bounded targeted influence spread in online social networks

Cost-Aware Targeted Viral Marketing with Time Constraints in Social Networks

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendix

1.1 A Proof of Lemma 2

1.2 B Example illustrating performance of Wolsey’s solution

Example

Rights and permissions

About this article

Cite this article

Keywords

Navigation

On minimizing budget and time in influence propagation over social networks

Abstract

Access this article

Similar content being viewed by others

Social Influence Spectrum with Guarantees: Computing More in Less Time

Time-bounded targeted influence spread in online social networks

Cost-Aware Targeted Viral Marketing with Time Constraints in Social Networks

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

1.1 A Proof of Lemma 2

1.2 B Example illustrating performance of Wolsey’s solution

Example

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation