Data Mining and Knowledge Discovery

, Volume 25, Issue 3, pp 545–576 | Cite as

Scalable influence maximization for independent cascade model in large-scale social networks

  • Chi Wang
  • Wei Chen
  • Yajun Wang


Influence maximization, defined by Kempe et al. (SIGKDD 2003), is the problem of finding a small set of seed nodes in a social network that maximizes the spread of influence under certain influence cascade models. The scalability of influence maximization is a key factor for enabling prevalent viral marketing in large-scale online social networks. Prior solutions, such as the greedy algorithm of Kempe et al. (SIGKDD 2003) and its improvements are slow and not scalable, while other heuristic algorithms do not provide consistently good performance on influence spreads. In this article, we design a new heuristic algorithm that is easily scalable to millions of nodes and edges in our experiments. Our algorithm has a simple tunable parameter for users to control the balance between the running time and the influence spread of the algorithm. Our results from extensive simulations on several real-world and synthetic networks demonstrate that our algorithm is currently the best scalable solution to the influence maximization problem: (a) our algorithm scales beyond million-sized graphs where the greedy algorithm becomes infeasible, and (b) in all size ranges, our algorithm performs consistently well in influence spread—it is always among the best algorithms, and in most cases it significantly outperforms all other scalable heuristics to as much as 100–260% increase in influence spread.


Influence maximization Social networks Independent cascade model Viral marketing 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Aiello W, Chung FRK, Lu L (2000) A random graph model for massive graphs. In: STOC ’00Google Scholar
  2. Bakshy E, Karrer B, Adamic LA (2009) Social influence and the diffusion of user-created content. In: EC ’09: Proc. 10th ACM Conf. Electronic CommerceGoogle Scholar
  3. Brin S, Page L (1998) The anatomy of a large-scale hypertextual web search engine. Comput Netw 30(1-7): 107–117Google Scholar
  4. Cha M, Mislove A, Gummadi KP (2009) A measurement-driven analysis of information propagation in the flickr social network. In: WWW ’09Google Scholar
  5. Chen W, Wang Y, Yang S (2009) Efficient influence maximization in social networks. In: KDD ’09Google Scholar
  6. Chen W, Wang C, Wang Y (2010a) Scalable influence maximization for prevalent viral marketing in large-scale social networks. In: KDD ’10Google Scholar
  7. Chen W, Yuan Y, Zhang L (2010b) Scalable influence maximization in social networks under the linear threshold model. In: ICDM ’10Google Scholar
  8. Chen W, Collins A, Cummings R, Ke T, Liu Z, Rincon D, Sun X, Wang Y, Wei W, Yuan Y (2011) Influence maximization in social networks when negative opinions may emerge and propagate. In: SDM ’11Google Scholar
  9. Cui P, Wang F, Liu S, Ou M, Yang S, Sun L (2011) Who should share what?: item-level social influence prediction for users and posts ranking. In: SIGIR ’11Google Scholar
  10. Domingos P, Richardson M (2001) Mining the network value of customers. In: KDD ’01Google Scholar
  11. Feige U (1998) A threshold of ln n for approximating set cover. J ACM 45(4): 634–652MathSciNetzbMATHCrossRefGoogle Scholar
  12. Freeman L (1979) Centrality in social networks: conceptual clarification. Soc Netw 1: 215–239CrossRefGoogle Scholar
  13. Goyal A, Bonchi F, Lakshmanan LV (2010) Learning influence probabilities in social networks. In: WSDM ’10Google Scholar
  14. Gruhl D, Guha RV, Liben-Nowell D, Tomkins A (2004) Information diffusion through blogspace. In: WWW ’04Google Scholar
  15. Kempe D, Kleinberg JM, Tardos É (2003) Maximizing the spread of influence through a social network. In: KDD ’03Google Scholar
  16. Kimura M, Saito K (2006) Tractable models for information diffusion in social networks. In: PKDD ’06Google Scholar
  17. Leskovec J, Krause A, Guestrin C, Faloutsos C, VanBriesen J, Glance NS (2007) Cost-effective outbreak detection in networks. In: KDD ’07Google Scholar
  18. Misner IR (1999) The world’s best known marketing secret: Building your business with word-of-mouth marketing, 2nd edn. Bard Press, AustinGoogle Scholar
  19. Nail J (2004) The consumer advertising backlash. Forrester Research and Intelliseek Market Research ReportGoogle Scholar
  20. Nemhauser G, Wolsey L, Fisher M (1978) An analysis of the approximations for maximizing submodular set functions. Math Program 14: 265–294MathSciNetzbMATHCrossRefGoogle Scholar
  21. Richardson M, Domingos P (2002) Mining knowledge-sharing sites for viral marketing. In: KDD ’02Google Scholar
  22. Rodriguez MG, Leskovec J, Krause A (2010) Inferring networks of diffusion and influence. In: KDD ’10Google Scholar
  23. Streeter M, Golovin D (2007) An online algorithm for maximizing submodular functions. Technical Report CMU-CS-07-171, Carnegie Mellon University, PittsburghGoogle Scholar
  24. Tang J, Sun J, Wang C, Yang Z (2009) Social influence analysis in large-scale networks. In: KDD ’09Google Scholar
  25. Valiant LG (1979) The complexity of enumeration and reliability problems. SIAM J Comput 8(3): 410–421MathSciNetzbMATHCrossRefGoogle Scholar
  26. Vazirani VV (2004) Approximation algorithms. Springer, BerlinGoogle Scholar

Copyright information

© The Author(s) 2012

Authors and Affiliations

  1. 1.University of Illinois at Urbana-ChampaignUrbanaUSA
  2. 2.Microsoft Research AsiaBeijingChina

Personalised recommendations