Skip to main content
Log in

Seeking powerful information initial spreaders in online social networks: a dense group perspective

  • Published:
Wireless Networks Aims and scope Submit manuscript


The rapid growth of online social networks (OSNs) has ultimately facilitated information spreading and changed the economics of mobile networks. It is important to understand how to spread information as widely as possible. In this paper, we aim to seek powerful information initial spreaders with an efficient manner. We use the mean-field theory to characterize the process of information spreading based on the Susceptible Infected (SI) model and validate that the prevalence of information depends on the network density. Inspired by this result, we seek the initial spreaders from closely integrated groups of nodes, i.e., dense groups (DGs). In OSNs, DGs distribute dispersedly over the network, so our approach can be fulfilled in a distributed way by seeking the spreaders in each DG. We first design a DG Generating Algorithm to detect DGs, where nodes within the DG have more internal connections than external ones. Second, based on the detected DGs, we design a criterion to seek powerful initial spreaders from each DG. We conduct experiments as well as statistical analysis on real OSNs. The results show that our approach provides a satisfactory performance as well as computational efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others


  1. Links in the DG are not drawn since the connections are complex.

  2. In reality, some users in Sina Weibo have many followers, but limited followees; e.g., public accounts. Therefore, such uni-direction connections cannot represent a bi-direction information spreading.

  3. If a directed graph is considered, given a link starting from \(v_i\) to \(v_j\) (implying information can be transmitted from \(v_i\) to \(v_j\)), \(v_i\) is a neighbor of \(v_j\), whereas \(v_j\) is not the neighbor of \(v_i\).

  4. If many nodes are selected as spreaders in each DG, for example, five spreaders from each DG, in our data set as shown in Sect. 7, we have to select more than 1000 spreaders, even though the overlap DGs are considered, there would be hundreds of spreaders. Selecting so many spreaders is not necessary and is hard to bear in terms of cost on informing initial spreaders.

  5. However, we think for some other types of information, the incentive mechanism is unnecessary, which include the public service announcement (e.g., the announcement of the city’s new metro line’s operation date), knowledge (e.g., how can we keep ourself safe when typhoon comes), and even a piece of interesting news, a meaningful video, a joke and etc. These types of information may arouse a user’s interest to share with his friends.

  6. The exact expression of \(C(n_s)\) would vary in different application scenarios, so we do not provide the explicit expression here. Nevertheless, we intend to provide a strategy of tradeoff decisions.

  7. We use a BFS approach to crawl the network with an arbitrary source node. We consider ordinary users within Shanghai city leaving out celebrities, and keep bidirectional links.

  8. The unidirectional links are deleted.


  1. Ma, L., Ma, C., & Zhang, H. (2016). Identifying influential spreaders in complex networks based on gravity formula. Physica A: Statistical Mechanics and its Applications, 451, 205–212.

    Article  Google Scholar 

  2. Zhong, L., Liu, J., & Shang, M. (2015). Iterative resource allocation based on propagation feature of node for identifying the influential nodes. Physics Letters A, 379(38), 2272–2276.

    Article  Google Scholar 

  3. Ren, Z., Zeng, A., Chen, D., Liao, H., & Liu, J. (2014). Iterative resource allocation for ranking spreaders in complex networks. Europhysics Letters, 106(4), 48005.

    Article  Google Scholar 

  4. Horel, T., & Singer, Y. (2015). Scalable methods for adaptively seeding a social network. In Proceedings WWW, Florence, Italy.

  5. Chen, W., Wang, Y., & Yang, S. (2009). Efficient influence maximization in social networks. In Proceedings ACM SIGKDD, Paris, France.

  6. Chen, W., Lu, W., & Zhang, N. (2012). Time-critical influence maximization in social networks with time-delayed diffusion process. In Proceedings AAAI, Toronto, Canada.

  7. Kempe, D., Kleinberg, J., & Tardos, É. (2003). Maximizing the spread of influence through a social network. In Proceedings ACM SIGKDD, Washington, DC.

  8. Neglia, G. N., Ye, X., Gabielkov, M., & Legout, A. (2014). How to network in online social networks. In Proceedings NetSciCom, Toronto, Canada.

  9. Palla, G., Derényi, I., Farkas, I., & Vicsek, T. (2005). Uncovering the overlapping community structure of complex networks in nature and society. Nature, 435(7043), 814–818.

    Article  Google Scholar 

  10. Gregory, S. (2010). Finding overlapping communities in networks by label propagation. New Journal of Physics, 12(10), 1–26.

    Article  Google Scholar 

  11. Nguyen, N. P., Dinh, T. N., Tokala, S., & Thai, M. T. (2011). Overlapping communities in dynamic networks: Their detection and mobile applications. In ACM MOBICOM, Las Vegas, NV.

  12. Benaim, M., & Le Boudec, J.-Y. (2008). A class of mean field interaction models for computer and communication systems. Performance Evaluation, 65(11), 823–838.

    Article  Google Scholar 

  13. Le Boudec, J. -Y., McDonald, D., & Mundinger, J. (2007). A generic mean field convergence result for systems of interacting objects. In Proceedings QEST, Edinburgh, Scotland.

  14. Kwak, H., Lee, C., Park, H., & Moon, S. (2010). What is twitter, a social network or a news media?. In Proceedings WWW, Raleigh, NC.

  15. Guo, Z., Li, Z., & Tu, H. (2011). Sina microblog: An information-driven online social network. In Proceedings CW, Banff, Canada.

  16. Yang, F., Liu, Y., Yu, X., & Yang, M. (2012). Automatic detection of rumor on sina weibo. In Proceedings ACM SIGKDD, Beijing, China.

  17. Fan, J., Chen, J., Du, Y., Gao, W., Wu, J., & Sun, Y. (2013). Geocommunity-based broadcasting for data dissemination in mobile social networks. IEEE TPDS, 24(4), 734–743.

    Google Scholar 

  18. Bakshy, E., Rosenn, I., Marlow, C., & Adamic, L. (2012). The role of social networks in information diffusion. In Proceedings WWW, Lyon, France.

  19. Miritello, G., Moro, E., & Lara, R. (2011). Dynamical strength of social ties in information spreading. Physical Review E, 83(4), 045102.

    Article  Google Scholar 

  20. Eugster, P. T., Guerraoui, R., Kermarrec, A.-M., & Massoulié, L. (2004). Epidemic information dissemination in distributed systems. Computer, 37(5), 60–67.

    Article  Google Scholar 

  21. Khelil, A., Becker, C., Tian, J., & Rothermel, K. (2002). An epidemic model for information diffusion in manets. In Proceedings ACM MSWiM, Atlanta, GA.

  22. Guille, A., & Hacid, H. (2012). A predictive model for the temporal dynamics of information diffusion in online social networks. In Proceedings WWW, Lyon, France.

  23. Gopalan, A., Banerjee, S., Das, A. K., & Shakkottai, S. Random mobility and the spread of infection. In Proceedings IEEE INFOCOM.

  24. Banerjee, S., Gopalan, A., Das, A. K., & Shakkottai, S. (2014). Epidemic spreading with external agents. IEEE Transactions on Information Theory, 60(7), 4125–4138.

    Article  MathSciNet  Google Scholar 

  25. Ganesh, A., Massoulié, L., & Towsley, D. (2005). The effect of network topology on the spread of epidemics. In Proceedings INFOCOM, Miami, FL.

  26. Zhu, K., & Ying, L. (2016). Information source detection in the sir model: A sample-path-based approach. IEEE/ACM ToNransactions on Networking, 24(1), 408–421.

    Article  Google Scholar 

  27. Fortunato, S. (2010). Community detection in graphs. Physics Reports, 486(3), 75–174.

    Article  MathSciNet  Google Scholar 

  28. Alba, R. D. (1973). A graph-theoretic definition of a sociometric clique. Journal of Mathematical Sociology, 3(1), 113–126.

    Article  MathSciNet  Google Scholar 

  29. Raghavan, U. N., Albert, R., & Kumara, S. (2007). Near linear time algorithm to detect community structures in large-scale networks. Physical Review E, 76(3), 1–11.

    Article  Google Scholar 

  30. Morone, F., & Makse, H. A. (2015). Influence maximization in complex networks through optimal percolation. Nature, 524(7563), 65–68.

    Article  Google Scholar 

  31. Dinh, T. N., Zhang, H., Nguyen, D. T., & Thai, M. T. (2014). Cost-effective viral marketing for time-critical campaigns in large-scale social networks. IEEE/ACM ToN, 22(6), 2001–2011.

    Article  Google Scholar 

  32. Ok, J., Jin, Y., Shin, J., & Yi, Y. (2014). On maximizing diffusion speed in social networks: Impact of random seeding and clustering. In Proceedings ACM SIGMETRICS, Austin, TX.

  33. Albert, R., & Barabási, A.-L. (2002). Statistical mechanics of complex networks. Reviews of Modern Physics, 74(1), 47.

    Article  MathSciNet  Google Scholar 

  34. Boguná, M., & Pastor-Satorras, R. (2002). Epidemic spreading in correlated complex networks. Physical Review E, 66(4), 047104.

    Article  Google Scholar 

  35. Wang, S., Zhou, X., Wang, Z., & Zhang, M. (2012). Please spread: Recommending tweets for retweeting with implicit feedback. In Proceedings DUBMMSM Maui, HI.

  36. Osborne, M . J. (2004). An introduction to game theory. vol. 3, no. 3. New York: Oxford University Press.

    Google Scholar 

  37. Sina Corp. Sina Weibo API. Available at

  38. Leskovec, J. Stanford large network dataset collection. Available at

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Xinbing Wang.

Appendix: Notations

Appendix: Notations

G :

the graph

\({\mathcal {V}}\), \({\mathcal {E}}\) :

the set of nodes and links

\(v_i\) :

a node belongs to \({\mathcal {V}}\)

\({\mathbf {A}}\) :

the adjacency matrix of graph G

\(A_{i,j}\) :

an element of \({\mathbf {A}}\)

\(\alpha\) :

the probability that a node is infected by one of its neighbor

\(P(v_i,t)\) :

the probability that \(v_i\) is infected by its infected neighbors at time t


the number of infected nodes in G at time t


the density of infected nodes in G at time t

N :

the total number of nodes in G

\(n_I (t)\) :

the average number of infected neighbors of an uninfected node

k :

the degree of a node

\(k_i\) :

the degree of node \(v_i\)

\(\left\langle k \right\rangle\) :

the average degree

\(\tau\) :

the time constant for information spreading in graph G

\(I_k(t)\) :

the number of infected nodes at time t which have the same degree k

\(i_k(t)\) :

the density of infected nodes at time t which have the same degree k

\(N_k\) :

the number of nodes with degree k

\(P_k\) :

the probability that a nodes’ degree equals to k

\(i_k^*(t)\) :

the density of infected neighbors of the k-degree node at time t

\(i^*(t)\) :

the density of infected neighbors of the node at time t

\({\mathcal {D}}\) :

a set of nodes or a dense group

\({\varPhi }({\mathcal {D}})\) :

a metric for \({\mathcal {D}}\), the density of \({\mathcal {D}}\)

\(\left| {\mathcal {D}}_l \right|\) :

the number of links in \({\mathcal {D}}\)

\(\left| {\mathcal {D}} \right|\) :

the number of nodes in \({\mathcal {D}}\)

\({\mathcal {D}}_{ca}\), \({\mathcal {D}}^i\) :

a candidate set of nodes or a candidate cluster

\(\tau ({\mathcal {D}})\) :

the threshold of determining \({\mathcal {D}}_{ca}\)

\({\mathcal {D}}^{max}\) :

the cluster who has the largest density

\(N_E\) :

the number of edges of an n-node complete graph

\(d({\mathcal {D}})\) :

the difference between \(N_E\) and \(\left| {\mathcal {D}} \right|\)

\(D({\mathcal {D}})\) :

the upper bound of the number of absent links for a cluster

\(\pi\) :

the probability that node \(v_i\) is infected by its neighbors in a time slot


neighbors of node \(v_i\)

\(p_{j,t-1}\) :

the probability that \(v_j\) has been infected before time slot t

\({\mathbf {I}}\) :

the identity matrix

\({\mathbf {P}}_t\) :

a vector that concludes the infected probability of each node at time t

\(I_t\) :

the number of infected nodes at time t; same with I(t)

\({\mathbf {e}}\) :

the identity vector

\(l_i\) :

the number of physical outward link of the node

\({\mathbf {L}}\) :

a square matrix \(diag(l_1,l_2,\ldots ,l_n)\)

\(L_t\) :

the number of DA links at time t

\(n_s\) :

the number of initial information spreaders to be sought

\(S_k\) :

the sought node

\({\mathcal {N}}_S\) :

the set of sought nodes

\({\mathcal {D}}_{ol}\) :

the set of nodes in the overlapping area

\({\mathcal {N}}^{ol}\) :

the set of sought nodes from overlapping DGs

\({\mathcal {D}}_{in}\) :

an integrity composed by multiple clusters


the benefit or the number of DA links at time t

\(B_0 (t)\) :

the optimal benefit

\({\mathcal {N}}_1^*\) :

the optimal sets of spreaders from \({\mathcal {D}}_1\) using the approach in Section 6.2

\(B_{ol}(t)\) :

the benefit when using the approach in 6.3 to seek spreaders in overlapping DGs.

\(C(n_s)\) :

the cost of informing the initial spreaders

\(\beta\) :

a parameter that adjusts adjust the two terms with the same unit in Eq. (27)


the sought node by GeneralGreedy


the sought node from DG

R :

simulation round

M :

the number of G’s links

\({\mathcal {N}}\) :

neighbors that are not belong to the DGs

\({\mathcal {S}}_f\) :

single node’s messages reception frequency

\({\mathcal {A}}_f\) :

collection’s average messages reception frequency

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, S., Chen, G., Fu, L. et al. Seeking powerful information initial spreaders in online social networks: a dense group perspective. Wireless Netw 24, 2973–2991 (2018).

Download citation

  • Published:

  • Issue Date:

  • DOI: