Abstract
The rapid growth of online social networks (OSNs) has ultimately facilitated information spreading and changed the economics of mobile networks. It is important to understand how to spread information as widely as possible. In this paper, we aim to seek powerful information initial spreaders with an efficient manner. We use the meanfield theory to characterize the process of information spreading based on the Susceptible Infected (SI) model and validate that the prevalence of information depends on the network density. Inspired by this result, we seek the initial spreaders from closely integrated groups of nodes, i.e., dense groups (DGs). In OSNs, DGs distribute dispersedly over the network, so our approach can be fulfilled in a distributed way by seeking the spreaders in each DG. We first design a DG Generating Algorithm to detect DGs, where nodes within the DG have more internal connections than external ones. Second, based on the detected DGs, we design a criterion to seek powerful initial spreaders from each DG. We conduct experiments as well as statistical analysis on real OSNs. The results show that our approach provides a satisfactory performance as well as computational efficiency.
Similar content being viewed by others
Notes
Links in the DG are not drawn since the connections are complex.
In reality, some users in Sina Weibo have many followers, but limited followees; e.g., public accounts. Therefore, such unidirection connections cannot represent a bidirection information spreading.
If a directed graph is considered, given a link starting from \(v_i\) to \(v_j\) (implying information can be transmitted from \(v_i\) to \(v_j\)), \(v_i\) is a neighbor of \(v_j\), whereas \(v_j\) is not the neighbor of \(v_i\).
If many nodes are selected as spreaders in each DG, for example, five spreaders from each DG, in our data set as shown in Sect. 7, we have to select more than 1000 spreaders, even though the overlap DGs are considered, there would be hundreds of spreaders. Selecting so many spreaders is not necessary and is hard to bear in terms of cost on informing initial spreaders.
However, we think for some other types of information, the incentive mechanism is unnecessary, which include the public service announcement (e.g., the announcement of the city’s new metro line’s operation date), knowledge (e.g., how can we keep ourself safe when typhoon comes), and even a piece of interesting news, a meaningful video, a joke and etc. These types of information may arouse a user’s interest to share with his friends.
The exact expression of \(C(n_s)\) would vary in different application scenarios, so we do not provide the explicit expression here. Nevertheless, we intend to provide a strategy of tradeoff decisions.
We use a BFS approach to crawl the network with an arbitrary source node. We consider ordinary users within Shanghai city leaving out celebrities, and keep bidirectional links.
The unidirectional links are deleted.
References
Ma, L., Ma, C., & Zhang, H. (2016). Identifying influential spreaders in complex networks based on gravity formula. Physica A: Statistical Mechanics and its Applications, 451, 205–212.
Zhong, L., Liu, J., & Shang, M. (2015). Iterative resource allocation based on propagation feature of node for identifying the influential nodes. Physics Letters A, 379(38), 2272–2276.
Ren, Z., Zeng, A., Chen, D., Liao, H., & Liu, J. (2014). Iterative resource allocation for ranking spreaders in complex networks. Europhysics Letters, 106(4), 48005.
Horel, T., & Singer, Y. (2015). Scalable methods for adaptively seeding a social network. In Proceedings WWW, Florence, Italy.
Chen, W., Wang, Y., & Yang, S. (2009). Efficient influence maximization in social networks. In Proceedings ACM SIGKDD, Paris, France.
Chen, W., Lu, W., & Zhang, N. (2012). Timecritical influence maximization in social networks with timedelayed diffusion process. In Proceedings AAAI, Toronto, Canada.
Kempe, D., Kleinberg, J., & Tardos, É. (2003). Maximizing the spread of influence through a social network. In Proceedings ACM SIGKDD, Washington, DC.
Neglia, G. N., Ye, X., Gabielkov, M., & Legout, A. (2014). How to network in online social networks. In Proceedings NetSciCom, Toronto, Canada.
Palla, G., Derényi, I., Farkas, I., & Vicsek, T. (2005). Uncovering the overlapping community structure of complex networks in nature and society. Nature, 435(7043), 814–818.
Gregory, S. (2010). Finding overlapping communities in networks by label propagation. New Journal of Physics, 12(10), 1–26.
Nguyen, N. P., Dinh, T. N., Tokala, S., & Thai, M. T. (2011). Overlapping communities in dynamic networks: Their detection and mobile applications. In ACM MOBICOM, Las Vegas, NV.
Benaim, M., & Le Boudec, J.Y. (2008). A class of mean field interaction models for computer and communication systems. Performance Evaluation, 65(11), 823–838.
Le Boudec, J. Y., McDonald, D., & Mundinger, J. (2007). A generic mean field convergence result for systems of interacting objects. In Proceedings QEST, Edinburgh, Scotland.
Kwak, H., Lee, C., Park, H., & Moon, S. (2010). What is twitter, a social network or a news media?. In Proceedings WWW, Raleigh, NC.
Guo, Z., Li, Z., & Tu, H. (2011). Sina microblog: An informationdriven online social network. In Proceedings CW, Banff, Canada.
Yang, F., Liu, Y., Yu, X., & Yang, M. (2012). Automatic detection of rumor on sina weibo. In Proceedings ACM SIGKDD, Beijing, China.
Fan, J., Chen, J., Du, Y., Gao, W., Wu, J., & Sun, Y. (2013). Geocommunitybased broadcasting for data dissemination in mobile social networks. IEEE TPDS, 24(4), 734–743.
Bakshy, E., Rosenn, I., Marlow, C., & Adamic, L. (2012). The role of social networks in information diffusion. In Proceedings WWW, Lyon, France.
Miritello, G., Moro, E., & Lara, R. (2011). Dynamical strength of social ties in information spreading. Physical Review E, 83(4), 045102.
Eugster, P. T., Guerraoui, R., Kermarrec, A.M., & Massoulié, L. (2004). Epidemic information dissemination in distributed systems. Computer, 37(5), 60–67.
Khelil, A., Becker, C., Tian, J., & Rothermel, K. (2002). An epidemic model for information diffusion in manets. In Proceedings ACM MSWiM, Atlanta, GA.
Guille, A., & Hacid, H. (2012). A predictive model for the temporal dynamics of information diffusion in online social networks. In Proceedings WWW, Lyon, France.
Gopalan, A., Banerjee, S., Das, A. K., & Shakkottai, S. Random mobility and the spread of infection. In Proceedings IEEE INFOCOM.
Banerjee, S., Gopalan, A., Das, A. K., & Shakkottai, S. (2014). Epidemic spreading with external agents. IEEE Transactions on Information Theory, 60(7), 4125–4138.
Ganesh, A., Massoulié, L., & Towsley, D. (2005). The effect of network topology on the spread of epidemics. In Proceedings INFOCOM, Miami, FL.
Zhu, K., & Ying, L. (2016). Information source detection in the sir model: A samplepathbased approach. IEEE/ACM ToNransactions on Networking, 24(1), 408–421.
Fortunato, S. (2010). Community detection in graphs. Physics Reports, 486(3), 75–174.
Alba, R. D. (1973). A graphtheoretic definition of a sociometric clique. Journal of Mathematical Sociology, 3(1), 113–126.
Raghavan, U. N., Albert, R., & Kumara, S. (2007). Near linear time algorithm to detect community structures in largescale networks. Physical Review E, 76(3), 1–11.
Morone, F., & Makse, H. A. (2015). Influence maximization in complex networks through optimal percolation. Nature, 524(7563), 65–68.
Dinh, T. N., Zhang, H., Nguyen, D. T., & Thai, M. T. (2014). Costeffective viral marketing for timecritical campaigns in largescale social networks. IEEE/ACM ToN, 22(6), 2001–2011.
Ok, J., Jin, Y., Shin, J., & Yi, Y. (2014). On maximizing diffusion speed in social networks: Impact of random seeding and clustering. In Proceedings ACM SIGMETRICS, Austin, TX.
Albert, R., & Barabási, A.L. (2002). Statistical mechanics of complex networks. Reviews of Modern Physics, 74(1), 47.
Boguná, M., & PastorSatorras, R. (2002). Epidemic spreading in correlated complex networks. Physical Review E, 66(4), 047104.
Wang, S., Zhou, X., Wang, Z., & Zhang, M. (2012). Please spread: Recommending tweets for retweeting with implicit feedback. In Proceedings DUBMMSM Maui, HI.
Osborne, M . J. (2004). An introduction to game theory. vol. 3, no. 3. New York: Oxford University Press.
Sina Corp. Sina Weibo API. Available at http://open.weibo.com/.
Leskovec, J. Stanford large network dataset collection. Available at http://snap.stanford.edu/data/.
Author information
Authors and Affiliations
Corresponding author
Appendix: Notations
Appendix: Notations
 G :

the graph
 \({\mathcal {V}}\), \({\mathcal {E}}\) :

the set of nodes and links
 \(v_i\) :

a node belongs to \({\mathcal {V}}\)
 \({\mathbf {A}}\) :

the adjacency matrix of graph G
 \(A_{i,j}\) :

an element of \({\mathbf {A}}\)
 \(\alpha\) :

the probability that a node is infected by one of its neighbor
 \(P(v_i,t)\) :

the probability that \(v_i\) is infected by its infected neighbors at time t
 I(t):

the number of infected nodes in G at time t
 i(t):

the density of infected nodes in G at time t
 N :

the total number of nodes in G
 \(n_I (t)\) :

the average number of infected neighbors of an uninfected node
 k :

the degree of a node
 \(k_i\) :

the degree of node \(v_i\)
 \(\left\langle k \right\rangle\) :

the average degree
 \(\tau\) :

the time constant for information spreading in graph G
 \(I_k(t)\) :

the number of infected nodes at time t which have the same degree k
 \(i_k(t)\) :

the density of infected nodes at time t which have the same degree k
 \(N_k\) :

the number of nodes with degree k
 \(P_k\) :

the probability that a nodes’ degree equals to k
 \(i_k^*(t)\) :

the density of infected neighbors of the kdegree node at time t
 \(i^*(t)\) :

the density of infected neighbors of the node at time t
 \({\mathcal {D}}\) :

a set of nodes or a dense group
 \({\varPhi }({\mathcal {D}})\) :

a metric for \({\mathcal {D}}\), the density of \({\mathcal {D}}\)
 \(\left {\mathcal {D}}_l \right\) :

the number of links in \({\mathcal {D}}\)
 \(\left {\mathcal {D}} \right\) :

the number of nodes in \({\mathcal {D}}\)
 \({\mathcal {D}}_{ca}\), \({\mathcal {D}}^i\) :

a candidate set of nodes or a candidate cluster
 \(\tau ({\mathcal {D}})\) :

the threshold of determining \({\mathcal {D}}_{ca}\)
 \({\mathcal {D}}^{max}\) :

the cluster who has the largest density
 \(N_E\) :

the number of edges of an nnode complete graph
 \(d({\mathcal {D}})\) :

the difference between \(N_E\) and \(\left {\mathcal {D}} \right\)
 \(D({\mathcal {D}})\) :

the upper bound of the number of absent links for a cluster
 \(\pi\) :

the probability that node \(v_i\) is infected by its neighbors in a time slot
 N(i):

neighbors of node \(v_i\)
 \(p_{j,t1}\) :

the probability that \(v_j\) has been infected before time slot t
 \({\mathbf {I}}\) :

the identity matrix
 \({\mathbf {P}}_t\) :

a vector that concludes the infected probability of each node at time t
 \(I_t\) :

the number of infected nodes at time t; same with I(t)
 \({\mathbf {e}}\) :

the identity vector
 \(l_i\) :

the number of physical outward link of the node
 \({\mathbf {L}}\) :

a square matrix \(diag(l_1,l_2,\ldots ,l_n)\)
 \(L_t\) :

the number of DA links at time t
 \(n_s\) :

the number of initial information spreaders to be sought
 \(S_k\) :

the sought node
 \({\mathcal {N}}_S\) :

the set of sought nodes
 \({\mathcal {D}}_{ol}\) :

the set of nodes in the overlapping area
 \({\mathcal {N}}^{ol}\) :

the set of sought nodes from overlapping DGs
 \({\mathcal {D}}_{in}\) :

an integrity composed by multiple clusters
 B(t):

the benefit or the number of DA links at time t
 \(B_0 (t)\) :

the optimal benefit
 \({\mathcal {N}}_1^*\) :

the optimal sets of spreaders from \({\mathcal {D}}_1\) using the approach in Section 6.2
 \(B_{ol}(t)\) :

the benefit when using the approach in 6.3 to seek spreaders in overlapping DGs.
 \(C(n_s)\) :

the cost of informing the initial spreaders
 \(\beta\) :

a parameter that adjusts adjust the two terms with the same unit in Eq. (27)
 GOS :

the sought node by GeneralGreedy
 LOS :

the sought node from DG
 R :

simulation round
 M :

the number of G’s links
 \({\mathcal {N}}\) :

neighbors that are not belong to the DGs
 \({\mathcal {S}}_f\) :

single node’s messages reception frequency
 \({\mathcal {A}}_f\) :

collection’s average messages reception frequency
Rights and permissions
About this article
Cite this article
Ma, S., Chen, G., Fu, L. et al. Seeking powerful information initial spreaders in online social networks: a dense group perspective. Wireless Netw 24, 2973–2991 (2018). https://doi.org/10.1007/s1127601714781
Published:
Issue Date:
DOI: https://doi.org/10.1007/s1127601714781