A Note on Modeling Retweet Cascades on Twitter
Information cascades on social networks, such as retweet cascades on Twitter, have been often viewed as an epidemiological process, with the associated notion of virality to capture popular cascades that spread across the network. The notion of structural virality (or average path length) has been posited as a measure of global spread.
In this paper, we argue that this simple epidemiological view, though analytically compelling, is not the entire story. We first show empirically that the classical SIR diffusion process on the Twitter graph, even with the best possible distribution of infectiousness parameter, cannot explain the nature of observed retweet cascades on Twitter. More specifically, rather than spreading further from the source as the SIR model would predict, many cascades that have several retweets from direct followers, die out quickly beyond that.
We show that our empirical observations can be reconciled if we take interests of users and tweets into account. In particular, we consider a model where users have multi-dimensional interests, and connect to other users based on similarity in interests. Tweets are correspondingly labeled with interests, and propagate only in the subgraph of interested users via the SIR process. In this model, interests can be either narrow or broad, with the narrowest interest corresponding to a star graph on the interested users, with the root being the source of the tweet, and the broadest interest spanning the whole graph. We show that if tweets are generated using such a mix of interests, coupled with a varying infectiousness parameter, then we can qualitatively explain our observation that cascades die out much more quickly than is predicted by the SIR model. In the same breath, this model also explains how cascades can have large size, but low “structural virality” or average path length.
KeywordsEpidemic Model Average Path Length Broad Interest Structural Virality Epidemic Threshold
We are grateful to the anonymous reviewers for very helpful feedbacks. Goel and Zhang are supported by DARPA GRAPHS program via grant FA9550-12-1-0411. Munagala is supported in part by NSF grants CCF-1348696, CCF-1408784, and IIS-1447554, and by grant W911NF-14-1-0366 from the Army Research Office (ARO).
- 1.Berger, N., Borgs, C., Chayes, J.T., Saberi, A.: On the spread of viruses on the internet. In: Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 301–310. Society for Industrial and Applied Mathematics (2005)Google Scholar
- 3.Bosagh Zadeh, R., Goel, A., Munagala, K., Sharma, A.: On the precision of social and information networks. In: Proceedings of the ACM Conference on Online Social Networks (COSN), pp. 63–74 (2013)Google Scholar
- 4.Cheng, J., Adamic, L., Dow, P.A., Kleinberg, J.M., Leskovec, J.: Can cascades be predicted? In: Proceedings of the 23rd World Wide Web Conference (WWW), pp. 925–936 (2014)Google Scholar
- 6.Goel, S., Anderson, A., Hofman, J., Watts, D.: The structural virality of online diffusion. Management Science (2015)Google Scholar
- 7.Goel, S., Watts, D.J., Goldstein, D.G.: The structure of online diffusion networks. In: Proceedings of the ACM EC, pp. 623–638 (2012)Google Scholar
- 8.Golub, B., Jackson, M.O.: How homophily affects diffusion and learning in networks. The Quarterly Journal of Economics (2012)Google Scholar
- 9.Gomez-Rodriguez, M., Leskovec, J., Krause, A.: Inferring networks of diffusion and influence. In: Proceedings of the SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pp. 1019–1028 (2010)Google Scholar
- 10.Kempe, D., Kleinberg, J., Tardos, É.: Maximizing the spread of influence through a social network. In: Proceedings of the SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pp. 137–146 (2003)Google Scholar
- 13.Kwak, H., Lee, C., Park, H., Moon, S.: What is twitter, a social network or a news media? In: Proceedings of the 19th International Conference on World Wide Web, pp. 591–600. ACM (2010)Google Scholar
- 16.Leskovec, J., McGlohon, M., Faloutsos, C., Glance, N.S., Hurst, M.: Patterns of cascading behavior in large blog graphs. In: Symposium on Data Mining (SDM), vol. 7, pp. 551–556 (2007)Google Scholar