Skip to main content

Advertisement

Log in

Large-scale agent-based simulations of online social networks

  • Published:
Autonomous Agents and Multi-Agent Systems Aims and scope Submit manuscript

Abstract

As part of the DARPA SocialSim challenge, we address the problem of predicting behavioral phenomena including information spread involving hundreds of thousands of users across three major linked social networks: Twitter, Reddit and GitHub. Our approach develops a framework for data-driven agent simulation that begins with a discrete-event simulation of the environment populated with generic, flexible agents, then optimizes the decision model of the agents by combining a number of machine learning classification problems. The ML problems predict when an agent will take a certain action in its world and are designed to combine aspects of the agents, gathered from historical data, with dynamic aspects of the environment including the resources, such as tweets, that agents interact with at a given point in time. In this way, each of the agents makes individualized decisions based on their environment, neighbors and history during the simulation, although global simulation data is used to learn accurate generalizations. This approach showed the best performance of all participants in the DARPA challenge across a broad range of metrics. We describe the performance of models both with and without machine learning on measures of cross-platform information spread defined both at the level of the whole population and at the community level. The best performing model overall combines learned agent behaviors with explicit modeling of bursts in global activity. Because of the general nature of our approach, it is applicable to a range of prediction problems that require modeling individualized, situational agent behavior from trace data that combines many agents.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. Tweet object: https://developer.twitter.com/en/docs/twitter-api/v1/data-dictionary/object-model/tweet.

  2. Reddit API: https://www.reddit.com/dev/api/.

  3. GitHub commit object: https://docs.github.com/en/rest/reference/commits.

  4. GitHub user object: https://docs.github.com/en/rest/reference/users.

  5. GitHub repository object: https://docs.github.com/en/rest/reference/repos.

References

  1. Aho, T., Ženko, B., Džeroski, S., & Elomaa, T. (2012). Multi-target regression with rule ensembles. Journal of Machine Learning Research, 13(Aug), 2367–2407.

    MathSciNet  MATH  Google Scholar 

  2. Appice, A., & Džeroski, S. (2007). Stepwise induction of multi-target model trees. In Machine learning: ECML 2007 (pp. 502–509). Berlin: Springer. https://doi.org/10.1007/978-3-540-74958-5

  3. Backstrom, L., & Leskovec, J. (2011). Supervised random walks: Predicting and recommending links in social networks. In Proceedings of the fourth ACM international conference on web search and data mining, WSDM ’11 (pp. 635–644). New York: ACM. https://doi.org/10.1145/1935826.1935914.

  4. Bergenti, F., Franchi, E., & Poggi, A. (2011). Selected models for agent-based simulation of social networks. In 3rd Symposium on social networks and multiagent systems (SNAMAS 2011), pp. 27–32.

  5. Bergenti, F., Franchi, E., & Poggi, A. (2013). Agent-based interpretations of classic network models. Computational and Mathematical Organization Theory, 19(2), 105–127.

    Article  Google Scholar 

  6. Blockeel, H., De Raedt, L., & Ramon, J. (2000). Top-down induction of clustering trees. arXiv preprint arXiv:cs/0011032

  7. Blythe, J. (2012). A dual-process cognitive model for testing resilient control systems. In 2012 5th international symposium on resilient control systems (pp. 8–12). IEEE. https://doi.org/10.1109/ISRCS.2012.6309285

  8. Blythe, J., Ferrara, E., Huang, D., Lerman, K., Muric, G., Sapienza, A., Tregubov, A., Pacheco, D., Bollenbacher, J., & Flammini, A., et al. (2019). The darpa socialsim challenge: Massive multi-agent simulations of the github ecosystem. In Proceedings of the 18th international conference on autonomous agents and Multiagent systems (pp. 1835–1837). International Foundation for Autonomous Agents and Multiagent Systems.

  9. Blythe, J., & Tregubov, A. (2018). Farm: Architecture for distributed agent-based social simulations. In D. Lin, T. Ishida, F. Zambonelli, & I. Noda (Eds.), Massively multi-agent systems II (pp. 96–107). Springer International Publishing.

  10. Blythe, J., & Tregubov, A. (2019). FARM: Architecture for distributed agent-based social simulations. In International workshop on massively multiagent systems (pp. 96–107). Cham: Springer. https://doi.org/10.1007/978-3-030-20937-7

  11. Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2016). Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics.

  12. Borchani, H., Varando, G., Bielza, C., & Larrañaga, P. (2015). A survey on multi-output regression. WIREs Data Mining Knowl Discov, 5, 216–233. https://doi.org/10.1002/widm.1157

    Article  Google Scholar 

  13. Can, E.F., Oktay, H., & Manmatha, R. (2013). Predicting retweet count using visual cues. In Proceedings of the 22nd ACM international conference information & knowledge management—CIKM ’13 (pp. 1481–1484). New York: ACM Press. https://doi.org/10.1145/2505515.2507824

  14. Collier, N., & North, M. (2013). Parallel agent-based simulation with repast for high performance computing. Simulation, 89(10), 1215–1235.

    Article  Google Scholar 

  15. D O’Brien, J., Dassios, I. .K., & Gleeson, J. .P. (2019). Spreading of memes on multiplex networks. New Journal of Physics, 21(2), 025001.

    Article  MathSciNet  Google Scholar 

  16. Deissenberg, C., Van Der Hoog, S., & Dawid, H. (2008). Eurace: A massively parallel agent-based model of the European economy. Applied Mathematics and Computation, 204(2), 541–552.

    Article  MathSciNet  MATH  Google Scholar 

  17. Goyal, P., & Ferrara, E. (2018). Graph embedding techniques, applications, and performance: A survey. Knowledge-Based Systems. https://doi.org/10.1016/j.knosys.2018.03.022

    Article  Google Scholar 

  18. Jung, A.K., Mirbabaie, M., Ross, B., Stieglitz, S., Neuberger, C., & Kapidzic, S. (2018). Information diffusion between twitter and online media. In Proceedings of the thirty ninth international conference on information systems.

  19. Kazemi, S. M., & Poole, D. (2018). Simple embedding for link prediction in knowledge graphs. In Advances in neural information processing systems (Vol. 2018-December, pp. 4284–4295). Neural Information Processing Systems Foundation.

  20. Kleinberg, J. (2003). Bursty and hierarchical structure in streams. Data Mining and Knowledge Discovery, 7(4), 373–397.

    Article  MathSciNet  Google Scholar 

  21. Krijestorac, H., Garg, R., & Mahajan, V. (2019). Cross-platform spillover effects in consumption of viral content: A quasi-experimental analysis using synthetic controls. Available at SSRN 3011533

  22. Linyuan, L. L., & Zhou, T. (2011). Link prediction in complex networks: A survey. Physica A: Statistical Mechanics and its Applications. https://doi.org/10.1016/j.physa.2010.11.027.

    Article  Google Scholar 

  23. Mordelet, F., & Vert, J. P. (2014). A bagging SVM to learn from positive and unlabeled examples. Pattern Recognition Letters, 37, 201–209. https://doi.org/10.1016/J.PATREC.2013.06.010

    Article  Google Scholar 

  24. Murić, G., Tregubov, A., Blythe, J., Abeliuk, A., Choudhary, D., Lerman, K.,&Ferrara, E. (2020). Massive cross-platform simulations of online social networks. In 19th international conference on autonomous agents and multiagent systems (AAMAS).

  25. Similä, T., & Tikka, J. (2007). Input selection and shrinkage in multiresponse linear regression. Computational Statistics & Data Analysis, 52, 406–422.

    Article  MathSciNet  MATH  Google Scholar 

  26. Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., & Mei, Q. (2015). LINE: Large-scale information network embedding. In WWW 2015—Proceedings of the 24th international conference on world wide web (pp. 1067–1077). Association for Computing Machinery, Inc . https://doi.org/10.1145/2736277.2741093

  27. Wang, P., Xu, B., Wu, Y., & Zhou, X. (2015). Link prediction in social networks: The state-of-the-art. Science China Information Sciences, 58(1), 1–38. https://doi.org/10.1007/s11432-014-5237-y.

    Article  Google Scholar 

  28. Zaman, T. R., Herbrich, R., Van Gael, J., & Stern, D. (2010). Predicting information spreading in twitter. In: Workshop on computational social science and the wisdom of crowds (Vol. 104, pp. 17599–601) . NIPS.

  29. Zhang, Q., Gong, Y., Wu, J., Huang, H., & Huang, X. (2016). Retweet prediction with attention-based deep neural network. In Proceedings of the 25th ACM international on conference on information and knowledge management—CIKM ’16 (pp. 75–84). New York: ACM Press. https://doi.org/10.1145/2983323.2983809

Download references

Acknowledgements

The authors are grateful to the Defense Advanced Research Projects Agency (DARPA), contract W911NF-17-C-0094, for their support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexey Tregubov.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Murić, G., Tregubov, A., Blythe, J. et al. Large-scale agent-based simulations of online social networks. Auton Agent Multi-Agent Syst 36, 38 (2022). https://doi.org/10.1007/s10458-022-09565-7

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10458-022-09565-7

Keywords

Navigation