Skip to main content

MEDICI: A Simple to Use Synthetic Social Network Data Generator

  • Conference paper
  • First Online:
Modeling Decisions for Artificial Intelligence (MDAI 2021)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12898))

Abstract

The motivation of the work in this paper is due to the need in research and applied fields for synthetic social network data due to (i) difficulties to obtain real data and (ii) data privacy issues of the real data. The issues to address are first to obtain a graph with a social network type structure, label it with communities. The main focus is the generation of realistic data, its assignment to and propagation within the graph. The main aim in this work is to implement an easy to use standalone end-user application which addresses the aforementioned issues. The methods used are the R-MAT and Louvain algorithms, with some modifications, for graph generation and community labeling respectively, and the development of a Java based system for the data generation using an original seed assignment algorithm followed by a second algorithm for weighted and probabilistic data propagation to neighbors and other nodes. The results show that a close fit can be achieved between the initial user specification and the generated data, and that the algorithms have potential for scale up. The system is made publicly available in a Github Java project.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Nettleton, D.F.: Data mining of social networks represented as graphs. Comput. Sci. Rev. 7, 1–34 (2013)

    Google Scholar 

  2. The ethics of Big Data: Balancing economic benefits and ethical questions of Big Data in the EU policy context”, Study of the European Economic and Social Committee (2017), Published by: “Visits and Publications” Unit EESC-2017–41-EN (2017)

    Google Scholar 

  3. Newman, N.: The costs of lost privacy: consumer harm and rising economic inequality in the age of Google. Wm. Mitchell L. Rev. 40, 849 (2013)

    Google Scholar 

  4. Tomašev, N., et al.: AI for social good: unlocking the opportunity for positive impact. Nat. Commun. 11(1), 1–6 (2020)

    Article  Google Scholar 

  5. Park, H., Kim, M.S.: TrillionG: a trillion-scale synthetic graph generator using a recursive vector model. In: Proceedings of the 2017 ACM International Conference on Management of Data, pp. 913–928, May 2017

    Google Scholar 

  6. Samsi, S., et al.: Static graph challenge: Subgraph isomorphism. In: 2017 IEEE High Performance Extreme Computing Conference (HPEC), pp. 1–6. IEEE, September 2017

    Google Scholar 

  7. Angles, R., Paredes, R., & García, R. (2020). R3MAT: A Rapid and Robust Graph Generator. IEEE Access, 8, 130048–130065

    Google Scholar 

  8. Feng, Z., et al.: A schema-driven synthetic knowledge graph generation approach with extended graph differential dependencies (GDDxs). IEEE Access. 30, 5609 (2020)

    Google Scholar 

  9. Pérez-Rosés, H., Sebé, F.: Synthetic generation of social network data with endorsements. J. Simul. 9, 279 (2014). https://doi.org/10.1057/jos.2014.29

  10. Ali, A.M., Alvari, H., Hajibagheri, A., Lakkaraj, K., Sukthankar, G.: Synthetic generators for cloning social network data. In: Proceedings SocInfo 2014 (2014)

    Google Scholar 

  11. Barrett, C.L., et al.: Generation and analysis of large synthetic social contact networks. In: Proceedings of the 2009 Winter Simulation Conference, 13–16 December 2009, pp.1003–1014 (2009)

    Google Scholar 

  12. Boncz, P., et al.: Benchmark Design for Navigational Pattern Matching Benchmarking. LDBC Cooperative Project FP7 – 317548. Coordinators: Arnau Prat, Alex Averbuch. Issue 3 28/09/2014 (2014)

    Google Scholar 

  13. Robins, G., Pattison, P., Woolcock, J.: Small and other worlds: global network structures from local processes. Am. J. Sociol. (AJS) 110(4), 894–936 (2005)

    Article  Google Scholar 

  14. Nettleton, D.F.: Generating synthetic online social network graph data and topologies. In: 3rd Workshop on Graph-based Tech. & Apps, UPC, Barcelona, Spain, March 2015

    Google Scholar 

  15. Nettleton, D.F.: A synthetic data generator for online social network graphs. Soc. Netw. Anal. Min. 6(1), 1–33 (2016). https://doi.org/10.1007/s13278-016-0352-y

    Article  MathSciNet  Google Scholar 

  16. Nettleton, D.F., Salas, J.: A data driven anonymization system for information rich online social network graphs. Expert Syst. Appl. 55, 87–105 (2016)

    Article  Google Scholar 

  17. Nettleton, D.F., Nettleton, S., Canal i Farriol,, M. (2021). MEDICI: A simple to use synthetic social network data generator. arXiv preprint arXiv:2101.01956

  18. Nettleton, D.F.: Social Network Synthetic Data Generator [Source code, Git repository] (2021). https://github.com/dnettlet/MEDICI

  19. Torra, V., Jonsson, A., Navarro-Arribas, G., Salas, J.: Synthetic generation of spatial graphs. Int. J. Intell. Syst. 33(12), 2364–2378 (2018)

    Article  Google Scholar 

  20. Chakrabarti, D., Zhan, Y., Faloutsos, C., R-MAT: a recursive model for graph mining. In: Proceedings of the 2004 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, pp. 442–446 (2004)

    Google Scholar 

  21. Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebure, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theory Experiment 10, 1000 (2008)

    Google Scholar 

  22. Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks, Phys. Rev. E 69, 026113 (2004)

    Google Scholar 

  23. Lambiotte, R., Delvenne, J.-C., Barahona, M.: Laplacian dynamics and multiscale modular structure in networks. IEEE Trans. Network Sci. Eng. 1(2), 76–90 (2015)

    Article  Google Scholar 

  24. Canal i Farriol, M.: Interfície d'usuari per a una aplicació de generació de dades sintètiques per a xarxes socials, Final year undergraduate project, DTIC, Universitat Pompeu Fabra (2019)

    Google Scholar 

  25. Bastian, M., Heymann, S., Jacomy, M.: Gephi: an open source software for exploring and manipulating networks. In: Proceedings 3rd International AAAI Conference on Weblogs and Social Media, pp. 361–362 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David F. Nettleton .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nettleton, D.F., Nettleton, S., Farriol, M.C.i. (2021). MEDICI: A Simple to Use Synthetic Social Network Data Generator. In: Torra, V., Narukawa, Y. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2021. Lecture Notes in Computer Science(), vol 12898. Springer, Cham. https://doi.org/10.1007/978-3-030-85529-1_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-85529-1_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-85528-4

  • Online ISBN: 978-3-030-85529-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics