Abstract
The motivation of the work in this paper is due to the need in research and applied fields for synthetic social network data due to (i) difficulties to obtain real data and (ii) data privacy issues of the real data. The issues to address are first to obtain a graph with a social network type structure, label it with communities. The main focus is the generation of realistic data, its assignment to and propagation within the graph. The main aim in this work is to implement an easy to use standalone end-user application which addresses the aforementioned issues. The methods used are the R-MAT and Louvain algorithms, with some modifications, for graph generation and community labeling respectively, and the development of a Java based system for the data generation using an original seed assignment algorithm followed by a second algorithm for weighted and probabilistic data propagation to neighbors and other nodes. The results show that a close fit can be achieved between the initial user specification and the generated data, and that the algorithms have potential for scale up. The system is made publicly available in a Github Java project.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Nettleton, D.F.: Data mining of social networks represented as graphs. Comput. Sci. Rev. 7, 1–34 (2013)
The ethics of Big Data: Balancing economic benefits and ethical questions of Big Data in the EU policy context”, Study of the European Economic and Social Committee (2017), Published by: “Visits and Publications” Unit EESC-2017–41-EN (2017)
Newman, N.: The costs of lost privacy: consumer harm and rising economic inequality in the age of Google. Wm. Mitchell L. Rev. 40, 849 (2013)
Tomašev, N., et al.: AI for social good: unlocking the opportunity for positive impact. Nat. Commun. 11(1), 1–6 (2020)
Park, H., Kim, M.S.: TrillionG: a trillion-scale synthetic graph generator using a recursive vector model. In: Proceedings of the 2017 ACM International Conference on Management of Data, pp. 913–928, May 2017
Samsi, S., et al.: Static graph challenge: Subgraph isomorphism. In: 2017 IEEE High Performance Extreme Computing Conference (HPEC), pp. 1–6. IEEE, September 2017
Angles, R., Paredes, R., & García, R. (2020). R3MAT: A Rapid and Robust Graph Generator. IEEE Access, 8, 130048–130065
Feng, Z., et al.: A schema-driven synthetic knowledge graph generation approach with extended graph differential dependencies (GDDxs). IEEE Access. 30, 5609 (2020)
Pérez-Rosés, H., Sebé, F.: Synthetic generation of social network data with endorsements. J. Simul. 9, 279 (2014). https://doi.org/10.1057/jos.2014.29
Ali, A.M., Alvari, H., Hajibagheri, A., Lakkaraj, K., Sukthankar, G.: Synthetic generators for cloning social network data. In: Proceedings SocInfo 2014 (2014)
Barrett, C.L., et al.: Generation and analysis of large synthetic social contact networks. In: Proceedings of the 2009 Winter Simulation Conference, 13–16 December 2009, pp.1003–1014 (2009)
Boncz, P., et al.: Benchmark Design for Navigational Pattern Matching Benchmarking. LDBC Cooperative Project FP7 – 317548. Coordinators: Arnau Prat, Alex Averbuch. Issue 3 28/09/2014 (2014)
Robins, G., Pattison, P., Woolcock, J.: Small and other worlds: global network structures from local processes. Am. J. Sociol. (AJS) 110(4), 894–936 (2005)
Nettleton, D.F.: Generating synthetic online social network graph data and topologies. In: 3rd Workshop on Graph-based Tech. & Apps, UPC, Barcelona, Spain, March 2015
Nettleton, D.F.: A synthetic data generator for online social network graphs. Soc. Netw. Anal. Min. 6(1), 1–33 (2016). https://doi.org/10.1007/s13278-016-0352-y
Nettleton, D.F., Salas, J.: A data driven anonymization system for information rich online social network graphs. Expert Syst. Appl. 55, 87–105 (2016)
Nettleton, D.F., Nettleton, S., Canal i Farriol,, M. (2021). MEDICI: A simple to use synthetic social network data generator. arXiv preprint arXiv:2101.01956
Nettleton, D.F.: Social Network Synthetic Data Generator [Source code, Git repository] (2021). https://github.com/dnettlet/MEDICI
Torra, V., Jonsson, A., Navarro-Arribas, G., Salas, J.: Synthetic generation of spatial graphs. Int. J. Intell. Syst. 33(12), 2364–2378 (2018)
Chakrabarti, D., Zhan, Y., Faloutsos, C., R-MAT: a recursive model for graph mining. In: Proceedings of the 2004 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, pp. 442–446 (2004)
Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebure, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theory Experiment 10, 1000 (2008)
Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks, Phys. Rev. E 69, 026113 (2004)
Lambiotte, R., Delvenne, J.-C., Barahona, M.: Laplacian dynamics and multiscale modular structure in networks. IEEE Trans. Network Sci. Eng. 1(2), 76–90 (2015)
Canal i Farriol, M.: Interfície d'usuari per a una aplicació de generació de dades sintètiques per a xarxes socials, Final year undergraduate project, DTIC, Universitat Pompeu Fabra (2019)
Bastian, M., Heymann, S., Jacomy, M.: Gephi: an open source software for exploring and manipulating networks. In: Proceedings 3rd International AAAI Conference on Weblogs and Social Media, pp. 361–362 (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Nettleton, D.F., Nettleton, S., Farriol, M.C.i. (2021). MEDICI: A Simple to Use Synthetic Social Network Data Generator. In: Torra, V., Narukawa, Y. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2021. Lecture Notes in Computer Science(), vol 12898. Springer, Cham. https://doi.org/10.1007/978-3-030-85529-1_22
Download citation
DOI: https://doi.org/10.1007/978-3-030-85529-1_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-85528-4
Online ISBN: 978-3-030-85529-1
eBook Packages: Computer ScienceComputer Science (R0)