Abstract
Online social networks have emerged as useful tools to communicate or share information and news on a daily basis. One of the most popular networks is Twitter, where users connect to each other via directed follower relationships. Twitter follower graphs have been studied and described with various topological features. Collecting Twitter data, especially crawling the followers of users, is a tedious and time-consuming process and the data needs to be treated carefully due to its sensitive nature, containing personal user information. We therefore aim at the fast generation of directed social network graphs with reciprocal edges and high clustering. Our proposed method is based on a previously developed model, but relies on less hyperparameters and has a significantly lower runtime. Results show that our method does not only replicate the crawled directed Twitter graphs well regarding several topological features and the application of an epidemics spreading process, but that it is also highly scalable which allows the fast creation of bigger graphs that exhibit similar properties as real-world networks.
Similar content being viewed by others
Data availability
The generated and analyzed network graphs in this study are available in the GitHub repository https://github.com/Buters147/Social_Network_Graph_Generator.
Notes
For the code see https://github.com/Buters147/Social_Network_Graph_Generator.
Connecting new first degree neighbors not only with reciprocal edges, but also with directed edges lead to increased values for the CC, exceeding 0.6, which is unrealistic for social network graphs.
References
Ahn YY, Han S, Kwak H, et al (2007) Analysis of topological characteristics of huge online social networking services. In: Proceedings of the 16th international conference on world wide web. https://doi.org/10.1145/1242572.1242685
Bansal S, Khandelwal S, Meyers LA (2009) Exploring biological network structure with clustered random networks. BMC Bioinf 10(405):1–15. https://doi.org/10.1186/1471-2105-10-405
Barabási AL, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512. https://doi.org/10.1126/science.286.5439.509
Bender EA, Canfield E (1978) The asymptotic number of labeled graphs with given degree sequences. J Comb Theory Ser A 24(3):296–307. https://doi.org/10.1016/0097-3165(78)90059-6
Bollobás B (1980) A probabilistic proof of an asymptotic formula for the number of labelled regular graphs. Eur J Comb 1(4):311–316. https://doi.org/10.1016/S0195-6698(80)80030-8
Bonifati A, Holubová I, Prat-Pérez A et al (2020) Graph generators: state of the art and open challenges. ACM Comput Surv 53(2):1. https://doi.org/10.1145/3379445
Britton T, Deijfen M, Martin-Löf A (2006) Generating simple random graphs with prescribed degree distribution. J Stat Phys 124:1377–1397. https://doi.org/10.1007/s10955-006-9168-x
Chen N, Olvera-Cravioto M (2013) Directed random graphs with given degree distributions. Stoch Syst 3(1):147–186. https://doi.org/10.1214/12-SSY076
Chung F, Lu L (2002) Connected components in random graphs with given expected degree sequences. Ann Comb 6:125–145. https://doi.org/10.1007/PL00012580
Durak N, Kolda TG, Pinar A, et al (2013) A scalable null model for directed graphs matching all degree distributions: In, out, and reciprocal. In: Proceedings of IEEE network science workshop. https://doi.org/10.1109/NSW.2013.6609190
Erdös P, Rényi A (1959) On Random Graphs I. Publ Math Debr 6:290–297
Gilbert EN (1959) Random graphs. Ann Math Stat 30(4):1141–1144. https://doi.org/10.1214/aoms/1177706098
Hethcote HW (2000) The mathematics of infectious diseases. SIAM Rev 42(4):599–653. https://doi.org/10.1137/S0036144500371907
Kiss I, Miller J, Simon P (2017) Mathematics of epidemics on networks. Springer. https://doi.org/10.1007/978-3-319-50806-1
Kwak H, Lee C, Park H, et al (2010) What is twitter, a social network or a news media? In: Proceedings of 19th international conference on world wide web. https://doi.org/10.1145/1772690.1772751
Miller JC, Ting T (2019) EoN (Epidemics on Networks): a fast, flexible Python package for simulation, analytic approximation, and analysis of epidemics on networks. J Open Source Softw 4(44):1731. https://doi.org/10.21105/joss.01731
Mislove A, Marcon M, Gummadi KP, et al (2007) Measurement and analysis of online social networks. In: Proceedings of the ACM SIGCOMM conference on internet measurement. https://doi.org/10.1145/1298306.1298311
Myers S, Sharma A, Gupta P, et al (2014) Information network or social network?: The structure of the twitter follow graph. In: Proceedings of 23rd International Conference on World Wide Web. https://doi.org/10.1145/2567948.2576939
Newman M (2009) Random graphs with clustering. Phys Rev Lett 103(058):701. https://doi.org/10.1103/PhysRevLett.103.058701
Schweimer C, Gfrerer C, Lugstein F, et al (2022) Generating simple directed social network graphs for information spreading. In: Proceedings of ACM web conference 2022. https://doi.org/10.1145/3485447.3512194
Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’ networks. Nature 393(6684):440–442. https://doi.org/10.1038/30918
Acknowledgements
This publication is part of the project “HPC and Big Data Technologies for Global Systems” (HiDALGO), which has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 824115. The Know-Center is funded within the Austrian COMET Program—Competence Centers for Excellent Technologies—under the auspices of the Austrian Federal Ministry for Climate Action, Environment, Energy, Mobility, Innovation and Technology, the Austrian Federal Ministry for Digital and Economic Affairs and by the State of Styria. COMET is managed by the Austrian Research Promotion Agency FFG. The author thanks Bernhard C. Geiger (Know-Center GmbH) for his valuable feedback.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The author declares that there is no competing interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Schweimer, C. Fast generation of simple directed social network graphs with reciprocal edges and high clustering. Soc. Netw. Anal. Min. 12, 127 (2022). https://doi.org/10.1007/s13278-022-00963-z
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-022-00963-z