Skip to main content
Log in

NDlib: a python library to model and analyze diffusion processes over complex networks

  • Applications
  • Published:
International Journal of Data Science and Analytics Aims and scope Submit manuscript

Abstract

Nowadays the analysis of dynamics of and on networks represents a hot topic in the social network analysis playground. To support students, teachers, developers and researchers, in this work we introduce a novel framework, namely NDlib, an environment designed to describe diffusion simulations. NDlib is designed to be a multi-level ecosystem that can be fruitfully used by different user segments. For this reason, upon NDlib, we designed a simulation server that allows remote execution of experiments as well as an online visualization tool that abstracts its programmatic interface and makes available the simulation platform to non-technicians.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. NetworkX: https://goo.gl/PHXdnL.

  2. NDlib GitHub: https://goo.gl/zC7p7b.

  3. NDlib pypi: https://goo.gl/gc96xW.

  4. NDlib docs: https://goo.gl/VLWtrn.

  5. SoBigData: http://www.sobigdata.eu.

  6. Bokeh: https://goo.gl/VtjuKS.

  7. Matplotlib: https://goo.gl/EY96HV.

  8. DyNetX GitHub: https://goo.gl/UL3JVd.

  9. DyNetX pypi: https://goo.gl/8HbS8d.

  10. DyNetX docs: https://goo.gl/ngKHpY.

  11. Extending NDlib: https://goo.gl/ycoUQ8.

  12. NDlib-REST: https://goo.gl/c6yHcY.

  13. Docker: https://goo.gl/BbWz7X.

  14. Gunicorn: https://goo.gl/TFjebL.

  15. Flask: https://goo.gl/qQhcBn.

  16. Available, currently as a separate branch, at: https://goo.gl/tYi48o.

  17. Bootstrap: https://goo.gl/8xhPdm.

  18. D3.js: https://goo.gl/mPrn3i.

  19. NVD3: https://goo.gl/YW87eA.

  20. Epigrass: https://goo.gl/yjbLRk.

  21. GEMF-sim: https://goo.gl/kcGs6R.

  22. Nepidemix: https://goo.gl/M8rEGM.

  23. EoN: https://goo.gl/cuArFP.

  24. Epydemic: https://goo.gl/PrPHh4.

  25. ComplexNetworkSim: https://goo.gl/nczJTH.

  26. Nxsim: https://goo.gl/U2rDvv.

  27. EpiModel: https://goo.gl/g9RRCM.

  28. EpiModel-viz: https://goo.gl/13Z5mj.

  29. RECON: https://goo.gl/eYMDqh.

  30. Sisspread: https://goo.gl/LSWsUh.

  31. GLEaMviz: https://goo.gl/kTftxZ.

  32. NetLogo: https://goo.gl/82zAoc.

  33. System Sciences: https://goo.gl/wH5kJn.

  34. FRED: https://goo.gl/JwUx7k.

  35. FluTE: https://goo.gl/VEBshU.

  36. Malaria: https://goo.gl/yyStf9.

  37. EpiFire: https://goo.gl/W5QLb1.

  38. Measles: https://goo.gl/R1aM1N.

  39. SNAP library: https://goo.gl/ZYrnH9.

  40. Boost: https://goo.gl/9xKAAR.

  41. graph-tool: https://goo.gl/uUW5kq.

References

  1. Ahrenberg, L., Kok, S., Vasarhelyi, K., Rutherford, A.: Nepidemix (2016)

  2. Van den Broeck, W., Gioannini, C., Gonçalves, B., Quaggiotto, M., Colizza, V., Vespignani, A.: The gleamviz computational tool, a publicly available software to explore realistic epidemic spreading scenarios at the global scale. BMC Infect Dis 11(1), 37 (2011)

    Article  Google Scholar 

  3. Burt, R.S.: Social contagion and innovation: cohesion versus structural equivalence. Am. J. Sociol. 92, 1287 (1987)

    Article  Google Scholar 

  4. Casteigts, A., Flocchini, P., Quattrociocchi, W., Santoro, N.: Time-varying graphs and dynamic networks. Int. J. Parallel Emerg. Distrib. Syst. 27(5), 387–408 (2012)

    Article  Google Scholar 

  5. Castellano, C., Munoz, M.A., Pastor-Satorras, R.: The non-linear q-voter model. Phys. Rev. E 80, 041–129 (2009)

    Article  Google Scholar 

  6. Chao, D.L., Halloran, M.E., Obenchain, V.J., Longini Jr., I.M.: Flute, a publicly available stochastic influenza epidemic simulation model. PLoS Comput. Biol. 6(1), e1000–656 (2010)

    Article  MathSciNet  Google Scholar 

  7. Clifford, P., Sudbury, A.: A model for spatial conflict. Biometrika 60(3), 581–588 (1973). https://doi.org/10.1093/biomet/60.3.581

    Article  MathSciNet  MATH  Google Scholar 

  8. Coelho, F.C., Cruz, O.G., Codeço, C.T.: Epigrass: a tool to study disease spread in complex networks. Sour. Code Biol. Med. 3(1), 3 (2008)

    Article  Google Scholar 

  9. Deffuant, G., Neau, D., Amblard, F., Weisbuch, G.: Mixing beliefs among interacting agents. Adv. Complex Syst. 3(4), 87–98 (2000)

    Article  Google Scholar 

  10. Friedman, R., Friedman, M.: The Tyranny of the Status Quo. Harcourt Brace Company, Orlando (1984)

    Google Scholar 

  11. Galam, S.: Minority opinion spreading in random geometry. Eur. Phys. J. B 25(4), 403–406 (2002)

    Google Scholar 

  12. Granovetter, M.: Threshold models of collective behavior. Am. J. Sociol. 83(6), 1420–1443 (1978)

    Article  Google Scholar 

  13. Grefenstette, J.J., Brown, S.T., Rosenfeld, R., DePasse, J., Stone, N.T., Cooley, P.C., Wheaton, W.D., Fyshe, A., Galloway, D.D., Sriram, A., et al.: Fred (a framework for reconstructing epidemic dynamics): an open-source software system for modeling infectious diseases and control strategies using census-based populations. BMC Public Health 13(1), 940 (2013)

    Article  Google Scholar 

  14. Havlin, S.: Phone infections. Science 324, 1023 (2009)

    Article  Google Scholar 

  15. Holley, R., Liggett, T.: Ergodic theorems for weakly interacting infinite systems and the voter model. Ann. Probab. 3(4), 643–663 (1975)

    Article  MathSciNet  MATH  Google Scholar 

  16. Holme, P., Saramäki, J.: Temporal networks. Phys. Rep. 519(3), 97–125 (2012)

    Article  Google Scholar 

  17. Jenness, S., Goodreau, S.M., Morris, M.: Epimodel: Mathematical modeling of infectious disease. r package version 1.3.0. (2017). http://www.epimodel.org

  18. Kempe, D., Kleinberg, J., Tardos, E.: Maximizing the spread of influence through a social network. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’03, pp. 137–146 (2003)

  19. Kermack, W.O., McKendrick, A.: A contribution to the mathematical theory of epidemics. Proceed. R. Soc. Lond. Ser. A Contain. Papers Math. Phys. Character 115(772), 700–721 (1927)

    Article  MATH  Google Scholar 

  20. Kiss, I.Z., Miller, J.C., Simon, P.: Mathematics of Epidemics on Networks: From Exact to Approximate Models. Springer (Forthcoming)

  21. Kovanen, L., Karsai, M., Kaski, K., Kertész, J., Saramäki, J.: Temporal motifs in time-dependent networks. J. Statist. Mech. Theory Exp. 2011(11), p11005 (2011)

    Article  Google Scholar 

  22. Krapivsky, P.L., Redner, S., Ben-Naim, E.: A Kinetic View of Statistical Physics. Cambridge University Press, Cambridge (2010)

    Book  MATH  Google Scholar 

  23. Lee, S., Rocha, L.E., Liljeros, F., Holme, P.: Exploiting temporal network structures of human interaction to effectively immunize populations. PLoS ONE 7(5), e36–439 (2012)

    Google Scholar 

  24. Leskovec, J., Sosič, R.: Snap: a general-purpose network analysis and graph-mining library. ACM Trans. Intell. Syst. Technol. (TIST) 8(1), 1 (2016)

    Article  Google Scholar 

  25. Milli, L., Rossetti, G., Pedreschi, D., Giannotti, F.: Information diffusion in complex networks: The active/passive conundrum. In: Complex Networks (2017)

  26. Milli, L., Rossetti, G., Pedreschi, D., Giannotti, F.: Diffusive phenomena in dynamic networks: a data-driven study. In: 9th Conference on Complex Networks, CompleNet (2018)

  27. Newton, C.M.: Graphics: from alpha to omega in data analysis. In: Wang, P.C. (ed.) Graphical Representation of Multivariate Data, pp. 59–92. Academic Press (1978). https://doi.org/10.1016/B978-0-12-734750-9.50008-3 URL http://www.sciencedirect.com/science/article/pii/B9780127347509500083

  28. Pennacchioli, D., Rossetti, G., Pappalardo, L., Pedreschi, D., Giannotti, F., Coscia, M.: The three dimensions of social prominence. In: International Conference on Social Informatics, pp. 319–332. Springer (2013)

  29. Rossetti, G.: Rdyn: graph benchmark handling community dynamics. J. Complex Netw. 5, 893 (2017)

    Article  Google Scholar 

  30. Rossetti, G., Cazabet, R.: Community discovery in dynamic networks: a survey. arXiv preprint arXiv:1707.03186 (2017)

  31. Rossetti, G., Guidotti, R., Miliou, I., Pedreschi, D., Giannotti, F.: A supervised approach for intra-/inter-community interaction prediction in dynamic social networks. Soc. Netw. Anal. Min. 6(1), 86 (2016)

    Article  Google Scholar 

  32. Rossetti, G., Pappalardo, L., Pedreschi, D., Giannotti, F.: Tiles: an online algorithm for community discovery in dynamic social networks. Mach. Learn. pp. 1–29 (2016)

  33. Ruan, Z., Iñiguez, G., Karsai, M., Kertész, J.: Kinetics of social contagion. Phys. Rev. Lett 115, 218702 (2015). https://doi.org/10.1103/PhysRevLett.115.218702

    Article  Google Scholar 

  34. Sahneh, F.D., Vajdi, A., Shakeri, H., Fan, F., Scoglio, C.: Gemfsim: a stochastic simulator for the generalized epidemic modeling framework. J. Comput. Sci. 22, 36–44 (2017)

    Article  Google Scholar 

  35. Sîrbu, A., Loreto, V., Servedio, V.D., Tria, F.: Opinion dynamics with disagreement and modulated information. J. Stat. Phys. 151, 1–20 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  36. Sîrbu, A., Loreto, V., Servedio, V.D., Tria, F.: Opinion dynamics: models, extensions and external effects. In: Participatory Sensing, Opinions and Collective Awareness, pp. 363–401. Springer International Publishing (2017)

  37. Sznajd-Weron, K., Sznajd, J.: Opinion evolution in closed community. Int. J. Mod. Phys. C 11, 1157–1165 (2001)

    Article  MATH  Google Scholar 

  38. Szor, P.: Fighting Computer Virus Attacks. USENIX, Berkeley (2004)

    Google Scholar 

  39. Tabourier, L., Libert, A.S., Lambiotte, R.: Predicting links in ego-networks using temporal information. EPJ Data Sci. 5(1), 1 (2016)

    Article  Google Scholar 

  40. Viard, T., Latapy, M., Magnien, C.: Computing maximal cliques in link streams. Theor. Comput. Sci. 609, 245–252 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  41. Vilone, D., Giardini, F., Paolucci, M., Conte, R.: Reducing individuals’ risk sensitiveness can promote positive and non-alarmist views about catastrophic events in an agent-based simulation. arXiv preprint arXiv:1609.04566 (2016)

  42. Wang, P., González, M.C., Menezes, R., Barabási, A.L.: Understanding the spread of malicious mobile-phone programs and their damage potential. Int. J. Inf. Secur. 12, 383 (2013)

    Article  Google Scholar 

  43. Watts, D.J.: A simple model of global cascades on random networks. Proc. Natl. Acad. Sci. 99(9), 5766–5771 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  44. Wilensky, U.: Netlogo (1999)

  45. Word, D.P., Abbott, G.H., Cummings, D., Laird, C.D.: Estimating seasonal drivers in childhood infectious diseases with continuous time and discrete-time models. In: American Control Conference (ACC), 2010, pp. 5137–5142. IEEE (2010)

Download references

Acknowledgements

This work is funded by the European Community’s H2020 Program under the funding scheme “FETPROACT-1-2014: Global Systems Science (GSS),” grant agreement # 641191 CIMPLEX “Bringing CItizens, Models and Data together in Participatory, Interactive SociaL EXploratories” (CIMPLEX:https://www.cimplex-project.eu). This work is supported by the European Community’s H2020 Program under the scheme “INFRAIA-1-2014-2015: Research Infrastructures", grant agreement #654024 “SoBigData: Social Mining & Big Data Ecosystem" (SoBigData: http://www.sobigdata.eu).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Giulio Rossetti.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest

Additional information

This paper is an extension version of the DSAA’2017 Application Track paper titled “: Studying Network Diffusion Dynamics”.

Appendices

Appendix: diffusion methods implemented in NDlib

NDlib exposes several network diffusion models, covering both epidemic approaches as well as and opinion dynamics. In particular, the actual release of the library (v3.0.0) implements the following algorithms:

Static epidemic models

SI this model was introduced in 1927 by Kermack [19]. In the SI model, during the course of an epidemics, a node is allowed to change its status only from Susceptible (S) to Infected (I). SI assumes that if, during a generic iteration, a susceptible node comes into contact with an infected one, it becomes infected with probability \(\beta \): once a node becomes infected, it stays infected (the only transition is \(S\rightarrow I\)).

SIR this model was still introduced in 1927 by Kermack [19]. In the SIR model, during the course of an epidemics, a node is allowed to change its status from Susceptible (S) to Infected (I), then to Removed (R). SIR assumes that if, during a generic iteration, a susceptible node comes into contact with an infected one, it becomes infected with probability \(\beta \), then it can be switch to removed with probability \(\gamma \) (the only transition allowed are \(S\rightarrow I\rightarrow R\)).

SIS as SIR, the SIS model is a variation of the SI model introduced in [19]. The model assumes that if, during a generic iteration, a susceptible node comes into contact with an infected one, it becomes infected with probability \(\beta \), then it can switch again to susceptible with probability \(\lambda \) (the only transition allowed are \(S\rightarrow I\rightarrow S\)).

SEIS as the previous models, the SEIS is a variation of the SI model. For many infections, there is a significant incubation period during which the individual has been infected but is not yet infectious themselves. During this period, the individual is in status exposed (E). SEIS assumes that if, during a generic iteration, a susceptible node comes into contact with an infected one, it switches to exposed with probability \(\beta \), then it becomes infected with probability \(\epsilon \) and then it can switch again to susceptible with probability \(\lambda \) (the only transition allowed are \(S\rightarrow E \rightarrow I\rightarrow S\)).

SEIR as the SEIS models, the SEIR takes into consideration the incubation period, considering the status exposed (E). SEIR assumes that if, during a generic iteration, a susceptible node comes into contact with an infected one, it switches to exposed with probability \(\beta \), then it becomes infected with probability \(\epsilon \) and then it can switch to removed with probability \(\gamma \) (the only transition allowed are \(S\rightarrow E \rightarrow I\rightarrow R\)).

SWIR this model has four states: Susceptible (S), Infected (I), Recovered (R) and Weakened (W). Besides the usual transaction \(S\rightarrow I\rightarrow R\), we have also the transaction \(S\rightarrow W \rightarrow I\rightarrow R\). At time stamp n, a node in state I is selected and the state of all its neighbors are checked one by one. If the state of a neighbor is S, then this state changes either (i) to I whit probability \(\kappa \) or (ii) to W with likelihood \(\mu \). If the state of a neighbor is W, with probability \(\nu \) its state changes in I. Then, we repeat for all node in the state I the process and the state for all these nodes become R.

Threshold this model was introduced in 1978 by Granovetter [12]. In the Threshold model during an epidemics, a node has two distinct and mutually exclusive behavioral alternatives, e.g., it can adopt or not a given behavior, participate or not participate in a riot. Nodes individual decision depends on the percentage of its neighbors that have made the same choice, thus imposing a threshold. The model works as follows: each node starts with its own threshold \(\tau \) and status (infected or susceptible). During the iteration t, every node is observed: iff the percentage of its neighbors that were infected at time \(t-1\) is grater than its threshold, it becomes infected as well.

Kertesz Threshold this model was introduced in 2015 by Ruan et al. [33] and it is an extension of the Watts threshold model [43]. The authors extend the classical model introducing a density r of blocked nodes–nodes which are immune to social influence—and a probability of spontaneous adoption p to capture external influence. Thus, the model distinguishes three kinds of node: Blocked (B), Susceptible (S) and Adopting (A). A node can adopt either under its neighbors influence or due to endogenous effects.

Independent Cascades this model was introduced by Kempe et all in 2003 [18]. The Independent Cascades model starts with an initial set of active nodes \(A_0\): the diffusive process unfolds in discrete steps according to the following randomized rule:

  • When node v becomes active in step t, it is given a single chance to activate each currently inactive neighbor w; it succeeds with a probability \(p_{v,w}\).

  • If w has multiple newly activated neighbors, their attempts are sequenced in an arbitrary order.

  • If v succeeds, then w will become active in step \(t+1\); but whether or not v succeeds, it cannot make any further attempts to activate w in subsequent rounds.

The process runs until no more activations are possible.

Node Profile this model is a variation of the Threshold one, introduced in [25]. It assumes that the diffusion process is only apparent; each node decides to adopt or not a given behavior—once known its existence—only on the basis of its own interests. In this scenario, the peer pressure is completely ruled out from the overall model: it is not important how many of its neighbors have adopted a specific behavior, if the node does not like it, it will not change its interests. Each node has its own profile describing how many it is likely to accept a behavior similar to the one that is currently spreading. The diffusion process starts from a set of nodes that have already adopted a given behavior H: for each of the susceptible nodes in the neighborhood of a node u that has already adopted H, an unbalanced coin is flipped, the unbalance given by the personal profile of the susceptible node; if a positive result is obtained, the susceptible node will adopt the behavior.

Node Profile-Threshold this model, still extension of the Threshold one [25], assumes the existence of node profiles that act as preferential schemas for individual tastes but relax the constraints imposed by the Profile model by letting nodes influenceable via peer pressure mechanisms. The peer pressure is modeled with a threshold. The diffusion process starts from a set of nodes that have already adopted a given behavior H: for each of the susceptible nodes an unbalanced coin is flipped if the percentage of its neighbors that are already infected exceeds its threshold. As in the Profile model, the coin unbalance is given by the personal profile of the susceptible node; if a positive result is obtained, the susceptible node will adopt the behavior.

Dynamic epidemic models

NDlib (starting from v3.0.0) implements dynamic network version of classic compartmental models (SI, SIS, SIR) [26] leveraging DyNetX graph structures. Such models, here shortly described, are defined for snapshot-based as well as temporal networks.

DynSI this model adapts the classical formulation of SI model (where the transition is \(S\rightarrow I\)) to the snapshot-based topology evolution where the network structure is updated during each iteration. The model applied at day \(t_i\) will then use as starting infected set the result of the iteration performed on the interaction graph of the previous day, and as social structure the current one. Such choice implies that not only the interactions of consecutive snapshot could vary but that the node sets can also differ.

DynSIS as the previous dynamic model, the DynSIS adapts the classical formulation of SIS model (where the transition is \(S\rightarrow I\rightarrow S\)) to the snapshot-based topology evolution where the network structure is updated during each iteration. The DynSIS implementation assumes that the process occurs on a directed/undirected dynamic network.

DynSIR as the previous model, the DynSIS adapts the classical formulation of SIR model (where the transition is \(S\rightarrow I\rightarrow R\)) to the snapshot-based topology evolution where the network structure is updated during each iteration. The DynSIR implementation assumes that the process occurs on a directed/undirected dynamic network.

Opinion dynamic models

Voter this model is one of the simplest models of opinion dynamics, originally introduced to analyze competition of species [7] and soon after applied to model elections [15]. The model assumes the opinion of an individual to be a discrete variable \(\pm \,1\). The state of the population varies based on a very simple update rule: at each iteration, a random individual is selected, who then copies the opinion of one random neighbor. Starting from any initial configuration, on a complete network, the entire population converges to a consensus on one of the two options [22]. The probability that consensus is reached on opinion \(+1\) is equal to the initial fraction of individuals holding that opinion.

Snajzd this model [37] is a variant of spin model employing the theory of social impact, which takes into account the fact that a group of individuals with the same opinion can influence their neighbors more than one single individual. In the original model, the social network is a 2-dimensional lattice; however, we also implemented the variant on any complex networks. Each agent has an opinion \(\sigma _i=\pm 1\); at each time step, a pair of neighboring agents is selected and, if their opinion coincides, all their neighbors take that opinion. The model has been shown to converge to one of the two agreeing stationary states, depending on the initial density of up-spins (transition at 50% density).

Q-Voter this model was introduced as a generalization of discrete opinion dynamic models [5]. Here, N individuals hold an opinion \(\pm \,1\). At each time step, a set of q neighbors are chosen and, if they agree, they influence one neighbor chose at random, i.e., this agent copies the opinion of the group. If the group does not agree, the agent flips its opinion with probability \(\epsilon \). It is clear that the voter and Sznajd models are special cases of this more recent model (\(q=1, \epsilon =0\) and \(q=2, \epsilon =0\), respectively). Analytic results for \(q\le 3\) validate the numerical results obtained for the special case models, with transitions from an ordered phase (small \(\epsilon \)) to a disordered one (large \(\epsilon \)). For \(q>3\), a new type of transition between the two phases appears, which consist of passing through an intermediate regime where the final state depends on the initial condition. We implemented in NDlib the model with \(\epsilon =0\).

Majority Rule this model is a different discrete model of opinion dynamics, proposed to describe public debates [11]. Agents take discrete opinions \(\pm \,1\), just like the voter model. All agents can interact with all other agents (also in our implementation), so the social network is always a complete graph. At each time step, a group of r agents is selected randomly and they all take the majority opinion within the group. The group size can be fixed or taken at each time step from a specific distribution. If r is odd, then the majority opinion is always defined; however, if r is even, there could be tied situations. To select a prevailing opinion, in this case, a bias in favor of one opinion (\(+1\)) is introduced. This idea is inspired by the concept of social inertia [10].

Cognitive Opinion Dynamics this model was introduced by Vilone et all. [41], which models the state of individuals taking into account several cognitively grounded variables. The aim is to simulate a response to risk in catastrophic events in the presence of external (institutional) information. The individual opinion is modeled as a continuous variable \(O_i \in [0,1]\), representing the degree of perception of the risk (how probable it is that the catastrophic event will actually happen). This opinion evolves through interactions with neighbors and external information, based on four internal variables for each individual i: risk sensitivity (\(R_i \in \{-1,0,1\}\)), tendency to inform others (\(\beta _i \in [0,1]\)), trust in institutions (\(T_i \in [0,1]\)) and trust in peers (\(\Pi _i = 1-T_i\)). These values are generated when the population is initialized and stay fixed during the simulation. In our implementation, we allow some control on the distribution of these parameters. The update rules define how \(O_i\) values change in time (see original paper [41] for details). The model was shown to be able to reproduce well various real situations; in particular, it is visible that risk sensitivity is more important than trust in institutional information when it comes to evaluating risky situations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rossetti, G., Milli, L., Rinzivillo, S. et al. NDlib: a python library to model and analyze diffusion processes over complex networks. Int J Data Sci Anal 5, 61–79 (2018). https://doi.org/10.1007/s41060-017-0086-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41060-017-0086-6

Keywords

Navigation