Python Packages for Networks
 107 Downloads
Synonyms
Glossary
 CSV

Commaseparated values is a plain text format for storing tabular data.
 JSON

JavaScript Object Notation is a datainterchange plain text format.
 Large network

A network with several thousands or millions of nodes.
 Network analysis

A study of networks as representations of relations between discrete objects.
Definition
Python is a highlevel generalpurpose programming language and easy to understand and learn. To deal with networks, different representations of networks in Python were proposed. On their basis, several Python libraries were developed to support programming of network analysis tasks.
Introduction
Python is an opensource, interpreted, interactive, objectoriented programming language (Python Software Foundation, https://www.python.org/). Its name comes from the BBC TV show Monty Python’s Flying Circus. It runs on all main platforms: Windows, MAC, Linux/Unix, and Android. It is a successor of the programming language ABC that was based on ideas of structured programming (Dahl et al. 1972). Python was initially, starting in December 1989, developed by Guido van Rossum at CWI. It reached Version 1.0 in January 1994. In 2001, the Python Software Foundation (PSF) was formed. Guido remains Python’s principal author – a Benevolent Dictator for Life. Python 2.0 was released on October 16, 2000, and Python 3.0 on December 3, 2008.
Python is based on the interpretation of programs that slows down their execution with respect to compiled programs. In most cases, the critical tasks are programmed in compiled languages and made available as packages (libraries). Besides standard packages, Python has a broad collection of packages for different tasks – see PyPI (the Python Package Index (https://pypi.python.org/pypi); in July 2017, it contained 113,701 packages). Python’s emphasis is on faster program development and its readability. It is also easy to learn – it is a kind of modern Basic. Besides R, Python is the main programming language used in data analysis.
A support for graphs in Python is an old problem – see a page “A Python Graph API?” (https://wiki.python.org/moin/PythonGraphApi). It starts with a question what is a graph? There are two approaches to this question: particular and general. Fathers of graph theory choose the particular approach: Berge (1958) relational/directed graphs and Harary (1969) simple undirected graphs without loops. We can find a general approach in Zykov (1969). In this entry, we shall use a general approach. In a graph, we allow both edges and arcs. A pair of nodes can be linked by multiple links. Loops are also allowed.
In most applications of graphs, we have to consider additional information about nodes or links – we are essentially dealing with networks.
Networks

A graph \( \mathcal{G}=\left(\mathcal{V},\mathrm{\mathcal{L}}\right) \), where \( \mathcal{V} \) is the set of nodes, \( \mathcal{A} \) is the set of arcs, ℰ is the set of edges, and \( \mathrm{\mathcal{L}}=\mathrm{\mathcal{E}}\cup \mathcal{A} \) is the set of links. \( n=\mid \mathcal{V}\mid, m=\mid \mathrm{\mathcal{L}}\mid \)

\( \wp \) vertex value functions or properties: \( p:\mathcal{V}\to A \)

\( \mathcal{W} \) link value functions or weights: w : ℒ → B
Types of Networks
In a twomode network \( \mathcal{N}=\left(\left({\mathcal{V}}_1,{\mathcal{V}}_2\right),\mathrm{\mathcal{L}},\wp, \mathcal{W}\right) \), its set of nodes is split into two subsets. Each link has its endnodes in both sets.
In a multirelational network \( \mathcal{N}=\left(\mathcal{V},\left({\mathrm{\mathcal{L}}}_i,i\in I\right),\wp, \mathcal{W}\right) \), the set of its links is partitioned into several mutually disjoint subsets – relations. Such networks are often obtained from text decomposed into simple sentences of the form (Subject Verb Object). Subjects and Objects are represented with nodes, and Verbs determine relations.
In a temporal network \( \mathcal{N}=\left(\mathcal{V},\mathrm{\mathcal{L}},\mathcal{T},\wp, \mathcal{W}\right) \), the time \( \mathcal{T} \) is added. To each node and to each link, its activity set is assigned. Also properties and weights can change through time – temporal quantities (Batagelj and Praprotnik 2016).
A collection of networks consists of some (onemode and twomode) networks with common subsets of nodes.
Types of networks can be combined – for example, a temporal twomode multirelational network.
Description of Networks
How to describe a network \( \mathcal{N}? \) In principle the answer is simple – we list its sets \( \mathcal{V},\mathrm{\mathcal{L}},\wp, \) and \( \mathcal{W} \). The simplest way is to describe a network \( \mathcal{N} \) by providing \( \left(\mathcal{V},\wp \right) \) and \( \left(\mathrm{\mathcal{L}},\mathcal{W}\right) \) in a form of two tables. Both tables are often maintained in Excel. They can be exported as a text in CSV (commaseparated values) format.
As an example, let us describe a part of bibliographic network determined by the following works: Generalized blockmodeling (Doreian et al. 2005), Clustering with relational constraint (Ferligoj and Batagelj 1982), Partitioning signed social networks (Doreian and Mrvar 2009), and The Strength of Weak Ties (Granoveter 1973).
In large networks, to avoid the empty cells, we split a network to some subnetworks – we make a collection of networks.
Factorization and Description of Large Networks
To save space and improve the computing efficiency, we often replace values of categorical variables with integers. In R this encoding is called a factorization. We enumerate all possible values of a given categorical variable (coding table) and afterwards replace each its value by the corresponding index in the coding table. This approach is used in most programs dealing with large networks. Unfortunately the coding table is often a kind of metadata.
netJSON
JSON (JavaScript Object Notation) is becoming very popular for describing and exchanging structured objects among programs and web applications. It is humanreadable and preserves the structure of complex data objects. In all main programming languages, efficient libraries exist for processing JSON files.
In general ids in a netJSON description can be any immutable data objects such that each node/link gets different id.
In the description in Fig. 6, we could omit the ids. In such a case, they are determined implicitly as index of the item position in a list. Since the value of info.org is 1, the counting starts with 1.
JSON allows that the field values are structured objects. Therefore, for example, we can use temporal quantities as property values or weights
or specify a function that transforms a property value or a weight.A general netJSON format that will support a description of collections of (linked) networks is still under development.
Representations of Networks in Python
A representation of a network in Python depends on network size, variability (static/dynamic), and intended operations on it. Selecting the right representation can improve the efficiency of network processing. In libraries a representation that leads to efficient algorithms for most of the tasks is usually selected.
A “classic” graph implementation as a “dictionary of list” was proposed by Guido van Rossum (1998). For example, a graph \( \mathcal{V}=\left\{A,B,C,D,E,F\right\} \) and \( \mathcal{A}=\left\{\left(\mathrm{A},B\right),\left(A,C\right),\left(B,C\right),\left(B,D\right),\Big(C,D\Big),\left(D,C\right),\left(E,F\right),\left(F,C\right)\right\} \) is represented as
Even better is a “dictionary of dictionary” representation that is a basis of PADS collection of Python algorithms and data structures implemented by David Eppstein of the University of California, Irvine (http://www.ics.uci.edu/~eppstein/PADS/). Another interesting approach to network representation is proposed in a library graphABC (http://www.linux.it/~della/GraphABC/). See also the book by Hetland (2010).
For implementing in Python prototype network analysis algorithms and programs, we developed in 2009 a Python library Nets that supports basic operations with networks based on an elaborated “dictionary of dictionary” representation (https://github.com/bavla/Nets). In Nets each node/link has its id. If a link id is not specified by a user, it is determined by Nets.

_ info – keys are general properties of a network. System properties: network, title, simple, directed, multirel, mode, temporal, meta, nNodes, nArcs, nEdges, time, etc. User properties such as nWeak, planar, etc. can be also included.

_ nodes – keys are ids of nodes. A value is a list of four dictionaries: edgeStar, inArcStar, outArcStar, and nodeProperties. Each star is again a dictionary that has for keys ids of neighboring nodes and for values lists of link ids.

_links – keys are ids of links. A value is a list [nodeId1, nodeId2, directed, relId, linkProperties] where linkProperties is a dictionary of weights.
Python Packages

PurePython packages: easy to install, platform independent, less efficient, and in reasonable time can deal with networks with up to some millions of nodes

Compiled libraries with Python interface: sometimes difficult to install, very efficient, and can deal with very large networks
A support for network analysis is available also in the SageMath (http://www.sagemath.org/; Joyner et al. 2013) – a Pythonbasedfree opensource mathematics software system licensed under the GPL.
NetworkX
NetworkX (https://networkx.github.io/) is a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks. It is based on “dictionary of dictionary” data structure and is written in pure Python. It was created by Aric Hagberg, Dan Schult, and Pieter Swart in 2002 and 2003 and released in April 2005. Its background was described in a paper (Hagberg et al. 2008).
NetworkX is the most popular among Python packages for network analysis. It contains many network analysis algorithms and is very well documented. Some books (Caldarelli and Chessa 2016; AlTaie and Kadry 2017; Fouss et al. 2016; Tsvetovat and Kouznetsov 2011) are based on it. It is the de facto standard for the analysis in Python of small to mediumsize (up to some millions of nodes) networks.
NetworkX is a crossplatform (Linux/Unix, Mac OS X, Windows) package and runs with Python 2.7/3.4 or later.
DeepGraph
The package DeepGraph was developed by Dominik Traxl and made public in 2016 (https://github.com/deepgraph/deepgraph/). He describes it as follows: DeepGraph is a scalable, generalpurpose data analysis package. It implements a network representation based on Pandas DataFrames – the nodes and edges are each represented by a DataFrame. It provides methods to construct, partition, and plot graphs, to interface with popular network packages, and more. Since it provides interfacing methods to NetworkX, scipy sparse matrices, and graphtool, a user can easily exploit the different advantages of these packages. Its theoretical background was published in Traxl et al. (2016).
DeepGraph is a crossplatform (Linux/Unix, Mac OS X, Windows) package and runs with Python 2.7/3.4 or later.
Zen
Zen (https://github.com/networkdynamics/zenlib, http://zen.networkdynamics.org/), developed in 2012 by Derek Ruths, is a library that provides a highspeed, easytouse API for loading, analyzing, visualizing, and manipulating networks in Python. By using a hybrid of Python and Cython code, it combines the speed and low memory overhead of C with the ease of use of Python. The result is a library that truly makes working with networks easy and fast. Many operations in Zen are over 100 times faster than the identical operation in NetworkX.
Zen requires Python 2.7 or later (but not Python 3). In principle it should be a crossplatform, but not easy to install.
igraph
igraph (http://igraph.org) is a network analysis library written in C and designed for extremely large networks. It is a collection of network analysis tools with the emphasis on efficiency, portability, and ease of use. It can be programmed in R, Python, and C/C++. Very popular is the igraph/R package. igraph was developed by Gábor Csárdi and Tamás Nepusz (2006) and first released in 2006.
igraph/Python is a crossplatform (Linux/Unix, Mac OS X, Windows) package and runs with Python 2.7/3.4 or later. Installing igraph/Python is relatively simple (Gohlke 2011). Basic documentation is provided.
graphtool
graphtool (https://graphtool.skewed.de/) is an efficient Python module for manipulation and statistical analysis of networks. The core data structures and algorithms are implemented in C++, based heavily on the Boost Graph Library (BGL, http://www.boost.org/libs/graph/doc/index.html), and can be used to work with very large networks. Many algorithms are implemented in parallel using OpenMP, which provides excellent performance on multicore architectures. It was developed by Tiago de Paula Peixoto and released in 2006.
graphtool package runs on Linux/Unix and Mac OS X with Python 2.7/3.4 or later. graphtool is quite complicated to install.
NetworKit
NetworKit (https://networkit.iti.kit.edu/) is an opensource toolkit for largescale (up to billions of links) network analysis. It is a Python package, with performancecritical algorithms implemented in C++/OpenMP. NetworKit is maintained by the Research Group Parallel Computing of the Institute of Theoretical Informatics at Karlsruhe Institute of Technology (KIT). It started as a collection of community detection algorithms developed in C++ by Henning Meyerhenke and Christian L. Staudt. It was first released in March 2013. Its background is described in the paper (Staudt et al. 2016).
NetworKit is comparable to packages such as NetworkX, albeit with a focus on parallelism and scalability. It is a hybrid combining the performance of kernels written in C++ with a convenient Python frontend.
NetworKit is a crossplatform (Linux/Unix, Mac OS X, Windows) package and runs with Python 3.3 or later. Some installing problems were reported. Basic documentation is provided.
Snap.py
Stanford Network Analysis Platform (SNAP) (Leskovec and Sosič 2016) is a generalpurpose, highperformance system for analysis and manipulation of large networks. It is written in C++ and optimized for maximum performance and compact graph representation. It easily scales to massive networks with hundreds of millions of nodes and billions of edges. SNAP was originally developed by Jure Leskovec in the course of his PhD studies. The first release was made available in November 2009. Snap.py (https://snap.stanford.edu/snappy/) is a Python interface for SNAP. It provides performance benefits of SNAP, combined with flexibility of Python.
Installation packages for Mac OS X, Linux (as CentOS), and Windows 64bit are available. Snap.py sticks to Python 2.7 or later (but not Python 3). Basic documentation is provided.
Tulip Python
Tulip was originally developed in 2001 by David Auber at LaBRI, University of Bordeaux. At its web site (http://tulip.labri.fr/Documentation/current/tulippython/html/), we find the following description: Tulip is an information visualization framework written in C++ dedicated to the analysis and visualization of graphs. Tulip Python is a set of modules that exposes to Python almost all the content of the Tulip C ++ API. The main features provided by the bindings are creation and manipulation of graphs, storage of data on graph elements (float, integer, Boolean, color, size, coordinate, list, etc.), application of algorithms of different types on graphs (layout, metric, clustering, etc.), and the ability to write Tulip plugins in pure Python. The bindings can be used inside the Tulip software GUI in order to run scripts on the current visualized graph. Starting from Tulip 3.6, the bindings can also be used outside Tulip through the classical Python interpreter.
For details about Tulip, see the essay Tulip 5.
Tulip Python is a crossplatform (Linux/Unix, Mac OS X, Windows) package and runs with Python 3.3 or later. Basic documentation is provided.
CrossReferences
Notes
Acknowledgments
The work was partially supported by Slovenian Research Agency (ARRS) projects J78279 and J16720 and grant P10294.
References
 AlTaie MZ, Kadry S (2017) Python for graph and network analysis. Springer, ChamCrossRefGoogle Scholar
 Batagelj V, Praprotnik S (2016) An algebraic approach to temporal network analysis based on temporal quantities. Soc Netw Anal Min 6(1):1–22CrossRefzbMATHGoogle Scholar
 Berge C (1958) Théorie des graphes et ses applications. Dunod, Paris. The theory of graphs. Courier Co., 1962zbMATHGoogle Scholar
 Caldarelli G, Chessa A (2016) Data science and complex networks: real cases studies with Python. Oxford University Press, OxfordCrossRefzbMATHGoogle Scholar
 Csárdi G, Nepusz T (2006) The igraph software package for complex network research. InterJournal Complex Systems, 1695Google Scholar
 Dahl OJ, Dijkstra EW, Hoare CAR (1972) Structured programming. Academic, LondonzbMATHGoogle Scholar
 de Nooy W, Mrvar A, Batagelj V (2012) Exploratory network analysis using Pajek. Cambridge University Press, CambridgeGoogle Scholar
 Doreian P, Mrvar A (2009) Partitioning signed social networks. Soc Networks 31(1):1–11CrossRefzbMATHGoogle Scholar
 Doreian P, Batagelj V, Ferligoj A (2005) Generalized blockmodeling. Cambridge University Press, New YorkzbMATHGoogle Scholar
 Ferligoj A, Batagelj V (1982) Clustering with relational constraint. Psychometrika 47(4):413–426MathSciNetCrossRefzbMATHGoogle Scholar
 Fouss F, Saerens M, Shimbo M (2016) Algorithms and models for network data and link analysis. Cambridge University Press, Cambridge, UKCrossRefGoogle Scholar
 Gohlke C (2011) Unofficial windows binaries for Python extension packages. http://www.lfd.uci.edu/~gohlke/pythonlibs/
 Granoveter M (1973) The strength of weak ties. Am J Sociol 78(6):1360–1380CrossRefGoogle Scholar
 Hagberg A, Schult D, Swart P (2008) Exploring network structure, dynamics, and function using NetworkX. In: Varoquaux G, Vaught T, Millman J (eds) Proceedings of the 7th Python in science conference (SciPy 2008), pp 11–15Google Scholar
 Harary F (1969) Graph theory. AddisonWesley, ReadingGoogle Scholar
 Hetland ML (2010) Python Algorithms: mastering basic algorithms in the Python Language. Apress, New YorkGoogle Scholar
 Joyner D, Nguyen MV, Phillips D (2013) Algorithmic graph theory and Sage. https://code.google.com/archive/p/graphbook/
 Leskovec J, Sosič R (2016) SNAP: a general purpose network analysis and graph mining library. ACM Trans Intell Syst Technol 8(1): 1 https://dl.acm.org/citation.cfm?id=2898361
 Staudt C, Sazonovs A, Meyerhenke H (2016) NetworKit: a tool suite for largescale complex network analysis. Netw Sci 4(4):508–530CrossRefGoogle Scholar
 Traxl D, Boers N, Kurths J (2016) Deep graphs – a general framework to represent and analyze heterogeneous complex systems across scales. Chaos 26(6):065303MathSciNetCrossRefzbMATHGoogle Scholar
 Tsvetovat M, Kouznetsov A (2011) Social network analysis for startups: finding connections on the social web. O’Reilly, SebastopolGoogle Scholar
 van Rossum G (1998) Python patterns – implementing graphs. https://www.python.org/doc/essays/graphs/
 Zykov AA (1969) Teorija konechnyh grafov I. Nauka, NovosibirskGoogle Scholar