Exponential random graph models for the Japanese bipartite network of banks and firms

  • Abhijit ChakrabortyEmail author
  • Hazem Krichene
  • Hiroyasu Inoue
  • Yoshi Fujiwara
Research Article


We use the exponential random graph models to understand the network structure and its generative process for the Japanese bipartite network of banks and firms. One of the well-known and simple models of the exponential random graph is the Bernoulli model which shows that the links in the bank–firm network are not independent from each other. Another popular exponential random graph model, the two-star model, indicates that the bank–firms are in a state where the macroscopic variables of the system can show large fluctuations. Moreover, the presence of high fluctuations reflects a fragile nature of the bank–firm network.


Exponential random graph Bipartite network Bernoulli model Two-star model 


Models of networks are useful in studying their structural properties as well as their dynamical behaviours. The approaches to construct models of networks can be classified into two broad categories considering the analogy with the theories of gases in statistical physics [1]. The two approaches are known as the kinetic theory approach and the ensemble approach. In kinetic theory approach, one considers the possible mechanisms to replicate some structural properties of the real-world network. For example, the well-known Barabási-Albert model [2] considers preferential attachment mechanisms to construct a growing network with a fat tail degree distribution. These models are easy to understand and give a qualitative understanding of the network, but have a limitation in quantitative accurate predictions. Thus, these models do not provide an overall understanding of the network, rather only mimics few features of the networks.

The other class of models, the ensemble models, are based on rigorous probabilistic arguments with a solid statistical foundation, useful for accurate predictions and quantitative study of the network. These models are based on the concept of statistical ensemble implying a large collection of all possible realizations of the network at particular values of the macroscopic observables. A particular graph in the ensemble of networks appears with a probability \({P(G)} \propto {\exp [H(G)]}\), where H(G) is known as the network Hamiltonian. As the probability is an exponential function of the network Hamiltonian, these models are popularly known as “exponential random graph (ERG) models”.

The ERG model was first introduced by Holland and Leinhardt [3], based on the framework laid by Besag [4]. Since the introduction of the ERG models, a variety of network Hamiltonians have been studied, which include models of random network [1], reciprocity model of directed network [1], the two-star model of network [5, 6], and the Strauss model of network with clustering [7, 8]. Far more complex Hamiltonians that include endogenous as well as exogenous observables of the network, have also been studied in the social network literature [9, 10, 11]. Moreover, there are many tools such as ERGM [12] and SIENA [13] packages to fit ERG model with social data. The problem with the complex non-linear Hamiltonian is that it cannot be solved exactly, the only linear Hamiltonian model can be solved exactly in the large system size limit. For a non-linear Hamiltonian, it can be solved approximately either using mean field theory and perturbation theory or by numerical simulation.

The ERG model has been studied extensively for monopartite networks except few studies in case of bipartite network [14]. In this paper, our focus is on the Japanese bipartite network of banks and firms. We model the bipartite network using the exponential random graph theory. We study the well-known Bernoulli model and two-star model to get a deep understanding of the network structure of the Japanese bipartite network of banks and firms.


We use the Nikkei data set for the banks–firms lending–borrowing links in Japan. Lending data are available only for the listed firms and are restricted in our work to the long-term loans during 2005. Each node in this bipartite network (firms and banks) has its financial statements. However, only listed banks have available financial statements. Therefore, we consider the unweighted and undirected simple bipartite network for the long-term lending–borrowing links between listed firms and listed banks during 2005. The network is formed by \(M = 127\) banks, \(N = 2198\) firms and \(L = 11,842\) unweighted long-term links.


Exponential random graph model

ERG model is a tie-based statistical model for understanding how network topology emerges by estimating how ties are patterned (see [9]). Let \(X = [x_{ij}]\) be the adjacency matrix of an unweighted bipartite network. ERG model is the regression of X with a set of endogenous attributes \(z_a\) and exogenous attributes \(z_e\). \(z_a\) representing the network statistics configuration, for example, the number of edges or the number of stars. \(z_e\) represents the counts of the node attributes, for example, in case of bank–firm network, the number of links weighted by the profit of the firm or the bank. The canonical form of ERG model is given by the following:
$${\text{Pr}}_{\Theta } (X = x) = \frac{1}{{\kappa (\Theta )}}\exp \left( {\sum\limits_{a} {\theta _{a} {\,\cdot\,} z_{a} (x)} + \sum\limits_{e} {\theta _{e} {\,\cdot\,} z_{e} (x)} } \right).$$
\(\kappa\) is a normalizing constant that ensures a proper distribution. Normalization is performed by all possible network realizations, as follows:
$$\kappa (\Theta ) \equiv \sum\limits_{{y \in X}} {\exp } \left( {\sum\limits_{a} {\theta _{a} \, \cdot \,z_{a} (y) + \sum\limits_{e} {\theta _{e} \, \cdot \,z_{e} (y)} } } \right).$$

Markov chain Monte Carlo (MCMC) sampling algorithm

Let \(x_\mathrm{{obs}}\) be the observed graph. We would like to solve the moment equation \({\mathrm {E}}_{\theta }(z(X)) - z(x_{\mathrm{{obs}}}) = 0\), where X represents the networks sampled with MCMC.

MCMC sampling is used to estimate network statistics \({\mathrm {E}}_{\theta }(z(X))\). The most commonly used MCMC sampler is the Metropolis–Hastings algorithm, which was introduced in [15].

The MCMC sampler consists of randomly selecting one dyad, one null dyad (\(x_{ij} = 0\)) or one nonnull dyad (\(x_{ij} = 1\)). Then, with the Hastings probability \({\mathrm {P}}(x \rightarrow x')\)1, the state of the dyad is changed (add a link for null dyad or delete a link for nonnull dyad). The Hastings probability is given by the following:
$$\begin{aligned} {\mathrm {P}}(x \rightarrow x') = {\mathrm {min}}\left\{ 1, \frac{\mathrm {Pr'}_{ \Theta }(X = x')}{{\mathrm {Pr'}}_{ \Theta }(X = x)}\right\} . \end{aligned}$$

Stochastic approximation: the Robbins–Monro algorithm

Snijders proposed a stochastic approximation based on the Robbins–Monro algorithm to obtain the maximum likelihood estimation (MLE) for the ERG model [16]. Following [9], this approach is robust and does not require any particular starting point. The stochastic approximation algorithm is based on three phases as described in the following.

Initialization phase

With the initial parameter \(\tilde{\theta }\), this phase determines the scaling matrix \(D_0\). Let \(z_{\tilde{\theta }}(x_1), z_{\tilde{\theta }}(x_2),...,z_{\tilde{\theta }}(x_{M_i})\) be the statistics related to networks \(x_1, x_2,...,x_{M_i}\) generated with the MCMC sampler based on \(\tilde{\theta }\). Let \({\mathrm {E}}_{\tilde{\theta }}\) be the expectation vector of the network statistics, and let D be the covariance matrix. The scaling matrix is defined as \(D_0 = {\mathrm {diag}}(D)\), and \(\theta\) is initialized for the second phase, as follows: \(\theta _0 = \tilde{\theta } - a {\,\cdot\,} D_0^{-1} {\,\cdot\,} ({\mathrm {E}}_{\tilde{\theta }} - z(x_{\mathrm{{obs}}}))\). a is defined as the gain factor, which controls the size of the updating steps (\(a = 0.1\) at initialization).

Optimization phase

The goal is to solve the moment equation \({\mathrm {E}}_{\theta }(z(X)) - z(x_\mathrm{{obs}}) = 0\) based on the Newton-Raphson minimization scheme. The goal is then to update \(\theta\) under different sub-phases, where each sub-phase r reduces the gain factor \(a_r\).

Each sub-phase r contains m simulation steps. At simulation step \(m+1\), a network is sampled based on the MCMC sampler with \(\theta _m\). The update process is defined as follows:
$$\begin{aligned} \theta _{m+1} = \theta _m - a_r {\,\cdot\,} D_0^{-1} {\,\cdot\,} (z(x_{MCMC}) - z(x_\mathrm{{obs}})). \end{aligned}$$
At the end of sub-phase r, the gain factor is updated, \(a_{r+1} = a_r/2\). This optimization procedure is iterated until convergence occurs.

Convergence phase

We want to check whether the returned value \(\hat{\theta }\) from the optimization phase is close to the true MLE. Therefore, \(M_c\) networks are sampled based on the MCMC sampler with a value of \(\hat{\theta }\). The convergence condition is reached when

$$\begin{aligned} -0.1 \le \frac{E_{\hat{\theta }} - z_\mathrm{{obs}}}{\mathrm{{S}}D_{\hat{\theta }}} \le 0.1 , \end{aligned}$$
where \(\mathrm{{SD}}_{\hat{\theta }}\) is the standard deviation of the statistics for the sampled networks.


Bernoulli model of a bipartite network

In the early 1950s, Solomonoff and Rapoport introduced the first well-known model of network, random graph or Bernoulli model of network [17], that was later famously studied by Erdős and Rényi [18]. This is the simplest model of the network and the analytic solution for the monopartite network using exponential random graph technique is shown in  [1]. Here we extend the study for a bipartite network. In the Bernoulli model of a bipartite network, links are formed independent of each other and the expected number of links \(\langle E \rangle\) is the only known observable. The Hamiltonian for this model can be written as \(H(G)=\theta E(G)\), where \(\theta\) is the associated parameter with the number of links or it can be thought as inverse temperature using the analogy with equilibrium statistical mechanics. Using the above expression of the network Hamiltonian, the probability that the graph \(\mathcal G\) is in state G can be written as
$$\begin{aligned} P( \mathcal{G} = G) = \frac{e^{\theta E(G)}}{Z} \end{aligned}$$
where the normalization constant \(Z = \sum _\mathcal{G} e^{\theta E(G)}\) is known as the partition function.
Fig. 1

The variation of connectance p of the Bernoulli model is plotted as a function of \(\theta\) for the Japanese bipartite network of banks and firms. The solid line represents the exact solution and red circles are the Monte Carlo simulation results. The filled red circle indicates the simulation result for the observed snapshot of the Japanese bipartite network of banks and firms

A bipartite network consisting of two distinct node set \(\mathcal {N, M}\) can be represented by a rectangular adjacency matrix with the elements \(A_{ij} = 1 \quad \{1 \le i \le N; 1 \le j \le M\}\) if and only if the ith node of one node set is connected to the jth node of the other set and \(A_{ij} = 0\) otherwise. The total number of links of the bipartite network \(E(G) = \sum _{i=1}^{N} \sum _{j=1}^{M} A_{ij}\).

Now, we can calculate the partition function as follows:
$$\begin{aligned} Z = \sum \limits _{\{A_{ij}\}} e^{\theta \sum \limits _{i=1}^{N} \sum \limits _{j=1}^{M} A_{ij}} = \prod \limits _{i=1}^{N} \prod \limits _{j=1}^{M} \sum \limits _{A_{ij}=0}^{1} A_{ij} = \prod \limits _{i=1}^{N} \prod \limits _{j=1}^{M} (1+e^\theta ) = (1+e^\theta )^{NM} \end{aligned}$$
From the partition function, we can calculate all the network observables:

The free energy of the network \(F=\ln {Z} = NM \ln (1 + e^\theta )\)

The expected number of edges
$$\begin{aligned} \langle E \rangle = \frac{\partial F}{\partial \theta } = NM\frac{e^{\theta }}{(1+e^{\theta })} \end{aligned}$$
This gives
$$\begin{aligned} \theta =\ln \left[ \frac{\langle E \rangle }{(NM - \langle E \rangle )}\right] \end{aligned}$$
Fig. 2

The degree distributions P(k) are plotted against degree k for a banks and b firms. Empirical, simulated and analytic results are indicated with different legends

Figure 1 shows the connectance \(p=\langle E \rangle /MN\) as a function of \(\theta\) for the model. The results indicate an excellent match between the analytic solution and simulation results. Simulation results are obtained using Markov chain Monte Carlo method as explained in Sect. 3. The data points are averaged over 1000 independent runs. The maximum standard deviation in the data points is found to be \(\sigma =0.001\). For the Japanese bipartite network of banks and firms, the analytic result gives \(\theta = 3.1167\) and our simulation estimates \(\theta = 3.1166 \pm 0.0015\) reflecting the sparse nature of the network.

The degree distribution p(k) implies that the number of nodes with degree k of this model has binomial form. For the bank–firm network, the degree distribution of the banks can be written as \(P_\mathrm{{bank}}(k) = \left( {\begin{array}{c}N\\ k\end{array}}\right) p^k(1-p)^{(N-k)}\) and for firms \(P_\mathrm{{firm}}(k) = \left( {\begin{array}{c}M\\ k\end{array}}\right) p^k(1-p)^{(M-k)}\). As can be seen from the Fig. 2, the degree distribution of the model does not fit with the empirical distribution which has a much broader shape for both bank and firm. We conclude that the Bernoulli model is a poor model for the Japanese bipartite network of banks and firms.

Two-star model of a bipartite network

The two-star model is an ERG model where the expected values for the total number of links and total number of two star (i.e., path length 2) are constant. The Hamiltonian for the model of a bipartite network can be written as
$$\begin{aligned} H (x)= \theta _L Z_L(x) + \theta _{2SB} Z_{2SB}(x) + \theta _{2SF} Z_{2SF}(x) \end{aligned}$$
where the total number of links
$$\begin{aligned} Z_L(x) = \sum _{i=1}^{N} \sum _{j=1}^{M} A_{ij} \end{aligned}$$
The total number of bank two star
$$\begin{aligned} Z_{2SB} = \frac{1}{2}\sum _{i=1}^{M} \sum _{j,k=1}^{N} (1-\delta _{jk})A_{ij}A_{ik} = \frac{1}{2}\sum _{i=1}^{M} \sum _{j,k=1}^{N} A_{ij}A_{ik}-\frac{1}{2}\sum _{i=1}^{M} \sum _{j=1}^{N} A_{ij} \end{aligned}$$
The total number of firm two star
$$\begin{aligned} Z_{2SF} = \frac{1}{2}\sum _{i=1}^{N} \sum _{j,k=1}^{M} (1-\delta _{jk})A_{ij}A_{ik} = \frac{1}{2}\sum _{i=1}^{N} \sum _{j,k=1}^{M} A_{ij}A_{ik}-\frac{1}{2}\sum _{i=1}^{N} \sum _{j=1}^{M} A_{ij} \end{aligned}$$
\(\theta\)’s are the associated parameters to the network observables.
The Hamiltonian H can be written in terms of the adjacency matrix as follows:
$$\begin{aligned} H = \frac{1}{2}\sum _{i=1}^{N} \sum _{j=1}^{M} A_{ij} \left(2\theta _L-\theta _{2SB}- \theta _{2SF}+ \theta _{2SB} \sum \limits _{k=1}^N A_{ik} + \theta _{2SF}\sum \limits _{k=1}^M A_{ik}\right) \end{aligned}$$
Using mean field technique of statistical physics, we can set the average connection probability between any two nodes as \(p = \langle A_{ik} \rangle = A_{ik}\) by ignoring the local fluctuations.
$$\begin{aligned} H&= \frac{1}{2}\sum _{i=1}^{N} \sum _{j=1}^{M} A_{ij}\left( 2\theta _L-\theta _{2SB}- \theta _{2SF} + \theta _{2SB} \sum \limits _{k=1}^N \langle A_{ik} \rangle +\theta _{2SF}\sum \limits _{k=1}^M \langle A_{ik} \rangle \right) \\&= \frac{1}{2}\sum _{i=1}^{N} \sum _{j=1}^{M} A_{ij}(2\theta _L-\theta _{2SB}- \theta _{2SF}+ \theta _{2SB} N p +\theta _{2SF} M p) \\&= \Theta \sum _{i=1}^{N} \sum _{j=1}^{M} A_{ij} \end{aligned}$$
where we define \(\Theta = \frac{1}{2} (2\theta _L-\theta _{2SB}- \theta _{2SF}+ \theta _{2SB} N p +\theta _{2SF} M p)\)

As the Hamiltonian becomes linear with \(A_{ij}\), we can easily calculate the partition function \(\kappa = [1+\exp ( \Theta )]^{NM}\)

From the partition function, we can calculate other network observables:

Free energy \(F=ln(\kappa )=NM \ln [1+\exp ( \Theta )]\)

Total expected number of links \(<Z_L> = \frac{\partial F}{\partial \theta _L} = NM \frac{\exp ( \Theta )}{1+\exp ( \Theta )}\)

This gives
$$\begin{aligned} p&= \frac{<Z_L>}{NM}\\&=\frac{\exp ( \Theta )}{1+\exp ( \Theta )}\\&= \frac{1}{2}[1+\tanh ( \Theta /2)]\\&=\frac{1}{2}[1+\tanh \{0.25(2\theta _L-\theta _{2SB}- \theta _{2SF}+ \theta _{2SB} N p +\theta _{2SF} M p)\}] \end{aligned}$$
For confirming let us define \(B = 0.25(2\theta _L-\theta _{2SB}- \theta _{2SF})\) and \(2J = 0.25(\theta _{2SB} N +\theta _{2SF} M )\).
It gives,
$$\begin{aligned} p=\frac{1}{2}[1+\tanh (B+2Jp)] \end{aligned}$$
The solution of this transcendental equation is well-known [1]. It has only one solution if \(J\le 1\), but if \(J > 1\), it may have either one solution or three solutions (where the outer two are stable solutions). It can be shown [19] that the three solutions appear when \(B_+(J)< B < B_-(J)\), where
$$\begin{aligned} B_{+}(J) = \frac{1}{2}\log \left[ \frac{\sqrt{J}+\sqrt{J-1}}{\sqrt{J}-\sqrt{J-1}}\right] -\frac{\sqrt{J}}{\sqrt{J}-\sqrt{J-1}} \end{aligned}$$
$$\begin{aligned} B_{-}(J) = \frac{1}{2}\log \left[ \frac{\sqrt{J}-\sqrt{J-1}}{\sqrt{J}+\sqrt{J-1}}\right] -\frac{\sqrt{J}}{\sqrt{J}+\sqrt{J-1}} \end{aligned}$$
Fig. 3

The phase diagram for the two-star model. The red circle indicates the position for the Japanese bipartite network of banks and firms for the year 2005

Table 1

Estimated values of the coupling parameters of the two-star model for the Japanese bipartite network for the year 2005


Estimated values

Standard deviation

\(\theta _L\)


\(4.040 \times 10^{-3}\)

\(\theta _{2SF}\)

\(6.307 \times 10^{-2}\)

\(3.445\times 10^{-4}\)

\(\theta _{2SB}\)

\(3.334 \times 10^{-3}\)

\(1.328 \times 10^{-6}\)

We show the phase diagram of the model in \((B-J)\) plane in Fig. 3. It has three distinct regions—high density, low density and co-existence phase. Our estimates of the parameters are given in Table 1. The values of the estimated parameters are \(B = 2.004\) and \(J=1.917\). As can be seen from Fig. 3 at these parameter values, the system can show high fluctuation in behaviours having two coexisting phases. We conclude that the Japanese bipartite network of banks and firms is close to the transition point which indicates a fragile nature of the system.
Fig. 4

The hysteresis plot for the two-star model of the bank–firm network. The black curve indicates the variation of connectance p when \(\theta _L\) increases from low to high and red curve indicates when \(\theta _L\) decreases from high to low. The values of \(\theta _{2SF}\) and \(\theta _{2SB}\) are kept constant as in Table 1. The error bars indicate standard deviation in p

This model exhibits hysteresis behaviour as shown in Fig 4. The finite area within the loop is a signature of a discontinuous transition (i.e., first order). The first-order transition is very dangerous for an economic network. It indicates that the network can collapse suddenly if there is a slight change in the parameter values.
Fig. 5

The degree distributions P(k) are plotted against degree k for a banks and b firms. Empirical and simulated results are indicated with different legends

Figure 5 shows the degree distribution for the two-star model. The distribution has a bi-modal nature [19], although the second peak near \(k=N\) for the degree distribution of banks is very small. This model also can not replicate the empirical nature of the degree distribution. In the future, we will consider more complex Hamiltonians that include endogenous as well as exogenous parameters to describe the system in a much better way.


We have studied the Japanese bipartite network of banks and firms using the Bernoulli model and the two-star model. The Bernoulli model assumes that links are formed between banks and firms independently. However, this model does not fit well with the empirical network structure indicating a relationship is present between the network structure and some hidden variables. As a first approximation, we consider two-star model that assumes adjacent links play a role in the link formation. This model indicates that the Japanese bipartite network of banks and firms has a fragile nature. Although this model can not capture the empirical network structure fully.

In the future, we would like to consider more complex Hamiltonians as well as the temporal evolution of the system in the phase space. We believe such complex Hamiltonians will be useful to understand the network structure in detail.


  1. 1.

    x and \(x'\) are network states at simulation steps t and t+1, respectively.


  1. 1.
    Park, J., & Newman, M. E. (2004). Statistical mechanics of networks. Physical Review E, 70, 066117.CrossRefGoogle Scholar
  2. 2.
    Barabási, A. L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286, 509–512.CrossRefGoogle Scholar
  3. 3.
    Holland, P. W., & Leinhardt, S. (1981). An exponential family of probability distributions for directed graphs. Journal of the American Statistical Association, 76, 33–50.CrossRefGoogle Scholar
  4. 4.
    Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems. Journal of the Royal Statistical Society. Series B (Methodological) 36(2),192–236.CrossRefGoogle Scholar
  5. 5.
    Park, J., & Newman, M. E. (2004). Solution of the two-star model of a network. Physical Review E, 70, 066146.CrossRefGoogle Scholar
  6. 6.
    Annibale, A., & Courtney, O. T. (2015). The two-star model: exact solution in the sparse regime and condensation transition. Journal of Physics A: Mathematical and Theoretical, 48, 365001.CrossRefGoogle Scholar
  7. 7.
    Strauss, D. (1986). On a general class of models for interaction. SIAM Review, 28, 513–527.CrossRefGoogle Scholar
  8. 8.
    Park, J., & Newman, M. (2005). Solution for the properties of a clustered network. Physical Review E, 72, 026136.CrossRefGoogle Scholar
  9. 9.
    Lusher, D., Koskinen, J., & Robins, G. (2013). Exponential random graph models for social networks: Theory, methods, and applications. Cambridge: Cambridge University Press.Google Scholar
  10. 10.
    Wong, L. H. H., Gygax, A. F., & Wang, P. (2015). Board interlocking network and the design of executive compensation packages. Social Networks, 41, 85–100.CrossRefGoogle Scholar
  11. 11.
    Simpson, S. L., Hayasaka, S., & Laurienti, P. J. (2011). Exponential random graph modeling for complex brain networks. PloS One, 6, e20039.CrossRefGoogle Scholar
  12. 12.
    Hunter, D. R., Handcock, M. S., Butts, C. T., Goodreau, S. M., & Morris, M. (2008). ergm: A package to fit, simulate and diagnose exponential-family models for networks. Journal of statistical software, 24(3), nihpa54860.CrossRefGoogle Scholar
  13. 13.
    Ripley, R.M., Snijders, T.A., & Preciado, P. (2011). Manual for siena version 4.0. Oxford: University of OxfordGoogle Scholar
  14. 14.
    Wang, P., Pattison, P., & Robins, G. (2013). Exponential random graph model specifications for bipartite networksa dependence hierarchy. Social Networks, 35, 211–222.CrossRefGoogle Scholar
  15. 15.
    Hastings, W. K. (1970). Monte carlo sampling methods using markov chains and their applications. Biometrika, 57(1), 97–109.CrossRefGoogle Scholar
  16. 16.
    Snijders, T. A. (2002). Markov chain monte carlo estimation of exponential random graph models. Journal of Social Structure, 3, 1–40.Google Scholar
  17. 17.
    Solomonoff, R., & Rapoport, A. (1951). Connectivity of random nets. The Bulletin of Mathematical Biophysics, 13, 107–117.CrossRefGoogle Scholar
  18. 18.
    Erdos, P., & Rényi, A. (1960). On the evolution of random graphs. Publications of the Mathematical Institute of the Hungarian Academy of Sciences, 5, 17–60.Google Scholar
  19. 19.
    Coolen, A. C., Annibale, A., & Roberts, E. (2017). Generating random networks and graphs. Oxford: Oxford University Press.CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.Graduate School of Simulation StudiesThe University of HyogoKobeJapan

Personalised recommendations