Introduction

Since the global financial crisis of 2008–2009, many studies on financial systemic risk have been accumulated to date. One of the most distinctive features of this newly arising field is its interdisciplinary nature, with researchers having backgrounds in economics and finance, statistical physics, ecology, engineering, applied mathematics, etc., [1,2,3]. The study of financial systemic risk has attracted such a diversity of disciplines because financial markets are complex systems, and the study of complex systems has traditionally been an interdisciplinary field.

In the financial market there is a wide variety of market participants, such as commercial banks, insurance companies, hedge funds, individual investors, and central banks. These participants are interacting with each other by selling and buying financial assets, creating complex webs of financial liabilities, cross-asset holdings, and correlations in asset returns. Financial systemic risk is, loosely speaking, the risk associated with the occurrence of a breakdown of the financial system. Looking at systemic risk from the point of view of complex systems means thinking of it as emerging from the interactions between different players that operate in the financial market. Moreover, individual participants react to the aggregate dynamics of the market that they collectively create, so that to understand systemic risk, the feedback loop between individual and collective dynamics has to be accounted for. A key to understanding systemic risk is thus to uncover the mechanisms that lie behind the micro–macro feedback. Because of the fact that many interactions that take place in financial markets can be represented as a network of financial linkages between institutions, a significant fraction of research in systemic risk has been devoted to the study of financial networks  [4, 5], which is the focus of this short review.

In this article, we provide a review of recent studies on financial systemic risk. Here, we focus on network models of financial markets that has been developed outside traditional economics, although there are also many studies of systemic risk in the literature of economics whose approaches are typically based on game theory, finance and macroeconomic modeling. An advantage of modeling the financial system as a complex network is that we can directly analyze complex feedback between micro- and macroscopic phenomena without oversimplifying the structure of financial linkages. It is the structure of networks that plays an essential role in leading micro events to collective phenomena. Moreover, the empirical structure of financial networks rarely takes a stylized form such as the Erdős–Rényi random graph and a star graph, but it takes more complex structures such as multiplex, bipartite, core–periphery, and time-varying networks, depending on the property of the financial linkage concerned. Such complex yet realistic network structures cannot be treated in the traditional economic models.

In section “Clearing algorithms”, we first explain basic clearing algorithms that are needed to compute the allocation of debtor’s assets among creditors. In normal times, a bilateral credit contract is settled when a debtor repays in full the amount of borrowed funds. However, if a debtor fails, then it is no longer straightforward to know how to consistently allocate the debtor’s remaining assets across its creditors. It may seem reasonable to allocate the remaining assets proportionally to the amount of funds lent, but the losses that the creditors incur may also cause their defaults. If such contagious defaults happen, the remaining assets of the failed creditors should also be allocated proportionally to their creditors, which might in turn lead to a collapse of the creditors’ creditors, and so on. Therefore, in the presence of contagious default cascades, calculating the final allocation of funds is essentially equivalent to computing a fixed point of an iterative map. We provide a brief introduction to a widely known clearing algorithm, the Eisenberg–Noe algorithm [6].

The allocation of funds achieved by the Eisenberg–Noe’ clearing algorithm is economically sensible in the sense that the amounts of funds paid back to the creditors are endogenously determined in proportion to the amounts of funds lent. In the real world, however, such an ideal allocation may no be feasible, especially in financial turmoil, because evaluating the assets of failed banks and negotiating among creditors takes a long time. Therefore, many studies consider a fixed exogenous recovery rate (i.e., creditors receive a fixed percentage of their payment), which is often set to zero to analyze worst case scenarios. An advantage of imposing the zero-recovery assumption is that it allows us to analyze financial contagion using a model of social contagion that has been developed in network science. Pioneers of this approach are Gai and Kapadia [7], who exploited the social cascade model of Watts [8]. We review some of the models of this type of default cascades in section “Cascades of bank defaults due to bilateral interbank exposures”.

Propagation of distress between borrowers and creditors can also occur before the borrower’s default, because of credit quality deterioration. In section “Distress propagation due to credit quality deterioration”, we review a model, the so-called DebtRank [9], which has been introduced to account for this situations, together with some of its extensions.

Stress does not only propagate between borrowers and lenders. In fact, losses can also propagate between investors having common assets. For example, the devaluation of an asset that is commonly held by many banks would simultaneously undermine the balance sheets of the banks holding the asset. If the loss of a bank is so severe that the bank is unable to meet the minimum requirement for its capital ratio, then the bank will have to sell some of its assets. This liquidation has a negative impact on the prices of the assets that are being sold, which causes losses to other banks holding these assets. In this way, a cascade of defaults could be triggered by the initial decline in the price of an asset, and fueled by the presence of overlapping portfolios among banks. Modeling the hidden interbank linkages formed through cross asset holdings is then important to understand systemic risk. Recent studies in this domain are introduced in section “Overlapping portfolios and price mediated contagion”.

In section “Empirical structure of interbank networks”, we summarize recent work on the structure of empirical interbank networks and its dynamics. Whether the initial default of a bank can trigger a large-scale default contagion depends largely on the way financial institutions are connected to each other. We introduce studies that examined the empirical structure of networks formed by interbank bilateral trades and explain the well-studied core–periphery structure and its possible limitations. The necessity to analyze daily network dynamics is also discussed. Section “Discussion” concludes.

Clearing algorithms

In the context of financial networks and systemic risk, clearing plays a central role as the settlements of different transactions are entangled in a network of mutual commitments. Hence, it constitutes the structural basis leading to observable liquidity problems, failed payments, losses, and insolvencies. In full generality, financial transactions can be over-the-counter (OTC) contracts, where each pair of counterparties has to settle its own contract, or can be collected and managed by Central-Clearing-Counterparties (CCPs)—absorbing all or some of the financial risks—or be mediated by a market via an order book dynamics.

Here, we introduce a well-known clearing model, the one proposed by Eisenberg and Noe [6], which led to a series of works on the problem of valuating systemic risk in financial networks. In a schematic representation, a debt contract between two counterparties is represented by an amount \(L_{ab}\) to be paid at time T by the financial institution A to financial institution B. Though this contract is simple in nature, it is easy to realize the complexity that may arise by thinking that, in the case of conflicting financial obligations, institution A might be unable, or even unwilling, to repay B in full at the prescribed time T. It is, therefore, necessary to consider the seniority of the debt, i.e., the priority of its repayment with respect to other obligations. Besides deterministic quantities, the counterparty B is interested in knowing ex-ante—before the maturity of the contract—the probability of default (PD) of A, i.e., the probability of A not honoring the contract, and the loss given default (LGD), that is the credit that B will be able to recover given the default event.

In this section, we focus on a deterministic clearing procedure [6] and in particular, we will investigate a specific case of an arbitrary number of financial institutions entangled in a mono-layer network of simple bilateral debt contracts. All institutions in the model need to clear their payments at the time of maturity, same for all bilateral contracts. Eisenberg and Noe provide a solid set of assumptions to study the basic properties of the solution of this kind of clearing procedure. This methodology allows to compute cascades of defaults, reallocation of funds, and systemic effects when dealing with network of contracts. It is important to stress that these models constitute only one mathematical aspect of the complex and multidisciplinary problem of ensuring financial stability, involving political, legal, economic, financial, and infrastructural aspects. Opacity and information asymmetry in markets cause a general discrepancy between credit risk estimations by different financial institutions and the real risks. It is arguable, given the complexity of system, that even a fully-fledged structural model would not be able to correctly compute credit risks. As a consequence, opacity and financial complexity lead to a systematic inefficiency in risk taking, both in the construction of portfolios and in the issuing of credit [10].

The Eisenberg–Noe model

Following closely the work by Eisenberg and Noe [6], let us consider an economy composed of N financial institutions, hereafter called banks for the sake of simplicity. Each bank has nominal liabilities to other banks that need to be settled at the same time. Such structure of liabilities can be represented with a \(N\times N\) matrix of non-negative real numbers L, where each entry \(L_{ij}\) stands for the nominal liability of node i to node j. Nominal liabilities are all non-negative because a debt contract \(L_{ij}\) between bank i and bank j with a negative value would constitute an effective credit for bank i, and therefore, a debt for bank j, and would simply appear as a positive value of the entry \(L_{ji}\). A further realistic assumption is the absence of nominal claims of a bank against itself, which results in having null elements on the diagonal of the liabilities matrix.

Finally, each bank has a non-negative operating cash flow, \(e_i\), representing the net cash received by each bank from the outside of the financial system under consideration. External liabilities can be introduced either by setting a negative value to cash flows or by adding an extra bank in the liabilities matrix. Such fictitious bank has zero cash flow, i.e., \(e_0 = 0\), and is supposed to receive an amount \(L_{i0}\) from each financial institution i. Nevertheless, this choice is not entirely equivalent to having a negative \(e_i\), as the seniority—the priority that a contract takes—of the liabilities in the matrix L, including of the ones towards node 0, is lower than the one that would result from simply subtracting \(L_{i0}\) from the non-negative cash flow \(e_i\). In the following we will take the latter choice and consider the possibility of an extra bank.

In this simplified picture, a financial system \(\mathcal {F}\) is a pair of a non-negative liabilities matrix and a operating cash flow vector, \(\mathcal {F}=(L,\,e)\). This framework excludes the existence of many realistic characteristics of financial systems, such as: multiple contracts between a given pair of banks, multiple levels of seniority, or involving more than two banks, or also, different times to maturity. Further, it does not model in detail the stochastic features of cash flows, nor their correlation structure or the existence of common asset holdings among different institutions.

Despite its specificity, this framework successfully deals with the problem of identifying a clearing vector of payments, i.e., a vector that associates to each bank the total amount that it is able to repay to its debtors given the financial system, \((L,\,e)\), and, in doing so, it is able to introduce some crucial quantities and features of systemic events in financial systems, such as the dynamics of the contagion process and the conditions for the existence of unique solutions.

Let us now enter in more detail into the definition of the clearing procedure as defined in [6]. It is useful to introduce a few auxiliary variables, i.e., the total nominal obligations \(\bar{p}_i\) defined as:

$$\begin{aligned} \bar{p}_i = \sum _{j=0}^NL_{ij}, \end{aligned}$$
(1)

and the relative liabilities matrix, quantifying the fraction of liabilities from bank i that a bank j is entitled to receive in the case of full repayment,

$$\begin{aligned} \Pi _{ij}= {\left\{ \begin{array}{ll} \frac{L_{ij}}{\bar{p}_i},&{}\quad \text {if}\;\bar{p}_i >0\\ 0&{}\quad \text {otherwise}. \end{array}\right. } \end{aligned}$$
(2)

Finally, we define the payment vector p, the vector of unknown quantities that we want to identify, that is the amount that each bank is actually able to repay. By definition, all the elements of the payment vector are less than or equal to the elements of the obligation vector \(\bar{p}\) and greater or equal to zero, or in formulas, \(p\le \bar{p}, \; p\ge 0\). The clearing procedure may yield multiple solutions for p, but, by making some intuitive and natural financial assumptions, it can be shown that the solution is unique.

The financial requirements for the Eisenberg–Noe clearing procedure are the following: (1) all the elements of the payment vector are less than or equal to the available cash flow of the bank (Limited Liabilities), (2) banks repay as much as they can, i.e., they are not allowed to keep cash in their balance as long as they have not fully repaid all their liabilities; this can also be expressed that the equity has a lower seniority with respect to interbank liabilities; (Absolute Priority) (3) the individual payment of a given liability, i.e., the effective value repaid to a bank, has to be proportional to the fraction of the total obligation that the liability represents, as given by the relative liabilities matrix \(\Pi _{ij}\) (Proportionality). More explicitly, for each institution the ratio between the liability repaid to a given counterparty and the total amount repaid to all counterparties has to equate the ratio between the nominal liability and the total amount of liabilities that the institution has.

From these simple assumptions, we can compute the net position of a bank assuming a given payment vector p. In fact, the total assets of bank i will amount to \(e_i +\sum _j \Pi _{ji}p_j\), whilst having a total obligation in the interbank market \(\bar{p}_i\). Given assumption (2), as long as the obligation is less than the total amount of assets then the bank will repay in full, i.e., \(p_i=\bar{p}_i\), and will remain with a net position \(e_i +\sum _j \Pi _{ji}p_j - \bar{p}_i\). Otherwise, when \(e_i +\sum _j \Pi _{ji}p_j < \bar{p}_i\), bank i will have to use all its assets to repay its counterparties, i.e., \(p_i=e_i +\sum _j \Pi _{ji}p_j\), leaving the bank with a net position equal to zero and a total amount of unrepaid debt equal to \(\bar{p}_i - e_i -\sum _j \Pi _{ji}p_j\). Hence, if we require all banks to simultaneously satisfy the same relations, we derive the following system of equations:

$$\begin{aligned} p_i = \min \left\{ e_i+\sum _{j}\Pi _{ji}p_j,\,\bar{p}_i\right\} \quad \forall \, i=1,\ldots , N, \end{aligned}$$
(3)

which defines a fixed-point problem for the payment vector, whose components also have to satisfy the conditions \(\bar{p}_i\ge p_i \ge 0\) for each bank i.

Convergence to the payment vector

The system, (3) yielding the solutions to the clearing procedure, is a set of non-linear equations, for which the dependence on each component of the payment vector is piece-wise linear, monotone, bounded, and continuous. All these properties allow to demonstrate the existence and uniqueness of the solution, by means of the general Knaster–Tarski theorem [11].

To identify the solution, Eisenberg and Noe propose a simple iterative algorithm, where each iteration is made of two-steps: first, an update that identifies the set of defaulted banks

$$\begin{aligned} D(\mathbf {p}) = \{i\in \{1,\ldots N\}\, | p_i < \bar{p}_i\}. \end{aligned}$$
(4)

Second, the payment vector is updated by looking for the fixed-point of the following map:

$$\begin{aligned} \mathbf {p'} = \Lambda (\mathbf {p'})(\Pi (\Lambda (\mathbf {p'})\mathbf {p} + (1-\Lambda (\mathbf {p'}))\bar{\mathbf {p}} + \mathbf {e}) + (1-\Lambda (\mathbf {p'}))\bar{\mathbf {p}}, \end{aligned}$$
(5)

where \(\Lambda (\mathbf {p'})\) is a diagonal matrix, such that \(\Lambda (\mathbf {p'})_{ii}\) is one if \(i\in D(\mathbf {p'})\), and zero otherwise. Equation (5) admits a unique fixed-point if the financial network is regular; the regularity condition is defined in full detail in [6], but for the sake of brevity we just notice that such condition is easily respected when each bank in the system has a strictly positive equity.

The fixed-point \(\mathbf {p^*}\) of the vector equation (5) then is used to update the set of defaulted banks D accordingly. The two-steps are repeated until convergence to a set of defaulted banks D and payment vector p satisfying (3), and under the regularity condition such payment vector is the only solution to (3).

Related literature and generalizations

The seminal paper by Eisenberg and Noe (EN, hereafter) ignited research and many generalizations have been proposed since its publication. Here we discuss only the ones that introduce the most important effects and outline their consequences on the properties of the systemic losses.

Rogers and Veraart [12] introduce default costs in the system. In fact, when insolvency occurs it is unrealistic to assume—as EN do in their clearing model—the absence of additional costs: an insolvent financial institution may (1) need to rapidly liquidate valuable external assets at a lower price than its present market valuation, e.g. due to the price impact of a fire-sale, and (2) need to withdraw their interbank assets conceding their counterparty a discount related to the early repayment before maturity. With such motivation the EN system is modified, introducing two factors \(\alpha \) and \(\beta \)—both between zero and one—that discount, respectively, the external, and the interbank assets of an insolvent institution. The resulting system of equations reads:

$$\begin{aligned} p_i= {\left\{ \begin{array}{ll} \bar{p}_i,\; \text {if}\;\bar{p}_i < e_i + \sum _{j}\Pi _{ji}p_j,\\ \alpha e_i + \beta \sum _{j}\Pi _{ij}p_j\;\text {otherwise}. \end{array}\right. } \end{aligned}$$
(6)

It is easy to recognize that when \(\alpha =\beta =1\) the system of equations (6) is equivalent to (3). Otherwise, these discount factors increase losses, introducing costs for bailouts that exceed the initial losses of the financial system. In fact, while EN simply redistributes losses across the financial system, as discussed in [13], Rogers and Veraart clearing procedure recognizes the existence of extra costs, which are exactly the ones that financial regulatory institutions want to minimize. The motivation is that, while external losses coming from economic shocks are driven by scarcely controllable complex economic dynamics, the possible endogenous loss amplification due to the financial system interconnectedness could be avoided by regulators by imposing specific policies.

Both the EN and the Rogers–Veraart models are completely deterministic. They constitute a mechanism, respectively, for the redistribution and amplification of losses in a financial system. Nevertheless, such mechanism is triggered only by an actual insolvency, i.e., the liabilities have to exceed assets for a bank to propagate its losses. They are default contagion mechanisms. This is a limitation of the clearing framework that could or may be overcome in two ways: first, directly account for a propagation that activates when a counterparty is in financial distress, as in DebtRank [9], and define a distress contagion mechanism; secondly, account—before maturity—for the uncertainty on the value that the external assets will take at maturity, i.e., introduce a probability distribution over external assets. In the latter, the propagation remains based on a default contagion mechanism.

Nevertheless, the distribution on the external assets will include scenarios that on average will cause expected losses, that would have been absent, if the external assets were taken at their present value before maturity. This is the approach taken by Elsinger et al. [14]. A recent effort was made to consider the two kinds of mechanisms in a unified framework of network valuation [15] accounting for uncertainty, and default costs in a compact form.

Cascades of bank defaults due to bilateral interbank exposures

The Gai–Kapadia model

Since the work of Gai and Kapadia [7], many researchers have developed network models of default cascades in financial networks, especially interbank networks in which banks lend to and borrow from each other. The basic structure of the Gai–Kapadia cascade model is heavily based on the Watts model of global cascades [8] that would occur on networks formed by social interactions between humans. In the Watts model, the mechanism of how a node affects its neighbors is quite simple; a node gets “activated” (or “infected”) if and only if at least a certain fraction \(R\in [0,1]\) of its neighbors are activated. The Watts model of cascades is, therefore, categorized as a linear threshold model or just a threshold model, which belongs to the class of complex contagion.Footnote 1 The main implication of the Watts model is that, on randomly connected networks, even a vanishingly small fraction of initial active nodes may lead a significant fraction of infinitely many nodes to get activated as long as the network is not too sparse or too dense. This critical phenomenon is called a global cascade, and the analytic condition under which a global cascade may occur, called the cascade condition, can be computed by exploiting a mean-field approximation.

In this section, we fist explain the basic properties of the threshold model developed by Watts [8]. Since the threshold of activation for humans can be reinterpreted as the threshold of defaults for banks, understanding the threshold model used in the social network literature is important to understand many existing models of default contagion in interbank networks. Here, we explain two different approaches to calculating the size of cascades, namely the tree-based approximation [16] and the generating function approach [7, 8]. While the Watts model assumes that edges are undirected, extending its framework to directed networks is straightforward [7].

Tree-based approximation

Here, we briefly explain a tree-like approximation method for solving the model of social contagion [16]. The basic idea of a tree-like approximation is to calculate the average final fraction \(\rho \) of activated nodes by assuming that the network is locally tree-like. The probability of a randomly chosen node being active, or the average size of global cascades, is calculated by the following equation:

$$\begin{aligned} \rho = \rho _0 + (1-\rho _0)\sum _{k=1}^{\infty }p_k \sum _{m=0}^k \begin{pmatrix} k \\ m \end{pmatrix} q^m (1-q)^{k-m}F\left( \frac{m}{k} \right) , \end{aligned}$$
(7)

where q is the probability that a randomly chosen neighbor is active, \(\rho _0\) is the chance that a node is initially active (i.e., a seed node), \(p_k\) is the degree distribution, and k and m stand for the degree and the number of active neighbors, respectively. At this point, the neighbors’ activation probabilities, which are considered to be identical to the mean value, are regarded as independent since there is no cycle of influence at least locally thanks to the assumption of a locally tree-like structure. In the simple Watts model, the response function \(F(\cdot )\) takes 1 if \(m/k > R\), and 0 otherwise.

Fig. 1
figure 1

Threshold cascade model with a Poissonian degree distribution. Line denotes the value of \(\rho \) calculated from Eq. (7), and circle represents the simulated average cascade size, averaged over 1000 runs with \(N=10^{5}\) and \(R=0.18\). z is the mean degree, and seed fraction is \(\rho _{0}=10^{-4}\)

The probability q is given as

$$\begin{aligned} q = \rho _0 + (1-\rho _0)\sum _{k=1}^{\infty }\frac{k}{z}p_k \sum _{m=0}^{k-1} \begin{pmatrix} k-1 \\ m \end{pmatrix} q^m (1-q)^{k-1-m}F\left( \frac{m}{k} \right) , \end{aligned}$$
(8)

where z denotes the mean degree, which is the connectivity parameter of the network. Note that one should use the excess degree distribution \(kp_{k}/z\), as opposed to the degree distribution \(p_{k}\), to compute the average fraction of active neighbors of a neighbor. This can be understood as a situation in which a “child node” is influenced by its “parent nodes” and then the child node affects the “grandchildren”, and so forth. The solution for q is then obtained as a fixed point of recursion equation (8), which in turn gives the solution for the mean cascade size \(\rho \) from Eq. (7).Footnote 2 Regardless of its simplicity, the tree-based method can predict the final size of a global cascade very accurately (Fig. 1).

The parameter space within which a global cascade may occur is called the cascade region, and the analytical condition for the parameters to be satisfied in the cascade region is called the cascade condition. To see the derivation of the cascade condition, let us define S(q) as

$$\begin{aligned} S(q)\equiv \sum _{k=1}^{\infty }\frac{k}{z}p_k \sum _{m=0}^{k-1} \begin{pmatrix} k-1 \\ m \end{pmatrix} q^m (1-q)^{k-1-m}F\left( \frac{m}{k} \right) . \end{aligned}$$
(9)

Gleeson and Cahalane [16] argue that if the derivative of the RHS of Eq. (8) (i.e., \(\rho _{0}+(1-\rho _0)S(q)\)) near \(q=0\) takes a value larger than one, then a vanishingly small initial seed \(\rho _0\) results in a large value of \(\rho \). The first-order cascade condition is, therefore, given by

$$\begin{aligned} (1-\rho _0)\sum _{k=1}^{\infty }\frac{k(k-1)}{z}p_k \left[ F\left( \frac{1}{k}\right) -F(0)\right] > 1. \end{aligned}$$
(10)

In fact, a comparison with simulation results reveals that this is not a very accurate condition for the parameter space (Rz). Therefore, they also propose the second-order cascade condition given by

$$\begin{aligned} (C_1-1)^2 - 4C_{0}C_{2} + 2\rho _{0}(C_{1}-C_{1}^{2}-2C_{2}+4C_{0}C_{2}) < 0, \end{aligned}$$
(11)

where it is assumed that \(\rho _0^2\approx 0\), and \(C_\ell \) is defined such that \(S(q) = \sum _{\ell =0}^\infty C_{\ell }q^\ell \) and

$$\begin{aligned} C_\ell \equiv \sum _{k=\ell +1}^{\infty }\sum _{n=0}^{\ell }\begin{pmatrix} k-1 \\ \ell \end{pmatrix}\begin{pmatrix} \ell \\ n \end{pmatrix} (-1)^{\ell -n}\frac{k}{z}p_{k}F\left( \frac{n}{k}\right) . \end{aligned}$$
(12)

Condition (11) states that the second-order approximation of Eq. (8), \(q = \rho _0 +(1-\rho _0)(C_0 +C_1q+C_2q^2)\), has no solution near \(q=0\), because the existence of a positive root near \(q=0\) implies that a global cascade is impossible.Footnote 3 Gleeson and Cahalane [16] showed that the second-order cascade condition (11) well matches the cascade region predicted by the numerical simulation.

Generating function approach

Now, we explain the generating function approach to calculate the expected cascade size. In doing so, we compute the probability that a randomly chosen node is vulnerable; we say a node is vulnerable if its degree, k, satisfies \(R\le 1/k\). That is, if a node is vulnerable, then the node will get activated if at least one neighbor is active. Let \(\mu _k = P[R\le 1/k]\) denote the probability of a node having k edges being vulnerable. Since the probability that a randomly chosen node has degree k is \(p_k\), the generating function of vulnerable node degree is given as

$$\begin{aligned} G_0(x) = \sum _{k=0}^{\infty }\mu _{k}p_{k}x^k. \end{aligned}$$
(13)

Generating function \(G_0(x)\) has information on all of the moments of the degree distribution only of vulnerable nodes.Footnote 4 The generating function for the excess degree distribution for the vulnerable nodes leads to

$$\begin{aligned} G_1(x)&= \frac{\sum _{k=1}^{\infty }k\mu _{k}p_{k}x^{k-1}}{\sum _{k=1}kp_k} = \frac{G_0^{\prime }(x)}{z}, \end{aligned}$$
(14)

where \(G^\prime \) represents derivative. Note that \(G_1(1)\) is equal to the probability that a randomly chosen neighbor is vulnerable.

We now introduce the generating function for the vulnerable cluster size:

$$\begin{aligned} H_0(x)&= \sum _{n=0}^{\infty }\theta _nx^{n}, \end{aligned}$$
(15)
$$\begin{aligned} H_1(x)&= \sum _{n=0}^{\infty }\tilde{\theta }_nx^{n}, \end{aligned}$$
(16)

where \(\theta _n\) denotes the probability that a randomly chosen node belongs to a vulnerable cluster of size n, and \(\tilde{\theta }_n\) is the corresponding probability for a neighbor of a randomly chosen node. The generating function for the probability that a randomly selected neighbor belongs to a vulnerable cluster should satisfy the following self-consistency equation [8, 20]:

$$\begin{aligned} H_1(x)= 1 - G_1(1) + xG_1(H_1(x)). \end{aligned}$$
(17)

The first term represents the probability that a neighbor is not vulnerable, and the second term corresponds to the size distribution of vulnerable clusters to which a chosen neighbor belongs. Note that if a node belongs to a vulnerable cluster of size n, then its neighbors must also belong to a vulnerable cluster of size n. Therefore, the generating function for the probability that a randomly chosen neighbor belongs to a vulnerable cluster of size n depends on the second- and higher-order neighbor’s generating function, resulting in a self-consistent determination (17) [8, 20]. Here, the assumption of a locally tree-like structure is needed to obtain the generating function, in which case different neighbors belong to (locally) independent subsets of a vulnerable cluster. Once \(H_1(x)\) is obtained, \(H_0(x)\) is computed as

$$\begin{aligned} H_0(x) = 1-G_0(1) + xG_0(H_1(x)), \end{aligned}$$
(18)

We note that, roughly speaking, the procedure for calculating \(H_1\) (\(H_0\)) corresponds to the derivation of fixed point q (\(\rho \)) in the recursion equation (8) (Eq. (7)) in the Gleeson–Cahalane’s [16] tree-based method. Footnote 5

The average vulnerable cluster size is given by

$$\begin{aligned} \langle n\rangle&= H_0^{\prime }(1) \nonumber \\&= G_0(1) + \frac{(G_{0}^{\prime }(1))^2}{z - G_0^{\prime \prime }(1)}. \end{aligned}$$
(19)

It follows that the cascade condition is expressed as

$$\begin{aligned} z < G_0^{\prime \prime }(1) = \sum _{k=1}k(k-1)\mu _{k}p_{k}. \end{aligned}$$
(20)

Note that since \(\mu _k = F(1/k)\) for \(k>0\), Eq. (20) is equivalent to the previous cascade condition derived from the Gleeson–Cahalane’s method (Eq. (10)) as long as the threshold value is strictly positive (i.e., \(R>0\) and \(F(0)=0\)) and the seed fraction is sufficiently small (i.e., \(\rho _0\rightarrow 0\)).

Financial contagion

In models of financial contagion, nodes and directed edges represent banks and lending–borrowing relationships, respectively, and an activation of a node is interpreted as a bank default. In fact, the Gai–Kapadia model is isomorphic to the Watts model, the difference being that the former treats a directed random graph while the latter focuses on an undirected random graph. To see this, let us consider a stylized balance sheet of a bank (Fig. 2). Suppose that each bank may have two types of assets: interbank assets, \(A^\mathrm{IB}\), and external assets, \(A^\mathrm{E}\) (such as stocks, bonds, etc). On the liability side, there can be interbank liabilities, \(L^\mathrm{IB}\), and deposits from customers, D. Then, the solvency condition for bank i is given by

$$\begin{aligned} A_{i}^\mathrm{IB} + A_{i}^\mathrm{E} - L_{i}^\mathrm{IB} - D_{i} > 0, \end{aligned}$$
(21)

which is equivalent to saying that the net worth (or the capital) of a bank should be positive.

Fig. 2
figure 2

Stylized balance sheet of a bank

Now, consider the situation in which the amounts of loans extended from a bank to other banks are evenly distributed, so that each interbank exposure is simply written as \(A_{i}^\mathrm{IB}/|\mathcal {N}_{i}|\), where \(\mathcal {N}_{i}\) denotes the set of borrowers to which bank i lends. We assume that the ratio of total interbank assets, \(A_{i}^\mathrm{IB}\), to net worth, \(K_{i}\), is common across banks and is given by \(A_{i}^\mathrm{IB}/K_{i} = 1/\overline{R}\) \(\forall i\) for \(\overline{R}>0\). Under these assumptions, the default condition for bank i leads to

$$\begin{aligned} \phi _{i}&> \frac{K_{i}}{A_{i}^\mathrm{IB}} = \overline{R}, \end{aligned}$$
(22)

where \(\phi _{i}\) is the fraction of bank i’s counterparties that have defaulted. The loss given default is assumed to be 100% for simplicity; the lender would loose the full amount of funds lent to a defaulted bank. Eq. (22) states that bank i will default if the actual fraction of defaulted counterparties exceeds a constant threshold level \(\overline{R}\), which is essentially the same mechanism as that of social contagion in the Watts model (i.e., just replacing R with \(\overline{R}\)). Note that if bank i holds a sufficient amount of capital buffer such that \(\overline{R}>1\), then there is no chance for bank i to (contagious) default simply because the capital buffer can fully absorb the maximum possible losses. Since edges exposed to the risk of bank default are out-going edges (i.e., lending to other banks), the mechanism of social contagion applies here in a straightforward manner as long as there are no bidirectional edges.Footnote 6 The trick here is that the volumes of interbank exposures, or edge weights, are evenly distributed between borrowers to which a bank lends. This greatly simplifies the analysis because otherwise we have to replace \(\phi _{i}\) in the default condition (22) with the total fraction of losses that bank i incurs. A more general case of heterogeneous edge weights will be discussed in the following section.

Given the isomorphic property, it is straightforward to analyze cascades of bank defaults in the same way with the methods developed by Gleeson and Cahalane [16] and Watts [8]. Due to its simplicity and analytical tractability, over the past years the Gai–Kapadia model has been frequently used as a baseline framework of more sophisticated models of financial contagion. We review some of these extensions in the next section.

Extensions of the threshold cascade model

The Gai–Kapadia model spurred a flurry of research on financial contagion due to interbank exposures. Before reviewing these works, let us summarize some of the important assumptions made in the simplest version of the Gai–Kapadia model; (1) loans are evenly distributed, (2) interbank lending forms a Erdős–Rényi random graph, and (3) the risk of external assets is not considered. Recently, various models are proposed to make the cascade model more realistic by relaxing these assumptions. In considering the possible extensions of the threshold financial cascades, we can take advantage of the isomorphic property that characterizes the Watts and Gai–Kapadia models. That is, any models proposed as extensions of the Watts model could be applied to the model of financial contagion as well.

Heterogeneous edge weights

When a heterogeneity of loan weights is introduced in the Gai–Kapadia model, the default condition is no longer captured by Eq. (22). The condition (22) is valid only if the amounts of interbank loans that a bank has lent to other banks are identical. Otherwise, the default condition cannot be expressed just by the fraction of defaulted borrowers, but rather expressed by the ratio of losses to total interbank assets:

$$\begin{aligned} \frac{\sum _{j\in \mathcal {N}_i^\mathrm{def}}A_{ij}^\mathrm{IB}}{A_i^\mathrm{IB}}&> \overline{R}, \end{aligned}$$
(23)

where \(A_{ij}^\mathrm{IB}\) denotes the amount of funds lent from bank i to bank j (i.e., \(A_i^\mathrm{IB} = \sum _{j}A_{ij}^\mathrm{IB}\)), and \(\mathcal {N}_i^\mathrm{def}\) is the set of bank i’s borrowers that have defaulted. Note that assuming identical edge weights, \(A_{ij}^\mathrm{IB} = A_i^\mathrm{IB}/|\mathcal {N}_{i}|\), recovers condition (22) since the LHS would reduce to \(|\mathcal {N}_{i}^\mathrm{def}|/|\mathcal {N}_{i}| \equiv \phi _i\).

Under a generic default condition (23), the standard mean-field approximation will not be appropriate since different borrowers have different weights, meaning that the number of defaulted banks itself is not informative. One obvious way to analyze such a more general environment is to rely on numerical simulations. Hurd and Gleeson [24], Hurd [25] and Unicomb et al. [26], however, proposed alternative approximation methods to compute the solution of cascade dynamics. Hurd and Gleeson [24, 25] consider a situation in which edge weights w are random variables, whose CDFs \(G_{kk^{\prime }}(w)\) depend on the degrees of nodes in both sides of an edge k and \(k^{\prime }\). They show that by imposing additional assumptions, one can obtain the analytical solutions for the mean cascade size. Unicomb et al. [26] extend Gleeson’s [27, 28] approximate master equation, showing that an increased weight heterogeneity will reduce the size of cascades.

Non-Erdős–Rényi networks

The Erdős–Rényi random graph [29] is probably the most widely used network structure in the analytical models of global cascades, yet no empirical financial networks exhibit the Erdős–Rényi structure. For example, it has been shown that the degree distribution of interbank networks follows fat-tail distributions such as a power-law distribution and a log-normal distribution [30,31,32]. Moreover, although the most analytical methods assume a locally tree-like structure, empirical networks have local clusters, and edges have a degree–degree correlation called assortativity [19].

It is natural to think that the tree-based method could be extended to the configuration graphs with arbitrary degree distribution \(p_{k}\) as long as the network has a locally tree-like structure. A possible problem is that the tree-like assumption might no longer hold true once a more realistic network structure is considered. Even in the configuration model, for example, the clustering coefficient can be large when a fat-tail degree distribution is assumed [19].Footnote 7 Fortunately, recent studies develop several ways in which the presence of local cycles would not affect the accuracy of the analytical solutions. Melnik et al. [33] provide some conditions under which the tree-like approximation works “unreasonably” well even in networks with high levels of clustering. Radicci and Castellano [34] develop an alternative technique based on a message-passing algorithm, which gives an accurate approximation in networks with local clusters. Ikeda et al. [35] also show that the presence of local clusters will enhance the chance of global cascades.

Another possible departure from the Erdős–Rényi graph is that there are negative degree–degree correlations (or disassortativity) in real-world financial networks [36, 37]. That is, banks with high degrees are likely to trade with low-degree banks. Dodds and Payne [38], Payne et al. [23, 39] and Hurd et al. [40] study the effect of (dis)assortativity on the level of systemic risk, allowing for an arbitrary degree distribution. They show that assortativity of financial linkages strongly affects the expected cascade size.

We can extend the tree-based method to even a multiplex structure of interbank networks. A multiplex network is a networked system consisting of multiple layers, on each of which nodes are connected by edges.Footnote 8 Interbank networks may exhibit a multiplex structure when banks trade different types of assets. For example, if there is a seniority in interbank assets (i.e., difference in the risk of loans), a “monoplex” model would no longer suffice. Brummitt and Kobayashi [42] generalized the Gai–Kapadia model in a way that allows for different seniority levels, where different risk assets are traded in different layers. They consider a general case in which there are M seniority levels, showing that the cascade condition is generally given by the trace of the Jacobian of M many recursion equations. Of course, variety of interbank assets is not limited to seniority levels. There can be other “layers” in which banks trade long- and short-term assets, foreign exchange exposures, derivatives, and etc., [43, 44]. The multiplex structure of financial networks, however, is a relatively pre-matured research area in the sense that analytical models of financial contagion are still scarce.

Risk of external assets

The only contagion channel considered in the Gai–Kapadia model is a cascade of repayment failures in the interbank market. In reality, however, this channel is just a part of the source of systemic risk. The risk of devaluation in external assets is one of the major concerns not only for the portfolio management of individual banks, but also for the systemic risk of the entire financial system. In recent years, lots of works have been done on the contagion channel through overlapping portfolios, in which a fall in the price of an asset will simultaneously affect many banks holding the same (or a correlated) asset [45,46,47,48]. Such a simultaneous shock to multiple banks has a potential to accelerate the traditional contagion process through interbank exposures. We will explain these studies in detail in sections “Distress propagation due to credit quality deterioration” and “Overlapping portfolios and price mediated contagion”.

In the Gai–Kapadia model, it is considered that the risk of external assets plays a role in initiating a contagion. Suppose that returns of external assets held by different banks are independent, and the price of an asset held by bank i falls. If the devaluation of the asset is so large that the default condition (22) is satisfied (due to a reduction in \(K_{i}\)), then it may cause bank i’s creditors to default, initiating a contagion process. However, the role of external assets in reality is not that simple because the actual external assets are correlated, and the volatility of assets makes the health of balance sheets differ from bank to bank. To take into account these more realistic situations, many studies conduct simulations to understand the impact that a correlation in external assets has on systemic risk [49]. Kobayashi [50] provides a simple way to generalize response function F to include the possibility that the value of external assets follow a probability distribution, which in fact corresponds to the Watts model in which the threshold for contagion is a random variable [16].

Distress propagation due to credit quality deterioration

The importance of counterparty default contagion for practical purposes has been challenged both theoretically and empirically. From the theoretical point of view Glasserman et al. [51] proved for instance that, within the Eisenberg–Noe framework, the contribution of contagion to the default probability of a bank is always small, and Battiston et al. [52] showed that this is the case because of a “conservation of losses” that is implicitly embedded in the Eisenberg–Noe algorithm, which prevents it from amplifying exogenous shocks. From the empirical point of view, contagion analysis of real interbank systems have shown domino effects triggered by the failure of a small number of banks are unlikely to occur in practice [53]. On the other hand it was shown that networks of interbank exposures can significantly amplify distress propagation in presence of other contagion channels, such as for instance fire sales and overlapping portfolios [48].

Beyond its interaction with other contagion mechanisms, another reason why networks of interbank exposures can be important is the following: models of contagion due to counterparty default risk assume that losses propagate from borrowers to lenders only after the default of a borrower. However, in practice, losses could occur even in absence of default, because of credit quality deterioration [54]. Consider the situation in which bank i is exposed to bank j, which suffers a large loss. After the loss, the probability that j defaults has increased, and therefore, the expected cash flow associated with the exposures between i and j is reduced. If interbank assets were to be marked to market, this would mean that the value of the interbank asset of i that is associated with its exposure to j is reduced. The idea of accounting for the propagation of distress before defaults led to the introduction of DebtRank [9].

DebtRank

Let us consider a system of N banks, and let us denote by \(W_{ij}\) the interbank exposure of i towards bank j, by \(A_i^\mathrm{ext}\) the external (non interbank) assets of bank i, and by \(L_i\) its total liabilities. DebtRank is a discrete-time map that describes the evolution of the equity of all banks after a shock hits the system. In the DebtRank dynamic banks can be in two states: active or inactive. An active bank is a bank that will pass distress to its creditors if subject to a loss, and a bank becomes inactive after it has passed distress to its creditors once. This does not necessarily mean that the bank has defaulted, nor that the bank cannot suffer additional losses, but simply that further losses will not be transmitted to its creditors. If we denote by \(h_i(t) = \frac{E_i(0)-E_i(t)}{E_i(0)}\) the relative loss of equity of bank i at time t, and by \(\mathcal {A}(t)\) the set of active banks at time t, the DebtRank dynamic reads

$$\begin{aligned} h_i(t+1)= & {} \, \mathrm{min} \left\{ 1, h_i(t) + \sum _{j\in \mathcal {A}(t)} \frac{W_{ij}}{E_i(0)} h_j(t) \right\} \end{aligned}$$
(24)
$$\begin{aligned} \mathcal {A}(t+1)= & {}\, \left\{ i~|~h_i(t)>0,~h_i(t-1)=0\right\} \end{aligned}$$
(25)

The meaning of the above dynamic is the following: the loss experience by bank i between time 0 and time \(t+1\) is its loss up to time t plus the new losses that are transmitted by its active counterparties. The contribution to the loss of i due to counterparty j is proportional to the level of distress of j (the factor \(h_j(t)\)) and to the exposures of i towards j relative to its equity (the factor \(W_{ij}/E_i(t)\)). The matrix with elements \(W_{ij}/E_i(t)\) has been named matrix of interbank leverage [55] because it represents the percentage loss of equity of i that corresponds to a \(1\%\) devaluation of its exposure to j.

In the original DebtRank paper, Battiston et al. [9] present a study of US commercial banks, and they show that their algorithm can effectively be used to rank banks in terms of their systemic importance. By showing that relatively small banks can be among the most systemically important, and because of the analogy between DebtRank and centrality measures in networks, they introduced into the debate on systemic risk the idea that some banks might be “too central to fail”.

Extensions

According to the above formulation, because nodes become inactive after they propagate distress once, losses can flow through a cycle in the network only once. To account for further rounds of propagation, Bardoscia et al. [56] derived, from the iteration of the balance sheet identity, the following modified dynamic

$$\begin{aligned} h_i(t+1)= & {} \mathrm{min} \left\{ 1, h_i(1) + \sum _{j=1}^N \frac{W_{ij}}{E_i(0)} h_j(t) \right\} , \end{aligned}$$
(26)

where \(h_i(1)\) is the initial exogenous shock that affects bank i, and it is assumed that \(h_i(0)=0\) for all \(i = 1,\ldots N\). This formulation makes it easier to understand the stability of a system with respect to a small perturbation. In particular, if the largest eigenvalue of the matrix of interbank leverages is larger than one, shocks will be amplified by the network and lead to the default of some banks in the system.

The underlying assumption of DebtRank is that losses propagate from borrowers to lenders linearly: an \(x\%\) devaluation of the equity of the borrower leads to an \(x\%\) devaluation of the interbank asset of the lender. This assumption can be relaxed by considering dynamics of the form

$$\begin{aligned} h_i(t+1) = \mathrm{min} \left\{ 1, h_i(1) + \sum _{j=1}^N \frac{W_{ij}}{E_i(0)} f\left( h_j(t) \right) \right\} , \end{aligned}$$
(27)

with f(x) a function that maps the interval [0, 1] into the positive real semiaxis. For instance, Bardoscia et al. [57] considered the following function

$$\begin{aligned} f(x) = x {e}^{-\alpha (x-1)}, \end{aligned}$$
(28)

where \(\alpha \ge 0\). This function represents a one-parameter family of distress propagation rules that interpolates between the threshold model used in [7], which is recovered in the limit \(\alpha \rightarrow \infty \), and the linear rule of DebtRank, which corresponds to \(\alpha =0\). By performing contagion analyses on a system of European banks, they explored the dependence of the model on the parameter \(\alpha \), showing the existence of different regimes for what concerns the amplification of distress.

Bardoscia et al. [58] also considered the case of a non-linear propagation of distress, and considered f(x) to be increasing and convex. They analyzed the largest eigenvalue of the interbank leverage matrix, and they showed the existence of trajectories in the space of networks that can turn a system from stable to unstable through processes that are normally believed to increase the stability of financial markets, namely market integration and diversification. To show this they considered the hypothetical case in which the network of interbank contract is a directed acyclic graph, which is stable. They then considered a situation in which links are randomly added between banks in the system, but in such a way that, every time a link is added, the total amount of lending of each bank is preserved. This implies that banks are on average increasing their diversification. They showed that, through this process of increasing diversification, it is possible for an initially stable network to become unstable. They argue that the instability of a network under these type of dynamics is due to the emergence of peculiar cyclical structures in the network of interbank exposures.

DebtRank is also at the basis of the stress testing framework proposed by Battiston et al. [55], who propose a framework based on the following steps: (1) application of an exogenous shock to the system and estimation of direct losses; (2) propagation of distress through the DebtRank dynamic Eq. (24) and estimation of second-round losses; (3) further (third-round) losses caused by banks liquidating a common asset to target their initial leverage. Through the application of this stress testing framework, Battiston et al. [55] find that the second-round effects (due to DebtRank) and the third-round effects (due to leverage targeting) dominate the first-round losses (direct losses due to the exogenous shock). This finding has potential implications for regulators, as it implies that stress tests that do not account for network effects can significantly underestimate systemic risk.

In relation to policy making, an interesting work has been presented by Thurner and Polenda [59] and Polenda and Thurner [60], who, using DebtRank as a tool to measure systemic risk, showed how taxation policies that do take into account the impact of interbank contracts on systemic risk can effectively promote systemic stability while not reducing the volume of interbank lending.

Overlapping portfolios and price mediated contagion

Section “Cascades of bank defaults due to bilateral interbank exposures” discussed contagion due to counterparty default risk. Here we discuss a different contagion mechanism, which is associated with the fact that stress propagates between investors that hold common assets. The idea of this contagion mechanism is the following: consider the simple situation in which two banks i and j invest in a common asset x. Suppose now that bank i is under stress, and that to reduce its risk exposure it has to liquidate part of its position on asset x. Because of market impact, the tendency of prices to react to trading activity, the liquidation procedure causes a devaluation of the asset, whose price will drop. When assets are marked to market, this devaluation causes a loss to bank j. We then see that, even in the absence of direct contracts between i and j (such as those associated with interbank loans considered previously), stress can propagate from i to j through the intermediation of the price of the common asset x. Similarly to the case of counterparty default risk, the question to be asked is, therefore, “how does the pattern of overlapping portfolios between banks, which can be modeled as a bipartite network, affect systemic risk?” A pictorial representation of a simple network of overlapping portfolios is shown in Fig. 3.

Fig. 3
figure 3

Pictorial representation of a network of overlapping portfolios. A bank is connected to the assets in its balance sheet. Stress can propagate between banks with common assets. For instance: if bank \(\mathtt {1}\) is under distress and liquidates its portfolio, asset \({\mathsf A}\) would be devalued. This would cause a loss to bank \(\mathtt {2}\), which might in turn need to liquidate its investment portfolio. This liquidation would cause assets \({\mathsf A}\) to be further devalued and asset \({\mathsf B}\) to be devalued as well, thus causing losses to banks \(\mathtt {3}\), \(\mathtt {4}\) and \(\mathtt {5}\) and the consequent devaluation of asset \({\mathsf C}\)

The effect of losses due to common asset holdings and fire sales has first be studied by Cifuentes et al. [61] in the context of the Eisenberg–Noe model. In their paper, Cifuentes et al. [61] consider a system of banks that are interacting through a network of interbank lending relationships and in which all banks are investing in one common external asset. Banks are subject to a capital constraint, and thus need to liquidate part of their investment in the common asset if they face a loss. The authors present numerical simulations on a system of 10 banks to show the response of the system to the initial default of a bank, and they study the effect of changing the average connectivity of the interbank network. They find that there is a non-monotonic relationship between the number of connections in the network and the number of observed defaults. A similar setting is also explored by Gai and Kapadia [7], where the interbank lending network is, however, modeled as a directed Erdős–Rényi network and counterparty default contagion is modeled through the threshold dynamics discussed in section “Cascades of bank defaults due to bilateral interbank exposures”, as well as in Nier et al. [62]. May and Arinaminpathy [63] build on the model of Nier et al. [62], of which they present a mean-field solution, by considering that banks interact through different asset classes and accounting for contagion between those asset classes.

The papers mentioned above considered the effect of fire sales of one or few asset classes, but their focus was the study of the stability of a system as a function of the properties of the network of interbank loans. More recently, the focus shifted towards the study of the network of overlapping portfolios itself and on how its shape affects systemic stability. The network of overlapping portfolios is usually modeled as a bipartite network, where two types of nodes exist (banks and assets) and a link can only connect a bank to an asset, meaning that the bank is investing in that asset. If we consider a system of N banks and M assets, we can describe the structure of the system in terms of the matrix Q, where the element \(Q_{ia}\) is the number of shares of asset a held by bank i, and we also denote by \(p_a\) the price of asset a. Apart from the network, there are two main ingredients that are needed to define a model. The first is the response of the bank to its losses, the second is the response of the asset to its liquidation. If we consider a dynamic that occurs over discrete time steps \(t=1,2,\ldots \), the response of bank i can be defined in terms of the map

$$\begin{aligned} Q_{ia}(t) = f_{ia} \left[ Q_{ia}(t-1),A_i(t-1),E_i(t-1)\right] , \end{aligned}$$
(29)

where we have denoted by \(A_i(t)\) the value of the assets of bank i at time t and by \(E_i(t)\) its equity, while the response of asset a can be expressed in terms of the map

$$\begin{aligned} p_a(t) = g_a\left[ \{Q_{ia}(t)\}\right] , \end{aligned}$$
(30)

where we denote by \(\{Q_{ia}(t)\}\) the set \(\{Q_{1a}(t),Q_{2a}(t),\ldots ,Q_{Na}(t)\}\).

In the literature on network models of overlapping portfolios, two choices are common for what concern the response of banks: either banks are passive until they default, at which point they liquidate their entire portfolio, or they target a certain level of leverage, defined as the ratio between the mark-to-market value of their assets and their equity: in the following, we briefly discuss some of these models. Although with some differences, the models we discuss are all very similar in their ingredients, but the analysis is quite different in their focus.

Threshold dynamics

Huang et al. [46] consider the situation in which a bank is passive until its default, and it liquidates its entire portfolio when it defaults, so that

$$\begin{aligned} f_{ia}\left[ Q_{ia}(t-1),A_i(t-1),E_i(t-1)\right] = {\left\{ \begin{array}{ll} Q_{ia}(0),&{}\quad \mathrm{if} \quad E_i(t-1)\ge 0\\ 0,&{}\quad \mathrm{if} \quad E_i(t-1)< 0 \end{array}\right. }. \end{aligned}$$
(31)

In Huang et al. [46] asset prices are assumed to respond to liquidation as

$$\begin{aligned} g_a\left[ \{Q_{ia}(t)\}\right] = p_a(0)\left( 1-\alpha \frac{\sum _i\left[ Q_{ia}(0)-Q_{ia}(t)\right] }{\sum _i Q_{ia}(0)}\right) , \end{aligned}$$
(32)

where \(\alpha \ge 0\) is a parameter related to the market impact associated with asset a. The above expression means that the value of the asset at time t depends linearly on the fraction of its shares (relative to the total number of shares held in the system) that has been liquidated up to that time.

Huang et al. [46] perform an empirical analysis concerning the situation of US commercial banks in 2007. They consider data for 7846 commercial banks and 13 asset classes, and they perform stress tests by reducing the value of one asset class from \(p_a(0)\) to \((1-\xi ) p_a(0)\), with \(0\le \xi \le 1\). They then compute the number banks that survive the cascading process triggered by the initial shock. They find that abrupt transitions occur in the number of surviving banks as a function of the parameters \(\alpha \) and \(\xi \) and that devaluation of commercial real estate loans are responsible for the failure of commercial banks during the subprime crisis. Quite interestingly, Huang et al. [46] also perform an empirical validation of their model by comparing the banks that their model predicts should fail with those banks that actually failed between 2008 and 2011. Their analysis of false positive and true positive rates shows that there is predictive power in the model.

A similar model of overlapping portfolios is the one of Caccioli et al. [47], who also consider a bipartite network of banks and assets and the map (31) for the update of banks positions on the assets. The rule for the devaluation of assets is, however, linear in the log returns,Footnote 9 as the one of [7, 61]. This can be written as

$$\begin{aligned} g_a\left[ \{Q_{ia}(t)\}\right] = p_a(0)\left( 1-{e}^{\alpha \frac{\sum _i\left[ Q_{ia}(0)-Q_{ia}(t)\right] }{\sum _i Q_{ia}(0)}}\right) . \end{aligned}$$
(33)

Caccioli et al. [47] study the stability of the system in the limit when the number of banks and assets is large. In particular, they identify the conditions under which a small initial perturbation such as the initial bankruptcy of a bank or the devaluation of an asset can lead to a global cascade of bankruptcies. They show that the model can be described in terms of a branching process. In particular, using the approximation that the network is a tree, they define a transfer matrix \(\Pi \), whose element \(\Pi _{ij}\) represents the probability that the only failure of bank j triggers the failure of bank i:

$$\begin{aligned} \Pi _{ij} = \mathrm{prob}\left[ \sum _{a=1}^M Q_{ia} p_a(0)\left( 1 - {e}^{-\alpha Q_{ja}/\sum _k Q_{ka}} > E_i\right) \right] . \end{aligned}$$
(34)

The stability of the system as a function of the model parameters can at this point be assessed by studying the largest eigenvalue of \(\Pi \). In the paper, Caccioli et al. [47] provide results for networks in the bipartite Erdős–Rényi ensemble, and, similarly to the case of counterparty default risk [7], they find the existence of a non-monotonic relation between average diversification and probability of observing a global cascades. They also show the existence of a critical value of leverage below which, independently on network connectivity, the system is always stable with respect to the initial shock. This is shown in Fig. 4, which presents an illustration of the unstable region as a function of the average diversification of banks (i.e., the average degree of banks in the network of overlapping portfolios) and the leverage of banks, which is defined as the market value of the bank’s investment portfolio divided by its equity. The figure refers to the same setting considered in [47], with a bipartite Erdős–Rényi networks, and under the assumption that all banks have the same leverage. The effect of heterogeneous degree distributions on this model is studied in Banwo et al. [64] by means of numerical simulations.

Fig. 4
figure 4

Region of instability for the cascade model of overlapping portfolios for a bipartite Erdős–Rényi network. Within the red region the system displays global cascades

Leverage targeting

Caccioli et al. [47] also consider the effect of relaxing the assumption that banks are passive investors, and look at what happens if banks decide to target their initial leverage over the dynamics. If a bank suffers a loss, its leverage will go up, therefore, a rebalancing is needed to reduce risk. They consider the situation in which banks react to losses at time t by liquidating a fraction

$$\begin{aligned} \Delta A_i(t) = \gamma A_i(t)\left( 1-\frac{\lambda _iE_i(t)}{A_i(t)}\right) \end{aligned}$$
(35)

of their investment. In the above formula, \(\lambda _i\) is the target leverage of bank i, while the parameter \(\gamma \in [0,1]\) determines how quickly the bank tries to reach its target. \(\Delta A_i(t)\) is the total value of the assets liquidated by the bank at time t. To know how many shares of a given asset a are sold, one needs to divide by the number \(k_i\) of different assets in the bank’s portfolio (it is assumed here that banks liquidate the same fraction of each asset), and by the current price of asset a. This leads to the following response function

$$\begin{aligned} f_{ia}\left[ Q_{ia}(t-1),A_i(t-1),E_i(t-1)\right] = {\left\{ \begin{array}{ll} Q_{ia}(t-1)-\gamma \frac{A_i(t-1)}{k_i p_a(t-1)}\left( 1-\frac{\lambda _i E_i(t-1)}{A_i(t-1)}\right) ,&{}\quad \mathrm{if} \quad E_i(t-1)\ge 0\\ 0,&{}\quad \mathrm{if} \quad E_i(t-1)< 0 \end{array}\right. }. \end{aligned}$$
(36)

They find that the attempt of banks to reduce their individual risk through preemptive liquidation ends up significantly widening the region of parameter space where global cascades can be observed.

A stress test framework based on shock propagation due to leverage targeting is the one proposed by Greenwood et al. [65]. In this case banks estimated their losses at time t and then reduce the size of their investment to restore their initial leverage at time \(t+1\), which corresponds to the map given in equation (36) with \(\gamma =1\). Deleveraging causes prices to be devalued, and consequently mark-to-market losses for banks, that will further need to deleverage, and so on. The market impact function considered by [65] is a linear function like the one in equation (31). They try to disentangle the effect of different factors on the losses produced by fire sale contagion, showing that the contribution of a bank to aggregate deleveraging is the higher if a bank is more connected, bigger, more leveraged and more exposed to the initial shock. They also introduce the concept of indirect vulnerability of a bank with respect to a given asset, as measured by the loss of the bank’s equity due to the deleveraging associated with the asset. This has to be contrasted with the notion of direct vulnerability, which is the direct exposure of the bank towards the asset. They test the model on the largest 90 banks in the European Union for the period 2009–2011. Through a regression analysis on those banks that are publicly traded, they find that direct and indirect vulnerabilities have the same explanatory power on banks’ returns.

The model of Greenwood et al. [65] is at the basis of the framework developed in Duarte and Eisenbach [66] to measure aggregate vulnerability and systemic importance of banks. Greenwood et al. [65] also show that their measure of aggregate vulnerability due to spillover effects grows well before the crisis of 2008, and they are able to disentangle the contribution to aggregate vulnerability due to the increase of leverage, system size and concentration of investments in illiquid assets.

Cont and Schaanning [67] consider a dynamic which is in between that of a passive investor and a leverage targeting one. In their model, they account for the fact that there is usually a buffer between the leverage of a bank and the maximum leverage allowed by regulation. This is done so that they are not forced to liquidate their position because of a relatively small loss. They consider then the bank as a passive investor until its loss makes the bank break its leverage constraint. When this happens, the bank develerages to reach its target, which is, however, below the maximum allowed by the regulatory regime (so that a small buffer is restored). They also distinguish between marketable securities, that can be liquidated and are subject to market impact, and illiquid assets that are not marketable, and therefore, cannot be liquidated. Deleveraging only involves marketable securities.

Cont and Schaanning [67] consider data collected by the European Banking Authority on 51 European banks, and they compare the outcome of stress tests performing with their model vs. target leveraging. In terms of price changes, they consider both a linear and square root market impact. They find that there are significant differences in the losses estimated with the two models. Quite interestingly, they also introduce a matrix of overlaps between the portfolios of different banks, where each asset is weighted by its liquidity, and they show that, although many financial institutions have zero overlap between their portfolios, they are all connected by second order overlaps. This means that stress tests that do not account for second round losses can significantly underestimate systemic risk.

Although there may be randomness in the shocks that hit the system, or in the construction of the network of overlapping portfolios, the dynamics described above are all deterministic. A stochastic dynamic is for instance the one considered in Corsi et al. [68]. They consider a system of N banks investing in m randomly chosen assets out of a universe of M possible assets, where m is, however, computed as the optimal value of diversification that corresponds to banks maximizing their profit conditional on a VaR constraint, which is equivalent to leverage targeting [69]. The value of the assets is then the sum of a linear market impact term that depends on banks trading plus a stochastic component, which is in turn the sum of a common factor and an idiosyncratic component. As for the other models we discuss here, the relative composition of the portfolio does not change over time, but at each time banks change the volume of their investment on the portfolio to maintain their target leverage. Corsi et al. [68] show that, upon increasing diversification, the system goes from a stable regime where time series of asset returns are stationary to an unstable regime where they are characterized by bubbles and bursts.

Empirical structure of interbank networks

There are many works that aim to characterize the structure of real-world financial networks. Measuring and analyzing the structure of financial networks has a twofold objective: on one hand, knowledge of the structure of financial networks gives insights on how local risks would spread over the entire network through financial linkages. This class of studies aims to measure the systemic risks of particular network topologies that could emerge and remain unchanged for a certain period of time (i.e., static structure). On the other hand, since the topology changes over time, knowledge about the dynamical transition patterns could allow us to predict how systemic risk would evolve over time. In this section, we provide a brief review of these two lines of research that study the static and dynamic structures of interbank networks.

Static structure

Interbank networks in different countries

Over the past decade, the topology of interbank networks has been examined in many countries. These studies include Boss et al. [30] for Austria, Upper and Worms [70] for Germany, Degryse et al. [71] for Belgium, van Lelyveld and Liedorp [72] for Netherland, Iori et al. [31] and Bargigli et al. [43] for Italy, Wells [73] and Langfield et al. [74] for the UK, Furfine [75] for the US, Cont et al. [32] for Brazil, Martínez-Jaramillo et al. [76] for Mexico, and Imakubo and Soejima [77] for Japan.

While in some countries bilateral transactions data are available,Footnote 10 in many countries the aggregate balance-sheet data (e.g., total amount of loans) are the only source of information for bilateral trades. In such cases, one needs to estimate the interbank network structure using a suitable estimation method. A widely used one is the maximum entropy (ME) method. The ME method estimates the network structure by maximizing the entropy of interbank linkages, which implies that the total interbank lending is distributed to all the possible borrowers as evenly as possible. A disadvantage of the ME method is that it is likely that the estimated network is much denser than the actual one. Mistrulli [78] argues that the ME method may over- or underestimate the risk of default contagion. To overcome the problem, recently more sophisticated methods are also proposed [79]. Anand et al. [80] compare the accuracy of several existing estimation methods by applying them to various empirical networks.

A more direct way to extract information of bilateral transactions is to use interbank payment data. Since payment flows contain information on interbank settlements and transfers, one could filter out the information of bilateral interbank loans. This approach is taken by, among others, Furfine [75], Demiralp et al. [81], and Imakubo and Soejima [77].

Core–periphery structure

It has been argued that the structure of interbank networks at certain point in time is best described as a core–periphery structure [82]. A core–periphery structure is formed by two groups: core and periphery. The core and peripheral nodes are distinguished as follows; the core forms a subgraph of the entire network in which nodes are connected densely to each other. Peripheral nodes are connected to the core nodes but not to other peripheral nodes. This is expressed by a block adjacency matrix as

$$\begin{aligned} A = \left[ \begin{array}{ll} \mathbf {CC} \;&{} \mathbf {CP} \\ \mathbf {PC}&{} \mathbf {PP}\end{array}\right] \approx \left[ \begin{array}{ll} \mathbf {1} \;&{} \mathbf {CP} \\ \mathbf {PC}&{} \mathbf {0}\end{array}\right] , \end{aligned}$$
(37)

where \(\mathbf {CC}\) denotes the submatrix representing the connectivity among core nodes, and \(\mathbf {PC}\) represents the connectivity between the core and peripheral nodes. \(\mathbf {PC}\) is identical to \(\mathbf {CP}\) since we consider undirected graphs. In general, detecting core–periphery structure in a directed graph is a challenging problem, and most of the existing methods for core–periphery detection are developed for undirected graphs. In the pure core–periphery structure, we should have \(\mathbf {CC}=\mathbf {1}\) (i.e., a complete graph) and \(\mathbf {PP}=\mathbf {0}\) (i.e., there is no link between peripheral nodes). Classifying all the nodes into core and periphery is a nontrivial task, and a popular way to do this is to find core nodes so as to minimize the difference between the empirical adjacency matrix and the ideal core–periphery block matrix (37) [82,83,84].

Empirical networks: core–periphery vs. bipartite structure

We summarize the works estimating the core–periphery structure in interbank networks in Table 1. These empirical studies differ in terms of the data and their time scales. For example, Fricke and Lux [84] and Barucca and Lillo [85, 86] used the data for interbank transactions, but the former studied quarterly aggregate networks and the latter analyzed daily networks. Imakubo and Soejima [77] studied the transactions data filtered out of interbank payment data, and other studies are based on the regulatory data reported by financial institutions to the financial authorities.

While many empirical works on core–periphery structure are essentially based on the standard detection method described above, a more flexible and widely used approach to detecting block structure, called the stochastic block model (SBM) [87], has also been used. The SBM is a probabilistic model of random graphs with a flexible block structure; nodes are assigned to different blocks and each pair of nodes is linked with a probability depending on the nodes’ blocks. This model can generate arbitrary block structures, such as core–periphery, modular, and bipartite structures. In fact, Barucca and Lillo [85, 86] employ this approach and find that the two-block structure that best represents the e-MID overnight money market is bipartite (i.e., borrowers and lenders) at the daily resolution.

Even in the case of aggregate networks, the plausibility of the stylized core–periphery structure as a characteristic of interbank networks can be controversial. As is evident from Eq. (37), the previous works implicitly assumed that an empirical network consists of a single core block and a single peripheral block. This suggests that even if there were no such a standard core–periphery structure in the empirical network, the estimation method classifies each node as either a core node or a peripheral node. In fact, recent studies argue that the seemingly core–periphery structure might just come from a heterogeneous degree distribution or consist of two cores [88,89,90]. As an example, visualization of Italian interbank networks aggregated over 10 business days is presented in Fig. 5. It appears that there are densely connected core nodes at the center of the network while there are also (seemingly) peripheral nodes that are not linked to each other yet connected to the core. However, in 2007, Italian banks and other foreign banks seem to form two core-like groups, which would make it difficult to extract a stylized pure core–periphery structure. It might be reasonable to infer that there are two core–periphery structures in the network [89].

Fig. 5
figure 5

Visualization of Italian interbank networks, e-MID. The networks are aggregated over 10 business days. Red and black circle denote Italian and other foreign banks, respectively. Visualization is done by python-igraph with the Kamada–Kawai algorithm [91]

Table 1 Empirical studies on the core–periphery structure in interbank networks

Interbank network dynamics at the daily scale

Why daily scale?

Although the majority of empirical works are based on static and aggregate networks, the granularity of the data can be crucial for recognizing the heterogeneous behavior of financial institutions and for capturing the functioning of a market at its inherent time-scales. In interbank markets, most of the bilateral transactions are overnight, meaning that the relationship between two banks as a lender and a borrower lasts only for one day or shorter, depending on the time when the loan contract is made. In the Italian interbank market, for example, more than 86% of transactions are overnight lending in the period between 2000 and 2015 [21].

If the issue of interest is to understand the interconnectedness of financial risks, then an aggregate network of overnight lending relationships contains irrelevant information because different edges formed in different days in fact do not exist at the same time. Many researchers studied aggregate networks not necessarily because it conveys information about the interconnected risk structure, but rather because it would reveal meaningful information about the structure of long-term relationships among banks. Another possible reason for why aggregate networks attract attention is that daily networks can be much sparser and noisier than aggregate networks, and they appear to change their structure from day to day in a purely random manner [93, 94].

Daily network dynamics

Barucca and Lillo [85, 86] show that at the daily resolution, the structure of interbank networks in the e-MID interbank market is not characterized by a core–periphery structure, but rather characterized by a bipartite structure or more general community structures. Lowering the time resolution (such as weeks or months) tends to increase the likelihood that a core–periphery structure is detected, yet a non-negligible fraction of such networks are still best characterized by a bipartite structure. Kobayashi and Takaguchi [21] reinforce their result by showing that the bipartivity of daily interbank networks have been increasing over the past decade.Footnote 11

Fig. 6
figure 6

Scaling laws in the Italian overnight interbank networks. a Superlinear relationship between the numbers of nodes (N) and edges (M) between September 4, 2000 and December 31, 2015 (3922 business days). Each dot corresponds to a day. b Complementary cumulative distribution function (CCDF) of transaction duration (in terms of the number of business days) of bank pairs between 2010–2015

If we regard interbank markets as dynamic systems in which the structure of bilateral exposures changes every day, an interesting question to ask is whether the daily dynamics are just random or there are robust and time-invariant properties. Kobayashi and Takaguchi [21] find that the daily market activity represented by combination (NM), where N and M are the numbers of active banks and edges, respectively, is strictly ruled by a superlinear relationship \(N\propto M^{1.5}\), or \(\langle N\rangle \propto \sqrt{M}\), independently of the structure and the size of daily networks (Fig. 6a). They also find several daily dynamical patterns in the e-MID market, such as the power-law distribution of transaction duration (Fig. 6b) and a tent-shaped distribution of weight growth. Interestingly, these properties are ubiquitous in social networks formed via human interactions such as phone calls and face-to-face interactions [96,97,98].

Discussion

In this article, we reviewed recent works studying financial systemic risk based on network approaches. While we tried to cover as many research topics as possible, these are obviously not exhaustive. In particular, two important research areas that have not been discussed in the current article are the prediction and the control of systemic risk.

As in other fields of sciences, there are generally three steps for the study of systemic risk to mature as a scientific research field: The first step is to understand and model the mechanisms behind the real-world phenomena. The second is to forecast the future state of the system. The third and last step is to control the system to avoid the occurrence of undesired phenomena. Most of the studies discussed in the current article are still at the first stage. Researchers from various fields have just started working together since the late 2000’s to develop models, such as the models of interbank default cascades through bilateral exposures and overlapping portfolios, that can be used to describe the real-world phenomena analytically.

In recent years, however, a growing number of researchers are tackling the controllability of systemic risk by simulating possible policy tools that could be taken by the financial regulators. We still need further studies at every step to deepen our understanding of the complexity of financial networks and reduce systemic risk. We hope that this review article will encourage researchers from various areas of sciences to join in this challenging research field.