1 Introduction

The Smoluchowski equation was developed to describe the distribution of clusters containing specific numbers of particles formed through coagulation processes conserving mass [1]. This has been used widely in many fields such as colloid, aerosol, virus, fish school, and asteroid [2,3,4,5,6,7,8,9,10,11,12,13,14]. The approach was generalized by Miura et al. in [15] to analyze the formation of networks with application to the Japanese firm network. A model (MTT model) is proposed, which encompasses three processes of node creation, annihilation, and coagulation, together with the preferential attachment rule [16, 17], that is, a new link is more likely to be attached to a node having a larger number of links, so that a new node will be connected to node i, depending on the connectivity \(k_{i}^\lambda \) of that node. The third process is intended to represent mergers and acquisitions amongst firms. The behavior differs from the traditional Smoluchowski method since not only particles (i.e. nodes) but also links must be taken into account. Coagulation is studied from the viewpoint of the number of links (i.e. degree) connected to the nodes. The stationary state of the model exhibits a power law behavior for the degree distribution with exponents that depend on model parameters which allows the model to replicate the observations on the Japanese inter-firm transaction data for which the distribution of the number of transaction partners follow a power law with a cumulative exponent \(1.3\pm 0.1\) (for probability density, the exponent is \(2.3\pm 0.1\)) [18] (see Appendix A). The previous study was restricted to replicating the distribution of the network by making use of parameter values for both preferential attachment exponents (i.e. for a newcomer at creation process and for an acquirer at coagulation process) as well as the occurrence probabilities of node creation, annihilation, and coagulation. Here, we broaden the scope in order to center on the dynamics of the network formation and the emerging phase transactions arising from such dynamics.

In this study, we generalize the MTT model and discover two new types of long time dynamics and we determine the locus of control parameters separating the different types of asymptotic dynamics. Our generalized MTT model includes a more detailed analysis of the merger process in which we distinguish between a firm (i.e. node) acquiring another firm, that is initiating a merger, and a firm being the target of the acquisition, i.e. being selected for merger and has been verified validation by comparing empirical analysis results with simulation results (see [19] for details of the validation). The first process happens with a probability proportional to \(k_a^\alpha \), where \(k_a\) is the degree of the acquiring node. The acquiring node selects the target for acquisition with a probability proportional to the degree \(k_t^\beta \) where \(k_t\) is the degree of the target. This represents a generalization of the original MTT model. We study this generalized process numerically and analytically using the Smoluchowski equation representing this process. To do that we have included a nonlinear coagulation effect by introducing a collision kernel proportional to \(k_a^\alpha k_t^\beta \) [19]. We plot the phase diagram consisting of two axes, the vertical axis given by the preferential attachment exponent for newcomers (\(\lambda \)), and the horizontal axis given by the sum of coagulation kernel’s exponents (\(\alpha +\beta \)). By observing temporal behaviors of the system for each point in this phase diagram through intensive numerical simulations, we are able to quantify the dynamical properties resulting from the maximum value of the difference of the sizes of between the largest and the second largest nodes, denoted as Diff. According to this value, we categorize the phase diagram into Type I and Type II, with marginal critical cases. Type I relates to the case where the value of Diff is small as shown in Fig. 1b and the size distribution is steady as shown in Fig. 1a. In contrast, Type II consists of the case that Diff is very large with intermittent appearance of giant degree node showing saw-teeth oscillations and the corresponding size distribution is not steady as typically shown in Fig. 1e and f. The marginal cases, typically shown in Fig. 1c and d are given by the value of Diff around \(10^3\). The regions of Type I and II are separated by the marginal critical cases. We apply each value of model-parameter from the Japanese firm network data analysis [19] and find that the firm network during the last few decades has moved towards the Type II region, the region where a giant degree node will be formed.

2 Model Analysis

Time evolution of network growth is modeled by the following three processes which are chosen randomly at each time step.

Newcomer A new node with n links is added into the system. Each new link is randomly connected to an already existing node with a probability given by the preferential attachment rule [16, 17] with the exponent \(\lambda \). The allocation of links to an existing node of degree k follows a probability that is proportional to \(k^\lambda \). That is, the probability \(\varPi \) that a new node will be connected to node i depends on the connectivity \(k_{i}^\lambda \) of that node, so that \(\varPi = k_{i}^\lambda / \sum _{j} k_{j}^\lambda = N k_{i}^\lambda / \langle k^{\lambda } \rangle \), where N is total number of nodes which fluctuates over time.

Annihilation A node, chosen randomly with uniform probability, is removed along with all links connected to this node.

Coagulation A pair of nodes, an acquiring node a and a target node t, are chosen randomly taking into account their link numbers to become proportional to the coagulation kernel, \( K(k_{a},k_{t}) \propto k_{a}^{\alpha } k_{t}^{\beta } \). Node t is absorbed by the node a. In this process the \(k_t\) links emerging from the target node are transferred to the acquiring node. The merger (coagulation) process conserves the sum of the link numbers except overlapping links, that is, multiple links between the same nodes is not allowed within the simulation.

Fig. 1
figure 1

a, c, e Cumulative distribution of link number k in logarithm scales, with parameter-sets \((\lambda ,~\alpha ,~\beta )\) = (1.0,0.8,0.6) as (a), (1.0,1.1,0.7) as (c) and (1.0,1.4,0.8) as (e), respectively. The rates for each of the processes are \(r_{n} : r_{a} : r_{c}\) = 0.5 : 0.4 : 0.1. The black squares, pink circles, and light-green triangles indicate the distribution found after 150, 000 step, 250, 000 step, and 350, 000 step, respectively in a, c, and 210, 000 step, 240, 000 step, and 280, 000 step, respectively in e. The dashed-lines indicate slope \( -1.3=-\gamma +1\). The distribution shown by each of the coloured symbols in a, c, and e relate to the times given by the vertical lines in b, d, and f of the same colors. b, d, f Simulated time dependence of the link number k corresponding to the distributions a, c and e, respectively. Each line shows 1st (red), 2nd (green), 10th (blue), 100th (light blue), 1,000th (gray) and median (yellow) link number k

First, we study the model by use of Monte Carlo simulations (see Appendix B for the method). The initial state consists of 10, 000 nodes with zero links. We then develop the network by applying the three processes: newcomer, annihilation and coagulation (merger and acquisition) with the rates \(r_n\), \(r_a\) and \(r_c\). By comparing with the Japanese firm network in 2014, see [19, 20], the rates are chosen as 0.5 : 0.4 : 0.1. The simulations are done as described in previous studies [15, 19]. We are particularly interested in the evolution of the degree distribution and the appearance of a gap between the link numbers of the high degree nodes. Figure 1 shows the salient results of the simulations for three different choices of values for the three exponents \((\lambda ,\alpha ,\beta )\), namely = (1.0,  0.8,  0.6), (1.0,  1.1,  0.7), and (1.0,  1.4,  0.8). In panels a, c and e the black squares, pink circles, and light-green triangles indicate the distribution found after \(1.5\times 10^5\), \(2.5\times 10^5\), and \(3.5\times 10^5\) time steps, respectively. Panel b, d and f show the corresponding time dependence (in semi-log) of the number of links k for the nodes with with the largest degree (red), second largest degree (green), 10-th largest degree (blue), 100-th largest degree (light blue) and 1,000-th largest degree (gray) and for the median (yellow). We notice that as \(\alpha \) and \(\beta \) are increased, the temporal fluctuation becomes larger.

For the smaller values of \(\alpha \) and \(\beta \) the nodes of highest degree ranks have link numbers of the same order of magnitude as shown in panel a and b. We call this Type I behavior. A change occurs as the \(\alpha \) and \(\beta \) exceed certain threshold values, this is shown in panel e and f. The time dependent degree distribution develops a form that we approximate below by a power law for probability density \(c_{k}(\tau ) \propto k^{-\gamma }\) with an approximate exponent \( \gamma =2.3\pm 0.1 \), where \(c_k(\tau )\) is probability density of number of nodes with link k. Concurrently, a significant gap appears intermittently between the number of links of the node with the highest degree and the link number of the other nodes of lower rank, see panel f in Fig. 1. We call this Type II behavior. A noticeable decrease occurs in the link number of the nodes with rank smaller than the largest, while the largest rapidly increases its link number. The gap between 1st and 2nd grow quickly, then, stays at a certain level for a while and the 1st disappears suddenly by annihilation in the case of Type II. This categorization of Type I and II is available numerically for any initial condition after a brief transient state. This behavior will be discussed below in relation to Eq. (2).

3 Smoluchowski Equation

As the dynamics switches from Type I to Type II, the intermittent appearance of a node with giant degree after the appearance of power-law degree distribution becomes evident. This is similar to the formation of the gel in the sol–gel phase transition, which is an aggregation phenomena that has a long history in statistical physics. Within this process, it is possible to observe the existence of a steady state which obeys a robust power law distribution in the aggregation system with continuous injection and removal near the gelation point [4, 5, 8], so to understand better the nature of the change between the two behaviors we establish a master equation in the spirit of Smoluchowski’s equation used to describe sol–gel phase transitions which gives the time evolution of \(c_{k}(\tau )\). The following distinct processes and their probabilities describe the dynamics:

Increase of the number of nodes with k links

  1. (i)

    \(\frac{(k-1)^{\lambda }+1}{\langle k^{\lambda } \rangle +1}c_{k-1}(\tau )\); for connecting one new link to a node with \(k-1\) links so that the resulting node holds k links.

  2. (ii)

    \( \delta _{k,4} \); for directly creating a node of 4 links (\(\delta \) denotes Kronecker delta). This constant is derived by the ratio between the total number of transaction partners (c. 4.1 million) and the total number of firms (c. 1,1 million) observed within the Japanese firm network real data in 2014 [19].

  3. (iii)

    \((k+1)c_{k+1}(\tau )\); for annihilating a node which is connecting to a node of \(k+1\) links.

  4. (iv)

    \(\sum _{k'=0}^{k}\frac{k'^{\alpha }+1}{\langle k^{\alpha } \rangle +1 }c_{k'}(\tau ) \frac{(k-k')^{\beta }+1}{\langle k^{\beta }\rangle +1} c_{k-k'}(\tau )\); for merging a node of \(k'\) links as an acquirer and that of \(k-k'\) links as a target.

Decrease of the number of nodes with k links

  1. (v)

    \(\frac{k^{\lambda }+1}{ \langle k^{\lambda } \rangle +1 }c_{k}(\tau )\); for creating a node which is connecting to a node of k links.

  2. (vi)

    \(k c_{k}(\tau )\); for annihilating a node which is connecting to a node of k links.

  3. (vii)

    \(c_{k}(\tau )\); for directly annihilating a node of k links.

  4. (viii)

    \(\sum _{k'=0}^{\infty } \frac{k'^{\alpha }+1}{\langle k^{\alpha } \rangle +1}c_{k'}(\tau ) \frac{k^{\beta }+1}{\langle k^{\beta } \rangle +1}c_{k}(\tau )\); for merging a node of \(k'\) links as an acquirer and that of k links as a target.

  5. (ix)

    \(\sum _{k'=0}^{\infty } \frac{k^{\alpha }+1}{\langle k^{\alpha }\rangle +1}c_{k}(\tau ) \frac{k'^{\beta }+1}{\langle k^{\beta } \rangle +1}c_{k'}(\tau )\); for merging between a node of k links as an acquirer and that of \(k'\) links as a target.

In order to allow isolated nodes (degree zero nodes) to be able to connect to newcomers and to participate in the merger events, we have added one to the degree of the involved nodes in the numerator and denominator of the kernels describing these processes. The resulting master equation for the time evolution of \(c_k\) referring to the previous study [15, 21] is given by;

$$\begin{aligned}&(\langle N \rangle +r_{n}-r_{a}-r_{c})c_{k}(\tau +1) - \langle N \rangle c_{k}(\tau )\nonumber \\&=r_{n} \Biggl [ \frac{ (k-1)^{\lambda }+1 }{ \langle k^{\lambda } \rangle +1 }c_{k-1}(\tau ) - \frac{ k^{\lambda }+1 }{ \langle k^{\lambda } \rangle +1 } c_{k}(\tau ) + \delta _{k,4} \Biggr ] \nonumber \\&\quad + \, r_{a} \Biggl [ (k+1)c_{k+1}(\tau ) - kc_{k}(\tau ) - c_{k}(\tau ) \Biggr ] \nonumber \\&\quad + \, r_{c} \Biggl [ \sum _{k'=0}^{k} \frac{k'^{\alpha }+1}{ \langle k^{\alpha } \rangle +1 } c_{k'}(\tau ) \frac{ (k-k')^{\beta }+1 }{ \langle k^{\beta } \rangle +1 } c_{k-k'}(\tau ) \nonumber \\&\quad - \sum _{k'=0}^{\infty } \frac{k'^{\alpha }+1}{ \langle k^{\alpha } \rangle +1 } c_{k'}(\tau ) \frac{k^{\beta }+1}{ \langle k^{\beta } \rangle +1 } c_{k}(\tau ) - \sum _{k'=0}^{\infty } \frac{ k^{\alpha }+1 }{ \langle k^{\alpha } \rangle +1 } c_{k}(\tau ) \frac{k'^{\beta }+1}{ \langle k^{\beta } \rangle +1 } c_{k'}(\tau ) \Biggr ]. \end{aligned}$$
(1)

Note that \(\langle N \rangle \) represents the mean (i.e. ensemble average) of total number of nodes, which \(\langle N \rangle = \sum _{s=1}^{L} N_{s}(\tau ) / L\) where \(N_s(\tau )\) the s-th sample’s total number of nodes at time \(\tau \) and L the total number of samples, so that \(\langle N \rangle c_{k}(\tau )\) is the mean number of nodes with degree k at time \(\tau \). Moreover, the mean number of nodes added (or deleted) to the network per time is \(r_{n} - r_{a} - r_{c}\). Therefore, the mean number of nodes with degree k after one step is \(\langle N \rangle c_{k}(\tau +1)+(r_{n} - r_{a} - r_{c})c_{k}(\tau +1)\), where \(c_{k}(\tau +1)\) is the new value of \(c_{k}(\tau )\).

We now describe the analytic criteria we use to determine the boundary between Type I and Type II dynamics. We follow previous studies [22,23,24] of the Smoluchowski coagulation equation in view of sol–gel phase transition. The formation of the gel can be identified from the time dependence of the moments restricted by a given upper-bound link number Q, i.e. \(M_{1,Q}(\tau ) \equiv \sum _{k=1}^{Q} kc_{k}\). The gelation transition, i.e. the formation of a giant node, is determined by the behavior of the derivative \(\varLambda (Q) \equiv d{M}_{1,Q}/d\tau \). The gel phase appears when \(\varLambda (Q_1)>0\) while \(\varLambda (Q_2)<0\), where \(Q_1\) where \(Q_1\) and \(Q_2\) denote the link sizes of the largest node and the second largest node, respectively [23] (see Appendix C for a strong divergence of a size of a node of maximum degree \(Q_{1}\) in finite time which is effected by aggregating between large nodes). For theoretical evaluation of Eq. (1), we pay attention to the boundary between Type I and Type II dynamics and we assume that the degree distribution follows a power law \( c_{k}(\tau ) = H(\tau )k^{-\gamma } \), based on findings from previous studies [3,4,5, 8, 22,23,24]. Further more we assume stationarity for the averaged number of nodes by choosing \(r_n=1/2\) and \(r_a=1/2-r_c\), we have the following evaluation (see Appendix D for detail derivation).

$$\begin{aligned} \varLambda (Q)\approx & {} 1 - \frac{Q^{\lambda -\gamma +1}}{\langle k^{\lambda } \rangle +1} H(\tau ) + (1/2-r_{c}) \biggl [ H(\tau ) \bigl [ Q^{2-\gamma } + (1-\gamma )Q^{1-\gamma } \bigr ] - 2\langle k \rangle \biggr ] \nonumber \\&- 2r_{c} [H(\tau )]^{2} Q^{3+\alpha +\beta -2\gamma } \bar{B}. \end{aligned}$$
(2)

Here \(\bar{B} = \bigl [ B(\bar{\alpha }+1,\bar{\beta }+1) / \bar{\beta } \bigr ] + \bigl [ B(\bar{\beta }+1,\bar{\alpha }+1) / \bar{\alpha } \bigr ]\) and B expresses beta function, where \(\bar{\alpha } = \alpha -\gamma +2\), \(\bar{\beta } = \beta -\gamma +2\). We identify the boundary of the transition between Type I and Type II as the locus of the exponents \(\lambda \) and \(\alpha +\beta \) for which the time derivative given in Eq. (2) can become negative in the limit of large Q. We note that in Eq. (2) \(\bar{B}\) is a constant and the normalization factor \(H(\tau )\) is essentially independent of the number of nodes for \(\gamma >2\). So the sign of \(\varLambda (Q)\) for large Q is estimated by the exponents of Q of the terms in the right hand side of Eq. (2). We fix \(\gamma \) at \(2.3\pm 0.1\), suggested by the simulations and by the real value estimated for the Japanese firm network [18]. Then, Eq. (2) derives the following two conditions for \(\varLambda (Q)\).

  1. (A)

    \(r_c=0\) \(\rightarrow \) \( \lambda -\gamma +1> 0 \rightarrow \lambda > 1.3 \pm 0.1 \),

  2. (B)

    \(1/2>r_c>0\) \(\rightarrow \) \((\lambda -\gamma +1 > 0)\) and \((3+\alpha +\beta -2\gamma> 0) \rightarrow (\lambda > 1.3 \pm 0.1)\) and \((\alpha +\beta > 1.6 \pm 0.2)\).

We stress that this is the prediction for a node of giant degree to be able to form and we take this as an estimate of the boundary between the two types of dynamics observed in the simulations. We are going to check the validity of the analytic estimations, (A) and (B), by introducing phase diagrams as follows. To detect the onset of Type II dynamics, corresponding to the intermittent formation of a giant node, we measure the maximum difference between the first and second highest number of links, \(Q_1\) and \(Q_2\) during 500, 000 Monte Carlo steps of the processes newcomer, annihilation and coagulation. We pay attention to the following three cases. (a) \(r_{n}=0.5, r_{a}=0.49, r_{c}=0.01\), as a representative case that the effect of coagulation is very weak and the condition (A) should holds. (b) \(r_{n}=0.5, r_{a}=0.01, r_{c}=0.49\), as a representative case that the effect of coagulation is very strong and the condition (B) should holds. (c) \(r_{n}=0.5, r_{a}=0.4, r_{c}=0.1\), as a representative case that corresponds to the observed rates for Japanese business firms in the real world [19]. Each figure in Fig. 2 shows the phase diagram of the corresponding case using a colored heatmap showing the difference \(Q_{1}-Q_{2}\) with the theoretical boundary given by the dotted lines.

Fig. 2
figure 2

Maximum difference between the first and second highest number of links. The dashed lines refer to the estimate of the boundary for the onset of Type II dynamics given by a and b. The generation rates \(r_{n} : r_{a} : r_{c}\) are a 0.5 : 0.49 : 0.01 with the dashed line referring to a, (b) 0.5 : 0.01 : 0.49 with the dashed lines referring to b, c 0.5 : 0.4 : 0.1 with the dashed lines referring to b, with the black circles showing the parameters \( (\lambda ,~\alpha ,~\beta ) = (1.0,~0.8,~0.6) \), (1.0,  1.1,  0.7) , and (1.0,  1.4,  0.8) . The first two set of values correspond to the Japanese firm network data for 1994 and 2014, and the last is a simple linear extrapolation to 2034 assuming steady growth of the exponents

We notice that when the rate for merger, or coagulation, is very low, Fig. 2 panel a, the phase boundary obtained from the analytic analysis of the time dependence of the moments coincides with the observed change in \(Q_1-Q_2\) in the simulations. For larger values of the rate \(r_c\) for merger events, the analytic prediction deviates quantitatively from the simulated boundary between the two types of dynamics. This is natural since the Smoluchowski is a mean field approach and allowing mergers increase significantly the stochastic fluctuations in the dynamics.

4 Discussion

We have analytically and numerically studied the threshold condition for the formation of a giant component, or gelation, in a network subject to the three processes of creation, annihilation and merger.

Analysis of the Japanese firms network shows that the preferential attachment exponent \(\lambda \) remains constant at about 1. In contrast to the \(\alpha \) and \(\beta \) exponent describing the merger, which show a significant change with time. Specifically analysis of 1994 and 2014 leads to the following estimates of the exponents \( (\lambda ,~\alpha ,~\beta ) = (1.0,~0.8,~0.6) \) and (1.0,  1.1,  0.7) , respectively [19]. This analysis also shows acceleration in the number of firms with large numbers of links over the last twenty years. This corresponds to the growth of the exponents \(\alpha ,~\beta \) as a result of substantial merger and acquisitions activities. Therefore, our theoretical analysis indicates that the dynamics of the inter-firm network in 1994 and 2014 is likely to be located near and at the threshold for the onset of giant node formation, respectively. Extrapolation of this trend, see Fig. 2, suggests that the ecosystem of Japanese firms is moving into the Type II region and that intermittent formation and disappearance of giant components is likely to upset the economy with increasing intensity in the near future. Such behavior may obviously be consequential for the sustainability [25] and health of the Japanese economy. The results of our work, therefore, provide quantitatively measurable indicators that can aid assessment of the evolution and the stability of an economical ecosystem.

Conceptually, appearance of giant degree nodes in evolving network systems may look similar to the study developed by et al. [26]. They analytically showed appearance of giant nodes by the effect of stronger preferential attachment of newly added nodes in the limit of infinite time. By contrast, we add two more effects, coagulation of nodes and spontaneous annihilation, and find that the effect of coagulation of nodes can be so strong for some parameter region that we can observe tendency that the link number of the largest degree node diverges in a finite time corresponding to the physical phenomenon of gelation, that never occurs by the preferential attachment alone.

Additionally, the appearance of saw-teeth oscillation in Fig. 1f may also have apparent similarities to the study developed by Ball et al. [27] and Matveev et al. [28], where steady oscillations in the aggregation systems with removal of cluster or fragmentation of cluster into monomers is detected. In particular, Matveev et al. [28] obtained an analytical solution for the steady state cluster size distribution with numerical confirmation of the relaxation of the size distribution to this steady-state form. However, both of these studies aim to evaluate the steady oscillatory behavior of cluster size distribution, with focus on the kinetics of irreversible aggregation and clustering phenomena of the Smoluchowski coagulation equation. In contrast, our study focuses on the sol–gel phase transition, and the impact (separately and interactively) of each fundamental process of newcomers, annihilations, and coagulations, since these are the fundamental cornerstones to the understanding of business transaction networks and systems. As a result, our findings are unique and cannot be derived by these earlier studies. In addition, there is an important methodological difference within our work as the oscillation period after gelation is depended on the random event of the gel-node’s removal, so that the dynamics are nontrivial.