Economies with heterogeneous interacting learning agents

  • Simone Landini
  • Mauro Gallegati
  • Joseph E. Stiglitz
Regular Article


Economic agents differ from physical atoms because of the learning capability and memory, which lead to strategic behaviour. Economic agents learn how to interact and behave by modifying their behaviour when the economic environment changes. We show that business fluctuations are endogenously generated by the interaction of learning agents via the phenomenon of regenerative-coordination, i.e. agents choose a learning strategy which leads to a pair of output and price which feedback on learning, possibly modifying it. Mathematically, learning is modelled as a chemical reaction of different species of elements, while inferential analysis develops combinatorial master equation, a technique, which is an alternative approach in modelling heterogeneous interacting learning agents.


Heterogeneous interacting ABM Learning Master equations 

JEL Classification

C5 C6 D83 E1 E3 

1 Introduction

The asymmetric information revolution challenged the economic profession to rebuild its macro analytical tools upon sound micro-foundations (Stiglitz 1973, 1975, 1976). Statistical Physics offers tools for analyzing systems with heterogeneous interacting agents through the master equation (ME) approach (Weidlich and Braun 1992; Foley 1994; Aoki 1996, 2002; Aoki and Yoshikawa 2006). Social Sciences have a different status, because they analyze “social atoms” (Buchanan 2007), i.e. agents who act strategically in an intelligent way. Statistical Physics tools, such as ME are certainly suitable but, to take care of learning capabilities of social atoms, Combinatorial (Chemical) ME (CME) are more comfortable, as we propose in this paper.

Our model is populated by many heterogeneous interacting agents: their behaviour generates aggregate (emergent) phenomena, from which they learn and to which they adapt. This produces two consequences: (1) because heterogeneity and interaction produce strong non-linearities, aggregation cannot be solved using the Representative Agent framework; (2) the individuals may learn to achieve a state of statistical equilibrium, according to which the market is balanced but the agents can be in disequilibrium.

Individuals follow simple rules, interact and learn. We model the reactions of other agents to an individual’s choice of actions. In the words of Kirman (2012): “we can let the agent learn about the rules that he uses and we can find out if our simple creatures can learn to be the sophisticated optimisers of economic theory.”

To move the first steps into this direction, we use a simplified version of Greenwald and Stiglitz (1993) model with learning agents. Learning capabilities are represented by a set of rules to model learning behaviour as concerning the output strategy given the actual net worth of a firm and the market-price level. To couple with it, we introduce a method to get analytic solutions to heterogeneous interacting and learning agents models by a CME (Nicolis and Prigogine 1977; Gardiner and Chaturvedi 1977; Gardiner 1985) for the distribution of firms on the behavioural state space (the learning rules: see Sect. 3).

Allowing for learning might lead to a phenomenon we call re-configurative learning: within a certain state of the market a rule (say \(j\)) becomes dominant (i.e. most of the agents adopt it) when a critical mass of the agents uses it since they find it the most profitable one. But that rule is the most profitable according to the previous market conditions: when most of the agents move toward the winning rule they produce, e.g., more lowering the aggregate price. At the new price, rule \(j\), may become not “optimal” and agents start adopting a new rule, say \(i\). If it becomes dominant at its turn, aggregate output and price will be affected, causing another phase transition.1

All in all, we might say that the short term success of a strategy leads to its medium term failure because of the phase transitions produced by agents’ behaviour.

In this model, firms’ population is financially heterogeneous. This might be modeled by a ME, which describes the dynamics of the probability distribution of the population over states of financial soundness. What comes at hands is an analytic model for a dynamic estimator, together with its volatility, for the expected concentration of the heterogeneous class of firms in the system. This is found as the general solution of an ordinary differential equation (the macroscopic equation) for the expected value of the a state distribution, which depends on the transition rates, involved in the ME to model (mean-field) interaction, and the initial condition.2

Firms are characterized by a second kind of heterogeneity due to the learning rule. This can be modeled by means of a Combinatorial ME (CME). According to a metric which compares profits of two learning rule at time, a flow of firms from one rule to another is obtained. The solution the CME provides is a tool distributing a volume of firms over a set of rules. The model provides a differential equation for the probability distribution of agents over a set of behavioural states, each characterised by a given learning rule, and some characteristic levels for observables involved in the model. The ME provides an analytic model to describe the dynamic behavior of a complex system whose constituents perform non-linear behavior: it is an inferential methodology which allows finding the estimator of the expected value of any transferable quantity in the system.

Even though one knew the equations of motion of all the observables characterising every single agent in the system, she would not be able to manage an analytic solution if some non-linearity came into play and if the equation were coupled. The ME approach ends up with a small system of differential equation to model drifting dynamics and spreading fluctuations it.

The paper is organised as follows. Section 2 describes the economic model. Section 3 develops the learning mechanism and the rules agents behave with and how they choose among them through time (analytics inside profit curves and their maximisation are developed in “Appendix A” and B, respectively). Section 4 deals with CME to model learning at mean-field level in order to make inference from the aggregate simulation data and macro-dynamics. Section 5 comments ABM simulation results and the inferred macro-dynamics by using the CME set-up. Section 6 concludes.

2 The model

Our closed economy without Government is populated by \(I\) heterogeneous firms producing the same perishable good, \(Q\), using labour, \(l\), as the only input (provided by the \(I\) households at the given wage level \(w\)), according to a financially constrained production function (\(A\) is the net worth of the firm: see Delli Gatti et al. 2010), and one bank which supply the credit the firms demand for, at the constant interest rate \(r\), and pays no interests on deposits.

At time \(t\) the firm \(i\) output is
$$\begin{aligned} Q(i,t)=\alpha (i,t)A(i,t)^{\beta }, \end{aligned}$$
where \(\beta \in (0,1)\) and \(\alpha \) is the “financial” parameter each firm determines and continuously updates through the learning mechanism of Sect. 3.
The demand for labour is,
$$\begin{aligned} N(i,t)=(\gamma Q(i,t))^{\delta }=\chi \alpha (i,t)^{\delta }A(i,t)^{\phi }, \end{aligned}$$
since \(Q(i,t)={N(i,t)^{1/\delta }}/\gamma \), where \(\chi =\gamma ^{\delta }, \phi =\beta \delta , \gamma \in (0,1)\) and \(\delta >0\).
Balance between aggregate output \(Q(t)=\sum {Q(i,t)} \) and the stock-flow consistent (Godley and Lavoie 2007) aggregate demand, \(wN(t)\), yields the market price, \(P\), at \(t+1\),
$$\begin{aligned} P(t+1)={wN(t)}/{Q(t)}. \end{aligned}$$
Because of demand informational imperfections, firm \(i\) is assumed to face an individual price, \(p\), a multiplicative idiosyncratic shock \(u(i,t+1)\) to the market price \(P(t+1)\),
$$\begin{aligned} p(i,t+1)=u(i,t+1)P(t+1), \end{aligned}$$
where \(u\) is uniformly distributed between \(0\) and \(2\), such that expected price is \(E[p(i,t+1)]=P(t+1)\).
Individual wage bill and credit demand are,
$$\begin{aligned} W(i,t)&= w N(i,t)=\theta \alpha (i,t)^{\delta }A(i,t)^{\phi } \end{aligned}$$
$$\begin{aligned} L(i,t)&= W(i,t)-A(i,t). \end{aligned}$$
The firm is self-financed (SF, or hedge, in Minsky’s jargon) if she can pay the wage bill fully with its own financial resources, \(A(i,t)\ge W(i,t)\), otherwise the firm is not self-financed (NSF, or speculative, in Minsky’s jargon), \(A(i,t)<W(i,t)\).
Firm’s profit is,
$$\begin{aligned} \Pi (i,t+1)=p(i,t+1)Q(i,t)-W(i,t)-rL(i,t), \end{aligned}$$
while its equity updates according to,
$$\begin{aligned} A(i,t+1)=A(i,t)+\Pi (i,t+1). \end{aligned}$$
Moreover, we assume that:
  • the firm goes bankrupt when \(A(i,t)\le 0\) or when \(Q(i,t)=0\);

  • the number of firms is constant, such that there exists an entry-exit mechanism 1 to 1, and the new entry \(a\) is randomly assigned within the range 0–20.

3 Learning

This is a model in which agents are supposed to learn. In particular, they set the value of \(\alpha \), the “financial” parameter of Eq. (1).3 We allow them to freely chose between a certain set of rules (7; of course the list is far from being exhaustive and we get rid of the phenomenon of learning to learn, but the 4 branches are exhaustive: Sargent 1993; Kirman 2011) whose reinforcement, or not, is given by own return on investment (profits) or by imitation, and to move from one rule to another without costs. Once a certain rule is adopted, i.e. the financial parameter is set, a firm produce and bring its output to the market, where aggregate demand meets aggregate supply, providing the price of it. Once the idiosyncratic shock is taken into account, the rate of change of profit is evaluated corroborating or not previous decisions. If firms are satisfied with the pace of profits, they hold the rule, otherwise they shift to a new one.4 If there exist more than one rule with the same expected equity, the firm chooses the simplest one.

We classify the 7,\(\alpha \) [1–7], rules into 4 different branches:
  • Non-interactive without learning
    • Firms set a value which never updates even though the system changes: \(\alpha [1]\);

    • Firms set \(\alpha \) as a random variable: \(\alpha [2]\); it equals the previous period own \(\alpha \) plus a \(\pm 30\,\, \% \) change.

  • Non-interactive with learning
    • Firms set \(\alpha \) following a profit maximising rule: \(\alpha [3]\) (see “Appendix B”);

    • \(\alpha [4]\) average of the firm’s historical values \(\alpha (i,t-s)\) in the last \(\tau \) periods with positive profit.

  • Learning with global interaction
    • \(\alpha [5]\) average value \(\sum _i {{\alpha (i,t-1)}/I} \) over all the firms in the previous period; eventually firms copy it.

  • Learning with local interaction
    • The firm randomly chooses \(M\) firms from its neighbourhood, i.e. among those in the same condition (NSF or SF), collecting information about past period values \(\{\alpha (i_m ,t-1)\}\) and sets \(\alpha [6]\) to the average;

    • The firm looks at its own subgroup (NSF or SF) and uses the ratio of profit to equity to measure other firms performance. The firm calculates its own \(\alpha [7]\) as the average value of the best performers parameter values.

Through learning, individual behaviour induces a mutation of the system when the interactions leads to some critical point. For instance, when a considerable (a critical mass) group of firms concentrates on a behavioural strategy, the system undergoes a phase transition. The market price is the driving force (pilot-quantity, in Physics) of the aggregate dynamics, because it embeds behavioural heterogeneity, \(\{\alpha (i,t)\}\), and endowments heterogeneity, \(\{A(i,t)\}\), of all agents.

This allows us to go into a more deep understanding of the notion of re-configuration. If a mass of firms is adopting a given behaviour (rule), what is \(a\) successful strategy becomes the winning strategy up to a critical mass of firms adopt it. The convergence to the new output level changes the price, i.e. destroys the environment which allowed to adopt the winning rule and a new one may enter the drama: in a way, the success of a rule destroys the success itself and a new “equilibrium” is ready to enter, i.e. it leads to different configuration.5 Those firstly experiencing a different strategy improving production efficiency might realize better performances inducing other firms to do the same time by time. Therefore, the system itself is going to change its “learning-induced” configuration destroying that regularity it created to assume a new configuration.

4 The CME and the mean-field learning

Different agents species live in the system characterised by heterogeneity in endowments, w.r.t. the state of financial soundness \(\varsigma \in \Sigma =\left\{ {\varsigma _k :k\le S} \right\} \), and in behavioural strategies, \(\lambda \in \Lambda =\{\lambda _h :h\le K\}.\, \Xi =\Lambda \times \Sigma ={\xi _{j}=\lambda _{h}\wedge \zeta _{k}}\) qualifies the species: \(x_i (\lambda _h \wedge \varsigma _k ;t)\equiv \xi _j \) means the \(i\)-th agent is a firm of \(j\)-th species being in the \(k\)-th state of financial soundness while adopting the \(h\)-th rule in scheduling output. The occupation number \(I_j (t)\) evaluates the concentration of \(j\)-th species, how many firms belong to the \(j=j(h,k)\)-th state on \(\Xi \). Since the total number of firms is assumed to be constant, the vector \(\mathbf{I}(t)=\left( {I_1 (t),\ldots ,I_J (t)} \right) \), with \(J=KS\), gives the configuration of the system such that the total number of agents is conserved through time. The following sections develop a mean-field approach in a combinatorial (chemical) interaction framework to take care of the learning mechanism to infer a model for the dynamics of species in the system over \(\Xi \).

By following an analogy with chemical reactions, a simplified description of combinatorial (chemical) interactions is introduced to model learning at aggregate level.

At a mean-field (i.e. aggregate) level, reactant \(L_h\) represents the species of those firms scheduling output according to the \(h\)-th rule while being in a given state of financial soundness. A ’simple’ interaction between two species \(L_h\) and \(L_k\) is a reaction channel: \(L_h +L_k \equiv \rho _k (L_h )\) where \(L_h\) is here called the ‘effective reactant’ and \(L_k\) is the ‘virtual’ one.

At the social atoms level the learning mechanism is a procedure of \(K\) steps, each of which tests a single output scheduling strategy \(\lambda _k\) along a small ’test-period’ \([t+(k-1)dt,t+kdt)\): the sequence of all such periods is said the ’learning period’ \([t,t+\Delta )\). The firm starts at \(t\) being \(i\in L_p \), i.e. scheduling output according to rule \(\lambda _p \), and at \(t+\Delta \) it ends choosing to be \(i\in L_q \): if \(L_q =L_p \) then the firm has learnt that for the moment, among all the rules, it is better to maintain the rule its was behaving with, if \(L_q \ne L_p \) the firm has learnt it is better to change accordingly. The firm makes this decision after the learning mechanism has been completed passing through \(K\) learning steps along the ‘learning period’: in this paper it coincides with a single time step in ABM–DGP of Sect. 2.6

To describe the mechanism assume being at the \(k\)-th step and that a firm is temporarily found to be \(i\in L_h \). Now it interacts with the species of those behaving with \(\lambda _k\) before choosing to maintain its own rule for the next step or to switch it so that \(i\in L_k \). The same applies to mean-field reactants: interaction is between two species; this is here developed according to a chemical reaction representation to describe the learning mechanism as a ‘complex reaction’, that is an ordered sequence (or chain) of simple interactions between two species at time with different concentrations.

The learning mechanism of Sect. 3 develops along the learning-period partitioned into \(K\) sub-intervals of length \(dt, [t+(k-1)dt,t+kdt)\), called the test-period for rule \(\lambda _k \): each effective reactant tests all the \(K\) rules, one after the other. Along the \(k\)-th test-period \(L_h\)(effective) interacts with \(L_k\)(virtual), the event returns an outcome called the ‘product’: it might be \(L_h\) or \(L_k\), whatever it is it becomes the effective reactant for the next step along \([t+kdt,t+(k+1)dt)\). Hence, along the interaction chain the effective reactant may change while the virtual reactant must change: this is because the firm may temporarily switch or not its rule while testing all the behavioural possibilities before making the final decision at the end of the learning period. According to the chemical reaction formalism \(L_h +L_k\) is therefore a simple interaction or a ‘learning-channel’, \(\rho _k (L_h )\), characterized by the virtual species to interact with at the \(k\)-th step. Therefore, the index \(k\) points to the output scheduling strategy \(\lambda _k \)defining \(L_k\) and it is also called the ‘degree of advancement’ in the interactions chain \(L_p +\{L_k \}\).7 The outcome of the learning mechanism is \(L_p +\{L_k \}\rightarrow L_p +L_q \): in case of maintenance it reads as \(L_p +L_q \), in case of switching it reads as \(2L_q \). Anyway, before the end of the learning-period neither the observer nor the social atom know what the outcome would be: while learning the social atoms live in sort of superimposition of states but, at the end, one and only one outcome will realize. So, the best thing one can do is to develop a probabilistic model to estimate probabilities for the outcomes: \(w_\varsigma (p,q\ne p;t+\Delta )\) for the final decision to switch the strategy from \(\lambda _{p}\) to a different \(\lambda _q \) and \(w_\varsigma (p,q=p;t+\Delta )\) for maintenance of the previous strategy. These probabilities need to take care of the whole learning steps before making the decision, hence their specification depends on a probabilistic model for simple interactions, the simple fragments of the chain.

Since each simple interaction \(L_h +L_k \) in the chain has only two outcomes, a temporary switch or maintenance, it is described as a Bernoulli event
$$\begin{aligned} L_h +L_k \rightarrow \left\{ {\begin{array}{l} 2L_k : \Pr \{\rho _k (L_h )=L_k \}=r_{hk|k} \\ L_h +L_k : \Pr \{\rho _k (L_h )=L_h \}=r_{hh|k} \\ \end{array}} \right. \end{aligned}$$
where \(r_{hk|k}\) is the temporary switching probability in the interaction of \(L_h \) with \(L_k\) and \(r_{hh|k} =1-r_{hk|k} \) for maintenance: a model to estimate these probabilities is developed in “Appendix C”: note that \(r_{hk|k} \) concerns only the interaction of \(L_h\) with \(L_k \), as the virtual reactant changes into \(L_m \) then \(r_{hm|m} \) realises. Probabilities in (9) concern only one interaction but, as known, the learning mechanism is made of \(K=7\) interactions spanning through the steps of the learning-period. Therefore, two vectors of probabilities can be estimated starting with a generic \(L_p :\, (r_{p1|1} ,r_{p2|2} ,r_{p3|3} ,r_{p4|4} ,r_{p5|5} ,r_{p6|6} ,r_{p7|7} )\) for switching and \((r_{pp|1} ,r_{pp|2} ,r_{pp|3} ,r_{pp|4} ,r_{pp|5} ,r_{pp|6} ,r_{pp|7} )\) for maintenance. Since \(p\le K\) two matrices come at hands as described in “Appendix C”: \(\mathbf{W}_\varsigma ^s (t)=\{r_{hk|k} (t+kdt):h,k\le K\}\) for switching and \(\mathbf{W}_\varsigma ^m (t)=\{r_{hh|k} (t+kdt):h,k\le K\}\) for maintenance probabilities. Moreover, these probabilities are dynamic because the probabilities in (9) depend on profits as the pilot-quantity obtained from the ABM–DGP, or with a given analytic model, if any at hands. That is: at each step in the learning mechanism, the probability \(r_{hk|k}\) for an effective reactant \(L_h \) to switch \(\lambda _h\) into \(\lambda _k \) so becoming \(L_k \) increases as the profit realised by the virtual reactant \(L_k , \Pi (\lambda _k |\varsigma ; t+(k-1)dt)\), is higher than the profit the effective realised before, \(\Pi (\lambda _h |\varsigma ; t+(k-1)dt)\). The switching and maintenance probabilities in the \(k\)-th interaction depend on the profits differential \(\Pi (\lambda _h |\varsigma ; t+(k-1)dt)-\Pi (\lambda _k |\varsigma ; t+(k-1)dt)\): if it is negative, the maintenance probability is greater than the switching probability, if it is positive the opposite happens, if it is zero this gives an indifference probability, \(r_{hh|k} =r_{hk|k} =1/2\). These probabilities are used to estimate \(w_\varsigma (p,q\ne p;t+\Delta )\) and \(w_\varsigma (p,q=p;t+\Delta )\) in a matrix \({\mathbf{W}_{\varsigma }}(t+\Delta )\), specific for within-transitions among behavioural rules given a state of financial fragility, see “Appendix C”.

As an example, assume \(L_p +\{L_k \}=L_p +L_7 \) is realised according to the path of \(K=7\) steps shown if Fig. 1: (a) \(L_p +L_1 =2L_1 \), (b) \(L_1 +L_2=L_1 +L_2 \), (c) \(L_1 +L_3 =2L_3 \), (d) \(L_3 +L_4 =L_3 +L_4 \), (e) \(L_3 +L_5 =2L_5 \), (f) \(L_5 +L_6 =L_5 +L_6 \) and (g) \(L_6 +L_7 =2L_7 \). The final outcome is known to be reached passing through the sample path made of four temporary switching events (\(a,c,e,g\)) and three maintenance events (\(b,d,f\)). Maybe one would have thought that its probability is \(r_{p1|1} r_{11|2} r_{13|3} r_{33|4} r_{35|5} r_{55|6} r_{57|7} \), but this is not correct: indeed, this is just one of the many possible paths connecting \(L_p \) to \(L_7 \). Therefore, since nobody knows what is the learning path a social atom is following,8 to find \(w_\varsigma (p,7;t+\Delta )\) one should consider all such feasible paths the learning process might follow to become \(L_7\) from being \(L_p \): the probability is therefore give by (50) of “Appendix C”.

Consider another example: the aim is to estimate the probability \(w_\varsigma (p,2;t+\Delta )\) to become \(L_2\) at \(t+\Delta \) being \(L_p\) at \(t\); whatever \(L_p\) is, all the possible paths leading to \(L_2\) are considered.

Following the paths \((L_p +L_1 =2L_1 )-(L_1 +L_2 =L_1 +L_2 )-(L_1 +L_k =\cdots )\) and \((L_p +L_1 =L_p +L_1 )-(L_p +L_2 =L_p +L_2 )-(L_p +L_k =\cdots )\) the product \(L_2\) cannot be reached: both paths explain that when \(L_2\) is met it is rejected to proceed further, hence these paths do not contribute to the estimation of \(w_\varsigma (p,2;t+\Delta )\). On the other hand,\((L_p +L_1 =2L_1 )-(L_1 +L_2 =2L_2 )-(L_2 +L_k =\cdots )\) and \((L_p +L_1 =L_p +L_1 )-(L_p +L_2 =2L_2 )-(L_2 +L_k =\cdots )\) explain that when \(L_2\) is met it is maintained. Accordingly, the probability is \(w(p,2)=H_{p,2} \left[ {r_{pp|1} r_{p2|2} +r_{p1|1} r_{12|2} } \right] \prod _{k\ge 3} {r_{22|k} } \) as shown in (45) of “Appendix C”, where the constraint exclude those paths not contributing to the estimation.9

A configuration the agents realise over the space \(\Xi =\Lambda \times \Sigma \) is a vector \(\mathbf{I}(t)=\mathbf{y}\) whose components \(I(\lambda _h ,\varsigma _k ,t)=y_j \) count how many firms are scheduling output according to \(\lambda _h \) while being in the state \(\varsigma _k \): in the present model the total number of firms is conserved, \(\mathbf{I}(t)\cdot \mathbf{1}_J =I\, \forall t\). By conditioning on the states of financial fragility there can be found \(S\) sub-systems whose configurations \(\mathbf{I}_\varsigma (t)=\left\{ {I(\lambda ,t|\varsigma ):\lambda \in \Lambda } \right\} \) on \(\Lambda \) can assume microstates like \(\mathbf{n}=(n_1 ,\ldots ,n_k ,\ldots ,n_K )\) so that the following constraint holds
$$\begin{aligned} \sum _{k\le K} {I(\lambda _k ,t|\varsigma )=I(\varsigma ,t)\,\quad \forall \varsigma \in \Sigma \Rightarrow \sum _{\varsigma \in \Sigma } {I(\varsigma ,t)=I\,\quad \forall t} } \end{aligned}$$
\(\mathbf{I}_\varsigma (t)=\mathbf{n}\) states that the configuration of the \(\varsigma \)-system on \(\Lambda \) has been realised in a given microstate; each component \(I(\lambda _k ,t|\varsigma )=n_k \) in \(\mathbf{I}_\varsigma (t)=\mathbf{n}\) is therefore relative to a specific state rule in \(\Lambda \) and it counts how many firms in the state of financial soundness \(\varsigma \in \Sigma \) are scheduling their output according to \(\lambda _k \in \Lambda \) at time \(t\): it is the realised concentration of the \(k\)-th species among the total volume \(I(\varsigma ,t)\) of firms in the same financial soundness state; this concentration changes through time both due to the change in \(I(\varsigma ,t)\), strictly tied to the economic environment (driven by the market price pilot-quantity), and due to the learning activities agents perform by interactions (driven by the profitability pilot-quantity).

Mean-field interactions have been described in Sect. 4 to represent the learning mechanism in mean-field terms. Interactions10 are now developed in terms of what in literature is known as Combinatorial Kinetics.11 Learning essentially implies moving on \(\Lambda \) so that concentrations change, therefore transformation probabilities are transition probabilities over \(\Lambda \) due to interactions between reactants.

To develop the general model it is worth beginning with an elementary case. Consider the simple fragment \(L_h +L_k \) of the interaction chain: this is a bi-molecular interaction where \(L_h \) is the effective and \(L_k \) is the virtual interacting species. This interaction involves \(n_h^k =n_h +n_k \) agents therefore, at the end, values of \(n_h \) and \(n_k \) may be different due to switching events, hence the configuration may change, but their sum will remain the same: this is the so called ‘stoichiometric constraint’. The ‘stenographic equation’ for this interaction is
$$\begin{aligned} s_h^k L_h \mathop \leftrightarrow \limits _{H_k^- }^{H_k^+ } r_h^k L_k \end{aligned}$$
where \(s_h^k \) is the concentration of the effective reactant \(L_h \) activated in the interaction and \(r_h^k \) concerns \(L_k \), moreover, \(H_k^\pm \) are the so called ‘rate constants’: they are indexed by \(k\) because the interaction is labelled by the virtual reactant \(L{ }_k\), “+” means a direct interaction “\(\rightarrow \)” and “–” means the inverse “\(\leftarrow \)”. In this expression the direct interaction “\(\rightarrow \)” is for “switching”, and an inverse “\(\leftarrow \)” is for “maintenance” events.
The master equation associated to (11) considers both inflows and out flows of agents during the direct and the inverse interactions, hence it is represented as the sum of two net in-out flows.
$$\begin{aligned} \frac{\partial P(n_h ,n_k ,t)}{\partial t}\!&= \!H_k^+ \left[ {\frac{(n_h +s_h^k )!}{n_h !}P(n_h \!+\!s_h^k ,n_k \!-\!r_h^k ,t)\!-\!\frac{n_h !}{(n_k -s_h^k )!}P(n_h ,n_k ,t)} \right] \nonumber \\&+\,H_k^- \left[ {\frac{(n_k \!+\!s_h^k )!}{n_k !}P(n_h \!-\!s_h^k ,n_k \!+\!s_h^k ,t)\!-\!\frac{n_k !}{(n_k \!-\!r_h^k )!}P(n_h ,n_k ,t)} \right] \nonumber \\ \end{aligned}$$
In general there are \(C\) interaction channels \(K\) species can interact with one another. A reaction channel involving all the species is therefore represented with the following multivariate stenographic equation
$$\begin{aligned} \sum _{k\le K} {s_k^c L_k \mathop {\mathop \leftrightarrow \limits _{H_c^- } }\limits ^{H_c^+ } \sum _{k\le K} {r_k^c L_k } }\, \quad \hbox {or}\,\quad \mathbf{s}^{c}\cdot \mathbf{L}\leftrightarrow \mathbf{r}^{c}\cdot \mathbf{L} \qquad \forall c\le C \end{aligned}$$
where all the species are involved at the same time and \(\mathbf{s}^{c}=(s_1^c ,\ldots ,s_k^c ,\ldots ,s_K^c ), \mathbf{r}^{c}=(r_1^c ,\ldots ,r_k^c ,\ldots ,r_K^c )\) and \(\mathbf{L}=(L_1 ,\ldots ,L_k ,\ldots ,L_K )\) being \(s_k^c \) and \(r_k^c \) portions of the \(L_k \)-concentration, i.e. portions of the \(n_k \) agents activated as reactants and products in the \(c\)-th interaction.
Define the microstate-change vector associated to a specific interaction step
$$\begin{aligned} \mathbf{u}^{c}=\mathbf{r}^{c}-\mathbf{s}^{c}\Rightarrow \left\{ {\begin{array}{l} \mathbf{n}\rightarrow \mathbf{n}+\mathbf{u}^{c}\,\, \textit{forward} \\ \mathbf{n}\rightarrow \mathbf{n}-\mathbf{u}^{c}\,\, \textit{backward} \\ \end{array}} \right. \end{aligned}$$
The forward/backward flows obey the following interaction-specific combinatorial transition rates
$$\begin{aligned} T_c^+ (\mathbf{n};t)=H_c^+ \prod _{k\le K} {\frac{n_k !}{(n_k -s_k^c )!}}, \quad T_c^- (\mathbf{n};t)=H_c^- \prod _{k\le K} {\frac{n_k !}{(n_k -r_k^c )!}} \end{aligned}$$
because transformations or behavioural state transitions are proportional to the number of ways interactions can realise.12 It is now worth considering that the economic model described so far, as well as the learning mechanism involved, both concern a set of \(C=K\) interaction channels like those in (9). Accordingly, taking care of all the interaction channels, the compact form combinatorial master equation is
$$\begin{aligned}&\frac{\partial P(\mathbf{n},t)}{\partial t}\nonumber \\&\quad =\sum _{h\le K} {\sum _{\theta =0,1} {(2\theta \!-\!1)} \left[ {T_h^- (\mathbf{n}+\theta \mathbf{u}^{h};t)P(\mathbf{n}+\theta \mathbf{u}^{h},t)+T_h^+ (\mathbf{n}-\theta \mathbf{u}^{h};t)P(\mathbf{n}-\theta \mathbf{u}^{h},t)} \right] }\nonumber \\ \end{aligned}$$
where transition rates obey (15) according to the following spin-like quantity
$$\begin{aligned} \theta =\left\{ {\begin{array}{l} 0: \hbox {outflow} \\ 1: \hbox {inflow} \\ \end{array}} \right. \end{aligned}$$
In case of inflows (\(\theta =1\)) the square bracket terms in (16) give the “backward and forward” advancements respectively, in the case of outflows (\(\theta =0\)) the same terms read as “forward and backward” advancements: that is, the net flows in the direct and inverse reactions. Note that the combinatorial equation (16) is consistent with the specific nature of the multi-species interactions like (13) where direct and inverse interactions are involved together with their in-out flows.
The stationary solution to (16) follows from \(\partial _t P(\mathbf{n},t)=0\) and balancing backward/forward inflows with forward/backward outflows in each interaction channel
$$\begin{aligned} T_h^\mp (\mathbf{n}\pm \mathbf{u}^{h};t)P_e (\mathbf{n}\pm \mathbf{u}^{h};t)=T_h^\pm (\mathbf{n};t)P_e (\mathbf{n};t) \quad \forall k\le K \end{aligned}$$
hence it is a detailed balance condition13 and, as known, it can be used to evaluate the probability for a given configuration realisation. That is, if \(\mathbf{n}^{h}=\mathbf{n}^{h-1}+\upsilon _k \mathbf{u}^{h}\) then
$$\begin{aligned} \mathbf{n}_{t+\Delta } =\mathbf{n}_t +\sum _{h\le K} {\upsilon _k \mathbf{u}^{h}} : \upsilon _h \in \mathrm{Z} \end{aligned}$$
where \(\mathbf{I}_\varsigma (\tau )=\mathbf{n}_\tau \) and \(\mathrm{Z}\) is a set of relative integers bounded for the “degree of advancement” \(\upsilon _h \) to fulfil (13)14.
According to (14) and (17), transition rates specify as follows
$$\begin{aligned}&T_h^+ (\mathbf{n}-\theta \mathbf{u}^{h};t)=H_h^+ \prod _{k\le K} {\frac{(n_k -\theta u_k^h )!}{(n_k -\theta u_k^h -s_k^h )!}}\nonumber \\&T_h^- (\mathbf{n}+\theta \mathbf{u}^{h};t)=H_h^- \prod _{k\le K} {\frac{(n_k +\theta u_k^h )!}{(n_k +\theta u_k^h -r_k^h )!}} \end{aligned}$$
If \(\mathbf{s}^{h-1}\ne \mathbf{s}^{h}\) and \(\mathbf{r}^{h-1}\ne \mathbf{r}^{h}\), which is reasonable, and \(\mathbf{u}^{k-1}=\mathbf{u}^{h}=\mathbf{u}\), then, according to Gardiner (1985), from (18) to (21) detailed balance gives a unique stationary solution of (16) if
$$\begin{aligned} \prod _{h\le K} {\frac{T_h^\pm (\mathbf{n};t)}{T_h^\mp (\mathbf{n}+ \mathbf{u}^{h};t)}} =\prod _{h\le K} {\frac{T_h^\pm (\mathbf{n};t)}{T_h^\mp (\mathbf{n}+ \mathbf{u}^{h};t)}} \end{aligned}$$
for any advancement-path it might be taken in (19).
Therefore, by setting \(\partial _t P(\mathbf{n},t)=0\) and applying (20) in (18), the stationary solution is found to be multivariate Poisson
$$\begin{aligned} P_e (\mathbf{n})=\prod _{h\le K} {e^{-m_h }\frac{m_h ^{n_h }}{n_h !} : m_h =\left\langle {n_h } \right\rangle } \end{aligned}$$
The general solution of (16) can be found using the Poisson representation developed by Gardiner and Chaturvedi (1977) and Gardiner (1985): following statistical-mechanic reasoning on the canonical ensemble, the authors show that the technique expands \(P(\mathbf{n},t)\) in Poisson distributions along each coordinate and a Fokker–Planck equation can be obtained to approximate the combinatorial master equation. In the present framework, the expected value \(m_h =\left\langle {n_h } \right\rangle \) plays the same role the macroscopic equation plays in the system size expansion due to van Kampen (2007) and developed by Aoki (1996) as the macroeconomic equation: the next section provides a model for this expectation.
The transition matrix \(\mathbf{W}_\varsigma (t)\) contains the dynamic local transition probability from \(\lambda _p \) to \(\lambda _q \) on the \(\Lambda \) given \(\varsigma \in \Sigma \) at time \(t\). By plausibly assuming the Markov property to hold, as well as time homogeneity, let \(\tau _k \) be the time a firm behaves with the same rule, the holding time in the species \(L_k \), therefore it is known that:
$$\begin{aligned} \Pr \left\{ { \tau _k <\Delta } \right\} =1-e^{-z_\varsigma (k,t)\Delta }=z_\varsigma (k,t)\Delta +o(\Delta ) \end{aligned}$$
where \(z_\varsigma (k,t)=\sum _{k\ne h} {w_\varsigma (h,k,t)} \) being \(w_\varsigma (p,q,t)\) known as described in “Appendix C”. Accordingly,
$$\begin{aligned} \left\{ {\begin{array}{l} \Pr \left\{ { L_q ,t+\Delta |L_p ,t } \right\} =w_\varsigma (p,q,t)\Delta +o(\Delta ) \\ \Pr \left\{ { L_p ,t+\Delta |L_p ,t } \right\} =1-z_\varsigma (p,t)\Delta +o(\Delta ) \\ \end{array}} \right. \end{aligned}$$
are the transition and permanence probabilities along \([t,t+\Delta )\) being \(\Delta \rightarrow 0\). Let now \(h_\varsigma (p,q,t)=\Pr \left\{ { L_q ,t|L_p ,0 } \right\} \) be the probability of finding a \(L_q \) firm at \(t\) which was a \(L_p \) at time zero. The problem is to specify a stochastic dynamic model for the probabilities in \(h_\varsigma (p,q,t)\)
$$\begin{aligned} \frac{dh_\varsigma (p,q,t)}{dt}=\mathop {\lim }\limits _{\Delta \rightarrow 0} \frac{h_\varsigma (p,q,t+\Delta )-h_\varsigma (p,q,t)}{\Delta } \end{aligned}$$
By using the definition of \(h_\varsigma (p,q,t)\) together with (24), since \(h_\varsigma (p,q,t+\Delta )=\sum _k {h_\varsigma (k,q,t+\Delta )} h_\varsigma (p,k,t)\) then a forward Kolmogorov equation is found, see Feller (1966),
$$\begin{aligned} \frac{dh_\varsigma (p,q,t)}{dt}=-z_\varsigma (p,t)h_\varsigma (p,q,t)+\sum _{k\ne p} {w_\varsigma (k,q,t)} h_\varsigma (p,k,t) \end{aligned}$$
which also reads as a master equation, see Aoki (1996).
By knowing local transition rates \(w_\varsigma (p,q,t)\) and \(z_\varsigma (k,t)\) the following generator is defined
$$\begin{aligned} g_\varsigma (p,q,t)=\left\{ {\begin{array}{l} -z_\varsigma (p,t)=-\sum _{q\ne p} {w_\varsigma (p,q,t)} :p=q \\ w_\varsigma (p,q,t):p\ne q \\ \end{array}} \right. \end{aligned}$$
in the matrix \(\mathbf{G}_\varsigma (t)\). Therefore, the master equation for the transition probability matrix is
$$\begin{aligned} {\dot{\mathbf{H}}}_\varsigma (t)=\mathbf{H}_\varsigma (t)\mathbf{G}_\varsigma (t) : \mathbf{H}_\varsigma (0)=\mathbf{H}_\varsigma ^0 \end{aligned}$$
“Ergodic”-estimates15 of the transition matrices \({\hat{\mathbf{W}}}_\varsigma \) and \({\hat{\mathbf{G}}}_\varsigma \) are obtained as time averages to be used to set a dynamic model, hence (28) gives
$$\begin{aligned} {\dot{\mathbf{H}}}_\varsigma (t)=\mathbf{H}_\varsigma (t){\hat{\mathbf{G}}}_\varsigma :\mathbf{H}_\varsigma (0)=\mathbf{H}_\varsigma ^{0} \Rightarrow {\hat{\mathbf{H}}}_\varsigma (t)=\mathbf{H}_\varsigma ^{0} \cdot \exp (t\cdot {\hat{\mathbf{G}}}_\varsigma ) \end{aligned}$$
As the system is assumed to be asymptotically large enough the following convergence can be assumed
$$\begin{aligned} p(\lambda _h |\varsigma ,t)=\frac{n_h }{\hat{{I}}(\varsigma ,t)}\mathop {\longrightarrow }\limits ^{I\rightarrow \infty }\mathop {\lim }\limits _{\Delta \rightarrow 0} \Pr \left\{ {I(\lambda _h ,t+\Delta |\varsigma )} \right\} \quad \forall h\le K \end{aligned}$$
where \(\hat{{I}}(\varsigma ,t)\) is the aggregation of the ABM–DGP for the total number of firms in the financial fragility state \(\varsigma \in \Sigma \) and \(n_h \) is the realisation along the \(h\)-th coordinate in (19) for the configuration \(\mathbf{I}_\varsigma (t)=\left\{ {I(\lambda ,t|\varsigma ):\lambda \in \Lambda } \right\} \). Being \({\hat{\mathbf{I}}}_\varsigma (0)=\mathbf{n}_0 \) the initial vector of species concentrations drawn from the ABM–DGP simulation, from (30) the initial vector is \(\mathbf{p}_\varsigma ^0 =(n_{1,0} /\sum _k {n_{k,0} ,\ldots , } n_{K,0} /\sum _k {n_{k,0} } )=\mathbf{n}_0 /\hat{{I}}(\varsigma ,0)\), being \(\hat{{I}}(\varsigma ,0)\) the total number of firms in \(\varsigma \in \Sigma \) at \(t=0\). Therefore, the state probability follows from (29): \({\hat{\mathbf{p}}}_\varsigma (t)={\hat{\mathbf{H}}}_\varsigma (t)\cdot \mathbf{p}_\varsigma ^0 \). Finally, the time-indexed estimate of (22) is
$$\begin{aligned} {\hat{\mathbf{m}}}_\varsigma (t)={\hat{\mathbf{p}}}_\varsigma (t)\cdot \hat{{I}}(\varsigma ,t)={\hat{\mathbf{H}}}_\varsigma (t)\cdot \mathbf{p}_\varsigma ^0 \cdot \hat{{I}}(\varsigma ,t)=\mathbf{H}_\varsigma ^0 \cdot \exp (t\cdot {\hat{\mathbf{G}}}_\varsigma )\cdot \mathbf{p}_\varsigma ^0 (\varsigma )\cdot \hat{{I}}(\varsigma ,t)\nonumber \\ \end{aligned}$$
which nests the ABM–DGP aggregate outcome into the CME stationary solution (22).

5 Dynamics and re-generative learning using ACE and CME

Once one introduces heterogeneous and interacting agents, two different approaches has been developed in the ABM literature. One is called ACE (Agent Computational Economics) based on computer simulation (see Tesfatsion and Judd 2006; Delli Gatti et al. 2010); the other, ASHIA (Analytic Systems with Heterogeneous Interacting Agents) derives from the statistical physics and analyzes economic agents as interacting intelligent atoms (recently Alfarano et al. 2005; Di Guilmi et al. 2011). A new strand goes beyond it, introduces the mechanism of learning: this paper is a first pace in that direction.

In the following, we will describe the dynamics of the AB system, the proximate cause-effect relations and the links between fluctuations and learning; moreover, we show using the analytical tools of Sect. 4, that it is possible to get rid of millions of equation by the meso-foundation provided by the CME.

Figure 3 shows aggregate time series from the ABM–DGP with \(N=1{,}000\) firms along \(T=1,000\) periods estimated with 50 Montecarlo runs with the following parameter values: \(A(i,0)\mathop {\longrightarrow }\limits ^{iid}U(0,20), \beta =0.5, \gamma =0.8, \delta =1.4, w = 0.8\) and \(r=5\,\% \).

On average, the economy is populated with almost equal shares of NSF/SF firms with small volatility through time. This stability is the signature of the statistical equilibrium (Foley 1994). Even though equivalent in shares to SF ones, NSF firms concentrate about 30 % of total equity but they realise about 65 % of total output and more than 60 % of total profit.16 We believe there are at least three causae for it:
  • the NSF perform well in learning activity because they have to be quite aggressive in production to look for very performing strategies;

  • there exist implicit bankruptcy costs;

  • the empirical evidence tells us of growth rate as Laplacian distributed because of the different behaviour of firms of different sizes: smaller firms’ (de)growth is fat tail distributed.

Since increasing output is the only way to increase profits, and making profits is the only way to increase equity to improve the state of financial soundness, it turns out that NSF firms are much more active than SF ones because they fear to go bankrupt.
Define now a regime as a sub-period characterized by a given dominant configuration and a phase as a sub-period along which a quantity \(Z\) is found in a subset of states with specific qualitative meaning (e.g. expansion or recession).17 Accordingly, the system can face both regime and phase transitions, as emergent phenomena. Moreover, since the individual behaviour is complex (involving heterogeneity, interaction and learning), and due to the external field effects (market price and profitability), it can happen that even those phases synchronized with some regimes along certain sub-periods can be found in different combinations along other sub-periods (Fig. 2).
Fig. 2

Montecarlo simulation (50 runs of the ABM–DGP): HP-filtered aggregate quantities

For instance, Fig. 3 shows that two expansion phases (when output is increasing beyond the upper confidence band) can be found to be synchronized with different dominant configurations \(C\) and \(C^{\prime }\) along different sub-periods \(\tau \) and \(\tau ^{\prime }\). The reason of this issue is the dynamic change of system structure at individual level: a dominant configuration in expansion can be found to be dominant in recession too, because the economic conditions are different.
Fig. 3

NSF and SF diffusion-dominance (red-stairs) and output-phases (blue-line). Horizontal lines the time average and confidence bands about the mean (\(\pm \)SD)

Aggregate states of the system matter, as well as the effects of the environment, but learning agents change their behaviour through time. Therefore the same subset of conditions assumes different relevance: if a subset of firms found convenient behaving in a certain way when they were NSF, the same behavior is not expected to be convenient as well when they are SF.

According to Fig. 3, SF firms change frequently but choosing between only two configurations (3764215, 7364215) which are almost the same configuration but a switch in the first two rules, while NSF firms choose among seven configurations (2573416, 2573461, 257416, 2574361, 5273416, 5274316, 5274361): hence, we may say that the NSF sub-system is more active.

Let’s also note that there is a discrepancy between current production and the aggregate profit: there is no guarantee that the “optimal” behavior of the individual agent leads to the welfare of society.

Summarizing the results:
  • the state of financial soundness, weak heterogeneity, matters and that the difference in firms’ behavioral attitudes, strong heterogeneity, conditioned on the state of financial soundness, is due to different outcomes from learning activity.

  • the expected NSF scheduling parameter triples the SF one: this aspect allows to conclude that NSF firms are more “aggressive” than SF ones because the vital impulse of NSF push them mainly to recover their financial fragility seeking for profit-improving behavioral rules while SF firms are more prudential.

  • Figure 3 shows NSF profit and output is about 65 % of totals while NSF and SF concentrations are balancing (48 vs. 52 %); therefore, the social-welfare of the system is sustained by the more active and lively firms. This finding can be read in a different way. The sounder the financial health the less the incentive to change: if SF were the majority in the system, due to this rigidity, the system itself would have been more exposed to adverse phase-transitions due to a low resilience capability which, being more prone to change, pertains NSF firms.

In the following we analyze the computational results of the model by using the CME approach introduced in Sect. 4. In particular we treat the aggregate share of NSF firms and aggregate profits in different species to make inference according the model in (31), the estimator for the expected values of concentrations. Figure 4 compares results from (31) to the estimated share of firms in each state and total output: the plots show different attitudes summarised in Table 1.
Fig. 4

Top panels expected concentrations on the behavioural rules space given the financial soundness. Middle panels shares of NSF and SF firms from the ABM–DGP and Hodrick–Prescott aggregation for expected values of (30). Bottom panels Hodrick–Prescott time series of shares of aggregate output in the NSF and SF states of financial soundness: the horizontal solid line gives the time average of output while dot-dashed lines are confidence bands about the average, \(\pm 0.5\hbox {std}(Q_\varsigma (t):t\le T)\). Vertical lines identify phases

Table 1

Time averages expected concentrations in each behavioural state given the state of financial

\(\Xi =\Sigma \times \Lambda \)

\(\lambda _1 \)

\(\lambda _2 \)

\(\lambda _3 \)

\(\lambda _4 \)

\(\lambda _5 \)

\(\lambda _6 \)

\(\lambda _7 \)




















From Table 1 NSF dominant rules are: \(\lambda _5 \succ \lambda _4 \succ \lambda _3 \succ \lambda _7 \succ \lambda _2 \succ \lambda _1 \succ \lambda _6 \). As regards SF: \(\lambda _4 \succ \lambda _3 \succ \lambda _5 \succ \lambda _7 \succ \lambda _1 \succ \lambda _2 \succ \lambda _6 \).

According to our results, NSF firms are prone to interaction, although in a weak sense, because they behave in order to improve their financial soundness avoiding the risk of bankruptcy; SF firms are more individualistic and precautious by looking at their past or profit maximising to maintain their status more than improving their richness (which improves through the St.Matthew effect on profits, being subject to a multiplicative shock).

The left and right top panels of Fig. 4 show that in the beginning NSF are more concentrated on rule \(\lambda _5 \) but, as time goes by, they become diffused on rules \(\lambda _3 , \lambda _4 \) and \(\lambda _5 \); on the contrary, SF firms begin almost spread over rules \(\lambda _3 , \lambda _4 , \lambda _5 \) and \(\lambda _7 \) while, at the end, they concentrate mostly on \(\lambda _4 \). The NSF propensity to diversification against the propensity to concentration of SF might be interpreted as need for NSF to put forward more and more behavioural attitudes to improve the state of financial fragility while SF firms seem to have reached a satisfactory configuration. Even though the SF firms are the majority (on average about 52 %), the share of SF output is lower than the NSF one (65 %). This shows NSF firms are more active than SF ones because their aim is to improve as fast as possible their financial soundness: the improvement follows by increasing equity, which is possible only with an increase in profits, but profits increase only if output increases. Therefore, NSF firms try to do their best to increase output to become SF in the short run, nevertheless they did so as reasonably as possible: this is shown by the more diversified portfolio of behaviours they behave with. On the other hand, SF firms have been found to be more precautious in preserving their status: they have less interest in increasing output to become richer preferring to remain self financing with profits’ marginal increments.

Still considering Fig. 4, when NSF output is fairly above the upper confidence band then the density of NSF has a peak and an increase in the concentration of NSF firms on the dominant rules is observed. When output is within or below the bands there are periods with smaller peaks or with peaks toward the minimum, respectively, corresponding to periods in which the distribution of firms spreads more uniformly.

When the NSF density increases firms concentrate on those rules they find more profitable to improve their financial soundness and this determines increments in output. These periods are almost short and essentially dominated by trend inversions in the market price dynamics, which drives the share of NSF firms with a high correlation.

Being the market price determined by total previous period demand (3) the learning activity on the output scheduling parameter affects both total demand and output. Since labour demand is a function of the scheduled output, an increase of total demand is itself a consequence of an increase in total output: in the end, the price increases when the learning activity push firms to increase their output. There is therefore a cyclical effect between learning activity on output and demand, of output on price and of price on the share of NSF.

These cyclical effects determine phase transitions of the system from increasing to decreasing periods of NSF firms concentration on the behavioural rules space. Phases, in which the profitability of some dominant rule polarises the volume of firms, then realise inducing a gradual increase in output up to a certain level which needs of an increase of labour (total) demand lower than the increase in output. When this happens it determines a downturn for the price and, as a consequence, for the density of NSF and of their concentration on dominant rules. Accordingly there is a transition to a period along which firms are less concentrated on dominant rules to spread more uniformly over the behavioural states space. In case of SF firms there is essentially a similar but opposite mechanism: when the concentration on the dominant rules is higher this corresponds to positive peaks of SF density (i.e. negative peaks of NSF density) but, differently from the NSF case, this is associated to downturns in output corresponding to increasing price periods.

This representation confirms the state of financial soundness makes a big difference in firms behaviour, as it has been found for their behavioural preferences.

6 Conclusive remarks

In a socioeconomic complex system, as an ensemble of feedbacks between individual behaviours and emerging regularities, no precise prediction can be made but inferential in the view of what has been called regenerative learning. Both Montecarlo simulations of an ABM and ME techniques confirmed that regimes of dominant behaviours configurations grow into a regularities destroying themselves and reconfiguring the system into newer ones, inducing system phase-transitions. In such a framework, unpredictability of system dynamics has been found due to learning capability of economic learning agents interacting with one another and with their environment which is driven by those force-fields (i.e. market price and profits in our model).

Heterogeneity and interaction have been found to be individuals’ entangled characteristics which cannot be treated to macroscopic inference in the representative agent framework, new tools are needed to manage the aggregation problem.

Learning capability emphasizes the coexistence of multiple equilibria for a system whose equilibrium is not a point in the space, where opposite forces balance, but a probability distribution over a space of behaviours and characteristics.

Allowing agents for learning enhance the ontological perspective in complex systems theory by qualifying an agent as an intelligent one. The paper shows that the learning agent is not an isolated homo oeconomicus, since she cares of the others and of the environment she belongs to. Intelligent agents learn, and by learning they modify the system. All in all, she is different from her natural counterpart, the atom, since she behaves the way she wants and not in the only way she must. This is not an irrelevant detail because it requires the social scientist not to draw analytic tools from hard sciences as they are but it compels to suitably adapt techniques to social phenomena, or finding newer and sounder ones.18

In this respect, the present paper aims to promote, stimulate and, maybe, move forward in the research stream opened by Masanao Aoki about thirty years ago in socioeconomic complex systems analysis.


  1. 1.

    Note that business fluctuations are due to the idiosyncratic price shocks and the endogenous self organization of the market.

  2. 2.

    For the sake of simplicity this modeling has not been considered here: the reader is referred to Delli Gatti et al. (2012).

  3. 3.

    \(\alpha \) can be considered as a “financial” parameter since it represents the leverage, i.e. the ratio between external and internal financial needs.

  4. 4.

    Firms may change their behavior because the price change, i.e. they take into account the Lucas’ critique.

  5. 5.

    As it will be shown, a configuration is a string of rules’ codes ordering the production strategies according to their diffusion degree, from the most diffused to the lest diffused rule.

  6. 6.

    In the ABM–DGP \(\Delta =1\), that means that two adjacent dates are separated by a time-span of length \(\Delta \). Since in the present paper time has no specific relevant meaning, \(\Delta =1\) is only the simulation reference unit of time, it might be a quarter of a year or a year or whatever.

  7. 7.

    By analogy with chemical reactions it is a “progress variable”, or “degree of advancement”, as described in de Groot and Mazur (1984) p. 199, see also van Kampen (2007) p. 168. Therefore, the complexity degree of rule \(\lambda _k \in \Lambda \) is the index \(k\le K=|\Lambda |\) labelling the \(k\)-th rule the social atom is testing while learning.

  8. 8.
    Having set up an ABM one can certainly take note of each single step each single agent is taking while learning. But this would be very time and memory consuming, even with not so huge systems like the one here involved with 1,000 firms and 1,000 periods. Moreover, in the end, it would be useless because what needed is an inferential approach, like the Statistical Physics one: taking care of all the positions on the learning space would be like integrating the motion differential equations for particles in a complex system, which is almost an impossible task.
    Fig. 1

    The learning mechanism as a sequence of interactions along the learning-period made of \(K\) test-sub-periods. A predetermined sample path is highlighted from being \(L_p\) to become \(L_7\). The whole graph represents the graphs one would obtain by starting with any rule \(\lambda _p \in \Lambda \), that is from the initial state \(L_p \), passing through all the \(K\) channels \(\rho _k \) before ending to a final state \(L_q \)

  9. 9.

    The formulae presented in “Appendix C” have been analytically obtained by involving a suitable algebraic method: its development is far beyond the aim of the present paper; notes are available by the authors.

  10. 10.

    The reader might refer to Gardiner (1985) chapter 7, from Sects. 5 to 7, for a rigorous development of the following exposition which aims to resemble the main features of the Poisson representation technique for the many variable birth-death systems in terms of combinatorial kinetics. See also Gardiner and Chaturvedi (1977) for an early exposition of the technique.

  11. 11.

    This term is due to Gardiner and Chaturvedi (1977) to extend the field of Chemical Kinetics. The reader interested in Chemical Physics and Physical Chemistry, upon which the following development is based, is suggested to refer also to McQuarrie (1967) and Gillespie (2007), and references cited therein. To appreciate the probabilistic and combinatorial nature of these disciplines, and for extensions of the tools in other fields of applicability, an important reference is Nicolis and Prigogine (1977).

  12. 12.

    According to Gardiner (1985) combinatorial transition rates are usually not explicitly time dependent. In the present paper time dependence is maintained to take care that the configuration of the system changes \(\mathbf{I}_\varsigma (t)=\mathbf{n}\) due to learning, but time is considered as a sequential parameter.

  13. 13.

    Note that \(P_e ({\bullet } ;t):\chi \rightarrow [0,1]\) is the stationary solution where time is an indexing parameter as in transition rates \(T_k^\pm ({\bullet } ;t)\). The interest on time indexing is essentially motivated by the fact that the present modelling is grounded on an ABM–DGP where time is an iteration counter.

  14. 14.

    See van Kampen (2007), page 168, for a geometric interpretation on this issue.

  15. 15.

    The ergodic property is here conceived very loosely: basically, estimates of \(\mathbf{W}_\varsigma (t)\) have been found stable through time such that their series can be likely substituted with the time average; it has also been found the standard deviation is very small.

  16. 16.

    For a given subsystem of SF or NSF firms, a dominant configuration is a combination of behavioural rules that, at \(t\), concentrates fractions of a given quantity, say \(Z\), from the highest to the lowest share. As regarding the number of firms, \(Z=I\), the diffusion-dominance of a certain rules’ configuration allocates the highest shares of firms into behavioural states \(\lambda \in \Lambda \). If \(Z=A,Q,W,\Pi \)effects-dominance of a rules’ configuration identifies what should have been chosen to get the collective optimal configuration as regarding a given quantity.

  17. 17.

    Regimes concern dominance while phases concern the state levels of aggregate quantities.

  18. 18.

    In order to fully appreciate the consequence of introducing learning in a complex system, let concentrate on the effect of a policy, say an easing of the monetary policy, i.e.. a reduction of the rate of interest. The share of SF firms will increase: resilience will be strengthed but the pace of growth could be modest; those effects themselves will depend on the S,s of the system; agents will change their behavior, according to the prescription of the Lucas critique.

  19. 19.

    As regarding the profit maximizing rule \(3\), it can also be seen as function of the control (output scheduling) parameter \(\alpha \).

  20. 20.

    Profit curves and their maximisation w.r.t. equity (state variable) previously developed is different from profit maximisation w.r.t. the scheduling parameter (control parameter): the former concerns the overall economic interpretation, the latter concerns the specific profit maximisation rule, which aims to set an optimal value for the control parameter.

  21. 21.

    This means that equity is conditioning profit through the scheduling parameter, that is \(\Pi (\alpha |A)\).

  22. 22.

    For the ease of exposition time and financial fragility state are suppressed therefore, from here on, all the quantities must be considered as time dependent in every state of financial fragility.



The authors thank an anonymous referee for his remarks; Patrick Xihao Li, Corrado di Guilmi and participants to the EEA conference, NY May 2013, PRIN Bologna, June 2013, for suggestions; the support of the Institute for New Economic Thinking Grant INO1200022, and the EFP7, MATHEMACS and NESS, is gratefully acknowledged.


  1. Alfarano S, Lux T, Wagner F (2005) Estimation of agent-based models: the case of an asymmetric herding model. Comput Econ 26(1):19–49CrossRefGoogle Scholar
  2. Aoki M (1996) New approaches to macroeconomic modelling. Cambridge University Press, CambridgeGoogle Scholar
  3. Aoki M (2002) Modelling aggregate behaviour and fluctuations in economics. Cambridge University Press, CambridgeGoogle Scholar
  4. Aoki M, Yoshikawa H (2006) Reconstructing macroeconomics. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  5. Buchanan M (2007) The social atom: why the rich get richer, cheaters get caught, and your neighbour usually looks like you. Bloomsbury, LondonGoogle Scholar
  6. Delli Gatti D, Gallegati M, Greenwald B, Russo A, Stiglitz JE (2010) The financial accelerator in an evolving credit network. J Econ Dyn Control 34:1627–1650CrossRefGoogle Scholar
  7. Delli Gatti D, Di Guilmi C, Gallegati M, Landini S (2012) Reconstructing aggregate dynamics in heterogeneous agents models. A Markovian Approach, Revue de l’OFCE 124(5):117–146Google Scholar
  8. Delli Gatti D, Fagiolo G, Richiardi M, Russo A, Gallegati M (2014) Agent based models. A Premier (forthcoming)Google Scholar
  9. de Groot SR, Mazur P (1984) Non-equilibrium thermodynamics. Dover Publication, New YorkGoogle Scholar
  10. Di Guilmi C, Gallegati M, Landini S, Stiglitz JE (2011) Towards an analytic solution for agent based models: an application to a credit network economy. In: Aoki M, Binmore K, Deakin S, Gintis H (eds) Complexity and institutions: markets, norms and corporations. Palgrave Macmillan, London, IEA conference, vol. N.150-IIGoogle Scholar
  11. Feller W (1966) An introduction to probability theory and its applications. Wiley, New JerseyGoogle Scholar
  12. Foley DK (1994) A statistical equilibrium theory of markets. J Econ Theory 62:321–345CrossRefGoogle Scholar
  13. Gardiner CW (1985) Handbook of stochastic methods. Springer, BerlinGoogle Scholar
  14. Gardiner CW, Chaturvedi S (1977) The Poisson representation I. A new technique for chemical master equations. J Stat Phys 17(6):429–468CrossRefGoogle Scholar
  15. Gillespie DT (2007) Stochastic simulation of chemical kinetics. Annu Rev Phys Chem 58:35–55CrossRefGoogle Scholar
  16. Greenwald B, Stiglitz JE (1993) Financial markets imperfections and business cycles. Q J Econ 108(1):77–114CrossRefGoogle Scholar
  17. Godley W, Lavoie M (2007) Monetary economics. An integrated approach to credit, money, income, production and wealth. Palgrave MacMillan, BasingstokeGoogle Scholar
  18. Kirman A (2011) Learning in agent based models. East Econ J 37(1):20–27CrossRefGoogle Scholar
  19. Kirman A (2012) Can artificial economies help us understand real economies. Revue de l’OFCE, Debates and Policies 124Google Scholar
  20. McQuarrie DA (1967) Stochastic approach to chemical kinetics. J Appl Probab 4:413–478CrossRefGoogle Scholar
  21. Nicolis G, Prigogine I (1977) Self-organization in nonequilibrium systems: from dissipative structures to order through fluctuations. Wiley, New JerseyGoogle Scholar
  22. Sargent T (1993) Bounded rationality in macroeconomics. Clarendon Press, OxfordGoogle Scholar
  23. Stiglitz JE (1973) Taxation, corporate financial policy and the cost of capital. J Public Econ 2(1):1–34CrossRefGoogle Scholar
  24. Stiglitz JE (1975) The theory of screening, education and the distribution of income. Am Econ Rev 65(3):283–300Google Scholar
  25. Stiglitz JE (1976) The efficiency wage hypothesis, surplus labour and the distribution of income in L.D.C’.S. Oxford Econ Papers 28(2):185–207Google Scholar
  26. Tesfatsion L, Judd KL (2006) Agent-based computational economics. In: Handbook of computational economics. Handbooks in economics series, vol 2. North Holland, AmsterdamGoogle Scholar
  27. van Kampen NG (2007) Stochastic processes in physics and chemistry. North-Holland, AmsterdamGoogle Scholar
  28. Weidlich W, Braun M (1992) The master equation approach to nonlinear economics. J Evol Econ 2(3): 233–265Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Simone Landini
    • 1
  • Mauro Gallegati
    • 2
  • Joseph E. Stiglitz
    • 3
  1. 1.I.R.E.S. PiemonteTurinItaly
  2. 2.DiSESUniversità Politecnica delle MarcheAnconaItaly
  3. 3.Columbia Business SchoolColumbia UniversityNew YorkUSA

Personalised recommendations