Skip to main content
Log in

Agent-Based Modeling of a Non-tâtonnement Process for the Scarf Economy: The Role of Learning

  • Published:
Computational Economics Aims and scope Submit manuscript

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

In this paper, we propose a meta-learning model to hierarchically integrate individual learning and social learning schemes. This meta-learning model is incorporated into an agent-based model to show that Herbert Scarf’s famous counterexample on Walrasian stability can become stable in some cases under a non-tâtonnement process when both learning schemes are involved, a result previously obtained by Herbert Gintis. However, we find that the stability of the competitive equilibrium depends on how individuals learn—whether they are innovators (individual learners) or imitators (social learners), and their switching frequency (mobility) between the two. We show that this endogenous behavior, apart from the initial population of innovators, is mainly determined by the agents’ intensity of choice. This study grounds the Walrasian competitive equilibrium based on the view of a balanced resource allocation between exploitation and exploration. This balance, achieved through a meta-learning model, is shown to be underpinned by a behavioral/psychological characteristic.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. Velupillai (2015) succinctly summarizes the different facets of the problem, viz. the ‘existence of a solution, a method of finding it (if ‘proved’ to exist), the ‘reality’ of the method considered as a dynamic process and its stability’ (Ibid, p. 1556).

  2. This is expressed with admirable clarity in Clower (Clower 1975, p. 13).

  3. Several important contributions made in the 1970s and 1980s also took up the disequilibrium and out-of-equilibrium dynamics to establish a link between Keynesian and general equilibrium models. On this see, Benassy (1982) and Malinvaud (1977).

  4. At a systemic level, it has been shown that the computational complexity of a decentralized system of interacting agents is lower than that of the centralized system that is based on a Walrasian auctioneer (Axtell 2005).

  5. In order to examine the role played by learning, we choose a version that is closer to Scarf’s original version and augment it with learning. It is therefore different from Gintis (2007), which has production.

  6. As Gintis (2013, p. 119) notes:

    ...This is the fact that in a decentralized market economy out of equilibrium, there is no price vector for the economy at all. The assumption that there is a system of prices that are common knowledge to all participants (we may call these public prices) is reasonable in equilibrium, because all agents can, at least in principle, observe the same prices. However, out of equilibrium there is no single set of prices determined by market exchange. Rather, every agent has a subjective prior concerning prices, based on personal experience, that he uses to make and carry out trading plans.

  7. More specifically, t refers to the whole market day t, i.e., the interval \([t,t-1)\).

  8. For a more extended list, see Chen et al. (2016).

  9. It is known that these two approaches can generally lead to different results (Grimm and Railsback 2005). We, however, will leave this issue to a separate study.

  10. We certainly can consider more generalized propensity updating dynamics with three parameters as proposed by Camerer and Ho (1999), but that can complicate our analysis at this initial stage. Hence, we plan to start with this ‘minimal’ model.

  11. By following Arthur (1993), a normalization scheme is also applied to normalize the propensities \(q_{i,k}(t+1)\) as follows:

    $$\begin{aligned} q_{i,k}(t+1) \leftarrow \frac{q_{i,k}(t+1)}{q_{i,a_{il}}(t+1)+ q_{i,a_{sl}}(t+1)}. \end{aligned}$$
    (18)
  12. Results do not vary qualitatively for perturbations of these parameters.

  13. We have examined the system by simulating the same treatments for much longer periods and we find that the results are robust to this extended horizon.

  14. Numéraire normalized processes and their convergence properties have been studied widely in the tatônnement literature. We do not analyze the non-normalized case in this paper. See Kitti (2010), for example, on non-normalized iterative processes and the associated convergence conditions.

  15. This property can also been found in Table 2, the results of Simulation Series 3, where we can see that for each type of agent the price expectations of own consumption goods are biased upward, whereas the price expectations of own production goods are biased downward.

  16. The behavior of the price of good 2 is qualitatively similar.

  17. For \(\lambda = 0\), past performance should not influence the current choice and therefore \(Prob_{i;k}^{t + 1} = 1/2\). Thus, we would expect the market fractions to be 50–50, in contrast to what is observed. However, the past performance does exert an indirect influence through the reference point mechanism. Note that the agents consider switching only when the utility falls below their current reference point. Since innovators have relatively high pay-offs for \(\lambda =0\) (and other lower values), the reference point mechanism introduces a bias in favor of the innovators, which explains the deviations we observe in Fig. 8.

  18. At this point, \(mks_{a_{il}}(0)\) is evenly distributed over two different learning schemes (Table 1).

  19. So far, we have not seen many empirical studies directly devoted to examining the payoff distribution among different heuristics, schemes or strategies in the context of adaptive belief systems or heuristic switching models, neither from the simulation studies, nor from the experimental studies. In this regard, the only study close to us is Bossan et al. (2015), but their adjustment is made at the mesoscopic level (a kind of replicator dynamics), and not at a microscopic level.

  20. We are not able to show here the \(\lambda \) which can serve as the equalizer. However, since the payoff reversal happens when \(\lambda \) increases from 3 to 4, we suspect that there may exist some \(\lambda \)s (\(\lambda \in (3,5)\)) which may remove the gap.

  21. In the psychology literature, the power law of practice indicates that subjects’ early learning experiences have a dominating effect in their limiting behavior; it is normally characterized by initially steep but then flatter learning curves. In the machine learning literature, it is also known as premature convergence, and is a familiar result corresponding to the path-dependence property of learning dynamics. In our case, when agents’ memory never decays (\(\phi =0\)) and \(\lambda \) is large, say, \(\lambda =10\), the path dependency effect can become extreme.

  22. As we show in “Appendix C”, the inferior performance is mainly contributed by innovators rather than imitators.

  23. The correlation is based on the pool of 550 pairs of the MAPE (of good 1) and the average number of switches (over the last 500 periods). There are 550 pairs because we have 50 repetitions for 11 \(\lambda \)s.

  24. See Chen and Venkatachalam (2017) for limits to information aggregation and price discovery in a related context.

  25. https://www.openabm.org/model/4897/.

References

  • Albin, P., & Foley, D. (1992). Decentralized, dispersed exchange without an auctioneer: A simulation study. Journal of Economic Behavior and Organization, 18(1), 27–51.

    Article  Google Scholar 

  • Alós-Ferrer, C., & Schlag, K. H. (2009). Imitation and learning. In P. Anand, P. Pattanaik, & C. Puppe (Eds.), The handbook of rational and social choice. New York: Oxford University Press.

    Google Scholar 

  • Anderson, C., Plott, C., Shimomura, K., & Granat, S. (2004). Global instability in experimental general equilibrium: The Scarf example. Journal of Economic Theory, 115(2), 209–249.

    Article  Google Scholar 

  • Anufriev, M., & Hommes, C. (2012). Evolution of market heuristics. Knowledge Engineering Review, 27(2), 255–271.

    Article  Google Scholar 

  • Apesteguia, J., Huck, S., & Oechssler, J. (2007). Imitation—Theory and experimental evidence. Journal of Economic Theory, 136(1), 217–235.

    Article  Google Scholar 

  • Arrow, K. (1974). General economic equilibrium: Purpose, analytic techniques, collective choice. American Economic Review, 64(3), 253–272.

    Google Scholar 

  • Arthur, B. (1993). On designing economic agents that behave like human agents. Journal of Evolutionary Economics, 3(1), 1–22.

    Article  Google Scholar 

  • Axelrod, R. (1997). Advancing the art of simulation in the social sciences. In R. Conte, R. Hegselmann, & P. Terna (Eds.), Simulating social phenomena (pp. 21–40). Berlin: Springer.

    Chapter  Google Scholar 

  • Axtell, R. (2005). The complexity of exchange. The Economic Journal, 115, F193–F210.

    Article  Google Scholar 

  • Benassy, J. P. (1982). The economics of market disequilibrium. Cambridge: Academic Press.

    Google Scholar 

  • Bossan, B., Jann, O., & Hammerstein, P. (2015). The evolution of social learning and its economic consequences. Journal of Economic Behavior and Organization, 112, 266–288.

    Article  Google Scholar 

  • Brenner, T. (1998). Can evolutionary algorithms describe learning processes? Journal of Evolutionary Economics, 8(3), 271–283.

    Article  Google Scholar 

  • Brock, W., & Hommes, C. (1997). A rational route to randomness. Econometrica, 65(5), 1059–1095.

    Article  Google Scholar 

  • Brock, W., & Hommes, C. (1998). Heterogeneous beliefs and routes to chaos in a simple asset pricing model. Journal of Economic Dynamics and Control, 22(8–9), 1235–1274.

    Article  Google Scholar 

  • Camerer, C., & Ho, T.-K. (1999). Experience-weighted attraction learning in normal form games. Econometrica, 67(4), 827–874.

    Article  Google Scholar 

  • Chen, S.-H., Chang, C.-L., & Du, Y.-R. (2012). Agent-based economic models and econometrics. Knowledge Engineering Review, 27(2), 187–219.

    Article  Google Scholar 

  • Chen, S.-H., Kao, Y.-H., & Ragupathy, V. (2016). Computational behavioral economics. In R. Frantz, S.-H. Chen, K. Dopfer, F. Heukelom, & S. Mousavi (Eds.), Routledge handbook of behavioral economics (pp. 297–319). London: Routledge.

    Google Scholar 

  • Chen, S.-H., & Venkatachalam, R. (2017). Information aggregation and computational intelligence. Evolutionary and Institutional Economics Review, 14(1), 231–252.

    Article  Google Scholar 

  • Clower, R. (1975). Reflections on the Keynesian perplex. Journal of Economics, 35(1), 1–24.

    Google Scholar 

  • Ellison, G., & Fudenberg, D. (1993). Rules of thumb for social learning. Journal of Political Economy, 101(4), 612–643.

    Article  Google Scholar 

  • Erev, I., & Rapoport, A. (1998). Coordination, “magic,” and reinforcement learning in a market entry game. Games and Economic Behavior, 23, 146–175.

    Article  Google Scholar 

  • Erev, I., & Roth, A. (1998). Predicting how people play games: Reinforcement learning in experimental games with unique, mixed strategy equilibria. American Economic Review, 88(4), 848–881.

    Google Scholar 

  • Fisher, F. M. (1983). Disequilibrium foundation of equilibrium economics. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Gintis, H. (2006). The emergence of a price system from decentralized bilateral exchange. Contributions in Theoretical Economics, 6(1), 1–15.

    Article  Google Scholar 

  • Gintis, H. (2007). The dynamics of general equilibrium. Economic Journal, 117(523), 1280–1309.

    Article  Google Scholar 

  • Gintis, H. (2013). Hayek’s contribution to a reconstruction of economic theory. In R. Frantz & R. Leeson (Eds.), Hayek and behavioral economics, chapter 5 (pp. 111–126). New York: Palgrave Macmillan.

    Chapter  Google Scholar 

  • Grimm, V., & Railsback, S. (2005). Individual-based modeling and ecology. New York: Princeton University Press.

    Book  Google Scholar 

  • Hahn, F., & Negishi, T. (1962). A theorem on non-tâtonnement stability. Econometrica, 30(3), 463–469.

    Article  Google Scholar 

  • Hayek, F. A. (1945). The use of knowledge in society. American Economic Review, 35(4), 519–530.

    Google Scholar 

  • Hommes, C., & Zeppini, P. (2014). Innovate or imitate? Behavioural technological change. Journal of Economic Dynamics and Control, 48, 308–324.

    Article  Google Scholar 

  • Hommes, C. (2006). Heterogeneous agent models in economics and finance. In L. Tesfatsion & K. L. Judd (Eds.), Handbook of computational economics (Vol. 2, pp. 1109–1186). Amsterdam: Elsevier.

    Google Scholar 

  • Hommes, C. (2011). The heterogeneous expectations hypothesis: Some evidence from the Lab. Journal of Economic Dynamics and Control, 35(1), 1–24.

    Article  Google Scholar 

  • Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk. Econometrica, 47, 263–291.

    Article  Google Scholar 

  • Kitti, M. (2010). Convergence of iterative tâtonnement without price normalization. Journal of Economic Dynamics and Control, 34(6), 1077–1091.

    Article  Google Scholar 

  • Koza, J. (1992). Genetic programming: On the programming of computers by means of natural selection. Cambridge: MIT Press.

    Google Scholar 

  • Malinvaud, E. (1977). The theory of unemployment reconsidered. London: Blackwell.

    Google Scholar 

  • Mandel, A. (2012). Agent-based dynamics and the general equilibrium model. Complexity Economics, 1(1), 105–121.

    Article  Google Scholar 

  • Mandel, A., Landini, S., Gallegati, M., & Gintis, H. (2015). Price dynamics, financial fragility and aggregate volatility. Journal of Economic Dynamics and Control, 51, 257–277.

    Article  Google Scholar 

  • Rendell, L., Boyd, R., Cownden, D., Enquist, M., Eriksson, K., Feldman, M. W., et al. (2010). Why copy others? Insights from the social learning strategies tournament. Science, 328(5975), 208–213.

    Article  Google Scholar 

  • Roth, A., & Erev, I. (1995). Learning in extensive-form games: Experimental data and simple dynamic models in the intermediate term. Games and Economic Behaviour, 8, 164–212.

    Article  Google Scholar 

  • Samuelson, L. (1998). Evolutionary games and equilibrium selection. Cambridge: MIT Press.

    Google Scholar 

  • Scarf, H. (1960). Some examples of global instability of the competitive economy. International Economic Review, 1(3), 157–172.

    Article  Google Scholar 

  • Sutton, R., & Barto, A. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT Press.

    Google Scholar 

  • Tesfatsion, L. (2006). Agent-based computational economics: A constructive approach to economic theory. In L. Tesfatsion & K. Judd (Eds.), Handbook of computational economics: Agent-based computational economics (Vol. 2, pp. 831–880). Amsterdam: North Holland.

    Google Scholar 

  • Tversky, A., & Kahneman, D. (1991). Loss aversion in riskless choice: A reference-dependent model. Quarterly Journal of Economics, 106, 1039–1061.

    Article  Google Scholar 

  • Uzawa, H. (1960). Walras’ tâtonnement in the theory of exchange. The Review of Economic Studies, 27(3), 182–194.

    Article  Google Scholar 

  • Velupillai, K. (2015). Iteration, tâtonnement, computation and economic dynamics. Cambridge Journal of Economics, 39(6), 1551–1567.

    Article  Google Scholar 

  • Vriend, N. (2000). An illustration of the essential difference between individual and social learning, and its consequences for computational analyses. Journal of Economic Dynamics and Control, 24(1), 1–19.

    Article  Google Scholar 

  • Wiering, M., & van Otterlo, M. (2012). Reinforcement learning: State of the art. Heidelberg: Springer.

    Book  Google Scholar 

Download references

Acknowledgements

We thank the two anonymous referees for their helpful and constructive comments, which have helped us greatly in improving the quality and clarity of the paper. The first and the last author are grateful for the research support in the form of the Ministry of Science and Technology (MOST) grants, MOST 103-2410-H-004-009-MY3 and MOST 104-2811-H-004-003, respectively. We thank Wolfgang Magerl for the able research assistantship in the execution of this project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shu-Heng Chen.

Appendices

Implementation Details

The details regarding the implementation of the simulations are provided in this “Appendix”. All simulations and analyses were performed using NetLogo 5.2.0 and Matlab R2015a. The NetLogo interface of our simulations is provided in Fig. 13. To comply with the current community norm, we have made the computer program available at the OPEN ABM.Footnote 25 Figure 13 is the typical NetLogo operation interface. We classify the figure into two blocks. The first block (the left most block) is a list of control bars for users to supply the values of the control parameters to run the simulation. The parameters are detailed in Table 1, including N, M, S, T, \(\varphi \), \(\theta _{1}\), \(\theta _{2}\), K, \(\lambda \), \(\textit{POP}_{RE}\) (defined in Sect. 5.3), and \(\textit{POP}_{a_{il}}(0)\). In addition to these variables, other control bars are the on-off choices for the running model, including individual learning (only), social learning (only), an exogenously given market fraction, and the meta-learning model. For the exogenously given market fraction, \(\textit{POP}_{a_{il}}(0)\) needs to be given additionally.

On the left of the control panels are the real-time demonstrations of the economy under operation. The six subfigures shown in the upper right block are information related to price expectations sustained for a trading day (3) and excess demand and supply settled at the end of a trading day (10). The results are plotted in a time series. The leftmost three subfigures refer to the mean of price expectations (by good and by type), and the middle three subfigures refer to the mean of excess demand and supply (by good and by type).

The top leftmost three subfigures give the summary of the market: prices, quantities, and market fraction (population of innovators). The first one gives the time series plot of the mean of the actual trading prices of goods 1 and 2, denoted by M1_t and M2_t in contrast to its expectations averaged over all agents, M1 and M2 (good 3 serves as the numéraire). The middle one gives the time series of aggregate demand, summed over all agents’ planned demand (4). The third one gives the time series of the fraction of agents who adopt the individual learning scheme.

Fig. 13
figure 13

NetLogo interface for the simulation of the agent-based Scarf economy

Immediately below the above nine subfigures is the snapshot distribution (dispersion) of price expectations, displayed by goods. The histogram of good 3 is trivial because it serves as the numéraire. The last two subfigures at the bottom provide the information on the relative price of each pair of goods. On the left is the time series of the mean relative price of each pair of goods and on the right is the time series of the respective standard deviation (price dispersion).

Endogenously Determined Market Fractions

This appendix provides the table describing the simulation results concerning endogenously generated market fractions starting from different initial conditions.

Table 3 Endogenously determined market fractions

Payoff Inequality and Two Types of Errors

In Sect. 5.8, we have seen that large populations of immobile agents associated with large \(\lambda \)s cause the market mechanism to malfunction due to both the possible presence of ‘type-I’ and ‘type-II’ errors. In this section, we shall have a further look at the relative importance of these two types of errors at the individual level by examining the payoffs to these two types of agents who may contribute to these errors. In Fig. 14, we present the results in parallel to Fig. 11 except that here we only restrict our attention to those innovators and imitators who are immobile. Since there is only a negligible number of immobile agents when \(\lambda < 5\) (Fig. 12, the left panel), we only report the payoffs of these two groups for \(\lambda \ge 5\) in Fig. 14.

Fig. 14
figure 14

Payoffs to immobile agents: innovators or imitators. The figure is based on the data retrieved from Simulation 3. Each of the boxes above shows the time series of the mean payoffs to immobile innovators and immobile imitators. The immobile agent is defined by the given threshold \(f_{min}\). Each point at time t is calculated as follows. We first figure out the mean of each run: \(\bar{U}_{il}^{f}(t)=(\sum _{i \in A^{f}_{il}}U_{i}(t))/\# A^{f}_{il}\), and \(\bar{U}_{sl}^{f}(t)=(\sum _{i \in A^{f}_{sl}}U_{i}(t))/ \# A^{f}_{sl}\), where \(A^{f}_{il}\) (\(A^{f}_{sl}\)) is the set of immobile innovators (imitators). We then take the mean of these means over 50 runs. The blue line indicates the payoffs to immobile innovators, and the red line indicates the payoffs to immobile imitators. The six boxes above, from the left to the right, from the upper to the lower levels, correspond to the case of \(\lambda =5, 6,\ldots ,10\). Notice that the first subplot (\(\lambda =5\)) does not have immobile imitators, and hence the red line is not available. As expected, when the sample size (the population size of the immobile agents) is rather small, we experience a variability in the results, for example, when \(\lambda =6,7\). (Colour figure online)

As in Figs. 11 and 14 shows that, even for the immobile agents, the imitators’ performance is superior to that of the innovators. In fact, these two figures together show that the payoffs to innovators drop substantially with an increase in immobile agents, whereas the payoffs to imitators are not affected substantially by the prevalence of immobile agents. Therefore, despite our taxonomy of two types of errors, what contributes to the ‘malfunction’ of the market mechanism the most is the ‘type-one error’. This result demonstrates that the market mechanism is a joint function of exploration and exploitation. They help each other, but when one function is not working, exploitation alone can do more harm than exploration alone.

The above finding resonates well with what we have observed in Sect. 5.2, in which one economy has only exploitation (Sect. 5.2.1) and one economy has only exploration (Sect. 5.2.2). The reason why exploitation can do more harm than exploration alone is probably because it results in a lower spread of information and hence prevents markets from pooling information effectively. Nevertheless, we have also seen that the performance of innovators (exploitation) can be generally beefed up when learning from others is possible, i.e., being mobile agents. Indeed, when the market is filled with mobile agents (the case with low \(\lambda \)s), innovators on average perform better than imitators, as shown in Fig. 11 (the sub-figures with small \(\lambda \)s).

The above results also shed light on our earlier result that the presence of general-equilibirum agents does not automatically ensure that all agents will adopt equilibrium prices in our meta-learning model (Sect. 5.3). Why do these ‘superior’ price expectations fail to spread across the whole economy? The reason is due to the existence of immobile agents who not only block themselves away from the adoption of the ‘superior’ belief, but may generate lots of ‘noises’ to prevent others from copying it.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, SH., Chie, BT., Kao, YF. et al. Agent-Based Modeling of a Non-tâtonnement Process for the Scarf Economy: The Role of Learning. Comput Econ 54, 305–341 (2019). https://doi.org/10.1007/s10614-017-9721-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10614-017-9721-5

Keywords

Navigation