1 Introduction

Dynamic models in physics, engineering, natural sciences and economics have the common feature of involving memory of some past states (or a continuous portions of past states) to determine their future time evolution (see, e.g., [15, 26, 27]). The inclusion of past history in the time evolution adds nontrivial complexities, thus introducing a trade-off between the advantage of dealing with more realistic models and the drawback of dynamic models which are more difficult to be studied. In social sciences, the inclusion of memory in modeling human decisions may sometimes be considered as a method to represent learning processes or discounted averaging methods (see, e.g., [3, 16, 17, 21]).

In order to study the effects of increasing memory in systems driven by repeated decisions of economic agents, we consider an evolutionary dynamic model in discrete time recently proposed in [10] and modify it by introducing a recursive method that adds the presence of memory (or discounted past averaging) in the decision process. The model describes an economic system composed by a population of agents facing a binary choice between two different behavioral strategies; the payoff obtained by each agent as a consequence of the chosen option is affected by the number of agents currently making the same choice and is expressed in the form of an evolutionary game based on replicator dynamics in discrete time. This mechanism describes how agents change their choices over time according to the currently observed payoff differences.

However, decisions in real systems are not only based on currently observed payoff differences but also on information about past performances, which can indeed influence agents’ decisions. In other words, a realistic fitness measure to drive evolutionary pressure, i.e., the emerging behavioral strategies prevailing in the long run, should also take into account payoff patterns accumulated along the recent history. Thus, the knowledge of past performances is employed as information set in order to decide the choices of future strategies, and consequently the long-run evolution of the time-evolving system. Of course, this kind of evolutionary pressure requires a higher degree of information and of computational ability. However, this does not necessarily mean a higher degree of “rationality”. Indeed, rationality means agents’ ability to exploit all available information to correctly forecast future states of the systems, i.e., to anticipate future outcomes. If they are not able to make so good anticipations of future profits, they may adopt current profits as an estimate of their future realizations, that is they use naive expectations. This is the assumption of the evolutionary model proposed in [10]. However, in many economic systems, agents employ some average of past observations in order to assess future states.

The benefit (and rationality) of using such information to make better choices for the future is not obvious. Indeed, on the one hand, this attitude could be regarded as an effort to use in an improved (or at least computationally more refined) way the information agents have at disposal, which accounts for improved rationality. On the other hand, however, this could be seen as an even lower degree of rationality, a sort of anchoring to the past, or prejudice that prevents agents to update the information available, i.e., cancel the past to look at the present in order to estimate future states.

Papers concerning binary decisions with externalities have a long tradition in the social and economic literature, see, e.g., [6, 18, 29, 30]. Recently, many authors, such as [22, 24, 25], introduced evolutive adaptive processes to mimic how agents switch from one choice to another according to the observed (or expected) payoff differences. In particular, in this paper we consider a discrete-time evolutionary model based on replicator dynamics in the form proposed by [8], see also [19] and [31], and we add a contribution about the intriguing topic of the role played by the presence of memory, i.e., the role of time lags in the decision process when agents’ decisions are not only based on the current payoffs observed but on past payoff differences as well. An answer about the effect of memory on the dynamics of the system reveals to be not univocal, as several ambiguous conclusions can be found in the literature. (A comparison of the titles of references [9, 11] is quite emblematic.) In the context of binary games, the problem of memory has been analyzed in [17] and [7], whereas more references exist in oligopoly modelling, see, e.g., [12,13,14, 27], see also [2] for a complete reference on dynamic oligopoly games.

In this paper, we generalize the evolutionary model proposed in [10] by the inclusion of a weighted sum of all previously observed payoffs, with exponentially (or geometrically) distributed weights that discount past outcomes, see also [4] for a related contribution. This generalization does not modify the equilibrium points of the original model, and we prove that it does not affect their local stability properties as well. However, numerical simulations show that some global dynamic properties are influenced by the presence of memory, in particular when oscillatory dynamics (periodic or chaotic) occur. Periodic patterns become chaotic under the influence of increasing memory, and vice versa. Moreover, sufficiently high levels of memory give rise to stabilization of the system at the Nash equilibrium production.

The role of the memory in these models is still an open question, and also our results do not give an unambiguous answer. The most useful lesson that can be learned concerns the fact that a local analysis is not enough to fully understand the possible dynamic scenarios of the systems with memory. Even if the recursive method to represent memory we have proposed, following, e.g., [17], leads to a tractable low-dimensional discrete dynamical system, our study shows once more that when considering nonlinear systems both local and global analyses are necessary to have better understanding of the system. Indeed, our numerical simulations provide further insight into nonlinear phenomena and the related effects of the presence of memory.

The structure of the paper is the following. In Sect. 2 we provide a general framework for formalizing evolutionary oligopoly games with memory and local stability analysis at symmetric equilibria in production for generic behavioral rules. Section 3 shows an example of such a model with two specific behavioral rules proposed in the literature, namely gradient dynamics and local monopolistic approximation. Here, an example of the results of local stability analysis of Sect. 2 is obtained for this particular specification. In addition, Sect. 3 presents some possible dynamic scenarios to underline the role of memory in the system. Section 4 proposes some conclusive remarks.

2 Evolutionary oligopolies with memory

Let us assume that N oligopolists operate in a market by selling homogeneous goods. The economic structure of the game is not known by the players. In particular, as it often occurs in real cases, the oligopolists do not know the functional form of the inverse demand function, which specifies the selling price at time t, p(t), as a function of total production Q(t) delivered to the market at time t, i.e., \(p(t)= p(Q(t))\). At any discrete time t (decision-driven time) each oligopolist can set her production plan according to one behavioral rule, which prescribes her quantity to produce as a function of the production of other players according to a distribution of behavioral rules expected to prevail at time \(t+1\), when production will be sold in the market. Let us consider the simplest nontrivial case, given by two different behavioral rules that can be conceived by the players, and let us denote by \(x_{i}(t+1)\), \(i=1,2\), the production plan of a firm for market delivery at time \(t+1\) prescribed by the behavioral rule i according to the information available at time t. Moreover, let \(r_{i}(t)\in \left[ 0,1\right] \) be the share of firms employing behavioral rule i at time t. In our simple setting, with just two behavioral strategies, the relation \(r_{1} (t)+r_{2}(t)=1\) holds for each time t; hence, we can rename the shares as \(r_{1}(t)=r(t)\) and \(r_{2}(t)=1-r(t)\). A common assumption in these kinds of deterministic discrete-time evolutionary models is that through a behavioral rule one can determine inductively the production plan for time \(t+1\) given the state of the system at time t (productions and fraction of employers of the behavioral rules), that is

$$\begin{aligned} x_{i}(t+1)=H_{i}\left( x_{i}(t),x_{j}(t),r(t)\right) ,\quad i=1,2;i\ne j \end{aligned}$$
(1)

Each behavioral rule is associated with an information cost, relating the use of this rule to how much expensive it is in terms of information and computing effort. In the following, we denote by \(K_{i}\ge 0\) the information cost associated with the behavioral rule \(H_{i}\), which we consider constant over time.

From the side of production costs, we assume that all firms employ the same production technology and bear the same production cost C(x), whose functional form is correctly assessed by the players. Consequently, the behavioral rule i entails an actual profit at time period t given by

$$\begin{aligned} \pi _{i}\left( t\right) =\left[ p\left( Q(t)\right) -C\left( x_{i} \left( t\right) \right) \right] x_{i}\left( t\right) -K_{i} \end{aligned}$$
(2)

from which an expected profit for the next time period can be estimated as

$$\begin{aligned}&\pi _{i}^{e}\left( t+1\right) \nonumber \\&\quad =\left[ p\left( Q_{i}^{e}(t+1)\right) -C\left( x_{i}\left( t+1\right) \right) \right] x_{i}\left( t+1\right) -K_{i}\nonumber \\&\quad =\left[ p\left( Q_{i}^{e}(t+1)\right) -C\left( H_{i} \left( x_{i}(t),x_{j}(t),r_{i}(t)\right) \right) \right] \nonumber \\&\quad \qquad \times \,H_{i}\left( x_{i}(t),x_{j}(t),r_{i}(t)\right) -K_{i} \end{aligned}$$
(3)

In (3), the total expected quantity for time \(t+1\) by agents employing behavioral rule i can be given by:

$$\begin{aligned} Q_{i}^{e}(t+1)=N\left[ r_{i}(t)x_{i}(t+1)+r_{j}(t)x_{j}^{e}(t+1)\right] ,\quad \end{aligned}$$
(4)

\(x_{j}^{e}(t+1)\) being the quantity expected to be produced by firms adopting the behavioral rule j and \(x_{i}(t+1)\) is given by (1), \(i,j=1,2\), \(i\ne j\). Another possible choice for the total quantity expected in the market for time \(t+1\) is through the so-called naive expectations, that in this context assume the following form:

$$\begin{aligned} Q^{e}(t+1)=Q(t)=N\left[ r_{i}(t)x_{i}(t)+r_{j}(t)x_{j}(t)\right] \quad \end{aligned}$$
(5)

Notice that, under naive expectations, the total expected quantity is the same for all agents independently on the employed behavioral rule so that in (5) we can omit the subscript i for \(Q^{e}(t+1)\).

The quantity \(\pi _{i}^{e}\left( t+1\right) \) is assumed to provide an estimate of the possible fitness associated with the use of behavioral rule i; similarly, the difference \(\pi _{i}^{e}\left( t+1\right) -\pi _{j} ^{e}\left( t+1\right) \), \(i\ne j\) yields a measure of the comparative advantage of employing behavioral rule i over behavioral rule j. However, firms using a specific behavioral rule can be interested not only in the current assessment of how well behavioral rule i is currently performing over behavioral rule j but also on how well historically behavioral rule i has performed over the other behavioral rule. In other words, the fitness associated with a behavioral rule may be measured according to the accumulated profit instead of just current profit. Following this argument, one can assume that in the evolutionary model the fitness measure \(U_{i}(t)\) of behavioral rule i at each time step also involves a portion of the profit accumulated in the past, that is

$$\begin{aligned} U_{i}(t)=\left( 1-\omega \right) \pi _{i}(t)+\omega U_{i}(t-1) \end{aligned}$$
(6)

where \(\omega \in \left[ 0,1\right] \) represents a memory parameter that takes into account a convex combination of current expected profit and accumulated ones. This specification is denoted as “normalized memory” in [21]. From the recursive formula (6), by backward induction reasoning, it is easy to get the expression of the accumulated profit

$$\begin{aligned} U_{i}(t)=\left( 1-\omega \right) \sum _{k=0}^{t-1}\omega ^{k}\pi _{i} (t-k)+\omega ^{t}U_{i}(0), \qquad i=1,2 \end{aligned}$$
(7)

as a measure of fitness expressed in the form of a discounted weighted sum with exponentially fading weights. Again, the parameter \(\omega \in \left[ 0,1\right] \) gives a measure of the memory, as \(U_{i}(t)=\pi _{i}(t)\) for \(\omega =0\), whereas a uniform arithmetic mean of all the payoffs observed in the past is obtained in the other limiting case \(\omega =1\).

At this point, it is useful to remember what information is relevant for agents in order to make decisions on future productions. To calculate \(Q^{e}(t+1)\), the expected total supply on the market, it is necessary to include in the information set the quantities played and the distributions of the strategies at time t, that is \(x_{i}(t)\) and \(r_{i}(t)\), \(i=1,2\). Furthermore, the use of memory implies that agents also remember the performance in terms of profits obtained over time through the available behavioral rules from the beginning of the game up to time t, that is \(\pi _{i}(t-k)\), \(i=1,2\). and \(k=0,\ldots ,t-1\).

Now consider \(U_{i}(t)\), the fitness of employing behavioral rule i, and assume that firms can assess this measure of fitness and switch to the more profitable behavioral rule from period to period. This dynamic process changes over time the distribution of behavioral rules. One common assumption for modeling this endogenous process of choice is obtained through the exponential replicator model, see [8, 19, 20], which takes the form

$$\begin{aligned} r\left( t+1\right)&=\frac{r\left( t\right) e^{\beta U_{1}\left( t\right) }}{r\left( t\right) e^{\beta U_{1}\left( t\right) } +\left[ 1-r\left( t\right) \right] e^{\beta U_{2}\left( t\right) }}\nonumber \\&=\frac{r\left( t\right) }{r\left( t\right) +\left[ 1-r\left( t\right) \right] e^{\beta \left[ U_{2}\left( t\right) -U_{1}\left( t\right) \right] }} \end{aligned}$$
(8)

where \(r\left( t\right) =r_{1}(t)\) is the time-t fraction of firms employing behavioral rule 1. In (8), parameter \(\beta \ge 0\) is referred to as the intensity of choice and measures how sensitive the players are at selecting fitness-increasing behavioral rules. The minimum value \(\beta =0\) corresponds to the case with fixed fractions, being \(r\left( t+1\right) =r\left( t\right) =r\). On the other extreme case \(\left( \beta =\infty \right) \), all firms immediately switch to the behavioral rule showing better performance, i.e., \(r\left( t\right) \rightarrow 1\) if \(U_{1}\left( t\right) >U_{2}\left( t\right) \) and \(r\left( t\right) \rightarrow 0\) if \(U_{1}\left( t\right) <U_{2}\left( t\right) \).

The exponential replicator model (8) has several useful properties. In fact, the (strictly monotone) transformation \(U_{i}\left( t\right) \rightarrow e^{U_{i}\left( t\right) }\) guarantees that the fractions obtained through the dynamics in (8) are always contained in the interval \(\left[ 0,1\right] \) even with negative fitness \(U_{i}\left( t\right) <0\).

If the recursive scheme (6) is plugged into the evolutionary model (8) and the auxiliary dynamic variable \(m(t)=U_{1}\left( t\right) -U_{2}\left( t\right) \) is introduced, together with the dynamics of productions specified by the behavioral rules (1), then the evolution of these variables is governed by the following four-dimensional map T in the phase space \(\left( x_{1},x_{2},r,m\right) \in A\subseteq {\mathbb {R}}_{+}^{2}\times \left[ 0,1\right] \times {\mathbb {R}}\), where \({\mathbb {R}}_{+}=\left[ 0,+\infty \right) \):

$$\begin{aligned} T:\left\{ \begin{array}{l} x_{1}\left( t+1\right) =H_{1}\left( x_{1}(t),x_{2}(t),r(t)\right) \\ x_{2}\left( t+1\right) =H_{2}\left( x_{1}(t),x_{2}(t),r(t)\right) \\ r\left( t+1\right) =R(r(t),m(t))=\frac{r\left( t\right) }{r\left( t\right) +\left( 1-r\left( t\right) \right) e^{-\beta m(t)}}\\ m(t+1)=M(x_{1}(t+1),x_{2}(t+1),r(t),m(t))\\ \quad =\left( 1-\omega \right) \left( \pi _{1}\left( t+1\right) -\pi _{2}\left( t+1\right) \right) +\omega m(t) \end{array}\right. \end{aligned}$$
(9)

Here, we assume that the set of strategies available to the oligopolists is a non-empty compact and convex set of \({\mathbb {R}}^{2}\) and each firm’s profit is concave in own strategies; then by the results in [28] a Nash equilibrium exists. Following [10], we consider behavioral rules that are stationary at any symmetric Nash equilibrium of the underlying game. Stationary behavioral rules prescribe to play the Nash equilibrium quantity whenever the quantities played are at Nash equilibrium regardless of the distribution of behavioral rules, that is

$$\begin{aligned} x^{*}=H_{i}\left( x^{*},x^{*},r\right) ;\quad i=1,2 \quad \forall r\in \left[ 0,1\right] . \end{aligned}$$

From (9) we get that an equilibrium distribution of behavioral rules is compatible only with the following three cases: (1) \(r=0\); (2) \(r=1\); (3) \(m=0\). It is important to stress that \(r=0\) and \(r=1\) are invariant sets for the dynamics where only one pure strategy is employed (strategy 2 or strategy 1, respectively). On these sets all the firms follow the same behavioral rule and may include Nash equilibria which are also dynamic equilibria of the model without memory, characterized by \(x_{1}^{*}=x_{2}^{*}=x^{*}\) and \(r=0\) or \(r=1\). Thus, being \(\pi _{1}^{*}=\left[ p\left( Nx^{*}\right) -C\left( x^{*}\right) \right] x^{*}-K_{1}\) and \(\pi _{2}^{*}=\left[ p\left( Nx^{*}\right) -C\left( x^{*}\right) \right] x^{*}-K_{2}\), from the fourth equation in (9), we have that at equilibrium the following condition holds: \(m^{*}=\pi _{1}^{*}-\pi _{2}^{*}=-K_{1}+K_{2}\). Such equilibria of map (9) are given by

$$\begin{aligned} E_{0}=\left( x^{*},x^{*},0,K_{2}-K_{1}\right) \end{aligned}$$

and

$$\begin{aligned} E_{1}=\left( x^{*},x^{*},1,K_{2}-K_{1}\right) \end{aligned}$$

Moreover, interior equilibria can exist with both behavioral strategies adopted by given fractions of the population of firms, given by

$$\begin{aligned} E_{r}=\left( x^{*},x^{*},r^{*},0\right) . \end{aligned}$$

Now we turn to the Jacobian Matrix of map (9) to characterize the local stability of equilibria. Notice the fourth equation in (9) is the only one that depends on all state variables:

$$\begin{aligned}&m(t+1)=M(H_{1}\left( x_{1}(t),x_{2}(t),r(t)\right) ,\\&H_{2}\left( x_{1}(t),x_{2}(t),r(t)\right) ,m(t)) \end{aligned}$$

The Jacobian matrix J associated with (9) at an equilibrium E has always entries \(J_{31}(E)=J_{32}(E)=J_{14}(E)=J_{24}(E) =0\).Footnote 1

From the analysis of the model without memory (see [10]), if quantities are at a Nash equilibrium \(\left( x^{*},x^{*}\right) \) and \(r=0\) it is also \(J_{21} (E_{0})=J_{13}(E_{0})=J_{23}(E_{0})=0\).

Moreover, it is \(J_{34}(E_{0})=\frac{\partial R}{\partial m(t)} =\frac{e^{-\beta m}r(1-r)\beta (1-\delta )}{\left( e^{-\beta m}(1-r)+r\right) ^{2}}=0\) (if \(r=0\) or \(r=1\)) and \(J_{43}(E_{0})=\frac{\partial M}{\partial r(t)} =\frac{\partial M}{\partial x(t+1)}\frac{\partial x(t+1)}{\partial r(t)} +\frac{\partial M}{\partial y(t+1)}\frac{\partial y(t+1)}{\partial r(t)}=0\) because \(\frac{\partial x(t+1)}{\partial r(t)}=\frac{\partial y(t+1)}{\partial r(t)}=0\).

Summing up, at equilibrium \(E_{0}\) the Jacobian assumes the following structure:

$$\begin{aligned} J(E_{0})=\left( \begin{array}{cccc} &{} &{} 0 &{} 0\\ 0 &{} &{} 0 &{} 0\\ 0 &{} 0 &{} &{} 0\\ &{} &{} 0 &{} \end{array}\right) \end{aligned}$$

from which we obtain that the eigenvalues are the entries on the main diagonal, since an eigenvalue \(\mu \) solves the following characteristic equation

$$\begin{aligned} \mathrm{det}(J(E_{0}))&=\mathrm{det}\left( \begin{array}{cccc} J_{11}-\mu &{} J_{12} &{} 0 &{} 0\\ 0 &{} J_{22}-\mu &{} 0 &{} 0\\ 0 &{} 0 &{} J_{33}-\mu &{} 0\\ J_{41} &{} J_{42} &{} 0 &{} \omega -\mu \end{array}\right) \\&=\left( J_{11}-\mu \right) \left( J_{22}-\mu \right) \left( J_{33}-\mu \right) \left( \omega -\mu \right) \\&=0 \end{aligned}$$

Moreover, the same holds at \(E_{1}\) where \(r=1\), as in that case in is \(J_{12}(E_{1})=0\) and \(J_{21}(E_{1})\ne 0\). Thus, at \(E_{0}\) and \(E_{1}\), the eigenvalues are the same as in the three-dimensional model where no memory is present, with the additional eigenvalue \(\omega \), which is always in the range [0, 1].

As stressed in [5], from an economic point of view this fact has an obvious interpretation: in a deterministic evolutionary setting, missing behaviors/strategies can not appear as they cannot be imitated by anyone. However, when only one behavioral strategy is present, the exogenous introduction of a mutation in agents’ behavior may either spread over the population or be reabsorbed. This phenomenon can be confirmed through the study of transverse stability of the attractors on the invariant manifolds \(r=0\) and \(r=1\).

In general, an attractor on one of these invariant sets of phase space may be transversely stable, so that it attracts trajectories starting outside the restriction, i.e., from \(r\left( 0\right) \in \left( 0,1\right) \); otherwise, if the attractor of the restriction on the invariant set is transversely unstable, then any mutation, even quite small, will spread inside the phase space.

Finally, at equilibrium \(E_{r}\) the Jacobian matrix assumes a particular structure, from which we obtain that eigenvalues solve the following characteristic equation

$$\begin{aligned} det(J(E_{r}))&=det \begin{pmatrix} J_{11}^{\prime }-\mu &{} J_{12}^{\prime } &{} 0 &{} 0\\ J_{21}^{\prime } &{} J_{22}^{\prime }-\mu &{} 0 &{} 0\\ 0 &{} 0 &{} J_{33}^{\prime }-\mu &{} J_{34}^{\prime }\\ J_{41}^{\prime } &{} J_{42}^{\prime } &{} 0 &{} \omega -\mu \end{pmatrix}\\&=[(J_{11}^{\prime }-\mu )(J_{22}^{\prime }-\mu )-J_{21}^{\prime } J_{12}^{\prime }]\\&\qquad (J_{33}^{\prime }-\mu )(\omega -\mu )=0. \end{aligned}$$

3 A specific example

We now extend the model proposed by [10], where an evolutionary oligopoly game is studied, by adding the presence of memory in behavioral decisions, according to (6). Goods are homogeneous and are sold in a market characterized by isoelastic (unitary) inverse demand: selling price at time t is then \(p\left( t\right) =\frac{1}{Q(t)}\). In [10] it is studied the specific example with two different behavioral rules for selecting productions, namely Gradient rule (G-rule) (that determines a time-t production denoted by x(t)) and the Local Monopolistic Approximation rule (LMA-rule) (leading to a time-t production denoted by y(t)), see [2] for details in the non-evolutionary setting. In [10] , the dynamic selection of those rules based on the exponential replicator is studied. Here, we extend that framework by adding a memory term, as described in the previous section. As a result, we obtain the following four-dimensional map \(T:A\longrightarrow A\)

$$\begin{aligned} T:\left\{ \begin{array}{l} x\left( t+1\right) =\max \left\{ 0,x\left( t\right) +\lambda x \left( t\right) \left( \frac{Q_{-1}\left( t\right) }{(x\left( t\right) +Q_{-1}\left( t\right) )^{2}}-c\right) \right\} \\ y\left( t+1\right) =\max \left\{ 0,(1-\alpha )y\left( t\right) +\frac{\alpha }{2}\left[ y\left( t\right) +\frac{N}{N-1}Q_{-1} \left( t\right) \left( 1-c\frac{N}{N-1}Q_{-1} \left( t\right) \right) \right] \right\} \\ r\left( t+1\right) =\frac{r\left( t\right) }{r\left( t\right) +\left( 1-r\left( t\right) \right) e^{-\beta m(t)}}\\ m(t+1)=\left( 1-\omega \right) \left( \pi _{G}\left( t+1\right) -\pi _{\mathrm{LMA}}\left( t+1\right) \right) +\omega m(t) \end{array}\right. \end{aligned}$$
(10)

where \(A={\mathbb {R}}_{+}^{2}\times \left[ 0,1\right] \times \left( -\infty ,+\infty \right) \). The quantity \(Q_{-1}\left( t\right) \) appearing in (10) is obtained through naive expectations (5) and is given by

$$\begin{aligned} Q_{-1}\left( t\right) =(N-1)\left[ r\left( t\right) x\left( t\right) +(1-r(t))y(t)\right] \end{aligned}$$

Here, \(Q_{-1}\left( t\right) \) represents the total expected quantity produced by all competitors of a generic firm at time-t (aggregate production by \(N-1\) players).

The first equation in (10) is the behavioral rule of gradient learning: a generic firm employing this rule increases/decreases its production by factor \(\lambda >0\) (speed of adjustment of the G-rule) if it perceives a profit increment by that decision, see [2] for details. Similarly, the second equation in (10) represents the LMA-rule: a generic firm employing this rule selects its quantity with speed of adjustment \(0<\alpha \le 1\) by solving a profit maximization problem with a linear approximation of the demand function and ignoring the effects of the competitors’ outputs on own profits, see again [2] for details. The third equation in (10) models the share r(t) of firms employing the gradient rule (with the \(1-r(t)\) firms selecting LMA-rule at time t) according to exponential replicator dynamics. The fourth equation in (10) introduces memory in the evolutionary selection of behavioral rules, according to (6) and \(m(t)=U_{G}-U_{\mathrm{LMA}}\). Finally, in (10) \(\pi _{i}\left( t\right) \), \(i\in \left\{ G,\mathrm{LMA}\right\} \), are, respectively:

$$\begin{aligned} \pi _{G}\left( t\right)&=p\left( t\right) x\left( t\right) -\left( cx\left( t\right) +K_{G}\right) \nonumber \\&=\left( \frac{N-1}{NQ_{-1}\left( t\right) }-c\right) x\left( t\right) -K_{G}\nonumber \\ \pi _{\mathrm{LMA}}\left( t\right)&= p\left( t\right) y\left( t\right) -\left( cy\left( t\right) +K_{L}\right) \nonumber \\&=\left( \frac{N-1}{NQ_{-1}\left( t\right) }-c\right) y\left( t\right) -K_{L} \end{aligned}$$
(11)

Here, we study the model to analyze the role of memory, measured by the parameter \(\omega \in \left[ 0,1\right] \). The model with no memory considered in [10], corresponding to (10) with \(\omega =0\), exhibits periodic and chaotic behaviors both in the time patterns of outputs \(\left( x(t),y(t)\right) \) and in the evolution of the behaviors’ share inside the agents’ population. We now try to see the effects on the presence of memory \(\omega \in \left( 0, 1\right] \), starting from its influence on the stability of the Nash equilibrium and then on other kinds of attracting sets.

Equilibria with symmetric productions by the two rules are of the form \((x^{*},x^{*},r^{*},m^{*})\). These equilibria are solutions of the following algebraic system:

$$\begin{aligned} {\left\{ \begin{array}{ll} x^{*}\left[ \frac{(N-1)x^{*}}{(Nx^{*})^{2}}-c\right] =0\\ x^{*}=Nx^{*}(1-cNx^{*})~\text {or}~\alpha =0\\ r^{*}\left( \frac{1}{r^{*}+(1-r^{*})e^{-\beta m^{*}}}-1\right) =0\\ \pi _{G}^{^{*}}-\pi _{L}^{^{*}}=m^{*} \end{array}\right. } \end{aligned}$$

so we obtain

$$\begin{aligned} {\left\{ \begin{array}{ll} x^{*}=y^{*}=\frac{N-1}{cN^{2}}\\ r^{*}=0,~r^{*}=1\ \text { or }\forall r^{*}~\text {if}~m=0 \quad (\text {case }\pi _{G}=\pi _{L})\\ m^{*}=K_{L}-K_{G}\text { or}~\forall m^{*}~\text {if}~\omega =1. \end{array}\right. } \end{aligned}$$

where \(x^{*}=y^{*}=\frac{N-1}{cN^{2}}\) is the Nash equilibrium quantity, as both the G-rule and the LMA-rule admits the Nash equilibrium as a fixed point of the dynamics. Summing up, for any parameters configuration, we obtain the following equilibria, all characterized by Nash quantity play:

$$\begin{aligned} E_{0}=\left( \frac{N-1}{cN^{2}},\frac{N-1}{cN^{2}},0,K_{L}-K_{G}\right) \end{aligned}$$
(12)

with no firm choosing the gradient rule (\(r^{*}=0\)),

$$\begin{aligned} E_{1}=\left( \frac{N-1}{cN^{2}},\frac{N-1}{cN^{2}},1,K_{L}-K_{G}\right) \end{aligned}$$
(13)

with no firm choosing the LMA-rule (\(r^{*}=1\)); when \(K_{L}=K_{G}\) (equal information costs) a continuum of equilibria \(E_{r}\) exist with \(r\in \left[ 0,1\right] \), given by

$$\begin{aligned} E_{r}=\left( \frac{N-1}{cN^{2}},\frac{N-1}{cN^{2}},r,0\right) \end{aligned}$$
(14)

with coexistence of both behavioral rule. We provide next a local stability analysis for those fixed points.

3.1 Local stability analysis

Let \(\omega <1\), so that in equilibrium it is \(m^{*}=K_{L}-K_{G}\). At the equilibrium \(E_{0}=(\frac{N-1}{cN^{2}},\frac{N-1}{cN^{2}},0, K_{L}-K_{G})\) the Jacobian matrix assumes the form

$$\begin{aligned} J(E_{0})=\begin{pmatrix} 1-2\frac{c\lambda }{N} &{} \frac{c\lambda }{N}(2-N) &{} 0 &{} 0\\ 0 &{} 1-\frac{\alpha }{2}(N-1) &{} 0 &{} 0\\ 0 &{} 0 &{} e^{\beta (K_{L}-K_{G})} &{} 0\\ \gamma _{0} &{} \eta _{0} &{} 0 &{} \omega \end{pmatrix} \end{aligned}$$
(15)

where \(\gamma _{0}=\frac{\partial m^{\prime }}{\partial x}|_{r=0}\) and \(\eta _{0}=\frac{\partial m^{^{\prime }}}{\partial y}|_{r=0}\). From (15), as shown before, the eigenvalues are the entries in the main diagonal:

$$\begin{aligned} \mu _{01}&=1-2\frac{c\lambda }{N},\mu _{02}=1-\frac{\alpha }{2}(N-1),\nonumber \\ \mu _{03}&=e^{\beta (K_{L}-K_{G})},\mu _{04}=\omega \end{aligned}$$
(16)

Thus, \(E_{0}\) is stable if the following conditions hold:

  1. 1.

    \(\lambda <\frac{N}{c}\);

  2. 2.

    \(\forall \alpha \) if \(2\le N\le 5\); otherwise \(\alpha <\frac{4}{N-1}\);

  3. 3.

    \(K_{L}<K_{G}\).

Clearly, quantity dynamics can be destabilized only through a flip bifurcation for sufficiently high speeds of adjustment of the G- or LMA-rules. Condition \(K_{L}<K_{G}\) has an immediate economic meaning: for selecting Nash behavior firms always choose the rule with lower information cost, which is the LMA-rule in this case, whereas it will be the G-rule in the case of equilibrium \(E_{1}\), as shown below.

Now we turn to equilibrium \(E_{1}=(\frac{N-1}{cN^{2}}, \frac{N-1}{cN^{2}}, 1,K_{L}-K_{G})\) where all players play strategy G. It is

$$\begin{aligned} J(E_{1})=\begin{pmatrix} 1-c\lambda &{} 0 &{} 0 &{} 0\\ \frac{\alpha }{2}(2-N) &{} 1-\frac{\alpha }{2} &{} 0 &{} 0\\ 0 &{} 0 &{} e^{-\beta (K_{L}-K_{G})} &{} 0\\ \gamma _{1} &{} \eta _{1} &{} 0 &{} \omega \end{pmatrix} \end{aligned}$$
(17)

where \(\gamma _{1}=\frac{\partial m^{^{\prime }}}{\partial x}|_{r=1}\) and \(\eta _{1}=\frac{\partial m^{^{\prime }}}{\partial y}|_{r=1}\). The eigenvalues of (17) are then

$$\begin{aligned} \mu _{11}&=1-c\lambda ,\mu _{12}=1-\frac{\alpha }{2},\nonumber \\ \mu _{13}&=e^{-\beta (K_{L}-K_{G})},\mu _{14}=\omega . \end{aligned}$$
(18)

So, \(E_{1}\) is stable if the following conditions hold:

  1. 1.

    \(\lambda <\frac{2}{c}\);

  2. 2.

    \(K_{L}>K_{G}\).

Hence, as anticipated, in this case the G-rule is chosen by all the firms due to its lower information cost.

Let us now consider \(E_{r}=(\frac{N-1}{cN^{2}}, \frac{N-1}{cN^{2}},r,0)\) with \(r\in (0,1)\), i.e., both behavioral strategies are present. In this case, it is

$$\begin{aligned} J(E_{r})=\begin{pmatrix} 1+\frac{c\lambda }{N}\left( r\left( 2-N\right) -2\right) &{} \frac{c\lambda }{N}\left( 1-r\right) \left( 2-N\right) &{} 0 &{} 0\\ \frac{\alpha }{2}r(2-N) &{} 1+\frac{\alpha }{2}\left[ (1-r)(2-N)-1\right] &{} 0 &{} 0\\ 0 &{} 0 &{} 1 &{} \beta r(1-r)\\ \gamma _{r} &{} \eta _{r} &{} 0 &{} \omega \end{pmatrix} \end{aligned}$$

So, in any case, we have an eigenvalue equal to 1 and another one is equal to \(\omega \in \left[ 0,1\right] \). The other two eigenvalues are those associated with the quantity dynamics, i.e., to the submatrix

$$\begin{aligned} Z=\begin{pmatrix} 1+\frac{c\lambda }{N}\left( r\left( 2-N\right) -2\right) &{} \frac{c\lambda }{N}\left( 1-r\right) \left( 2-N\right) \\ \frac{\alpha }{2}r(2-N) &{} 1+\frac{\alpha }{2}\left[ (1-r)(2-N)-1\right] \end{pmatrix} \end{aligned}$$

for which the usual conditions for local asymptotic stability can be applied (see [10]).

When only one eigenvalue is equal to 1 (with the others being less than 1 in modulus), we can study the system restricted to the center manifold, that is, the one generated by the relative eigenvector of eigenvalue 1. In our case, this eigenvector is simply \(v_{1}=(0,0,r,0)\), and, therefore, the system is reduced to \(r(t+1)=r(t)\), so that equilibrium \(E_{r}\) is stable.

What is interesting for our study is the fact that memory has no role in the local stability properties of the equilibrium points. However, as we shall argue in the next section, local stability analysis is not enough, as the global dynamic properties of the attractors are influenced by the presence of memory and both their quantitative and qualitative structure depend on the memory parameter \(\omega \), even if not in an elementary way, in the sense that the role of memory is not easy to be anticipated and interpreted.

3.2 Numerical simulations

In this section, we present several numerical examples to show the possible dynamic scenarios arising in the evolutionary oligopoly with memory. The numerical examples described below are representative of the possible cases and share the same Nash equilibrium in quantities, which depend only on the number of firms and the marginal cost of production. Thus, the various examples are obtained by changing either the speed of adjustment of the two behavioral rules, their information costs or the amount of memory in the system.

Let us start by considering a case in which equilibrium \(E_{1}\) in (13) for the map (10) is locally asymptotically stable. Take, for instance, the following parameters’ constellation:

$$\begin{aligned}&N=15;c=0.1;\lambda =1;\alpha =0.2;\beta =1;K_{G}<K_{L};\\&\omega \in \left[ 0,1\right] \end{aligned}$$

According to the previous stability analysis, being \(K_{G}<K_{L}\) and all eigenvalues in modulus less than 1, the cheapest behavioral rule (G-rule) will be eventually employed by all firms. Moreover, at \(E_{1}\) firms will eventually learn to produce the Nash equilibrium quantity, which, under this setting, is given by \(x^{*}=y^{*}=\frac{N-1}{cN^{2}}=0.6\overline{2}\).

Now let us explore the effect of increasing the speed of adjustment of the LMA-rule to, say, \(\alpha =0.4\). Clearly, as long as \(K_{G}<K_{L}\) we will observe still convergence to equilibrium \(E_{1}\) (stability of \(E_{1}\) is independent on \(\alpha \)). However, if the increment of the speed of adjustment \(\alpha \) is combined with an increment of the cost of information for the G-rule up to the point in which \(K_{G}=K_{L}\), then interesting dynamic phenomena are observed. This is a consequence of the fact that for \(K_{G}>K_{L}\), equilibrium \(E_{0}\), in which all firms use the LMA-rule, is unstable, due to the high speed of adjustment of the LMA-rule \(\alpha \), and equilibrium \(E_{1}\) is unstable as well, due to the high information cost of the G-rule. This situation is depicted in the bifurcation diagram of Fig. 1, where \(N=15\); \(c=0.1\); \(\lambda =1;\alpha =0.4;\beta =1\), \(\omega =0\) (no memory) and the bifurcation parameter \(K=K_{L}-K_{G}\in \left( -0.1,0.01\right) \).

Fig. 1
figure 1

Bifurcation diagram for \(K=K_{L}-K_{G}\in \left( -0.1,0.01\right) \), \(N=15;c=0.1;\lambda =1;\alpha =0.2;\beta =1\) and \(\omega =0\) for a quantities x by a representative G-player (red points) and quantities y by a representative LMA-player (black points); b share of agents r employing the G-rule; c fitness m of the G-rule with respect to that of LMA-rule

As the cost difference K increases chaos disappears leading to periodic time patterns and as the value of K is further increased then a symmetric Nash equilibrium becomes the unique attractor. At the same time, the fraction r(t) of G-firms changes from a negligible share, when \(K=0\) and the system is chaotic, to an increasing share when the dynamics is periodic and it becomes dominant when \(K>0\), i.e., \(K_{G}<K_{L}\), the Nash equilibrium becomes the unique attractor.

Now we explore the effects of increasing memory in the selection process of behavioral rules. First of all, we observe that the presence of memory for \(K<0\) does not eliminate fluctuations of productions of G-firms (red points) and LMA-firms (black points). In general, we observe larger oscillations of productions for LMA-firms; moreover, these oscillations belong to a neighborhood of the Nash equilibrium quantity \(x^{*}=y^{*}\). Productions by G-firms are characterized by oscillations of smaller amplitude than those of LMA-firms, but these oscillations are always well above the Nash equilibrium productions and the quantities produced by those who use the LMA-rule. This effect persists even when a high amount of memory in the system is added, see Figs. 1a, 2a, and 3a, where the values of the memory parameter are, respectively, \(\omega =0\), \(\omega =0.95\), and \(\omega =0.99\). Another clear effect of more memory in the system is to dampen fitness oscillations, as clearly shown by comparing Figs. 1c, 2c and 3c.

As remarked before, from an analytical point of view the presence of memory has no effect on the stability of equilibria since the fourth eigenvalue of the Jacobian is precisely \(\omega \). However, the presence of memory changes the structure of the attractor and in some cases makes this attractor more complex in the presence of the memory (compare the stable 2-cycle in Figs. 1 and 2 with the chaotic attractor of Fig. 3 for example for \(K=-0.01\)). A very particular dynamic effect can be seen in Fig. 3, where the presence of memory causes a sudden transition to chaos in a range of cost difference K where in the model with a lower amount of memory the periodic patterns smoothly evolves to a situation of convergence to the stable Nash equilibrium.

Fig. 2
figure 2

As Fig. 1 with \(\omega =0.95\)

Fig. 3
figure 3

As Fig. 1 with \(\omega =0.99\)

Fig. 4
figure 4

Bifurcation diagram for \(\omega \in \left( 0.998,1\right) \), \(N=15;c=0.1;\lambda =1;\alpha =0.3;\beta =1\); \(K=K_{L}-K_{G}=-0.1\) a quantities x by a representative G-player (red points) and quantities y (black points) by a representative LMA-player (black points); b share of agents r employing the G-rule; c fitness m of the G-rule with respect to that of LMA-rule

Fig. 5
figure 5

As Fig. 4 with \(\alpha =0.4\)

Fig. 6
figure 6

Bifurcation diagram for \(K=K_{L}-K_{G}\in \left( -0.075,0.01\right) \), \(N=15;c=0.1;\lambda =1.5;\alpha =0.5;\beta =1\) showing the quantities x by a representative G-player (red points) and quantities y (black points) by a representative LMA-player (black points) with a \(\omega =0\) and b \(\omega =0.9\)

Let us now explore more in detail the effect of a change in the amount of memory in the system. For this reason, consider the previous parameter setting with \(K=K_{L}-K_{G}=-0.1\) and \(\alpha =0.3\), whereas the memory parameter \(\omega \) is taken as bifurcation parameter.

Figure 4 depicts quantities played by G-firms and LMA-firms after the flip bifurcation at \(\mu _{02}=-1\) as \(\omega \) is increased (for graphical purposes we represent only cases with a sufficiently high level of memory \(\omega \)). G-firms always choose a more aggressive behavior by selecting production quantities in periodic cycle that is always higher than the cycle of productions by LMA-firms, that is \(x\left( t\right) >y\left( t\right) \). Interestingly, quantities by LMA-firms, although cyclical, are always in a given neighborhood of the Nash equilibrium quantity. The behavior of G-firms lowers overall selling prices and thus overall profits; however, G-firms get higher profits from selling the goods as it always holds that \(\left( p\left( t\right) -c\right) x\left( t\right) >\left( p\left( t\right) -c\right) y\left( t\right) \). The higher profits by G-firms are counterbalanced by the higher information costs for the G-rule. As a result, the share of G-firms converges to zero for a level of memory \(\omega \) sufficiently low. Thus, almost all firms tend to use, asymptotically, the cheaper LMA-rule to which dynamics fail to converge being \(\mu _{02}=-1.1\). Finally, when \(\omega \) is slightly less than one, dynamics converge to Nash equilibrium quantity. The stabilization of the equilibrium is obtained for a sufficiently high level of memory \(\omega \) in the system and involves, in this example, more firms using the gradient rule.

A similar scenario occurs if the speed of adjustment \(\alpha \) is further increased, see, for instance, Fig. 5 where all parameters are the same as in Fig. 4 but the speed of adjustment of the LMA-rule has been increased to \(\alpha =0.4\). We again observe the stabilizing effect of memory, as again Nash equilibrium play is resumed for a level of memory \(\omega \) in the system sufficiently high.

Another interesting dynamic property of the model is related to the nonnegativity constraints of productions in the two behavioral rules, implemented through the max operator in the first two equations of (10). Here, particular phenomena are related to the Border Collision Bifurcations when one behavioral rule (the LMA one which prescribes lower productions) hits the zero boundary so that transitions between periodic cycles and chaotic attractors can be detected without the classical flip bifurcations cascades. For instance, Figure 6 exhibits production patterns of the two behavioral rules without memory (Fig. 6a, with \(\omega =0\)) and with memory (Fig. 6b, with \(\omega =0.9\)) with all other parameters set to \(c=0.1;\lambda =1.5\); \(N=15;\beta =1;\alpha =0.5\) and the parameter of bifurcation \(K\in \left( -0.075,0.01\right) \). Here, it is interesting to observe the existence of small windows with chaotic dynamics in Fig. 6b, which are detected only when enough memory is present.

As a final comment, we can say that the role of the memory is not univocal, as high values of the memory parameter \(\omega \) lead to stability but intermediate values of \(\omega \) may introduce complex dynamic patterns such as chaotic attractors instead of periodic ones.

4 Conclusions

We have proposed a modification of the evolutionary model studied in [10] by introducing a memory term that allows us to consider a fitness measure based on accumulated profits instead of current profit only. The introduction of similar forms of memory has already been considered by several authors in economic modeling as a more realistic assumption, see, e.g., [23]. As shown in [1], an increasing memory, i.e., a larger weight given to the past realizations, may have a stabilizing effect. This idea is partially confirmed in the model studied in this paper, in the sense that starting from a situation of chaotic dynamics without memory, we can retrieve stability for values of memory parameter high enough, i.e., by considering as fitness measure a uniform average of profits gained in the past. However, for intermediate values of the memory parameter, i.e., when discounted averages of past profits with exponentially fading weights are considered, dynamic scenarios can be obtained which are even more complex than those observed without memory.

So, even if we have analytically proved that the memory parameter plays no role on the local stability properties of the Nash equilibrium, the numerical explorations of the global dynamic properties show that the presence of memory in fitness measure can have important consequences on the global time patterns observed and, in general, on the long-run dynamics. This clearly shows the importance of a global analysis of nonlinear dynamical systems, which can often be performed only through heuristic methods obtained by a combination of analytical, geometrical and numerical tools. In fact, a study limited to the analysis of local stability and bifurcations, being based on the linear approximation of the model around the equilibrium points, sometimes may be quite incomplete and even misleading.