Dynamic Stackelberg duopoly with sticky prices and a myopic follower

In this paper, we study a model of a market with asymmetric information and sticky prices—the dynamic Stackelberg model with a myopic follower and infinite time horizon of Fujiwara ("Economics Bulletin" 12(12), 1–9 (2006)). We perform a comprehensive analysis of the equilibria instead of concentrating on the steady state only. We study both the equilibria for open loop and feedback information structure, which turn out to coincide, and we compare the results with the results for Cournot-Nash equilibria.

However, all of the models mentioned above focus on the market structure of Cournot oligopoly, in which firms produce a homogeneous product and they have entirely symmetric information. Analogous model with information asymmetry in which one of the firms (the leader) takes into account how the other firm (the follower) reacts to its strategy is called the Stackelberg model.
In static games, both the idea and solution of the Stackelberg problem are relatively simple and the informational advantage can be easily interpreted as a generalized first mover advantage: the sequence of moves with the leader as the first mover, or binding declaration of the leader about his choice of strategy before the choice of the follower (then the actual sequence of moves does not matter). Moreover, as a sequential optimization, the problem has a solution under only upper semicontinuity and compactness assumptions, unlike the Nash equilibrium problem requiring also the existence of a fixed point. In differential games, and, more generally, dynamic games, this generalized first mover advantage may be either required at each stage of the game, but it may also concern declaration of the leader's strategy in the entire game before the first move and, depending on the information structure of the game, additional assumptions may be required. The situation is simple for the open loop information structure, when the strategies of the players are functions of time only, because then the standard definition applies. Conversely, a problem appears for the feedback information structure, with strategies (called feedback or Markov perfect) dependent on the current state. In the latter case, there are two extensions of the Stackelberg equilibrium. One of those concepts corresponds to the leader being the first mover at each stage. The other one, called global Stackelberg equilibrium, describes the situation in which the leader declares his feedback strategy before the game and the follower best responds to it. This approach is either equivalent to using threat strategies in order to enforce the global maximum of the leader's payoff, or it requires imposing additional assumptions on the leader's strategy. The reason is that calculating the best response of the follower to every possible leader's strategy and then optimizing leader's payoff with the resulting follower's best response is illposed, since the best response to discontinuous strategies ceases to exist even in nice problems. Therefore, there are some a priori constraints of the class of the leader's strategies: it is usually assumed that the leader's strategy is linear. Nevertheless, it is enough to consider a class of functions defined by several real parameters. The resulting problem of the leader, however, is not a standard optimal control any more,

3
Dynamic Stackelberg duopoly with sticky prices and a myopic… but it becomes usual finite-dimensional optimization and the resulting strategy may be suboptimal if the leader's optimal control after declaration does not belong to the assumed class of functions, so the solution may be not time-consistent. For deeper insight, see e.g. Başar and Olsder (1998) or Haurie et al. (2012) for general theory, while Martín-Herrán and Rubio (2021) for rare cases of coincidence of those two classes of Stackelberg equilibria with state-dependent information structure.
Similarly, the open loop Stackelberg equilibria, which are simpler to derive, are usually not subgame-perfect and it often turns out that the leader has incentives to change the declared strategy after the follower chooses his strategy being the best response to it.
Generally, solving the feedback Stackelberg problem is analytically very complicated and restriction to time-consistent, subgame-perfect solution makes it substantially more complicated, especially if the strategy sets are constrained, which even in linear quadratic problems with linear constraints leads to only piecewise-linear solutions. It can be expected that the best response of the follower to the leader's strategy that is piecewise-linear with k pieces may result in the best response of the follower being piecewise linear with more than k pieces. This makes the leader's optimization problem piecewise-linear-quadratic with more than k pieces.
The class of linear quadratic problems with linear constraints has been extensively studied for resource extraction problems for common or interrelated renewable resources sold at a common market, known also as productive asset oligopolies. Inherent constraints, like nonnegativity of the state variable and control or the constraint by admissibility of resource, in linear quadratic problems, may lead to numerous problems for Nash equilibria. Examples of such problems are as follows: the value function is piecewise quadratic with infinitely pieces for some parameters (Singh and Wiszniewska-Matyszkiel 2018), the problem is intractable in the standard way and the solution not even piecewise-linear , all the symmetric Nash equilibria are discontinuous (Singh and Wiszniewska-Matyszkiel 2019). Some difficulties may appear even in such optimal control problems, like e.g., in , where the solution is piecewiselinear with infinitely many pieces and the standard undetermined coefficient method returns a control far from the unique optimum. Nevertheless, such complications does not have to happen always in this kind of problems: there is a sequence of works, with piecewise linear dynamics in which this problem does not appear at a Nash equilibrium: Benchekroun (2008), Benchekroun et al. (2020), Vardar and Zaccour (2020), or at a Stackelberg equilibrium: Colombo and Labrecciosa (2019).
It is worth emphasizing that, as it has been proven in Wiszniewska-Matyszkiel et al. (2015), in the Cournot oligopoly with sticky prices, the strategies of the players in a feedback equilibrium are only piecewise linear with two pieces and the same applies to best responses to linear feedback strategies. So, what can be expected, a typical way of defining the global feedback Stackelberg equilibrium, in which calculating the best response of the follower is restricted to linear strategies of the leader only and the leader's equilibrium strategy is indeed linear, cannot lead to a timeconsistent global feedback Stackelberg equilibrium. Moreover, assuming a twopiece linear strategies of the leader results in only piecewise linear dynamics in the follower's problem. Thus, a three-pieces linear best response can be expected, and it cannot be a priori excluded that the best response of the leader has more than two pieces. So, the global Stackelberg problem becomes extremely complicated.
Therefore, various simplifications of the dynamic Stackelberg equilibrium are considered. One of them is a model in which the leader's informational advantage is increased by the fact that the less informed follower is also myopic.
The first work with an attempt to capture the sticky price dynamics in a market with asymmetric information of this type is Fujiwara (2006), proposing a Stackelberg duopoly model with a myopic follower who expects immediate price adjustment. To the best of our knowledge, the subject has not been continued in the published literature. In Fujiwara (2006), the calculations are restricted only to finding the steady state of the open loop equilibrium (i.e. the information structure in which the strategies are functions of time only, not price) and the results are not fully proven. So, a natural step is to complete that analysis.
In this paper, we perform a complete analysis of both open loop and feedback form of the leader's strategies in the model proposed by Fujiwara and we obtain interesting phenomena.
We compare our results with those for analogous market with Cournot duopoly structure derived in Wiszniewska-Matyszkiel et al. (2015).

Formulation of the model
We consider a differential game with 2 players, producers of the same good. Products of both producers are perceived by consumers as identical. Each of the firms has the same quadratic cost functions where c is some positive constant and q i ≥ 0 denotes the production of i-th player.
The market is described by the inverse demand function However, the price does not adjust immediately, but its behaviour is defined by a differential equation where s > 0 measures the speed of adjustment and A is some positive constant substantially greater that c , which can be interpreted as the market capacity. (1) (2)

3
Dynamic Stackelberg duopoly with sticky prices and a myopic… So, it is natural to consider the resulting problem as a differential game with players maximizing i (q 1 , q 2 ) = ∫ ∞ 0 e −rt p(t)q i (t) − C(q i (t)) dt , where q i (t) is the decision at time t . The above problem has been extensively studied in the literature (see the introduction). In this paper, we want to consider a serious asymmetry between the players. Firstly, only the leader (player 1) is far-sighted, he knows the dynamics of price and his aim is to maximize where r > 0 is a discount factor, while p is defined by (2).
The follower (player 2) is assumed to be myopic and at each time instant he behaves like in the static Stackelberg duopoly. Hence, given a decision of the leader q 1 (t) , he chooses q 2 (t) maximizing as in Fujiwara (2006).
There may be several reasons of myopia of the less sophisticated player. Two most obvious ones are related to stronger position of the leader. The first one is when the leader is an established firm at the market and there are unrelated follower entrants at separate time instants, each of the entrants existing for one time instant. The same applies if there is only one firm but not sure whether it is going to exist in the future. This encompasses, among many other cases, the asymmetry between a fashion firm and a counterfeiter or, in a slightly different approach, a company with fishing rights and a poacher. The other obvious explanation assumes that the leader is the one who dictates prices and the follower just does not know the pricing rules of the leaderso there is partly a problem with distorted information as in Wiszniewska-Matyszkiel (2016) and Wiszniewska-Matyszkiel (2017).
We return to those interpretations in Sect. 5 after stating the results. There is also one more explanation, already examined in the literature: being myopic may be a behavioural choice as in e.g. Benchekroun et al. (2009), which has already been studied in papers on sticky prices Liu et al. (2016) and Liu et al. (2017).
We end the formulation of the problem by recalling that the leader knows the way the follower behaves.
We would like to mention that although we write q 1 (t) and q 2 (t) while defining

The behaviour of the follower, the implications for the leader and the static model
Let us consider a time instant t . If we solve the optimization problem of the follower given the decision of the leader q 1 (t) , we get the best response of the follower whenever it is positive, which, as we shall see, holds for all reasonable levels of the leader's production. This best response is knowns to the leader and therefore, taken as an input into his optimization problem. So, the optimization problem of the leader reduces to the maximization of given by (3) with p is defined by We also need the static Stackelberg model with immediate adjustment of prices for comparison to the results of our dynamic game. In the static Stackelberg model, the leader also maximizes 1 defined analogously to Eq. (4), and the only difference is in information -the leader knows that the strategy of the follower is the best response to q 1 , given by (5). So, the leader's optimization problem is to maximize For comparison, the results for static Cournot-Nash equilibrium are

The myopic-follower Stackelberg equilibrium for open loop strategies of the leader
We start the analysis from the open loop strategies of the leader, i.e., the strategies of the leader being measurable functions q 1 ∶ ℝ + → ℝ + , directly dependent on time, without any dependence on price. The set of such strategies is denoted by ℚ OL . In the case when discontinuity appears, the price adjustment Eq. (7) is required to hold almost everywhere. The reaction of the follower is given by Eq. (5). We apply the necessary conditions given by Theorem 11.
Lemma 1 For the current value Hamiltonian the following properties hold.  (12), the costate trajectory is calculated backwards, as usually in the reasoning based on the Pontryagin maximum principle. In the sequel, in Theorem 4, we transform the conditions (12) and (13) to an initial condition, which is unique given p 0 , analogously to the technique of proof used in Wiszniewska-Matyszkiel et al. (2015).
Proof The assumptions of Theorem 11 are fulfilled (see appendix A.3). Applying relations of Theorem 11 yields formulae (10) and (11). By the terminal condition given in Theorem 11, I(t) = ∫ ∞ t e −rw e −sw q * 1 (w)dw converges absolutely and that fulfils As proven in Appendix A.3, the set of control parameters that can appear in the optimal control is bounded, so and Thus, (t) e −rt → 0 as t → ∞ and it is nonnegative.
Suppose that (t) = 0 for some t > 0 . Since the integral of a nonnegative function can be zero only if the function is 0 almost everywhere, without loss of generality, the optimal control fulfils q 1 (w) = 0 for all w ≥t.
First, we check the case when p(w) > c for some w ≥t . Then, by continuity of trajectories, there exist , > 0 such that increasing q 1 to on some small This leads to a contradiction with optimality of the leader's strategy. Next, we assume that p(w) > c does not hold for any w ≥t . So, p(w) ≤ c for all w ≥t.
We recall that the optimal control fulfils q i (w) = 0 for all w ≥t and note that, by the fact that A > c , Eq. (7) with q 1 (w) = 0 for w ≥t implies ṗ(w) > 0 and the unique steady state of Eq. (7) for such q 1 is greater than c . So, the the price corresponding to such a q 1 grows to this steady state and the trajectory exceeds c at some finite time, which leads to a contradiction.
To maximize the present value Hamiltonian with respect to q 1 , we calculate its zero derivative point and we obtain q 1 (t) = p(t) − c − 2 3 s (t). Taking into account the nonnegativity constraints, this implies where and Applying Lemma 1 to our problem yields Substituting the follower's best response (5) yields

3
Dynamic Stackelberg duopoly with sticky prices and a myopic… Therefore, the optimality of the leader's strategy implies that the state and the costate variables must fulfil the following system of ODEs.
Again, has the terminal condition given by Eq. (12), while p the initial condition given as in Eq. (7). So, we have a backward-forward ODE with a mixed terminalinitial condition, which we are going to transform to a joint initial condition. As the first step to do this, we formulate the following Theorem ( Fig. 1).
The point ( * , * OL ) ∈ 2 (for 1 , 2 defined in (16) and (17)) and The corresponding steady state production of the leader is Proof First, we analyse the phase portrait, presented in Fig. 1, to determine the existence of solutions. We can see that for each variable the null-clines are as follows: As the p-null-cline has the slope smaller than the line dividing the (p, ) space into 1 and 2 , there exists exactly one solution in the positive quadrant and it corresponds to positive leader's production. It is easy to calculate (21) and then (22) by substituting to (15).
In 1 , the solution of (20) has the form In 1 , to the right from the stable manifold, for t → ∞ , (t) asymptotically behaves as c 2 e (r+s)t and p(t) as c 1 e −st . Therefore, lim t→∞ e −rt (t) ≠ 0.
We can see from the phase diagram that each solution with the initial condition right to the stable manifold eventually enters 1 , so the above reasoning applies also to other trajectories right to the stable manifold.
We can also see that for every trajectory with the initial condition left to the stable manifold, (t) ≤ 0 from some time instant on. ,

3
Dynamic Stackelberg duopoly with sticky prices and a myopic… The equilibrium production is given by where 2 is given by (26), p and ̄ are given by (23) and (24) and * , * OL and * 1,OL by (21) and (22).

The equilibrium price level is given by
The steady state ( * OL , * 1,OL ) is stable with respect to changes of p 0 .
Proof We use Theorem 2 and Lemma 3 and solve the set of equations (20) along the stable manifold of the steady state Γ. 0 corresponding to p 0 is uniquely defined by the terminal condition and the condition (t) > 0 , and it is such that ( 0 , p 0 ) belongs to the stable manifold of the steady state. ◻ We would like to emphasize that, although the steady state ( * , * OL ) is a saddle point, * OL and * 1,OL are stable with respect to changes of the initial condition p 0 . This holds because the terminal condition for at infinity together with the positivity condition imply unique initial condition 0 corresponding to p 0 such that the trajectory is in the stable manifold of the steady state. Besides, the costate variable is only an auxiliary variable that has to exist and it shouldn't be treated in the same way as the actual state variable p.

Feedback strategies
The next problem we want to solve is the optimization of the leader for the feedback information structure, i.e. the problem in which the set of controls of the leader is the set of functions q 1 ∶ ℝ → ℝ + , with price as the argument, such that Eq. (7) with q 1 (t) replaced by q 1 (p(t)) has a unique absolutely continuous solution. In the for 0 ≤ t <t and p 0 <p, * Dynamic Stackelberg duopoly with sticky prices and a myopic… feedback approach, the problem is solved more generally for arbitrary values of the initial condition. We recall that for the Cournot duopoly considering the feedback information structure leads to results which are not equivalent to results of considering the open loop information structure and even the steady states are not equivalent (see Cellini and Lambertini 2004;Fershtman and Kamien 1987;Wiszniewska-Matyszkiel et al. 2015).
To calculate the myopic-follower Stackelberg equilibrium assuming the feedback form of leader's strategies, we use the standard sufficient condition using Bellman or Hamilton-Jacobi-Bellman (HJB) equation (see e.g. Dockner 2000; Fleming and Soner 2006;Zabczyk 2009) which returns the auxiliary value function, i.e., a function W ∶ ℝ + → ℝ such that for every p , W(p) is the optimal payoff of the leader if the initial price is p.
In our case, the sufficient condition for a continuously differentiable function W to be the value function is the Bellman equation for every price p ∈ ℝ + with the terminal condition lim sup t→∞ e −rt W(p(t)) = 0 for every admissible price trajectory p.
If W is the value function, then every 1,F that maximizes the rhs. of the Bellman equation, i.e., that for every price p ∈ ℝ + , fulfils is an optimal control.

Theorem 5 The value function of this optimization problem is defined by
and it is nonnegative, increasing, continuous and continuously differentiable.
The feedback optimal solution is defined by strictly increasing whenever positive. The corresponding price trajectory is defined by where The price p is the price at which the leader starts production, while the time instant t is the time instant at which the optimal price trajectory originating from p 0 attains the level p.
Proof The methodology of finding the value function is analogous to Wiszniewska-Matyszkiel et al. (2015). First, we solve the analogous dynamic optimization without constraint on q 1 . We assume that for some , , . We substitute the leader's production maximizing the right hand size of (31) and we obtain This leads to solving the following equation Simple calculations yield (34) and another solution differing by plus sign before the square root in . We choose with minus sign, because the other solution results in q 1 decreasing in price and negative above p , so we cannot treat it as a good candidate for the optimal strategy. Next, we return to the initial problem. By that moment, we haven't considered the constraint q 1 ≥ 0 . We calculate the price p at which q 1 given by (39) is equal to zero.
We replace q 1 (p) by 0 for p ≤p and we proceed to prove that the modified strategy is optimal.
The resulting trajectory of price for p 0 <p as long as p stays in this set, is as follows.
Let t be the time needed for the price described by equation (42) to reach the level p . Such t exists, because for p <p , the price is increasing, since ṗ(t) > 0 and immediate transformation of (42) proves that it is as in (37).
If the modified q 1 given by (35) is the optimal control and the value function is as in (38) In order to check sufficiency, besides checking the Bellman equation and the terminal condition, we have to prove that W(p) = W + (p) for p ≥p, W − (p) otherwise, is continuously differentiable. Continuity of W is straightforward as W + (p) = W − (p) . We focus on proving the continuity of the derivative, which is not obvious at p.
Since the Bellman equation is fulfilled for p and W is continuous, we have To prove monotonicity and nonnegativity, we first prove that (W + ) � is positive for p ≥p . Since > 0 , (W + ) � is strictly increasing, so, it is enough to check it at p . W − is positive since it is a discounted value of W + (p) , which is positive, since it is equal to its derivative at this point multiplied by a positive constant r s ⋅ 3 2A+c−3p . Analogously, (W − )(p) � is positive, since it is equal to W − (p) multiplied by a positive constant s r ⋅ 2A+c−3p 3 . Consider an arbitrary control q 1 and the corresponding trajectory p . Nonnegativity of W implies that lim sup t→∞ W(p(t)) e −rt ≥ 0 . Noting that if p(t) > A , then p � (t) < 0 until p(t) = A implies that lim sup t→∞ W(p(t)) e −rt ≤ 0 , which completes the proof that W is the value function.
Since 1,F is defined as the zero-derivative point of the maximized function in the rhs. of the Bellman equation whenever it is nonnegative, zero otherwise, and the maximized function is strictly concave in q 1 , it is the optimal control.
Next, we calculate the trajectory corresponding to the optimal control. The equation defining it whenever p(t) ≥p is So, if p 0 >p , then the whole trajectory is given by while if we consider a trajectory with p 0 <p which reaches p at time t , then for t ≥t

3
Dynamic Stackelberg duopoly with sticky prices and a myopic… ◻

Open loop and feedback myopic-follower equilibria, limitations of the model and self-verification of the follower's false beliefs in the game
After solving both problems, we present the solutions graphically. In Fig. 2, we present productions of both firms, compared to their static Stackelberg equilibrium strategies and the static Cournot-Nash equilibrium production level, while in Fig. 3, the equilibrium price compared to static Stackelberg and Cournot-Nash price.
As we can see, the open loop and feedback solutions coincide. It is not only a property for a specific set of data, but a general principle. This is different from the situation observed for the analogous dynamic Cournot-Nash equilibrium, in which feedback strategies at the corresponding trajectory are larger than the open loop strategies, with the opposite inequality for prices, as it has been proven in Wiszniewska-Matyszkiel et al. (2015). This coincidence in our model is a result of myopia of the follower. We can formally write the following theorem.

Theorem 6 The open loop and feedback myopic-follower Stackelberg equilibrium trajectories coincide, and the open loop optimal strategy of the leader coincides with his optimal feedback strategy along this trajectory, and the same applies to the follower's best response and
Proof By substitution of the constants from Eqs (34) and (37) to F from Eq. (36) and meticulous simplifying and then by substitution of those constants and F to 1,F and simplifying.
The last inequality in (46) simplifies to A > c , the first one has already been proven in Theorem 2. ◻ As we can see from Fig. 2, for small prices, until t , the leader does not produce, waiting for the price to grow. At the same time interval, the follower has maximal production. Afterwards, the leader's production continuously increases, while the follower's production decreases. They intersect and they converge to their steady states, with the steady state of the leader above his static Stackelberg equilibrium level, and the steady state of the follower below his static Stackelberg equilibrium level. When prices are considered, the steady state of the dynamic equilibrium price is below the static Stackelberg equilibrium price, which can be also confirmed by analytic calculations.

3
Another interesting thing that can be seen from Fig. 2 is that the model does not behave as everybody can expect for p close to and below c (to make it visible, we started from p 0 close to the minimal marginal cost c ). While the leader does not produce, the follower observing the leader's production, has a large constant level of production.  This leads us to the concept of self-verification. The fact that instead of the total payoff in the dynamic game, the follower maximizes only expected current payoff, may be caused by two different reasons.
1. The less realistic explanation is that the leader has already been at the market, while there are multiple unrelated follower firms-entrants-who do not know the leader's pricing strategy, i.e., sticky prices with speed of price adjustment s . Each of those follower firms exists only one time instant, at most one at each time instant. After obtaining the profit lower then expected, each follower firm resigns. And the profit is lower then expected, because if the leader offers a lower price, then the entrant has to decrease his price, too. 2. The follower is not conscious that his current choice influences future price. In this case, we have a game with distorted information, in which players have some beliefs on how their current decision influences future aggregates and values of the state variable, non-necessarily consistent with reality. Depending whether the beliefs are deterministic (realizations regarded as possible versus those impossible) or stochastic (a probability distribution on future realizations), there are two corresponding concepts of belief distorted Nash equilibrium, introduced in Wiszniewska-Matyszkiel (2016) and Wiszniewska-Matyszkiel (2017), respectively. A part of those concepts is self-verification of the equilibrium profiles which, briefly speaking, means that the beliefs influence the behaviour of the players such that the beliefs cannot be falsified by subsequent play. Moreover, the correct current value of the state variable and opponent's behaviour is a part of both equilibrium concepts.
For steady-state initial prices, the follower's belief of no influence on future prices is self-verifying and the current leader's behaviour and price is guessed correctly. For lower prices, they do not have this property, which is especially visible for initial prices close to c . This suggests that the analysis of a dynamic optimization model with sticky prices cannot be restricted to the steady-state only and it suggests that further studies are required to derive a model that behaves as expected also for small values of the initial price. Moreover, we would like to emphasize some limitation of the sticky price models usually not emphasized and therefore, not perceived, since the focus is usually on their nice mathematical behaviour. The economic justification of introducing first sticky prices models was by the fact that prices below the static equilibrium level are often observed at real world markets and it is obtained by gradual increase of prices. Such a situation in a model can happen only if the initial market price does not exceed the steady state price (which in our case is slightly below the static Stackelberg equilibrium).
This assumption is not needed in any of the mathematical results and their proofs, which hold for arbitrary positive initial price.
Nevertheless, in economics, the reverse situation is unrealistic. As we can see from the dynamics of price for the equilibrium strategies in various sticky prices models, above the steady state price, there is a permanent excess supply.
Sticky prices approach is related to behaviour of producer facing excess demand but constrained by e.g. menu costs. In reality, permanent excess supply and the resulting need to dispose the excess amount of product would cause qualitative change of behaviour of the producers, e.g. immediate reduction of price in order to sell the excess product.
So, if the initial price is above the steady state, which can happen if e.g. an entrant suddenly appears at a previously monopolistic market, an immediate reduction of price by the ex-monopolist can be expected. So, after the reduction, there will be no excess supply and the new initial price to the sticky prices dynamics will not exceed the steady state price.

Dependence on the speed of price adjustment
An interesting question is how the equilibria depend on the speed of price adjustment s . In Fig. 4, we compare production levels for two different s . We can see that increasing s results in faster switching on production of the leader, and faster growth of production at the beginning, but later convergence to a lower steady state. The opposite inequalities apply to the production of the follower. Analogous comparison of the price for various s in Fig. 5 reveals that this anomaly of production trajectories is not strong enough to affect anomalies in prices-the price at each time instant is a strictly increasing function of s.

The asymptotic values of the equilibria
In many previous works, e.g. Fershtman and Kamien (1987), Cellini and Lambertini (2004) and Wiszniewska-Matyszkiel et al. (2015), it has been proven that the feedback Cournot-Nash equilibrium does not converge to the static Cournot-Nash equilibrium when s → ∞ , which corresponds to immediate price adjustment. So, an interesting question is what happens in our model as s tends to its limits, especially when s → ∞.
First, we recall the form of the steady states for the feedback equilibrium.
Next, their limits as s → 0 * Dynamic Stackelberg duopoly with sticky prices and a myopic… Finally, their limits as s → ∞.  For s < ∞ * OL = * F < p SB , * 1,OL = * 1,F > q SB 1 and * 2,OL = * 2,F < q SB 2 . As we can see, all the values converge to their static Stackelberg analogues as the speed of adjustment tends to infinity, i.e. the immediate adjustment.
In Fig. 6, we present the steady state of productions of both firms, while in Fig. 7, the steady state of price. As we can see, the steady state production of the leader is decreasing in s and it converges to the static Stackelberg leader production from above, with the opposite inequalities for the follower, and the steady state price is increasing in s and it converges to the static Stackelberg price from below.

Comparison to the Cournot model
Last, but not least, we want to compare our results with the results of the Cournot oligopoly case. The complete results for the Cournot model with sticky prices has been derived in Wiszniewska-Matyszkiel et al. (2015). We do not cite the exact values of constants, we only present the comparison graphically in Figs. 8 and 9, with a zoomed view of an initial time interval for better readability.
As we can see, at the myopic-follower Stackelberg equilibrium, the leader starts the production later than the Cournot competitors in the feedback case and slightly before the Cournot competitors in the open loop case. Afterwards, his production Dynamic Stackelberg duopoly with sticky prices and a myopic… first grows slower than that in both Cournot cases, then faster and, after intersecting the open loop equilibrium strategy twice and feedback equilibrium strategy once, it converges to a larger steady state. The myopic-follower Stackelberg price first grows slower, but afterwards it intersects the feedback Cournot price trajectory and converges to a steady state between the steady states of the feedback and open loop Cournot equilibrium price.

Conclusions
In this paper, we have extensively studied the model of a dynamic Stackelberg type duopoly at a market with price stickiness in which the follower is myopic, first proposed and partially studied by Fujiwara (2006), called myopic-follower Stackelberg In this model, we have obtained convergence to a stable steady state with the price and follower's production below while the leader's production above their static Stackelberg levels. However, an interesting result can be observed for low initial prices, when the leader's production is below the myopic follower's production, and, if the initial price is low enough, the leader initially waits in order to increase it, while the follower produces maximally. This waiting time is for a longer time interval than for the feedback Cournot equilibrium. Interesting anomalies can be observed as the speed of adjustment changes, but the limits as it converges to infinity are equal to their static Stackelberg counterparts. Besides, unlike in the Cournot model, open loop and feedback solutions coincide. The results of this paper concerning the behaviour of the follower for small prices show that the analysis of a dynamic game model with sticky prices cannot be restricted to the steady state only and it suggests that further studies are required to derive a model that behaves as expected also for small values of the initial price.
Therefore, an analogous analysis with a different model of the follower's behaviour, observing rather the price than the leader's behaviour is an obvious future continuation of this paper. In such a case, we can introduce more myopic followers, being price takers, which results in "a cartel and a fringe" models (see e.g. Groot et al. 2003 or Benchekroun andWithagen 2012 for applications of such differential game models). Dynamic Stackelberg duopoly with sticky prices and a myopic…

Appendix A: Open loop -existence of optimal solution and appropriate necessary conditions for infinite horizon optimal control problem
In this section, we formulate the necessary condition, analogous to the core relations of the Pontryagin maximum principle for finite time horizon, in the case of the infinite time horizon.
We consider an optimal control problem with the state space being an open convex set ⊆ ℝ n , the set of control parameters ⊆ ℝ m and the open loop information structure, so, consequently, the set of open loop control functions U OL = {u ∶ ℝ + → measurable} . As the objective we consider maximisation of where the trajectory x is the trajectory corresponding to u and it is defined by the discount rate is r > 0 , and the integration denotes integration with respect to the Lebesgue measure.
Obviously, the set is assumed to be invariant set of equation (48) for every control function u.
We assume a priori that the functions g and f are such that the objective function is finite for every u ∈ U OL and the corresponding trajectory x.
An absolutely continuous function x ∶ ℝ + → being a solution to the system (48) with u ∈ U OL is called the (admissible) trajectory corresponding to u.
We denote this dynamic optimization problem by (P). In all further results we assume that both sets and are nonempty, is compact, and the functions f ∶ ℝ + × × → ℝ n , and g ∶ ℝ + × × → ℝ are measurable.
Any pair (u, x) , where u is a control and x is an admissible trajectory corresponding to it, is called an admissible solution.
A pair (u * , x * ) is called an optimal solution of the problem (P) if it is an admissible solution, and the value of J 0,x 0 (u * ) is maximal, that is J 0,x 0 (u) ≤ J 0,x 0 (u * ) for every admissible solution (u, x).

A.1 Aseev and Veliov extension of the Pontryagin maximum principle
Here we cite the maximum Pontryagin principle for the problem (P), which is infinite horizon, non-autonomous, discounted dynamic optimization problem. As it has been mentioned before (see Section 3), the maximum principle, especially the terminal condition lim t→∞ (t) e −rt = 0 , is not necessary in such a problem.
Results that can be applicable in this paper has been proved by Aseev and Veliov (2012). First, we formulate three suitable assumptions. Consider the dynamic optimization problem (P) and let (u * , x * ) be an optimal solution to it.
(A1) For almost all t ≥ 0 and every (x, u) ∈ × , partial derivatives f x (t, x, u) and g x (t, x, u) exist. The functions f and g and their partial derivatives with respect to x are Lebesgue-Borel measurable in (t, u) for every x , continuous in x for almost every t ≥ 0 and every fixed u ∈ , uniformly bounded as functions of t over every bounded set of (x, u). (A2) There exist a continuous function ∶ [0, ∞) → [0, ∞) and a locally integrable and for almost all t ≥ 0 we have (A3) There exist a number > 0 and a nonnegative integrable function ∶ [0, ∞) → ℝ such that for every ∈ with ‖ − x 0 ‖ < , Eq. (48) with u = u * and the initial condition replaced by x(0) = , has a solution on [0, ∞) , denoted by x , and it fulfils for a.e. t , The formulation of necessary conditions uses a Hamiltonian function H and an adjoint variable .

Definition 8
For an admissible solution (u * , x * ) an absolutely continuous function ∶ ℝ + → ℝ n is called an adjoint (or costate) variable corresponding to (x * , u * ) , if it is a solution to the following system Definition 9 We say that an admissible pair (x * , u * ) together with an adjoint variable * corresponding to (x * , u * ) , satisfies the core relations of the normal-form Pontryagin maximum principle for the problem (P), if the following maximum condition holds on [0, ∞) .

A.2 Existence of the optimal solution
We use the existence theorem of Balder (1983, Theorem 3.6), which we cite in a simplified form, previously used in Wiszniewska-Matyszkiel et al. (2015).

A.3 Checking the assumptions of Theorem 10
First, we are going to restrict the state and control sets in a way that does not change the optimal control for realistic initial conditions. If the initial price p 0 is greater than 2A+c 3 , then every admissible trajectory is contained in (−∞, p 0 ] . Note that whenever the initial price p 0 < A , what we assume in our paper, then we can restrict the set of state variables (prices) to (−∞, A] . Next, suppose that at some time t the price is below c . Denote by t the time instant when the price reaches c and consider the time interval [t, t] such that the price does not exceed c . Consequently, the leader's instantaneous payoff is nonpositive. Thus, his optimal strategy is q 1 = 0 a.e. on [t, t] . It follows that ṗ(t) > 0 on the considered time interval and so values below c cannot be reached if the initial price is at least c . This implies that if the initial price is in [c, A] , then the whole optimal trajectory of price remains in [c, A] . So, adding the constraint p ∈ [c, A] on possible prices does not change the optimal control if the initial price is in this interval. Next, let us note that the set of control variables can also be constrained. Considering the leader's instantaneous payoff 1 = (p − c)q 1 − 1 2 q 2 1 , with p ∈ [c, A] , we can see that if the leader's production exceeds some sufficiently large q max , then his current payoff becomes negative. Therefore, the optimal control of our problem is equal to the optimal control for the problem with an additional constraint q 1 ∈ [0, q max ].
(A1) The functions f , g and their partial derivatives f p , g p are Lebesgue-Borel measurable in (t, p, q 1 ) for every p , continuous in p and uniformly bounded as functions of t hence they are independent of t. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.