On the dynamic optimality of yardstick regulation

This paper proposes a generalization of Shleifer’s (RAND J Econ 16:319–327, 1985) model of yardstick competition to a dynamic framework. In a differential game setting, we show that the yardstick mechanism effectively replicates the first-best solution if players adopt open-loop behaviour rules and are symmetric at the initial time; in the absence of initial symmetry, the social efficiency is reached only in the asymptotic steady state. On the contrary, if players adopt Markovian behaviour rules, then the yardstick pricing rule cannot achieve the first-best solution along the equilibrium path of any Markov Perfect Nash Equilibrium.


Introduction
Yardstick regulation is a price regulation scheme, initially presented in a theoretical model by Shleifer (1985), according to which the regulated price set for a given firm, who is a monopolist in a market niche, is obtained from the cost structure of similar firms, operating in different niches. This mechanism is implemented, in some countries, to regulate prices in the sectors of healthcare, transport, energy, and, more in general, in public utilities and services (see, e.g., Schmalensee & Willig, 1989, Part 5;Armstrong & Porter, 2007), 1 with two main aims: to avoid market failures due to the asymmetric information between regulated firms and the regulator, and to provide incentives for driving less efficient firms to improve their efficiency.
Although the yardstick regulation mechanism was proposed precisely to reach the goal of leading firms to improve their efficiency over time, no simple, dynamic versions of the model are available in the literature, to the best of our knowledge. By 'dynamic version' we mean a model where a pricing rule for every instant over the period under consideration is specified, and it is possible to evaluate the dynamic path of the relevant variables, namely, on the one hand, the cost-reducing investments and the corresponding cost levels of regulated firms, and, on the other hand, the regulated prices and the demand levels. We aim to fill this gap, by presenting a differential game model where yardstick competition can be introduced. We study the features of such a regulatory mechanism and precisely we assess whether the repetition of the yardstick pricing rule derived by Shleifer (1985) leads to the first-best outcome also in a dynamic game framework.
Our paper contributes to the somewhat limited literature concerning dynamic aspects of yardstick regulation. Specifically, a generalization of the static Shleifer's model is proposed by Faure-Grimaud and Reiche (2006), who explicitly deal with dynamic yardstick mechanisms, focusing on mechanism design in the presence of private information, and explore the limitations of yardstick mechanisms that can justify the use of long-term contracts. The importance of considering truly dynamic models to assess the properties of yardstick competition is also supported by Meya (2015), who observes that real life applications of yardstick regulation often resort to historical cost data and argues that a firm under yardstick regulation can affect the price it will be allowed to charge in the future if it can influence the current behaviour of competitors. Unlike in these contributions, in our model firms take cost-reducing investments, as in Shleifer's seminal paper and other related (static) models. 2 Moreover, rather than considering a finitely (Faure-Grimaud & Reiche, 2006) or infinitely (Meya, 2015) repeated stage-game, the analysis is conducted in a continuous time setting.
On the technical side, our work is thus closer to previous contributions dealing with dynamic regulation in a differential game framework. For instance, in the field of environmental economics, Benchekroun and Van Long (1998) and Menezes and Pereira (2017) consider efficiency-inducing tax or subsidy to correct the production choices of polluting oligopolists; in the field of resource economics, Akao (2008) and Bisceglia (2020) deal with the optimal taxation of common-pool resources. More closely related to our work, Bisceglia et al. (2020) characterise the optimal dynamic volume-based price regulation in the oligopoly model under firms' quality competition proposed by Brekke et al. (2012).
As in these papers, in our model the type of interaction among firms depends on the assumptions concerning the information set available to firms at any point in time. As usual in differential games (see, e.g., Dockner et al., 2000), we distinguish the case in which firms set their plans at the beginning of the game and then stick to them forever (open-loop solution) from the case in which the state of the world is observable in any instant of time, and the firms find the optimal rule connecting the choice variable to the state variables (closed-loop solution). In particular, as far as the latter concept is concerned, we consider Markovian statefeedback behaviour rules, in which the players' choice variables are linked to the current value of the state variables only, and these rules are stable over time.
The consideration of a model in continuous time also requires a specification concerning how we set up yardstick regulation: in Shleifer's (1985) static framework, the regulator announces (and commits to) a pricing rule; then firms make their (one-shot) decision, and the regulator's rule is implemented. In our dynamic setting, the rule has to be interpreted as announced at the outset of the game (i.e., at time 0); then firms make their decisions over time, and the rule is implemented instant by instant.
Following the literature (see, e.g., Karp & Livernois, 1992), we argue that the regulator cannot commit to time dependent pricing rules and we show that, even in the absence of information asymmetries between the firms and the regulator, the social first-best cannot be implemented by considering the dynamic price regulation problem of each firm separatelyi.e., by linking the regulated price of each firm to its own cost only. Starting from this impossibility result, we then consider pricing rules involving yardstick competition among the regulated firms.
Our model shows that the yardstick regulation rule, as designed in the static framework by Shleifer, can replicate the social optimum in the dynamic framework only if firms adopt open-loop behaviour rules. Indeed, if regulated firms are asymmetric, as far as their initial costs are concerned, the replication of the first-best is limited to the asymptotic steady state only. On the contrary, if firms are symmetric, the replication of the social first-best also occurs along the whole equilibrium path leading to the steady state. This is because, when players are constrained to stick to a strategy chosen at the beginning of the time, there is no genuinely dynamic interaction, thereby the results shown in a static setting carry over to the dynamic game.
By contrast, the yardstick regulation fails to replicate the first-best outcome if firms adopt Markovian state-feedback closed-loop behaviour rules. This is due to the fact that, in the absence of regulation, firms are able to react instant by instant to the dynamic evolution of state variables (namely, the cost levels of each firm) controlling two choice variables (namely, prices and cost reducing investments), whereas the regulator can control only one policy instrument (i.e., the regulated price). Nevertheless, in the game with state-feedback information structure, regulated prices inducing yardstick competition are likely to alleviate the static and dynamic inefficiencies that would arise in the absence of regulation-i.e., if each monopolistic firm could freely set its prices.
The remainder of the paper is as follows. Section 2 introduces the basics of the model. Sections 3 and 4 provide the social first-best solution and the optimal solution for a profit-oriented monopolist, respectively. Section 5 deals with preliminary and general aspects concerning price regulation in a dynamic environment. Section 6 studies the features of the yardstick regulation mechanism. Conclusions are in Sect. 7.

The model set-up
Like in Shleifer (1985), we consider N identical firms. Each of them serves a market niche, by selling one product at a price p. The niche demand, denoted by q( p), must satisfy some standard assumptions: (i) there is a finite choke price p > 0 such that q( p) > 0 for all p ∈ [0, p) and q( p) = 0 for p ≥ p; and (ii) q( p) is strictly decreasing and twice continuously differentiable over (0, p).
As for the technology, we consider constant marginal costs, denoted by c( ) > 0, whose amount depends on a variable ≥ 0, representing the firm's efficiency level. Setting, without loss of generality, fixed costs equal to zero, the marginal cost coincides with the average, or unit, cost. We assume that the unit cost function c(·) ≥ 0 is twice continuously differentiable, with c (·) < 0 and c (·) > 0 for all ≥ 0, lim →∞ c( ) = c > 0 and lim →∞ c ( ) = 0: a more efficient firm faces lower costs, and the marginal effect of the efficiency level on costs is decreasing and vanishing in the limit → ∞. Moreover, for simplicity we posit c(0) < p, so that even the less efficient firm can sell a positive quantity.
In every instant of time, firms can make investment I aimed at increasing technological efficiency, that is, at reducing the unit cost. Investment I ≥ 0 entails a cost Γ (I ) ≥ 0 that is supposed to be twice continuously differentiable, with Γ (·) > 0 and Γ (·) > 0 for all I .
The instantaneous profit function for each firm is thus The efficiency level moves over time according to the following rule, which is clearly inspired by the simple and widely used rule concerning the evolution of the capital stock under investments (see, e.g., Dockner et al., 2000): where δ ∈ (0, 1) is a depreciation rate and θ > 0 measures the investments' effectiveness, with an initial value (0) 0 > 0. Following Shleifer (1985), a transfer T , covered by a lump-sum tax, may be designed, to cover eventual firms' losses. However, since T has only redistributive effects, it is immaterial to the first-best allocation. We consider a model with an infinite time horizon and assume that all firms discount future profits at a constant instantaneous rate r > 0. For simplicity and in line with a large body of literature, we also consider the same discount rate for the regulator.
Finally, together with the requirements imposed on the cost and the demand functions, throughout the paper, in order to have concave maximization problems, we impose the following Assumption.

Assumption (A)
This assumption is satisfied by several commonly used demand and cost functions, at least under some restrictions, as shown in the following example.
Example 1 Consider the linear demand q( p) a − bp, with a, b > 0 , and the exponential cost function c( ) ze −k + c, with z, k > 0. Then, it is easy to see that Assumption (A) can be written as: a − bp > bze −k . As the left-hand side is decreasing in p, and the largest p that a firm can optimally set is the monopoly price p m a+bc( ) 2b , it then follows that Assumption (A) is always satisfied for b < a 3z+c . This example suggests that Assumption (A) simply amounts to impose an upper bound on the absolute value of the price-derivative of demand.
In the following two sections, we consider two benchmark cases. We first examine the social planner problem-i.e., we find the optimal price and investments set by a benevolent planner who maximises social welfare. Secondly, we consider the dynamic optimization problem faced by an unregulated profit-oriented firm.

The first-best solution
As in Shleifer's (1985) seminal (static) paper, we define the social welfare function for each market niche, at any instant t, as the sum of the consumer surplus and the firm's profit-i.e., Notice that we do not attach any opportunity cost to public expenditure if lump-sum transfers are due to lead firms to break-even. Extending the problem to a dynamic framework, the benevolent planner faces the following optimal control problem ⎧ ⎨ whose solution is shown in the following Proposition.
Proposition 1 In the first-best solution, for every t ∈ [0, ∞): and the time paths of the investments and the efficiency level solve the ODE system The efficiency level in steady state * is the unique positive solution of and it constitutes a saddle point.
Clearly, the benevolent planner sets the social first-best (i.e., the perfect competition) prices over time. The corresponding demand levels impact the optimal amount of cost-reducing investments, which in turn affects future costs, hence prices, and so on. Specifically, by drawing the nullclines in the state-control plane, it can be easily checked that the transition to the steady state along the stable manifold is as follows: if 0 < * , then the social planner sets a high initial investment level I (0) > I * δ θ * ; over time, socially optimal investments' levels are decreasing, nevertheless the efficiency levels increase up to the steady state value, hence prices decrease over time. Clearly, the opposite holds if 0 > * . This is not surprising, and the economic intuition behind the converging process is linked to the decreasing marginal returns of investments.
Finally, it is easy to see that the steady state value * is a decreasing function of the depreciation rate δ, whereas it is an increasing function of the investments' effectiveness (captured by the parameter θ ). Also these outcomes are far from being surprising, and they are in line with our assumptions, which define a well-behaved technology.

The unregulated firm
Each unregulated firm aims to maximise the discounted profit flow by choosing prices and cost-reducing investments over time, subject to the dynamic of the efficiency level given by the ODE (1). The solution of this optimal control problem is shown in the following Proposition.

Proposition 2
The unregulated firm, for every t ∈ [0, ∞), chooses p such that and the time paths of the investments and the efficiency level solve the ODE system Let p = h( ) be the implicit function defined by the firm's monopoly pricing rule 3 (4). Then, the efficiency level in steady state is the unique positive root¯ of the equation and it constitutes a saddle point. Moreover,¯ < * .
Notice that the difference between the investments' time paths chosen by an unregulated firm and by the benevolent planner is entirely driven by the different implemented pricing rules-i.e., the monopoly price vs the perfect competition one. Consequently, the presence of market power results in higher steady state costs compared to the social optimum. 4 Thus, in steady state, the social planner sets the perfect competition price, which is given by the cost c( * ), whereas the unregulated firm adds the monopolist mark-up to its cost c(¯ ) > c( * ). The intuition is rather simple: other things being equal, the individual optimum, like in a static framework, is characterised by a lower quantity and a higher price as compared to the social optimum; hence, from the perspective of the dynamic problem, the incentive to invest over time for reducing the unit cost is lower. Put it in another way, in our model, besides the static deadweight loss, a monopoly also entails dynamic inefficiencies, due to the weaker incentives to reduce unit production cost: the well-known replacement effect (Arrow, 1962).
Thus, market regulation is needed to correct these (static and dynamic) inefficiencies. In the remainder of the paper, we deal with this optimal dynamic price regulation problem.

Price regulation: preliminaries
To begin with, a few preliminary remarks about the dynamic price regulation problem under consideration are in order. First, if the regulator (the benevolent social planner) knows a firm's technology, in principle, she could easily solve her problem by setting at any time t a regulated price given by the first-best cost at time t, so to lead the firm to replicate the social first-best investments' choices. However, in our dynamic setting, even under complete information, the above mentioned solution cannot be implemented if the regulator has limited commitment power-i.e., cannot commit at the outset of the game to the entire time path of prices { p(t)} t∈ [0,∞) . Indeed, for several reasons, the assumption of full commitment is hardly in touch with reality. It is much more realistic to assume that the regulator can only commit to implementing a Markovian pricing rule, which specifies the regulated price depending on the current value of the state variables.
This restriction can be justified based on two kinds of consideration. On the one hand, the yardstick pricing rule, as implemented in the real world, can be seen as a Markov pricing rule (see, e.g., Meya, 2015). 5 On the other hand, regulation in the form of Markov policies has been widely employed in the differential game literature (see, e.g., Benchekroun & Van Long, 1998), mainly for two reasons. First, this choice is made for the sake of tractability since it allows to restrict attention to Markovian (state-feedback) Nash equilibria, which are much easier to characterise and interpret than non-Markovian equilibria. Second, non-Markovian policies, which (explicitly) depend on time, are often vulnerable to strategic manipulation by the regulated firm (see, e.g., Karp & Livernois, 1992).
Thus, in the remainder of the analysis, as the solution concept to the outlined dynamic price regulation problem, we consider first-best Markov pricing rules, defined as follows.
Definition 1 A first-best Markov pricing rule is a function ρ( ) : R + → R + such that: (i) the investments' time path of the regulated firm solves system (2) and leads to steady state defined by (3), and (ii) the price is set at the perfect competition level for every t ∈ [0, ∞).
To solve our dynamic price regulation problem, we must first consider a regulated firm's behaviour in response to a generic Markov pricing rule announced by the regulator and then look for a Markov pricing rule such that the firm's behaviour yields the first-best solution (i.e., the two conditions stated in Definition 1 are fulfilled).
To begin with, notice that, for any given Markov pricing rule ρ( ) announced by the public authority, a regulated firm's profit, at any time t, is Hence, the firm chooses the investments' time path to maximise its discounted profit flow, subject to the dynamic constraint (1).
Notice that a pricing regulation rule ρ( ) does not implement yardstick competition among the regulated firms since it links the regulated price for a firm with the cost of the firm itself only: in other words, for the time being, we are assuming that the regulator aims to solve the price regulation problem of each firm separately. It is just the case to notice that a regulated price ρ( ) c( ) (i.e., a cost reimbursement mechanism), would result in nil investments over time since the regulated firm could not increase its revenues by investing in cost-reducing activities, while it would bear the costs of such investments.
By applying the Pontryagin maximum principle, it is easy to find that the investments' and efficiency levels over time of a regulated firm solve the following ODE: coupled with the state Eq. (1). Next, assume that there exists a Markov pricing rule ρ( ) such that, for some instant t ≥ 0, up to time t, the prices and the investments of the regulated firm coincide with the corresponding first-best levels. Then, the investment level at time t + dt will coincide with the first-best one if and only if the right-hand side of the Eq. (7) coincides with the right-hand side of the corresponding ODE in system (2). Since, by assumption ρ( (t)) = c( (t)), 6 we have On the contrary, the price level at time t + dt will coincide with the first-best one if and only if, at = (t), the time derivative of price and cost levels coincide with each other -i.e., which, off steady state (i.e., when˙ (t) = 0), implies From (8) and (9), we can conclude that there does not exist a price function which simultaneously satisfies both requirements (i) and (ii) of Definition 1.

Lemma 1
If the regulator considers the price regulation problem of each firm separately, then there does not exist a first-best Markov pricing rule.
Intuitively, a reason why this impossibility result emerges rests on the fact that the regulator has one instrument (i.e., the regulated price), while private firms set a larger number of control variables, namely prices and cost-reducing investments.
As a consequence of this impossibility result, we wonder whether a dynamic price regulation based on yardstick pricing rules (which will be defined formally in the following section) may be effective in driving firms to replicate the social first-best allocation. Thus, unlike in the static model (Shleifer, 1985), in which the need for yardstick competition mechanism rests on the fact that the regulator cannot observe the firms' technology, here, in a dynamic setting, there can be scope for introducing yardstick competition among the regulated firms even though there is no asymmetry of information between the regulator and the firms, provided that the former cannot commit to time-dependent pricing rules. This point can provide an additional rationale for the use of yardstick regulation mechanisms, beyond asymmetric information (or even in the case with symmetric information): a regulation mechanism linking the price for a firm to a state variable (moving over time) pertaining to the same firm, at least in the form of a Markov rule, is never able to replicate the first-best outcome.

Yardstick pricing rules
In this section, we investigate whether the possibility of introducing yardstick competition among the regulated firms can lead the regulator to overcome the impossibility result shown in Lemma 1-i.e., to achieve the first-best solution in all regulated markets.
Specifically, as in Shleifer (1985), for simplicity, we consider the same demand function q(·) for each regulated firm i = 1, . . . , N . Furthermore, as far as technology is concerned, we consider the same cost functions c(·) and Γ (·) for each firm, even though we may allow for different initial efficiency levels i (0) i0 > 0. Generalizing Shleifer (1985), a pricing rule entailing yardstick competition among the regulated firms is defined as follows.
Clearly, yardstick pricing rules are, by definition, Markov pricing rules. The regulator aims to design a first-best pricing rule, which is now defined as a yardstick pricing rule, such that conditions (i) and (ii) stated in Definition 1 hold true, along the equilibrium path of the game (more below), for every regulated firm.
As in the previous section, the first step of the analysis consists in determining the regulated firms' strategies in response to any given yardstick pricing rule ρ, announced at the outset of the game and implemented at any instant of time. Each firm i = 1, . . . , N faces the following optimal control problem: Thus, when the regulated price is specified as a yardstick pricing rule, the optimization problems faced by each firm are no longer independent from each other. Put it another way, such a pricing scheme introduces dynamic strategic interactions among the regulated firms, entailing that the analysis must be conducted in a differential game framework. It is well known (see, e.g., Başar & Olsder, 1999) that, in this class of games, different kinds of strategies, hence equilibrium concepts, are defined, depending on the information structure of the game-i.e., roughly speaking, the information on which each player can base his action at each time instant. Accordingly, in what follows, we deal with the yardstick regulation problem by distinguishing between two different equilibrium concepts of the game, namely the openloop Nash equilibrium and the Markovian state-feedback closed-loop Nash equilibrium.

Open loop Nash equilibrium
In this section, we focus on the Open Loop Nash Equilibrium ( For any given yardstick pricing rule announced by the regulator, the OLNE of the differential game played by the regulated firms is shown in the following Lemma. (I 1 (t), . . . I N (t)) solves is

Lemma 2 Let ρ( ) be the pricing rule announced by the public authority. Then, the ODE system in the state-control variables that the OLNE
The feedback representation of the investment dynamics under the OLNE is given by an N -tuple (ψ 1 ( ), . . . , ψ N ( )) which solves the following PDE system: Proof See "Appendix C".
We are now ready to investigate whether the yardstick pricing rule provided by Shleifer (1985) leads to the first-best solution if adopted in the considered dynamic regulation problem.
Recall that the Shleifer's pricing rule is The regulated price for each firm i is thus the average unit cost of the other regulated firms. From (10), it follows that, when the Shleifer's rule is employed, the OLNE strategy of firm i solves:İ Since the considered pricing rule is symmetric across the firms, the symmetrical structure of the game implies that, when the initial cost is the same for all firms, then the equilibrium investments' time paths, and hence the costs' levels over time, are the same for every regulated firm under consideration-i.e., along the equilibrium path of the game, from which it can be easily verified that condition (i) of Definition 1 is fulfilled for each firm i = 1, . . . , N . Moreover, from (13) it also follows that the price levels coincide with the first-best ones over time-i.e., also condition (ii) of Definition 1 holds true, for every regulated firm. On the contrary, when the initial costs are different across firms, clearly Eq. (13) does not hold. However, denoting by˜ i the efficiency level in steady state for firm i = 1, . . . , N , we . Since the equations of this system for i = 1, . . . , N , are symmetric, the system admits the symmetric solution˜ 1 = . . . =˜ N ˜ , and it is easy to see that˜ = * . We can thus state what follows.

Proposition 3 Let i0
0 > 0 for all i = 1, . . . , N . Then, along the equilibrium path of the OLNE, the yardstick pricing rule (12) achieves the first-best solution. If, instead, the initial efficiency levels are different, then the considered pricing rule leads to the first-best outcome only in steady state.
With symmetric firms, the Shleifer's pricing rule constitutes an optimal pricing rule in our dynamic setting, provided that the regulated firms adopt open-loop strategies. In other words, the result obtained by Shleifer (1985), according to which yardstick regulation can replicate the social first-best, also applies (at least as far as the asymptotic steady state is concerned) to a dynamic context in which firms commit to a plan, set at the beginning of the game, and strategic interaction does not take place in every instant of time. In the next section we will show that this outcome does not extend to a dynamic framework in which the strategic interaction leads firms to set their choices instant by instant, following the evolution of the state variables.

State-feedback Nash equilibria
We now consider Markov perfect (state-feedback) Nash Equilibria (MPNE), in which the strategy of each player depends only on the current value of the state variables: thus, such an equilibrium concept is appropriate if players in each instant observe and take into account the current value of states, which summarises the whole past history of the game. The strategies under consideration are labelled as state-feedback by Başar et al. (2018). They are a particular case of closed-loop rules (in which the control variables depend on the state variables). Statefeedback rules are said to be Markovian (and the corresponding equilibrium concept is called Markov perfect) if the functional form of the rules linking control(s) to state(s) remains stable over time 7 .
For any such given yardstick pricing rule announced by the regulator, we can state what follows.

Lemma 3 Let ρ( i , j ) be the non-discriminatory pricing rule announced by the public authority. Then, in any symmetric MPNE
, the equilibrium strategy solves the following PDE: Proof See "Appendix D".
Next, by considering the yardstick pricing rule ρ( i , j ) c( j ) (i.e., the Shleifer's rule (12), when N = 2), the PDE (14) becomes: On the contrary, from (11) we get that the feedback representation of the OLNE strategy, when the same pricing rule is adopted, which we denote by ψ i ( i , j ), solves the following PDE: Notice that Eq. (15) is identical to Eq. (16) up to the third summand in the right-hand side, which is different from zero also when i = j . 8 Therefore, any solution of the PDE (15) does not coincide with the solution of Eq. (16), even restricted to i = j . Since, as we have seen, in the case with the same initial efficiency levels for both firms, the OLNE yields the first-best equilibrium trajectories when the Shleifer's pricing rule is adopted, we can conclude that condition (i) of Definition 1 is not fulfilled in any MPNE. We can thus state the following Proposition.

Proposition 4 The Shleifer's pricing rule is not able to achieve the first-best solution along the equilibrium path of any MPNE.
It is worth exploring in more details how Shleifer's pricing rule shapes the behaviour of firms under the state-feedback information structure. To this end, notice that if we solve for the MPNE by using the dynamic programming approach, we would have: where V i is player i's (unknown) value function. 9 Therefore, The second-order condition of the firm's maximization problem requires the value function to be concave, entailing 8 Since, clearly, given the strategic interdependences between the players, the function is not identically zero (more below).
Thus, we can conclude Moreover, under the Shleifer's pricing rule, it should be the case that The motivation (and the economic intuition) for this claim is as follows: for every regulated firm, the more efficient are rivals, the lower the regulated price, and the larger the sold quantity; consequently, also the value of investments aimed at achieving a cost decrease has to be more significant-i.e., ∂ Thus, by regulating prices according to a yardstick competition mechanism, the regulator can lead a provider to invest more in response to an increase in its rival's efficiency. Put it another way, such a pricing scheme entails intertemporal strategic complementarities (Jun & Vives, 2004), which may at least alleviate the under-investment problem arising in the unregulated monopoly case. 10 Intuitively, even over-investment could be expected under the closed-loop solution, as compared to the open-loop one, because in any instant of time, under the closed-loop solution, firms are led to react to the lower production cost of rivals, increasing further their investment effort, given that the marginal benefit from cost reduction is more significant, the larger is the volume of production. 11 Such a punctual reaction is absent under the open-loop solution, where firms commit themselves to the investment plan designed at the beginning of the interaction.
In any case, the true strategic interaction among the firms over time, entailed by the state-feedback information structure, compared to the game with the open-loop information structure, leads the yardstick pricing rule to be ineffective in replicating the first-best outcome in a dynamic environment. Intuitively, this is due to the fact that only one policy instrument is controlled by the regulator (the regulated price based on the opponent(s)' cost), while the choice variables under the control of the regulated firms, and the state variables affecting them, are in a larger number.
To conclude, the extent to which individually optimal investment path of firms under closed-loop behaviour rule differs from the social first-best path depends on the strength of the dynamic complementarities between investments of different firms (the optimal investment of a firm increases, if the rival's investment increases). This has nothing to do with different reasons, analysed by available contributions, that could prevent the yardstick regulation from being effective, such as collusive behaviour (Tangeras, 2002;Potters et al., 2004) or biased incentives (Dalen, 1998;Sobel, 1999). 10 Notice that this reasoning applies not only to Shleifer's rule but to any other pricing rule such that the regulated price for a firm is a decreasing function of the cost of its competitor. We expect similar mechanisms to be at play also in the more general case with N firms. 11 Indeed, from > 0, implying that the right-hand side of Eq. (15) is higher than the right-hand side of the Eq. (16). Since the left-hand side of Eq. (15) represents the time derivative of the investment along the equilibrium path, we have just argued that its value in any MPNE is higher compared to the OLNE, the latter coinciding with the first-best solution. Nevertheless, this does not suffice to prove that the yardstick pricing rule leads providers to invest more, as compared to the first-best solution, since we cannot determine the values of investments at the initial time in the two equilibria.

Concluding remarks
This article shows that the simple yardstick pricing rule suggested by Shleifer in his seminal contribution dated 1985 can lead to the first-best solution in a dynamic framework only if regulated firms adopt open-loop behaviour rules and they are symmetric at the initial instant of time. In the presence of asymmetry among firms, the social efficiency can be reached by firms following open-loop rules only in the asymptotic steady state. We have also shown that the simple yardstick competition mechanism is not able to achieve the socially efficient outcome if regulated firms adopt state-feedback behaviour rules. This is because firms are led to react, instant by instant, to the investment choices of rivals, and they can set a larger number of control variables, whereas only one policy instrument is used in the yardstick regulation.
For sure, the aim of the present paper is very focussed, and the model overlooks several features of the real world, including product innovation. For instance, in our model, monopoly unambiguously entails dynamic inefficiency, beyond the usual static deadweight loss, and no room is present for dynamically efficient phenomena à la Schumpeter (Schumpeter, 1934). Though very simple, our model has shown that the properties of price regulation in a dynamic framework crucially depend on the behaviour rules followed by regulated subjects, that is, on the information set they use when making their choices.
Even if yardstick regulation maintains the properties of the static context if firms commit to open-loop behaviour, there is no doubt that the results of the present investigation considerably weaken the scope of yardstick regulation mechanisms in a dynamic context, at least for obtaining the first-best solution, and in the case in which firms adopt Markovian strategies. To examine the properties of alternative mechanisms of price regulation, such as price-cap regulation or rate of return regulation (see, e.g., Biglaiser & Riordan, 2000) in a genuine dynamic context with firms' interactions is in our research agenda.
where μ is the (current-value) co-state variable.
The first-order conditions (henceforth, FOCs) with respect to p and I give, respectively, The adjoint equation isμ From the FOCs, by means of easy calculations, we get the first equation of system (2), which an optimal control must satisfy, together with the transversality condition lim t→∞ e −rt μ = 0. The above conditions are sufficient for an optimal control problem if the Hamiltonian is jointly concave with respect to the control and the state variables. To this end, a sufficient condition is that its Hessian matrix is negative definite. The Hessian matrix is given by: This matrix is negative definite if all the coefficients of its characteristic polynomial are positive. This amounts to impose: The first inequality of system (18) is satisfied for which is satisfied under Assumption (A). Moreover, from (19) it follows that the third inequality of system (18) implies the second one and it is verified if and only if Assumption (A) holds.
We now turn to the steady state analysis. Given the smoothness properties of the considered functions, a sufficient condition in order for Eq. (3) to admit a unique (positive) solution is that: Conditions (i) and (ii) are trivially satisfied since q(c(0)) > 0 and c (·) < 0 for all ≥ 0, and lim →∞ q( p)| p=c( ) c ( ) = q(c) · 0, respectively. Lastly, we compute which turns out to be positive under Assumption (A), thus establishing condition (iii).
Finally, we turn to the stability analysis of the equilibrium. The Jacobian matrix of system (2) is given by: Since tr(J ) = δ > 0 and, under Assumption (A), it follows that the equilibrium is a saddle point.

B Proof of Proposition 2
The (current-value) Hamiltonian function for the optimal control problem of the unregulated firm is where μ is the (current-value) co-state variable. The FOCs with respect to price and investment yield, respectively, The adjoint equation isμ By combining the last two equations, we get the first equation of system (5), which an optimal control must satisfy, together with the transversality condition lim t→∞ e −rt (t) = 0. The sufficient conditions for an optimal control are satisfied if the Hamiltonian is jointly concave with respect to the control and the state variables. To this end, a sufficient condition is that its Hessian matrix is negative definite. The Hessian matrix is given by: which is identical to the Hessian matrix (17), with the exception of the (1, 1)-element. Thus, it is trivial to see that, under Assumption (A), the considered matrix is negative definite. As for the steady state analysis, given the smoothness properties of the considered functions, a sufficient condition in order for Eq. (6 ) to admit a unique (positive) solution is that: By proceeding like in the proof of Proposition 1, it is easy to see that conditions (i) and (ii) are satisfied. As for condition (iii), we have By substituting the expression for h ( ) obtained from the implicit function theorem-i.e., it is clear that condition (iii) is satisfied under Assumption (A).
As for the stability of the equilibrium point, the Jacobian matrix of system (5), taking into account that p is fixed according to Eq. (4), which defines the implicit function p = h( ), is given by:

C Proof of Lemma 2
In order to find an OLNE of the differential game, we apply the Pontryagin maximum principle. The where μ i (μ i,1 , . . . , μ i,N ) is the vector of co-state variables for player i. The FOC with respect to I i yields

D Proof of Lemma 3
Consider firm i's problem, given that the other players j = 1, . . . , N , j = i, adopt a Markovian state-feedback strategy I j = φ j ( ). Let φ −i denote the vector whose components are the equilibrium strategies of firm i's rivals-i.e., φ −i (φ 1 ( ), . . . , φ i−1 ( ), φ i+1 ( ), . . . , φ N ( )). Then, the (current-value) Hamiltonian function for player i is is the vector of co-state variables of player i associated with the state variables of the other players. The FOC with respect to φ i is again given by (20), but now the adjoint equations are which must be satisfied together with the transversality conditions lim t→∞ e −rt μ i,i i = 0, i = 1, . . . , N . In order to look for a MPNE, we should first compute the constant (i.e., timeinvariant) solution of the ODE system in the unknown functions μ i, j , then substitute it into the ODE for μ i,i . Then, consider the closed-loop strategy for player i, φ i ( ). From the FOC (20) we obtainμ By equating the right-hand sides of the two ODEs inμ i,i (i.e., (22) and the first ODE of system (21)), we can find the PDE that must be satisfied by firm i's MPNE strategy. Thus, to find a MPNE for a given pricing rule ρ( ), we must carry out these calculations for each player.
To obtain analytical expressions, we consider the simplest case with N = 2 firms. In this case, for player i, system (21) becomes: The constant solution for μ i, j is given by: