1 Introduction

Yardstick regulation is a price regulation scheme, initially presented in a theoretical model by Shleifer (1985), according to which the regulated price set for a given firm, who is a monopolist in a market niche, is obtained from the cost structure of similar firms, operating in different niches. This mechanism is implemented, in some countries, to regulate prices in the sectors of healthcare, transport, energy, and, more in general, in public utilities and services (see, e.g., Schmalensee & Willig, 1989, Part 5; Armstrong & Porter, 2007),Footnote 1 with two main aims: to avoid market failures due to the asymmetric information between regulated firms and the regulator, and to provide incentives for driving less efficient firms to improve their efficiency.

Although the yardstick regulation mechanism was proposed precisely to reach the goal of leading firms to improve their efficiency over time, no simple, dynamic versions of the model are available in the literature, to the best of our knowledge. By ‘dynamic version’ we mean a model where a pricing rule for every instant over the period under consideration is specified, and it is possible to evaluate the dynamic path of the relevant variables, namely, on the one hand, the cost-reducing investments and the corresponding cost levels of regulated firms, and, on the other hand, the regulated prices and the demand levels. We aim to fill this gap, by presenting a differential game model where yardstick competition can be introduced. We study the features of such a regulatory mechanism and precisely we assess whether the repetition of the yardstick pricing rule derived by Shleifer (1985) leads to the first-best outcome also in a dynamic game framework.

Our paper contributes to the somewhat limited literature concerning dynamic aspects of yardstick regulation. Specifically, a generalization of the static Shleifer’s model is proposed by Faure-Grimaud and Reiche (2006), who explicitly deal with dynamic yardstick mechanisms, focusing on mechanism design in the presence of private information, and explore the limitations of yardstick mechanisms that can justify the use of long-term contracts. The importance of considering truly dynamic models to assess the properties of yardstick competition is also supported by Meya (2015), who observes that real life applications of yardstick regulation often resort to historical cost data and argues that a firm under yardstick regulation can affect the price it will be allowed to charge in the future if it can influence the current behaviour of competitors. Unlike in these contributions, in our model firms take cost-reducing investments, as in Shleifer’s seminal paper and other related (static) models.Footnote 2 Moreover, rather than considering a finitely (Faure-Grimaud & Reiche, 2006) or infinitely (Meya, 2015) repeated stage-game, the analysis is conducted in a continuous time setting.

On the technical side, our work is thus closer to previous contributions dealing with dynamic regulation in a differential game framework. For instance, in the field of environmental economics, Benchekroun and Van Long (1998) and Menezes and Pereira (2017) consider efficiency-inducing tax or subsidy to correct the production choices of polluting oligopolists; in the field of resource economics, Akao (2008) and Bisceglia (2020) deal with the optimal taxation of common-pool resources. More closely related to our work, Bisceglia et al. (2020) characterise the optimal dynamic volume-based price regulation in the oligopoly model under firms’ quality competition proposed by Brekke et al. (2012).

As in these papers, in our model the type of interaction among firms depends on the assumptions concerning the information set available to firms at any point in time. As usual in differential games (see, e.g., Dockner et al., 2000), we distinguish the case in which firms set their plans at the beginning of the game and then stick to them forever (open-loop solution) from the case in which the state of the world is observable in any instant of time, and the firms find the optimal rule connecting the choice variable to the state variables (closed-loop solution). In particular, as far as the latter concept is concerned, we consider Markovian state-feedback behaviour rules, in which the players’ choice variables are linked to the current value of the state variables only, and these rules are stable over time.

The consideration of a model in continuous time also requires a specification concerning how we set up yardstick regulation: in Shleifer’s (1985) static framework, the regulator announces (and commits to) a pricing rule; then firms make their (one-shot) decision, and the regulator’s rule is implemented. In our dynamic setting, the rule has to be interpreted as announced at the outset of the game (i.e., at time 0); then firms make their decisions over time, and the rule is implemented instant by instant.

Following the literature (see, e.g., Karp & Livernois, 1992), we argue that the regulator cannot commit to time dependent pricing rules and we show that, even in the absence of information asymmetries between the firms and the regulator, the social first-best cannot be implemented by considering the dynamic price regulation problem of each firm separately—i.e., by linking the regulated price of each firm to its own cost only. Starting from this impossibility result, we then consider pricing rules involving yardstick competition among the regulated firms.

Our model shows that the yardstick regulation rule, as designed in the static framework by Shleifer, can replicate the social optimum in the dynamic framework only if firms adopt open-loop behaviour rules. Indeed, if regulated firms are asymmetric, as far as their initial costs are concerned, the replication of the first-best is limited to the asymptotic steady state only. On the contrary, if firms are symmetric, the replication of the social first-best also occurs along the whole equilibrium path leading to the steady state. This is because, when players are constrained to stick to a strategy chosen at the beginning of the time, there is no genuinely dynamic interaction, thereby the results shown in a static setting carry over to the dynamic game.

By contrast, the yardstick regulation fails to replicate the first-best outcome if firms adopt Markovian state-feedback closed-loop behaviour rules. This is due to the fact that, in the absence of regulation, firms are able to react instant by instant to the dynamic evolution of state variables (namely, the cost levels of each firm) controlling two choice variables (namely, prices and cost reducing investments), whereas the regulator can control only one policy instrument (i.e., the regulated price). Nevertheless, in the game with state-feedback information structure, regulated prices inducing yardstick competition are likely to alleviate the static and dynamic inefficiencies that would arise in the absence of regulation—i.e., if each monopolistic firm could freely set its prices.

The remainder of the paper is as follows. Section 2 introduces the basics of the model. Sections 3 and 4 provide the social first-best solution and the optimal solution for a profit-oriented monopolist, respectively. Section 5 deals with preliminary and general aspects concerning price regulation in a dynamic environment. Section 6 studies the features of the yardstick regulation mechanism. Conclusions are in Sect. 7.

2 The model set-up

Like in Shleifer (1985), we consider N identical firms. Each of them serves a market niche, by selling one product at a price p. The niche demand, denoted by q(p) , must satisfy some standard assumptions: (i) there is a finite choke price \({\overline{p}}>0\) such that \(q(p)>0\) for all \(p\in [0,{\overline{p}})\) and \(q(p)=0\) for \(p\ge {\overline{p}}\); and (ii) q(p) is strictly decreasing and twice continuously differentiable over \((0,{\overline{p}})\).

As for the technology, we consider constant marginal costs, denoted by \( c(\epsilon )>0\), whose amount depends on a variable \(\epsilon \ge 0\), representing the firm’s efficiency level. Setting, without loss of generality, fixed costs equal to zero, the marginal cost coincides with the average, or unit, cost. We assume that the unit cost function \(c(\cdot )\ge 0\) is twice continuously differentiable, with \(c^{\prime }(\cdot )<0\) and \( c^{\prime \prime }(\cdot )>0\) for all \(\epsilon \ge 0\), \(\displaystyle \lim _{\epsilon \rightarrow \infty }c(\epsilon )={\underline{c}}>0\) and \(\displaystyle \lim _{\epsilon \rightarrow \infty }c'(\epsilon )=0\): a more efficient firm faces lower costs, and the marginal effect of the efficiency level on costs is decreasing and vanishing in the limit \(\epsilon \rightarrow \infty \). Moreover, for simplicity we posit \(c(0)<{\overline{p}}\), so that even the less efficient firm can sell a positive quantity.

In every instant of time, firms can make investment I aimed at increasing technological efficiency, that is, at reducing the unit cost. Investment \(I\ge 0\) entails a cost \(\varGamma (I)\ge 0\) that is supposed to be twice continuously differentiable, with \(\varGamma ^{\prime }(\cdot )>0\) and \(\varGamma ^{\prime \prime }(\cdot )>0\) for all I.

The instantaneous profit function for each firm is thus

$$\begin{aligned} \pi (t) \triangleq [p(t)-c(\epsilon (t))]q(p(t))-\varGamma (I(t)) . \end{aligned}$$

The efficiency level moves over time according to the following rule, which is clearly inspired by the simple and widely used rule concerning the evolution of the capital stock under investments (see, e.g., Dockner et al., 2000):

$$\begin{aligned} {\dot{\epsilon }}(t)=-\delta \epsilon (t)+\theta I(t), \end{aligned}$$
(1)

where \(\delta \in (0,1)\) is a depreciation rate and \(\theta >0\) measures the investments’ effectiveness, with an initial value \(\epsilon (0)\triangleq \epsilon _{0}>0\).

Following Shleifer (1985), a transfer T, covered by a lump-sum tax, may be designed, to cover eventual firms’ losses. However, since T has only redistributive effects, it is immaterial to the first-best allocation.

We consider a model with an infinite time horizon and assume that all firms discount future profits at a constant instantaneous rate \(r>0\). For simplicity and in line with a large body of literature, we also consider the same discount rate for the regulator.

Finally, together with the requirements imposed on the cost and the demand functions, throughout the paper, in order to have concave maximization problems, we impose the following Assumption.

Assumption (A) \([c^{\prime }(\epsilon )q^{\prime }(p)]^2+[q^{\prime }(p)+(p-c(\epsilon ))q^{\prime \prime }(p)]c^{\prime \prime }(\epsilon )q(p)<0.\)

This assumption is satisfied by several commonly used demand and cost functions, at least under some restrictions, as shown in the following example.

Example 1

Consider the linear demand \(q(p)\triangleq a-bp\), with \(a,b>0\) , and the exponential cost function \(c(\epsilon )\triangleq z e^{-k\epsilon }+{\underline{c}}\), with \(z,k>0\). Then, it is easy to see that Assumption (A) can be written as: \(a - b p> b z e^{-k\epsilon }\). As the left-hand side is decreasing in p, and the largest p that a firm can optimally set is the monopoly price \(p^m\triangleq \frac{a+b c(\epsilon )}{2b}\), it then follows that Assumption (A) is always satisfied for \(b<\frac{a}{3z+{\underline{c}}}\). This example suggests that Assumption (A) simply amounts to impose an upper bound on the absolute value of the price-derivative of demand.

In the following two sections, we consider two benchmark cases. We first examine the social planner problem—i.e., we find the optimal price and investments set by a benevolent planner who maximises social welfare. Secondly, we consider the dynamic optimization problem faced by an unregulated profit-oriented firm.

3 The first-best solution

As in Shleifer’s (1985) seminal (static) paper, we define the social welfare function for each market niche, at any instant t, as the sum of the consumer surplus and the firm’s profit—i.e.,

$$\begin{aligned} w(t)\triangleq \int _{p(t)}^{{\overline{p}}}q(x)dx+\pi (t). \end{aligned}$$

Notice that we do not attach any opportunity cost to public expenditure if lump-sum transfers are due to lead firms to break-even. Extending the problem to a dynamic framework, the benevolent planner faces the following optimal control problem

$$\begin{aligned} {\left\{ \begin{array}{ll} \displaystyle \max _{p(\cdot )\ge 0,I(\cdot )\ge 0}\int _{0}^{\infty }e^{-rt}w(t)\,dt \\ {\dot{\epsilon }}(t)=-\delta \epsilon (t)+\theta I(t),\quad \epsilon (0)\triangleq \epsilon _{0}>0 \end{array}\right. } \end{aligned}$$

whose solution is shown in the following Proposition.

Proposition 1

In the first-best solution, for every \(t\in [0,\infty )\):

$$\begin{aligned} p=c(\epsilon ), \end{aligned}$$

and the time paths of the investments and the efficiency level solve the ODE system

$$\begin{aligned} {\left\{ \begin{array}{ll} \dot{I}=(r+\delta )I+\frac{\theta }{\varGamma ^{\prime \prime }(I)}q(p)|_{p=c(\epsilon )}c^{\prime }(\epsilon ) \\ {{\dot{\epsilon }}}=-\delta \epsilon +\theta I \end{array}\right. } \end{aligned}$$
(2)

The efficiency level in steady state \(\epsilon ^*\) is the unique positive solution of

$$\begin{aligned} \frac{\varGamma ^{\prime \prime }(I)\delta (r+\delta )}{\theta ^2}\epsilon ^*=-c^{\prime }(\epsilon ^*)q(p)|_{p=c(\epsilon ^*)}, \end{aligned}$$
(3)

and it constitutes a saddle point.

Proof

See “Appendix A”. \(\square \)

Clearly, the benevolent planner sets the social first-best (i.e., the perfect competition) prices over time. The corresponding demand levels impact the optimal amount of cost-reducing investments, which in turn affects future costs, hence prices, and so on. Specifically, by drawing the nullclines in the state-control plane, it can be easily checked that the transition to the steady state along the stable manifold is as follows: if \(\epsilon _{0}<\epsilon ^{*}\), then the social planner sets a high initial investment level \(I(0)>I^{*}\triangleq \frac{\delta }{\theta } \epsilon ^{*}\); over time, socially optimal investments’ levels are decreasing, nevertheless the efficiency levels increase up to the steady state value, hence prices decrease over time. Clearly, the opposite holds if \(\epsilon _{0}>\epsilon ^{*}\). This is not surprising, and the economic intuition behind the converging process is linked to the decreasing marginal returns of investments.

Finally, it is easy to see that the steady state value \(\epsilon ^{*}\) is a decreasing function of the depreciation rate \(\delta \), whereas it is an increasing function of the investments’ effectiveness (captured by the parameter \(\theta \)). Also these outcomes are far from being surprising, and they are in line with our assumptions, which define a well-behaved technology.

4 The unregulated firm

Each unregulated firm aims to maximise the discounted profit flow by choosing prices and cost-reducing investments over time, subject to the dynamic of the efficiency level given by the ODE (1). The solution of this optimal control problem is shown in the following Proposition.

Proposition 2

The unregulated firm, for every \(t\in [0,\infty )\), chooses p such that

$$\begin{aligned} q(p)+[p-c(\epsilon )]q^{\prime }(p)=0, \end{aligned}$$
(4)

and the time paths of the investments and the efficiency level solve the ODE system

$$\begin{aligned} {\left\{ \begin{array}{ll} {\dot{I}}=(r+\delta )I+\frac{\theta }{\varGamma ^{\prime \prime }(I) }q(p)c^{\prime }(\epsilon ) \\ {\dot{\epsilon }}=-\delta \epsilon +\theta I \end{array}\right. } \end{aligned}$$
(5)

Let \(p=h(\epsilon )\) be the implicit function defined by the firm’s monopoly pricing ruleFootnote 3 (4). Then, the efficiency level in steady state is the unique positive root \({\bar{\epsilon }}\) of the equation

$$\begin{aligned} \frac{\varGamma ^{\prime \prime }(I) \delta (r+\delta )}{\theta ^{2}}{\bar{\epsilon }}=-q(p)|_{p=h(\bar{ \epsilon })}c^{\prime }({\bar{\epsilon }}), \end{aligned}$$
(6)

and it constitutes a saddle point. Moreover, \({\bar{\epsilon }}<\epsilon ^{*}\).

Proof

See “Appendix B”. \(\square \)

Notice that the difference between the investments’ time paths chosen by an unregulated firm and by the benevolent planner is entirely driven by the different implemented pricing rules—i.e., the monopoly price vs the perfect competition one. Consequently, the presence of market power results in higher steady state costs compared to the social optimum.Footnote 4 Thus, in steady state, the social planner sets the perfect competition price, which is given by the cost \( c(\epsilon ^{*})\), whereas the unregulated firm adds the monopolist mark-up to its cost \(c({\bar{\epsilon }})>c(\epsilon ^{*})\). The intuition is rather simple: other things being equal, the individual optimum, like in a static framework, is characterised by a lower quantity and a higher price as compared to the social optimum; hence, from the perspective of the dynamic problem, the incentive to invest over time for reducing the unit cost is lower. Put it in another way, in our model, besides the static deadweight loss, a monopoly also entails dynamic inefficiencies, due to the weaker incentives to reduce unit production cost: the well-known replacement effect (Arrow, 1962).

Thus, market regulation is needed to correct these (static and dynamic) inefficiencies. In the remainder of the paper, we deal with this optimal dynamic price regulation problem.

5 Price regulation: preliminaries

To begin with, a few preliminary remarks about the dynamic price regulation problem under consideration are in order. First, if the regulator (the benevolent social planner) knows a firm’s technology, in principle, she could easily solve her problem by setting at any time t a regulated price given by the first-best cost at time t, so to lead the firm to replicate the social first-best investments’ choices. However, in our dynamic setting, even under complete information, the above mentioned solution cannot be implemented if the regulator has limited commitment power—i.e., cannot commit at the outset of the game to the entire time path of prices \(\{p(t)\}_{t\in [0,\infty )}\). Indeed, for several reasons, the assumption of full commitment is hardly in touch with reality. It is much more realistic to assume that the regulator can only commit to implementing a Markovian pricing rule, which specifies the regulated price depending on the current value of the state variables.

This restriction can be justified based on two kinds of consideration. On the one hand, the yardstick pricing rule, as implemented in the real world, can be seen as a Markov pricing rule (see, e.g., Meya, 2015).Footnote 5 On the other hand, regulation in the form of Markov policies has been widely employed in the differential game literature (see, e.g., Benchekroun & Van Long, 1998), mainly for two reasons. First, this choice is made for the sake of tractability since it allows to restrict attention to Markovian (state-feedback) Nash equilibria, which are much easier to characterise and interpret than non-Markovian equilibria. Second, non-Markovian policies, which (explicitly) depend on time, are often vulnerable to strategic manipulation by the regulated firm (see, e.g., Karp & Livernois, 1992).

Thus, in the remainder of the analysis, as the solution concept to the outlined dynamic price regulation problem, we consider first-best Markov pricing rules, defined as follows.

Definition 1

A first-best Markov pricing rule is a function \(\rho (\epsilon ): {\mathbb {R}}^+\rightarrow {\mathbb {R}}^+\) such that: (i) the investments’ time path of the regulated firm solves system (2) and leads to steady state defined by (3), and (ii) the price is set at the perfect competition level for every \(t\in [0,\infty )\).

To solve our dynamic price regulation problem, we must first consider a regulated firm’s behaviour in response to a generic Markov pricing rule announced by the regulator and then look for a Markov pricing rule such that the firm’s behaviour yields the first-best solution (i.e., the two conditions stated in Definition 1 are fulfilled).

To begin with, notice that, for any given Markov pricing rule \(\rho (\epsilon )\) announced by the public authority, a regulated firm’s profit, at any time t, is

$$\begin{aligned} \pi (t)\triangleq [\rho (\epsilon (t))-c(\epsilon (t))]q(p)|_{p=\rho (\epsilon (t))}-\varGamma (I(t)). \end{aligned}$$

Hence, the firm chooses the investments’ time path to maximise its discounted profit flow, subject to the dynamic constraint (1).

Notice that a pricing regulation rule \(\rho (\epsilon )\) does not implement yardstick competition among the regulated firms since it links the regulated price for a firm with the cost of the firm itself only: in other words, for the time being, we are assuming that the regulator aims to solve the price regulation problem of each firm separately. It is just the case to notice that a regulated price \(\rho (\epsilon )\triangleq c(\epsilon )\) (i.e., a cost reimbursement mechanism), would result in nil investments over time since the regulated firm could not increase its revenues by investing in cost-reducing activities, while it would bear the costs of such investments.

By applying the Pontryagin maximum principle, it is easy to find that the investments’ and efficiency levels over time of a regulated firm solve the following ODE:

$$\begin{aligned} {\dot{I}}=(r+\delta )I+\frac{\theta }{\varGamma ''(I) }\left[ q(p)|_{p=\rho (\epsilon )}c^{\prime }(\epsilon )-\frac{\partial \rho }{\partial \epsilon } [q(p)|_{p=\rho (\epsilon )}+q^{\prime }(p)|_{p=\rho (\epsilon )}(\rho (\epsilon )-c(\epsilon ))]\right] , \end{aligned}$$
(7)

coupled with the state Eq. (1). Next, assume that there exists a Markov pricing rule \(\rho (\epsilon )\) such that, for some instant \(t\ge 0\), up to time t, the prices and the investments of the regulated firm coincide with the corresponding first-best levels. Then, the investment level at time \(t+dt\) will coincide with the first-best one if and only if the right-hand side of the Eq. (7) coincides with the right-hand side of the corresponding ODE in system (2). Since, by assumption \( \rho (\epsilon (t))=c(\epsilon (t))\),Footnote 6 we have

$$\begin{aligned} q(p)|_{p=c(\epsilon (t))}c^{\prime }(\epsilon (t))= q(p)|_{p=c(\epsilon (t))}c^{\prime }(\epsilon (t))-\rho ^{\prime }(\epsilon (t))q(p)|_{p=c(\epsilon (t))}\iff \rho ^{\prime }(\epsilon (t))=0. \end{aligned}$$
(8)

On the contrary, the price level at time \(t+dt\) will coincide with the first-best one if and only if, at \(\epsilon =\epsilon (t)\), the time derivative of price and cost levels coincide with each other — i.e.,

$$\begin{aligned} \frac{d}{dt}\rho (\epsilon (t))=\frac{d}{dt}c(\epsilon (t))\iff \rho ^{\prime }(\epsilon (t)){{\dot{\epsilon }}}(t)=c^{\prime }(\epsilon (t)){{\dot{\epsilon }}}(t), \end{aligned}$$

which, off steady state (i.e., when \({{\dot{\epsilon }}}(t)\ne 0\)), implies

$$\begin{aligned} \rho ^{\prime }(\epsilon (t))=c^{\prime }(\epsilon (t))<0. \end{aligned}$$
(9)

From (8) and (9), we can conclude that there does not exist a price function which simultaneously satisfies both requirements (i) and (ii) of Definition .

Lemma 1

If the regulator considers the price regulation problem of each firm separately, then there does not exist a first-best Markov pricing rule.

Intuitively, a reason why this impossibility result emerges rests on the fact that the regulator has one instrument (i.e., the regulated price), while private firms set a larger number of control variables, namely prices and cost-reducing investments.

As a consequence of this impossibility result, we wonder whether a dynamic price regulation based on yardstick pricing rules (which will be defined formally in the following section) may be effective in driving firms to replicate the social first-best allocation. Thus, unlike in the static model (Shleifer, 1985), in which the need for yardstick competition mechanism rests on the fact that the regulator cannot observe the firms’ technology, here, in a dynamic setting, there can be scope for introducing yardstick competition among the regulated firms even though there is no asymmetry of information between the regulator and the firms, provided that the former cannot commit to time-dependent pricing rules. This point can provide an additional rationale for the use of yardstick regulation mechanisms, beyond asymmetric information (or even in the case with symmetric information): a regulation mechanism linking the price for a firm to a state variable (moving over time) pertaining to the same firm, at least in the form of a Markov rule, is never able to replicate the first-best outcome.

6 Yardstick pricing rules

In this section, we investigate whether the possibility of introducing yardstick competition among the regulated firms can lead the regulator to overcome the impossibility result shown in Lemma 1—i.e., to achieve the first-best solution in all regulated markets.

Specifically, as in Shleifer (1985), for simplicity, we consider the same demand function \(q(\cdot )\) for each regulated firm \(i=1,\ldots ,N\). Furthermore, as far as technology is concerned, we consider the same cost functions \(c(\cdot )\) and \(\varGamma (\cdot )\) for each firm, even though we may allow for different initial efficiency levels \(\epsilon _{i}(0)\triangleq \epsilon _{i0}>0\).

Generalizing Shleifer (1985), a pricing rule entailing yardstick competition among the regulated firms is defined as follows.

Definition 2

A yardstick pricing rule is a vector function \(\varvec{\rho }(\varvec{\epsilon }):{\mathbb {R}} ^{N}\rightarrow {\mathbb {R}}^{N}\), whose \(i-\)th component \(\rho _{i}\) gives the regulated price for firm i at any time \(t\in [0,\infty )\) as a time-invariant function of the efficiency levels of all regulated firms at t—i.e., of the vector \(\varvec{\epsilon }(t)\triangleq (\epsilon _{i}(t),\varvec{\epsilon _{-i}}(t))\), where \(\varvec{\epsilon _{-i}}(t) \triangleq (\epsilon _{1}(t),\ldots ,\epsilon _{i-1}(t),\epsilon _{i+1}(t),\ldots ,\epsilon _{N}(t))\).

Clearly, yardstick pricing rules are, by definition, Markov pricing rules. The regulator aims to design a first-best pricing rule, which is now defined as a yardstick pricing rule, such that conditions (i) and (ii) stated in Definition 1 hold true, along the equilibrium path of the game (more below), for every regulated firm.

As in the previous section, the first step of the analysis consists in determining the regulated firms’ strategies in response to any given yardstick pricing rule \(\varvec{\rho }\), announced at the outset of the game and implemented at any instant of time. Each firm \(i=1,\ldots ,N\) faces the following optimal control problem:

$$\begin{aligned} {\left\{ \begin{array}{ll} \displaystyle \max _{I_{i}(\cdot )\ge 0}\int _{0}^{\infty }e^{-rt}\left[ [\rho _{i}(\varvec{\epsilon })-c(\epsilon _{i})]q(p)|_{p=\rho _{i}(\varvec{ \epsilon })}-\varGamma (I_i)\right] \,dt \\ {\dot{\epsilon }}_{j}(t)=-\delta \epsilon _{j}(t)+\theta I_{j}(t),\, \epsilon _{j}(0)\triangleq \epsilon _{j0}>0,\,j=1,\ldots ,N \end{array}\right. } \end{aligned}$$

Thus, when the regulated price is specified as a yardstick pricing rule, the optimization problems faced by each firm are no longer independent from each other. Put it another way, such a pricing scheme introduces dynamic strategic interactions among the regulated firms, entailing that the analysis must be conducted in a differential game framework. It is well known (see, e.g., Başar & Olsder, 1999) that, in this class of games, different kinds of strategies, hence equilibrium concepts, are defined, depending on the information structure of the game—i.e., roughly speaking, the information on which each player can base his action at each time instant. Accordingly, in what follows, we deal with the yardstick regulation problem by distinguishing between two different equilibrium concepts of the game, namely the open-loop Nash equilibrium and the Markovian state-feedback closed-loop Nash equilibrium.

6.1 Open loop Nash equilibrium

In this section, we focus on the Open Loop Nash Equilibrium (OLNE) of the game, which is the equilibrium emerging in the case in which each player adopts an open-loop behaviour rule—i.e., decides at the beginning of the game (in \(t=0\)) the course of his own actions, in order to maximise his individual objective function, and then sticks to it. Hence, the optimal value of the control variable of each player over time depends on the initial conditions and on time t, but it does not depend on the state variables. The open-loop behaviour rule is particularly appropriate and realistic when players cannot observe the evolution of state variables over time, or when they have to set their plan at the beginning of the time period under consideration and then they have to commit to it, or even when it is difficult to adjust the control variable instant-by-instant to the current value of the state variables. It is not very easy to provide concrete examples. However, one could argue that open-loop behaviour rules are appropriate when long-term, non-flexible, courses of actions are required, such as investment plans (see Cellini et al., 2018, and Bisceglia et al., 2019, for real-world examples in the healthcare sector).

For any given yardstick pricing rule announced by the regulator, the OLNE of the differential game played by the regulated firms is shown in the following Lemma.

Lemma 2

Let \(\varvec{\rho }(\varvec{\epsilon })\) be the pricing rule announced by the public authority. Then, the ODE system in the state-control variables that the OLNE \((I_{1}(t),\ldots I_{N}(t))\) solves is

$$\begin{aligned} {\left\{ \begin{array}{ll} {\dot{I}}_{i}=(r+\delta )I_{i}+\frac{\theta }{\varGamma ^{\prime \prime }(I_i)}\left[ q(p)|_{p=\rho _{i}(\varvec{\epsilon })}c^{\prime }(\epsilon _{i})-\frac{\partial \rho _{i}}{\partial \epsilon _{i}}[q(p)|_{p=\rho _{i}(\varvec{\epsilon } )}\right. \\ \left. \qquad +q^{\prime }(p)|_{p=\rho _{i}(\varvec{\epsilon })}(\rho _{i}( \varvec{\epsilon })-c(\epsilon _{i}))]\right] \\ {\dot{\epsilon }}_{i}=-\delta \epsilon _{i}+\theta I_{i} \end{array}\right. } \end{aligned}$$
(10)

The feedback representation of the investment dynamics under the OLNE is given by an N-tuple \((\psi _{1}(\varvec{\epsilon }),\ldots ,\psi _{N}( \varvec{\epsilon }))\) which solves the following PDE system:

$$\begin{aligned} \bigg \langle \frac{\partial \psi _{i}(\cdot )}{\partial \varvec{\epsilon }},-\delta \varvec{\epsilon }+\theta \psi _{i}(\varvec{\epsilon })\bigg \rangle&=(r+\delta )\psi _{i}(\varvec{\epsilon })\nonumber \\&\quad +\frac{\theta }{\varGamma ^{\prime \prime }(\psi _i(\varvec{\epsilon })) }\left[ -\frac{\partial \rho _{i}}{\partial \epsilon _{i}}[q(p)|_{p=\rho _{i}(\varvec{\epsilon })}+q^{\prime }(p)|_{p=\rho _{i}(\varvec{\epsilon })}(\rho _{i}(\varvec{\epsilon })\right. \nonumber \\&\quad \left. -c(\epsilon _{i}))]+c^{\prime }(\epsilon _{i})q(p)|_{p=\rho _{i}(\varvec{\epsilon })} \right] . \end{aligned}$$
(11)

Proof

See “Appendix C”. \(\square \)

We are now ready to investigate whether the yardstick pricing rule provided by Shleifer (1985) leads to the first-best solution if adopted in the considered dynamic regulation problem.

Recall that the Shleifer’s pricing rule is

$$\begin{aligned} \rho _{i}(\varvec{\epsilon })\triangleq \frac{1}{N-1}\sum _{j\ne i}c(\epsilon _{j}). \end{aligned}$$
(12)

The regulated price for each firm i is thus the average unit cost of the other regulated firms.

From (10), it follows that, when the Shleifer’s rule is employed, the OLNE strategy of firm i solves:

$$\begin{aligned} {\dot{I}}_{i}=(r+\delta )I_{i}+\frac{\theta }{\varGamma ^{\prime \prime }(I_i) }c^{\prime }(\epsilon _{i})q(p)|_{p=\frac{1}{N-1}\sum _{j\ne i}c(\epsilon _{j}) }. \end{aligned}$$

Since the considered pricing rule is symmetric across the firms, the symmetrical structure of the game implies that, when the initial cost is the same for all firms, then the equilibrium investments’ time paths, and hence the costs’ levels over time, are the same for every regulated firm under consideration—i.e., along the equilibrium path of the game,

$$\begin{aligned} \frac{1}{N-1}\sum _{j\ne i}c(\epsilon _{j})=c(\epsilon _{i}), \end{aligned}$$
(13)

from which it can be easily verified that condition (i) of Definition is fulfilled for each firm \(i=1,\ldots ,N\). Moreover, from (13) it also follows that the price levels coincide with the first-best ones over time—i.e., also condition (ii) of Definition 1 holds true, for every regulated firm.

On the contrary, when the initial costs are different across firms, clearly Eq. (13) does not hold. However, denoting by \(\tilde{ \epsilon }_{i}\) the efficiency level in steady state for firm \(i=1,\ldots ,N\) , we have \(I_{i}({\tilde{\epsilon }}_{i})=\frac{\delta }{\theta }\tilde{\epsilon }_{i}\) and

$$\begin{aligned} \frac{\varGamma ^{\prime \prime }(I_i) \delta (r+\delta )}{\theta ^{2}}{\tilde{\epsilon }}_{i}=-c^{\prime }({\tilde{\epsilon }}_{i})q(p)|_{p= \frac{1}{N-1}\sum _{j\ne i}c(\epsilon _{j})}. \end{aligned}$$

Since the equations of this system for \(i=1,\ldots ,N\), are symmetric, the system admits the symmetric solution \({\tilde{\epsilon }}_{1}=\ldots =\tilde{ \epsilon }_{N}\triangleq {\tilde{\epsilon }}\), and it is easy to see that \( {\tilde{\epsilon }}=\epsilon ^{*}\). We can thus state what follows.

Proposition 3

Let \(\epsilon _{i0}\triangleq \epsilon _0>0\) for all \(i=1,\ldots ,N\). Then, along the equilibrium path of the OLNE, the yardstick pricing rule (12) achieves the first-best solution. If, instead, the initial efficiency levels are different, then the considered pricing rule leads to the first-best outcome only in steady state.

With symmetric firms, the Shleifer’s pricing rule constitutes an optimal pricing rule in our dynamic setting, provided that the regulated firms adopt open-loop strategies. In other words, the result obtained by Shleifer (1985), according to which yardstick regulation can replicate the social first-best, also applies (at least as far as the asymptotic steady state is concerned) to a dynamic context in which firms commit to a plan, set at the beginning of the game, and strategic interaction does not take place in every instant of time. In the next section we will show that this outcome does not extend to a dynamic framework in which the strategic interaction leads firms to set their choices instant by instant, following the evolution of the state variables.

6.2 State-feedback Nash equilibria

We now consider Markov perfect (state-feedback) Nash Equilibria (MPNE), in which the strategy of each player depends only on the current value of the state variables: thus, such an equilibrium concept is appropriate if players in each instant observe and take into account the current value of states, which summarises the whole past history of the game. The strategies under consideration are labelled as state-feedback by Başar et al. (2018). They are a particular case of closed-loop rules (in which the control variables depend on the state variables). State-feedback rules are said to be Markovian (and the corresponding equilibrium concept is called Markov perfect) if the functional form of the rules linking control(s) to state(s) remains stable over timeFootnote 7.

For the sake of tractability, and without loss of insights, we consider the simplest case with \(N=2\) firms and, in order to restrict our attention to symmetric MPNE, constituted by a pair of Markovian strategies \((\phi _{1}(\epsilon _{1},\epsilon _{2}),\phi _{2}(\epsilon _{1},\epsilon _{2}))\) such that \(\forall \epsilon _{1},\epsilon _{2}:\phi _{1}(\epsilon _{1},\epsilon _{2})=\phi _{2}(\epsilon _{2},\epsilon _{1})\), we consider a symmetric pricing rule such that \(\forall \epsilon _{1},\epsilon _{2}:\rho _{1}(\epsilon _{1},\epsilon _{2})=\rho _{2}(\epsilon _{2},\epsilon _{1})\triangleq \rho (\epsilon _{i},\epsilon _{j})\). This assumption appears to be appropriate, as it does not entail any bias or discrimination among firms (see, e.g., also Benchekroun & Van Long, 1998), and in addition, the yardstick pricing rule (12) satisfies this property.

For any such given yardstick pricing rule announced by the regulator, we can state what follows.

Lemma 3

Let \(\rho (\epsilon _i,\epsilon _j)\) be the non-discriminatory pricing rule announced by the public authority. Then, in any symmetric MPNE \( (\phi _i, \phi _j)\), with \(\phi _{i}(\epsilon _{i},\epsilon _{j})=\phi _{j}(\epsilon _{j},\epsilon _{i})\), the equilibrium strategy solves the following PDE:

$$\begin{aligned}&\frac{\partial \phi _i}{\partial \epsilon _i}(-\delta \epsilon _i+\theta \phi _i)+ \frac{\partial \phi _i}{\partial \epsilon _j}(-\delta \epsilon _j+\theta \phi _j)=(r+\delta )\phi _i+\frac{\theta }{\varGamma ^{\prime \prime }(\phi _i)}c^{\prime }(\epsilon _i)q(p)|_{p=\rho (\epsilon _i,\epsilon _j)}+\nonumber \\&\quad -\frac{\theta }{\varGamma ^{\prime \prime }(\phi _i)}[q(p)|_{p=\rho (\epsilon _i,\epsilon _j)}\nonumber \\&\quad +q^{\prime }(p)|_{p=\rho (\epsilon _i,\epsilon _j)}(\rho (\epsilon _i,\epsilon _j)-c( \epsilon _i))]\left( \frac{\partial \rho (\epsilon _i,\epsilon _j)}{\partial \epsilon _i}+\frac{\theta \frac{\partial \rho (\epsilon _i,\epsilon _j)}{\partial \epsilon _j}\frac{\partial \phi _i}{\partial \epsilon _j}}{r+\delta -\theta \frac{ \partial \phi _i}{\partial \epsilon _i}}\right) . \end{aligned}$$
(14)

Proof

See “Appendix D”. \(\square \)

Next, by considering the yardstick pricing rule \(\rho (\epsilon _{i},\epsilon _{j})\triangleq c(\epsilon _{j})\) (i.e., the Shleifer’s rule (12), when \(N=2\)), the PDE (14) becomes:

$$\begin{aligned}&\frac{\partial \phi _i}{\partial \epsilon _{i}}(-\delta \epsilon _{i}+\theta \phi _i)+\frac{\partial \phi _i}{\partial \epsilon _{j}}(-\delta \epsilon _{j}+\theta \phi _j)\nonumber \\&\quad =(r+\delta )\phi _i+\frac{\theta }{\varGamma ^{\prime \prime }(\phi _i) }c^{\prime }(\epsilon _{i})q(p)|_{p=c(\epsilon _{j})}\nonumber \\&\qquad -\frac{\theta ^{2}[q(p)|_{p=c(\epsilon _{j})}+q^{\prime }(p)|_{p=c(\epsilon _{j})}(c(\epsilon _{j})-c(\epsilon _{i}))]c^{\prime }(\epsilon _{j})\frac{\partial \phi _j}{\partial \epsilon _{i}}}{\varGamma ^{\prime \prime }(\phi _i) [r+\delta -\theta \frac{\partial \phi _j}{\partial \epsilon _{j}}]}. \end{aligned}$$
(15)

On the contrary, from (11) we get that the feedback representation of the OLNE strategy, when the same pricing rule is adopted, which we denote by \(\psi _i(\epsilon _i,\epsilon _j)\), solves the following PDE:

$$\begin{aligned} \frac{\partial \psi _i}{\partial \epsilon _{i}}(-\delta \epsilon _{i}+\theta \psi _i)+\frac{\partial \psi _i}{\partial \epsilon _{j}}(-\delta \epsilon _{j}+\theta \psi _j)=(r+\delta )\psi _i+\frac{\theta }{\varGamma ^{\prime \prime }(\psi _i)}c^{\prime }(\epsilon _i)q(p)|_{p=c(\epsilon _j)}. \end{aligned}$$
(16)

Notice that Eq. (15) is identical to Eq. (16) up to the third summand in the right-hand side, which is different from zero also when \(\epsilon _{i}=\epsilon _{j}\).Footnote 8 Therefore, any solution of the PDE (15) does not coincide with the solution of Eq. (16), even restricted to \(\epsilon _{i}=\epsilon _{j}\). Since, as we have seen, in the case with the same initial efficiency levels for both firms, the OLNE yields the first-best equilibrium trajectories when the Shleifer’s pricing rule is adopted, we can conclude that condition (i) of Definition 1 is not fulfilled in any MPNE. We can thus state the following Proposition.

Proposition 4

The Shleifer’s pricing rule is not able to achieve the first-best solution along the equilibrium path of any MPNE.

It is worth exploring in more details how Shleifer’s pricing rule shapes the behaviour of firms under the state-feedback information structure. To this end, notice that if we solve for the MPNE by using the dynamic programming approach, we would have:

$$\begin{aligned} \phi _{i}(\epsilon _{i},\epsilon _{j})=\frac{\theta }{\varGamma ^{\prime \prime }(\phi _i(\epsilon _{i},\epsilon _{j}))}\frac{\partial V_{i}(\epsilon _{i},\epsilon _{j})}{\partial \epsilon _{i}}, \end{aligned}$$

where \(V_{i}\) is player i’s (unknown) value function.Footnote 9 Therefore,

$$\begin{aligned} \frac{\partial \phi _{i}}{\partial \epsilon _{j}}=\frac{\theta }{\varGamma ^{\prime \prime }(\phi _i)} \frac{\partial ^{2}V_{i}}{\partial \epsilon _{j}\partial \epsilon _{i}}, \qquad \frac{\partial \phi _{i}}{\partial \epsilon _{i}}=\frac{\theta }{ \varGamma ^{\prime \prime }(\phi _i)}\frac{\partial ^{2}V_{i}}{\partial \epsilon _{i}^{2}}. \end{aligned}$$

The second-order condition of the firm’s maximization problem requires the value function to be concave, entailing

$$\begin{aligned} \frac{\partial ^{2}V_{i}}{\partial \epsilon _{i}^{2}}<0,\qquad \frac{ \partial ^{2}V_{i}}{\partial \epsilon _{j}^{2}}<0. \end{aligned}$$

Thus, we can conclude

$$\begin{aligned} \frac{\partial \phi _{i}}{\partial \epsilon _{i}}=\frac{\partial \phi _{j}}{ \partial \epsilon _{j}}<0. \end{aligned}$$

Moreover, under the Shleifer’s pricing rule, it should be the case that

$$\begin{aligned} \frac{\partial \phi _{j}}{\partial \epsilon _{i}}=\frac{\partial \phi _{i}}{ \partial \epsilon _{j}}=\frac{\theta }{\varGamma ^{\prime \prime }(\phi _i)} \frac{\partial ^{2}V_{i}}{ \partial \epsilon _{j}\partial \epsilon _{i}}>0. \end{aligned}$$

The motivation (and the economic intuition) for this claim is as follows: for every regulated firm, the more efficient are rivals, the lower the regulated price, and the larger the sold quantity; consequently, also the value of investments aimed at achieving a cost decrease has to be more significant—i.e., \(\frac{\partial }{\partial \epsilon _{j}}\left( \frac{\partial V_{i}}{\partial \epsilon _{i}}\right) = \frac{\partial \phi _{i}}{\partial \epsilon _{j}}>0\).

Thus, by regulating prices according to a yardstick competition mechanism, the regulator can lead a provider to invest more in response to an increase in its rival’s efficiency. Put it another way, such a pricing scheme entails intertemporal strategic complementarities (Jun & Vives, 2004), which may at least alleviate the under-investment problem arising in the unregulated monopoly case.Footnote 10 Intuitively, even over-investment could be expected under the closed-loop solution, as compared to the open-loop one, because in any instant of time, under the closed-loop solution, firms are led to react to the lower production cost of rivals, increasing further their investment effort, given that the marginal benefit from cost reduction is more significant, the larger is the volume of production.Footnote 11 Such a punctual reaction is absent under the open-loop solution, where firms commit themselves to the investment plan designed at the beginning of the interaction.

In any case, the true strategic interaction among the firms over time, entailed by the state-feedback information structure, compared to the game with the open-loop information structure, leads the yardstick pricing rule to be ineffective in replicating the first-best outcome in a dynamic environment. Intuitively, this is due to the fact that only one policy instrument is controlled by the regulator (the regulated price based on the opponent(s)’ cost), while the choice variables under the control of the regulated firms, and the state variables affecting them, are in a larger number.

To conclude, the extent to which individually optimal investment path of firms under closed-loop behaviour rule differs from the social first-best path depends on the strength of the dynamic complementarities between investments of different firms (the optimal investment of a firm increases, if the rival’s investment increases). This has nothing to do with different reasons, analysed by available contributions, that could prevent the yardstick regulation from being effective, such as collusive behaviour (Tangeras, 2002; Potters et al., 2004) or biased incentives (Dalen, 1998; Sobel, 1999).

7 Concluding remarks

This article shows that the simple yardstick pricing rule suggested by Shleifer in his seminal contribution dated 1985 can lead to the first-best solution in a dynamic framework only if regulated firms adopt open-loop behaviour rules and they are symmetric at the initial instant of time. In the presence of asymmetry among firms, the social efficiency can be reached by firms following open-loop rules only in the asymptotic steady state. We have also shown that the simple yardstick competition mechanism is not able to achieve the socially efficient outcome if regulated firms adopt state-feedback behaviour rules. This is because firms are led to react, instant by instant, to the investment choices of rivals, and they can set a larger number of control variables, whereas only one policy instrument is used in the yardstick regulation.

For sure, the aim of the present paper is very focussed, and the model overlooks several features of the real world, including product innovation. For instance, in our model, monopoly unambiguously entails dynamic inefficiency, beyond the usual static deadweight loss, and no room is present for dynamically efficient phenomena à la Schumpeter (Schumpeter, 1934). Though very simple, our model has shown that the properties of price regulation in a dynamic framework crucially depend on the behaviour rules followed by regulated subjects, that is, on the information set they use when making their choices.

Even if yardstick regulation maintains the properties of the static context if firms commit to open-loop behaviour, there is no doubt that the results of the present investigation considerably weaken the scope of yardstick regulation mechanisms in a dynamic context, at least for obtaining the first-best solution, and in the case in which firms adopt Markovian strategies. To examine the properties of alternative mechanisms of price regulation, such as price-cap regulation or rate of return regulation (see, e.g., Biglaiser & Riordan, 2000) in a genuine dynamic context with firms’ interactions is in our research agenda.