Preprints of the Max Planck Institute for Research on Collective Goods Bonn 2015 / 3 The Expected Externality Mechanism in a Level-k Environment

Mechanism design theory strongly relies on the concept of Nash equilibrium. However, studies of experimental games show that Nash equilibria are rarely played and that subjects may be thinking only a finite number of iterations. We study one of the most influential benchmarks of mechanism design theory, the expected externality mechanism (D’Aspremont, GerardVaret, 1979) in a finite-depth environment described by the Lk model. While efficient implementation fails under certain conditions, our results provide a vindication of the mechanism in the convex quasi-linear environment with finitely-rational agents.


Introduction
Mechanism design theory studies institutions with privately informed agents. Using the tools of game theory, it proposes rules of interactions such that the participants' strategic behavior complies with the designer's objective. In a leading example, the designer's purpose is to implement the socially efficient outcome, that is, to find the allocation that maximizes total welfare. The major is constructed in the following iterative process. A player of level k = 1 ("L1 player") believes that his opponents ("L0") behave non-strategically. In incomplete information games, such as the AGV mechanism, L0's can be modeled in two distinct ways: either they truthfully reveal their type ("truthful L0") or draw their actions (type reports) from a random distribution ("random L0"). An L2 player best replies to the profile of L1 strategies, L3 best replies to L2, and so on. In general, an Lk strategy is best reply to the profile of Lk-1, suggesting the interpretation that players try to "outguess" their opponents. 4 As an illustration, think of a game, where the players pick a number between 0 and 100 and the one whose number is closest to some fraction, say one half, of the average wins the game. In this guessing game, if L0s randomize uniformly between 0 and 100, L1s will choose 50/2=25, L2s will choose 25/2, etc. As Lk increases, the best response approaches 0, the only Nash equilibrium of the game.
We apply the Lk model to the AGV mechanism in a setting with independent private valuations and utilities that are strictly concave with respect to the allocation. 5 First, we observe that in the truthful-L0 specification of the Lk model the mechanism never produces a loss in efficiency. In that specification, the L1 best reply is given by the equilibrium condition of AGV which implies truth-telling. By induction, this result extends to any higher level k, therefore the mechanism chooses the efficient allocation irrespectively the levels prevailing in the population.
Further, in the random-L0 specification of Lk, we show that if the distribution of random moves (L0) coincides with the distribution of payoff types, then the participants at any level larger than zero report truthfully to the mechanism. Next, we analyze the more challenging setup where the type distribution used by the planner to assign transfers differs from L1s' perception of the opponents' moves. In this case, the externality payment generally fails to align the agent's incentives with total expected welfare maximization. As a result, the AGV mechanism does not induce truth-telling and produce a suboptimal allocation. Denoting the distribution of random L0 strategies by Φ and the distribution of types by F , we study how the stochastic properties of Φ and F affect the Lk strategies in the mechanism.
We start with a simple environment where utilities are quadratic. In this setting, a difference in the mean values of Φ and F creates distortions at level 1. For instance, if the mean type is greater than the mean L0 report, then all types of an L1 player will over-report their types to the mechanism. Misreporting carries over to higher levels, but the expected absolute value of the distortion of type decreases exponentially as level k goes up. Moreover, the direction of bias (i.e., whether the agents over-report or under-report their types) alternates at each iteration from k to k+1. This result has two interesting implications for the outcome of the mechanism. First, if the pool of agents is a mixture of two subsequent levels (e.g., L2 and L3), the distortion of efficiency is lower than in a group where only one of these levels is present. Second, as Lk goes up, the outcome approaches efficiency.
Similar results are obtained in a more general setting, where the efficient rule is essentially linear in types. 6 In this neutral environment types are neither substitutes nor complements with respect to the optimal allocation. A simple example of a neutral environment is the one where the optimal allocation is a linear combination, for instance, the average, of types. In this environment, whenever Φ (F ) dominates F (Φ) in the sense of first-order stochastic dominance then types are going to be systematically misreported. 7 We find that if the distribution of types F dominates the distribution of random moves Φ, then L1s always overreport their types. Thus they compensate the downward bias of L0s distorting their own reports in the opposite direction. This is due to the incentive scheme induced by the mechanism: it punishes for the expected, as opposed to the realized, negative externality.
As an extension, we study the case where type reports are complements or substitutes with respect to the optimal allocation. The direction of bias in reports can be predicted, similarly to the neutral case, but only for a subset of types. For instance, one of the results states that if the distribution of types F dominates the distribution of random moves Φ and type reports are complements with respect to the social choice function, then low-type L1s over-report their types. The reason that high-type L0s will not necessarily do so is that over-reporting leads, in expectation, to an excessively high allocation due to the complementarity in agents' reports.
In the neutral case, we also obtain the following convergence result: as level k increases, the players' strategies in the AGV mechanism tend to truth-telling. Since in most experiments the estimated values of k rarely exceed 3 (Crawford, Costa-Gomez, and Iriberri, 2013;Camerer, Ho, 2015), the convergence result bears little importance for one-shot mechanisms. However, with the interpretation of Lk model as a learning algorithm, this result has an important implication for mechanisms that are played repeatedly. 8 We describe a learning algorithm in the game of incomplete information with a large number of players that is equivalent to the Lk model. If learning follows that algorithm, then our convergence result for Lk implies that the players will gradually learn to report types truthfully. We can interpret the results as a vindication of the AGV mechanism in a convex quasi-linear environment with independent private values. The analysis shows that even if agents are finitely-rational, their behavior in the mechanism is centered around truth-telling. 9 The rest of this paper is organized as follows. Section 2 presents the key assumptions, the Lk model in incomplete information games and in the AGV mechanism in particular. Section 3 describes the properties of Lk strategies in the AGV mechanism: equivalence of Lk and equilibrium models in the AGV mechanism, the biases due to first order stochastic dominance and convergence in the neutral environment. Section 4 partially extends the results to the case when types are substitutes or complements with respect to the efficient allocation. Section 5 explains how the results can be understood in the context of a learning model, and finally, Section 6 discuses the implications for the practical implementation of the AGV mechanism.

The Model
Preferences The preference environment is characterized by the following assumptions: A1. Utilities are linear in money.
A3. Values are independent and identically distributed.
Assumptions A1 and A2 imply that the utility function of a given agent i ∈ I can be represented as: where v i (x, θ i ) is the utility derived from allocation x ∈ X, θ i ∈ Θ ⊆ R is the privately known preference parameter that we refer to as the player's type, and m i is the monetary transfer to player i. We assume that v i (x, θ i ) is strictly concave in x and continuously differentiable with respect to both arguments. A3 implies that the values θ i are drawn independently across i ∈ I. We denote the respective cumulative function F and assume that F is common knowledge. We require that the preferences satisfy a single crossing (Spence-Mirrlees) condition. The condition postulates that function v i (x, θ i ) has a cross-derivative with respect to allocation x and type θ i with a sign that is constant over the function's domain: A4. v i (x, θ i ) satisfies the Spence-Mirrlees condition, i.e., either A4.1 or A4.2 holds: A1-A4 are the basic assumptions of mechanism design. A further standard assumption is the common knowledge of rationality: the knowledge that the opponent is rational, the knowledge that the opponent knows that his opponent is rational, and so on ad infinitum. In this paper, we consider the case with a finite number of rationality iterations. This frame of reasoning is described by the following model (Nagel, 1995;Crawford and Iriberri, 2007).
Level-k Consider a game of incomplete information where the payoffs are given by u i (s; θ i ), for each player i ∈ I of type θ i and strategy profile s = s 1 , s 2 , ..s |I| , where s i ∈ S. (We use s i and s i (θ i ) interchangeably.) We look at players who engage in iterations of best reply, following Nagel (1995). The Lk strategy s i (θ i ) is recursively defined as the function of the player's type θ i that maximizes his expected payoff against level-(k − 1) profile s (k−1) −i (θ −i ). 10 As the starting point of recursion, the model features nonstrategic L0 players, that can be modeled in two alternative ways (see Crawford, Costa-Gomez, and Iriberri, 2013). In one specification, the L0's always reveal their type truthfully; in the other, L0's actions are drawn from a random distribution. In the version with random L0's, we denote the associated cumulative distribution function by Φ and assume, as it is standard in the Lk model, that Φ is known. 11 The formal definition is then the following.
Definition For k ≥ 1 the optimal strategy s (k) i maximizes the expected payoff of player i against s where θ −i is the residual profile of types. For k = 0, action s The following simple lemma establishes the connection between the Lk and equilibrium strategy profiles.
The lemma follows immediately from Equation (2) and the Bayes-Nash equilibrium conditions: The strategy profile s (k) (θ) that satisfies the condition of Lemma 1 is a fixed point of best-reply correspondence (2).

Choice Rules and Mechanisms
For a quasilinear utility representation (1), we define a choice rule x * (θ) as efficient if it maximizes the total welfare for every profile of agents' types θ = θ 1 , θ 2 , ..θ |I| : 13 10 In other words, the player believes with certainty that the opponents make exactly k − 1 iterations of best reply. In contrast, the cognitive hierarchy model assumes that an Lk player attributes strictly positive probabilities to all the levels of rationality lower than k. 11 Otherwise the optimal Lk strategies are not well-defined. 12 Assuming strict concavity of the payoff functions. 13 We restrict the attention to strictly convex problems, such that for all θ ∈ Θ |I| the solution x * (θ) to the welfare maximization problem is unique.
A (direct) mechanism is a system of communication and decision-making, where the privately informed agents report their payoff types and the central authority assigns the allocation and transfers based on the submitted reports. Formally, it is a tuple x (s) , T 1 (s) , T 2 (s) , ..T |I| (s) of allocation and transfers, such that the payoffs in the mechanism are given by: A mechanism implements choice rule x (s) if the profile of truth-telling strategies, s i = θ i , ∀i, is an equilibrium.

Expected Externality Mechanism
The expected externality mechanism introduced in d'Aspremont and Gerard-Varet (AGV, 1979) implements the efficient allocation in a Bayes-Nash equilibrium. In this mechanism, the center chooses the allocation x * (s) defined in (3) and assigns the following monetary transfers to the participants: where The transfer T i is constructed such that agent i internalizes the expected effect of his report on others. The incentive part of t i represents the monetary value of externality imposed by the agent's report s i on others' welfare; the externality is evaluated under the assumption that the other agents report their types truthfully. Therefore, if the agent also expects others to report truthfully (the equilibrium assumption), then his incentives are aligned with total welfare maximization, and there is no benefit in misrepresenting his own true preferences. Thus, in the Bayesian setting, the transfer induces truth-telling in equilibrium (d'Aspremont and Gerard-Varet, 1979). Their result immediately implies that in the truthful-L0 specification of the Lk model efficient implementation obtains for any k.
The second part of the transfer, 1 |I|−1 l =i t l (s l ), guarantees that mechanism satisfies ex post budget balance, i.e., its transfers sum up to zero for any profile of reports s (and, in particular, in the level-k model.) 14 Observe that the budgetbalancing term 1 |I|−1 l =i t l (s l ) does not depend on agent i's own report s i . Therefore this term does not affect best replies and can be omitted in the Lk analysis.

Level-k in the Mechanism
In the expected externality mechanism, a Lk player, k ≥ 1, maximizes the expected gain in the mechanism: Given the incentive transfer (6), the optimal Lk strategy in the mechanism is defined by the following: 15 By Lemma 1, a strategy profile that satisfies s (k) (θ) = s (k−1) (θ) for all k and θ is a Bayes-Nash equilibrium. In particular, if truth-telling obtains at all levels up to k − 1, for all i and θ i , then we can substitute s (8) and obtain the equilibrium condition: Since x * maximizes the sum of utilities, it must be the case that s However, if an agent does not expect his opponents to report their types truthfully, he will not reveal his true type either. We start the following section describes with the respective example.

Results
Example Consider a setting with n players and a quadratic utility represen- In this setup, agent i has a bliss point at θ i and incurs quadratic loss as the allocation departs from it. It is easy to verify that the socially efficient allocation (that maximizes the sum of utilities) is the average of individual bliss points: x * (θ 1 ) = i θ i n . 16 In the appendix, we prove the following simple lemma:

Lemma 2
In the quadratic case, the optimal Lk strategy, k ≥ 1, for player i is given by the following: where ∆ =´θdF (θ) −´sdΦ(s) denotes the difference between the average type and the average random move of an L0 player.
The Lk strategy (10) has several interesting properties. First, the size of distortion diminishes as the level of rationality k increases. As k goes to infinity, the optimal strategies converge to truth-telling. This holds for any finitemoments distributions F and Φ. Second, if the distributions have equal means, θdF (θ) =´sdΦ(s), then truth-telling obtains at every level of rationality, starting from k = 1. Third, the absolute size of the discrepancy ∆ × n−1 n k+1 between the true type θ and the Lk report s (k) i (θ i ) increases in the number of players n. Next we study these properties in a more general setup. We maintain, however, that the efficient rule is linear in (a function of) the reported types. Formally, we make the following assumption: Henceforth, we refer to A5 as the 'neutrality condition'. It implies that the marginal effect of an agent's report on the efficient allocation is not influenced by the report of another agent. The condition is satisfied whenever the efficient allocation x * is a linear combination of types, in particular if it is the average 16 of types. 17 For instance, an environment with v i (x, θ i ) = − (x − θ i ) 2p , for some p ∈ N , n = 2 satisfies A5. Neutrality is a necessary condition for the proof of our main result: Proposition 2. In section 4, we discuss the case when neutrality is violated.
Observe that the first level, L1, is central to the analysis. As we will see next, if no distortion of truth-telling appears at L1, then no distortion will be observed at any subsequent level. Therefore we focus on the behavior of L1's in the mechanism. Once we identify the departures from truth-telling at the first level, we study whether it dissipates at higher levels and what the implications for the AGV mechanism are.
Recall that an L1 maximizes his expected payoff under the belief that his opponent makes a random report. The L1 optimal strategy (best reply) in the mechanism is given by: where x * (s i , s −i ) is the efficient social choice rule defined in Equation (3). The analysis of the optimal strategy yields the following simple result.
Proposition 1 Under assumptions A1-A3, truth-telling is optimal at all levels of rationality if the distribution of random strategies Φ and the distribution of types F coincide.
Proposition 1 establishes the equivalence between the equilibrium and Lk predictions of the mechanism's outcome. It implies that whether the agents stop at a finite level of reasoning or engage in equilibrium thinking is irrelevant as long as the perceived distribution of random strategy coincides with the distribution of type. Proposition 1 trivially extends to the cognitive hierarchy (CH) model, because both Lk and CH models define L1 equivalently. Overall, the AGV mechanism achieves efficient implementation in four models of reasoning: Lk and CH with truth-telling L0s; Lk and CH with random L0s and F = Φ.
If distributions F and Φ do not coincide, Lk agents do not report truthfully. Next we show that systematic biases in reports (under-or over-reporting for all realizations of type) occur if F and Φ are ordered in the sense of first-order stochastic dominance. F dominates Φ means that the probability that a type exceeds a given threshold is always higher than the probability that a random move exceeds the same threshold. For instance, if Φ represents a distribution of values obtained from a prior survey, and F represents the true distribution, then a dominance relation between the distributions may arise if the survey sample is biased.
Denote the first-order stochastic dominance relation by F OSD . 18 The following proposition states in which direction a level-1 player's report is going to be distorted.
Proposition 2 Under assumptions A1-A5, the L1s distort their type reports The proof of the proposition is given in the Appendix. We start with the observation that any n-player problem can be reduced to a problem with 2 players due to the fact that the stochastic dominance relation is preserved under monotone transformations an summation of random variables. This is the content of Lemma A in the Appendix. Then, in the framework with 2 players, we analyze the first-order condition that corresponds to the payoff-maximization problem (11) to obtain the result.
The proposition states that level-1 players systematically (that is, for every realization of type) misreport their types, if the distributions of types and of random strategies are ordered in the sense of first-order stochastic dominance. In particular, if player i expects player j to report a higher type than j has on average, then i will report a lower type than he actually has (and vice versa), even if this induces a less preferred allocation. What is the intuition behind that? In the AGV mechanism, agent i gets utility from the social choice based on his and j's reported preferences, plus the expected payoff of agent j had he told the truth to the principal. Suppose first that a high type values the size of the alternative more than a low type ('positive SMC', as in A4.1). If agent i knows that agent j tends to over-report his preferred allocation, then -since i benefits from satisfying j's true preferences in expectation -he would adjust the social choice downward by under-reporting himself. If higher types prefer lower alternatives, then j's over-reporting makes the chosen alternative lower and i over-reports to shift it back up. In either case, the level-1 player compensates the counterpart's random behavior by misreporting their types in the opposite direction.
Recall from the example of the previous section that the distortion of reports by level-1 players feeds into the optimal strategies of level-2 players, level-3 and so on, whereas the size of distortion decreases and the limiting optimal strategy is truth-telling. The following proposition states a similar result for a more general setting of an arbitrary social choice rule that satisfies neutrality.
The expected absolute deviation of reported from true types decreases with the level of rationality. The sign of the expected deviation alternates as the level of rationality increases by one. Thus the optimal level-k strategies follow a similar pattern as the example of Section 2. If level-2 players overstate their type in the game, then level-3 players will understate them. Note that this is good news for the AGV mechanism: if the group of agents is a mix of, say, level-2 and level-3 players, then the expected chosen alternative will be closer to the one maximizing the true welfare.

Extension
The assumption of neutrality implies that the marginal effect of an agent's report on the allocation choice is unaffected by another agents' report. However, in certain preference environments, this assumption may be violated. For instance, if agent i of an extreme type knows that his biased report affects the mechanism's reaction to j's report in such a way that the total distortion becomes even stronger, he may prefer not to misreport in the direction that Proposition 2 suggests.
This can be demonstrated by the following example. Suppose that the agents' preferences are given by v i = θ i x, where the allocation x takes values 0 or 1 (whether or not to build an airport), and types range between -10 and 10. This implies that, when there are two agents in the mechanism, the optimal decision is to undertake the project, x * = 1, if θ 1 + θ 2 > 0 and decline, x * = 0, otherwise. Suppose that F dominates Φ, and it holds for both distributions that the mass on the negative side of the support (−10, 0) is very small, and the mass on the positive side (0, 10) is very large (a small minority suffers from having the airport around, while a large majority benefits).
Proposition 2 says that due to the dominance relation between F and Φ L1 players will tend to over-report their types. Consider however an agent of type −10, who overstates his type and reports −9. This raises the probability of project implementation from 0 to 1/10 − 2ε ≡ π. The expected externality equals 9.5π while i's expected utility is −10π such that his total payoff in the mechanism is negative. 19 Thus, contrary to what Proposition 2 suggests, the agent is strictly better off by reporting his type truthfully (in which case he gets the zero payoff). By over-reporting his type he will increase the probability that the project is undertaken.
The result of Proposition 2 does not apply in this example since the agents' reports are perfect substitutes when one agent's type is the negative of the other: θ 1 = −θ 2 . Similar problem arises when types are complements. The formal definitions are as follows.
Agents' types are complements 20 with respect to the efficient allocation, if: 19 Recall that we omit the budget-balancing term of the AGV transfer, since it does not affect strategy choice. The agent's payoff in the mechanism is given by u 14 Agents' types are substitutes 21 with respect to the efficient allocation, if: If types are substitutes, a higher report by agent i lowers the marginal effect of the opponent's report. If types are complements, the interaction is the opposite: the marginal effect of j's report increases with the report of agent i.
The following propositions state results that parallel Proposition 2 in the mechanism with two players. In this part of the analysis, we distinguish between positive (A4.1) and negative (A4.2) single crossing. Recall that, in the positive case, higher types receive higher marginal utility from allocation. In the negative case, the marginal utility diminishes with type. We separate the environments into four groups according to two criteria: first, whether the single-crossing holds as positive or as negative, and, second, whether the chosen alternative's increment due to an increase in one agent's report increases or decreases with the other agent's report (types are complements or substitutes). In these propositions, we additionally assume the monotone likelihood ratio property (MLRP).

Proposition 4
Under assumptions A1-A4.1, MLRP and complements (substitutes) environment, the agents with sufficiently low (high) types distort their reports downwards, if Φ F OSD F , and upwards, if F F OSD Φ.

Proposition 5
Under assumptions A1-A4.2, MLRP and complements (substitutes) environment, the agents with sufficiently high (low) types distort their reports downwards, if Φ F OSD F , and upwards, if F F OSD Φ.
Propositions 4 and 5 make four distinct claims. Let us consider, for example, the first claim: if high types tend to have high valuations (A4.1: positive singlecrossing) and the efficient social choice rule is more sensitive to i's type if j's type is high (i.e., types are complements), then low-valuation players will tend to misreport their type so as to compensate the bias in the other player's report. This claim is the same as Proposition 2, except that high-valuation agents are excluded. If there is first-order stochastic dominance in distributions, in the neutral case, agent i displays compensating behavior: i systematically underor over-reports, regardless of whether his true type is high or low. However, in a non-neutral case this is different. Continuing with the first claim for illustration, we observe that under its conditions the mechanism becomes more sensitive to j's misreporting in the range where i's type is high. Therefore i's compensating reporting strategy has an additional indirect effect on the allocation choice (see proof in the Appendix). For this reason, both propositions 4 and 5 include only the type ranges that correspond to sufficiently weak sensitivity of the social choice rule to the other agent's report. Types in the weak-sensitivity regions display compensating behavior.

Lk as a Learning Algorithm
Observe that the Lk model can be thought of a model of learning. Suppose that a symmetric incomplete information game is played repeatedly by a large number of players. There is a common prior over types and types are independent. At the end of each repetition, the players observe each others' actions and types. At date 0, each player chooses a random action. At date 1, each player best-replies to the profile of actions played at date 0. At every subsequent date k, each player best-replies to his opponents' strategies at k-1. 22 Note that this implies that all the players play the same strategy as function of type (not the same action).
Observe that the strategy played at a given date k (by all players) corresponds to an Lk strategy. 23 Therefore, Proposition 3 implies that such learning procedure converges to truth-telling in the AGV mechanism.
Does truth-telling convergence obtain with other learning procedures? For complete information games, Monderer and Shapley (1996) described a learning algorithm that is similar to the above interpretation of the Lk model. In their learning algorithm, called improvement path, one player improves his payoff at a given date k, while the rest play as in k-1. In the appendix, we extend the improvement path algorithm to games of incomplete information and show that the game induced by the AGV mechanism with quadratic utility is a potential game for any type profile θ. Applying the result of Monderer and Shapley we conclude that in the quadratic case the improvement path leads to 22 If the number of players is large and the game is symmetric, strategies can be inferred from the observed actions and types. 23 Note that the learning interpretation of the cognitive hierarchy model is closer to fictitious play, since it takes into account the weighted average of the whole past, and not just the last period.
truth-telling in the AGV mechanism. This, however, is not true in general: even in neutral concave environments that are not quadratic the AGV does not induce a potential game for all type profiles θ. 24 This latter finding relates to Sandholm (2005) who designs an indirect mechanism with the potential game property and shows that convergence to efficiency obtains in a very large class of learning dynamics. Lifting the concavity assumption made in this paper, Mathevet (2010) designs supermodular mechanisms with good learning properties. When a game is potential or supermodular, then a large class of learning algorithms converge to equilibrium.

Conclusion
The idea of relaxing the pervasive common knowledge assumption, often referred to as the Wilson doctrine, has motivated recent research in mechanism design. Significant progress was made in studying implementation in frameworks approaching the universal type space, where higher-order beliefs are virtually unrestricted. 25 Kets (2012) extends the notion of type space further to allow finite depths of reasoning, as in the level-k model. The next natural step for mechanism design is to accommodate the extended notion and search for mechanisms that are robust with respect to changes not only in the structure of beliefs, but also in the depth of reasoning (as mentioned in the discussion, learning to play the mechanism is a related issue). This paper first studies one of the most influential of existing mechanisms, d'Aspremont and Gerard-Varet (1979), in the Lk environment.
The AGV mechanism implements the efficient choice rule under the common prior and common knowledge assumptions. It is conceptually similar to the Vickrey-Clarke-Groves (VCG) mechanism that taxes the agents with the amount of negative externality their preference report exerts on the welfare of other agents. The VCG mechanism implements the efficient social choice rule in dominant strategies, and hence is independent of the beliefs. 26 On the downside, the VCG mechanism fails to satisfy the overall budget constraint. The expected externality mechanism has the advantage of being exactly budget balanced, but it comes at the cost of achieving Bayesian, as opposed to dominant-strategy implementation. In the light of the Lk model, this is not entirely innocuous.
We show that if there is a systematic difference in the perceptions of random-L0 moves and the true types the agents will distort their types at the first level and, by extension, also at the higher levels of rationality. We observe compensating behavior of finite-level players in an AGV mechanism, that is, distorting one's report in the opposite direction to the anticipated bias of the opponents. This is due to the fact that the AGV mechanism rewards for the expected externality, where the expectation is measured with respect to the true types. A simple implication of this result is that the AGV mechanism could use the distribution of random moves, as opposed to types, to achieve truthtelling among Lk agents.
Nevertheless our results, put together, vindicate the AGV mechanism in convex environments. First, in the truthful-L0 specification there is no distortion of truthtelling and efficiency. Second, if there is distortion of truth-telling, its sign alternates and its absolute value decreases with k. Therefore, in mixed groups of agents with various levels k the biases cancel out and the mechanism's outcome is close to efficiency. This also implies that starting from L2 in the cognitive hierarchy model best replies are located within a smaller neighborhood of truth-telling. Third, our convergence result suggest that, in repeated interactions where the agents can observe others' strategies, equilibrium becomes an increasingly better approximation and the expected externality mechanism achieves efficiency.

A Appendix
Proof We proceed by induction. Suppose that for k − 1 it holds that: Level-k optimal strategy is best reply to the profile of strategies s (k−1) (θ j ), where the expectation is taken with respect to the opponents' types θ −i . (12) holds on level k − 1 it also holds on level k. Level-1 strategy is best reply to the profile of random moves: Thus for L1 the induction formula (12) applies. (Lemma 2) Proof The first-order condition (henceforth f.o.c.) for the maximization problem (11) is the following: Given that x * (s i , s −i ) is the efficient choice rule, it must hold that Then the second term of (13) can be rewritten, such that the f.o.c. becomes: 27 −i and θ −i is the same random variable), then s i = θ i satisfies the first order condition (14) and thus s Consider an L1 problem P n with n players and F ≺ F OSD Φ (Φ ≺ F OSD F ). There exists an L1 problem P 2 with 2 players and a pair of distribution functions such that the solution to P 2 is also a solution to P n .
Proof First, we observe that ∂ 2 x * ∂s i ∂s j = 0 (A5) implies that x * (s 1 , ..s n ) = i λ i s i for some scalars λ i , λ i > 0. 28 Condition (14) can be rewritten as follows: . 27 The second order condition (s.o.c.) . 28 This assumes that types are relabeled: if v i (x, θ i ) ≡ṽ i (x, h (θ i )), then we considerṽ i x,θ i with typeθ i = h (θ i ). If A4.2 holds, ("negative" SMC), letθ i = −h (θ i ). s i that satisfies this condition is a solution to P n . From Theorem 1.A.3 in Shaked and Shanthikumar (2007) Σ and θ Σ correspond to the random action and type of a fictitious second player in P 2 . In this problem P 2 the first order condition writes as follows: It is then clear that the solutions to problems P n and P 2 coincide.
Statement. The L1 strategy in the AGV mechanism is given by (n = 2): Proof. Rewrite (14) as follows: Integrate the second term of Equation (16) by parts:

Modify the first term of Equation (16) by taking Taylor expansion under the integral:ˆ∂
where θ i is between s i and θ i , and integrate by parts: Observe that due to the equal support of the two distribution functions F and Φ: Thus, the f.o.c. becomes: We can rewrite the solution as follows: If F (t) − Φ(t) ≡ 0, then s −i : Proof The efficiency of the social choice rule x * implies that for all t i , t −i : Differentiate with respect to θ i : From the s.o.c. of the same problem, and obtain: −i )) = sgn( i (θ i ))).
Given A4 (i.e., sign of ∂ 2 v i ∂x∂θ i (x, θ i ) is the same for all (x, θ i )) the result is proven.
Proof From Lemma B, the first-order condition for the L1 maximization problem when n = 2 is given by Equation (17). Lemma C (p. 23) shows that the denominator of the expression is positive. Let us transform the nominator as follows: The signs marked above are determined by the following. (1) ; s i ) < 0 by the concavity of preferences; (2) By Lemma C (p. 23), ∂ 2 x * ∂s i ∂s −i (s i , t) = 0 by neutrality.

Proposition 3.
Statement Suppose that A1-A5 hold, and F F OSD Φ or Φ F OSD F . Then for all i, lim E θ i s i (θ i ) = arg max Remark Recall from Proposition 2 that either s (1) i (θ i ) ≥ θ i ∀θ i , or s (1) i (θ i ) ≤ θ i ∀θ i . By induction, the equation above implies that the same is true for all levels k: either s Moreover, from the proof of Lemma C we know that Forθ i we have, by continuity, as well. Take the expectation of both sides: as types are independent and the distributions of types coincide, Consider the sequence E θ i s

26
This concludes the proof of Proposition 3.

(P 3)
Proposition 4. Let us separate the statements of Proposition 4 and refer to them as Proposition 4a and Proposition 4b respectively. Bold face is used to emphasize the differences in the two statements: Proposition 4a: Under A1-A4.1, MLRP and complements environment, ∃t * such that for all θ i < t * if Φ F then s (1) i (θ i ) < θ i , and if F Φ then s Proof Given the non-neutrality, ∂ 2 x * ∂s i ∂s −i (s i , t), we need to decompose the denominator of Equation 17. Start with the case of Proposition 4a: Clearly, if ∂ 2 v ∂x 2 (x * (s); θ i ) varies with θ i then so does ∂ 2 u i ∂s i ∂s j (s, θ i ); therefore, the conditions of Theorem 4.5 are not satisfied.