The expected externality mechanism in a level-k environment

Mechanism design theory strongly relies on the concept of Nash equilibrium. However, studies of experimental games show that Nash equilibria are rarely played and that subjects may be thinking only a finite number of iterations. We study one of the most influential benchmarks of mechanism design theory, the expected externality mechanism (D’Aspremont and Gerard-Varet, J Public Econ 11:25–45, 1979) in a finite-depth environment described by the Level-k model. While the original mechanism may fail to implement the efficient rule in this environment, it can be adjusted to restore efficiency.

the allocation that maximizes total welfare. The major challenge to efficient implementation is the fact that information about individual preferences is private. 1 In a setting with quasi-linear utilities, D'Aspremont and Gérard-Varet (1979) construct an ingenious mechanism that aligns the agents' individual incentives with total welfare maximization. In a Bayes-Nash equilibrium, the agents reveal their types to the principal and thus efficiency can be achieved. The AGV mechanism has become an essential building block for the mechanism design theory (Athey and Segal 2013).
Since the AGV mechanism is tailored to the concept of Bayes-Nash equilibrium, its success in inducing truth-telling and, therefore, efficiency in practice depends on (1) whether the participants' behavioral response to the mechanism coincides with the Bayes-Nash prediction and, if it does not, (2) whether efficiency still obtains under the possible deviations. While the first question has not been addressed directly in the literature, the experimental results in (simpler) complete information games suggest that the answer may be negative. As to the second question, little is known as to the loss of efficiency if the participants do not play equilibrium. This paper tries to fill this gap by studying how the mechanism performs in a behavioral framework where, contrary to the requirement of Bayes-Nash equilibrium, the agents conduct only a limited number of iterations of reasoning. The choice of the behavioral setting follows a large body of evidence from experimental games. Recent surveys by Crawford et al. (2009) and Camerer and Ho (2015) show that non-equilibrium models with finite depth of reasoning, such as the Level-k model (Lk ;Nagel 1995;Stahl and Wilson 1994;Costa-Gomes et al. 2001;Costa-Gomes and Crawford 2006) and the cognitive hierarchy model (CH; Camerer et al. 2004), systematically outperform equilibrium in predicting human behavior. Along with closely fitting the lab data, these models are able to predict some frequently observed field phenomena such as the winner's curse in common-value auctions: see Crawford and Iriberri 2007. We choose the Lk model due to its tractability, but most of our results also hold in the CH model. 2 Lk is a model of reasoning prior to a game, where the agent maximizes his payoff against a non-equilibrium belief about other agents' strategies. The belief is constructed in the following iterative process. An agent of level k = 1 ("L1 agent") believes that his opponents ("L0") behave non-strategically. In incomplete information games, such as the AGV mechanism, L0's can be modeled in two distinct ways: either they truthfully reveal their type ("truthful L0") or draw their actions (type reports) from a random distribution ("random L0"). An L2 agent best replies to the profile of L1 strategies, L3 best replies to L2, and so on. In general, an Lk strategy is best reply to the profile of L(k − 1), suggesting the interpretation that agents try to "outguess" their opponents. 3 To illustrate, consider a seminal game in this literature, 4 where players pick a number between 0 and 100 and the one whose number is closest to some fraction, say one half, of the average wins the game. In this guessing game, if L0s randomize uniformly between 0 and 100, L1s will choose 50/2=25, L2s will choose 25/2, etc. As k increases, the best response of Lk approaches 0, the only Nash equilibrium of the game.
This paper applies the Lk model to the AGV mechanism with one-dimensional types. We look at the case where the principal knows the type distribution and expects equilibrium behavior on part of the agents. Such principal is ignorant of the fact that he operates in an Lk environment. In this setting we conduct a positive exercise and find conditions under which the mechanism remains robust to Lk. Throughout the paper we assume independent private valuations and utilities that are strictly concave with respect to the allocation. 5 First, we observe that in the truthful-L0 specification of the Lk model the mechanism never produces a loss in efficiency. In that specification, the L1 best reply is given by the equilibrium condition of AGV which implies truthtelling. By induction, this result extends to any higher level k, therefore the mechanism chooses the efficient allocation irrespective of the levels prevailing in the population.
Further, in the random-L0 specification of Lk, we show that if the distribution of random actions (L0) coincides with the distribution of payoff types, then the participants at any level larger than zero report truthfully to the mechanism. Next, we analyze the more interesting case where the type distribution used by the planner to assign transfers differs from L1s' expectation of the opponents' actions. In this case, the externality payment generally fails to align the agent's incentives with total expected welfare maximization. As a result, the AGV mechanism does not induce truth-telling and produces a sub-optimal allocation. Denoting the distribution of random L0 strategies by and the distribution of types by F, we study how the relation between and F affects the Lk strategies in the mechanism.
We focus on the case where dominates F (in the sense of first-order stochastic dominance) or vice versa. This corresponds to scenarios where players believe a salient strategy is to systematically under-or over-report one's type. The main result characterizes the deviations from equilibrium behavior for the case that the efficient choice rule is linear in agents' types (the environment we call neutral). If L0 agents are expected to under-report their types, then all types of an L1 agent will over-report their types to the mechanism, and vice versa. Therefore L1 agents display compensatory bias in reports. The distortion carries over to higher levels, but the expected absolute value of the distortion of type decreases as level k goes up; in the case of quadratic utilities, the rate of decrease is exponential. Interestingly, the direction of the bias (i.e., whether the agents over-report or under-report their types) alternates at each iteration from k to k + 1. This result has two interesting implications for the outcome of the mechanism. First, if the pool of agents is a mixture of two subsequent levels (e.g., L2 and L3), the distortion of efficiency is lower than in a group where only one of these levels is present. Second, as Lk goes up, the outcome approaches efficiency.
The results extend partially to the non-neutral case where types are complements or substitutes with respect to the efficient choice of allocation. Non-neutrality means that the marginal effect of one agent's type on the efficient allocation is not invariant in the 5 We use the assumption of strict concavity to assure that the equilibrium of the AGV mechanism is unique. For an account of the problem of non-uniqueness, see Mathevet (2010). other agent's type. In particular, when the other agent's type is high, the marginal effect is stronger in case of complements and weaker in case of substitutes. In either of these environments reports have two counter-veiling effects on the choice of allocation. The first direct effect of compensating bias pushes the allocation in the direction of marginal payoff increase. The second indirect effect changes the choice rule's sensitivity to the opponent's report. Therefore, compensating bias remains best reply in type ranges where the direct effect dominates. We demonstrate by means of example that the dominance of the indirect effect changes the prediction.
While the main interest of this paper is positive, we conduct a separate normative analysis of the AGV mechanism. This part is concerned with a principal who is aware of the Lk environment and seeks the appropriate AGV-type mechanism for efficient implementation. In particular, we change the transfer rule to reflect the actual expected externality (under the level-k strategy profile) and thus to elicit the information correctly. 6 The Lk environment is characterized by three components: type distribution F, random actions distribution and agents' levels k. When all three components are known, the efficient Lk mechanism differs from the original AGV in its transfer to L1 agents only. By correcting the incentives at level 1 the principal restores truth-telling at all levels and achieves efficiency. When the information on F, or k is missing, the principal can expand the mechanism to elicit the agents' knowledge. One way to do this is to add a betting round where the agents guess each others' reports. Ex post, the principal rewards correct guesses. Betting is a powerful tool for the elicitation of correlated information 7 and turns out to be instrumental in the Lk environment. We show how betting can be used to elicit levels k and other information necessary to construct the efficient mechanism. This paper is among the first studies of mechanisms in an Lk environment. Crawford (2015) looks at the double auction mechanism and revisits Myerson and Satterthwaite's (1983) impossibility result in the Lk framework. He finds, in particular, that revelation principle does not hold in this framework since the choice of mechanism influences the correctness of Lk beliefs. Similar to his paper, the normative part of our analysis exploits the predictably incorrect beliefs of Lk agents. De Clippel et al. (2014) provide a characterization of implementable choice functions in a general setup with finite depth of reasoning. They consider the expected externality mechanism as an example and show that it achieves efficient implementation under the assumption that L0 report truthfully. In contrast, the present paper allows for L0 to be random and arbitrarily far from truthtelling.
The rest of this paper is organized as follows. Section 2 presents the key assumptions, the Lk model in incomplete information games and in the AGV mechanism in particular. Section 3 describes the properties of Lk strategies in the AGV mechanism: equivalence of Lk and equilibrium models in the AGV mechanism, the biases due to first order stochastic dominance and convergence in the neutral environment. Section 4 shows how the AGV mechanism can be adjusted to the Lk environment, and Sect. 5 concludes.

The model
Preferences The preference environment is characterized by the following assumptions: A1 Utilities are linear in money. A2 Values are private. A3 Values are independent draws from a commonly known distribution F with density f .
Assumptions A1 and A2 imply that the utility function of a given agent i ∈ I = {1, 2 . . . n} , n ≥ 2, can be represented as: where v i (x, θ i ) is the utility derived from allocation x ∈ X ⊆ R, θ i is the privately known preference parameter that we refer to as the agent's type, and T i is the monetary transfer to agent i. Agent types θ i are drawn independently from , a compact subset of R, according to a distribution F. We assume that v i (x, θ i ) is strictly concave in x and continuously differentiable with respect to both arguments on the entire domain. Some of our results require that the preferences satisfy a single crossing (Spence-Mirrlees) condition. The condition postulates that the cross-derivative of v i (x, θ i ) with respect to allocation x and type θ i has constant sign over the function's domain: . A1-A4 are the basic assumptions of mechanism design. A further standard assumption is that agents play Bayes-Nash equilibrium: the profile of strategies is a fixed point of a best reply correspondence. In this paper, we consider a framework with a finite number of best-reply iterations that do not generally start at equilibrium. This framework is described by the following model (Nagel 1995;Crawford and Iriberri 2007).
Level-k Consider a game of incomplete information where the payoffs are given by u i (s; θ i ), for each agent i ∈ I of type θ i and strategy profile s = (s 1 , s 2 , . . . s n ), where s i (θ i ), or simply s i , maps into an action. We look at agents who engage in iterations of best reply. The Lk strategy s (k) i (θ i ) is recursively defined as function of agent's type θ i that maximizes his expected payoff against level- The agent believes with certainty that his opponents make exactly k − 1 iterations of best reply. 8 As starting point of the recursion, the model features nonstrategic L0 agents whose actions s (0) i are drawn from a given distribution . By analogy, we say that s is an unobserved random mapping such that the induced cumulative distribution of actions is and the density is ϕ.
Definition For k ≥ 1 the optimal strategy s (k) i maximizes the expected payoff of agent i against s where θ −i is the residual profile of types. The expectation is taken over the residual types and mappings s (0) i . The following simple observation establishes the relation between the Lk and equilibrium strategy profiles. 10 Observation: If s (k) (θ ) = s (k+1) (θ ) for some k ≥ 1 and θ ∈ , then s (k) (θ ) constitutes a Bayes-Nash equilibrium.
Choice rules and mechanisms For a quasi-linear utility representation (1), we define a choice rule x * (θ ) as efficient if it maximizes the total welfare for every profile of agents' types θ = (θ 1 , θ 2 , . . . θ n ): We look at a direct mechanism, where the agents report their types to the principal: i's report s i is a member of . 11 A mechanism implements the choice rule x * (·) if the profile of truth-telling reports is an equilibrium. The expected externality mechanism introduced in d' Aspremont and Gérard-Varet (AGV, 1979) is an example of such mechanism. AGV chooses the efficient allocation x * (·) and assigns the following monetary transfers to the participants: where The transfer t i (s i ) is constructed such that agent i internalizes the expected effect of his report on the others' welfare, assuming they tell the truth. This guarantees that agent i's incentives are aligned with the total welfare maximization, therefore truth-telling is Bayes-Nash equilibrium. Note that this implies immediately that in the truthful-L0 specification of the Lk model efficient implementation obtains for any k.
The second part of the transfer, 1 n−1 l =i t l (s l ), guarantees that mechanism satisfies ex post budget balance. In particular, in the level-k model the transfers add up to zero after any profile of reports s. 12 Note that this part of transfer does not depend on i's own report s i , therefore it can be omitted from the analysis of incentives. Level-k in the Mechanism In the expected externality mechanism, an Lk agent, k ≥ 1, maximizes the expected gain in the mechanism: Given the incentive transfer (5), the optimal Lk strategy in the mechanism is defined by the following: 13 Recall that a strategy profile that satisfies s (k) (θ ) = s (k−1) (θ ) for all k and θ is a Bayes-Nash equilibrium. The following section demonstrates an example where this is not the case and studies the differences between Lk and equilibrium behavior in the AGV mechanism.

Unadjusted mechanism
This section takes the AGV mechanism as given and studies its outcomes in the Levelk environment. We establish the conditions under which the mechanism still yields efficient outcomes, and look at the misreporting of preferences that may arise in certain stochastic environments. We start with a simple example to illustrate some of our main findings.
Example Consider a setting with n agents and a quadratic utility representation In this setup, agent i has a bliss point at θ i and incurs quadratic loss if the allocation departs from it. It is easy to verify that the socially efficient allocation is the average of individual bliss points: x * (θ 1 ) = i θ i n . We prove the following simple lemma (see "Appendix").

Lemma 1
In the quadratic case, the optimal Lk strategy, k ≥ 1, for agent i is given by the following: where = θ d F(θ ) − sd (s) denotes the difference between the average type and the average random move of an L0 agent.
The Lk strategy (8) has several interesting properties. First, the size of the distortion diminishes as the level of rationality k increases. As k goes to infinity, the optimal strategies converge to truth-telling. This holds for any pair of distributions F and . Second, if the distributions have equal means, θ d F(θ ) = sd (s), then truthtelling obtains at every level of rationality, starting from k = 1. Third, the absolute size of the discrepancy × n−1 n k between the true type θ and the Lk report s (k) i (θ i ) increases in the number of agents.
Next we study these properties in a more general setup. We maintain, however, that the efficient rule is linear in (a function of) types. Formally, we make the following assumption of neutrality: Level 1 is central to the entire analysis, since any distortion of truthtelling that emerges at L1 propagates to higher levels. The analysis of L1 optimal strategy: (9) yields the following proposition.

Proposition 1 Under assumptions A1-A3, truth-telling is optimal at all levels of rationality if the distribution of random actions and the distribution of types F coincide.
Proposition 1 establishes the equivalence between equilibrium and Lk predictions of the AGV mechanism's outcome. It shows that as long as the subjective distribution of random actions coincides with the (objective) distribution of types, it is irrelevant whether the agents stop at a finite level of reasoning or engage in equilibrium thinking. Proposition 1 trivially extends to the cognitive hierarchy (CH) model, since both Lk and CH models define level-1 equivalently. Overall, the AGV mechanism achieves efficient implementation in four models of reasoning: Lk and CH with truth-telling L0s; Lk and CH with random L0s and F ≡ . Observe that the equivalence result does not rely on either the linearity of the social choice rule nor the Spence-Mirrlees condition.
If distributions F and do not coincide, Lk agents do not report truthfully in general. To study the report biases, we concentrate on the case where F and can be ordered with respect to first-order stochastic dominance relation, denoted F O S D . This corresponds to scenarios where players believe a salient strategy is to systematically under-or over-report one's type. We have the following result.

Proposition 2 Under assumptions A1-A5, L1 agents distort their type reports upwards if F F O S D , and downwards if
The proof of the proposition is given in the "Appendix". We start with the observation that any n-agent problem can be reduced to a problem with two agents due to the fact that stochastic dominance is preserved under monotone transformations and summation of random variables. Then, in the framework with two agents, we analyze the first-order condition that corresponds to the payoff-maximization problem (9) to obtain the result. The first part of Proposition 2 states that L1 agents systematically (that is, for every realization of type) misreport their types, if one distribution dominates the other in the sense of first-order stochastic dominance. For example, if an L1 agent expects L0 agents' reports to dominate the type distribution, then L1 will report a lower type than he actually has (and vice versa), even if this induces a less preferred allocation. The reason is that in the AGV mechanism, an agent's report affects both (1) the expected externality, which is calculated based on the true distribution F, and (2) the agent's own expected value from the allocation which depends on his own belief about other agents' reports. If an agent believes the others over-report ( dominates), he concludes that the allocation is on average higher than it would be under truthful reports by the others. Given that the utility function is strictly concave, this reduces his perceived marginal value of the allocation, therefore he under-reports. If higher types prefer lower alternatives ('negative cross-derivative', as in A4.2), then L0s' over-reporting makes the chosen alternative lower and L1 over-reports to compensate. In either case, an L1 agent compensates the opponents' random behavior by misreporting his type in the opposite direction.
The second part of the proposition states that the expected deviation of reported from true types decreases in absolute value as the level of rationality increases. The sign of the expected deviation alternates at every transition from k to k + 1. Thus the optimal level-k strategies follow a pattern similar to the example of Sect. 2. If level-2 agents overstate their type in the game, then level-3 agents will understate them. Note that this is good news for the AGV mechanism: if the group of agents is a mix of, say, level-2 and level-3 agents, then the expected chosen alternative is closer to efficiency.

Non-neutrality
The assumption of neutrality implies that the marginal effect of an agent's type on the efficient allocation is invariant in other agents' types. However, there are examples of preferences where this assumption is violated. Consider the case with two agents whose preferences are given by v 1 = θ 1 x for Agent 1 and v 2 = − x 2 2θ 2 (θ 2 > 0) for Agent 2. The optimal allocation is x * = θ 1 θ 2 . Agent 1's utility in mechanism (excluding the budget balancing part) 14 Suppose dominates F such that Es (0) 2 = 1 and Eθ 2 = 0, then v 1 + t 1 = θ 1θ1 . Thus Agent 1 will over-report if θ 1 > 0 and under-report if θ 1 < 0, which is not the prediction of Proposition 2. Contrary to the neutral environment, where F would imply under-reporting by all types of an L1 agent (Proposition 2), this example features types that are complements with respect to the optimal allocation: In such environments, the result of Proposition 2 holds only for a subset of types, as we demonstrate below.
Agents' types are complements 15 with respect to the efficient rule ∂ 2 x * ∂θ i ∂θ j > 0 for all i = j. Agents' types are substitutes 16 with respect to the efficient rule ∂ 2 x * ∂θ i ∂θ j < 0 for all i = j. When types are substitutes, a higher type by agent i lowers the marginal effect of the opponent's type. If types are complements, the interaction is the opposite: the marginal effect of j's type increases with the type of agent i.
In this part of the analysis, we distinguish between positive (A4.1) and negative (A4.2) single crossing. Recall that, in the positive case, higher types receive higher marginal utility from allocation. In the negative case, the marginal utility diminishes with type. We separate the environments into four groups according to two criteria: first, whether the single-crossing holds as positive or as negative, and, second, whether the chosen alternative's increment due to an increase in one agent's report increases or decreases with the other agent's report (types are complements or substitutes). In these propositions, we additionally assume the monotone likelihood ratio property (MLRP). It says that the ratio of probability distribution functions Propositions 3 and 4 make four distinct claims. Consider the first claim, for example: If high types tend to have high valuations (A4.1, positive single-crossing) and the efficient social choice rule is more sensitive to i's type if j's type is high (i.e., types are complements), then low-valuation agents will tend to misreport their type so as to compensate the bias in the other agent's report. This claim is the same as Proposition 2, except that it does not include a range of valuations above a threshold.
If there is first-order stochastic dominance in distributions, in the neutral case, an L1 displays compensating behavior: L1 systematically under-or over-reports, regardless of whether his true type is high or low. However, in a non-neutral case this is different.
Observe that when types are complements or substitutes the mechanism may become more sensitive to L0's misreporting in the extreme ranges of L1's type when L1 misreports. Therefore L1's strategy of compensating report bias has a further indirect effect on the allocation choice. For this reason, both Propositions 3 and 4 include only the type ranges that correspond to low enough sensitivity of the social choice rule to the other agent's report. Types in the low-sensitivity regions display the compensating behavior, similar to our benchmark result in Proposition 2. Intuitively, the exclusion of some types in Propositions 3 and 4 can be understood as follows. Consider the more intuitive case of positive single crossing (A4.1). Suppose L1 agent's type is high, so he prefers a high level of public good, and complements environment. Then compensatory under-reporting makes the choice rule less responsive to the opponent's over-reporting and thus may lead to the allocation being too low for his preferences. On the other hand compensatory over-reporting makes the choice rule more responsive to the opponent's under-reporting and thus, again, may lead to the choice of allocation that is too low. Suppose now that the agent's type is low, so he prefers a low level of public good, and substitutes environment, as in the example given at the beginning of this section. In the example the choice rule does not respond to the opponent's under-reporting and thus, if the agent over-reports his type, he increases the probability that the project is undertaken, and that is against his private interest. Therefore, the reaction of the choice rule to the opponent's report determines whether the compensating bias is a profitable strategy.

Adjusting the mechanism 17
Our analysis so far assumed that the principal is unaware of the Lk environment. In other words, the principal implements the allocation and transfers as if the agents were infinitely rational. But what if the principal knows that the agents conduct only a finite number of best-reply iterations? How can he adjust the mechanism and achieve efficiency in this case? This section discusses this question. The answer depends critically on the principal's information about the setting. If the characteristics of stochastic setting-the type distribution F, distribution of random actions , and the Lk identity of every agent-are known, then the principal can achieve efficiency by adjusting the incentive transfer. However, if some of that information is missing, the principal should expand the mechanism.

Known environment (F, , k)
When F, , and k i for all i ∈ I are known, the principal's response to the Lk environment is to adjust the incentive transfers accordingly. Knowing that L1 agents expect their opponents to behave non-strategically according to the distribution , the principal assigns the following transfer to any L1 agent: 17 I am grateful to the anonymous referee who suggested writing this section and offered some important insights into adjusting the AGV mechanism.
The expectation in (10) is taken over the L0 strategies s (0) −i , as opposed to type distributions as in the original AGV mechanism.
Thus, the incentive transfer to all higher-level agents Lk remains unchanged relative to the original AGV mechanism: Let AGVk(F, ) refer to the AGV mechanism with transfers Eqs. (10) and (11).

Lemma 2 Any Lk player (k ≥ 1) is truthful in AGV k (F, ).
Proof Facing transfer (10), any L1 agents report their types truthfully, since s i = θ i solves the utility maximization problem: Provided that L1s receive transfers that make them reveal their types, L2s hold a belief over the reports that coincides with F, the distribution of types. Similar to the Bayes Nash equilibrium in the standard AGV mechanism L2 best replies to the incentives by reporting his type truthfully. By induction, truthfulness extends to all subsequent levels that face the standard AGV transfer (11). The induction relies on the fact that L(k + 1) believe that Lk best reply to L(k − 1) and believe that L(k − 1) best reply to L(k − 2) etc up to L1.
Therefore, in case where the stochastic Lk environment is known, the principal can implement the efficient allocation by changing the transfer to L1 agents only. As before, budget balance ex post is achieved through an additional term that is independent of agent i's own report s i : T i (s) = t i (s i ) − 1 n−1 j =i t j s j .

Unknown environment
The construction of transfers Eqs. (10) and (11) relies on the principal's knowledge of distributions and F, respectively. The assignment of transfers to agents relies on the knowledge of levels k i for i ∈ I . If any part of this information is not available to the principal he has to elicit it from the agents. Unfortunately, there is little hope to get the information "for free". Suppose that the principal knew he was facing an L1 agent i and asked him to report . The agent would benefit from misrepresenting as it determines his incentive transfer (10). For example, in the quadratic utility case (Sect. 3) the agent gains in -expected externality if is such that the other agents' preferences are very similar to his own preference reportθ i . In the extreme case, the agent reports a degenerate distribution with a mass point atθ i . Asking an L2 agent to report would not result in truthful elicitation either. Contrary to L1, misreporting does not affect L2's incentive transfer, but it does affect his expectation of the resulting allocation choice. Since an L2 believes that others are L1 he also believes that their type reports can be manipulated by falsely reporting . Furthermore, since L2 believes that he pays a fraction 1 n−1 of L1s' total incentive transfers as part of the budget balance program, his report of also affects his monetary gain in the mechanism. These considerations illustrate the need for a proper elicitation mechanism.
Let P i denote agent i's true belief about (i +1)'s moves 18 andP i denote the reported belief. We assume that beliefs are differentiable for simplicity. Observe that P i = , if k i = 1. However if k i ≥ 2 then P i = F under the assumption of truth-telling Lk. Neither F, or levels k are known to the principal.
Consider the following two-stage AGVk (TS-AGVk) mechanism: Stage 1 Agent i reportsP i . 19 Stage 2 First-stage reports pin down the transfer schedule and i reports typeθ i .
The principal implements the efficient allocation (3) and pays the transfer: 20 where . 21 Note that compared to the standard AGV mechanism, the budget balancing part in TS-AGVk includes an extra term −b i+1 to balance the betting rewards.
Lemma 3 For any ε > 0 there exists λ > 0 in the TS-AGV k mechanism with n > 2, such that truth-telling is ε-optimal for an Lk-agent, given that I /i tell the truth. 22 Under the assumption that all agents tell the truth, the lemma states that no Lk-agent can deviate and gain more than ε by lying to the principal if the betting transfer is appropriately scaled. The proof is given in the "Appendix". The proof relies on the observation that the expected betting transfer: (Good 1952). However, since the reportp i also affects i's incentive transfer t i (·), the loss in betting reward has to be sufficiently large to nullify any gain from changing the allocation and t i (·) that i may achieve by misreporting p i and θ i .
Remark TS-AGVk does not rely on the knowledge that the underlying model is Lk. Specifically, the transfers are constructed to induce truth-telling as best response of an 18 If i = n, consider his beliefs about agent 1. 19 Since communicating the entire distribution function may not seem tractable, assume that the distributions belong to a known parametric class. In that case, the agents have to communicate only a finite number of parameters. See, e.g., Brooks (2013) and Azar et al. (2012). 20 If i = n read "t n + b n − 1 n−1 l =n t l − b 1 ". 21 If i = n read "p n (s 1 ) = ∂ ∂s 1P n (s 1 )". 22 In the standard setting, this corresponds to an ε-equilibrium. agent with arbitrary beliefs, not necessary an Lk agent. In contrast, the mechanisms introduced below are tailored to the particular setting of Lk and are therefore less robust to the change of environment. 23 If F and are known but levels k are unknown, then the first stage of the mechanism above can be simplified. Here, we use the fact that in the Lk model, agent i's level k i can be inferred from his belief about another agents' level k j , j = i. At the first stage of TS-AGVk(F, ) the principal asks each agent to guess the level of another participant. To fix ideas, let agent 1 report on k 2 , agent 2 reports on k 3 , and so on until agent n who reports on k 1 . In the Lk model, agent i's reportk i i+1 about agent (i + 1)'s level is truthful, if it is just below the agent's own level:k i i+1 = k i − 1. The true belief may not be correct (i.e.,k i i+1 may or may not equal k i i+1 ); moreover, at least one agent's belief must be incorrect.
The structure of transfers in TS-AGVk(F, ) is given by (13), where the incentive part t i is given by (10), ifk i i+1 = 0, and (11), ifk i i+1 ≥ 1; the betting transfer Lemma 4 There exists λ > 0 in TS-AGV k(F, ) with n > 2, such that truth-telling is Lk-optimal for agent i ∈ I , given that I /i tell the truth.
Unlike the TS-AGVk mechanism, TS-AGVk(F, ) with the appropriately chosen "punishment level" λ induces exact truth-telling. This is achieved because the reported levels k take on only discrete values (0, 1, 2, . . .). If F and k are known but is unknown, then we can exploit the fact that is common knowledge among the agents. The principal can use a shoot-the-liar protocol by asking the agents to report and punishing them if there is no unanimity. In this mechanism, reporting truthfully is best reply to the residual profile of truthful reports. However, truth-telling is not a unique solution. Establishing uniqueness could involve using "nuisance" strategies, as in Maskin (1985), or additional stages, as in Moore and Repullo (1988).

Conclusion
The idea of relaxing the pervasive common knowledge assumption, often referred to as the Wilson doctrine, has motivated recent research in mechanism design. Significant progress was made in studying implementation in frameworks approaching the universal type space, where higher-order beliefs are virtually unrestricted. 24 Kets (2012) extends the notion of type space further to allow finite depths of reasoning, as in the level-k model. The next natural step for mechanism design is to accommodate the extended notion of type space and search for mechanisms that are robust with respect to changes not only in the structure of beliefs, but also in the depth of reasoning (as mentioned in the discussion, learning to play the mechanism is a related issue). This paper, first, studies one of the most influential existing mechanisms, d'Aspremont and Gerard-Varet (1979), in the Lk environment.
The AGV mechanism implements the efficient choice rule in Bayes-Nash equilibrium. It is conceptually similar to the Vickrey-Clarke-Groves (VCG) mechanism that taxes the agents with the amount of negative externality their preference report exerts on the welfare of other agents. The VCG mechanism implements the efficient social choice rule in dominant strategies, and hence is independent of the beliefs. 25 On the downside, the VCG mechanism fails to satisfy the overall budget constraint. The expected externality mechanism has the advantage of being exactly budget balanced, but it comes at the cost of achieving Bayesian, as opposed to dominant-strategy implementation. In the light of the Lk model, this is not entirely innocuous.
Using the setup of the Lk model we start by conducting a positive analysis of the mechanism in the behavioral environment. We show that if there is a systematic difference in the perceptions of random-L0 actions and true types, then the agents distort their types at the first level and, by extension, also at the higher levels of rationality. Thereby we observe compensating behavior of finite-level agents in an AGV mechanism, that is, distorting one's report in the opposite direction to the opponents' anticipated bias. This is due to the fact that the AGV mechanism rewards for the expected externality, where the expectation is measured with respect to the true types. A simple implication of this result is that the AGV mechanism could use the distribution of random actions, as opposed to types, to achieve truth-telling among Lk agents. Consequently, we adjust the AGV mechanism by changing transfer for L1 agents in the case where the principal's has sufficient information. Otherwise, we introduce a betting scheme to elicit the agents' knowledge of the environment that the principal uses at a subsequent stage to induce truth-telling.
Altogether, our results suggest that the AGV mechanism is fairly robust to the iterative thinking environment. First, in the truthful-L0 specification there is no distortion of truth-telling and efficiency. Second, if there is distortion of truth-telling, its sign alternates and its absolute value decreases with k. Therefore, in mixed groups of agents with various levels k the biases cancel out and the mechanism's outcome is close to efficiency. This also implies that starting from L2 in the cognitive hierarchy model best replies are located within a smaller neighborhood of truth-telling. Third, the mechanism can be adjusted to the Lk framework in a way that maintains its key properties.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Appendix
Proof We proceed by induction. Suppose that for k − 1 it holds that: Level-k optimal strategy is best reply to the profile of strategies s (k−1) θ j , where the expectation is taken with respect to the opponents' types θ −i .
Thus, if (14) holds on level k − 1 it also holds on level k. Level-1 strategy is best reply to the profile of random actions: Thus for L1 the induction formula (14) applies.

Proposition 1
Statement Under assumptions A1-A3, if F ≡ then s Proof The first-order condition (henceforth f.o.c.) for the maximization problem (9) is the following: Given that x * (s i , s −i ) is the efficient choice rule, it must hold that Then the second term of (18) can be rewritten, such that the f.o.c. becomes: 26 −i and θ −i is the same random variable), then s i = θ i satisfies the first order condition (20) and thus s (1) i (θ i ) = θ i .
Lemma A Let us denote the following L1 maximization problem with n agents by P n : Statement Suppose that A1-A5 hold. Consider an L1 problem P n with n agents and There exists an L1 problem P 2 with 2 agents and a pair of distribution functions F , satisfying F ≺ F O S D ( ≺ F ) such that the solution to P 2 is also a solution to P n .
Proof First, we observe that ∂ 2 x * ∂s i ∂s j ≡ 0 (A5) implies that x * (s 1 , . . . s n ) = i λ i h i (s i ) for some scalars λ i , λ i > 0 and monotone functions h i . Without loss of generality, 26 The second order condition (s.o.c.) E Condition (20) can be rewritten as follows: s i that satisfies this condition is a solution to P n . From Theorem 1.A.3 in Shaked and Shanthikumar (2007) and vice versa. s (0) and θ correspond to the random action and type of a fictitious second agent in P 2 . In this problem P 2 the first order condition writes as follows: It is then clear that the solutions to problems P n and P 2 coincide.

Lemma B Statement
The L1 strategy in the AGV mechanism is given by (n = 2): Proof Rewrite (20) as follows: Integrate the second term of Equation (26) by parts: Modify the first term of Equation (26) by taking Taylor expansion under the integral: where θ i is between s i and θ i , and integrate by parts: Observe that due to the equal support of the two distribution functions F and : Thus, the f.o.c. becomes: We can rewrite the solution as follows:

Lemma C Statement
The Spence-Mirrlees condition (A4) implies the following, for all Proof The efficiency of the social choice rule x * implies that for all t i , t −i : Differentiate with respect to θ i : From the s.o.c. of the same problem, Given A4 (i.e., sign of ∂ 2 v i ∂ x∂θ i (x, θ i ) is the same for all (x, θ i )) the result is proven.
Proposition 2 Proof From Lemma B, the first-order condition for the L1 maximization problem when n = 2 is given by Eq. (33). Lemma C (p. 25) shows that the denominator of the expression is positive. Let us transform the nominator as follows: The signs marked above are determined by the following. t); s i ) < 0 by the concavity of preferences; 2. By Lemma C (p. 25), = 0 by neutrality. Therefore, the term is negative. Given that F implies F(t) − (t) > 0 for all t and ≺ F implies F(t) − (t) < 0 Proposition 2 follows immediately.

Statement 2.2 Suppose that A1-A5 hold, and F F O S D or
Proof Recall that by definition: The first-order condition for level-k strategy s ∂s i (s i , θ −i ) = 0 since by neutrality assumption ∂ 2 x * ∂s i ∂s −i (s i , t) = 0 and x * (·, ·) is continuously differentiable.
Apply the Taylor expansion to the first term: 27 To perform transition ( * ) we add and subtract To prove Proposition 3b ( ∂ 2 x * ∂s i ∂s −i (s i , t) ≤ 0), we change the decomposition of the nominator as follows: Given that ∂v i ∂ x (x * (s i , t); s i ) decreases in t, we have that for t ≤ s i , ∂v i ∂ x (x * (s i , t); s i ) ≥ 0 and thus the term in brackets is negative. Integrating the second term by part, we obtain: Similarly to the argument in 3a, we identify the condition under which both parts of the nominator have the same sign. Given the decomposition (57), we can see that for this to hold s i has to be sufficiently high (or θ i such that s (1) i (θ i ) ≥ t * i ). Proposition 3b proven.
Proof of Proposition 4 The statement and proof are symmetric to Proposition 3.

Proof of Lemma 3
Fix an arbitrary ε > 0. The subjective expected gain in deviation from truthfully reporting (P i , θ i ) to (P i ,θ i ) amounts to: where The classic result of Good (1952) implies that Therefore, D P i ,θ i ; P i , θ i ≥ ε only if W P i ,θ i ; P i , θ i ≥ ε. Consider set containing allP i , P i such that for W P i ,θ i ; P i , θ i ≥ ε for a least some θ i , θ i ∈ 2 and assume that is non-empty. Then we can define and − c = max C is the greatest reward for misreporting within and c > 0 is the lowest punishment (before scaling) for misreporting within . The total gain from deviation (59) is capped: hence one can always find λ > 0 such that C − λc < 0. Thus D P i ,θ i ; P i , θ i < 0 and the premise of non-empty is false for the given λ. We have shown that for all ε > 0 there exists λ > 0 such that there exists no P i ,θ i ; P i , θ i such that D P i ,θ i ; P i , θ i ≥ ε.