On the optimality of monetary trading

People have long been trading in a “monetary” way. This persistence of monetary trading suggests that it might be an efficient trading mechanism. We formalize this intuition in a random-matching, absence-of-double-coincidence-of-wants environment. The record-keeping technology is operated by an information-processing center which summarizes and updates individuals’ past trading behavior in a binary variable. A trading mechanism consists of an updating rule and an individual trading behavior rule. To capture the difficulties in collecting and processing information about others’ past behavior, we assume that the center faces costs, both fixed and variable, to operate. We show that (1) any non-autarkic equilibrium trading mechanism is, in terms of aggregate variables (such as consumption and frequency of trade), observationally equivalent to a monetary trading mechanism and (2) any non-autarkic optimal equilibrium trading mechanism is a monetary trading mechanism.


Introduction
Throughout time the means of exchange have changed considerably. Many commodities have been used as media of exchange, including chocolate, shells, butter, salt, and, of course, paper. Recent technological developments led to a substantial increase in cash substitutes like electronic payments and credit cards in most developed countries (e.g. Humphrey et al. 1996). Despite all the technological developments in payment systems, the nature of exchange in developed economies is strikingly similar to what it has been before-each person has a balance, which rises when he gives up goods, and falls when he acquires goods. The trading mechanism is, in an informational sense, still a monetary one: All that has changed is that the balance, which was once physical, has become virtual.
This persistence of monetary exchange is puzzling. Given the advances in recordkeeping and communication technology, it now seems feasible to design a trading mechanism which conditions trading behavior in information that monetary exchange ignores. By taking into account that information, such a trading mechanism has the potential to be more efficient than a monetary one. In fact, Kandori (1992) and Kocherlakota and Wallace (1998) explicitly design a non-monetary trading mechanism which is more efficient than a monetary trading mechanism whenever the record-keeping technology is sufficiently developed. 1 Why do not real-world trading systems exploit these efficiency gains? Why do we observe monetary trading and not some other trading arrangement?
In this paper, we provide an answer to these questions in a random-matching, absence-of-double-coincidence-of-wants setting analogous to that of Kiyotaki and Wright (1989) and Kocherlakota and Wallace (1998). Our view is that monetary trading would not have persisted if there is a more efficient trading mechanism; the contribution of this paper is to identify, in the setting on which most recent monetary models are built, an economically meaningful constraint on trading mechanisms that renders monetary trading as the most efficient trading mechanism among those that satisfy it.
In short: Individuals in single-coincidence matches choose whether or not to trade depending on the history of past behavior. Each individual's history of past behavior is summarized in a binary state variable (i.e. there is memory) which is updated by an agent-the center. 2 This is a costly activity; hence, the center needs to be incentivized, and this leads to the constraint on trading mechanisms that will imply our optimality result for monetary trading. Thus, no wonder why we observe monetary trading: It is the best that can realistically-with an incentive-feasible memory-be achieved.
Our optimality result significantly differs from that of Kocherlakota and Wallace (1998). In the latter, monetary trading is optimal only when there is no other record-keeping device other than fiat money (i.e. a portable object). In contrast, in our optimality result, incentive-feasible memory is available and, as in Kocherlakota (1998), monetary trading is just one particular way of using memory. Despite the wealth of alternatives, monetary trading turns out to be almost the only way of using it (in the sense that any non-autarkic equilibrium is, on the aggregate, observationally equivalent to monetary trading) and, more importantly, the best way of using it.
We now elaborate on the above short but vague answer to our questions. Each individual's allocation is a function of his and other individuals' past trading behavior, and each individual has access to information on other individuals' past trading behavior. However, this information is not perfect. As a metaphor for the difficulties individuals often experiment in processing and coordinating on information about others' past behavior (e.g. Camera et al. 2013), there is a center (e.g. an agent not involved in trading) who summarizes individuals' past trading behavior in a binary variable and updates this variable. 3 We then impose that the center faces costs to update this variable.
Costs to update individuals' past trading information can arise for several reasons. 4 In general, it takes time and effort to do it; it may require setting up a communication device to collect information and a software to process it. In the latter case, time and effort may be required to write down the computer code and for the employees of the center to learn it. For these reasons, we assume that the center faces costs, both variable (e.g. dependent on the frequency by which changes need to be made) and fixed (e.g. setup costs) and that trading arrangements with more changes cost more. 5 A trading arrangement consists of an updating rule together with a form of individual trading behavior (a behavior rule). Monetary trading is a particular trading arrangement. But is it the best trading arrangement? Here best in the sense of maximizing, within a relatively broad class of trading arrangements, 6 a long-run (i.e. steady state) social welfare criterion in a decentralized way, namely, such that both the center and the individuals are happy to follow its rules.
Incentives for the center are provided via payments from individuals that are conditional on whether or not individuals' past trading information is updated as specified by a given trading arrangement. The presence of costs that differ across trading arrangements means that not all trading arrangements are necessarily incentive-feasible for the center because the center may be able to obtain the same payment while saving on costs. In fact, we show that any non-autarkic equilibrium is, in terms of aggregate quantities, observationally equivalent to monetary trading. 7 A sharper result is obtained when we focus on optimal equilibrium. We specify the optimum problem in such a way that the updating costs are, together with the average well-being of the individual traders, considered in the social welfare ranking. 8 Similarly to what has been described above, the presence of costs that differ across trading arrangements means that not all equilibrium trading arrangements are necessarily optimal because it may be possible to obtain the same average well-being of the individual traders while saving on costs. Indeed, we show that, provided that there exists a non-autarkic equilibrium ranked above autarky, monetary trading is the unique optimum equilibrium. 9 In conclusion, monetary trading is, in the sense of the above result, the optimal trading arrangement. To the extent that the setting we consider is useful as a foundation of monetary economics, this optimality of monetary trading helps to rationalize the monetary nature of real-world trade.
We emphasize that what is crucial in our definition of monetary trading is a specific form of updating individuals' information and a specific form of individual trading behavior. In particular, the way this information is recorded (e.g. whether it uses a specific object or whether it is done electronically) is immaterial: According to our result, monetary trading is the optimal equilibrium independently of how individuals' past behavior is recorded. Thus, monetary trading persisted while the means of exchange changed. Moreover, monetary trading will continue to persist as long as costs to updating individuals' past trading information remain in the form we have specified.
An important implication of adding the center to the model for our optimality result is that, in any equilibrium, there are no changes to the record of each individual's information off-the-equilibrium path. The addition of the center to the model can, therefore, be seen as a way of justifying this property: Changes to individuals' information offthe-equilibrium path are never observed; hence, the center can discard them and save on fixed costs while obtaining the same payment. 10 As Rubinstein (1986) puts it, "social institutions, various types of organizations, and human abilities degenerate or are readily discarded if they are not used regularly".
One implication of the property that, in any equilibrium, there are no changes to the record of each individual's information off-the-equilibrium path is that the non-monetary trading arrangement of Kandori (1992) and Kocherlakota and Wallace (1998) is not an equilibrium. Indeed, these are gift-giving, grim-trigger trading arrangements whose efficient outcome relies on the treat of autarky for anyone who fails to produce, i.e. on a transition off-the-equilibrium path. 11 Grim-trigger trading arrangements, and strategies more generally, are very prominent and make sense (i.e. are an equilibrium) in several settings, namely when individuals do not face difficulties in updating other individuals' past trading information, e.g. among married couples or people in village economies. 12 They do not in the present setting in light of the above arguments. 13 9 In particular, this result does not require "high" discount factors. This feature and the fact that monetary trading is an (exact) optimum distinguishes our optimality result from the asymptotic (on the discount factor and number of status levels) and approximate optimality results for monetary trading of, for example, Berentsen (2002), Green and Zhou (2005), van der Schaar et al. (2013) and Olszewski and Safronov (2018). See Sect. 5 for more on this point. 10 These changes would still be discarded if they occur infrequently, e.g. if individuals tremble and choose non-equilibrium actions. See Sect. 2.5.3 for a more detailed discussion. 11 The same argument applies to a modification of the grim-trigger trading arrangement where no production is part of the equilibrium path. See Footnote 17 for details. 12 See Kocherlakota (1996) regarding the latter example.
Our results are consistent with some experimental evidence. Bigoni et al. (2014), Camera and Casari (2014) and Duffy and Puzzello (2014) observed that the most efficient outcome in their experiments occurred when subjects used monetary trading. In addition, in the benchmark treatment in Bigoni et al. (2014) and Camera and Casari (2014), there is evidence (see the discussion following Result 5 in the former paper and Result 4 in the latter) that subjects in some treatments tried to use a gift-giving, grimtrigger trading arrangement but were reluctant to update or, at least, act on information on other individuals' past behavior when off-the-equilibrium path. While there is evidence that some producers chose not to produce after observing no production by some other subject, this did not occur frequently enough; in fact, those subjects who were frequent defectors (i.e. those who produced in less than 20% of the matches where they were the producer) were the ones who received the highest payoff (see Figure 4 in Bigoni et al. 2014;Figure 3 in Camera and Casari 2014). Thus, as in our result, difficulties in updating other individuals' past trading information (leading, specifically, to "little" updating of information off-the-equilibrium path) seemed to justify the observed optimality of monetary trading in those experiments.
There may be other (possibly milder or more appealing) conditions leading to an optimality result for monetary trading and also more complicated settings where monetary trading will not be optimal under reasonable conditions. Determining the exact scope of the optimality of monetary trading is beyond the goal of this paper; instead, the goal of this paper is simply to provide an economically meaningful optimality result for monetary trading in the classic framework of Kiyotaki and Wright (1989) that goes significantly beyond that of Kocherlakota and Wallace (1998). Kocherlakota and Wallace's (1998) optimality result for monetary trading assumes that there is no information on other individuals' past trading behavior (i.e. no memory) other than what is transmitted by a portable object (fiat money). In contrast, we have an imperfect but still sufficiently rich memory that allow us, in particular, to dispense with the presence of a portable object. There is no special role given to fiat money so that, in our monetary trading arrangement, an individual's money balance is simply a label or a status level. As a result, Kocherlakota and Wallace's (1998) no-commitment condition, requiring no change to an individual's balance when he chooses not to produce, and which is key to their optimality result, is not imposed. Instead, our optimality result relies on the properties we impose on updating costs, namely, that less changes to individuals' past trading information imply lower costs and that lower costs imply higher social welfare.
To see our main result in perspective, note first that Kiyotaki and Wright (1989) specified a setting where monetary trading is an equilibrium. In their setting, monetary trading uses commodity money or fiat money, the latter then implying that individuals are willing to accept an intrinsically worthless object in exchange for valuable goods. Kocherlakota (1998) then showed that the important feature of money in monetary trading is to serve as a record-keeping device, i.e. memory. In this line of thought, Kocherlakota and Wallace (1998) then showed that monetary trading is optimal when individuals have no access to records of past behavior other than what is provided by fiat money. Our main result now shows that for monetary trading to be optimal, memory does not need to be nonexistent: If memory is incentive-feasible and designed optimally to reduce costs, then monetary trading is the optimal trading arrangement. Kocherlakota's (1998) money-is-memory result leads researchers to look for essentiality results for monetary trading, i.e. for settings where some (optimal) allocations can be supported with money but not with memory (see Huggett and Krasa 1996;Araujo et al. 2012 for a recent result). This paper offers a different response to Kocherlakota's (1998) result by presenting a setting where the incentive-feasible form of memory we consider is, at least for optimal equilibrium allocations, equivalent to money. In this sense, incentive-feasible memory is money and, in such setting, there is no reason why people should not use money.
There are some qualifications for our loosely stated results, their details being presented in what follows. Section 2 presents the model, including the notion of monetary trading and the specific details regarding the center. Our main results appear in Sect. 3. Section 4 discusses some of our assumptions, and Sect. 5 presents additional concluding remarks and a discussion of more recent developments in the literature. The proofs of our results are in Appendix.

The environment
The environment is analogous to that of Kocherlakota and Wallace (1998) and consists of a random-matching, absence-of-double-coincidence-of-wants setting along the lines of Kiyotaki and Wright (1989). Time is discrete and the horizon is infinite. There are N ≥ 3 distinct perishable goods at each date and there is a [0, 1] continuum of each of N types of people. Each type is specialized in consumption and production: A type n person consumes good n and produces good n + 1 (modulo N ). Each person maximizes expected discounted utility with discount factor β ∈ (0, 1).
In each period, each person can produce a quantity on a set Y ⊆ R + . We assume that 0 ∈ Y and that Y has at least two elements. Two particular examples (which we do not impose) are: The indivisible goods case where Y = {0, 1} and the divisible goods case where Y = [0,ȳ] for someȳ > 0. In each period, the utility of producing y ∈ Y equals −y and the utility for a type n person of consuming y units of good n equals u(y), where u : R + → R + is strictly increasing with u(0) = 0.
The type of each individual is perfectly observable by all individuals. In each period, there is also public information on the actions taken by each individual prior to the current period. Following Okuno-Fujiwara and Postlewaite (1995), the information on each individual's history of past actions is summarized by a finite-valued state variable called status. We do not assume that the status level is tangible; an individual's status level is simply an abstract record of his past history. However, in some specific trading mechanisms, the status level can have a concrete interpretation and, in the particular case of the monetary trading mechanism defined in Sect. 2.3, the status of an individual can be interpreted as his (electronic or paper) money holdings; we emphasize that, despite this interpretation, we do not assume the existence of a portable object. In this paper, for tractability, we focus on binary status levels.
In each period, people are randomly matched in pairs. This matching is such that, for each person, the distribution of partners' type and status from which an individual's meeting is drawn matches the demographic distribution of types and status in the entire population of the economy. Each person in a meeting knows his trading partner's type and status.

Trading mechanisms
Trade between individuals in the economy is described by a trading mechanism. We focus on trading mechanisms that are symmetric across individuals (i.e. that treat equally all individuals of the same type), pure (i.e. do not involve randomization) and binary (i.e. that have at most two status levels). 14 Such mechanisms are described as follows. In each period, each individual has a status level of 0 or 1. In no-coincidence meetings, the mechanism recommends no production (and, therefore, no consumption) and unchanged status. In single-coincidence meetings, as a function of each individual's status, the producer decides how much to produce and the consumer decides whether or not to accept the quantity produced by the producer (a strictly positive production implicitly means that the producer accepts to give the quantity he has produced to the consumer). 15 These actions, together with both individuals' status levels, determine each individual's next period status level.
Let S = {0, 1} be the set of status levels and let denote the set of status profiles. Let A = Y × {0, 1} denote the set of action profiles, where (y, 1) (resp. (y, 0)) means that the producer offers to produce y and the consumer accepts (resp. does not accept) the quantity proposed by the producer. Formally, a trading mechanism π is defined by a decision function B : X → A and a transition function T : X × A → X . Throughout the paper, for each status profile and each action profile, the first coordinate refers to the producer, while the second to the consumer, i.e. x = (x p , x c ) and a = (a p , a c ). Thus, the first coordinate of the behavior and the transition functions refers to the producer in the match, while the second refers to the consumer, and we often write B = (B p , B c ) and T = (T p , T c ). The interpretation of the function B is that B i (x) describes i's choice, where i = p, c, in a singlecoincidence meeting when the producer and the consumer have status profile x ∈ X .
For each x ∈ X , the quantity produced and consumed (henceforth, traded) equals Kiyotaki and Wright (1989), there is trade if and only if both parties agree. For each (x, a) ∈ X × A 14 See Sect. 4 for a discussion of these assumptions. and i = p, c, T i (x, a) is i's next period status level when the current status profile is x and the current action profile is a. 16 We note that there are no feasibility constraints on B and T (latter on, B and T will be constrained by equilibrium conditions). This is because status levels are abstract records of the past. In contrast, if each status level was the quantity of a portable object, then some feasibility constraints would be natural. For instance, it would be natural to impose that T p (x, a) + T c (x, a) = x p + x c for each x ∈ X and a ∈ A to express the requirement that individuals leave a single-coincidence match with the same total quantity of the portable object that they have entered. We do not impose any such constraint (see Footnote 32 for more on this).
Actual trade in the economy depends on the trading mechanism π being used and on the distribution of status levels in the economy. We focus on status distributions that are symmetric across types; thus, a status distribution is described by an element of = {(q 0 , q 1 ) ∈ R 2 + : q 0 + q 1 = 1}, where q s is the fraction of people of each type having status s, for each s ∈ S = {0, 1}.
We focus on stationary distributions of the Markov chain on S that π (together with the specification of the economy) induces when individuals follow it. Such stationary distributions are defined as follows. For convenience, in the case where individuals follow π , we simplify the notation by defining, for each x ∈ X , Then q is a stationary distribution of the Markov chain on S induced by π if q 1 = 1−q 0 and Following Okuno-Fujiwara and Postlewaite (1995), we refer to a pair μ = (π, q) as a trading norm when π is a trading mechanism and q ∈ . We say that a trading norm μ = (π, q) is stationary if q is a stationary distribution of the Markov chain on S induced by π .

Monetary trading norms and other examples
In a monetary trading mechanism, when possible, the producer produces and receives money, while the consumer gives money and receives the consumption good. Formally, we say that π is a monetary trading mechanism if, for some 0 < y < u(y), B c (0, 1) = 1, and T (x, a) = (1, 0) if x = (0, 1) and a = (y, 1), for all x ∈ X and a ∈ A. We then say that μ = (π, q) is a monetary trading norm if 0 < q 0 < 1 and π is a monetary trading mechanism. The interpretation is clear: In a monetary trading norm, the status of an individual is interpreted as his money holdings. Moreover, a strictly positive quantity is traded if and only if the producer has zero units of money and the consumer has one unit of money. In this case, the producer receives one unit of money from the consumer. In the remaining cases, there is no trade and no transfers of money. These cases can be interpreted as follows: When the money holdings are x = (0, 0) or x = (1, 0), the consumer has no money to pay for the good; when x = (1, 1), the producer cannot receive more money because there is effectively an unit upper bound on money holdings. Furthermore, 0 < q 0 < 1 means that trade actually takes place with a strictly positive probability (equal to q 0 (1 − q 0 )/N ).
In the above definition, the actual monetary trading norm depends on q 0 , y, B c (0, 0), B c (1, 0) and B c (1, 1). The set of all monetary trading norms is denoted by M.
A prominent trading norm, used in Kandori (1992) and Kocherlakota and Wallace (1998) to obtain a more efficient trading norm than monetary trading, is the grimtrigger norm. Formally, we define the grim-trigger trading norm μ G = (π G , q G ) by setting q G 1 = 1, x otherwise for all x ∈ X and a ∈ A, where y ∈ Y is such that u(y) > y. Here, a status of 1 indicates that the individual has produced in all the past single-coincidence meetings in which he was the producer. Production occurs if and only if the two individuals in a single-coincidence meeting have status of 1; thus, someone who fails to produce in some of the past single-coincidence meetings in which he was the producer will be in autarky from that time onwards. To see that this norm is more efficient than monetary trading norm with the same level of production, simply note that y is produced and consumed in every match under grim-trigger; in contrast, under monetary trading, there is zero production and consumption in all single-coincidence matches where the status profile is (0, 0), (1, 0) and (1, 1). 17 Despite the prominence of the grim-trigger trading norm, our goal is to compare monetary trading norms with all possible trading norms and not just to the grimtrigger. As an example of a non-monetary trading norm which is also different from the grim-trigger norm, consider the following monetary trading norm with charity μ C = (π C , q C ) defined by setting (1, 0) if x = (0, 1) and a = (y, 1), (1, 0) if x = (0, 0) and a = (y , 1), x otherwise for each x ∈ X and a ∈ A, where q C ∈ and y, y ∈ Y are such that 0 < y < y. This trading mechanism differs from monetary trading in that producers produce to a consumer with zero status (although a quantity lower than the one they produce for a consumer with a status of one) and are rewarded by attaining a status of one. Thus, an individual's status reflects both past accumulation of money and past good behavior. For the above monetary trading norm with charity to be stationary requires q C 1 = 1. This, together with y C (1, 1) = B C p (1, 1)B C c (1, 1) = 0, means that trade occurs with probability zero. To give an example of a non-monetary and non-grim-trigger trading mechanism in which trade occurs with strictly positive probability, we modify the monetary trading norm with charity as follows. Define the monetary trading norm with charity and a vow of poverty μ P = (π P , q P ) by setting q P 0 = 1/2, B P = B C and for each x ∈ X and a ∈ A. Thus, besides the difference between q C and q P , the difference between this trading norm and the monetary trading norm with charity is that, in single-coincidence matches with status profile x = (1, 1), the consumer becomes poorer, in the sense of transiting from the high-valued status of 1 to lowvalued status of 0 without receiving any material compensation for that. A final example is obtained by combining the decision function of monetary trading mechanisms with a variation of the transition function of the grim-trigger.

Formally, we define the grim-monetary trading norm
x otherwise for all x ∈ X and a ∈ A, where 0 < y < u(y). Thus, as in a monetary trading norm, q G M 1 ∈ (0, 1) and trade takes place if and only if the producer in a single-coincidence meeting has status 0 and the consumer has status 1. As in the grim-trigger trading norm, a status of 1 (resp. 0) can be interpreted as indicating good (resp. bad) past behavior. Furthermore, a producer loses his good status when he fails to produce for a consumer with good status (despite following the recommendation of the decision function B G M ). However, unlike the grim-trigger trading norm, there is in T G M an element of forgiveness: A producer with bad status who produces for a consumer with good status regains his good status.

Equilibrium and the optimum problem
In this section, we preview our notions of an equilibrium and of an optimal equilibrium and apply them to the setting of Kocherlakota and Wallace (1998).
Given a stationary trading norm μ = (π, q), the utility each individual receives is described by a function V (μ) : S → R, which we often write simply as V when it is clear what the trading norm is. Specifically, for each s ∈ S = {0, 1}, V (μ)(s) (which we write as V s (μ) or, simply, as V s ) gives the expected discounted utility of an individual having status s. The function V satisfies and The function V above allows us to verify whether or not individuals have an incentive to follow the actions prescribed by π . We require that each individual in a single-coincidence meeting cannot increase his utility through an one-shot deviation. Specifically, we require that the producer does not gain by choosing a quantity different from B p (x) and that the consumer does not gain by changing his acceptance decision: For each x ∈ X , y ∈ Y and α ∈ {0, 1}, and The information on each individual's trading history, i.e. each individual's status, is updated by a center. Formally, the center is an agent different from the individuals engaged in trading and which is himself not involved in trading. As we describe explicitly in Sect. 2.5, the center faces costs to operate, and hence, it may or may not have an incentive to update individuals' status levels according to a given transition function T . Informally, we say that a stationary norm μ = (B, T , q) is an equilibrium if the center has an incentive to follow T and the individuals have an incentive to follow B, the latter being characterized by (7) and (8). We use E to denote the set of equilibria.
An important particular case is obtained when the center faces no costs to operate. In this case, the center is a priori indifferent between any two transition functions as all of them cost the same to operate (i.e. zero); hence, it is easy to incentivize the center to follow any given transition function. In this case, an equilibrium is simply a stationary norm μ such that (7) and (8) hold. 18 We consider an optimum problem which is analogous to the one in Kocherlakota and Wallace (1998) and with a similar interpretation. Its goal is to choose an equilibrium that is a best element according to a social ranking of stationary trading norms. Let denote the (strict) social ranking of stationary trading norms. An optimal equilibrium is then μ * ∈ E such that there is no μ ∈ E such that μ μ * .
In our setting, the social ranking of stationary trading norms treats individuals in a symmetric way and, moreover, depends both on the costs of operating the center and on the average expected discounted utility. The latter is the weighted average of the expected discounted utility of those individuals with status level equal to one and zero, with weights given by the proportions of each of these two groups in the population. Thus, the average expected discounted utility of a stationary trading norm (1), (5) and (6), it is easy to see that In the particular case where the center faces no costs to operate, it is natural to consider the social ranking of stationary trading norms given by the average expected discounted utility, i.e. μ μ if and only if W (μ) > W (μ ). This case corresponds to the "memory" case in Kocherlakota and Wallace (1998) (i.e. to the case where, in their setting, the record of individuals' past actions is updated in every period with probability one) and it follows by their Proposition 3 that monetary trading is not an optimal equilibrium. We shall see below how the presence of costs to operate the center can make monetary trading be an optimal equilibrium.

The center
The social planner chooses a stationary norm μ = (π, q) to maximize the social ranking of stationary norms (described later) in a decentralized way. The latter means that each individual chooses his actions to maximize his own expected discounted utility; in particular, μ must be such that each individual is happy to follow the decision function B.
The decentralization of μ also requires someone to perform the task of updating status levels; while the social planner could, in principle, perform it, this task is delegated for standard reasons (e.g. lack of time and comparative advantage). The person to whom this task is delegated is called the center. To separate the role of different people, the center is an agent different from the individuals engaged in trading and which is himself not involved in trading. His role is to update individuals' status levels and he will choose to do so to maximize his payoff; thus, μ must also be such that the center is happy to follow the transition function T .
The center faces costs and receives payments from the remaining individuals. In what follows, we describe the costs that the center faces and the payments he receives from the rest of the individuals. We then describe the interaction between the center and the individuals, and the role that the social planner plays in that interaction. Finally, we specify the notion of equilibrium that we use and the social ranking of stationary norms that defines the optimum problem that we consider.

Costs
Costs to operate the center may arise for several reasons. For instance, some communication device must be used for the center to know what has happened in every single-coincidence meeting. Such communication device must be installed and some time must be devoted to use it. Furthermore, it might be that the costs arise not just from the time involved in processing changes to individuals' status levels but also from how complex the procedure used to process them is, e.g. how difficult it is to write the computer program used for this task, or how much computational power is needed, or how much time the employees of the center take to learn its rules.
It is then reasonable to assume that there are costs to set up and operate the center. Setup costs make it conceivable that these costs are partly independent of how frequently status levels are updated. This partial independence of costs from the frequency by which status levels change can also arise due to lower potential changes, if this makes it faster to process the outcome of each match and, thus, reduces the time needed for the center to update status levels. In particular, it is reasonable that the least costly way of running the center is when no change to individuals' status levels is ever required, i.e. when the transition T nc defined by is used, as this would require minimal setup costs and no updating costs. Costs can be very small; what is important is that different transition functions, and different trading norms more generally, may have different costs.
To formulate the above, we now assume that it is costly for the center to implement any given stationary trading norm μ = (π, q). Since what is important is to compare the costs associated with different stationary trading norms, these costs will be represented by a real number. Due to the presence of setup costs, these costs may differ across time, and thus, we let them be time-dependent; for each t = 0, 1, 2, . . ., F t (μ) denotes the total cost incurred by the center to implement μ in period t.
A concrete example to illustrate the above is as follows. It uses the following notation: For each trading norm μ = (π, q), let be the set of status-action profiles at which the status profile changes.
Example Suppose that, in addition to the N perishable goods, there is another perishable good, good 0. Good 0 is a good that cannot be produced: The center has an endowment W > 0 of good 0 in period 0 and zero in subsequent periods; individuals have an endowment w > 0 of good 0 in each period. Good 0 is the only good that the center values: The center's preferences are described by ∞ t=0 β t c t where {c t } ∞ t=0 is his sequence of good 0 consumption. The cost of updating individuals' status levels are in terms of good 0 and are given, for each stationary norm and for some 19 In this example, costs depend on the frequency by which status levels are changed in every period; furthermore, in period 0, costs also depend on the number of status-action profiles at which the status profile changes, this to capture how complex the procedure used to process the changes to individuals' status levels is, for example, as a measure of how difficult it is to write the computer program used for this task, and/or how much computational power is needed, and/or how much time the employees of the center take to learn its rules. These setup costs can, however, be insignificant as compared to variable costs: This case is obtained by setting, for instance, φ v = 1 and then setting φ f as close to zero as desired.
We impose three conditions on the sequence of cost functions. (These conditions are satisfied in the above example.) First, the costs of any stationary norm with T = T nc are normalized to zero: For each stationary trading norm μ = (B, T , q) such that Second, costs are nonnegative and non-increasing with time: For each stationary trading norm μ, This condition captures, in particular, the case where setup costs decrease to zero and variable costs are time-independent. Our third condition requires that, for each stationary trading norms μ = (π, q) and μ = (π , q ), we have that In simple terms, this condition says that norms with less changes to individuals' status levels cost less. Looking at it in more detail, the condition C(μ) ⊂ C(μ ) means (i) that every status-action profile (x, a) at which there is a change in the status profile under μ is such that a change in the status profile also occurs under μ and (ii) that there is a status-action profile (x,â) at which there is a change in the status profile under μ but not under μ. Ifâ = B (x) = B(x) and qx = q x > 0, then this change occurs in every period both under μ and μ , and thus, its cost can be thought to be a variable cost; in this case, μ would have lower variable costs that μ and one could reasonably require not only that But none of this is required; in particular,â may differ from B (x), in which case the cost associated with it is either a setup cost or is, in some other way, unrelated to the frequency by which status profiles change when a trading norm is followed. As noted before, this can occur when lower potential changes make it faster to process the outcome of each match and, thus, reduce the time needed for the center to update status levels. Thus, the frequency by which status levels change is not the only factor determining the center's costs in (12). However, these alternative factors matter to conclude that μ is less costly than μ only if both norms have the same stationary distribution (i.e. q = q ) and it is the case that every status profile at which there is a transition on the equilibrium path of μ is such that a transition also occurs on the equilibrium path of thus, in particular, setup and other types of costs matter to conclude that μ is less costly than μ only if the frequency by which status levels are changed in every period is no larger under μ than under μ . This is a way of expressing, in general, that setup and other types of costs can be insignificant relative to variable costs in comparing two stationary norms in terms of their costs.
We consider the examples of Sect. 2.3 to illustrate condition (12). We say that μ is less costly than μ if F 0 (μ) < F 0 (μ ) and F t (μ) ≤ F t (μ ) for each t > 0; this defines a binary relation on stationary norms which is not complete, i.e. several stationary norms cannot be compared. Condition (12) allows us to compare some stationary norms. For example, (i) μ = (π, q) with q 1 = 1 and T = T nc is less costly than μ G ; (ii) μ = (π, q) with q 0 ∈ (0, 1) and T = T nc is less costly than μ M = (π M , q); (iii) μ M = (π M , q) with q 0 = 1/2 is less costly than μ G M ; and (iv) μ G M is less costly than μ P . 20 Intuitively, in (i)-(iv), there are less contingencies in which there is a change in status profile in the least costly norm; also, in these examples, the least costly norm is easier to write down as it requires less conditions in the definition of the transition function.
Condition (12), however, does not allow us to compare all stationary norms. This holds, for example, in the case where μ = μ G and μ is some

The social planner
The social planner has two roles: First, to choose a stationary norm μ = (π, q) (to maximize the social ranking of stationary norms described in the next section) and, second, to monitor and incentivize the center to follow it. As we shall discuss in the remark at the end of this section, the social planner can be thought of as a group of individuals. We start, however, to describe it as a god-like entity to avoid discussing incentive problems it might face; in other words, the social planner will not be a player in the game played by the center and the individuals that we shall now describe.
The interaction between the center and the individuals takes place as follows. Given the stationary norm μ chosen by the social planner, the initial, period 0, status level of each individual is determined according to the distribution q. (In particular, q is the initial distribution of status levels.) In each period t ≥ 0 and after the individuals have been matched in pairs, in each single-coincidence match, the producer and consumer make their choices (of how much to produce and of whether or not to accept to trade, respectively) and, then, the center chooses next period status level of each of these two individuals. In no-coincidence meetings, both individuals are forced to choose a zero production level and the center is forced to choose the same status levels for the two individuals in that meeting. These actions are observed by all, i.e. by each individual, the center and the social planner.
In the attempt to incentivize the center to follow μ, the social planner has to monitor the outcome path. Specifically, for each period t ≥ 0 and history h t , the social planner needs to determine whether or not μ has being followed in the past. A period t-history h t consists of individuals' states in periods 0, . . . , t, matches in periods 0, . . . , t and individuals' actions in periods 0, . . . , t − 1. A stationary norm μ has been followed in the past at history h t if, for each x ∈ X with q x > 0, the fraction of individuals in single-coincidence matches who move from state profile x to T (x, B(x)) in periods 0 ≤ t ≤ t − 1 equals 1.
The monitoring task of the social planner is simple. Assume that the center chooses, at every history h t with t ≥ 0, to update status levels in a symmetric way (e.g. because it is prohibitively expensive to do otherwise), i.e. by choosing a transition function T h t : X × A → X but not necessarily equal to the transition function T which is part of μ. 21 Then, the social planner needs to sample, for each x ∈ X with q x > 0, only one match with state profile x. Indeed, with such sample and with probability 1, the social planner correctly finds out whether or not the trading norm has been followed. In particular, monitoring the center is considerably simpler than to update individuals' status levels, and this provides a justification for the delegation of the latter task from the social planner to the center.
The incentives for the center to follow μ are then provided as follows: If μ has been followed in the past, then the center receives in period t an amount c(μ) ∈ R + ; otherwise, the center receives nothing. The interpretation is that c(μ) is a stationary measure of the costs associated with μ and that are ultimately paid by the individuals. For concreteness, we specify that this amount is collected from the individuals by the social planner and then given to the center; this is illustrated in the following example, which also illustrates a more decentralized alternative of making the payment of c(μ) to the center.

Example (continued)
In the context of the example of Sect. 2.5.1, and given a stationary norm μ, the payment the center receives when he follows T is c(μ) = φ v x:T (x) =x q x + (1 − β)φ f |C(μ)| in each period and in units of good 0; c(μ) is paid to the center by the social planner after collecting c(μ)/N units of good 0 from each individual (recall that the measure of the total population is N ). Making c(μ) be time-independent implies that the individuals still face a stationary environment even in the presence of time-dependent costs. In this example, the center has to pay the initial setup cost φ f |C(μ)|, thus obtaining a sequence of consumption of good 0 with for each t ≥ 1, which gives him the same utility as his endowment {c t } ∞ t=0 wherē c 0 = W andc t = 0 for each t ≥ 1. 22 One way to decentralize this payment is to consider a more detailed interaction between the center and the individuals, whereby, at the beginning of each period, the social planner announces whether or not μ has been followed in the past and then individuals choose whether or not to pay c(μ)/N units of good 0 to the center (the social planner's announcement concerns now only the single-coincidence matches in which both individuals have paid). 23 Consider strategies that are such that: (i) the center does not change the status profile in any single-coincidence match where at least one of the two individuals has failed to pay, (ii) each individual pays c(μ)/N to the center if and only if the social planner announces that μ has been followed in the past and (iii) in single-coincidence matches where some individual did not pay, the producer chooses not to produce. Assume that the extra utility of consuming z units of good 0 in a period is ξ z independently of the level of consumption and production of the other N goods, where ξ > 0. 24 Then, the choices described in (ii) and (iii) are optimal provided that min s∈{0,1} V s (μ) ≥ ξ c(μ) N . 25,26 The choice described in (i) is also optimal as making no changes to status levels is the least costly way of updating them. 27 In general, we assume that the center's preferences are represented by ∞ t=0 δ t v(γ t ) for each (bounded, real-valued) sequence {γ t } ∞ t=0 of net payments, where δ ∈ (0, 1) and v : R → R is continuous and strictly increasing, with v(0) = 0. The amount c(μ) is then defined to be the smallest c ∈ R such that, for each t = 0, 1, 2, . . ., the center weakly prefers the sequence of net payments 23 Recall the timing: First, matches are realized, then individuals make their choices, and then the center chooses how to update states; now, before matches have been realized, the social planner makes the announcement and each individual chooses to pay or not. 24 More explicitly, the overall period utility of an individual of type n who consumes z units of good 0 and y ∈ Y units of good n, and produces y ∈ Y units of good n + 1 (modulo N ) is u(y ) − y + ξ z. 25 To see this, suppose that an individual chooses not to pay and that this is an one-shot, unilateral, deviation. Since the center will then choose not to update status levels in the match of the deviating individual, it follows that no production will take place in such match. Hence, the deviating player obtains a continuation payoff of β V s + ξ w − c(μ) N , where s ∈ {0, 1} is his status level at the time of the deviation. If, instead, he chooses not to deviate, then his continuation payoff is V s + ξ w − c(μ) N . Hence, the deviation is not profitable if min s∈{0,1} V s (μ) ≥ ξ c(μ) N . 26 If this formalization replaces the one where the social planner collects payment from each individual, then such a voluntary participation constraint needs to be added as an equilibrium condition. Our main results, however, continue to hold as currently stated. 27 The center is still assumed to update status levels in a symmetric way, now meaning a transition function T h t : X × A × {pay, not pay} 2 → X .
Thus, c(μ) is the smallest stationary per period payment to the center that incentivizes him to follow his part of the trading norm μ. 28 The following lemma states two implications of the above assumptions.

c(μ) exists and belongs to
Remark The interaction between the individuals and the center can be fully decentralized, and thus, the social planner can be completely eliminated. In the above example, we described how the payment from the individuals to the center could be decentralized. Here, we focus on the monitoring of the center and assume, for simplicity, that the payment to the center is obtained via lump-sum taxes on the individuals. To decentralize the monitoring of the center, two individuals are selected (at period 0 or in each period, randomly or deterministically) and each performs the statistical test (i.e. samples, for each x ∈ X with q x > 0, one match with state profile x) independently of the others. If all observe T (x, B(x)) for each x ∈ X , then it is concluded that μ has been followed in the past and the payment c(μ) is made to the center; otherwise, the taxes stop being collected and paid to the center. Each one of the two individuals have an incentive to report truthfully if the other is doing so. 29

Equilibrium
Summarizing the above, we have defined a repeated game with complete information with the following elements. The players are the center and the individuals. In each period, the actions available are: Consumers in single-coincidence matches choose 0 (accept) or 1 (reject) and producers choose y ∈ Y ; individuals in no-coincidence matches have only one action available: 0 ∈ Y ; the center chooses status levels for those individuals in single-coincidence matches according to a function T : X × A → X ; in no-coincidence matches, the center has only one choice, namely T nc . Finally, payoffs are as follows. Each individual's payoff is the expected discounted utility received in his sequence of matches. We assume that the payment of c(μ) affects periodwise utility in a separable way as in the above example, so that the incentive conditions for each individual in a single-coincidence match remain (7) and (8) at 28 It may be that the center is, ex ante, no different from the individuals engaged in trading. In this case, we would require that the ex ante expected utility of the center be equal to that of an individual engaged in trade. This would change the value of c(μ) but not our results. 29 The details, including the importance of having a group of two individuals, are as follows. Consider a non-autarkic stationary norm. If, at some history, the payment to the center stops, then the center stops updating status levels and, consequently, producers stop producing, i.e. the outcome will be autarkic. If there is only one individual in the group, then he would have an incentive to deviate at an history where the center has failed to update status levels according to T since, by continuing to pay the center instead of following the strategy by stopping that payment, he obtains a higher payoff from the non-autarkic norm than from the autarkic outcome. In contrast, in a group of two, such deviation is no longer profitable for each individual in the group because the other will stop the payment regardless of his choice. histories where the strategy profile recommends that μ is followed in all subsequent periods. 30 As for the center: If, at a given history h t , t ≥ 0, μ has been observed in the past, then following μ from period t onwards means that the center receives ∞ k=t δ k−t v(c(μ) − F k (μ)). In addition, we specify that if a deviation at h t implies that, in the subsequent periods, μ has no longer been followed in the past, then the center's payoff is at most 0. This means that any choice of the center leads to a sequence {γ k } ∞ k=t of nonnegative costs, and thus, the center's payoff is ∞ k=t δ k−t v(−γ k ) ≤ 0 since the payment c(μ) is no longer received. Moreover, we specify that if such deviation by the center consists in T nc being chosen at h t and onwards, then the center's payoff is 0 in line with (10), i.e. choosing T nc leads to minimal costs which have been normalized to zero, so that γ k = 0 for each k ≥ t.
A strategy for each player is a mapping from histories into actions. Thus, for each history and realization of future matches, a strategy determines a sequence of actions for the center and the individuals which, in turn, determines their payoffs.
We then say that a stationary trading norm μ is an equilibrium if there is a subgame perfect equilibrium of this repeated game such that, for each t = 0, 1, 2, . . ., for each history in which μ has been followed in the past, and each single-coincidence match in period t, the center uses T to update individuals' status levels and individuals choose according to B. In other words, μ is followed in the equilibrium path of some subgame perfect equilibrium of this repeated game. Recall that the set of equilibria is denoted by E. Equilibria in this model have a simple characterization. (7) and (8) hold and, for each (x, a) ∈ X × A,

Lemma 2 A stationary trading norm μ = (π, q) is an equilibrium only if
Conversely, a stationary trading norm μ = (π, q) is an equilibrium if (7), (8) and (13) hold and, for each x ∈ X, In the particular case where q 0 ∈ (0, 1), which holds in any monetary trading norm and, as we will shown, in any equilibrium trading norm other than autarky, (14) is trivially satisfied. Thus, in this case, a stationary trading norm is an equilibrium if and only if the individuals' incentive constrains (7) and (8) hold and the center's incentive condition (13) holds. We interpret (13) has stating that memory is incentive-feasible. It arises because the presence of costs that are independent of the frequency by which status profiles change implies that changes to individuals' status levels that are never used (namely, those off-the-equilibrium paths) should be discarded, thus reducing the costs that the center faces.
We note that the necessity of (13) for a stationary norm to be an equilibrium allows for the possibility that individuals may slightly tremble when choosing their actions. For instance, if the probability that each individual trembles (i.e. chooses an action different from B(x) at status profile x) is sufficiently small, independent across individuals and such that the exact law of large numbers hold (see, e.g. Sun 2006; Podczeck 2010), then it still pays for the center to discard the changes described in (13): While these changes are sometimes used, they happen very infrequently. Intuitively, (13) formalizes the view expressed in Rubinstein (1986) and cited in the introduction that "social institutions, various types of organizations, and human abilities degenerate or are readily discarded if they are not used regularly." As also noted in the introduction, evidence for (13) can be found in Bigoni et al. (2014) and Camera and Casari (2014): In the benchmark treatment in both papers, there is evidence that subjects were reluctant to update (or, at least, act upon) information on other individuals' past behavior when off-the-equilibrium path. This is also evident in Camera and Casari (2018) where subjects could, by incurring a small cost, make the action of the player with whom they were matched be publicly know; despite this option, subjects did not always report deviations and, in fact, it did not help in supporting an efficient outcome.

Optimum problem
Recall that the optimum problem is described by a social ranking of stationary trading norms and that an optimal equilibrium is μ * ∈ E such that there is no μ ∈ E with μ μ * . In this section, we describe the conditions we impose on the social ranking of stationary trading norms.
When the cost of operating the center is paid by the individuals, as described in the examples of Sects. 2.5.1 and 2.5.2 , these costs should be part of the social ranking of stationary trading norms. In fact, in these examples, the average expected discounted utility of a stationary norm μ equals and thus, the average expected discounted utility of a stationary norm decreases with its cost. 31 In light of this, we assume that, for each pair of stationary norms μ = (π, q) and μ = (π , q ), This condition says that if two stationary norms have the same stationary distribution and yield the same average expected discounted utility (when the cost c(μ) is not included) but one is less costly than the other, then the former is strictly preferred to the latter.
The second and last condition we impose on is a symmetry requirement. Note that the status level is an abstraction and, therefore, can be interchanged without changing the outcome of a trading norm. Interchanging status levels means that x ∈ X is mapped into (1, 1) − x = (1 − x p , 1 − x c ); for convenience, let g : X → X be defined by g(x) = (1, 1) − x for each x ∈ X . We then say that two stationary norms μ andμ are symmetric, denoted by μSμ, if either μ =μ or B(g(x)) for each x ∈ X , and T (x, a) = g (T (g(x), a)) for each (x, a) ∈ X × A.
We now require that the social ranking of symmetric stationary norms be unchanged: For all stationary norms μ, μ ,μ,μ, if μSμ and μ Sμ, then μ μ if and only ifμ μ.
We remark that these two properties are satisfied if (a) is represented by

Optimality of monetary trading
In this section, we consider non-autarkic trading norms and provide a characterization of non-autarkic equilibria and of non-autarkic optimal equilibria.
A non-autarkic trading norm is one where a strictly positive quantity is traded with strictly positive probability. Formally, a stationary norm μ is non-autarkic if there exists x ∈ X such that q x > 0 and y(x) > 0. Any monetary trading norm is nonautarkic and so are all the norms in Sect. 2.3 except for μ C , i.e. μ G , μ P and μ G M are all non-autarkic.
The introduction of (13) as an equilibrium condition does not preclude the use of intertemporal incentives to support myopically non-optimal actions. It does, however, change the nature of these intertemporal incentives: Instead of punishing the deviation from a prescribed myopically non-optimal action (such as in the grim-trigger norm μ G ), equilibrium norms have now to reward the choice of such an action. This element is present in any monetary trading norm and also in μ P and μ G M . But is each of them an (optimal) equilibrium?
Before stating our main results, we note that V 1 (μ) ≥ V 0 (μ) holds in all our examples. This amounts just to a normalization and can always be assumed. The following lemma makes this formal by stating that any equilibrium trading norm is symmetric to an equilibrium trading norm with the same average expected discounted utility in which status 1 has the highest value.
Our first main result in this section shows that any non-autarkic equilibrium is, in terms of aggregate quantities traded, observationally equivalent to an equilibrium monetary trading norm yielding the same average expected discounted utility. It states that any non-autarkic equilibrium trading norm with V 1 (μ) ≥ V 0 (μ) has the same average expected discounted utility of an equilibrium monetary trading norm with the same stationary distribution and the same traded quantities at each status profile. This means that the aggregate quantities, including the fraction of single-coincidence matches in which there is trade, the quantity traded in such case and, hence, total output and consumption, of any non-autarkic equilibrium μ coincides with that of an equilibrium monetary trading norm.
It is easy to see why certain non-autarkic trading norms are not an equilibrium. For instance, the grim-trigger norm μ G does not satisfy (13) when the status profile is (1, 1) and the producer chooses not to produce. Thus, it fails to be an equilibrium as the center has no incentive to use its transition function to update individuals' status levels. The monetary trading norm with charity and a vow of poverty μ P satisfies (13) but, nevertheless, fails to be an equilibrium. Specifically, (8) fails as the consumer in a single-coincidence meeting with status profile (1, 1) has an incentive to deviate and, thus, keep, by virtue of (13), a status of 1. Indeed, if he does not deviate, then his next period status level is 0 and his current consumption is also zero.
In general, it is easy to see that any non-autarkic equilibrium μ satisfies y(1, 0) = y(1, 1) = 0 because, as pointed out before, the choice of a strictly positive production level requires the producer to be rewarded with a transition to a more valuable status level. It is also easy to see that q 0 > 0 and that y(0, 0) > 0 or y(0, 1) > 0 since μ is non-autarkic. More surprisingly is that it cannot be that y(0, 0) > 0 and y(0, 1) > 0, and that it turns out that y(0, 0) = 0 and y(0, 1) > 0. 32 In summary, by using (7), (8) and (13), we show that any non-autarkic equilibrium has to satisfy y(0, 0) = y(1, 0) = y(1, 1) = 0, 0 < y(0, 1) < u(y(0, 1)), T p (0, 1) = 1, T c (1, 1) = 1, q 0 ∈ (0, 1) as in any monetary trading and also β V 1 −V 0 1−β ≥ y(0, 1) (which is the incentive condition for producers to produce y(0, 1) at single-coincidence meetings with a status profile (0, 1)). In general, the remaining values of the transition function cannot be pinned down. However, and somewhat surprisingly, it turns out that the monetary trading norm with the same stationary distribution of the given equilibrium and with the same production level at singlecoincidence meetings with a status profile (0, 1) is an equilibrium and yields the same average expected discounted utility as the given equilibrium. In this sense, mone-tary trading norms obtain a given average expected discounted utility in a minimally demanding way in terms of individuals' incentives to follow its rules.
Proposition 1, however, imposes no conditions on the transition function of nonautarkic equilibrium trading norms other than T p (0, 1) = 1 and T c (1, 1) = 1. For this reason, there is no guarantee that any non-autarkic equilibrium is observationally equivalent to an equilibrium monetary trading norm in terms of individual quantities consumed and produced. An example where this property fails is provided by the grim-monetary trading norm μ G M in the case of binary production (i.e. Y = {0, y} where 0 < y < u(y)) and β ∈ (0, 1) such that β V 1 −V 0 1−β = y. 33 As we next show, the observational equivalence to an equilibrium monetary trading norm in terms of individual quantities consumed and produced holds in any nonautarkic optimal equilibrium. Specifically, using Proposition 1, we now obtain that, whenever there is a non-autarkic optimal equilibrium, optimal equilibria are symmetric to a monetary trading norm. Thus, effectively, there is a unique optimal equilibrium, namely the best equilibrium monetary trading norm.

Corollary 1
If μ * is non-autarkic and an optimal equilibrium, then there isμ ∈ M such that μ * Sμ. 0, 1)) and B c (0, 1) = 1 as in any monetary trading norm. Moreover, we have that T p (0, 1) = T c (1, 1) = 1 also as in any monetary trading norm. Corollary 1 strengthens these conclusions when μ is optimal. First, monetary trading norms economize on updating costs as status profiles only change at x = (0, 1) when B(0, 1) is chosen, a change that occurs in every non-autarkic equilibrium. Second, if it were the case that a producer offers to produce at a status profile x = (0, 1) where no trade occurs (which, thus, implies that the consumer is refusing the offer), the consumer would have an incentive to deviate from the prescribed behavior in μ by accepting the producer's offer. These two properties imply that B p (x) = 0 for each x = (0, 1) and T (x, a) = x for each (x, a) = ((0, 1), B(0, 1)), and this shows that any non-autarkic optimal equilibrium must be monetary.

Some intuition for Corollary 1 is as follows. From Proposition 1, we have that any non-autarkic equilibrium
We conclude this section by commenting on the assumption of Corollary 1, which requires the existence of a non-autarkic optimal equilibrium. Intuitively, this property should hold provided that the costs c(μ) of non-autarkic stationary norms μ are not "too big" or not "too important." This is illustrated in the context of the examples in Sects. 2.5.1 and 2.5.2 .

Discussion
Our optimality result for monetary trading requires some constraints on the class of possible trading norms, which we now discuss. The requirement that trading mechanism be symmetric across individuals seems innocuous, but requiring them to be pure, stationary and binary does not. We shall argue below that the binary requirement is the really restrictive one.
We shall also conjecture that our results extend to the case where individuals' types are not observable.

Stochastic mechanisms
As Berentsen et al. (2002) have shown, monetary trading norms can be made more efficient by allowing for random transitions. However, in our setting where transitions are chosen by a center who has costs to update individuals' status levels, it is conceivable that random transition be more costly that pure ones and that, therefore, the center optimally chooses pure transitions. We present such an extension in what follows.
A trading mechanism is now π = (B, ρ) where ρ : X × A → [0, 1] 2 is a random transition function: For each i ∈ {p, c}, x ∈ X and a ∈ A, ρ i (x, a) is the probability that T i (x, a) = 0. For convenience, let ρ i (x) = ρ i (x, B(x)) for each i ∈ {p, c} and x ∈ X .
In the description of the interaction between the center and the individuals, the condition determining whether or not center receives the payment c(μ) is now as follows: The stationary norm μ has been followed in the past at history h t if, for each x ∈ X with q x > 0, a fraction ρ i (x) of individuals whose role is i ∈ {p, c} in single-coincidence matches move from state profile x to 0 in periods 0 ≤ t ≤ t − 1.
We now specify costs in such a way that no changes to status levels with certainty are preferred to random change. Intuitively, the latter requires the center to perform the randomization and to make changes to a fraction of individuals. Following a stationary trading norm μ = (B, ρ, q) means that the center uses ρ to determine next period's status level of each individual. This yields a sequence of costs {F 0 (μ), F 1 (μ), . . .} analogously to what is described in Sect. 2.5.1.
Suppose instead that the center proceeds as follows: First, it chooses fractions (α p (x), α c (x)) of producers and consumers in single-coincidence meetings with status profile x, second, it changes the status levels of these individuals according to (ρ p (x), ρ c (x)), and third, the status levels of the remaining fraction of individuals are not changed, i.e. the next period's status levels of a fraction 1 − α p (x) of producers (resp. 1 − α c (x) of consumers) in single-coincidence meetings with status profile x is x p (resp. x c ). Let F t (μ, α) be the costs incurred by the center in period t by using such strategy. Note that in the context of this section, the center following μ means that he uses ρ to determine next period's status level of each individual, i.e. α i (x) = 1 for each i ∈ {p, c} and x ∈ X . Moreover, if, for each i ∈ {p, c}, x ∈ X and a ∈ A, ρ i (x, a) ∈ {0, 1} holds in addition to α i (x) = 1, then all transitions are pure and F t (μ, α) is the same as F t (μ) in Sect. 2.5.1.
In addition to appropriately modified version of conditions (10)- (12), we now assume that, for each stationary trading norm μ, Under (17), we then have that any equilibrium μ will be pure: ρ i (x, a) ∈ {0, 1} for each i ∈ {p, c}, x ∈ X and a ∈ A. Indeed, by an appropriate version of (12), status levels do not change off-the-equilibrium path; thus, to guarantee that μ has been followed in the past. 34 But (17) then implies that this is a profitable deviation and that μ is not an equilibrium.

Non-stationary mechanisms
Focusing on norms that are stationary is potentially restrictive. However, we argue that the real restriction arises from a binary status space.
To see the above, suppose that we consider a set of status of the form N × S, where the first component represents time. For example, B(t, x) would then be the action profile to be played in period t in a single-coincidence meeting where the second component of the status profile equals x ∈ X = S 2 . Focusing on binary mechanisms means that either (i) S has two elements and both B and T are independent of t or (ii) S has one element, B depends only on whether or not t is even or odd and T is such that an odd period follows an even one and vice versa, independently of the action played. 35 Proposition 1 characterizes non-autarkic equilibria in case (i). Equilibria in case (ii) are easily characterized: When X has only one element, there cannot be any reward for a producer producing today, and thus, there is a unique equilibrium consisting . 35 This so that t is interpreted as time. In general, without this restriction, N would be replaced with {0, 1} to meet the requirement of a binary mechanism and we would be back to case (i). of autarky. Thus, it is better to have a non-autarkic stationary norm (which will be monetary) than to have a non-stationary one (which will be autarkic).

Non-binary mechanisms
The main question we leave open is whether or not our optimality result extends to the case of a more general set of status levels. In the concluding remarks (Sect. 5), we loosely conjecture that it does extend. Here, instead, we outline some of the difficulties of extending our optimality result beyond the case of binary status levels.
General sets of status levels open up interesting questions. For instance, when the set of status levels is finite and sufficiently large, e.g. S = {0, . . . , m} for some m ∈ N sufficiently large, we can ask whether or not it is optimal to have a "law of one price." Specifically, and assuming that y(x) = y > 0 whenever x ∈ X is such that x p < m and x c > 0 analogously as in the definition of monetary trading norms, is it optimal to have x c −T c (x, B(x)) be equal to T p (x, B(x))−x p (in which case such common value can be interpreted as the price of y(x) = y units) and be independent of x? It is likely that additional conditions are needed to obtain these conclusions because otherwise it seems better to have such common value be bigger when x p = 0 or x c = m than otherwise to obtain a stationary distribution that puts less weight on status level that imply no trade.
One possibility to overcome this issue is to assume that status levels are private information. We have consider this in a previous paper, Carmona (2016), but allowing only for portable-object mechanisms in which the portable object can only take two values. Another possibility is to assume that the transition matrix of the Markov chain governing the evolution of the distribution of status levels is symmetric. In Carmona (2002a, Chapter 4), this condition, together conditions analogous to (7), (8) and (13), has been shown to yield, in a model with two individuals but otherwise analogous to the present one, the optimality of monetary trading for general finite sets of status levels.
The above symmetry condition has, however, not been given a meaningful economic interpretation. In this light, the appeal of binary status levels is that they allow us to obtain economically meaningful and intuitively undemanding conditions on the center's costs and on the social ranking of stationary norms that render monetary trading as the optimal trading norm.

Non-observable types
It seems likely that our results extend to the case where types are not observable by adjusting the trading protocol in the following way. In any match, the two individuals start by simultaneously reporting their types. Then, the trading mechanism (i.e. B and T ) applies with respect to the reported types. Each individual has an incentive to report truthfully if all the others do so: The probability of being a producer (which any individual would like to minimize) is always 1/N independently of the report, but an individual who misreports his type has a zero probability of consuming the good that he likes as opposed to 1/N when he reports truthfully. But given truthful reporting, we are effectively back to the case of observable types.

Concluding remarks
In this paper, we provided a setting where any optimal equilibria are indistinguishable from monetary trading. Hence, in this setting, there is no surprise that monetary trading should be observed. Monetary trading is described by particular rules regarding individual trading behavior and updating of individuals' trading history; in particular, it is independent of whether monetary trading is implementing using a specific object or whether it uses electronic money. This might explain the persistence of monetary trading throughout time.
The optimality of monetary trading clearly holds in settings where monetary trading (broadly defined as a form of behavior where each person has a balance, which rises when he gives up goods, and falls when he acquires goods) is efficient, as in the case of centralized Arrow-Debreu markets. This is why we have focused on a setting where trade is difficult and, thus, where the optimality of monetary trading is, in principle, harder to establish. For these reasons, one expects that our optimality result continues to hold in intermediate settings, such as that of Lagos and Wright (2005) and Aliprantis et al. (2007), where trade is still difficult but not as much as under fully decentralized trading and random-matching.
Despite the unresolved issues discussed in Sect. 4, we know that monetary trading (with a rich set of possible status levels/money holdings and a sufficiently high discount factor) is nearly efficient in several settings (e.g. Berentsen 2002;Green and Zhou 2005;van der Schaar et al. 2013;Olszewski and Safronov 2018). It is also the case that several authors have been able to find trading mechanisms that dominate monetary trading. These include Kandori (1992) and Kocherlakota and Wallace (1998) as discussed in the introduction, Kocherlakota (2002) (using a mechanism with two distinct portable objects which are disposable and can be concealed), Ellison (1994), Carmona (2002b), Araujo (2004) and Aliprantis et al. (2007) (all of whom use contagious strategies) and Zhu and Maenner (2012) (which considers an information-updating center as we do, but unlike in our setting, the center does not observe the action taken in any given match; rather, this is communicated by the individuals). The point of this paper is that some intuitive limitations on information updating are enough to make monetary trading an optimal equilibrium. By combining this and possibly additional such limitations with the near-efficiency of monetary trading, it is then possible that this conclusion might hold quite generally.
Despite the extreme nature of some of our assumptions, it is remarkable that economically meaningful conditions imply that monetary trading norms are selected as effectively the unique optimal equilibria. Under these conditions, it is then no surprise that monetary trading persists.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

A.2 Proof of Lemma 1
Fix a stationary norm μ and let The continuity of v implies thatC is closed. If c < 0 and t ∈ N 0 , the monotonicity of v implies that Since c → c is continuous andC ∩ [0, F 0 (μ)] is compact and non-empty, the first part of the lemma follows. We and, hence, c ∈C. But this is a contradiction since c(μ) = minC. This contradiction establishes the claim.
Let μ and μ be stationary norms satisfying condition (12). We then have that a contradiction to the claim in the previous paragraph. This contradiction then implies that c(μ) < c(μ ).

A.3 Proof of Lemma 2
Suppose that μ = (B, T , q) ∈ E and that (13) does not hold. Define T by setting T (x, B(x)) = T (x, B(x)) and T (x, a) = x for each x ∈ X and a = B(x), and let μ = (B, T , q). Despite the center playing T in every period, it is still the case that, at every history h on the equilibrium path, μ has been followed in the past at h; thus, the payment received by the center is c(μ) in every period. Moreover, (12) implies that F 0 (μ ) < F 0 (μ) and F t (μ ) ≤ F t (μ) for each t > 0. Thus, the center has an incentive to deviate at t = 0, a contradiction to μ ∈ E. This contradiction shows that (13) must hold.
Conversely, define the following strategy. At histories where μ has been followed in the past, the center uses T and the individuals use B; otherwise, the center uses T nc and the individuals use B nt defined by setting B nt (x) = (0, 0) for each x ∈ X . (In words, the center does not change status profiles and the individuals do not trade.) Let μ = (B nt , T nc , q) (note that q is a stationary distribution ofμ).
There are no profitable deviations in period t ≥ 0 at histories in which μ has been followed in the past. For the individuals, this follows by (7) and (8). Regarding the center, no deviation yields him a continuation payoff of ∞ k=t δ k−t v(c(μ) − F k (μ)) ≥ 0, whereas a deviation yields him a continuation payoff of at most 0.
There are no profitable deviations in period t ≥ 0 at histories in which μ has not been followed in the past. This is clear for individuals. As for the center, no deviation yields him a continuation payoff of 0, whereas a deviation yields him a continuation payoff of at most 0.

A.4 Proof of Lemma 3
A standard application of Blackwell's sufficient conditions for a contraction and the contraction mapping theorem establishes the uniqueness of the expected discounted utility function. This is stated formally in the following lemma.

Lemma 5
For each norm μ, the function V : S → R is unique.
Lemma 3 is a particular case of the following result. (This more general result will be used in the proof of Corollary 1.) The first property holds since This shows thatμ is stationary if μ is stationary. For ), establishing the second property.
The third property holds since Turning to the fourth property, let (5) and (6) whenμ is the norm being used and (V 0 , V 1 ) solves (5) and (6) when μ is the norm being used.

A.5 Proof of Proposition 1
A road map of this proof is as follows: Recall that, for each x ∈ X , y(x) = B p (x)B c (x). Claim 1 shows that any state profile x at which y(x) > 0 is such that T p (x) = 1, T p (x, (0, 1)) = 0 and −(1 − β)y(x) + βV 1 ≥ βV 0 . In words, the producer needs to be rewarded with a transition to the high status level in order to produce. Claim 2 then shows that y(1, 1) = y(1, 0) = 0 because the producer already starts with the high status and there are no transitions off the equilibrium path. Claim 3 states that y(0, 0) > 0 or y(0, 1) > 0 since μ is non-autarkic. Claim 4 then shows that T c (1, 1) = 1; indeed, if not, then the consumer could deviate and, because there are no transitions off the equilibrium path, keep the high status. More surprising and less intuitive is Claim 5 then shows that y(0, 0) = 0 or y(0, 1) = 0; its proof shows that the only way to have both y(0, 0) > 0 and y(0, 1) > 0 and satisfy the equilibrium conditions is to have q 1 = 1, but this would mean that the norm is autarkic as y(1, 1) = 0. Claim 6 then shows that q 0 > 0 and that q s > 0 and u(y(0, s)) > y(0, s) where s ∈ {0, 1} is such that y(0, s) > 0; this is so because μ is non-autarkic and recall that, at this stage, we know that either y(0, 0) > 0 or y(0, 1) > 0 but not both. Claim 7 then shows it must, in fact, be y(0, 0) = 0 and y(0, 1) > 0 since, otherwise, status 0 would be the high status instead of status 1. At this stage, we have y(0, 1) > 0, y(0, 0) = y(1, 0) = y(1, 1) = 0, T p (0, 1) = 1, T c (1, 1) = 1 and q 0 ∈ (0, 1). To complete the argument, we show that if μ is a non-autarkic equilibrium with V 1 ≥ V 0 , then the monetary trading norm with the same stationary distribution of the given equilibrium and the same production at singlecoincidence meetings with a status profile (0, 1) is an equilibrium and yields the same average expected discounted utility as the given equilibrium.
Since μ is non-autarkic, Claim 2 implies that
For convenience, let V * = V (μ M ). To complete the proof of the proposition, we will show, by considering several cases regarding the remaining values of T , that This inequality, together with Lemma 4, then establishes that (π M , q) ∈ E and completes the proof of the proposition. Suppose that T p (1, 0) = 0 or T p (0, 0) = 1 or T c (0, 1) = 0 or T c (0, 0) = 1. We then have by (20) that Thus, using (18), This inequality, together with (22), implies that (23) holds. It follows by the above that we may assume that Furthermore, by (20), y(0, 1)) + q 1 y(0, 1)] We consider the following four cases.