Before we look at the theoretical predictions about specific comparisons, we present an overview of the choice behavior over all treatments in Fig. 1.Footnote 9 We then present our results on each of the predictions. We compare the efficiency of AGV and SM when we test Prediction 3.
In the summary overview of all binary choices in Fig. 1, we see some indications of the expected effects. In Fig. 1a, efficiency seems to matter in the ex ante mechanism choices. Subjects are close to indifferent between NSQ and RAND in the Symmetric and Robustness treatment where the ex ante expected value of implementation is (close to) zero, and more subjects favor NSQ (RAND) in the Left-skewed (Right-skewed) treatment that has a negative (positive) expected value. If we order the two treatments in terms of the relative efficiency of NSQ and RAND, we find the same order as on the lower axis of Fig. 1a. In line with Prediction 1.1, subjects overwhelmingly choose the more efficient, active mechanisms (SM and AGV) instead of the two passive ones. The choices between SM and AGV are close to the 50/50 distribution. SM is somewhat preferred in two treatments, whereas AGV is somewhat preferred in the Left-skewed treatment. In this last treatment, sending a message of − 7 acts as a veto, which could provide a clear limit to the risk this mechanism poses to subjects.
The ad interim choices in Fig. 1b also appear largely in line with expectations. Prediction 1.2 states that subjects with a negative valuation prefer NSQ (Myerson–Satterthwaite impossibility). This is clearly visible in the agglomeration of markings in the south-west corner of Fig. 1b in the negative valuation panel. This choice pattern is completely absent for subjects with a positive valuation, and can also not be found in the ex ante choices of the same subjects. Striking is also that this clustering on the inefficient mechanism only happens if the mechanism is safe (NSQ). In line with Prediction 1.3, subjects are much more likely to choose the SM or AGV over RAND, regardless of whether they have a positive or negative valuation.
Figure 1a shows that subjects prefer the AGV over SM most in the Left-skewed treatment. Whereas in the binary choice between NSQ and RAND no difference between treatments are found. This indicate that it is unlikely that Prediction 2 will be supported by our data. The figure does not show much about the other predictions.
Theoretical predictions, Prediction 1
In Table 3, we use logistic regressions to test Predictions 1.1–1.4. Throughout the paper, we cluster standard errors on the matching group (the largest group in the experiment that subjects could be matched with and thus could share some common history with) or treatment level and use a sandwich-estimator for the variance–covariance matrix based on Cameron et al. (2012).
To test Prediction 1.1, we first define a dummy variable that is set to 1 if the subject selected the mechanism that is theoretically most efficient (we code the comparison RAND-NSQ as missing in the Symmetric treatment as these mechanism have the same expected efficiency). Column (1) shows that subjects indeed do not respond to the valuation before they know their valuation. More importantly, the coefficients on the full set of treatment dummies are positive and highly significant, indicating that subjects in every treatment prefer the efficient mechanism in the ex ante stage, as stated in Prediction 1.1.
Column (2) relates to Prediction 1.2. It shows ad interim subject-periods with a choice between NSQ and some other mechanism. As predicted by Myerson–Satterthwaite impossibility, subjects with a negative valuation are much more likely to choose the NSQ. The hypothesis that the sum of the treatment-specific constant and the coefficient on Negative Value is equal to zero is rejected in all treatments (\(\chi ^2\)-tests, \(p<0.001\) in all cases). Note the stark contrast with column (1) where both the treatment dummies and the Negative Value dummy have significantly smaller coefficient sizes. The Myerson–Satterthwaite theorem does not state that types with a negative value on average prefer NSQ, it states that all types with a negative valuation prefer the NSQ. Therefore, column (3) repeats the regression with a full set of valuation dummies (we drop the Symmetric treatment dummy for identification). The dummies for types \(-\,7\) and \(-\,2\) and corresponding observations are dropped because those types are perfectly predicted to select NSQ (see Fig. 1 and Online Appendix B.1.2). The coefficients on all negative valuations are positive and significant, whereas the coefficients on all positive valuations are negative and significant. Value \(-\,1\) is the marginal type in the type space and has the smallest positive coefficient. In \(\chi ^2\)-tests against the restriction that the treatment dummies and the coefficient on Value \(-\,1\) add to zero, the null is rejected in all treatments (Right-skewed \(p<0.001\) , Left-skewed \(p=0.005\) , and Robustness \(p=0.039\)). The pattern of Prediction 1.2 is clearly visible in the choices made by our subjects for all treatments and types.
In column (4), we look at the choice between flipping a coin or voting. Prediction 1.3 says that in these choices, all types should prefer AGV or SM to RAND. Indeed, we find that the coefficients on all types are negative and highly significant, indicating that RAND is not preferred. We repeat the \(\chi ^2\)-tests on the sum of the treatment dummy and the smallest coefficient for a type present in that treatment. The Symmetric treatment is the baseline, so the coefficients measure the marginal effects and they are all significant and negative as predicted. In the Right-skewed treatment, \(Value\; -\,3 + Right \; kewed = 0\) yields \(\chi ^2=11.24\), \(p=<0.001\). In the Left-skewed treatment, \(Value\; 3 + Left-skewed = 0\) yields \(\chi ^2=2.22\), \(p=<0.1364\). In the Robustness treatment, \(Value\; -\,3 + Robustness = 0\) yields \(\chi ^2=0.27\), \(p=<0.61\). Over all treatments, the pattern of Prediction 1.3 appears visible. However, if we check on the individual-type level on which the prediction is made, we find null results in two treatments.
In column (5), we examine which types prefer AGV to SM. To have enough statistical power, we create a single dummy that identifies the types that Grüner and Koriyama (2012) predict prefer the AGV over SM in the ad interim stage. The coefficient on the dummy AGV-pref (GK) is positive as predicted by Prediction 1.4 and is highly significant. Testing the restriction that the treatment dummies plus the coefficient AGV-pref (GK) equals zero yields: \(\chi ^2=52.78\) , \(p=<0.001\) in the Symmetric treatment; \(\chi ^2=31.45\), \(p=<0.001\) in the Right-skewed treatment; \(\chi ^2=3.49\), \(p=<0.06\), in the Left-skewed treatment; \(\chi ^2=13.04\), \(p=<0.001\) in the Robustness treatment. The pattern suggested by Prediction 1.4 is clearly identified over the treatments, but is only marginally significant in the Right-skewed treatment.
The statistical noise we expect from our data asymmetrically affects theoretical predictions indicating implementation problems and potential solutions for two reasons. The first is directly observable in our data. In our statistical tests the problematic pattern of Prediction 1.2 is found very strongly. In fact, we have to drop some observations in column (3) because our statistics cannot deal with perfect identification. We do not see similarly strong patterns in the potential solutions to the impossibility in columns (4) and (5). In part, the fact that the later predictions are not as clear cut as the Myerson–Satterthwaite impossibility result is due to statistical power. We go from 1710 observation in 15 clusters in column (1) to only 150 observation in column (5). However, the theoretical predictions are made with certainty for all types, an expectation that is clearly not found in any real-world setting or the lab. Secondly, in many situations, we need all individual players or a qualified majority to accept a change in the rules. In a consensus or veto situation, we only need one opposing vote to prevent the implementation of efficient mechanisms. If we find a weak pattern in line with Prediction 1.2, this could be enough to prevent efficient mechanisms from being adopted. The opposite holds for Predictions 1.3 and 1.4. If we want the efficient mechanism to be voluntarily adopted, these predictions have to hold perfectly for all types. Any statistical noise around the prediction makes implementing efficient mechanisms more difficult. Since the results of these more qualified predictions are not as clear cut, empirical tests are needed to make sure such predictions are borne out in real life.
Social concerns, Prediction 2
In the Right-skewed and Robustness (Left-skewed) treatments, the AGV transfers are paid by subjects that have a valuation of 7 (− 7) for the public project to subjects that have a negative (positive) valuation for the public project. These payments reduce (increase) ex post inequality after implementation of the project. Based on the results in Engelmann and Grüner (2017), Bierbrauer et al. (2017), and Bol et al. (2020) that subjects value fairness, we could expect subjects to prefer the AGV more in the Right-skewed treatment and Robustness treatment than in the Left-skewed treatment. In Table 4 we test this prediction by looking at the choices for AGV in a logistic regression with dummy Tax the winner set to 1 for the Right skewed and Robustness treatments, and to 0 for the Left-skewed treatment. We interact Tax the winner with a dummy indicating the ad interim rounds to see if the social concerns matter more ad interim or ex ante. In column (1), we look at all decisions of all types and see that, ad interim, the AGV is chosen less often than ex ante. The main effect and the interaction effect of the Tax the winner dummy are insignificant (and of opposing sign). Since the strength of preferences depend directly on the valuation for the public project, the types with extreme preferences could drive the null result. In column (2), we therefore repeat the analysis using only types with a valuation of − 1 or + 1. This does not change the sign or significance of the coefficients on Tax the winner. Contrary to Prediction 2, there does not appear to be a prosocial tendency in the mechanism choices in our data.
The difference between our findings and those experiments that do find social concerns can be explained by a number of factors. For instance, subjects might not perceive enough difference in fairness between the mechanisms since they all have similar procedural fairness. Alternatively, the one-third probability that the mechanism choice has direct effects on the experiment, and thus on monetary payoffs, might overwhelm social concerns. The random, anonymous rematching used in this experiment restricts personal relations, dynamic strategies, and direct reciprocity, further reducing the potential for social concerns. Random rematching and random dictator choices clearly reduce the scope of the social concerns, but they are common in similar experiments that do find social concerns in group decision settings.Footnote 10 Our results thus put some bounds on the strength of these social concerns and/or the characteristics of the mechanisms that matter for the expression of social concerns by our subjects, but do not excluded social concerns per se.
Realized surplus, Prediction 3
Predictions 1.1–1.4 assume that all subjects play Bayes–Nash strategies when determining the relative payoffs of the mechanisms. However, as we know from other experiments and experience, individuals seldom perfectly adhere to Bayes–Nash strategies, and realized payoffs of the mechanisms are an empirical matter. With different valuations, we should expect different rankings of mechanisms for an income maximizing subject. Prediction 3 focuses on these differences between theoretic and empirical payoffs and how these affect mechanism choice.
We compare the realized payoff for each mechanism based on the behavioral strategies and the objective probability over the vector of types, rather than the average surplus in the lab. The surplus obtained in the lab is strongly influenced by the realizations of private valuation and the mechanism choices by the random dictator which can distort the comparison. Therefore, we use the observed distribution of reports/votes made by subjects to determine behavioral strategies for each treatment-type. Using these behavioral strategies, we calculate the payoffs and surplus (in €) that would have realized in the limit where all combinations of private valuations occur with their expected probabilities. Equivalently, the realized surplus can be interpreted as the expected value of the next, unobserved round given these behavioral strategies.Footnote 11
Table 5 shows the Bayes–Nash surplus and the realized group surplus for the AGV and SM mechanisms in the ex ante rounds in all treatments.Footnote 12 Neither mechanism reaches its full theoretical efficiency level. Still, SM is almost as efficient in the lab as predicted by theoretical calculations with rational, self-interested agents. The AGV is perfectly efficient in theory but loses a lot of its efficiency in practice. It is still the most efficient mechanism ex ante in the two Skewed treatments and the Robustness treatment. In the Symmetric treatment, SM is theoretically very close to optimal, which reduces the advantage of AGV in theoretic calculations. Simultaneously, the realized efficiency of AGV is quite low in this treatment, causing the realized efficiency ranking to reverse. The reversal of the efficiency ordering of AGV and SM makes it very difficult to predict preferences over mechanisms in the lab for subjects that are sensitive to realized payoff.
In every round, each group faces an efficient project (group surplus \(>0\)) or inefficient project with the same probability. In the ex ante rounds, the efficiency of the project cannot affect the mechanism choices. Therefore, we compare the number of efficient and inefficient provision decisions (one decision per three matched subjects per period) in the ex ante rounds in Table 6. We choose this comparison over a comparison of the surplus in Table 5 for two reasons. The values in Table 6 are determined at the level of treatments, so that we only have one observation per treatment. Furthermore, the size and variance of the surplus varies over treatments because of the changes in the type space and in behavior, so that comparisons of the average surplus are not directly informative. Over the four treatments combined, implementation is marginally more efficient in the AGV. However, if we look at the results per treatment, the only difference found is in the Robustness treatment, whereas the SM is non-significantly more efficient in the Symmetric treatment. The Robustness treatment is the least Symmetric treatment, so exactly the situation where the theoretically expected difference between AGV and SM is largest. In Online Appendix B.2.4 we show that similar results are obtained through logistical regressions with clustered standard errors. The same appendix shows these null results are not purely due to lack of statistical power, as we can clearly show that the AGV has more efficient implementation than the extremely noisy RAND mechanism.
Prediction 3 states subjects tend to select mechanisms with a higher expected payoff. After the ex ante rounds, subjects have experienced the mechanisms and thus have a feeling for the payoff they can obtain in the mechanisms in the lab. We create two proxies for these benefits by calculating the expected payoffs for each mechanism based on Bayes–Nash strategies and based on observed behavioral strategies for each treatment-type. We use the differences in expected utility between the two available mechanisms to explain the ad interim mechanism choices of each type in each treatment. This allows us to directly compare predictions based lab and Bayes–Nash payoffs for the second part of Prediction 3. Since strategies are determined at the treatment-type level, we aggregate our data to this level and determine the proportion of subjects with a given treatment-type that support mechanism A over mechanism B in a given choice.Footnote 13 This yields one observation per treatment-type and correlated errors within treatment since strategies are interdependent. We estimate a quasi-binomial model against the fraction of subjects that prefer the mechanism using both the theoretic and lab payoff differences as explanatory variables. We cluster standard errors on the treatment level. Since we want to examine the difference lab-based and theory-based predictions, we do not use the comparison between RAND and NSQ where the lab and theoretic payoffs are the same by construction. The results are shown in Table 7.
In columns (1) and (2), we estimate the model using the lab and theory measures of incentives, respectively. In both columns, types that gain more from the mechanism are more likely to select it ad interim. If we look at the overall model fit, we see a slightly better fit (lower residual deviance) for model (1) using lab predictions. In column (3), we pit the two predictors against each other directly. The positive correlation between the two independent variables decreases the coefficients’ sizes, but the lab-based measure is the only significant predictor. The model with both variables has a marginally better fit than both other models, but the difference is not statistically significant (Rao score test, theory only \(p=0.849\), lab only \(p=0.937\)). We looked at the ad interim rounds since we can see the choices made by each type individually there. We see something similar in the choice between AGV and SM in the ex ante rounds. In the Symmetric treatment, the realized surplus of the AGV is lower than that of SM in the lab. If we then look at Fig. 1a, we indeed see that support for the AGV is particularly low in that treatment. In fact, if we take the realized difference between AGV and SM in Table 5 and order the treatments accordingly, we find the exact same order as we see on the top axis of Fig. 1a. The expectations of Prediction 3 are clearly found in our data. The expected benefits of the mechanisms drive choices, and subjects respond most clearly to the expected benefits they experience in actual play.
The pattern of efficiency differences is interesting in its own right. Consistent with theory, the voting mechanism underperforms relative to the AGV particularly in situations with skewed distributions. The inability to show the intensity of preferences is particularly costly in these situations. However, the realized differences are very small and therefore difficult to notice in real life. These small differences can therefore create difficulties in the implementation of this more efficient mechanism.
Deviations from theoretical efficiency predictions stem from subjects’ second stage reporting (AGV) and voting (SM) strategies. Online Appendix B.2.1 shows that subjects that misreport the sign of their valuation in the AGV cause the largest loss in surplus. We show that the empirical best response of each type contains the truthful report and for most treatment-types it is unique. Reports with an incorrect sign could be caused by subjects that mistake \(-3\) for 3 or vice versa. We removed this possibility in the Robustness treatment, but we still find a significant number of misreported signs. Furthermore, we find a pattern where subjects with a positive valuation almost never misreport the sign of their valuation, whereas subjects with a negative valuation do so more often. As such, there appears to be a bias by some of our subjects in favor of implementing the public project in the lab. Interestingly, this asymmetric pattern is present in all treatments and across a number of individuals. In Online Appendix B.3, we look at how individual differences in reporting and voting strategies relate to personal characteristics. The most consistent effect we identify is that those individuals that are more likely to follow the Nash strategies of truthful revelation (sincere voting) are also most likely to select the AGV (SM) if given the option. This seems to imply that beliefs about the mechanisms influenced both selection of and play within the mechanisms. We find little evidence that the deviations from Nash equilibrium, either in the AGV or in SM, are driven by understanding of the experiment.