Skip to main content

Meaningful learning in weighted voting games: an experiment

Abstract

By employing binary committee choice problems, this paper investigates how varying or eliminating feedback about payoffs affects: (1) subjects’ learning about the underlying relationship between their nominal voting weights and their expected payoffs in weighted voting games; (2) the transfer of acquired learning from one committee choice problem to a similar but different problem. In the experiment, subjects choose to join one of two committees (weighted voting games) and obtain a payoff stochastically determined by a voting theory. We found that: (i) subjects learned to choose the committee that generates a higher expected payoff even without feedback about the payoffs they received; (ii) there was statistically significant evidence of “meaningful learning” (transfer of learning) only for the treatment with no payoff-related feedback. This finding calls for re-thinking existing models of learning to incorporate some type of introspection.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Notes

  1. The objectives of this body of work differ from the work done on the theory of learning in games. See, e.g., Hart (2005) and the references therein for the theoretical literature that mainly investigates the convergence properties of learning models to various equilibria.

  2. There is a related literature on “behavioral spillover” (Cason et al. 2012; Bednar et al. 2012) or “learning spillover” (Grimm and Mengel 2012; Mengel and Sciubba 2014) across multiple games. Cason et al. (Cason et al. 2012, p. 234) define a “behavioral spillover as having been occurred whenever observed behavior differs when a game is played together with other games, compared to the same game played in isolation.” Of course, possible reasons underlying these spillovers vary and include “meaningful learning” as one of them. In this paper, we discuss about “meaningful learning” because we were particularly inspired by Rick and Weber (2010) as can be seen below.

  3. Rick and Weber (2010) also find that asking subjects to explain the reason for their behavior promotes meaningful learning in the presence of feedback information.

  4. Other studies mentioned above do not consider the effect of payoff-related feedback on meaningful learning. Cooper and Kagel (2003, 2008) deal with signaling games and find that letting subjects play in a team promotes “meaningful learning” or what they call “transfer of learning”. Haruvy and Stahl (2009, 2012) combine rule-based learning (Stahl 1996) and EWA learning models and show that in laboratory experiments of a sequence of 4\(\times \)4 symmetric normal form games which are dominance solvable, a learning model that re-labels the actions based on the steps of eliminating dominated strategies (which assumes that agents understand the basic property—dominance solvability—of the game) better captures subjects’ observed behavior. Dufwenberg et al. (2010) investigate it in two Race to X games (which Dufwenberg et al. call “games of X”). In a Race to X game, two players alternately put 1 to M coins in one initially empty hat. The game ends when there are X coins in the hat, and the player who has put the X-th coin into the hat is the winner. Both M and X are common knowledge, and the number of coins in the hat is observable at any time. This game has a dominant strategy and it can be solved by backward induction. Dufwenberg et al. (2010) show that when M is 2, subjects who had experience playing Race to 6 before they played Race to 21 are able to play the latter game perfectly using the dominant strategy more often than those who had experience playing Race to 21 before Race to 6. One can infer that it is easier for subjects to adopt the dominant strategy for this class of games when X is smaller and that once they have adopted the dominant strategy, they will exploit it every time they face the same class of games.

  5. Fréchette et al. (2005a, b, c), Drouvelis et al. (2010), and Kagel et al. (2010) conduct experiments in non-cooperative game environments that are variants of a legislative bargaining model.

  6. The standard Two-Armed Bandit problem does not provide subjects with any contextual information related to their payoffs. See Meyer and Shi (1995), Banks et al. (1997), and Hu et al. (2013).

  7. In Esposito et al. (2012), subjects were asked to divide the fixed amount of total points in a subsequent negotiation stage, but the authors were unable to identify clear determinants of how their subjects chose committees. Cason and Friedman (1997) point to a similar complication in their market experiments, and conducted experimental sessions in which subjects interacted with “robots” that follow the equilibrium strategy.

  8. It would be an interesting future study to consider subjects’ learning about the relationship between their nominal voting weights and their expected payoff when the relationship is governed by the idea behind either BzI or SSI.

  9. Packel and Deegan (1980) provided a set of axioms to generalize the DPI for any occurrence probability distributions specified over MWCs and for any payoff distributions among members of each MWC. Unfortunately, however, there is no criterion to specify those occurrence probability and payoff distributions.

  10. See Banzhaf (1965) and Shapley and Shubik (1966) for more details, respectively.

  11. These are written as {1,2,3}, {1,2,4}, and {3,4} in the formal notation of coalitions with references to the specific players.

  12. See the Appendix for an English translation of the instructions.

  13. The sessions at Tsukuba were added as a robustness check to the results we have obtained from data gathered in Osaka. We have checked whether the data from two locations are not significantly different in the key statistics that we consider below in all the treatments. Since they were not, we have decided to pool the data from these two locations in the analyses presented below.

  14. The data are available from the authors upon request.

  15. In our analyses, we employ 5% significance level in rejecting the null hypothesis unless otherwise stated.

  16. In pooled data, the p values are as follows; \(p<0.001\) for Problem A, \(p=0.002\) for Problem B, \(p=0.001\) for problem C, and \(p<0.001\) for Problem D.

  17. In the questionnaire, we did not directly ask our subjects to report which problem they faced was easier for themselves to determine the better option, but asked them to note the reason(s) of their choices in free format. Some subjects described their reasoning which supported Observation 1 clearly both in Osaka and in Tsukuba.

  18. Below we drop superscript i from \(\Delta \mathrm{FR}^i_{l,m}\) and simply refer them to as \(\Delta \mathrm{FR}_{l,m}\).

  19. We conjecture the reason for this negative learning between Problem B and Problem A as follows: subjects who have learned to choose [6; 1, 1, 4, 4] over [6; 1, 2, 3, 4] in Problem B may have employed a wrong heuristic when they faced Problem A, namely the committee in which two members with more votes having the same number of votes to be better, and have chosen [14; 5, 3, 7, 7] over [14; 5, 4, 6, 7]. This interpretation also explains why there is not such a negative learning between Problem D and Problem C despite of the similarity of this sequence of problems with Problem B and A. Notice that Problem C involves the choice between [14; 3, 5, 6, 8] and [14; 3, 6, 6, 7], so that the same heuristic cannot be applied in Problem C.

References

  • Aleskerov, F., Belianin, A., Pogorelskiy, K. (2009). Power and preferences: An experimental approach. SSRN. http://ssrn.com/abstract=1574777.

  • Arifovic, J., & Ledyard, J. (2012). Individual evolutionary learning, other-regarding preferences, and the voluntary contributions mechanism. Journal of Public Economics, 96, 808–823.

    Article  Google Scholar 

  • Arifovic, J., McKelvey, R. D., & Pevnitskaya, S. (2006). An initial implementation of the Turing tournament to learning in two person games. Games and Economic Behavior, 57, 93–122.

    Article  Google Scholar 

  • Banks, J., Olson, M., & Porter, D. (1997). An experimental analysis of the bandit problem. Economic Theory, 10, 55–77.

    Article  Google Scholar 

  • Banzhaf, J. F. (1965). Weighted voting doesn’t work: A mathematical analysis. Rutgers Law Review, 19, 317–343.

    Google Scholar 

  • Bednar, J., Chen, Y., Liu, T. X., & Page, S. (2012). Behavioral spillovers and cognitive load in multiple games: An experimental study. Games and Economic Behavior, 74, 12–31.

    Article  Google Scholar 

  • Camerer, C., & Ho, T.-H. (1999). Experience-weighted attraction learning in normal form games. Econometrica, 67, 827–874.

    Article  Google Scholar 

  • Cason, T. N., & Friedman, D. (1997). Price formation in single call markets. Econometrica, 65, 311–345.

    Article  Google Scholar 

  • Cason, T. N., Savikhin, A. C., & Sheremeta, R. M. (2012). Behavioral spillovers in coordination games. European Economic Review, 56, 233–245.

    Article  Google Scholar 

  • Cheung, Y.-W., & Friedman, D. (1997). Individual learning in normal form games: Some laboratory results. Games and Economic Behavior, 19, 46–76.

    Article  Google Scholar 

  • Cooper, D. J., & Kagel, J. H. (2003). Lessons learned: Generalized learning across games. American Economic Review, Papers and Proceedings, 93, 202–207.

    Article  Google Scholar 

  • Cooper, D. J., & Kagel, J. H. (2008). Learning and transfer in signaling games. Economic Theory, 34, 415–439.

    Article  Google Scholar 

  • Deegan, J., & Packel, E. (1978). A new index of power for simple n-person games. International Journal of Game Theory, 7, 113–123.

    Article  Google Scholar 

  • Drouvelis, M., Montero, M., & Sefton, M. (2010). The paradox of new members: Strategic foundations and experimental evidence. Games and Economic Behavior, 69, 274–292.

    Article  Google Scholar 

  • Dufwenberg, M., Sundaram, R., & Butler, D. J. (2010). Epiphany in the game of 21. Journal of Economic Behavior and Organization, 75, 132–143.

    Article  Google Scholar 

  • Erev, I., Ert, E., & Roth, A. E. (2010). A choice prediction competition for market entry games: An introduction. Games, 2, 117–136.

    Article  Google Scholar 

  • Erev, I., & Roth, A. E. (1998). Predicting how people play games: Reinforcement learning in experimental games with unique, mixed strategy equilibria. American Economic Review, 88, 848–881.

    Google Scholar 

  • Esposito, G., Guerci, E., Lu, X., Hanaki, N., & Watanabe, N. (2012). An experimental study on “meaningful learning” in weighted voting games. Mimeo: Aix-Marseille University.

    Google Scholar 

  • Felsenthal, D. S., & Machover, M. (1998). The measurement of voting power: theory and practice. Problems and paradoxes. London: Edward Elgar.

  • Fischbacher, U. (2007). z-Tree: Zurich toolbox for ready-made economic experiments. Experimental Economics, 10, 171–178.

    Article  Google Scholar 

  • Fréchette, G., Kagel, J. H., & Morelli, M. (2005a). Behavioral identification in coalitional bargaining: An experimental analysis of demand bargaining and alternating offers. Econometrica, 73, 1893–1937.

    Article  Google Scholar 

  • Fréchette, G. R., Kagel, J. H., & Morelli, M. (2005b). Gamson’s law versus non-cooperative bargaining theory. Games and Economic Behavior, 51, 365–390.

    Article  Google Scholar 

  • Fréchette, G. R., Kagel, J. H., & Morelli, M. (2005c). Nominal bargaining power, selection protocol, and discounting in legislative bargaining. Journal of Public Economics, 89, 1497–1517.

    Article  Google Scholar 

  • Grimm, V., & Mengel, F. (2012). An experiment on learning in a mutiple games environment. Journal of Economic Theory, 147, 2220–2259.

    Article  Google Scholar 

  • Guerci, E., Hanaki, N., Watanabe, N., Lu, X., & Esposito, G. (2014). A methodological note on a weighted voting experiment. Social Choice and Welfare, 43, 827–850.

    Article  Google Scholar 

  • Hanaki, N., Sethi, R., Erev, I., & Peterhansl, A. (2005). Learning strategy. Journal of Economic Behavior and Organization, 56, 523–542.

    Article  Google Scholar 

  • Hart, S. (2005). Adaptive heuristics. Econometrica, 73, 1401–1430.

    Article  Google Scholar 

  • Haruvy, E., & Stahl, D. O. (2009). Learning transference between dissimilar symmetric normal-form games. Mimeo: University of Texas at Dallas.

    Google Scholar 

  • Haruvy, E., & Stahl, D. O. (2012). Between-game rule learning in dissimilar symmetric normal-form games. Games and Economic Behavior, 74, 208–221.

    Article  Google Scholar 

  • Ho, T.-H., Camerer, C., & Weigelt, K. (1998). Iterated dominance and iterated best response in experimental “p-beauty contests”. American Economic Review, 88, 947–969.

    Google Scholar 

  • Hu, Y., Kayaba, Y., & Shum, M. (2013). Nonparametric learning rules from bandit experiments: The eyes have it!. Games and Economic Behavior, 81, 215–231.

    Article  Google Scholar 

  • Ioannou, C. A., & Romero, J. (2014). A generalized approach to belief learning in repeated games. Games and Economic Behavior, 87, 178–203.

    Article  Google Scholar 

  • Kagel, J. H., Sung, H., & Winter, E. (2010). Veto power in committees: An experimental study. Experimental Economics, 13, 167–188.

    Article  Google Scholar 

  • Marchiori, D., & Warglien, M. (2008). Predicting human interactive learning by regret-driven neural networks. Science, 319, 1111–1113.

    Article  Google Scholar 

  • Mengel, F., & Sciubba, E. (2014). Extrapolation and structural similarity in games. Economics Letters, 125, 381–385.

    Article  Google Scholar 

  • Meyer, R. J., & Shi, Y. (1995). Sequential choice under ambiguity: Intuitive solutions to the armed-bandit problem. Management Science, 41, 817–834.

    Article  Google Scholar 

  • Montero, M., Sefton, M., & Zhang, P. (2008). Enlargement and the balance of power: An experimental study. Social Choice and Welfare, 30, 69–87.

    Article  Google Scholar 

  • Neugebauer, T., Perote, J., Schmidt, U., & Loos, M. (2009). Selfish-biased conditional cooperation: On the decline of contributions in repeated public goods experiments. Journal of Economic Psychology, 30, 52–60.

    Article  Google Scholar 

  • Nowak, M. A., & Sigmond, K. (1993). A strategy of win stay, lose shift that outperforms tit-for-tat in the prisoner’s-dilemma game. Nature, 364, 56–58.

    Article  Google Scholar 

  • Packel, E. W., & Deegan, J. (1980). An axiomated family of power indices for simple n-Person games. Public Choice, 35, 229–239.

    Article  Google Scholar 

  • Rick, S., & Weber, R. A. (2010). Meaningful learning and transfer of learning in games played repeatedly without feedback. Games and Economic Behavior, 68, 716–730.

    Article  Google Scholar 

  • Shapley, L. S., & Shubik, M. (1954). A method for evaluating the distribution of power in a committee system. American Political Science Review, 48, 787–792.

    Article  Google Scholar 

  • Shapley, L. S., & Shubik, M. (1966). Quasi-cores in a monetary economy with non-convex preferences. Econometrica, 34, 805–827.

    Article  Google Scholar 

  • Spiliopoulos, L. (2012). Pattern recognition and subjective belief learning in a repeated constant-sum game. Games and Economic Behavior, 75, 921–935.

    Article  Google Scholar 

  • Spiliopoulos, L. (2013). Beyond fictitious play beliefs: Incorporating pattern recognition and similarity matching. Games and Economic Behavior, 81, 69–85.

    Article  Google Scholar 

  • Stahl, D. O. (1996). Boundedly rational rule learning in a guessing game. Games and Economic Behavior, 16, 303–330.

    Article  Google Scholar 

  • Watanabe, N. (2014). Coalition formation in a weighted voting experiment. Japanese Journal of Electoral Studies, 30, 56–67.

    Google Scholar 

  • Weber, R. A. (2003). Learning with no feedback in a competitive guessing game. Games and Economic Behavior, 44, 134–144.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nobuyuki Hanaki.

Additional information

The authors thank Gabriele Esposito and Xiaoyan Lu for their collaboration in the early stage of this project, Yoichi Izunaga for his excellent research assistance, and Yan Chen, John Duffy, Benjamin Hermalin, Midori Hirokawa, Yoichi Hizen, Tatsuya Kameda, Maria Montero, Tatsuyoshi Saijo, Toshio Yamagishi, and Roberto Weber for their valuable comments, and Jeremy Mercer for proof-reading the manuscript. The authors acknowledge the contributions of the experimental economics laboratory at Osaka University, in particular, Keigo Inukai, Emi Kurimune, and Shigehiro Serizawa for help in conducting the experiment. A part of this work has been carried out while Hanaki was affiliated with Aix-Marseille University (Aix-Marseille School of Economics). Hanaki thanks the Aix-Marseille School of Economics for the various support it provided. Financial support from Foundation for the Fusion of Science and Technology (FOST), MEXT Grants-in-Aid 24330078 and 25380222 (Watanabe), JSPS-ANR bilateral research grant “BECOA” (ANR-11-FRJA-0002), and Joint Usage/Research Center of ISER at Osaka University are gratefully acknowledged.

Appendix A

Appendix A

Instructions

You will be asked to repeatedly make a simple choice between two options.

Imagine that you need to represent your interests within a voting committee. This committee decides how to divide 120 points among its members. The committee has three other members, and each member has a predetermined number of votes, which may be different from one to the other. The committee will make a decision only when a proposal receives the pre-determined required number of votes. You will be told what is the required number of votes. If more than one proposal is put before the committee, the members cannot vote for multiple proposals by dividing their allocated number of votes. A member can vote for only one proposal, and all of his/her votes must be cast for that proposal.

You are asked to choose which of the two possible committees you prefer to join. You will be informed of the number of votes allocated to each of the four members of the committee (including you), and the number of votes required for a proposal to be approved. The number of votes you have will always be indicated with the label YOU.

Full-feedback treatment

There is a total of 60 periods. In each period, you have 30 s to make your choice between the two committees. If you do not make a choice within the 30 s in one period, you will receive zero points for that period. When a choice is made, the chosen committee will automatically allocate 120 points among the four members. The outcomes may vary from one period to another, but are based on a theory of decision making in committees. Once the allocation is made, you will immediately be shown the resulting allocation. At the end of the experiment, you will be paid according to your total earnings during the 60 periods, at an exchange rate of 1 point = JPY1.

If you have any questions, please raise your hand.

No-feedback treatment

There is a total of 60 periods. In each period, you have 30 s to make your choice between the two committees. If you do not make any choice within the 30 s in one period, you will receive zero points for the period. When a choice is made, the chosen committee will automatically allocate 120 points between the four members. The outcomes may vary from one period to another, but they are based on a theory of decision making in committees. You will not see the resulting allocation after each period. However, at the end of the experiment, you will be told the total points you have obtained during the 60 periods, and you will be paid according to the points earned over the 60 periods at an exchange rate of 1 point = JPY1.

If you have any questions, please raise your hand.

Partial-feedback treatment

There is a total of 60 periods. In each period, you have 30 s to make your choice between the two committees. If you do not make any choice within the 30 s in one period, you will receive zero points for the period. When a choice is made, the chosen committee will automatically allocate 120 points among the four members. The outcomes may vary from one period to another, but they are based on a theory of decision making in committees. Once the allocation is made, you will be shown the number of points allocated to you. You will not see the allocations to the other members of the committee. At the end of the experiment, you will be paid according to your total points score at an exchange rate of 1 point = JPY1.

If you have any questions, please raise your hand.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Guerci, E., Hanaki, N. & Watanabe, N. Meaningful learning in weighted voting games: an experiment. Theory Decis 83, 131–153 (2017). https://doi.org/10.1007/s11238-017-9588-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11238-017-9588-x

Keywords

  • Learning
  • Voting game
  • Experiment
  • Two-armed bandit problem

JEL Classification

  • C79
  • C92
  • D72
  • D83