Network formation in repeated interactions: experimental evidence on dynamic behaviour


Here, we present some experiments of non-cooperative games of network formation based on Bala and Goyal (Econometrica 68:1181–1229, 2000). We have looked at the one-way and the two-way flow models, each for high and low link costs. The models come up with both multiple equilibria and coordination problems. We conducted the experiments under various conditions which allowed for repeated interactions between subjects. We found that coordination on non-empty Strict Nash equilibria was not an easy task to achieve, even in the one-way model where the Strict Nash equilibria are wheels. We found some evidence of convergence to equilibrium networks through learning dynamics, while we found no effect of salient labels to help coordination. The evidence on learning behavior provides support for subjects that were choosing strategies in line with various learning rules, principally Reinforcement and Fictitious Play.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4


  1. 1.

    See for instance Rauch and Hamilton (2001), Demange and Wooders (2005), for reviews.

  2. 2.

    Notice that is different from the notion of dynamic learning in games introduced below, which represents one focus of this paper.

  3. 3.

    See Bala and Goyal (2000), Sect. 5, for the analysis in presence of decay.

  4. 4.

    This set in particular includes the 4 equivalent networks belonging to the periphery-sponsored star architecture, and 24 possible variants of a restricted pipeline network structure where any two different players both access the same third individual, while the remaining fourth is connecting to one of the former two subjects.

  5. 5.

    In their paper Bala and Goyal (2000) note: “While these findings—those on Strict Nash network—restrict the set of networks sharply, the coordination problem faced by individuals in the network game is not entirely resolved…. This leads us to study the process by which individuals learn about the network and revise their decisions on link formation, over time” (p. 1184).

  6. 6.

    Falk and Kosfeld (2003) consider a second argument, referred to as ‘strategic asymmetry’, to explain the different evidence between the one-way and two-way model. With it, they refer to the fact that while the wheel in the one-way model is a symmetric equilibrium, where every subject chooses the same action, the centre-sponsored star is an asymmetric equilibrium, where one subject maintains all links and all other subjects pay no link. This, according to the authors, may create more strategic uncertainty to determine who should be the central agent. We don’t however consider this argument fully convincing. In fact, even in a wheel every player has to decide which other agent to make a link with. As remarked in the previous footnote, this implies a very high chance of mis-coordination.

  7. 7.

    See Bacharach (1993), Sugden (1995), Janssen (2001), and the collection in Colman (2006) for theoretical studies on salience; see Metha et al. (1994), Bacharach and Bernasconi (1997), Van Huyck et al. (1997), Bardsley et al. (2009) and the literature in Camerer (2003), for experimental studies.

  8. 8.

    Other recent experiments on network formation include Goeree et al. (2009) and Berninghaus et al. (2006), among others. These experiments study modified theoretical versions of the model of Bala and Goyal (2000), and are therefore less relevant for the present paper.

  9. 9.

    We also conducted a third wave of experiments in which participants could interact for a fixed number of 5 periods and in which we followed an experimental procedure more similar to the one used by Falk and Kosfeld (2003). An overall discussion of all the three waves of experiments is available in a working paper (Bernasconi and Galizzi 2005). We have decided to focus here only the first two waves which are those more relevant for the analysis of dynamic behaviour.

  10. 10.

    We did not oblige subjects to draw the networks and did not make payments conditional on the correct drawings in order to avoid uninted effects (for example, subjects playing very simple networks—like indeed wheels—which they were more confident to be able to draw). We nevertheless checked after the experiment the subjects’ drawing sheets and verified that the great majority of subjects indeed drew the networks and they were correct.

  11. 11.

    In particular, in ONE05 the wheels were of the following type: all 8 wheels of Group 1 were ADBC, all 5 wheels of Group 4 were ADCB; in ONE15 all 9 wheels of Group 1 were ACBD; the single wheel of Group 3 was ADCB; one wheel of Group 6 is ADBC, the other was ACBD.

  12. 12.

    Results on the specific Nash equilibria groups were getting closer across treatments are not reported for brevity. They are available from authors on request.

  13. 13.

    To obtain the predictions for Reinforcement, we adopted the most standard approach (Erev and Roth 1998), in which propensities to play the various strategies are adapted linearly by adding the latter payoffs to previous period’s propensities. In regards to Fictitious Play, we considered both a model of residual opponents, in which beliefs are formed to likelihoods of passed networks, and a model of individual opponents, where beliefs are formed to passed play of all other players. Having, however, found that the differences between the two variants of Fictitious Play don’t produce significant differences in the findings, we have shown only results for the former specification. (See "Appendix 2" for the complete specification of the various learning models).

  14. 14.

    In particular, recall that the strategy of no link is a dominated strategy in the ONE05 model; while any strategy of more than one link is dominated in both the ONE15 and TWO15 games (by either the strategy of one link or of no link). Obviously, moreover, in our experimental set-up of 16 strategies, all choices with a self-directed link are dominated.

  15. 15.

    The difference-of-proportion tests are in particular derived for the null that participants were picking at random among non-dominated strategies. The tests are based on the statistics \(d=\frac{h_{1}-h_{2}}{\sqrt{\frac{h_{1}(1-h_{1})} {n_{1}-1}+\frac{h_{2}(1-h_{2})}{n_{2}-1}}}\), where h 1 is the proportion of observed choices consistent with the various learning models (calculated with respect to the overall choices N 1 of each treatment) and h 2 is the proportion of choices predicted by the models, computed with respect to the total number N 2 of non-dominated strategies which subjects could play under each treatment. Under the null, d is distributed as a standard normal. The tests in Table 6 are based on within-individual clustering to account for individual dependency over periods.

  16. 16.

    We further considered an explicit mixed model developed by Camerer and Ho (1999) and known as Experience-Weighted Attraction (EWA), which combines Reinforcement and belief learning models according to specific rules. We didn’t find EWA outperforming the class of RMF .


  1. Aumann R, Myerson R (1988) Endogenous formation of links between players and coalitions: an application of the Shapley value. In: Roth AE (ed) The Shapley value. Cambridge University Press, New York, pp 175–191

    Google Scholar 

  2. Bacharach M (1993) Variable universe games. In: Binmore K, Kirman A, Tani P (eds) Frontiers of game theory. MIT Press, Cambridge, MA, pp 255–275

    Google Scholar 

  3. Bacharach M, Bernasconi M (1997) The variable frame theory of focal points: an experimental study. Game Econ Behav 19:1–45

    Article  Google Scholar 

  4. Bala V, Goyal S (2000) A non-cooperative model of network formation. Econometrica 68:1181–1229

    Article  Google Scholar 

  5. Ballester C, Calvò-Armengol A, Zenou Y (2006) Who’s who in networks: wanted the key player. Econometrica 74:1403–1417

    Article  Google Scholar 

  6. Barabasi A, Albert R (1999) Emergence of scaling in random networks. Science 286:509–512

    Article  Google Scholar 

  7. Bardsley N, Mehta J, Starmer C, Sugden R (2009) Explaining focal points: cognitive hierarchy theory versus team reasoning. Econ J 120:40–79

    Article  Google Scholar 

  8. Bernasconi M, Galizzi MM (2005) Coordination in networks formation: experimental evidence on learning and salience. FEEM working paper no. 107

  9. Berninghaus SK, Ehrhart K-M, Ott M (2006) A network experiment in continuous time: the influence of link costs. Exp Econ 9:237–251

    Article  Google Scholar 

  10. Bolton P, Dewatripont M (1994) The firm as a communication network. Q J Econ 109:809–839

    Article  Google Scholar 

  11. Bramoullè Y, Kranton R (2007) Public goods in networks. J Econ Theory 135:478–494

    Article  Google Scholar 

  12. Breiger RL, Boorman SA, Arabie P (1975) An algorithm for clustering relational data with applications to social network analysis and comparison with multidimensional scaling. J Math Psychol 12:328–383

    Article  Google Scholar 

  13. Callander S, Plott CR (2005) Principles of network development and evolution: an experimental study. J Public Econ 89:1469–1495

    Article  Google Scholar 

  14. Calvó-Armengol A (2001) Bargaining power in communication networks. Math Soc Sci 41:69–87

    Article  Google Scholar 

  15. Calvó-Armengol A (2003) A decentralized market with trading links. Math Soc Sci 45:83–103

    Article  Google Scholar 

  16. Calvó-Armengol A (2004) Job contact networks. J Econ Theory 115:191–206

    Article  Google Scholar 

  17. Calvó-Armengol A, Jackson MO (2007) Networks in labor markets: wage and employment dynamics and inequality. J Econ Theory 132:27–46

    Article  Google Scholar 

  18. Camerer C (2003) Behavioural game theory: experiments in strategic interaction. Russel Sage Foundation Princeton University Press, Princeton

    Google Scholar 

  19. Camerer C, Ho TH (1999) Experienced-weighted attraction learning in normal form games. Econometrica 67:827–874

    Article  Google Scholar 

  20. Cheung YW, Friedman D (1997) Individual learning in normal form games. Game Econ Behav 19:46–76

    Article  Google Scholar 

  21. Colman AM, Thomas C (2006) Schelling’s psychological decision theory: introduction to a special issue. J Econ Psychol 27:603–608

    Article  Google Scholar 

  22. Currarini S, Jackson MO, Pin P (2009) An economic model of friendship: homophily, minorities and segregation. Econometrica 77:1003–1045

    Google Scholar 

  23. Demange G, Wooders M (2005) Group formation in economics: networks, clubs, and coalitions. Cambridge University Press, Cambridge

    Google Scholar 

  24. Dutta B, Mutuswami S (1997) Stable networks. J Econ Theory 76:322–344

    Article  Google Scholar 

  25. Erdos P, Renyi A (1959) On random graphs. Publ Math 6:290–297

    Google Scholar 

  26. Erdos P, Renyi A (1961) On the strength of connectedness of a random graph. Acta Math Acad Sci Hung 12:261–267

    Article  Google Scholar 

  27. Erev I, Roth AE (1998) Predicting how people play games: reinforcement learning in experimental games with unique, mixed strategy equilibria. Am Econ Rev 88:848–881

    Google Scholar 

  28. Falk A, Kosfeld M (2003) It’s all about connections: evidence on network formation. Institute for empirical research in economics. Zurich IEER working paper no. 152

  29. Fischbacher U (2007) z-Tree: Zurich toolbox for ready-made economic experiments. Exp Econ 10:171–178

    Article  Google Scholar 

  30. Frank O, Strauss D (1986) Markov graphs. J Am Stat Assoc 81:832–842

    Article  Google Scholar 

  31. Fudenberg D, Levine D (1998) The theory of learning in games. MIT Press, Cambridge

    Google Scholar 

  32. Goeree JK, Riedl A, Ule A (2009) In search of stars: network formation among heterogeneous agents. Game Econ Behav 67:445–466

    Article  Google Scholar 

  33. Goodreau SM, Kitts JA, Morris M (2009) Birds of a feather, or friend of a friend? Using exponential random graph models to investigate adolescent social networks. Demography 46:103–125

    Article  Google Scholar 

  34. Goyal S, Joshi S (2003) Networks of collaboration in oligopoly. Game Econ Behav 43:57–85

    Article  Google Scholar 

  35. Goyal S, Moraga Gonzalez JL (2001) R&D networks. Rand J Econ 32:686-707

    Article  Google Scholar 

  36. Goya S (2005) Learning in networks: a survey. In: Demange G, Wooders M (eds) Group formation in economics: networks, clubs, and coalitions. Cambridge University Press, Cambridge

  37. Granovetter M (1973) The strength of weak ties. Am J Sociol 78:1360–1380

    Article  Google Scholar 

  38. Granovetter M (1985) Economic action and social structure: the problem of embeddedness. Am J Sociol 91:481–510

    Article  Google Scholar 

  39. Granovetter M (1995) Getting a job: a study of contacts and careers (2nd edn). University Chicago Press, Chicago

    Google Scholar 

  40. Janssen MCW (2001) Rationalizing focal points. Theor Decis 50:119–148

    Article  Google Scholar 

  41. Jackson MO, Wolinsky A (1996) A strategic model of economic and social networks. J Econ Theory 71:44–74

    Article  Google Scholar 

  42. Jackson MO (2004) A survey of models of network formation: stability and efficiency. In: Demange G, Wooders M (eds) Group formation in economics: networks, clubs, and coalitions. Cambridge University Press, Cambridge

  43. Jackson MO (2010) An overview of social networks and economic applications. In: Benhabib J, Bisin A, Jackson MO (eds) The handbook of social economics. Elsevier Press, Amsterdam (Forthcoming)

  44. Kosfeld M (2004) Economic networks in the laboratory: a survey. Rev Netw Econ 3:20–41

    Article  Google Scholar 

  45. Lazarsfeld P, Merton RK (1954) Friendship as a social process: a substantive and methodological analysis. In: Morroe B, Abel T, Page CH (eds) Freedom and control in modern society. Van Nostrand, New York

    Google Scholar 

  46. Mehta J, Starmer C, Sugden R (1994) The nature of salience: an experimental investigation of pure coordination games. Am Econ Rev 84:658–673

    Google Scholar 

  47. Mookherjee D, Sopher B (1994) Learning behaviour in an experimental matching pennies game. Game Econ Behav 7:62–91

    Article  Google Scholar 

  48. Mookherjee D, Sopher B (1997) Learning and decision costs in experimental constant sum games. Game Econ Behav 19:97–132

    Article  Google Scholar 

  49. Morris S (2000) Contagion. Rev Econ Stud 67:57–78

    Article  Google Scholar 

  50. Radner R (1993) The organization of decentralized information processing. Econometrica 61:1110–1147

    Article  Google Scholar 

  51. Rauch JE, Hamilton GG (2001) Networks and markets: concepts for bridging disciplines. In: Rauch JE, Casella A (eds) Networks and markets. Russel Sage Foundation, Princeton University Press, Princeton

  52. Roth AE, Erev I (1995) Learning in extensive-form games: experimental data and simple dynamic models in the intermediate term. Game Econ Behav 8:164–212

    Article  Google Scholar 

  53. Salmon TC (2001) An evaluation of econometric models of adaptive learning. Econometrica 69:1597–1628

    Article  Google Scholar 

  54. Schelling T (1960) The strategy of conflict. Harvard University Press, Boston

    Google Scholar 

  55. Schweitzer F, Fagiolo G, Sornette D, Vega-Redondo F, Vespigiani A, White DR (2009a) Economics networks: the new challenges. Science 325:422–425

    Google Scholar 

  56. Schweitzer F, Fagiolo G, Sornette D, Vega-Redondo F, White DR (2009b) Economics networks: what do we know and what do we need to know? Adv Complex Syst 12:407–422

    Article  Google Scholar 

  57. Sugden R (1995) A theory of focal points. Econ J 105:533–550

    Article  Google Scholar 

  58. Van Huyck JB, Battalio RC, Rankin FW (1997) On the origin of convention: evidence from coordination games. Econ J 107:576–596

    Article  Google Scholar 

  59. Vega Redondo F (2003) Economics and the theory of games. Cambridge University Press, Cambridge

    Google Scholar 

  60. Vega Redondo F (2007) Complex social networks. Econometric society monograph. Cambridge University Press, Cambridge

    Google Scholar 

  61. Watts DJ, Strogatz S (1998) Collective dynamics of small-world networks. Nature 393:440–442

    Article  Google Scholar 

  62. Wellman B, Berkowitz SD (1988) Social structures: A network approach. Cambridge University Press, Cambridge

    Google Scholar 

Download references


Financial supports from Italian MIUR and from Fondazione Cariplo are gratefully acknowledged.

Author information



Corresponding author

Correspondence to Michele Bernasconi.


Appendix 1: Example of instructions for the experiment

The experiments were conducted in Italian. Here is a translation of the instructions for the experiment on the mono-directional flow model with low cost (m0.5) conducted in Wave 1. The instructions for the other treatments were changed accordingly.

Welcome to an experiment in economic decision-making

This experiment is devoted to the study of network formation processes in which valuable information is transmitted.

The experiment consists of a series of periods in which you should make decisions.

If you follow the instructions carefully and make good decisions, you can earn a considerable amount of money, which will be paid in cash at the Bank… of the Università dell’Insubria….

In the room there are instructors to whom you can clarify any doubts. If you have any questions, raise your hand and wait for an instructor to contact you.

An experiment on information transmission

In this experiment you will always interact with three other participants. During the whole experiment these participants will remain the same. During the experiment you are asked not to speak in any way with the other participants.

Each participant is represented by one of the following symbols: @, #, *, %. You will only be informed about your symbol at the beginning of the experiment. Your symbol will only be known to yourself. Do not communicate your identity to anyone else.

In the experiment, each participant has some information that only he is aware of. The exact nature of the information is irrelevant to gain in the experiment. What is important is that the information owned by each participant is worth 1 point. This value is the same for each participant.

You have immediate access to your information, without having to take any action.

However, to access the information owned by other participants, you have to communicate with them.

You can only access the information held by another participant if there exists a connection that allows the information transmission between you and him.

Be aware that you can access the information held by another participant, both through a direct connection (for instance, you are @ and # is directly connected with you) or through a connection chain (for instance you, @, are connected with * while * is connected with #).

It is important to remember that the information is transmitted in just one direction. If you are, directly or indirectly, connected with #, the information held by # will arrive to you but not the other way round. In fact, if # wants to observe your information, he or she has to be connected with you, either directly or indirectly.

Remember that the value of the information you accede does not depend on the number of connections that allow you to observe it.

Connection cost

To open a connection is costly.

If you decide to establish a direct connection with another participant you must spend an amount equivalent to 0.5 points.

Your total costs amount to 0.5 points times each direct connection you establish.

If you decide not to open a connection with anyone you do not have to pay anything. Remember that you observe your own information automatically without the need of any connection.

An example

You can think of the connections between you and the others as arrows from them to you. The arrow indicates that the information of the others is flowing in your direction. The arrows form a network which shows the information flows between the players.

The arrows of the network can also show which player has created a connection. Indeed, for each arrow, the player to which the arrow is pointing toward is the one that has created that connection, bearing the cost.

Try to observe the information transmission and the connection costs of the following network:


First of all observe the number of connections opened by each player.

You can see the number of direct connections established by a player simply by counting the number of arrows pointing in his or her direction. Hence, you can see that

  • % has not established any connections,

  • neither has # established any connections,

  • @ has established just one connection (with *),

  • * has established two connections (one with # and another with @).

You can now calculate the total cost of the connections made by each player, multiplying by 0.5 points the number of connections he has established:

  • % does not spend anything,

  • # does not spend anything,

  • @ spends 0.5 points,

  • * spends 1 point.

Now think it how the information is transmitted in this network. Remember that the information circulate in the same direction as the arrows.

This means that the information of # flows in a direct way to *, but not vice versa.

Moreover, from the moment that there exists an arrow from * to @, it means that * directly observes also the information from @.

Note that in this case, @ is really able to observe the information of * from the moment that he has decided to establish a connection with *.

You also have to consider how the information is transmitted through indirect connections. As a matter of fact, through *, @ can also have access indirectly to the information of #. However you can see that the opposite is not true.

Player % is isolated, as he or she has not established any connection. Nevertheless, remember that each player always observes his or her own information.

Thus, to summarize the number of information observed by each player through the network, we can say that,

  • # only observes his or her own information

  • * and @ each observe 3 information (their own and those from the other two players) through direct or indirect connections.

  • % only observes his or her own information.


The experiment of network formation will be repeated several times.

What you will earn from participating in the experiment depends on the type of network formed in each period.

In particular, the profit of each participant on each period will be given by the value of all information observed by him or her in that period through direct and indirect connections, minus the total cost of the direct connections established by him or her.

The profit of each player in each period will then be calculated by counting the information observed and attributing to each 1 point. To this amount 0.5 points will be subtracted for each direct connection established by him or her.

In the above example it is easy to calculate the points obtained by each participant:

  • % earns 1 point: observes only one piece of information, his or her own, and does not bear any cost.

  • # also earns 1 point: observes only his or her own information and does not spend anything.

  • * earns 2 points: he or she observes 3 pieces of information and spends 1 point for the two connections.

  • @ earns 2.5 points: he or she observes 3 pieces of information and spends 0.5 points in one connection.

The total amount for participating in the experiment will then be given by the sum of all points obtained in each period, converted into euro.

In particular, in each period the points earned will be converted into euro through the following rule:

$$ {\rm Euro} = ({\rm Points})*{0.5} $$

The payment for the participation in the experiment will be done after the experiment ends.

Computer support for the experiment

Hence, the experiment consists of deciding on the connections to be established with the other participants in a sequence of periods. To assist you with your decisions, we have prepared some computer support.

At the beginning of the experiment, an initial screen will communicate whether you are @, #, * or %. This identity will remain the same during the whole experiment. Thus, the proper and true experiment will be started with the period sequence.

In each period, you will be given two successions of screens: in the first you should make your choice, in the second you will be communicated the network structure and the earned points in that period.

The screen for your choice in the experiment

In each period of the experiment you will be asked to decide whether to establish a direct connection and with whom of the other participants you want to establish a direct connection. To make your choices you will have up to 2 minutes in each period.

You can make your choice by using one computer screen in front of you. Figure 1 represents a typical screen to make your choice.


The screen reminds you who you are (@ or # or * or %), it is numbered according to the period you are in and it indicates the remaining time to make your choice. For example, the figure refers to a hypothetical player #, in period 1, that still has 29 s to make his or her own choice.

On the top of the screen you will find the most important information to have in mind when you make your choice, i.e. that each connection costs 0.5 points and that you observe your own information automatically without needing any connection.

The screen reminds you that it is not advisable to activate a connection with yourself.

On the bottom of the screen, there are four cells with a similar label: Your connections to *, Your connections to %, Your connections to @, Your connections to #. Underneath each of these cells there is an empty space to introduce your choice.

In particular,

  • If you intend to establish a connection with a specific player you should insert “1” in the empty space under the cell that corresponds to his symbol.

  • If instead you intend to create no connections with a specific player you should insert “0” in the empty space under the cell that corresponds to his symbol.

  • 0 and 1 are the only accepted characters. If you insert any other character an error message will show up.

  • You can always modify your choice until time expires. When you have decided definitely on all connections, you have to confirm your choice by pressing the button Confirm.

The results screen, with the network structure and the profits

After having made your decision, you will receive a waiting message. When all participants have taken their decisions on the direct connections, the network will be formed. The computer will then show a screen with the network formation and the points earned by each player. This will occur with a screen like the one on Fig. 2.


The screen shows a table. Each row of this table corresponds to one of the four players: *, %, @, # .

All rows have cells.

If inside a cell there is 1, it means that the player of that row has decided to establish a connection with the player represented in the column.

If inside a cell there is 0, it means that the player of that row has decided not to establish a connection with the player represented in the column.

The connections made by you and by the other players of the group determine the structure of the network and the payoff points earned by each player. These are shown in the last column on the right of the connections table.

Figure 2 refers for example, to a period in which a network was formed with the following characteristics:

  • Player * has established a connection with # and one with @. His profit is 2 points

  • Player % hasn’t established any connection with any of the other players. His profit is 1 point.

  • Player @ has established one connection with *. His profit is 2.5 points.

  • Player # hasn’t established any connection with any of the other players. His profit is 1 point.

Please note that these are the same characteristics of the network represented with the graph of the previous example. In fact, the network is the same.

The screen does not show the network graph. You will find next to your computer sheets of paper to draw the graph of the network (see Figure). You can also copy the direct links formed by you and the other players in the empty table, with the points earned by each in the period.


This operation will among other things be useful to control your total profit for all periods in the experiment.

How the experiment continues

After you have seen the structure of the network and the earned points for a sufficient amount of time, the experiment will go into the successive period. Again all participants should make decisions, a network will be formed and will give profits that will be communicated by the computer through a new screen of results.

End of the experiment

The experiment will go on for a number of periods, until a different screen appears. On this screen you will be asked to fill in some information useful for your payment.

The computer will then calculate the amount you have earned for participating in the experiment, converting the total scored points in euro through the formula previously indicated.

You can withdraw your payment for the participation in the experiment in the office of Bank… of the Università dell’Insubria, address…

Appendix 2: Models of learning

Reinforcement ( R )

In the Reinforcementmodel, in the first period each player i = 1,…, I has an initial propensity to play any n of her N i strategies. Such a propensity is represented by q in (t) for any period of time t. Strategies with higher propensity are played with higher probability. The probability of player i choosing strategy n at time t can be found using \(p_{in}(t) =\frac{q_{in}( t) }{\sum_{m}q_{im}( t) }\). It is usually assumed that all initial propensities are strictly positive, so that at all times there is a positive probability of a strategy being picked.

In all our experimental games, I = 4, and the set of strategies is the same for all agents, N i  = N for any i, with \(\left\vert N\right\vert=16\). Moreover, in order to guarantee that the propensities stayed always strictly positive even in the (unlikely) case of repeated plays of a dominated strategy with a negative payoff, we assumed that in any game ∀i, q in (1) = 22 × 2.5 = 55. Any other choice would only re-scale the quantitative findings with no substantial effects.

Any learning model also needs an updating rule. In this paper, we only focused on the standard basic reinforcement model, where the propensities are updated by adding the payoff x received in period t by playing strategy n to the previous propensity. Formally, the updating rule is

$$\left\{ {\begin{array}{*{20}c} {q_{{in}} (t + 1) = q_{{in}} (t) + x} \hfill & {{\text{if}}\,{\it n}\,{\text{is}}\,{\text{played}}\,{\text{at}}\,t} \hfill \\ {q_{{im}} (t + 1) = q_{{im}} (t)} \hfill & {\forall m \ne n} \hfill \\ \end{array} } \right., $$

that is, only nth propensity is changed. The reason for this is that, since actions other than n were not chosen, the payoff they would have received could not be observed. Also note that the parameterization of the initial propensity q in (1) = 55 takes care of the existence of negative payoffs in the experimental games and rules out the technical problem of possibly negative propensities as well as of undefined probabilities by introducing a difference between reinforcements and payoffs in the spirit of Erev and Roth (1998).

Belief learning

A standard formalization of belief learning is commonly used for the case of two-player games, I = 2. Each subject i’s beliefs about his/her opponent’s actions can be represented by a vector v i containing a number of elements equal to the rank of the particular payoff matrix used in the game. Each element represents the weight player i places on opponent choosing each pure strategy. Thus v in (t) represents the weight that player i gives to his/her opponent playing pure strategy n in period t. It is easy to sort out the probability with which player i believes his/her opponent will play strategy n by calculating \(\pi _{in}(t)=\frac{v_{in}(t) }{\sum_{m}v_{im}(t) }\). The player then chooses the pure strategy that is a best response to the probability distribution. In case of a tie, the player is assumed to choose randomly between all the possible best response strategies.

Two possible extensions to the more general case of I > 2 players are possible. The first it is to calculate an n × (I − 1) matrix V i (t) for each player i, containing the weight i is placing on each of his/her I − 1 opponents playing each pure strategy. In such an individual opponent belief learning,player i is then choosing that particular strategy which is a best response to the combination of the most probable pure strategies by each of his/her opponents. In our network formation games this formally implies, first, identifying the highest element \(\overline{v_{inj}}(t) \) for each j column of the matrix, and then selecting i’s best response to a network formed by the other 3 opponents each playing their most probable strategies \(\overline{v_{inj}}(t) \).

Note, however, that this generalization of belief learning to our four-player games implies that subjects would experience not only a relatively time-consuming effort on computational operations, but also a rather sophisticated kind of learning. Indeed, given that the network to which best respond is exclusively formed by the opponents’ most probable strategies, it may well be that that particular network has never been seen in the past. In other words, being mainly an abstract procedure (based on joint probability distributions), which in theory should respond purely to virtual networks, makes individual opponent belief learning not particularly appealing in terms of understanding real subjects’ behaviour.

On the contrary, an alternative generalization of belief learning with I players, is based on the idea that subjects can see the structures formed in the past and may easily see how often a particular residual network has emerged. That is, the residual opponent belief learning assumes that in four-players games, for instance, each subject only keeps track of the observed combinations of the pure strategies played by all three his/her opponents and behaves by facing and reacting exclusively to residual networks. Formally, in such a case a (n I−1) × 1 vector v i(−i)(t) needs to be compiled by each player: any element v i(n,m,…,l)(−i)(t) represents the weight that player i gives to the possible residual network formed when his/her opponents are playing respectively pure strategies nm,…, l in period t.

This generalization would also be rather demanding in terms of calculation time, as it would require each subject filling all the (16)3 = 4,096 elements of the vector. It should be underlined, however, that the calculation only occurs with strategy combinations corresponding to residual networks seen in the past. Thus, while all unobserved residual networks simply get zero weight, players are only supposed to keep track of one structure for a given period of time, which in standard experiments seems to be a reasonable requirement.

In the data analysis of our four-person experimental games, we have calculated both the generalizations of belief learning for each subject. However, having found that, with extremely few exceptions, they perfectly overlap the same probability distributions, we only refer to vector formulation of the residual opponent model.

The belief learning variants typically differ only on the way they model how the belief vector v i(−i)(t) is updated.

Fictitious play ( F )

The pure deterministic Fictitious Play learning model that we adopted, begins with setting zero weights on any combination of strategies and residual networks, v i(−i)(0) = [0]. Therefore, subjects choose randomly in the first period. For all subsequent periods, let \(y^{\ast }=\left[ n^{\ast },m^{\ast },\ldots,l^{\ast}\right] \) be the choices of all player i’s opponents in period t − 1. The Fictitious Play learning model updates the belief vector by setting

$$ \left\{ {\begin{array}{*{20}c} {v_{{iy^{ * } \left( { - i} \right)}} (t) = v_{{iy^{ * } \left( { - i} \right)}} \left( {t - 1} \right) + 1} \hfill & {{\text{with}}\,y^{ * } = \left[ {n^{ * } ,m^{ * } , \ldots ,l^{ * } } \right]\,{\text{chosen}}\,{\text{at}}\,t - 1} \hfill \\ {v_{{ix\left( { - i} \right)}} (t) = v_{{ix\left( { - i} \right)}} \left( {t - 1} \right)} \hfill & {\forall x \ne y^{ * } } \hfill \\ \end{array} } \right.. $$

Thus, a player who learns according to the Fictitious Play model uses the entire history of opponents’ past strategies to form her beliefs. Subjects’ beliefs are simply the observed frequency with which all opponents have simultaneously used each combination of individual strategies.

Myopic best response ( M )

Alternatively, the Myopic Best Response model assumes that players update their belief setting

$$ \left\{ \begin{array}{ll} v_{iy^{\ast }\left( -i\right) }(t) =1 & \hbox {with }y^{\ast }= \left[ n^{\ast },m^{\ast },\ldots,l^{\ast }\right] \hbox { chosen at }t-1 \cr v_{ix\left( -i\right) }(t) =0 & \forall x\neq y^{\ast } \end{array}\right. . $$

In other words, a player learning according to the Myopic Best Response uses only the most recent period observation to form beliefs. We can immediately see that this type of learning, by treating each subject as assuming opponents will play the same combination of strategies as they did in the previous period, corresponds to what Bala and Goyal call ‘naive best response dynamics’.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Bernasconi, M., Galizzi, M. Network formation in repeated interactions: experimental evidence on dynamic behaviour. Mind Soc 9, 193–228 (2010).

Download citation


  • Experiments
  • Networks
  • Behavioral game theory
  • Learning dynamics