1 Introduction

In recent years, experimental economics has seen a rise in the collection and analysis of data on choice-processes as diverse as team chat, brain activity, eye movements, response times and many others (see Cooper et al., 2019, introducing a special issue on choice-process data in experimental economics). Depending on the intrusiveness of the method of data collection, these methods are at risk of changing subjects’ behavior by means of changing the underlying reasoning which is meant to be uncovered.Footnote 1

In this study, we want to understand whether the collection of communication data via team chats influences individuals’ reasoning and behavior as they enter the team deliberation, even before hearing others’ arguments. Possible channels could include an altered sense of responsibility due to the presence of team partners (aspect 1), anxiety or encouragement due to the exposure of own suggestions and arguments to anonymous strangers (aspects 2–3), or an increased reflection of own arguments due to their verbalization (aspect 3). Pioneered by Cooper and Kagel (2003, 2005), team communication is used increasingly as a means to understand both team and individual choice-processes (Kocher & Sutter, 2005; Goeree & Yariv, 2011; Burchardi & Penczynski, 2014; Kagel & McGee, 2016; Penczynski, 2016a; Sitzia & Zheng, 2019; Cooper & Kagel, 2022). It is therefore important to know whether this method is capturing the underlying individual reasoning in an unbiased fashion.

Burchardi and Penczynski (2014) introduce a communication protocol for teams of 2, in which partners simultaneously send a suggested decision to each other, accompanied by a supporting written message. After viewing their partner’s suggestion and message, both make an individual final decision and one of the decisions is randomly implemented as the team’s action. Here, in four between-subject treatments involving variations of this intra-team communication protocol, we explore three aspects of team communication protocols separately: (1) belonging to a team, (2) actively suggesting an action to the team partner, and (3) justifying the suggestion in a written text to the team partner. This separation serves two purposes. First, it allows us to identify which element of team communication causes changes in suggestions, if any. Second, other communication protocols or choice-process methods which only feature a subset of the three aspects will be informed by our results. For example, the verbal ‘thinking aloud’ protocol investigated by Capra (2019) only features the third aspect of verbalization of reasoning and neither the team nor the suggestion aspects. Some protocols involve action suggestions but not free-form text communication (He & Villeval, 2017; Ertac & Gurdal, 2019).

For our purposes, we chose three tasks of different choice environments whose decisions informatively reflect underlying reasoning. Task 1 is an individual choice problem of guessing five cards from a deck of 100 colored cards (Rubinstein, 2002). Task 2 is the Colonel Blotto game, a resource allocation game with a large strategy space (Arad & Rubinstein, 2012). Task 3 is a two-player common-value first-price auction as adapted from Kagel and Levin (1986) by Koch and Penczynski (2018).

Across these three different tasks, we find no systematic evidence of altered individual suggestions due to aspects (1)–(3) of team communication. This result implies that inferring the nature of individual reasoning from this type of team communication is valid and justifies the use of a team setup to explore both individuals’ and teams’ reasoning.

Our results on aspect 3 are in line with Capra (2019) finding no systematic evidence of any influence from verbal ‘thinking aloud’ reports. While our paper focuses on whether collecting communication data influences individual reasoning, it also relates to the experimental literature on team decision making, which attempts to reveal and understand the differences between individual behavior and team behavior in various contexts (see Kugler et al. (2012) and Charness and Sutter (2012) for reviews on teams’ strategic behavior). Most of the studies on teams compare individuals’ and teams’ behavior given a particular form of communication, usually free-form chat communication (e.g., Cooper & Kagel, 2005; Luhan et al., 2009),Footnote 2 but also face to face communication (e.g., Bornstein & Yaniv, 1998; Kocher & Sutter 2005),Footnote 3 and limited communication, where only a strategy proposal is sent (He & Villeval, 2017).

Several studies have attempted to understand the influence of different forms of communication within teams: full communication compared to no communication (e.g., Sutter and Strassmair, 2009; Cason et al., 2012),Footnote 4 face to face communication compared to chat (Meub and Proeger, 2017; Christens et al., 2019), structured compared to unstructured discussion (Park & DeShon, 2018), and one-way messages, sent by some members, compared to no communication (Cooper & Kagel, 2016; Ertac & Gurdal, 2019).

We do observe systematic and significant increases in sophistication when comparing suggestions before the communication to decisions after the communication. Specifically, we show that when one member is more sophisticated than another, there is a higher likelihood that the first will persuade the latter, which explains teams’ higher sophistication and replicates results from Penczynski (2016a). Furthermore, this effect is more pronounced when communication includes a written message, rather than just a suggested decision. That is, team communication that consists of free-form messages is more effective than limited communication, where only a decision is proposed.Footnote 5

2 Experimental design

The experiment consists of 4 treatments, in each of which 3 tasks are presented sequentially: task 1 is an individual decision and tasks 2 and 3 are strategic games. The outcomes of the tasks are revealed only at the end of the experiment.

2.1 Treatments

Treatment B: baseline. Subjects take a decision as individuals.

Treatment BT: teams without communication. Subjects are matched in teams of two and have a 50% chance that each of their decisions is selected as the team’s decision. There is no communication between team partners. Subjects’ payoff may be affected by their team partner’s choice and vice versa.

Treatment BTS: teams with communication of the suggestion. Team partners can communicate by simultaneously sending a suggested decision to each other, without a supporting written message. After viewing their team partner’s suggestion, they each make an individual final decision, which is implemented as the team’s action with 50% chance.

Treatment BTSM: teams with communication of the suggestion and message. Team partners can communicate by simultaneously sending a suggested decision to each other, accompanied by a supporting written message. After viewing their team partner’s suggestion and message, they individually take a final decision, which is implemented as the team’s action with 50% chance.

In our baseline treatment B, subjects decide as individuals and play against subjects in their own treatment in the game tasks 2 and 3. In treatments BT, BTS and BTSM, subjects are matched to play as a team, and in the game tasks 2 and 3 they play against strategies of individuals from treatment B. This is chosen to have subjects in all treatments hold similar beliefs regarding their competitors’ behavior.

Individual and revisited decisions.

In what follows we will often refer to subjects’ individual and revisited decisions. The first category includes the decisions observed in treatments B and BT and the suggested decisions in treatments BTS and BTSM; these decisions have in common that they were made by individuals before any communication was received. The second category features the final decisions made in the latter two treatments. Alluding to the fact that here, subjects revisited and potentially revised their individual decisions in light of the communication they received, we label these treatment conditions BTS-R and BTSM-R. This results in a total of 6 conditions to be compared in the experiment: B, BT, BTS, BTSM, BTS-R and BTSM-R.

A comparison of the decisions in treatment B to the decisions in treatment BT will reveal whether merely belonging to a team affects an individual’s initial reasoning. Note that in treatment BT individuals have a smaller chance of affecting their own payoff relative to treatment B, but on the other hand, they have partial responsibility for their team partner’s payoff. This is an essential feature of belonging to a team. The difference between the individual decisions in treatment BT and BTS will reflect the influence of the effort to persuade or to impress the team partner and possibly the preliminary nature of a suggestion.Footnote 6 Comparing individual decisions in treatment BTS and BTSM will test whether verbalization of one’s reasoning improves individual sophistication. In Sect. 4, comparing the revisited decisions (BTS-R and BTSM-R) with the individual decisions (BTS and BTSM) will reveal how the exchange of suggestions alone or with messages affects the sophistication of teams’ actions.

2.2 The tasks

Since our experiment was designed to be carried out online and not take too long to complete, we chose to focus on only three tasks. In an attempt to make our study results general, we were looking for tasks that are (1) not trivial, (2) occur in three different environments, (3) vary in the level or the source of complexity, (4) in which decisions informatively reflect underlying reasoning, and (5) in which we can identify sophistication. In what follows, we describe the three tasks.

2.2.1 Task 1

Task 1 is an individual choice problem. Subjects are told that five virtual cards were drawn randomly from a deck of 100 cards and placed into five separate boxes marked A, B, C, D and E. They are moreover informed that the deck was initially composed of colored cards according to the following breakdown: 36 of them green, 25 blue, 22 yellow and 17 brown. The aim is to guess the color of the card in each box. Every correct guess yields an individual reward of 1 pound (or an evenly split team reward of 2 pounds in the team settings).

The problem has a straightforward solution. Since green is the most frequent card in the deck at every draw, it is optimal to assign all five cards the color green. Our interest in this task arises from the finding that subjects often do not maximize their expected payoff in this task. Instead, research has shown that individuals tend to engage in probability matching, i.e. they often match their decision frequencies in repeated decisions to the probability of events occurring (Rubinstein, 2002). We measure sophistication in this task by the number of green cards bets chosen by the individual.

2.2.2 Task 2

Task 2 is the so-called Colonel Blotto game, first proposed by Borel (1921). This game can be described as a competitive resource allocation game with a large and complex strategy space. Subjects in the experiment are asked to assign 120 ‘troops’ among six separate ‘battlefields’ knowing that their deployment of troops would face those of other participants in the experiment. In any encounter with an opponent, subjects win a battlefield if they assign more troops to the particular battlefield and their score in the encounter is the number of battlefields won. Subjects’ deployment strategies enter a round–robin tournament in which they are automatically played against the deployment strategies of participants in our baseline treatment B. This is our way of ensuring that subjects hold similar expectations regarding their opponents’ behavior across treatments. Subjects are told that if they are among the top 3 scorers in the tournament, they receive an individual reward of 5 pounds (or an evenly split team reward of 10 pounds in the team settings).

The Blotto game serves as a platform to study relatively complicated strategic situations in which it is hard to identify simple decision rules due to the large size of the strategy space. Using the Blotto game as a workhorse, Arad and Rubinstein (2012) and Arad and Penczynski (2021) have shown that subjects in this game deal with complexity by thinking in terms of different features or dimensions of strategies rather than by thinking about strategies per se. In addition to subjects’ scores in the tournament, we use this previous knowledge on dimensional reasoning to identify sophisticated strategies in our data.

2.2.3 Task 3

Task 3 is a simplified version of a common value auction game (CVA). In the standard CVA setting by Kagel and Levin (1986), the common value of an auctioned item \(W^*\in [\underline{W}, \overline{W}]\) is randomly determined, with all values equally likely. Every bidder receives an independently drawn private signal \(x_i\in [W^*-\delta , W^*+\delta ]\), with \(\delta >0\). Bids \(a_i\) are submitted in a first-price sealed bid auction in which the highest bidder wins the auction and pays his or her bid. The payoff of the highest-bidding player is \(u_i=W^*-a_i\). The payoff of all other players is \(u_i=0\).

Our simplified game is taken from Koch and Penczynski (2018) who allow for only two signals and two players. The common value \(W^*\) is uniformly distributed in the interval [25, 225]. One bidder receives a private binary signal \(x_i\in \{W^*-3,W^*+3\}\) while the other bidder privately receives the remaining of the two signals. To ensure an equilibrium in pure strategies, we only allow bids \(a_i\in [x_i-8, x_i+8]\). As a tie-breaker in case of identical bids, the lower-signal player wins the auction. As in Kagel and Levin (1986), bids are submitted in a first-price sealed bid auction. To ensure that subjects hold similar expectations regarding their opponent’s behavior across treatments, we again let teams face strategies of an individual from treatment B with opposing signal. Any profit or loss resulting from the team’s final bid was added or subtracted fully from the money that was accumulated in the experiment up to that point.

The equilibrium of the auction game is for both players to bid their private signal minus 8 which implies that each player wins the auction with 50% chance and that the lower item value realizes from the perspective of the winner. This is this game’s analogue to the common ‘shading of bids’-strategy in first price auctions that results when bidders take into account that winning the auction carries negative information about the others’ bids, their signals and hence the item’s value. In an attempt to detect any meaningful difference between treatments, we use the categorization of strategies by sophistication suggested in Koch and Penczynski (2018) (in addition to comparing the raw bid distributions between treatments).

2.3 Experimental procedures

The experiment was carried out online; instructions and screenshots are provided in Appendix C. Subjects were students from various fields of study, recruited from the University of East Anglia and the University of Nottingham. Upon login, we registered a total of 750 subjects who were randomly assigned, within each of 10 sessions, to one of the four treatments. This resulted in 196 subjects assigned to treatment B, 190 to treatment BT, 184 to treatment BTS and 180 to treatment BTSM. The median time it took to complete the experiment was 12.87 minutes. Anonymity within and between teams/individuals was maintained both during and following the experiment. Subjects were on average 22.7 years old, 61.6% of them reporting to be female.Footnote 7 Final earnings in the experiment ranged between \(\pounds \)0 and \(\pounds \)19, with an average of \(\pounds \)6.30. The outcome and the winning amounts were e-mailed to subjects at the end of each session. Amazon vouchers of the winning amounts were sent to subjects 2 days later.

3 Results

Recall that the 6 conditions of our experiment differentiate between subjects’ individual decisions and revisited decisions. In this section, we focus on differences in subjects’ individual decisions, using the revisited decision data as useful benchmarks against which to compare our results. In Sect. 4, we turn our attention to the effects of exchanged communication and the processes leading to changed sophistication.

3.1 Task 1 - Results

Fig. 1
figure 1

Green card guesses

Recall that subjects are asked to guess the color of the card in each of the five boxes, where all guesses other than green are dominated by the guess of a green card. To obtain a simple indicator of individual sophistication, we construct a variable which counts the number of times a subject guessed the color green. Figure 1 summarizes our data by showing histograms and associated summary statistics of the number of green card guesses in the individual and revisited decisions for each treatment. From the modes of these distributions it is evident that for most of our conditions, assigning 2 cards the color green is the most frequent decision. This is compatible with the idea that less sophisticated subjects would try to mimic the deck’s distribution of colors in their guesses. Another spike is observed at the optimal point of assigning all five cards the color green.

Table 1 Statistical tests for task 1

To uncover any differences across our treatment conditions, we statistically compared all six against one another. The resulting p-values of our comparisons are reported in Table 1. Of main interest to us are comparisons of values in bold as these reveal any differences in subjects’ individual reasoning (before communication); comparisons to values in italics are instead with reference to subjects’ revisited decisions and therefore relate to the question of how persuasive the exchanged communication was, which we investigate in Sect. 4. Compared to our baseline B, we can see that neither belonging to a team (BT), nor the additional opportunities of communicating a suggestion (BTS), or verbalising one’s reasoning in a text message (BTSM) have any significant impact on the sophistication of the individual decisions according to Wilcoxon ranksum tests (p = 0.741, p = 0.826, and p = 0.816, respectively). The only significant differences that are being observed originate from a comparison of individual and revisited decisions. While receiving a bare suggestion is not powerful enough to increase sophistication (BTS vs. BTS-R: p = 0.183), a pronounced effect is observed when suggestions are supported by a text message (BTSM vs. BTSM-R: p < 0.01).Footnote 8 While the revisited decision data help to show that our analysis is sufficiently powered, we performed a supplementary simulation exercise in Appendix A.2 for all tasks which further narrows down the scale of detectable effects in our data.

3.2 Task 2 - Results

In our version of the Colonel Blotto game, a strategy is an assignment of 120 ‘troops’ among six separate ‘battlefields’. To begin with, we calculated the expected scores of subjects’ strategies in the tournament as a general measure of their success.Footnote 9 Figure 2 plots the results for each treatment condition, supplemented by tests of differences in Table 2.

Fig. 2
figure 2

Expected scores

Table 2 Statistical tests of expected score differences

Comparing our four treatments, we find no significant differences in expected scores stemming from subjects’ individual decisions. Revisited decision scores, however, are marginally higher which is consistent with our previous finding of increased sophistication after the communication. The expected scores are a useful first measure to explore whether our treatment conditions affected behavior, but one may consider them too crude to satisfactorily detect changes in sophistication.

In previous research, sophistication in the Blotto game was empirically characterized by Arad and Rubinstein (2012) and Arad and Penczynski (2021) who identified three distinct patterns that feature prominently in subjects’ winning strategies. The best performing strategies in the Blotto game usually (i) reinforce between 3 and 5 battlefields, where reinforcement means to assign more than 20 troops to a battlefield, (ii) make frequent use of the unit digit assignments 1, 2 and 3 to marginally trump a competitor’s assignments of troops, and (iii) assign relatively fewer troops to battlefields located on the edges, i.e. to the first or last battlefield, as opposed to the center.

We treat these patterns as our benchmark for sophisticated play, but focus the following analysis on the first of these three dimensions, i.e. the number of reinforced battlefields, as this dimension turns out to be the only one producing significant differences between some of the experimental conditions. A detailed analysis of the remaining two dimensions is provided in Appendix A.1.Footnote 10

Dimension 1: number of reinforced battlefields

To obtain a simple indicator of sophistication along dimension 1, we divided subjects into two groups depending on whether they reinforced 0–2 battlefields (‘intuitive allocation’) or 3–5 battlefields (‘strategic allocation’). Figure 3 presents the results in all treatment conditions. The observed pattern of differences resembles that observed in task 1. As far as subjects’ individual decisions are concerned, we cannot reject the hypothesis of no difference in the distributions of our outcome variable across the four comparison groups (\(\chi ^2\) test, p = 0.391).

Fig. 3
figure 3

Sophistication of reinforcements

This is supported by pairwise tests which are summarized in Table 3 and show no significant differences in any of the comparisons. What we do find, again, are significant differences when comparing subjects’ suggested decisions with their revisited decisions. A pronounced increase in the proportion of sophisticated decisions is induced both by a bare suggestion (BTS vs. BTS-R: p = 0.015) as well as by a suggestion which is supported by a written explanation (BTSM vs. BTSM-R: p = 0.017).Footnote 11

Table 3 Statistical tests for task 2–dimension 1

3.3 Task 3 - Results

In our common value auction, one bidder receives a private binary signal \(x_i\in \{W^*-3,W^*+3\}\) (where \(W^*\) is the item’s value) while the other bidder privately receives the remaining of the two signals. We only allow bids \(a_i\in [x_i-8, x_i+8]\). For ease of exposition, we express subjects’ strategies as their bids relative to their signal, i.e. \(b_i=a_i-x_i\). The distributions of relative bids in subjects’ individual and revisited decisions given in Fig. 4 are categorized by sophistication as in Koch and Penczynski (2018).Footnote 12 Table 4 summarizes our test statistics. Regarding subjects’ individual decisions, we find no significant differences when comparing our baseline treatment B to BT or to BTSM, which embeds our full communication protocol (p = 0.188 and p = 0.732, respectively). We do, however, find significantly lower bids in treatment BTS when compared to B (p = 0.046). In fact, there is even some marginal evidence of lower bids in BTS than BTSM (p = 0.117) which is surprising as BTSM embeds richer communication. Turning to subjects’ revisited decisions, while the high sophistication in BTS did not improve further significantly (BTS vs. BTS-R, p = 0.199), receiving a suggestion together with a written explanation improved the sophistication of bids significantly (BTSM vs. BTSM-R, p= 0.045).

Fig. 4
figure 4

Categorized bids

Table 4 Statistical tests for task 3

While the significant result of BTS versus B suggests an effect of the team setting with suggestions, it is a result that stands alone both within task 3 and among the other tasks. Within task 3, we deem monotonic effects as more likely than non-monotonic ones because, for example, the communicated suggested decision added in BTS is still part of BTSM, which in turn does not produce significantly different sophistication from B. Therefore, we view the BTS result more as an outlier than as evidence for an effect due to the communicated suggested decision.Footnote 13

4 Communication and revisited decisions

The revisited decisions allow us  to better understand the team decision making process (see Penczynski, 2016a) and, specifically in that context, the influence of a richer communication protocol when moving from BTS to BTSM.

A first set of insights can be gained when classifying both team members’ suggested and revisited decisions into the sophistication categories introduced for each task.Footnote 14 On the basis of both team members’ suggested decisions, we define the divergence in sophistication before communication. If the sophistication is not the same, we can identify the more sophisticated team member to be the partner or the player. On the basis of each player’s two decisions (suggested and revisited), we can understand whether the player corrected the sophistication upwards, downwards or not at all.

Fig. 5
figure 5

Correction in sophistication by divergence in sophistication

Figure 5 shows the correction by task, treatment and divergence. The middle of both panels shows that a similar sophistication between players predominantly leads to no correction. In BTS, having a more sophisticated partner leads often to upwards correction and being more sophisticated sometimes leads to downwards correction. Moving from BTS to BTSM increases the richness of the communication. Interestingly, this induces the two effects to change in an asymmetric fashion. The correction towards higher sophistication increases and the correction towards lower sophistication decreases. This proves the usefulness of rich communication in teams.

Table 5 shows ordered logit and OLS results of the correction in sophistication as a function of a treatment dummy BTSM, the divergence in sophistication as well as an interaction of the two. The table shows clearly and robustly that the divergence is predictive of the direction of the correction. Likewise, the richer communication in BTSM significantly strengthens the effect of positive divergence, possibly thanks to the verbal demonstration of the superior suggested strategy. However, this richness does neither influence correction for negative divergence nor for no divergence cases.

Table 5 Ologit and OLS regression results
Fig. 6
figure 6

Correlation between words’ predictiveness of sophistication and correction

The messages from BTSM allow us to explore the mechanism behind richer communication and to relate words both to a decision’s sophistication and to the correction they cause in the team partner’s decision. According to the text analysis, sent words that predict the sophistication of the suggested decision correlate with the received words that predict a correction in the direction of the partner. Figure 6 thus illustrates how the rich communication helps to make a team’s decision more sophisticated. For example in the card task 1, the word “green” is crucial for the sophisticated strategy and comfortingly is highly predictive of both sophistication and correction (Fig. 6a). In Blotto task 2, the correlation is less pronounced, but the word “just”, for example, is highly predictive of sophistication and among the most predictive words of correction (Fig. 6b).Footnote 15 In the auction task 3, a number of very reasonable words are in the top 10 of predicting both sophistication and correction, for example, “lowest”, “profit”, “bid”, “lower” (Fig. 6c).

5 Discussion and conclusion

We presented a rigorous analysis of the effects of different team communication aspects on the sophistication in individual and strategic decision making. Rather than looking at one specific task in isolation, our strategy was to uncover team communication effects in a more systematic way, namely by identifying shared patterns that would come to light across a variety of different tasks.

Across all 3 tasks, we observed that the revisited decisions were on average statistically more sophisticated than the individually suggested ones. In line with Penczynski (2016a), this shows that the communication is effective in improving the team’s sophistication because more sophisticated decisions are identified as superior and are thus more persuasive. This effect is increasing in the richness of the communication. These results also show that our analysis is sufficiently powered to detect systematic effects between our experimental conditions.

Note that by fixing beliefs about opponents’ behavior to the individual choices from treatment B we muted the potential influence that beliefs of playing against teams rather than individuals might have on reasoning and decisions. Studies using the intra-team communication design have not suggested that the individual reasoning against teams is different from the benchmark individual-against-individual results in the respective literatures (Burchardi & Penczynski, 2014; Penczynski, 2016a, b, 2017; van Elten & Penczynski, 2020). Similar to our results, these studies rather have shown systematically that the decisions after the communication are more sophisticated, suggesting that the impact of team play does not derive from holding particular beliefs about the opposing teams, but rather from the pooling of arguments by the team members.

For each of the games and dimensions we considered, we ran a whole battery of tests by comparing individual decisions in each of the four treatments against one another in an attempt to uncover significant patterns in our data. As far as subjects’ individual decisions are concerned, we found very little to no evidence of effects due to any of the treatment manipulations that we applied in any of the tasks. These findings were moreover substantiated by additional robustness checks which we referred to in the footnotes of each respective section of the analysis.

By having shown that none of the aspects of our team communication protocol (belonging to a team, suggesting an action, reflecting verbally on one’s reasoning) seems to have affected behavior, we believe that such team communication protocols are capable of generating choice-process information in a way that does not distort the choice processes that are meant to be uncovered.