Nonparametric analysis
The baseline and complex treatments each had 82 subjects, and the inattention treatment had 81 subjects. In this sub-section, we motivate our model with nonparametric analysis.
First, we consider the number of times our subjects executed a no-change, meaning a guessed probability equal to that of the previous period. This is interesting because, given the nature of the information and comparatively small number of draws, incidences of no-change are not predicted by either Bayesian or Quasi-Bayesian updating, and so, if such observations are widespread in the data, this is the first piece of nonparametric evidence that these standard models are incomplete.
The maximum of the number of no-changes for each subject is 49: seven opportunities for no change out of eight draws, times the seven stages. The distributions over subjects separately by treatment are shown in Fig. 1. The baseline distribution shows a concentration at low values; for both Complex and Inattention, there appears to be a shift in the distribution towards higher values, as one might expect. The means for each treatment are represented by the vertical lines on the right hand side. The vertical lines on the left hand side show the mean number of no-changes that would result from subjects rounding the Bayesian probabilities to two decimal places (or, equivalently, rounding percentages). Clearly, rounding cannot account for the prevalence of no-changes found in the empirical distribution.
The mean is higher under C (22.79) than under B (18.74) (Mann–Whitney test gives \(p=0.007\)); and higher under I (26.97) than under B (\(p<0.001\)).Footnote 15 This is expected: complexity and inattention are both expected to increase the tendency to leave guesses unchanged. When C and I are compared, the p value is 0.06, indicating mild evidence of a difference between the two treatments.
In Fig. 1 it is clear from the nonparametric evidence of widespread incidence of no-changes that any successful model of our data will have to deal with the phenomenon of whether to adjust, before considering how much to adjust.
Second, when agents do change, it is of interest why they do. In Fig. 2 we plot the binary indicator for updating against the strength of evidence against the maintained beliefs \(\left| z_{it}\right|\) (the absolute change in the Bayesian posterior since the last time the subject updated; see top of row Table 2). A Lowess smoother is superimposed, and this can be interpreted as the predicted probability of an update for a given value of \(\left| z_{it}\right|\). This provides good nonparametric evidence that higher values of \(\left| z_{it}\right|\) make change more likely, but, across subjects, agents who are more reluctant to change will exhibit both a relatively low probability of update and a higher value of \(\left| z_{it}\right|\). Hence there is an econometric concern that the relationship may be affected by endogeneity bias.Footnote 16
Third, Fig. 3 shows the extent of any updating on receipt of a white ball and an orange ball, both as a raw change and as a proportion of the absolute change dictated by Bayes rule. The upper panels indicate that updates are often in step sizes of 0.1 or 0.05 and the 0.6 prior for \(P_t\) was not so asymmetric as to generate artefacts.
In the lower panels, the updates relative to the Bayesian benchmark cluster between zero and one (for a white ball) and minus one and zero (for an orange ball).Footnote 17 This shows that when agents adjust, they tend to do so in a reasonable direction for risk averse agents, raising their probability for Urn 1 when a white ball is drawn and lowering it when an orange ball is drawn.Footnote 18 Furthermore, the clustering indicates that any Quasi-Bayesian representation of their adjustment will require a \(\beta\) parameter less than unity, reflecting insufficient belief adjustment.
These nonparametric statistics are consistent with more than one theoretical approach, but it is not clear that just one approach will explain all the features of the data. With that in mind, we now turn to a model which allows the different approaches to co-exist.
A double hurdle model of belief adjustment
In this section, we develop a parametric double hurdle model which simultaneously considers the decision to update beliefs and the extent to which beliefs are changed when updates occur. The purpose of the model is to act as a testing tool for state-dependent belief adjustment, namely Bayesian belief adjustment and Quasi-Bayesian belief adjustment in the simple version previously defined, as well as (stochastic) time-dependent belief adjustment.
Our econometric task is to model the transformed implied belief \(r_{t}^{*}=\Phi ^{-1}\left( g_{t}^{*}\left( \theta _{i}\right) \right)\), which in turn requires an estimate for risk aversion. We estimate this at the individual level using the technique by Offerman et al. (2009). “Appendix 3: Method for estimating CRRA risk parameter” contains the subject-level details surrounding the estimation of \(\theta _{i}\). On average, agents are risk averse with a mean \(\theta\) of 0.2.Footnote 19
We will refer to \(r_{it}^{*}\), subject i’s belief in period t, as shorthand for ‘transformed implied belief’. We will treat \(r_{it}^{*}\) as the focus of the analysis, because \(r_{it}^{*}\) has the same dimensionality as \(z_{_{it}}\), the test statistic defined in (4). That is, both have support \((-\infty ,\infty )\). Sometimes \(r_{it}^{*}\) changes between \(t-1\) and t; other times, it remains the same. Let \(\Delta r_{it}^{*}\) be the change in belief of subject i between \(t-1\) and t. That is, \(\Delta r_{it}^{*}=r_{it}^{*} -r_{it-1}^{*}\).
In the following estimation we exploit the near equivalence between (4) and the scaled difference since the last update \(2(\Phi ^{-1}\left( P_{t}\right) -\Phi ^{-1}\left( P_{m}\right) )/\sqrt{t}\) (from Table 2). In round 1, \(P_{m}\) equals the prior 0.6 and the movement of the guess for a given subject is \(\Delta r_{i1}^{*}=r_{1}^{*}-\Phi ^{-1}\left( 0.6\right)\). That is, both the objective measure of the information change and the subjective guess of the agent are assumed to anchor onto the prior probability that Urn 1 is chosen, 0.6, in the first period.
First hurdle
The probability that a belief is updated (in either direction) in period t is given by:
$$\begin{aligned} P\left( \Delta r_{it}^{*}\ne 0\right) =\Phi \left[ \delta _{i} +x_{i}^{\prime }\Psi _{1}+\gamma \left| z_{it}\right| \right] , \end{aligned}$$
(5)
where \(\Phi \left[ \cdot \right]\) is the standard Normal cdf and \(\delta _{i}\) represents subject i’s idiosyncratic propensity to update beliefs, and therefore models random probabilistic belief adjustment (time-dependent belief adjustment). The probability of an update is assumed to depend (positively) on the absolute value of \(z_{it}\), the test statistic. The vector \(x_{i}\) contains treatment and gender dummy variables together with an age variable and a score on two questions from the comprehension questionnaire administered after the main part of the experiment was explained,Footnote 20 all of which are time invariant and can be expected to affect the propensity to update.
One econometric issue flagged in the last sub-section is the endogeneity of the variable \(\left| z_{it}\right|\): subjects who are averse to updating tend to generate large values of \(\left| z_{it}\right|\) while subjects who update regularly do not allow it to grow beyond small values. This could create a downward bias in the estimate of the parameter \(\gamma\) in the first hurdle. To deal with this concern we use an instrumental variables (IV) estimator which uses the variable \(\widehat{\left| z_{it}\right| }\) in place of \(\left| z_{it}\right|\), where \(\widehat{\left| z_{it}\right| }\) comprises the fitted values from a regression of \(\left| z_{it}\right|\) on a set of suitable instruments.Footnote 21
Second hurdle
Conditional on subject i choosing to update beliefs in draw t, the next question relates to how much they do so. This is given by:
$$\begin{aligned} \Delta r_{it}^{*}=\left( \beta _{i}+x_{i}^{\prime }\Psi _{2}\right) \frac{\sqrt{t}z_{it}}{2}+\varepsilon _{it,}\qquad \varepsilon _{it}\sim N\left( 0,\sigma ^{2}\right) . \end{aligned}$$
(6)
As a reminder, the Quasi-Bayesian belief adjustment parameter \(\beta _{i}\) represents subject i’s idiosyncratic responsiveness to the accumulation of new information: if \(\beta _{i}=1\), subject i responds fully; if \(\beta _{i}=0\), subject i does not respond at all. Remember that \(\beta _{i}\) is not constrained to [0, 1]. In particular, a value of \(\beta _{i}\) greater than one would indicate the plausible phenomenon of overreaction. Again, treatment variables are included: the elements of the vector \(\Psi _{2}\) tell us how responsiveness differs by treatment.
Considering the complete model, there are two idiosyncratic parameters, \(\delta _{i}\) and \(\beta _{i}\). These are assumed to be distributed over the population of subjects as follows:
$$\begin{aligned} \left( \begin{array}{c} \delta _{i}\\ \beta _{i} \end{array} \right) \sim N\left[ \left( \begin{array}{c} \mu _{1}\\ \mu _{2} \end{array} \right) ,\left( \begin{array}{cc} \eta _{1}^{2} &{}\quad \rho \eta _{1}\eta _{2}\\ \rho \eta _{1}\eta _{2} &{} \quad \eta _{2}^{2} \end{array} \right) \right] . \end{aligned}$$
(7)
In total, there are seventeen parameters to estimate: \(\mu _{1}\), \(\eta _{1}\), \(\mu _{2}\), \(\eta _{2}\), \(\rho\), \(\gamma\), \(\sigma\), four treatment effects (two in each hurdle); two gender effects (one in each hurdle); two scores from the comprehension questionnaire (one in each hurdle); and two age effects (one in each hurdle). Estimation is performed using the method of maximum simulated likelihood (MSL), with a set of Halton draws representing each of the two idiosyncratic parameters appearing in (7). Following estimation of the model, Bayes rule is used to obtain posterior estimates (denoted \(\hat{\delta _i}\) and \(\hat{\beta _i}\)) of the idiosyncratic parameters for each subject.Footnote 22
The results are presented in Table 3 for four different models. The last column shows the preferred model. Model 1 estimates the QB benchmark, in which it is assumed that the first hurdle is crossed for every observation—that is, updates always occur. Zero updates are treated as zero realizations of the update variable in the second hurdle, and their likelihood contribution is a density instead of a probability. Because of this difference in the way the likelihood function is computed, the log-likelihoods and AICs cannot be used to compare the performance of QB to that of the other models.
Table 3 Results of hurdle model with risk adjustment Model 2 estimates the IE benchmark, in which the update parameter (\(\beta _{i}\)) is fixed at 1 for all subjects. Consequently the extra residual variation in updates is reflected in the higher estimate of \(\sigma\). The parameters in the first hurdle are free.
Model 3 combines IE and QB, but constrains the correlation (\(\rho\)) between \(\delta\) and \(\beta\) to be zero. Model 4 is the same model with \(\rho\) unconstrained.
The overall performance of a model is judged initially using the AIC; the preferred model being the one with the lowest AIC. Using this criterion, the best model is the most general model 4 (model 1 not being subject to the AIC criterion): IE-QB with \(\rho\) unrestricted, whose results are presented in the final column of Table 3.
To confirm the superiority of the general model over the restricted models, we conduct Wald tests of the restrictions implied by the three less general models. We see that, in all three cases, the implied restrictions are rejected, implying that the general model is superior. Note in particular that this establishes the superiority of the general model 4 (IE-QB with \(\rho\) unrestricted) over the QB model 1 (a comparison that was not possible on the basis of AIC).
Further confirmation is furnished by measuring the sample predictive accuracy on a subsample of data (the ‘cross validation’ approach). ROC (Receiver Operating Characteristic) is the model’s out of sample predictive accuracy for hurdle 1 (the frequency of updates) and \(R^{2}\) (out of sample) is the model’s predictive accuracy for hurdle 2 (the extent of updates).Footnote 23 Model 4 is no worse than model 3 on an ROC criterion, but predicts the extent of adjustment better. Model 1 is best of all at predicting the extent of adjustment, but it fails to predict ‘no-change behavior’, by construction. Thus, on both in and out of sample criteria, model 4 is best overall.Footnote 24
We interpret the results from model 4 as follows. Consider the first hurdle (propensity to update). The intercept parameter in the first hurdle (\(\mu _{1}\)) tells us that a typical subject has a predicted probability of \(\Phi \left( 0.061\right) =0.524\) of updating in any task, in the absence of any evidence (i.e. when \(\left| z_{it}\right| =0\)). We note that this estimate is not significantly different from zero, which would imply a 50% probability of updating. The Inattention treatment effect is significant and negative, suggesting that the probability of update is lower when subjects are not paying attention. So is the Complexity treatment but not by as much. The effect of the questionnaire score is negative and significant, though it is not large. The negative coefficient is consistent with Cohen et al.’s (2019) model if subjects have cognitive costs. This gives us our first result:
Result 1
There is evidence of time-dependent (random) belief adjustment. Subjects update their beliefs idiosyncratically around half the time.
The large estimate of \(\eta _{1}\) tells us that there is considerable heterogeneity in the propensity to update (see Fig. 4), something we will explore further in Sect. 4.3. The parameter \(\gamma\) is estimated to be significantly positive, and this tells us, as expected, that the more cumulative evidence there is, in either direction, the greater the probability of an update:
Result 2
There is evidence of state-dependent belief adjustment. Subjects are more likely to adjust if there is more evidence to suggest that an update is appropriate (thus making it costlier not to update).
In the second hurdle, the intercept (\(\mu _{2}\)) is estimated to be 0.819 in our preferred model 4: when a typical (baseline) subject does update, she updates by a proportion 0.819 of the difference from the Bayes probability. The large estimate of \(\eta _{2}\) tells us that there is considerable heterogeneity in this proportion also (see Fig. 4). Interestingly, on the basis of the posterior estimates from model 4, only 13 out of 245 subjects appear to have \(\beta <0\), which indicates noise or confused subjects who adjusted in the wrong direction. Moreover, 87 out of 245 subjects (around one third) display overreaction to the evidence. We summarize this in the following result:
Result 3
There is evidence of Quasi-Bayesian partial belief adjustment. On average, subjects who adjust do so by around 80%. There is evidence of prior information under-weighting: around one third of the subjects overreact to evidence once they decide to adjust.
The estimate of \(\rho\) is negative, indicating that subjects who have a higher propensity to update, tend to update by a lower proportion of the difference from the Bayes probability. Inattention is important for both hurdles:
Result 4
Inattention lowers the probability of updating from 50 to 40% and lowers the extent of update from 80 to 56% of the amount prescribed by Bayes rule. Complexity also lowers the probability of update in the first hurdle.
The empirical distribution of \(\alpha\) and \(\beta\)
To get a better sense of the population heterogeneity in belief adjustment, this subsection maps out the empirical distribution of the IE \(\alpha _{i}\) and QB \(\beta _{i}\) parameters across subjects against each other. The estimated distribution \(f\left( \beta \right)\) can be seen from the distribution of the posterior estimates \(\hat{\beta _i}\) from Model 4, and this distribution is the marginal distribution of the extent of update on the vertical axis on the bottom-right panel of Fig. 4. We next use the first hurdle information to generate \(f_{i}\left( \alpha \right)\), the empirical distribution of \(\alpha _{i}\).
As we flagged earlier, each agent has a full distribution of \(\alpha\) and so we need a representative \(\alpha _{i}\) to summarize the extent of sticky belief adjustment for agent i, to then relate to their \(\beta _{i}\). As will be clear below, the choice that permits analytic solutions is the median \(\alpha _{i}\) from \(f_{i}\left( \alpha \right)\).
The econometric equation for the first hurdle is equivalent to the probability of rejecting the null under IE. We omit the dummy variables and begin by re-writing the first hurdle, namely (5):
$$\begin{aligned} \Pr \left( {\text {reject }}H_{0}\right) _{it}=\Phi \left( \delta _{i} +\gamma \left| z_{it}\right| \right) , \end{aligned}$$
(8)
where \(\delta _{i}\) and \(\gamma\) are estimated parameters and \(\left| z_{it}\right|\) is the test statistic based on the proportion of white balls:
$$\begin{aligned} z_{it}=\frac{P_{it}^{w}-P_{im}^{w}}{\sqrt{0.5^{2}/t}}. \end{aligned}$$
(9)
For any \(\left| z_{it}\right|\) it is possible to work out an implied p value and we do so by assuming that (9) is approximately distributed N(0, 1). This in turn allows us to work out \(f_{i}\left( \alpha \right)\) from the econometric equation for the first hurdle. When \(\left| z_{it}\right| =0\), the p value for a hypothesis test is unity, and so the equation says that a fraction of agents will reject \(H_{0}\) if the p value is unity. Since the criterion for rejecting \(H_{0}\) in a hypothesis test is always \(\alpha \ge p\)-value, the observed behavior of rejecting \(H_0\) when \(\left| z_{it}\right| =0\) implies that there must be a non-zero probability mass on \(f_{i}\left( \alpha \right)\) at the value of \(\alpha\) exactly equal to 1. The pdf of \(\alpha _{i}\) will thus have a discrete ‘spike’ at unity and be continuous elsewhere. We know what that spike is from Eq. (8) with \(\left| z_{it}\right| =0\) substituted in, namely \(\Phi \left( \delta _{i}\right)\).
The probability of rejecting \(H_{0}\) depends on the probability that the test size is greater than the p value, but this is also equal to the econometric equation for the first hurdle.
$$\begin{aligned} \Pr \left( {\text {reject }}H_{0}\right) _{it}=\int _{p{\text {-value}}_{it}} ^{1}f_{i}\left( \alpha \right) d\alpha _{i}=1-F_{i}\left( p{\text {-value}} _{it}\right) =\Phi \left( \delta _{i}+\gamma \left| z_{it}\right| \right) \end{aligned}$$
(10)
Upper case F in the last equality is the anti-derivative of the density. We define \(F_{i}\left( 1\right)\) to be unity since 1 is the upper end of the support of \(\alpha\) but we also note that there is a discontinuity such that F jumps from \(1-\Phi \left( \delta _{i}\right)\) to 1 at \(\alpha =1\), as a consequence of the non-zero probability mass on \(f_{i}\left( \alpha \right)\) at unity. To solve the equation we use an expression for the p value of \(\left| z_{it}\right|\) on a two-sided Normal test.
$$\begin{aligned} p{\text {-value}}_{it}=2\left( 1-\Phi \left( \left| z_{it}\right| \right) \right) . \end{aligned}$$
(11)
We use a ‘single parameter’ approximation to the cumulative Normal (see Bowling et al. 2009). For our purposes \(\sqrt{3}\) is sufficient for the single parameter.
$$\begin{aligned} \Phi \left( \left| z_{it}\right| \right) \approx \frac{1}{1+\exp \left( -\sqrt{3}\left| z_{it}\right| \right) }. \end{aligned}$$
(12)
We can now write down \(\left| z_{it}\right|\) as a function of the p value using (11) and (12):
$$\begin{aligned} \left| z_{it}\right| \approx \frac{1}{\sqrt{3}}\ln \left( \frac{2-p{\text {-value}}_{it}}{p{\text {-value}}_{it}}\right) . \end{aligned}$$
(13)
Intuitively, a p value of zero implies an infinite \(\left| z_{it}\right|\) and a p value of unity implies \(|z_{it}|\) is zero, and (13) confirms this. We can now use the relationship between \(F_{i}\left( p{\text {-value}}_{it}\right)\) and our estimated first hurdle to generate \(F_{i}\left( \alpha \right)\).
$$\begin{aligned} 1-F_{i}\left( p{\text {-value}}_{it}\right)&= \Phi \left( \delta _{i} +\gamma \left| z_{it}\right| \right) \nonumber \\ \therefore \, F_{i}\left( p{\text {-value}}_{it}\right)&= 1-\Phi \left( \delta _{i}+\gamma \left| z_{it}\right| \right) \nonumber \\&\approx 1-\Phi \left( \delta _{i}+\gamma \left\{ \frac{1}{\sqrt{3}}\ln \left( \frac{2-p{\text {-value}}_{it}}{p{\text {-value}}_{it}}\right) \right\} \right) . \end{aligned}$$
(14)
In the above expression the variable ‘\(p{\text {-value}}_{it}\)’ is just a place-holder and can be replaced by anything with the same support leaving the meaning of (14) unchanged. Thus, it can be replaced by \(\alpha\) giving the cumulative density of \(\alpha\).
$$\begin{aligned} F_{i}\left( \alpha \right)&\approx 1-\Phi \left( \delta _{i}+\gamma \left\{ \frac{1}{\sqrt{3}}\ln \left( \frac{2-\alpha }{\alpha }\right) \right\} \right) \nonumber \\&\approx 1-\frac{1}{1+\left[ \frac{\alpha }{2-\alpha }\right] ^{\gamma }\exp \left( -\sqrt{3}\delta _{i}\right) }. \end{aligned}$$
(15)
Substitution of \(\alpha =1\) does not give unity, which is what we earlier assumed for the value of \(F_{i}\left( 1\right)\). However, it does give \(1-\Phi \left( \delta _{i}\right)\), which of course concurs with the econometric equation for the first hurdle when \(\left| z_{it}\right| =0\). This discontinuity in \(F_{i}\) is consistent with a discrete probability mass in \(f_{i}\left( \alpha \right)\) at unity, as we noted earlier. It now just remains to differentiate \(F_{i}\) to obtain the continuous density \(f_{i}\left( \alpha \right)\) for \(\alpha\) strictly less than unity. The description of the function at the upper end of the support (unity) is completed with a discrete mass at unity of \(\Phi \left( \delta _{i}\right)\).
$$\begin{aligned} \left. \begin{array}{cc} f_{i}\left( \alpha \right) \approx \frac{2\gamma \alpha ^{\gamma -1}\exp \left( -\sqrt{3}\delta _{i}\right) }{\left( 2-\alpha \right) ^{1+\gamma }\left\{ 1+\left[ \frac{\alpha }{2-\alpha }\right] ^{\gamma }\exp \left( -\sqrt{3} \delta _{i}\right) \right\} ^{2}}, &{} \qquad \alpha <1\\ \Pr _{i}\left( \alpha =1\right) \approx \frac{1}{1+\exp \left( -\sqrt{3}\delta _{i}\right) }, &{} \qquad \alpha =1 \end{array} \right\} \end{aligned}$$
(16)
Figure 5 illustrates the distribution \(f_{i}\left( \alpha \right)\) for \(\delta _{i}=0.1\) and \(\gamma =0.6\) together with the distributions one standard deviation either side of \(\delta _{i}\). The former is the mean of \(\delta\) across subjects, from our estimation (from the last column of Table 3, rounded). On the right-most of the chart is the probability mass when \(\alpha =1\). As discussed earlier, this corresponds to the proportion of agents who update on vanishingly small evidence (\(\left| z_{it}\right| =0\)). There is clearly a great deal of interesting heterogeneity. One distribution has a near-zero probability of a random update (10%) and when the agent uses information they are very conservative, with \(\alpha\) close to zero. We might call them ‘classical statisticians’ given the large probability mass around 1%, 5% and 10%. Another distribution has a virtually certain probability of a random update (90%) and we might call these agents ‘fully attentive’. The central estimate of \(\delta\) describes an agent who updates roughly half the time, and otherwise has a more or less uniform distribution over \(\alpha\).
Since there are idiosyncratic values of \(\delta _{i}\) there will be a separate distribution for every subject varying over \(\delta _{i}\). So we must use a summary statistic for \(f_{i}\left( \alpha \right)\), and the one which comes to hand is the median \(\alpha\) value, obtained by solving \(F_{i}\left( \alpha \right) =0.5\) in Eq. (15). In Fig. 6, we plot the collection of subject i’s (median \(\alpha\), \(\beta\)) duples for model 4, our preferred equation. Table 4 lists the percentage of subjects in each (median \(\alpha _{i}\), \(\beta _{i}\)) 0.2 bracket.Footnote 25
Table 4 Percentage of subjects in each (\(\alpha _i, \beta _i\)) bracket in model 4 Roughly half the subjects update regardless of evidence, so the median \(\alpha\)’s cluster at unity along the bottom axis with half of them (49%) in the range at or above 0.8. Just under one quarter (22%) of agents could be described as classical statisticians with median \(\alpha\)’s around the 1–10% level and a similar figure (28%) have ‘conservative belief adjustment’, with \(\alpha\)-values no more than 0.20.
Regarding the size of updating, we already know from Result 3 that it is less than complete. In Table 4, 22% update no more than 40 per cent of what they should.
Result 5
Estimated test sizes spread over the whole support [0, 1] but are clustered at zero and unity. The extent-of-update distribution has a large probability mass around 50% but an even larger mass for values over unity.
In supplementary analysis (see online appendix 5), we find that infrequent updaters (\(\alpha _{med}\): 0–0.2) have larger mean square deviations (MSD) from Bayesian’s guesses than other subjects. Frequent updaters (\(\alpha _{med}\): 0.8–1) tend to have larger MSD as each stage progresses, which can be explained by comparative underadjustment or overadjustment of beliefs in the second hurdle of our model (see Table 4).Footnote 26