Skip to main content

Advertisement

Log in

Comparing Probabilistic Accounts of Probability Judgments

  • Original Paper
  • Published:
Computational Brain & Behavior Aims and scope Submit manuscript

Abstract

Bayesian theories of cognitive science hold that cognition is fundamentally probabilistic, but people’s explicit probability judgments often violate the laws of probability. Two recent proposals, the “Probability Theory plus Noise” (PT+N; Costello and Watts Psychological Review, 121, 463–480, 2014) and “Bayesian Sampler” (Zhu et al. Psychological Review, 127, 719–748, 2020) theories of probability judgments, both seek to account for these biases while maintaining that mental credences are fundamentally probabilistic. These models differ in their averaged predictions about people’s conditional probability judgments and in their distributional predictions about their overall patterns of judgments. In particular, the Bayesian Sampler’s Bayesian adjustment process predicts a truncated range of responses as well as a correlation between the average degree of bias and variability trial-to-trial. However, exploring these distributional predictions with participants’ raw responses requires a careful treatment of rounding errors and exogenous response processes. Here, I cast these theories into a Bayesian data analysis framework that supports the treatment of these issues along with principled model comparison using information criteria. Comparing the fits of both models on data collected by (Zhu et al. Psychological Review, 127(5), 719–748 2020), I find these data are best explained by an account of biases based on “noise” in the sample-reading process but in which conditional probability judgments are produced by a process of conditioning in the mental model of the events, rather than in a two-stage mental sampling process as proposed by the PT+N model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Availability of Data and Materials

This paper presents secondary analyses of data. The datasets generated and/or analysed during the current study are available at https://osf.io/mgcxj/files/.

Code Availability

All analysis code is available at https://github.com/derekpowell/bayesian-sampler and at https://osf.io/bpkjf/.

Notes

  1. It is worth noting that other non-sampling based approaches have been proposed to account for distortions in people’s use of explicit probabilities in decision-making (e.g. Zhang & Maloney, 2012, Zhang et al., 2020). Further theorizing might extend these accounts to also describe the generation of probability estimates, so that a probabilistic account of beliefs might not rest entirely on the assumption of sampling from mental models.

  2. Rather than estimating model fit and then penalizing for model complexity, PSIS-LOO estimates out-of-sample prediction performance directly by estimating the expected log predictive density \(\widehat {\text {elpd}}\) of the model, or the expected probability of new unseen data (Gelman et al., 2014; Vehtari et al., 2017). From these calculations, an estimate of model complexity \(\hat {p}_{\text {LOO}}\) can also be derived. However, it is worth recognizing that formal measures of model complexity will not always track notions of simplicity or elegance in scientific explanation (for some related discussions, see (Kuhn, 1977; Piantadosi, 2018; Sober, 2002)

  3. Uninformativeness was sought in order to reduce bias in the posterior parameter estimates. It should be acknowledged that a uniform prior does not exactly correspond to what the authors of the PT+N theory would predict, as they have frequently assumed d to be a fairly small value (e.g. Costello and Watts, 2017)

  4. Strictly speaking, under the original form of the Bayesian sampler model, N and \(N^{\prime }\) are discrete parameters representing the number of distinct independent samples drawn. Given a particular implied d, this could create constraints on the possible values of \(d^{\prime }\), assuming β is held constant. However, Zhu et al. (2020) also consider the possibility that people draw non-independent mental samples, in which case N and \(N^{\prime }\) would represent the effective number of samples, accounting for their autocorrelation. In this case, we could treat this effective number of samples as a continuous quantity, and therefore imagine there are no clear constraints on d and \(d^{\prime }\) except the stipulation that dd. These ideas will be developed further in the trial-level analyses.

References

Download references

Author information

Authors and Affiliations

Authors

Contributions

Derek Powell is the sole author of this manuscript.

Corresponding author

Correspondence to Derek Powell.

Ethics declarations

Ethics Approval

Not applicable.

Consent to Participate

Not applicable.

Consent for Publication

Not applicable.

Competing of Interests

The author declares no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix:

Appendix:

For the trial-level response models participants’ rounded responses are modeled as discrete responses with a categorical (multinomial) distribution. For i ∈{0, 1,...,m} where m = 20 possible responses, define a set of cut points \(a_{i} = \frac {i}{m}-\frac {1}{2m}\) and \(b_{i} = \frac {i}{m}+\frac {1}{2m}\). Using x|[0,1] to denote that x is restricted to the domain [0, 1], the probability of each response given μ and N is:

$$ \begin{array}{@{}rcl@{}} p_{i,5} &=& P([a_{i,5},b_{i,5}))\\ &=& B(a_{i,5}|_{[0,1]}, \mu N, (1-\mu)N) \\&&- B(b_{i,5}|_{[0,1]}, \mu N, (1-\mu)N) \end{array} $$

where B is the appropriate cumulative distribution function. To capture rounding to 10 we define \(a_{i,10} = \frac {2i}{m}-\frac {1}{m}\) and \(b_{i} = \frac {2i}{m}+\frac {1}{m}\), so that the probability of each response is:

$$ p_{i,10} = \left\{\begin{array}{ll} P([a_{i,10},b_{i,10})) & \text{i is even} \\ 0 & \text{i is odd} \end{array}\right. $$
(14)

Next defined a vector of mixture probabilities \(\overrightarrow {\phi }\), with the zeroth index indicating a “contaminant” process. Combining these response processes, we can define the marginal probability of each response as:

$$ p_{i} = \frac{1}{21} \phi_{0} + p_{i,5} \phi_{1} + p_{i,10} \phi_{2} $$

And then responses themselves are distributed Categorical:

$$ y_{i} \sim Categorical(\overrightarrow{p}) $$

For the noise-based model, B is the incomplete Beta function, the CDF of the Beta distribution. The computations for the Bayesian Sampler model response probabilities are identical save that instead of B(x,α,β) we have \(B(f_{BS}^{-1}(x), \alpha , \beta )\) when computing the probability of each response pi, and we use N and \(N^{\prime }\) where appropriate. To see this, let X be the Beta distribution success proportions from the mental sampling operations, ρ(A), and let Y be the distribution of resulting probabilities from the Bayesian sampler model. Then Y = g(X) where g is the function defined in equation 13 from the manuscript.

$$ \hat{P}_{BS}(A) = \frac{\rho(A)N}{N+2\beta} + \frac{\beta}{N+2\beta} $$
(15)

Letting FX and FY be the CDF of X and Y respectively, we have that:

$$ \begin{array}{@{}rcl@{}} F_{Y}(y)= P(Y\leq y) &=& P(g(X)\leq y) \\ &=& P(X\leq g^{-1}(y)) = F_{X}(g^{-1}(y)) \end{array} $$

Putting this all together, define ZNB as the function which calculates the probability of each categorical response under the noise-based model given the inputs of \(\mu _{ijk}, d_{j}, d^{\prime }_{j}\) and ϕ. Here, \(\mu _{ijk} = f_{\text {NB}}(\overrightarrow {\theta _{jk}}, d_{j}, d^{\prime }_{j}, x_{ijk})\) computes the expected probability according to the PT+N theory except that it treats conditional probability judgments like simple probability judgments.

Finally, define ZBS as the function which calculate the probability of each categorical response under the Bayesian Sampler model. Note that, here, \(\mu _{ijk} = f_{0}(\overrightarrow {\theta _{jk}}, x_{ijk})\) where the value of μ depends only on the underlying probabilities and query asked on a specific trial.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Powell, D. Comparing Probabilistic Accounts of Probability Judgments. Comput Brain Behav 6, 228–245 (2023). https://doi.org/10.1007/s42113-022-00164-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42113-022-00164-z

Keywords

Navigation