Abstract
Decision-making in dynamic environments typically requires adaptive evidence accumulation that weights new evidence more heavily than old observations. Recent experimental studies of dynamic decision tasks require subjects to make decisions for which the correct choice switches stochastically throughout a single trial. In such cases, an ideal observer’s belief is described by an evolution equation that is doubly stochastic, reflecting stochasticity in the both observations and environmental changes. In these contexts, we show that the probability density of the belief can be represented using differential Chapman-Kolmogorov equations, allowing efficient computation of ensemble statistics. This allows us to reliably compare normative models to near-normative approximations using, as model performance metrics, decision response accuracy and Kullback-Leibler divergence of the belief distributions. Such belief distributions could be obtained empirically from subjects by asking them to report their decision confidence. We also study how response accuracy is affected by additional internal noise, showing optimality requires longer integration timescales as more noise is added. Lastly, we demonstrate that our method can be applied to tasks in which evidence arrives in a discrete, pulsatile fashion, rather than continuously.
This is a preview of subscription content,
to check access.







Similar content being viewed by others
Notes
The notation o(Δt) means all other terms are of smaller order than Δt. More precisely, o(Δt) represents a function f(Δt) with the property \(\lim \limits _{\Delta t\downarrow 0}\frac {f({\Delta } t)}{\Delta t}=0\).
References
Bankó, É.M., Gál, V., Körtvélyes, J., Kovács, G., Vidnyánszky, Z. (2011). Dissociating the effect of noise on sensory processing and overall decision difficulty. Journal of Neuroscience, 31(7), 2663–2674.
Behrens, T.E., Woolrich, M.W., Walton, M.E., Rushworth, M.F. (2007). Learning the value of information in an uncertain world. Nature Neuroscience, 10(9), 1214.
Billingsley, P. (2008). Probability and measure. Wiley.
Bogacz, R., Brown, E., Moehlis, J., Holmes, P., Cohen, J.D. (2006). The physics of optimal decision making: a formal analysis of models of performance in two-alternative forced-choice tasks. Psychological Review, 113 (4), 700.
Brea, J., Urbanczik, R., Senn, W. (2014). A normative theory of forgetting: lessons from the fruit fly. PLoS Computational Biology, 10(6), e1003640.
Brody, C.D., Romo, R., Kepecs, A. (2003). Basic mechanisms for graded persistent activity: discrete attractors, continuous attractors, and dynamic representations. Current Opinion in Neurobiology, 13(2), 204–211.
Brunton, B.W., Botvinick, M.M., Brody, C.D. (2013). Rats and humans can optimally accumulate evidence for decision-making. Science, 340(6128), 95–98.
Busemeyer, J.R., & Townsend, J.T. (1992). Fundamental derivations from decision field theory. Mathematical Social Sciences, 23(3), 255–282.
Droste, F., & Lindner, B. (2014). Integrate-and-fire neurons driven by asymmetric dichotomous noise. Biological Cybernetics, 108(6), 825–843.
Droste, F., & Lindner, B. (2017). Exact results for power spectrum and susceptibility of a leaky integrate-and-fire neuron with two-state noise. Physical Review E, 95(1), 012411.
Drugowitsch, J. (2016). Fast and accurate monte carlo sampling of first-passage times from wiener diffusion models. Scientific Reports, 6, 20490.
Drugowitsch, J., Moreno-Bote, R., Churchland, A.K., Shadlen, M.N., Pouget, A. (2012). The cost of accumulating evidence in perceptual decision making. Journal of Neuroscience, 32(11), 3612–3628.
Eckhoff, P., Holmes, P., Law, C., Connolly, P., Gold, J. (2008). On diffusion processes with variable drift rates as models for decision making during learning. New Journal of Physics, 10(1), 015006.
Eissa, T.L., Barendregt, N.W., Gold, J.I., Josić, K, Kilpatrick, Z.P. (2019). Hierarchical inference interactions in dynamic environments. In Computational and Systems Neuroscience. Lisbon.
Erban, R., & Chapman, S.J. (2007). Reactive boundary conditions for stochastic simulations of reaction–diffusion processes. Physical Biology, 4(1), 16.
Faisal, A.A., Selen, L.P., Wolpert, D.M. (2008). Noise in the nervous system. Nature Reviews Neuroscience, 9(4), 292.
Friedman, J, Hastie, T, Tibshirani, R. (2001). The elements of statistical learning. chap 7: Model Assessment and Selection, Vol. 1. New York: Springer series in statistics.
Gardiner, C. (2004). Handbook of stochastic methods: for physics, chemistry & the natural sciences, (series in synergetics, vol. 13).
Geisler, W.S. (2003). Ideal observer analysis. The Visual Neurosciences, 10(7), 12–12.
Glaze, C.M., Kable, J.W., Gold, J.I. (2015). Normative evidence accumulation in unpredictable environments. Elife, 4, e08825.
Glaze, C.M., Filipowicz, A.L., Kable, J.W., Balasubramanian, V., Gold, J.I. (2018). A bias–variance trade-off governs individual differences in on-line learning in an unpredictable environment. Nature Human Behaviour, 2(3), 213.
Gold, J.I., & Shadlen, M.N. (2007). The neural basis of decision making. Annual Review of Neuroscience, 30.
Hanson, F.B. (2007). Applied stochastic processes and control for Jump-diffusions: modeling, analysis, and computation, vol 13. SIAM.
Heath, R.A. (1992). A general nonstationary diffusion model for two-choice decision-making. Mathematical Social Sciences, 23(3), 283–309.
Horsthemke, W., & Lefever, R. (2006). Noise-induced transitions: theory and applications in physics, chemistry and biology. Springer Series in Synergetics. Berlin: Springer.
Kiani, R., & Shadlen, M.N. (2009). Representation of confidence associated with a decision by neurons in the parietal cortex. Science, 324(5928), 759–764.
Moehlis, J., Brown, E., Bogacz, R., Holmes, P., Cohen, J.D. (2004). Optimizing reward rate in two alternative choice tasks: mathematical formalism. Center for the study of brain, mind and behavior (pp. 04–01). Princeton University.
Ossmy, O., Moran, R., Pfeffer, T., Tsetsos, K., Usher, M., Donner, T.H. (2013). The timescale of perceptual evidence integration can be adapted to the environment. Current Biology, 23(11), 981–986.
Piet, A.T., El Hady, A., Brody, C.D. (2018). Rats adopt the optimal timescale for evidence integration in a dynamic environment. Nature Communications, 9(1), 4265.
Piet, A., Hady, A.E., Boyd-Meredith, T., Brody, C. (2019). Neural dynamics during changes of mind. In Computational and Systems Neuroscience (p. 2019). Lisbon.
Radillo, A.E., Veliz-Cuba, A., Josić, K, Kilpatrick, Z.P. (2017). Evidence accumulation and change rate inference in dynamic environments. Neural Computation, 29(6), 1561–1610.
Radillo, A.E., Veliz-Cuba, A., Josić, K. (2019). Performance of normative and approximate evidence accumulation on the dynamic clicks task. Neurons, Behavior, Data analysis, and Theory. submitted.
Rahnev, D., & Denison, R.N. (2018). Suboptimality in perceptual decision making. Behavioral and Brain Sciences, 41, e223. https://doi.org/10.1017/S0140525X18000936.
Ratcliff, R. (1978). A theory of memory retrieval. Psychological Review, 85(2), 59.
Ratcliff, R., & McKoon, G. (2008). The diffusion decision model: theory and data for two-choice decision tasks. Neural Computation, 20(4), 873–922.
Salinas, E., & Sejnowski, T.J. (2002). Integrate-and-fire neurons driven by correlated stochastic input. Neural Computation, 14(9), 2111–2155.
Skellam, J.G. (1946). The frequency distribution of the difference between two poisson variates belonging to different populations. Journal of the Royal Statistical Society Series A (General), 109(Pt 3), 296–296.
Smith, P.L. (2010). From poisson shot noise to the integrated ornstein–uhlenbeck process: neurally principled models of information accumulation in decision-making and response time. Journal of Mathematical Psychology, 54 (2), 266–283.
Smith, P.L., & Ratcliff, R. (2004). Psychology and neurobiology of simple decisions. Trends in Neurosciences, 27(3), 161–168.
Urai, A.E., Braun, A., Donner, T.H. (2017). Pupil-linked arousal is driven by decision uncertainty and alters serial choice bias. Nature Communications, 8, 14637.
Van Den Berg, R., Anandalingam, K., Zylberberg, A., Kiani, R., Shadlen, M.N., Wolpert, D.M. (2016). A common mechanism underlies changes of mind about decisions and confidence. Elife, 5, e12192.
Veliz-Cuba, A., Kilpatrick, Z.P., Josic, K. (2016). Stochastic models of evidence accumulation in changing environments. SIAM Review, 58(2), 264–289.
Wilson, R.C., Nassar, M.R., Gold, J.I. (2010). Bayesian online learning of the hazard rate in change-point problems. Neural Computation, 22(9), 2452–2476.
Yu, A.J., & Cohen, J.D. (2008). Sequential effects: superstition or rational behavior? Advances in Neural Information Processing Systems, 21, 1873–1880.
Zhang, S., Lee, M.D., Vandekerckhove, J., Maris, G., Wagenmakers, E.J. (2014). Time-varying boundaries for diffusion models of decision making and response time. Frontiers in Psychology, 5, 1364.
Acknowledgements
This work was supported by and NSF/NIH CRCNS grant (R01MH115557) and NSF (DMS-1517629). ZPK was also supported by NSF (DMS-1615737 & DMS-1853630). KJ was also supported by NSF (DBI-1707400). We thank Sam Isaacson and Jay Newby for feedback on the boundary value problem for the bounded accumulator model. We are also grateful to Adrian Radillo and Tahra Eissa for comments on a draft version of the manuscript.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interests
The authors declare that they have no conflict of interest.
Additional information
Action Editor: P. Dayan
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Code availability
See https://github.com/nwbarendregt/DynamicDecisionCKEquationshttps://github.com/nwbarendregt/DynamicDecisionCKEquationsfor the MATLAB finite difference code used to perform the analysis and generate figures.
K. Josić and Z.P. Kilpatrick share equal authorship.
Appendices
Appendix A: Normative evidence-accumulation in dynamic environments
Here we derive the continuum limit of the Bayesian update equation for continuous evidence accumulation in a changing environment. Starting with the discrete time model, we define Ln,± = P(s(tn) = s±|ξ1:n) as the probability of being in state s± at time tn assuming a sequence of observations ξ1:n. The state s(t) changes between evenly spaced time points t1:n (with Δt := tn − tn− 1) at a hazard rate hΔt := h ⋅Δt := P(s(tn) = s∓|s(tn− 1) = s±). The likelihood function fΔt,±(ξ) = P(ξ|s = s±; Δt) is the conditional probability of observing sample ξ given state s±, parameterized by Δt.
We begin by assuming an ideal observer who knows the environmental hazard rate h. Using Bayes’ rule and the law of total probability, we can relate Ln,± to the probability at the previous time step according to the weighted sum (Veliz-Cuba et al. 2016)
where L0,± = P(s(t0) = s±). Defining \(y_{n} = \log \frac {L_{n, +}}{L_{n,-}}\), we can compute
In search of the continuum limit of this equation, we assume 0 < Δt ≪ 1, 0 < |Δyn|≪ 1, and use the approximation \(\log (1 + z) \approx z\) to obtain
Replacing the index n with the time t and applying the functional central limit theorem as in Billingsley (2008) and Bogacz et al. (2006), we can write Eq. (27) as
where η is a random variable with a standard normal distribution and
The drift gΔt and variance \(\rho _{\Delta t}^{2}\) diverge unless fΔt,±(ξ) are scaled appropriately in the Δt → 0 limit. A reasonable assumption that can be made to compute gΔt and \(\rho _{\Delta t}^{2}\) explicitly is to take observations ξ to follow normal distributions with mean and variance scaled by Δt (Bogacz et al. 2006; Veliz-Cuba et al. 2016)
so we can compute the limits of Eq. (29) as
where g(t) ∈{+g,−g} is a telegraph process with probability masses P(±g, t) evolving as \(\dot {\mathrm {P}}(\pm g,t) = h \left [ \mathrm {P} (\mp g, t) - \mathrm {P} (\pm g,t) \right ] \) and ρ2(t) = ρ2 remains constant. Therefore, the continuum limit (Δt → 0) of Eq. (28) is
where dW is a standard Wiener process. Equation (31) provide the normative model of evidence accumulation for an observer who knows the hazard rate h and wishes to infer the sign of g(t) at time t with maximal accuracy (Glaze et al. 2015; Veliz-Cuba et al. 2016).
However, we are also interested in near-normative models in which the observer assumes an incorrect hazard rate \(\tilde {h} \neq h\). In such a case, the analysis proceeds as before, with the probabilistic inference process simply involving \(\tilde {h}\) now rather than h, and the result is
Lastly, note that if indeed the original observations ξ are drawn from normal distributions, Eq. (30a), (30b) states g(t) ∈±g where g = 2μ2/σ2 and ρ2 = 2g. Rescaling time ht↦t, we can then express Eq. (32) in terms of the following rescaled equation
where m = 2μ2/(hσ2) and x(t) ∈± 1 is a telegraph process with hazard rate 1, as shown in Eq. (2) of the main text.
Appendix B: Derivation of the Chapman-Kolmogorov equations
Here we outline the derivation of the CK equations given by Eqs. (4a), (4b) and (6) in the main text. Our goal is to write a PDE that describes the evolution of an ensemble of belief trajectories described by the SDE in Eq. (2) across all realizations of both sources of stochasticity – observation noise and state switching. We cannot simply write a Fokker-Planck equation to describe the desired probability distribution, as the process is doubly stochastic; the diffusion is described by a scaled Wiener process and the sign of the drift is controlled by a two-state Markov process. Therefore, following Droste and Lindner (2017), we condition our process on the state of x(t), and seek to obtain evolution equations for the conditional distributions p±(y, t) := p(y, t|x(t) = ±).
We start by conditioning on x(t) = + 1 to find an equation for p+(y, t). In the absence of state transitions in the background Markov process, the evolution equation for p+(y, t) is now given by the Fokker-Planck equation
On the other hand, in the absence of drift and diffusion, the evolution of p+(y, t) is governed exclusively by the two-state Markov process; in this case, we can write the evolution equation in the form of a discrete master equation:
Note here that because of the rescaling of time in Eq. (2), the transition rates of the Markov process equal one. Given these two components, we can follow Gardiner (2004) to write the CK equation that describes the evolution of p+(y, t) when drift, diffusion, and switching are all present as
which is Eqs. (4a); (4b) is obtained similarly by conditioning on x(t) = − 1.
To determine response accuracy, we are interested in the probability the observer responds with the correct state of the Markov process. Therefore, at an interrogation time t = T where x(T) = + 1, the observer is correct if y > 0. This probability is given by the integral \({\int \limits }_{0}^{\infty } p_{+}(y,T) dy\). Similarly, if x(T) = − 1, the observer is correct if y < 0, which happens with probability \({\int \limits }_{-\infty }^{0} p_{-}(y,T) dy\). Therefore, the total response accuracy at time T is given by summing these two integrals:
which is Eq. (5).
Finally, in order to more efficiently simulate the evolution of p+(y, t) and p−(y, t), we notice that because the nonlinear leak is an odd function, a change of variables − y↦y in Eq. (4b) allows us to combine Eqs. (4a) and (4b) into a single PDE. Defining ps(y, t) := p+(y, t) + p−(−y, t), we obtain the evolution equation
which is Eq. (6).
Appendix C: Finite difference methods for Chapman-Kolmogorov equations
Comparison of CK equations with Monte Carlo sampling. a Calculation of accuracy of mistuned nonlinear leak model for m = 5. Monte Carlo simulations run with varied number of samples superimposed (legend). b Runtime (red) and L2 error (blue) of Monte Carlo simulations as a function of sample size. Runtime of CK equations (black dashed) superimposed for comparison. L2 error of Monte Carlo simulations calculated against results from CK equations. c Runtime (red) and \(L^{\infty }\) error (blue) of finite difference simulations of the nonlinear model Eq. (6) under refinement of belief discretization, compared with the Δy = 10− 4 case. Because our method is first-order accurate in time and second-order accurate in belief, the method is first-order accurate when Δy is decreased and Δt is held constant
We used a finite difference method to simulate the differential CK equations. The method is exemplified here for the normative CK equation from Eq. (6), but a similar approach was used for the linear, cubic, and pulsatile equations. For stability purposes, our method uses centered differences in y and backward-Euler in t. This gives the following finite difference approximations of the functions and their derivatives in Eq. (6):
where Δt and Δy are timestep and spacestep of the simulation, respectively. Substituting into Eq. (6) and solving for ps(y, t) at each point on a mesh y for y gives the system of equations:
where A is tridiagonal with elements along the primary off-diagonal. This system can be inverted at each timestep and used to calculate the updates ps(y, t + Δt).
For the boundary conditions, we imposed no-flux conditions at the mesh boundaries ± b. For a standard drift-diffusion equation with drift A(y) and diffusion constant B(y), this condition takes the form
Using the finite difference approximations
we can plug in ± b to the appropriate replacement and use Eq. (34) to find the appropriate boundary terms for the system in Eq. (33).
Figure 9 shows the results of Monte Carlo simulations compared against those from the CK equations; Monte Carlo simulations are less smooth (Fig. 9a), making optimality calculations less accurate. Furthermore, obtaining results that are close to those from the CK equations takes much longer to run (Fig. 9b), as we can obtain good accuracy (error ≈ 10− 2) from the CK equations in less than a second (Fig. 9c).
Appendix D: Distinguishing linear discounting models using confidence
Distinguishing linear discounting parameter λ using confidence reports. a Average number of trials needed to determine whether a subject uses the maximizing accuracy leak parameter, λAcc, or the minimizing KL divergence leak parameter, λKL, as a function of evidence strength m. Averages were computed over a thousand simulations for interrogation time T = 1, with each simulation run until confidence in the subject’s model reached 90%
Can we use the distributions obtained from the CK equations to distinguish which model a subject uses? To illustrate an example, we considered distinguishing the two linear discounting models given by Eq. (7) and asked how many trials an experimentalist would need to run in order to tell if a subject was using λAcc or λKL. We sampled from the distributions obtained from Eq. (8) and performed a likelihood ratio test to find the number of samples needed for the experimentalist to have 90% confidence in the model being used. Assuming a symmetric prior P(λAcc) = P(λKL) = 0.5, and defining \(\mathcal { Y}_{j} = y_{j}(T)\) as the belief report at the end (t = T) of trial j, the likelihood ratio is computed using Bayes’ rule and independence after N trials as
and we counted the number of trials required to obtain ≥ 90% confidence in either model (so \(P(\lambda |\mathcal {Y}_{1:N}) \geq 0.9\) for either λ ∈{λAcc, λKL}). The results of this simulation are given in Fig. 10, showing that as the strength m of evidence increases, the mean number of trials 〈N(m)〉 required to distinguish models decreases precipitously. This is to be expected based on the fact that the belief distributions become more separated as m increases (Fig. 3e).
Appendix E: Deriving differential CK equation with internal noise
Here we provide intuition for the form of the diffusion coefficient in Eq. (13) for the belief distribution of a normative observer strategy with additional internal noise of strength D. Starting with the SDE in Eq. (12), because \(\sqrt {2m}\mathrm {d} W_{t}\) and \(\sqrt {2D}\mathrm {d} X_{t}\) are increments of independent Wiener processes, we can define a new Wiener process \(A\mathrm {d} Z_{t}=\sqrt {2m}\mathrm {d} W_{t}+\sqrt {2D}\mathrm {d} X_{t}\) that has the same statistics as the original summed Wiener processes (Gardiner 2004). To determine the appropriate effective diffusion constant A, we note that
and
This requires \(A=\sqrt {2(m+D)}\), and means Eq. (12) can be rewritten as
which following Gardiner (2004), has the differential CK equation given by Eq. (13).
Appendix F: Steady state solution of the bounded accumulator model
Steady state solutions of Eq. (16) are derived first by noting that ∂tp± = 0 implies
with boundary conditions \(\bar {p}_{\pm }(y) (\pm \beta ) = \bar {p}_{\pm }^{\prime }(\pm \beta )\). Equation (35) has solutions \(\left (\begin {array}{c} \bar {p}_{+}(y)\\ \bar {p}_{-}(y) \end {array} \right ) = \left (\begin {array}{c} A \\ B \end {array} \right ) \mathrm {e}^{\alpha y}\), with characteristic equation \(m^{2} \alpha ^{4}-\left (m^{2}+2m\right ) \alpha ^{2}=0\). The characteristic roots are α = 0,±q, where we define \(q = \sqrt {1+\frac {2}{m}}\). For α = 0, we have A = B, whereas for α = ±q, the symmetry \(\bar {p}_{+}(y)=\bar {p}_{-}(-y)\) implies B = (mq − (m + 1))A for α = +q and A = (mq − (m + 1))B for α = −q. Lastly, defining the sum \(\bar {p}_{s}(y)=\bar {p}_{+}(y)+\bar {p}_{-}(-y)\), we obtain
The no flux boundary conditions \(\bar {p}_{s}(\pm \beta )-\frac {\partial \bar {p}_{s}(\pm \beta )}{\partial y}=0\) along with the normalization requirement \({\int \limits }_{- \beta }^{\beta } \bar {p}_{s}(y) \mathrm {d} y = 1\) give explicit expressions for the constants
and
Appendix G: Steady state solution of the clicks-task bounded accumulator model
Considering Eq. (23), we look for stationary solutions of the form \(\left (\begin {array}{llll} \bar {p}_{+}(n)\\ \bar {p}_{-}(n) \end {array}\right )=\left (\begin {array}{llll} C_{1}\\ C_{2} \end {array}\right ) \alpha ^{n}\), yielding the characteristic equation
where rs = r+ + r− + 1. Solving Eq. (36) gives α = 1 with eigenfunction C1 = C2 and two roots α = q± of the quadratic \(\alpha ^{2}- \alpha (r_{+}+r_{+}^{2}+r_{-}+r_{-}^{2})/(r_{+} r_{-}) +1=0\). Superimposing the eigenfunctions, redefining constants, and defining \(\bar {p}_{s}(n)=\bar {p}_{+}(n)+\bar {p}_{-}(-n)\) gives the general solution
The constants C1, C2, and C3 can be determined by normalization \({\sum }_{n=-\beta }^{\beta }\bar {p}_{s}(n)=1\) and the stationary boundary conditions
Long term accuracy of the bounded accumulator is then determined by the weighted sum \(\text {Acc}_{\infty }(\beta ) = \frac {1}{2} \bar {p}_{s}(0) + {\sum }_{n=1}^{\beta } \bar {p}_{s}(n)\).
Rights and permissions
About this article
Cite this article
Barendregt, N.W., Josić, K. & Kilpatrick, Z.P. Analyzing dynamic decision-making models using Chapman-Kolmogorov equations. J Comput Neurosci 47, 205–222 (2019). https://doi.org/10.1007/s10827-019-00733-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10827-019-00733-5