## Abstract

Decision-making in dynamic environments typically requires adaptive evidence accumulation that weights new evidence more heavily than old observations. Recent experimental studies of dynamic decision tasks require subjects to make decisions for which the correct choice switches stochastically throughout a single trial. In such cases, an ideal observer’s belief is described by an evolution equation that is doubly stochastic, reflecting stochasticity in the both observations and environmental changes. In these contexts, we show that the probability density of the belief can be represented using differential Chapman-Kolmogorov equations, allowing efficient computation of ensemble statistics. This allows us to reliably compare normative models to near-normative approximations using, as model performance metrics, decision response accuracy and Kullback-Leibler divergence of the belief distributions. Such belief distributions could be obtained empirically from subjects by asking them to report their decision confidence. We also study how response accuracy is affected by additional internal noise, showing optimality requires longer integration timescales as more noise is added. Lastly, we demonstrate that our method can be applied to tasks in which evidence arrives in a discrete, pulsatile fashion, rather than continuously.

This is a preview of subscription content,

to check access.### Similar content being viewed by others

## Notes

The notation

*o*(Δ*t*) means all other terms are of smaller order than Δ*t*. More precisely,*o*(Δ*t*) represents a function*f*(Δ*t*) with the property \(\lim \limits _{\Delta t\downarrow 0}\frac {f({\Delta } t)}{\Delta t}=0\).

## References

Bankó, É.M., Gál, V., Körtvélyes, J., Kovács, G., Vidnyánszky, Z. (2011). Dissociating the effect of noise on sensory processing and overall decision difficulty.

*Journal of Neuroscience*,*31*(7), 2663–2674.Behrens, T.E., Woolrich, M.W., Walton, M.E., Rushworth, M.F. (2007). Learning the value of information in an uncertain world.

*Nature Neuroscience*,*10*(9), 1214.Billingsley, P. (2008).

*Probability and measure*. Wiley.Bogacz, R., Brown, E., Moehlis, J., Holmes, P., Cohen, J.D. (2006). The physics of optimal decision making: a formal analysis of models of performance in two-alternative forced-choice tasks.

*Psychological Review*,*113*(4), 700.Brea, J., Urbanczik, R., Senn, W. (2014). A normative theory of forgetting: lessons from the fruit fly.

*PLoS Computational Biology*,*10*(6), e1003640.Brody, C.D., Romo, R., Kepecs, A. (2003). Basic mechanisms for graded persistent activity: discrete attractors, continuous attractors, and dynamic representations.

*Current Opinion in Neurobiology*,*13*(2), 204–211.Brunton, B.W., Botvinick, M.M., Brody, C.D. (2013). Rats and humans can optimally accumulate evidence for decision-making.

*Science*,*340*(6128), 95–98.Busemeyer, J.R., & Townsend, J.T. (1992). Fundamental derivations from decision field theory.

*Mathematical Social Sciences*,*23*(3), 255–282.Droste, F., & Lindner, B. (2014). Integrate-and-fire neurons driven by asymmetric dichotomous noise.

*Biological Cybernetics*,*108*(6), 825–843.Droste, F., & Lindner, B. (2017). Exact results for power spectrum and susceptibility of a leaky integrate-and-fire neuron with two-state noise.

*Physical Review E*,*95*(1), 012411.Drugowitsch, J. (2016). Fast and accurate monte carlo sampling of first-passage times from wiener diffusion models.

*Scientific Reports*,*6*, 20490.Drugowitsch, J., Moreno-Bote, R., Churchland, A.K., Shadlen, M.N., Pouget, A. (2012). The cost of accumulating evidence in perceptual decision making.

*Journal of Neuroscience*,*32*(11), 3612–3628.Eckhoff, P., Holmes, P., Law, C., Connolly, P., Gold, J. (2008). On diffusion processes with variable drift rates as models for decision making during learning.

*New Journal of Physics*,*10*(1), 015006.Eissa, T.L., Barendregt, N.W., Gold, J.I., Josić, K, Kilpatrick, Z.P. (2019). Hierarchical inference interactions in dynamic environments. In

*Computational and Systems Neuroscience*. Lisbon.Erban, R., & Chapman, S.J. (2007). Reactive boundary conditions for stochastic simulations of reaction–diffusion processes.

*Physical Biology*,*4*(1), 16.Faisal, A.A., Selen, L.P., Wolpert, D.M. (2008). Noise in the nervous system.

*Nature Reviews Neuroscience*,*9*(4), 292.Friedman, J, Hastie, T, Tibshirani, R. (2001). The elements of statistical learning. chap 7: Model Assessment and Selection, Vol. 1. New York: Springer series in statistics.

Gardiner, C. (2004). Handbook of stochastic methods: for physics, chemistry & the natural sciences, (series in synergetics, vol. 13).

Geisler, W.S. (2003). Ideal observer analysis.

*The Visual Neurosciences*,*10*(7), 12–12.Glaze, C.M., Kable, J.W., Gold, J.I. (2015). Normative evidence accumulation in unpredictable environments.

*Elife*,*4*, e08825.Glaze, C.M., Filipowicz, A.L., Kable, J.W., Balasubramanian, V., Gold, J.I. (2018). A bias–variance trade-off governs individual differences in on-line learning in an unpredictable environment.

*Nature Human Behaviour*,*2*(3), 213.Gold, J.I., & Shadlen, M.N. (2007). The neural basis of decision making.

*Annual Review of Neuroscience*, 30.Hanson, F.B. (2007).

*Applied stochastic processes and control for Jump-diffusions: modeling, analysis, and computation*, vol 13. SIAM.Heath, R.A. (1992). A general nonstationary diffusion model for two-choice decision-making.

*Mathematical Social Sciences*,*23*(3), 283–309.Horsthemke, W., & Lefever, R. (2006).

*Noise-induced transitions: theory and applications in physics, chemistry and biology. Springer Series in Synergetics*. Berlin: Springer.Kiani, R., & Shadlen, M.N. (2009). Representation of confidence associated with a decision by neurons in the parietal cortex.

*Science*,*324*(5928), 759–764.Moehlis, J., Brown, E., Bogacz, R., Holmes, P., Cohen, J.D. (2004).

*Optimizing reward rate in two alternative choice tasks: mathematical formalism. Center for the study of brain, mind and behavior*(pp. 04–01). Princeton University.Ossmy, O., Moran, R., Pfeffer, T., Tsetsos, K., Usher, M., Donner, T.H. (2013). The timescale of perceptual evidence integration can be adapted to the environment.

*Current Biology*,*23*(11), 981–986.Piet, A.T., El Hady, A., Brody, C.D. (2018). Rats adopt the optimal timescale for evidence integration in a dynamic environment.

*Nature Communications*,*9*(1), 4265.Piet, A., Hady, A.E., Boyd-Meredith, T., Brody, C. (2019). Neural dynamics during changes of mind. In

*Computational and Systems Neuroscience*(p. 2019). Lisbon.Radillo, A.E., Veliz-Cuba, A., Josić, K, Kilpatrick, Z.P. (2017). Evidence accumulation and change rate inference in dynamic environments.

*Neural Computation*,*29*(6), 1561–1610.Radillo, A.E., Veliz-Cuba, A., Josić, K. (2019). Performance of normative and approximate evidence accumulation on the dynamic clicks task. Neurons, Behavior, Data analysis, and Theory. submitted.

Rahnev, D., & Denison, R.N. (2018). Suboptimality in perceptual decision making.

*Behavioral and Brain Sciences*,*41*, e223. https://doi.org/10.1017/S0140525X18000936.Ratcliff, R. (1978). A theory of memory retrieval.

*Psychological Review*,*85*(2), 59.Ratcliff, R., & McKoon, G. (2008). The diffusion decision model: theory and data for two-choice decision tasks.

*Neural Computation*,*20*(4), 873–922.Salinas, E., & Sejnowski, T.J. (2002). Integrate-and-fire neurons driven by correlated stochastic input.

*Neural Computation*,*14*(9), 2111–2155.Skellam, J.G. (1946). The frequency distribution of the difference between two poisson variates belonging to different populations.

*Journal of the Royal Statistical Society Series A (General)*,*109*(Pt 3), 296–296.Smith, P.L. (2010). From poisson shot noise to the integrated ornstein–uhlenbeck process: neurally principled models of information accumulation in decision-making and response time.

*Journal of Mathematical Psychology*,*54*(2), 266–283.Smith, P.L., & Ratcliff, R. (2004). Psychology and neurobiology of simple decisions.

*Trends in Neurosciences*,*27*(3), 161–168.Urai, A.E., Braun, A., Donner, T.H. (2017). Pupil-linked arousal is driven by decision uncertainty and alters serial choice bias.

*Nature Communications*,*8*, 14637.Van Den Berg, R., Anandalingam, K., Zylberberg, A., Kiani, R., Shadlen, M.N., Wolpert, D.M. (2016). A common mechanism underlies changes of mind about decisions and confidence.

*Elife*,*5*, e12192.Veliz-Cuba, A., Kilpatrick, Z.P., Josic, K. (2016). Stochastic models of evidence accumulation in changing environments.

*SIAM Review*,*58*(2), 264–289.Wilson, R.C., Nassar, M.R., Gold, J.I. (2010). Bayesian online learning of the hazard rate in change-point problems.

*Neural Computation*,*22*(9), 2452–2476.Yu, A.J., & Cohen, J.D. (2008). Sequential effects: superstition or rational behavior?

*Advances in Neural Information Processing Systems*,*21*, 1873–1880.Zhang, S., Lee, M.D., Vandekerckhove, J., Maris, G., Wagenmakers, E.J. (2014). Time-varying boundaries for diffusion models of decision making and response time.

*Frontiers in Psychology*,*5*, 1364.

## Acknowledgements

This work was supported by and NSF/NIH CRCNS grant (R01MH115557) and NSF (DMS-1517629). ZPK was also supported by NSF (DMS-1615737 & DMS-1853630). KJ was also supported by NSF (DBI-1707400). We thank Sam Isaacson and Jay Newby for feedback on the boundary value problem for the bounded accumulator model. We are also grateful to Adrian Radillo and Tahra Eissa for comments on a draft version of the manuscript.

## Author information

### Authors and Affiliations

### Corresponding author

## Ethics declarations

###
**Conflict of interests**

The authors declare that they have no conflict of interest.

## Additional information

Action Editor: P. Dayan

### Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

### Code availability

See https://github.com/nwbarendregt/DynamicDecisionCKEquationshttps://github.com/nwbarendregt/DynamicDecisionCKEquationsfor the MATLAB finite difference code used to perform the analysis and generate figures.

K. Josić and Z.P. Kilpatrick share equal authorship.

## Appendices

### Appendix A: Normative evidence-accumulation in dynamic environments

Here we derive the continuum limit of the Bayesian update equation for continuous evidence accumulation in a changing environment. Starting with the discrete time model, we define *L*_{n,±} = P(*s*(*t*_{n}) = *s*_{±}|*ξ*_{1:n}) as the probability of being in state *s*_{±} at time *t*_{n} assuming a sequence of observations *ξ*_{1:n}. The state *s*(*t*) changes between evenly spaced time points *t*_{1:n} (with Δ*t* := *t*_{n} − *t*_{n− 1}) at a hazard rate *h*_{Δt} := *h* ⋅Δ*t* := P(*s*(*t*_{n}) = *s*_{∓}|*s*(*t*_{n− 1}) = *s*_{±}). The likelihood function *f*_{Δt,±}(*ξ*) = P(*ξ*|*s* = *s*_{±}; Δ*t*) is the conditional probability of observing sample *ξ* given state *s*_{±}, parameterized by Δ*t*.

We begin by assuming an ideal observer who knows the environmental hazard rate *h*. Using Bayes’ rule and the law of total probability, we can relate *L*_{n,±} to the probability at the previous time step according to the weighted sum (Veliz-Cuba et al. 2016)

where *L*_{0,±} = P(*s*(*t*_{0}) = *s*_{±}). Defining \(y_{n} = \log \frac {L_{n, +}}{L_{n,-}}\), we can compute

In search of the continuum limit of this equation, we assume 0 < Δ*t* ≪ 1, 0 < |Δ*y*_{n}|≪ 1, and use the approximation \(\log (1 + z) \approx z\) to obtain

Replacing the index *n* with the time *t* and applying the functional central limit theorem as in Billingsley (2008) and Bogacz et al. (2006), we can write Eq. (27) as

where *η* is a random variable with a standard normal distribution and

The drift *g*_{Δt} and variance \(\rho _{\Delta t}^{2}\) diverge unless *f*_{Δt,±}(*ξ*) are scaled appropriately in the Δ*t* → 0 limit. A reasonable assumption that can be made to compute *g*_{Δt} and \(\rho _{\Delta t}^{2}\) explicitly is to take observations *ξ* to follow normal distributions with mean and variance scaled by Δ*t* (Bogacz et al. 2006; Veliz-Cuba et al. 2016)

so we can compute the limits of Eq. (29) as

where *g*(*t*) ∈{+*g*,−*g*} is a telegraph process with probability masses P(±*g*, *t*) evolving as \(\dot {\mathrm {P}}(\pm g,t) = h \left [ \mathrm {P} (\mp g, t) - \mathrm {P} (\pm g,t) \right ] \) and *ρ*^{2}(*t*) = *ρ*^{2} remains constant. Therefore, the continuum limit (Δ*t* → 0) of Eq. (28) is

where d*W* is a standard Wiener process. Equation (31) provide the normative model of evidence accumulation for an observer who knows the hazard rate *h* and wishes to infer the sign of *g*(*t*) at time *t* with maximal accuracy (Glaze et al. 2015; Veliz-Cuba et al. 2016).

However, we are also interested in near-normative models in which the observer assumes an incorrect hazard rate \(\tilde {h} \neq h\). In such a case, the analysis proceeds as before, with the probabilistic inference process simply involving \(\tilde {h}\) now rather than *h*, and the result is

Lastly, note that if indeed the original observations *ξ* are drawn from normal distributions, Eq. (30a), (30b) states *g*(*t*) ∈±*g* where *g* = 2*μ*^{2}/*σ*^{2} and *ρ*^{2} = 2*g*. Rescaling time *h**t*↦*t*, we can then express Eq. (32) in terms of the following rescaled equation

where *m* = 2*μ*^{2}/(*h**σ*^{2}) and *x*(*t*) ∈± 1 is a telegraph process with hazard rate 1, as shown in Eq. (2) of the main text.

### Appendix B: Derivation of the Chapman-Kolmogorov equations

Here we outline the derivation of the CK equations given by Eqs. (4a), (4b) and (6) in the main text. Our goal is to write a PDE that describes the evolution of an ensemble of belief trajectories described by the SDE in Eq. (2) across all realizations of both sources of stochasticity – observation noise and state switching. We cannot simply write a Fokker-Planck equation to describe the desired probability distribution, as the process is doubly stochastic; the diffusion is described by a scaled Wiener process and the sign of the drift is controlled by a two-state Markov process. Therefore, following Droste and Lindner (2017), we condition our process on the state of *x*(*t*), and seek to obtain evolution equations for the conditional distributions *p*_{±}(*y*, *t*) := *p*(*y*, *t*|*x*(*t*) = ±).

We start by conditioning on *x*(*t*) = + 1 to find an equation for *p*_{+}(*y*, *t*). In the absence of state transitions in the background Markov process, the evolution equation for *p*_{+}(*y*, *t*) is now given by the Fokker-Planck equation

On the other hand, in the absence of drift and diffusion, the evolution of *p*_{+}(*y*, *t*) is governed exclusively by the two-state Markov process; in this case, we can write the evolution equation in the form of a discrete master equation:

Note here that because of the rescaling of time in Eq. (2), the transition rates of the Markov process equal one. Given these two components, we can follow Gardiner (2004) to write the CK equation that describes the evolution of *p*_{+}(*y*, *t*) when drift, diffusion, and switching are all present as

which is Eqs. (4a); (4b) is obtained similarly by conditioning on *x*(*t*) = − 1.

To determine response accuracy, we are interested in the probability the observer responds with the correct state of the Markov process. Therefore, at an interrogation time *t* = *T* where *x*(*T*) = + 1, the observer is correct if *y* > 0. This probability is given by the integral \({\int \limits }_{0}^{\infty } p_{+}(y,T) dy\). Similarly, if *x*(*T*) = − 1, the observer is correct if *y* < 0, which happens with probability \({\int \limits }_{-\infty }^{0} p_{-}(y,T) dy\). Therefore, the total response accuracy at time *T* is given by summing these two integrals:

which is Eq. (5).

Finally, in order to more efficiently simulate the evolution of *p*_{+}(*y*, *t*) and *p*_{−}(*y*, *t*), we notice that because the nonlinear leak is an odd function, a change of variables − *y*↦*y* in Eq. (4b) allows us to combine Eqs. (4a) and (4b) into a single PDE. Defining *p*_{s}(*y*, *t*) := *p*_{+}(*y*, *t*) + *p*_{−}(−*y*, *t*), we obtain the evolution equation

which is Eq. (6).

### Appendix C: Finite difference methods for Chapman-Kolmogorov equations

We used a finite difference method to simulate the differential CK equations. The method is exemplified here for the normative CK equation from Eq. (6), but a similar approach was used for the linear, cubic, and pulsatile equations. For stability purposes, our method uses centered differences in *y* and backward-Euler in *t*. This gives the following finite difference approximations of the functions and their derivatives in Eq. (6):

where Δ*t* and Δ*y* are timestep and spacestep of the simulation, respectively. Substituting into Eq. (6) and solving for *p*_{s}(*y*, *t*) at each point on a mesh **y** for *y* gives the system of equations:

where **A** is tridiagonal with elements along the primary off-diagonal. This system can be inverted at each timestep and used to calculate the updates *p*_{s}(**y**, *t* + Δ*t*).

For the boundary conditions, we imposed no-flux conditions at the mesh boundaries ± *b*. For a standard drift-diffusion equation with drift *A*(*y*) and diffusion constant *B*(*y*), this condition takes the form

Using the finite difference approximations

we can plug in ± *b* to the appropriate replacement and use Eq. (34) to find the appropriate boundary terms for the system in Eq. (33).

Figure 9 shows the results of Monte Carlo simulations compared against those from the CK equations; Monte Carlo simulations are less smooth (Fig. 9a), making optimality calculations less accurate. Furthermore, obtaining results that are close to those from the CK equations takes much longer to run (Fig. 9b), as we can obtain good accuracy (error ≈ 10^{− 2}) from the CK equations in less than a second (Fig. 9c).

### Appendix D: Distinguishing linear discounting models using confidence

Can we use the distributions obtained from the CK equations to distinguish which model a subject uses? To illustrate an example, we considered distinguishing the two linear discounting models given by Eq. (7) and asked how many trials an experimentalist would need to run in order to tell if a subject was using *λ*^{Acc} or *λ*^{KL}. We sampled from the distributions obtained from Eq. (8) and performed a likelihood ratio test to find the number of samples needed for the experimentalist to have 90% confidence in the model being used. Assuming a symmetric prior *P*(*λ*^{Acc}) = *P*(*λ*^{KL}) = 0.5, and defining \(\mathcal { Y}_{j} = y_{j}(T)\) as the belief report at the end (*t* = *T*) of trial *j*, the likelihood ratio is computed using Bayes’ rule and independence after *N* trials as

and we counted the number of trials required to obtain ≥ 90*%* confidence in either model (so \(P(\lambda |\mathcal {Y}_{1:N}) \geq 0.9\) for either *λ* ∈{*λ*^{Acc}, *λ*^{KL}}). The results of this simulation are given in Fig. 10, showing that as the strength *m* of evidence increases, the mean number of trials 〈*N*(*m*)〉 required to distinguish models decreases precipitously. This is to be expected based on the fact that the belief distributions become more separated as *m* increases (Fig. 3e).

### Appendix E: Deriving differential CK equation with internal noise

Here we provide intuition for the form of the diffusion coefficient in Eq. (13) for the belief distribution of a normative observer strategy with additional internal noise of strength *D*. Starting with the SDE in Eq. (12), because \(\sqrt {2m}\mathrm {d} W_{t}\) and \(\sqrt {2D}\mathrm {d} X_{t}\) are increments of independent Wiener processes, we can define a new Wiener process \(A\mathrm {d} Z_{t}=\sqrt {2m}\mathrm {d} W_{t}+\sqrt {2D}\mathrm {d} X_{t}\) that has the same statistics as the original summed Wiener processes (Gardiner 2004). To determine the appropriate effective diffusion constant *A*, we note that

and

This requires \(A=\sqrt {2(m+D)}\), and means Eq. (12) can be rewritten as

which following Gardiner (2004), has the differential CK equation given by Eq. (13).

### Appendix F: Steady state solution of the bounded accumulator model

Steady state solutions of Eq. (16) are derived first by noting that *∂*_{t}*p*^{±} = 0 implies

with boundary conditions \(\bar {p}_{\pm }(y) (\pm \beta ) = \bar {p}_{\pm }^{\prime }(\pm \beta )\). Equation (35) has solutions \(\left (\begin {array}{c} \bar {p}_{+}(y)\\ \bar {p}_{-}(y) \end {array} \right ) = \left (\begin {array}{c} A \\ B \end {array} \right ) \mathrm {e}^{\alpha y}\), with characteristic equation \(m^{2} \alpha ^{4}-\left (m^{2}+2m\right ) \alpha ^{2}=0\). The characteristic roots are *α* = 0,±*q*, where we define \(q = \sqrt {1+\frac {2}{m}}\). For *α* = 0, we have *A* = *B*, whereas for *α* = ±*q*, the symmetry \(\bar {p}_{+}(y)=\bar {p}_{-}(-y)\) implies *B* = (*m**q* − (*m* + 1))*A* for *α* = +*q* and *A* = (*m**q* − (*m* + 1))*B* for *α* = −*q*. Lastly, defining the sum \(\bar {p}_{s}(y)=\bar {p}_{+}(y)+\bar {p}_{-}(-y)\), we obtain

The no flux boundary conditions \(\bar {p}_{s}(\pm \beta )-\frac {\partial \bar {p}_{s}(\pm \beta )}{\partial y}=0\) along with the normalization requirement \({\int \limits }_{- \beta }^{\beta } \bar {p}_{s}(y) \mathrm {d} y = 1\) give explicit expressions for the constants

and

### Appendix G: Steady state solution of the clicks-task bounded accumulator model

Considering Eq. (23), we look for stationary solutions of the form \(\left (\begin {array}{llll} \bar {p}_{+}(n)\\ \bar {p}_{-}(n) \end {array}\right )=\left (\begin {array}{llll} C_{1}\\ C_{2} \end {array}\right ) \alpha ^{n}\), yielding the characteristic equation

where *r*_{s} = *r*_{+} + *r*_{−} + 1. Solving Eq. (36) gives *α* = 1 with eigenfunction *C*_{1} = *C*_{2} and two roots *α* = *q*_{±} of the quadratic \(\alpha ^{2}- \alpha (r_{+}+r_{+}^{2}+r_{-}+r_{-}^{2})/(r_{+} r_{-}) +1=0\). Superimposing the eigenfunctions, redefining constants, and defining \(\bar {p}_{s}(n)=\bar {p}_{+}(n)+\bar {p}_{-}(-n)\) gives the general solution

The constants *C*_{1}, *C*_{2}, and *C*_{3} can be determined by normalization \({\sum }_{n=-\beta }^{\beta }\bar {p}_{s}(n)=1\) and the stationary boundary conditions

Long term accuracy of the bounded accumulator is then determined by the weighted sum \(\text {Acc}_{\infty }(\beta ) = \frac {1}{2} \bar {p}_{s}(0) + {\sum }_{n=1}^{\beta } \bar {p}_{s}(n)\).

## Rights and permissions

## About this article

### Cite this article

Barendregt, N.W., Josić, K. & Kilpatrick, Z.P. Analyzing dynamic decision-making models using Chapman-Kolmogorov equations.
*J Comput Neurosci* **47**, 205–222 (2019). https://doi.org/10.1007/s10827-019-00733-5

Received:

Revised:

Accepted:

Published:

Issue Date:

DOI: https://doi.org/10.1007/s10827-019-00733-5