Skip to main content
Log in

Efficient computation of the maximum a posteriori path and parameter estimation in integrate-and-fire and more general state-space models

  • Published:
Journal of Computational Neuroscience Aims and scope Submit manuscript

Abstract

A number of important data analysis problems in neuroscience can be solved using state-space models. In this article, we describe fast methods for computing the exact maximum a posteriori (MAP) path of the hidden state variable in these models, given spike train observations. If the state transition density is log-concave and the observation model satisfies certain standard assumptions, then the optimization problem is strictly concave and can be solved rapidly with Newton–Raphson methods, because the Hessian of the loglikelihood is block tridiagonal. We can further exploit this block-tridiagonal structure to develop efficient parameter estimation methods for these models. We describe applications of this approach to neural decoding problems, with a focus on the classic integrate-and-fire model as a key example.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Abarbanel, H., Creveling, D., & Jeanne, J. (2008). Estimation of parameters in nonlinear systems using balanced synchronization. Physical Review. D., 77, 016,208.

    Google Scholar 

  • Ahmadian, Y., Pillow, J., & Paninski, L. (2009). Efficient Markov chain Monte Carlo methods for decoding population spike trains. Neural Computation (in press).

  • Ahmed, N. U. (1998). Linear and nonlinear filtering for scientists and engineers. Singapore: World Scientific.

    Google Scholar 

  • Asif, A., & Moura, J. (2005). Block matrices with l-block banded inverse: Inversion algorithms. IEEE Transactions on Signal Processing, 53, 630–642.

    Article  Google Scholar 

  • Badel, L., Richardson, M., & Gerstner, W. (2005). Dependence of the spike-triggered average voltage on membrane response properties. Neurocomputing, 69, 1062–1065.

    Article  Google Scholar 

  • Bell, B. M. (1994). The iterated Kalman smoother as Gauss-Newton method. SIAM Journal on Optimization, 4, 626–636.

    Article  Google Scholar 

  • Boyd, S., & Vandenberghe, L. (2004). Convex optimization. Cambridge: Cambridge University Press.

    Google Scholar 

  • Brockwell, A. E., Rojas, A. L., & Kass, R. E. (2004). Recursive Bayesian decoding of motor cortical signals by particle filtering. Journal of Neurophysiology, 91, 1899–1907.

    Article  CAS  PubMed  Google Scholar 

  • Brown, E. N., Frank, L. M., Tang, D., Quirk, M. C., & Wilson, M. A. (1998). A statistical paradigm for neural spike train decoding applied to position prediction from ensemble firing patterns of rat hippocampal place cells. Journal of Neuroscience, 18, 7411–7425.

    CAS  PubMed  Google Scholar 

  • Davis, R. A., & Rodriguez-Yam, G. (2005). Estimation for state-space models based on a likelihood approximation. Statistica Sinica, 15, 381–406.

    Google Scholar 

  • Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B, 79, 1–38.

    Google Scholar 

  • Eden, U. T., Frank, L. M., Barbieri, R., Solo, V., & Brown, E. N. (2004). Dynamic analyses of neural encoding by point process adaptive filtering. Neural Computation, 16, 971–998.

    Article  PubMed  Google Scholar 

  • Fahrmeir, L., & Kaufmann, H. (1991). On Kalman filtering, posterior mode estimation and fisher scoring in dynamic exponential family regression. Metrika, 38, 37–60.

    Article  Google Scholar 

  • Fahrmeir, L., & Tutz, G. (1994). Multivariate statistical modelling based on generalized linear models. New York: Springer.

    Google Scholar 

  • Heskes, T., & Zoeter, O. (2002). Expectation propagation for approximate inference in dynamic Bayesian networks. In A. Darwiche & N. Friedman (Eds.), Uncertainty in artificial intelligence: Proceedings of the eighteenth conference (UAI-2002) (pp. 216–233). San Francisco: Morgan Kaufmann.

    Google Scholar 

  • Huys, Q., Ahrens, M., & Paninski, L. (2006). Efficient estimation of detailed single-neuron models. Journal of Neurophysiology, 96, 872–890.

    Article  PubMed  Google Scholar 

  • Izhikevich, E. M. (2007). Dynamical systems in neuroscience: The geometry of excitability and bursting. Cambridge: MIT.

    Google Scholar 

  • Jungbacker, B., & Koopman, S. J. (2007). Monte Carlo estimation for nonlinear non-Gaussian state-space models. Biometrika, 94, 827–839.

    Article  Google Scholar 

  • Koyama, S., Shimokawa, T., & Shinomoto, S. (2007). Phase transitions in the estimation of event rate: A path integral analysis. Journal of Physics. A, Mathematical and General, 40, F383–F390.

    Article  Google Scholar 

  • Koyama, S., & Shinomoto, S. (2005). Empirical Bayes interpretations for random point events. Journal of Physics. A, Mathematical and General, 38, L531–L537.

    Article  Google Scholar 

  • Minka, T. (2001). Expectation propagation for approximate Bayesian inference. Uncertainty in Artificial intelligence, 17.

  • Moehlis, J., Shea-Brown, E., & Rabitz, H. (2006). Optimal inputs for phase models of spiking neurons. ASME Journal of Computational and Nonlinear Dynamics, 1, 358–367.

    Article  Google Scholar 

  • Olsson, R. K., Petersen, K. B., & Lehn-Schioler, T. (2007). State-space models: From the EM algorithm to a gradient approach. Neural Computation, 19, 1097–1111.

    Article  Google Scholar 

  • Paninski, L. (2004). Maximum likelihood estimation of cascade point-process neural encoding models. Network: Computation in Neural Systems, 15, 243–262.

    Article  Google Scholar 

  • Paninski, L. (2005). Log-concavity results on Gaussian process methods for supervised and unsupervised learning. Advances in Neural Information Processing Systems, 17, 1025–1032.

    Google Scholar 

  • Paninski, L. (2006a). The most likely voltage path and large deviations approximations for integrate-and-fire neurons. Journal of Computational Neuroscience, 21, 71–87.

    Article  PubMed  Google Scholar 

  • Paninski, L. (2006b). The spike-triggered average of the integrate-and-fire cell driven by Gaussian white noise. Neural Computation, 18, 2592–2616.

    Article  PubMed  Google Scholar 

  • Paninski, L., Pillow, J., & Simoncelli, E. (2004). Maximum likelihood estimation of a stochastic integrate-and-fire neural model. Neural Computation, 16, 2533–2561.

    Article  PubMed  Google Scholar 

  • Paninski, L., Brown, E. N., Iyengar, S., & Kass, R. E. (2008). Stochastic methods in neuroscience, chap. Statistical analysis of neuronal data via integrate-and-fire models. Oxford: Oxford University Press.

    Google Scholar 

  • Pillow, J., Ahmadian, Y., & Paninski, L. (2009). Model-based decoding, information estimation, and change-point detection in multi-neuron spike trains. Neural Computation (in press).

  • Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P. (1992). Numerical recipes in C. Cambridge: Cambridge University Press.

    Google Scholar 

  • Rabiner, L. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77, 257–286.

    Article  Google Scholar 

  • Rasmussen, C. E., & Williams, C. K. I. (2006). Gaussian processes for machine learning. Cambridge: MIT.

    Google Scholar 

  • Roweis, S., & Ghahramani, Z. (1999). A unifying review of linear Gaussian models. Neural Computation, 11, 305–345.

    Article  CAS  PubMed  Google Scholar 

  • Rybicki, G., & Hummer, D. (1991). An accelerated lambda iteration method for multilevel radiative transfer, appendix b: Fast solution for the diagonal elements of the inverse of a tridiagonal matrix. Astronomy and Astrophysics, 245, 171.

    CAS  Google Scholar 

  • Rybicki, G. B., & Press, W. H. (1995). Class of fast methods for processing irregularly sampled or otherwise inhomogeneous one-dimensional data. Physical Review Letters, 74(7), 1060–1063. doi:10.1103/PhysRevLett.74.1060.

    Article  CAS  PubMed  Google Scholar 

  • Salakhutdinov, R., Roweis, S. T., & Ghahramani, Z. (2003). Optimization with EM and expectation-conjugate-gradient. International Conference on Machine Learning, 20, 672–679.

    Google Scholar 

  • Smith, A. C., & Brown, E. N. (2003). Estimating a state-space model from point process observations. Neural Computation, 15, 965–991.

    Article  PubMed  Google Scholar 

  • Snyder, D. L. (1975). Random point processes. New York: Wiley.

    Google Scholar 

  • Tierney, L., Kass, R. E., & Kadane, J. B. (1989). Fully exponential Laplace approximation to posterior expectations and variances. Journal of the American Statistical Association, 84, 710–716.

    Article  Google Scholar 

  • West, M., Harrison, J. P., & Migon, H. S. (1985). Dynamic generalized linear models and Bayesian forcasting. Journal of the American Statistical Association, 80, 73–83.

    Article  Google Scholar 

  • Ypma, A., & Heskes, T. (2005). Novel approximations for inference in nonlinear dynamical systems using expectation propagation. Neurocomputing, 69, 85–99. doi:10.1016/j.neucom.2005.02.020.

    Article  Google Scholar 

  • Yu, B. M., Shenoy, K. V., & Sahani, M. (2006). Expectation propagation for inference in non-linear dynamical models with Poisson observations. In Proceedings of the nonlinear statistical signal processing workshop. Piscataway: IEEE.

    Google Scholar 

Download references

Acknowledgements

We thank Y. Ahmadian, R. Kass, M. Nikitchenko, K. Rahnama Rad, M. Vidne and J. Vogelstein for helpful conversations and comments. SK is supported by NIH grants R01 MH064537, R01 EB005847 and R01 NS050256. LP is supported by NIH grant R01 EY018003, an NSF CAREER award, and a McKnight Scholar award.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shinsuke Koyama.

Additional information

Action Editor: Nicolas Brunel

Appendices

Appendix A: Point process filter and smoother

A simple version of the point process filter approximates the filtered distribution Eq. (26) to a Gaussian centered by its mode (Brown et al. 1998). Let x i|i and V i|i be the (approximate) mode and covariance matrix for the filtered distribution Eq. (26), and x i|i − 1 and V i|i − 1 be the mode and covariance matrix for the predictive distribution Eq. (27) at time i. Let l(x i ) = log{ p(y i |x i ) p(x i |y 1:i − 1) }. The filtered distribution is then approximated to a Gaussian whose mean and covariance are \(x_{i|i} = \arg\max_{x_i}l(x_i)\) and \(V_{i|i} = -[\nabla\nabla_{x_i} l(x_{i|i})]^{-1}\), respectively. When the state-transition density is linear Gaussian, \(p(x_i|x_{i-1}) = \mathcal{N}(F_i x_{i-1}, Q_i)\), the predictive distribution Eq. (27) is also Gaussian, whose mean and covariance are computed as

$$ \begin{array}{rcl} x_{i|i-1} &=& F_i x_{i-1|i-1}, \\ \end{array} $$
(67)
$$ \begin{array}{rcl} V_{i|i-1} &=& F_i V_{i-1|i-1}F_i^T + Q_i. \end{array} $$
(68)

Since the filtered and predictive distributions are Gaussian, the smoothing distribution Eq. (28) is also Gaussian, which can be computed by the standard Kalman smoother (Smith and Brown 2003). Let x i|N and V i|N be the mean and covariance the smoothing distribution at time i. The recursive smoothing equation corresponding to Eq. (28) is given by

$$ \begin{array}{rcl} x_{i|N} &=& x_{i|i} + V_{i|i} F_i V_{i+1|i}^{-1}(x_{i+1|N}-x_{i+1|i}), \\ \end{array} $$
(69)
$$ \begin{array}{rcl} V_{i|N} &=& V_{i|i} + V_{i|i} F_i V_{i+1|i}^{-1}(V_{i+1|N}-V_{i+1|i}) V_{i|i}^{-1} F_i^T V_{i+1|i}.\\ \end{array} $$
(70)

There are now several versions of the point process filter depending on the choice of the mean and variance of the approximate filtered distribution. In Eden et al. (2004), the filtered distribution at each time step i is approximated to a Gaussian by expanding its logarithm in a Taylor series about x i|i − 1 up to the second-order term, which results in a simpler algorithm. Koyama et al. (unpublished manuscript) proposed a more sophisticated method by utilizing the fully exponential Laplace approximation (Tierney et al. 1989), which achieves second-order accuracy in approximating the posterior expectation.

For the leaky IF model with hard-threshold, the standard Taylor-series-based recursions (Brown et al. 1998) do not apply (due to the discontinuity of log p(y i |x i )), and therefore we have not included comparisons to the point-process smoother in Figs. 46. However, it is worth noting that in this case the filtered distribution Eq. (26) can be approximated recursively as a truncated Gaussian defined on ( − ∞ , x th ], and hence the approximate mean and variance can be obtained analytically; we found that this moment-matching method behaves similarly to the EP method (this is unsurprising, since EP is also based on a moment-matching procedure; data not shown).

Appendix B: Gaussian quadrature in EP algorithm

The expectation of a function of x i , f(x i ), with respect to p(x i |y 1:N ) in Eq. (38) is expressed as

$$ \begin{array}{rcl} E_i[f(x_i)] &=& \int\int f(x_i) p(x_{i-1},x_i|y_{1:N}) dx_{i-1}dx_i \\ &\propto& \int f(x_i)p(y_i|x_i)\beta_i(x_i) \\&& \times \bigg[ \int \alpha_{i-1}(x_{i-1})p(x_i|x_{i-1})dx_{i-1} \bigg]dx_i \\ &\equiv& \int f(x_i) p(y_i|x_i)\beta_i(x_i) g(x_i)dx_i, \end{array} $$
(71)

where

$$ g(x_i) = \int \alpha_{i-1}(x_{i-1})p(x_i|x_{i-1})dx_{i-1} $$
(72)

is Gaussian since α i − 1(x i − 1) and p(x i |x i − 1) are also Gaussian. By introducing the Laplace approximation, \(p_L(x_i)\equiv \mathcal{N}(m, v) \approx p(y_i|x_i)\beta_i(x_i) g(x_i)\), as a proposal distribution, the expectation can be expressed as

$$ \begin{array}{rcl} E_i[f(x_i)] &\propto& \int p_L(x_i) \bigg[\frac{f(x_i) p(y_i|x_i)\beta_i(x_i) g(x_i)}{p_L(x_i)}\bigg]dx_i \\ &\equiv& \int p_L(x_i)F(x_i)dx_i. \end{array} $$
(73)

After a linear change of variable, \(x_i=\sqrt{v}u+m\), we have the standard form of the Gauss-Hermite quadrature,

$$ \begin{array}{rcl} E_i[f(x_i)] &\propto& \int e^{-\frac{u^2}{2}} F(\sqrt{v}u+m) du \\ &\approx& \sum\limits_{l=1}^nw_l F(\sqrt{v}u_l+m), \end{array} $$
(74)

where the weights w l and evaluation points u l are chosen according to a quadrature rule. The advantages of this method is that it requires only an inner product once the weights and evaluation points are calculated. (These only have to be computed once.) The expectation of f(x i − 1) with respect to p(x i − 1|y 1:N ) in Eq. (39) can be computed in the same way.

For the leaky IF model with hard-threshold, the observation model is given by the step-function, and thus the integral in Eq. (73) becomes

$$ \int_{-\infty}^{\infty}p(y_i=0|x_i)\ldots dx_i = \int_{-\infty}^{x_{th}}\ldots dx_i. $$
(75)

As a result, the expectation Eq. (73) is reduced to the integral over a truncated Gaussian, which can be computed analytically.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Koyama, S., Paninski, L. Efficient computation of the maximum a posteriori path and parameter estimation in integrate-and-fire and more general state-space models. J Comput Neurosci 29, 89–105 (2010). https://doi.org/10.1007/s10827-009-0150-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10827-009-0150-x

Keywords

Navigation