Efficient computation of the maximum a posteriori path and parameter estimation in integrate-and-fire and more general state-space models

Koyama, Shinsuke; Paninski, Liam

doi:10.1007/s10827-009-0150-x

Efficient computation of the maximum a posteriori path and parameter estimation in integrate-and-fire and more general state-space models

Published: 28 April 2009

Volume 29, pages 89–105, (2010)
Cite this article

Journal of Computational Neuroscience Aims and scope Submit manuscript

Shinsuke Koyama¹ &
Liam Paninski²

689 Accesses
24 Citations
Explore all metrics

Abstract

A number of important data analysis problems in neuroscience can be solved using state-space models. In this article, we describe fast methods for computing the exact maximum a posteriori (MAP) path of the hidden state variable in these models, given spike train observations. If the state transition density is log-concave and the observation model satisfies certain standard assumptions, then the optimization problem is strictly concave and can be solved rapidly with Newton–Raphson methods, because the Hessian of the loglikelihood is block tridiagonal. We can further exploit this block-tridiagonal structure to develop efficient parameter estimation methods for these models. We describe applications of this approach to neural decoding problems, with a focus on the classic integrate-and-fire model as a key example.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

What Is Neural Plasticity?

Learning spiking neuronal networks with artificial neural networks: neural oscillations

Article 17 April 2024

Neural Networks – State of Art, Brief History, Basic Models and Architecture

References

Abarbanel, H., Creveling, D., & Jeanne, J. (2008). Estimation of parameters in nonlinear systems using balanced synchronization. Physical Review. D., 77, 016,208.
Google Scholar
Ahmadian, Y., Pillow, J., & Paninski, L. (2009). Efficient Markov chain Monte Carlo methods for decoding population spike trains. Neural Computation (in press).
Ahmed, N. U. (1998). Linear and nonlinear filtering for scientists and engineers. Singapore: World Scientific.
Google Scholar
Asif, A., & Moura, J. (2005). Block matrices with l-block banded inverse: Inversion algorithms. IEEE Transactions on Signal Processing, 53, 630–642.
Article Google Scholar
Badel, L., Richardson, M., & Gerstner, W. (2005). Dependence of the spike-triggered average voltage on membrane response properties. Neurocomputing, 69, 1062–1065.
Article Google Scholar
Bell, B. M. (1994). The iterated Kalman smoother as Gauss-Newton method. SIAM Journal on Optimization, 4, 626–636.
Article Google Scholar
Boyd, S., & Vandenberghe, L. (2004). Convex optimization. Cambridge: Cambridge University Press.
Google Scholar
Brockwell, A. E., Rojas, A. L., & Kass, R. E. (2004). Recursive Bayesian decoding of motor cortical signals by particle filtering. Journal of Neurophysiology, 91, 1899–1907.
Article CAS PubMed Google Scholar
Brown, E. N., Frank, L. M., Tang, D., Quirk, M. C., & Wilson, M. A. (1998). A statistical paradigm for neural spike train decoding applied to position prediction from ensemble firing patterns of rat hippocampal place cells. Journal of Neuroscience, 18, 7411–7425.
CAS PubMed Google Scholar
Davis, R. A., & Rodriguez-Yam, G. (2005). Estimation for state-space models based on a likelihood approximation. Statistica Sinica, 15, 381–406.
Google Scholar
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B, 79, 1–38.
Google Scholar
Eden, U. T., Frank, L. M., Barbieri, R., Solo, V., & Brown, E. N. (2004). Dynamic analyses of neural encoding by point process adaptive filtering. Neural Computation, 16, 971–998.
Article PubMed Google Scholar
Fahrmeir, L., & Kaufmann, H. (1991). On Kalman filtering, posterior mode estimation and fisher scoring in dynamic exponential family regression. Metrika, 38, 37–60.
Article Google Scholar
Fahrmeir, L., & Tutz, G. (1994). Multivariate statistical modelling based on generalized linear models. New York: Springer.
Google Scholar
Heskes, T., & Zoeter, O. (2002). Expectation propagation for approximate inference in dynamic Bayesian networks. In A. Darwiche & N. Friedman (Eds.), Uncertainty in artificial intelligence: Proceedings of the eighteenth conference (UAI-2002) (pp. 216–233). San Francisco: Morgan Kaufmann.
Google Scholar
Huys, Q., Ahrens, M., & Paninski, L. (2006). Efficient estimation of detailed single-neuron models. Journal of Neurophysiology, 96, 872–890.
Article PubMed Google Scholar
Izhikevich, E. M. (2007). Dynamical systems in neuroscience: The geometry of excitability and bursting. Cambridge: MIT.
Google Scholar
Jungbacker, B., & Koopman, S. J. (2007). Monte Carlo estimation for nonlinear non-Gaussian state-space models. Biometrika, 94, 827–839.
Article Google Scholar
Koyama, S., Shimokawa, T., & Shinomoto, S. (2007). Phase transitions in the estimation of event rate: A path integral analysis. Journal of Physics. A, Mathematical and General, 40, F383–F390.
Article Google Scholar
Koyama, S., & Shinomoto, S. (2005). Empirical Bayes interpretations for random point events. Journal of Physics. A, Mathematical and General, 38, L531–L537.
Article Google Scholar
Minka, T. (2001). Expectation propagation for approximate Bayesian inference. Uncertainty in Artificial intelligence, 17.
Moehlis, J., Shea-Brown, E., & Rabitz, H. (2006). Optimal inputs for phase models of spiking neurons. ASME Journal of Computational and Nonlinear Dynamics, 1, 358–367.
Article Google Scholar
Olsson, R. K., Petersen, K. B., & Lehn-Schioler, T. (2007). State-space models: From the EM algorithm to a gradient approach. Neural Computation, 19, 1097–1111.
Article Google Scholar
Paninski, L. (2004). Maximum likelihood estimation of cascade point-process neural encoding models. Network: Computation in Neural Systems, 15, 243–262.
Article Google Scholar
Paninski, L. (2005). Log-concavity results on Gaussian process methods for supervised and unsupervised learning. Advances in Neural Information Processing Systems, 17, 1025–1032.
Google Scholar
Paninski, L. (2006a). The most likely voltage path and large deviations approximations for integrate-and-fire neurons. Journal of Computational Neuroscience, 21, 71–87.
Article PubMed Google Scholar
Paninski, L. (2006b). The spike-triggered average of the integrate-and-fire cell driven by Gaussian white noise. Neural Computation, 18, 2592–2616.
Article PubMed Google Scholar
Paninski, L., Pillow, J., & Simoncelli, E. (2004). Maximum likelihood estimation of a stochastic integrate-and-fire neural model. Neural Computation, 16, 2533–2561.
Article PubMed Google Scholar
Paninski, L., Brown, E. N., Iyengar, S., & Kass, R. E. (2008). Stochastic methods in neuroscience, chap. Statistical analysis of neuronal data via integrate-and-fire models. Oxford: Oxford University Press.
Google Scholar
Pillow, J., Ahmadian, Y., & Paninski, L. (2009). Model-based decoding, information estimation, and change-point detection in multi-neuron spike trains. Neural Computation (in press).
Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P. (1992). Numerical recipes in C. Cambridge: Cambridge University Press.
Google Scholar
Rabiner, L. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77, 257–286.
Article Google Scholar
Rasmussen, C. E., & Williams, C. K. I. (2006). Gaussian processes for machine learning. Cambridge: MIT.
Google Scholar
Roweis, S., & Ghahramani, Z. (1999). A unifying review of linear Gaussian models. Neural Computation, 11, 305–345.
Article CAS PubMed Google Scholar
Rybicki, G., & Hummer, D. (1991). An accelerated lambda iteration method for multilevel radiative transfer, appendix b: Fast solution for the diagonal elements of the inverse of a tridiagonal matrix. Astronomy and Astrophysics, 245, 171.
CAS Google Scholar
Rybicki, G. B., & Press, W. H. (1995). Class of fast methods for processing irregularly sampled or otherwise inhomogeneous one-dimensional data. Physical Review Letters, 74(7), 1060–1063. doi:10.1103/PhysRevLett.74.1060.
Article CAS PubMed Google Scholar
Salakhutdinov, R., Roweis, S. T., & Ghahramani, Z. (2003). Optimization with EM and expectation-conjugate-gradient. International Conference on Machine Learning, 20, 672–679.
Google Scholar
Smith, A. C., & Brown, E. N. (2003). Estimating a state-space model from point process observations. Neural Computation, 15, 965–991.
Article PubMed Google Scholar
Snyder, D. L. (1975). Random point processes. New York: Wiley.
Google Scholar
Tierney, L., Kass, R. E., & Kadane, J. B. (1989). Fully exponential Laplace approximation to posterior expectations and variances. Journal of the American Statistical Association, 84, 710–716.
Article Google Scholar
West, M., Harrison, J. P., & Migon, H. S. (1985). Dynamic generalized linear models and Bayesian forcasting. Journal of the American Statistical Association, 80, 73–83.
Article Google Scholar
Ypma, A., & Heskes, T. (2005). Novel approximations for inference in nonlinear dynamical systems using expectation propagation. Neurocomputing, 69, 85–99. doi:10.1016/j.neucom.2005.02.020.
Article Google Scholar
Yu, B. M., Shenoy, K. V., & Sahani, M. (2006). Expectation propagation for inference in non-linear dynamical models with Poisson observations. In Proceedings of the nonlinear statistical signal processing workshop. Piscataway: IEEE.
Google Scholar

Download references

Acknowledgements

We thank Y. Ahmadian, R. Kass, M. Nikitchenko, K. Rahnama Rad, M. Vidne and J. Vogelstein for helpful conversations and comments. SK is supported by NIH grants R01 MH064537, R01 EB005847 and R01 NS050256. LP is supported by NIH grant R01 EY018003, an NSF CAREER award, and a McKnight Scholar award.

Author information

Authors and Affiliations

Department of Statistics and Center for the Neural Basis of Cognition, Carnegie Mellon University, Pittsburgh, PA, USA
Shinsuke Koyama
Department of Statistics and Center for Theoretical Neuroscience, Columbia University, New York, NY, USA
Liam Paninski

Authors

Shinsuke Koyama
View author publications
You can also search for this author in PubMed Google Scholar
Liam Paninski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shinsuke Koyama.

Additional information

Action Editor: Nicolas Brunel

Appendices

Appendix A: Point process filter and smoother

A simple version of the point process filter approximates the filtered distribution Eq. (26) to a Gaussian centered by its mode (Brown et al. 1998). Let x _i|i and V _i|i be the (approximate) mode and covariance matrix for the filtered distribution Eq. (26), and x _{i|i − 1} and V _{i|i − 1} be the mode and covariance matrix for the predictive distribution Eq. (27) at time i. Let l(x _i) = log{ p(y _i|x _i) p(x _i|y _{1:i − 1}) }. The filtered distribution is then approximated to a Gaussian whose mean and covariance are $x_{i|i} = \arg\max_{x_i}l(x_i)$ and $V_{i|i} = -[\nabla\nabla_{x_i} l(x_{i|i})]^{-1}$, respectively. When the state-transition density is linear Gaussian, $p(x_i|x_{i-1}) = \mathcal{N}(F_i x_{i-1}, Q_i)$, the predictive distribution Eq. (27) is also Gaussian, whose mean and covariance are computed as

$$ \begin{array}{rcl} x_{i|i-1} &=& F_i x_{i-1|i-1}, \\ \end{array} $$

(67)

$$ \begin{array}{rcl} V_{i|i-1} &=& F_i V_{i-1|i-1}F_i^T + Q_i. \end{array} $$

(68)

Since the filtered and predictive distributions are Gaussian, the smoothing distribution Eq. (28) is also Gaussian, which can be computed by the standard Kalman smoother (Smith and Brown 2003). Let x _i|N and V _i|N be the mean and covariance the smoothing distribution at time i. The recursive smoothing equation corresponding to Eq. (28) is given by

$$ \begin{array}{rcl} x_{i|N} &=& x_{i|i} + V_{i|i} F_i V_{i+1|i}^{-1}(x_{i+1|N}-x_{i+1|i}), \\ \end{array} $$

(69)

$$ \begin{array}{rcl} V_{i|N} &=& V_{i|i} + V_{i|i} F_i V_{i+1|i}^{-1}(V_{i+1|N}-V_{i+1|i}) V_{i|i}^{-1} F_i^T V_{i+1|i}.\\ \end{array} $$

(70)

There are now several versions of the point process filter depending on the choice of the mean and variance of the approximate filtered distribution. In Eden et al. (2004), the filtered distribution at each time step i is approximated to a Gaussian by expanding its logarithm in a Taylor series about x _{i|i − 1} up to the second-order term, which results in a simpler algorithm. Koyama et al. (unpublished manuscript) proposed a more sophisticated method by utilizing the fully exponential Laplace approximation (Tierney et al. 1989), which achieves second-order accuracy in approximating the posterior expectation.

For the leaky IF model with hard-threshold, the standard Taylor-series-based recursions (Brown et al. 1998) do not apply (due to the discontinuity of log p(y _i|x _i)), and therefore we have not included comparisons to the point-process smoother in Figs. 4–6. However, it is worth noting that in this case the filtered distribution Eq. (26) can be approximated recursively as a truncated Gaussian defined on ( − ∞ , x _th], and hence the approximate mean and variance can be obtained analytically; we found that this moment-matching method behaves similarly to the EP method (this is unsurprising, since EP is also based on a moment-matching procedure; data not shown).

Appendix B: Gaussian quadrature in EP algorithm

The expectation of a function of x _i, f(x _i), with respect to p(x _i|y _1:N) in Eq. (38) is expressed as

$$ \begin{array}{rcl} E_i[f(x_i)] &=& \int\int f(x_i) p(x_{i-1},x_i|y_{1:N}) dx_{i-1}dx_i \\ &\propto& \int f(x_i)p(y_i|x_i)\beta_i(x_i) \\&& \times \bigg[ \int \alpha_{i-1}(x_{i-1})p(x_i|x_{i-1})dx_{i-1} \bigg]dx_i \\ &\equiv& \int f(x_i) p(y_i|x_i)\beta_i(x_i) g(x_i)dx_i, \end{array} $$

(71)

where

$$ g(x_i) = \int \alpha_{i-1}(x_{i-1})p(x_i|x_{i-1})dx_{i-1} $$

(72)

is Gaussian since α _i − 1(x _i − 1) and p(x _i|x _i − 1) are also Gaussian. By introducing the Laplace approximation, $p_L(x_i)\equiv \mathcal{N}(m, v) \approx p(y_i|x_i)\beta_i(x_i) g(x_i)$, as a proposal distribution, the expectation can be expressed as

$$ \begin{array}{rcl} E_i[f(x_i)] &\propto& \int p_L(x_i) \bigg[\frac{f(x_i) p(y_i|x_i)\beta_i(x_i) g(x_i)}{p_L(x_i)}\bigg]dx_i \\ &\equiv& \int p_L(x_i)F(x_i)dx_i. \end{array} $$

(73)

After a linear change of variable, $x_i=\sqrt{v}u+m$, we have the standard form of the Gauss-Hermite quadrature,

$$ \begin{array}{rcl} E_i[f(x_i)] &\propto& \int e^{-\frac{u^2}{2}} F(\sqrt{v}u+m) du \\ &\approx& \sum\limits_{l=1}^nw_l F(\sqrt{v}u_l+m), \end{array} $$

(74)

where the weights w _l and evaluation points u _l are chosen according to a quadrature rule. The advantages of this method is that it requires only an inner product once the weights and evaluation points are calculated. (These only have to be computed once.) The expectation of f(x _i − 1) with respect to p(x _i − 1|y _1:N) in Eq. (39) can be computed in the same way.

For the leaky IF model with hard-threshold, the observation model is given by the step-function, and thus the integral in Eq. (73) becomes

$$ \int_{-\infty}^{\infty}p(y_i=0|x_i)\ldots dx_i = \int_{-\infty}^{x_{th}}\ldots dx_i. $$

(75)

As a result, the expectation Eq. (73) is reduced to the integral over a truncated Gaussian, which can be computed analytically.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Koyama, S., Paninski, L. Efficient computation of the maximum a posteriori path and parameter estimation in integrate-and-fire and more general state-space models. J Comput Neurosci 29, 89–105 (2010). https://doi.org/10.1007/s10827-009-0150-x

Download citation

Received: 05 December 2008
Revised: 02 February 2009
Accepted: 17 March 2009
Published: 28 April 2009
Issue Date: August 2010
DOI: https://doi.org/10.1007/s10827-009-0150-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient computation of the maximum a posteriori path and parameter estimation in integrate-and-fire and more general state-space models

Abstract

Access this article

Similar content being viewed by others

What Is Neural Plasticity?

Learning spiking neuronal networks with artificial neural networks: neural oscillations

Neural Networks – State of Art, Brief History, Basic Models and Architecture

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendices

Appendix A: Point process filter and smoother

Appendix B: Gaussian quadrature in EP algorithm

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Efficient computation of the maximum a posteriori path and parameter estimation in integrate-and-fire and more general state-space models

Abstract

Access this article

Similar content being viewed by others

What Is Neural Plasticity?

Learning spiking neuronal networks with artificial neural networks: neural oscillations

Neural Networks – State of Art, Brief History, Basic Models and Architecture

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendices

Appendix A: Point process filter and smoother

Appendix B: Gaussian quadrature in EP algorithm

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation