Abstract
We introduce an implicit method for state and parameter estimation and apply it to a stochastic ecological model. The method uses an ensemble of particles to approximate the distribution of model solutions and parameters conditioned on noisy observations of the state. For each particle, it first determines likely values based on the observations, then samples around those values. This approach has a strong theoretical foundation, applies to nonlinear models and non-Gaussian distributions, and can estimate any number of model parameters, initial conditions, and model error covariances. The method is called implicit because it updates the particles without forming a predictive distribution of forward model integrations. As a point of comparison for different assimilation techniques, we consider examples in which one or more bifurcations separate the true parameter from its initial approximation. The implicit estimator is asymptotically unbiased, has a root-mean-squared error comparable to or less than the other methods, and is accurate even with small ensemble sizes.
Similar content being viewed by others
References
Andrieu, C., Doucet, A., & Holenstein, R. (2010). Particle Markov chain Monte Carlo methods. J. R. Stat. Soc. Ser. B Stat. Methodol., 72, 269–342.
de Angelis, D. L. (1975). Estimates of predator-prey limit cycles. Bull. Math. Biol., 37, 291–299.
Arnold, L. (1998). Random dynamical systems. Berlin: Springer.
Arulampalam, M. S., Maskell, S., Gordon, N., & Clapp, T. (2002). A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking. IEEE Trans. Signal Process., 50, 174–188.
Brauer, F., & Castillo-Chavez, C. (2012). Mathematical models in population biology and epidemiology (2nd ed.). New York: Springer.
Chorin, A. J., & Hald, O. H. (2009). Stochastic tools in mathematics and science (2nd ed.). Dordrecht: Springer.
Chorin, A. J., & Tu, X. (2009). Implicit sampling for particle filters. Proc. Natl. Acad. Sci. USA, 106, 17,249–17,254.
Chorin, A. J., Morzfeld, M., & Tu, X. (2010). Implicit particle filters for data assimilation. Commun. Appl. Math. Comput. Sci., 5, 221–240.
Conn, A. R., Gould, N. I. M., & Toint, P. L. (2000). Trust-region methods. Philadelphia: SIAM.
Cossarini, G., Lermusiaux, P. F. J., & Solidoro, C. (2009). Lagoon of Venice ecosystem: seasonal dynamics and environmental guidance with uncertainty analyses and error subspace data assimilation. J. Geophys. Res., 114. doi:10.1029/2008JC005080.
Courtier, P., & Talagrand, O. (1987). Variational assimilation of meteorological observations with the adjoint vorticity equation. II: Numerical results. Q. J. R. Meteorol. Soc., 113, 1329–1347.
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B Stat. Methodol., 39, 1–38.
Doron, M., Brasseur, P., & Brankart, J.-M. (2011). Stochastic estimation of biogeochemical parameters of a 3D ocean coupled physical-biogeochemical model: twin experiments. J. Mar. Syst., 87, 194–207.
Doucet, A., Godsill, S., & Andrieu, C. (2000). On sequential Monte Carlo sampling methods for Bayesian filtering. Statist. Comput., 10, 197–208.
Doucet, A., de Freitas, N., & Gordon, N. (2001). Sequential Monte Carlo methods in practice. New York: Springer.
Dowd, M. (2006). A sequential Monte Carlo approach for marine ecological prediction. Environmetrics, 17, 435–455.
Dowd, M. (2007). Bayesian statistical data assimilation for ecosystem models using Markov chain Monte Carlo. J. Mar. Syst., 68, 439–456.
Dowd, M. (2011). Estimating parameters for a stochastic dynamic marine ecological system. Environmetrics, 22, 501–515.
Efron, B. (1979). Bootstrap methods: another look at the jackknife. Ann. Statist., 7, 1–26.
Efron, B. (2003). Second thoughts on the bootstrap. Statist. Sci., 18, 135–140.
Evensen, G. (2009). Data assimilation: the ensemble Kalman filter (2nd ed.). Dordrecht: Springer.
Fasham, M. J. R., Ducklow, H. W., & McKelvie, S. M. (1990). A nitrogen-based model of plankton dynamics in the oceanic mixed layer. J. Mar. Res., 48, 591–639.
Fletcher, R. (1987). Practical methods of optimization (2nd ed.). Chichester: Wiley.
Friedrichs, M. A. M. (2002). Assimilation of JGOFS EqPac and SeaWIFS data into a marine ecosystem model of the central equatorial Pacific Ocean. Deep-Sea Res., Part 2, 49, 289–319.
Friedrichs, M. A. M., Dusenberry, J. A., Anderson, L. A., Armstrong, R. A., Chai, F., Christian, J. R., Doney, S. C., Dunne, J., Fujii, M., Hood, R., McGillicuddy, D. J. Jr., Moore, J. K., Schartau, M., Spitz, Y. H., & Wiggert, J. D. (2007). Assessment of skill and portability in regional marine biogeochemical models: role of multiple planktonic groups. J. Geophys. Res., 112. doi:10.1029/2006JC003852.
Geweke, J. (1989). Bayesian inference in econometric models using Monte Carlo integration. Econometrica, 57, 1317–1339.
Gilks, W. R., & Berzuini, C. (2001). Following a moving target—Monte Carlo inference for dynamic Bayesian models. J. R. Stat. Soc. Ser. B Stat. Methodol., 63, 127–146.
Golightly, A., & Wilkinson, D. J. (2011). Bayesian parameter inference for stochastic biochemical network models using Markov chain Monte Carlo. J. R. Soc. Interface, 1, 807–820.
Gordon, N. J., Salmond, D. J., & Smith, A. F. M. (1993). Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proc. F, 140, 107–113.
Gregg, W. W. (2008). Assimilation of SeaWIFS ocean chlorophyll data into a three-dimensional global ocean model. J. Mar. Syst., 69, 205–225.
Holling, C. S. (1959). Some characteristics of simple types of predation and parasitism. Can. Entomol., 91, 385–398.
Hurtt, G. C., & Armstrong, R. A. (1996). A pelagic ecosystem model calibrated with BATS data. Deep-Sea Res., Part 2, 43, 653–683.
Hurtt, G. C., & Armstrong, R. A. (1999). A pelagic ecosystem model calibrated with BATS and OWSI data. Deep-Sea Res., Part 1, 46, 27–61.
Ionides, E. L., Bretó, C., & King, A. A. (2006). Inference for nonlinear dynamical systems. Proc. Natl. Acad. Sci. USA, 103, 18,438–18,443.
Johnson, K. A., & Goody, R. S. (2011). The original Michaelis constant: translation of the 1913 Michaelis-Menten paper. Biochemistry, 50, 8264–8269.
Kitagawa, G. (1996). Monte Carlo filter and smoother for non-Gaussian nonlinear state space models. J. Comput. Graph. Statist., 5, 1–25.
Kivman, G. A. (2003). Sequential parameter estimation for stochastic systems. Nonlinear Process. Geophys., 10, 253–259.
Kloeden, P. E., & Platen, E. (1999). Numerical solution of stochastic differential equations. Berlin: Springer.
Lawson, L. M., Spitz, Y. H., Hofmann, E. E., & Long, R. B. (1995). A data assimilation technique applied to a predator-prey model. Bull. Math. Biol., 57, 593–617.
Lawson, L. M., Hofmann, E. E., & Spitz, Y. H. (1996). Time series sampling and data assimilation in a simple marine ecosystem model. Deep-Sea Res., Part 2, 43, 625–651.
van Leeuwen, P. J. (2009). Particle filtering in geophysical systems. Mon. Weather Rev., 137, 4089–4114.
Lehmann, E. L., & Casella, G. (1998). Theory of point estimation (2nd ed.). New York: Springer.
Liu, J., & West, M. (2001). Combined parameter and state estimation in simulation based filtering. In A. Doucet, N. de Freitas, & N. Gordon (Eds.), Sequential Monte Carlo methods in practice (pp. 197–217). New York: Springer.
Longhurst, A. (1995). Seasonal cycles of pelagic production and consumption. Prog. Oceanogr., 36, 77–167.
Lorenz, E. N. (1963). Deterministic nonperiodic flow. J. Atmospheric Sci., 20, 130–141.
Losa, S. N., Kivman, G. A., Schroter, J., & Wenzel, M. (2003). Sequential weak constraint parameter estimation in an ecosystem model. J. Mar. Syst., 43, 31–49.
Marjoram, P., Molitor, J., Plagnol, V., & Tavaré, S. (2003). Markov chain Monte Carlo without likelihoods. Proc. Natl. Acad. Sci. USA, 100, 15,324–15,328.
Matear, R. J. (1995). Parameter optimization and analysis of ecosystem models using simulated annealing: a case study at Station P. J. Mar. Res., 53, 571–607.
May, R. M. (1972). Limit cycles in predator-prey communities. Science, 177, 900–902.
McLachlan, G. J., & Krishnan, T. (2008). The EM algorithm and extensions (2nd ed.). Hoboken: Wiley.
Metropolis, N., Rosenbluth, A. W., Teller, A. H., & Teller, E. (1953). Equations of state calculations by fast computing machines. J. Chem. Phys., 21, 1087–1092.
Michaelis, L., & Menten, M. L. (1913). Die Kinetik der Invertinwirkung. Biochem. Z., 49, 333–369.
Miller, R. N., Carter, E. F. Jr., & Blue, S. T. (1999). Data assimilation into nonlinear stochastic models. Tellus A, 51, 167–194.
Morzfeld, M., & Chorin, A. J. (2012). Implicit particle filtering for models with partial noise, and an application to geomagnetic data assimilation. Nonlinear Process. Geophys., 19, 365–382.
Morzfeld, M., Tu, X., Atkins, E., & Chorin, A. J. (2012). A random map implementation of implicit filters. J. Comput. Phys., 231, 2049–2066.
Newberger, P. A., Allen, J. S., & Spitz, Y. H. (2003). Analysis and comparison of three ecosystem models. J. Geophys. Res., 108. doi:10.1029/2001JC001182.
Øksendal, B. K. (2003). Stochastic differential equations: an introduction with applications (6th ed.). Berlin: Springer.
Raftery, A. E., & Bao, L. (2010). Estimating and projecting trends in HIV/AIDS generalized epidemics using incremental mixture importance sampling. Biometrics, 66, 1162–1173.
Ripley, B. (1987). Stochastic simulation. New York: Wiley.
Robinson, A. R., & Lermusiaux, P. F. J. (2002). Data assimilation for modeling and predicting coupled physical-biological interactions in the sea. In A. R. Robinson, J. J. McCarthy, & B. J. Rothschild (Eds.), Biological-physical interactions in the sea. The sea (Vol. 12, pp. 475–536). New York: Wiley.
Sapsis, T. P., & Lermusiaux, P. F. J. (2009). Dynamically orthogonal field equations for continuous stochastic dynamical systems. Physica D, 238, 2347–2360.
Sapsis, T. P., & Lermusiaux, P. F. J. (2012). Dynamical criteria for the evolution of the stochastic dimensionality in flows with uncertainty. Physica D, 241, 60–76.
Sheather, S. J., & Jones, M. C. (1991). A reliable data-based bandwidth selection method for kernel density estimation. J. R. Stat. Soc. Ser. B, 53, 683–690.
Sheskin, D. J. (2011). Handbook of parametric and nonparametric statistical procedures (5th ed.). Boca Raton: Chapman & Hall/CRC.
Silverman, B. W. (1986). Density estimation for statistics and data analysis. London: Chapman & Hall/CRC.
Simon, E., & Bertino, L. (2012). Gaussian anamorphosis extension of the DEnKF for combined state and parameter estimation: application to a 1D ocean ecosystem model. J. Mar. Syst., 89, 1–18.
Snyder, C., Bengtsson, T., Bickel, P., & Anderson, J. (2008). Obstacles to high-dimensional particle filtering. Mon. Weather Rev., 136, 4629–4640.
Spitz, Y. H., Moisan, J. R., & Abbott, M. R. (2001). Configuring an ecosystem model using data from the Bermuda Atlantic Time Series (BATS). Deep-Sea Res., Part 2, 48, 1733–1768.
Sugie, J., Kohno, R., & Miyazaki, R. (1997). On a predator-prey system of Holling type. Proc. Amer. Math. Soc., 125, 2041–2050.
Talagrand, O., & Courtier, P. (1987). Variational assimilation of meteorological observations with the adjoint vorticity equation. I: Theory. Q. J. R. Meteorol. Soc., 113, 1311–1328.
Vallino, J. J. (2000). Improving marine ecosystem models: use of data assimilation and mesocosm experiments. J. Mar. Res., 58, 117–164.
Wilkinson, D. J. (2010). Parameter inference for stochastic kinetic models of bacterial gene regulation: a Bayesian approach to systems biology. In J. M. Bernardo, M. J. Bayarri, J. O. Berger, A. P. Dawid, D. Heckerman, A. F. P. Smith, & M. West (Eds.), Bayesian statistics: Vol. 9. Proceedings of the ninth Valencia international meeting (pp. 679–706). Oxford: Oxford University Press.
Zakai, M. (1969). On the optimal filtering of diffusion processes. Probab. Theory Related Fields, 11, 230–243.
Zhao, L., Wei, H., Xu, Y., & Feng, S. (2005). An adjoint data assimilation approach for estimating parameters in a three-dimensional ecosystem model. Ecol. Model., 186, 234–249.
Acknowledgements
The authors thank Dr. Ethan Atkins, Professor Alexandre Chorin, and Dr. Matthias Morzfeld for their invaluable contribution to the research and presentation of the results in this paper. We thank Linda Lamb as well for her detailed proofreading. This work was supported by the National Science Foundation, Division of Ocean Sciences, Collaboration in Mathematical Geosciences award #0934956.
Author information
Authors and Affiliations
Corresponding author
Appendix: The Optimization Problem
Appendix: The Optimization Problem
Essential to the applicability of our implicit sampling methods is the optimization step of Algorithm 1. We use the Levenberg–Marquardt method to search for the minimizer ζ ∗. This method is part of a wider class of trust-region (or restricted-step) methods—iterative approaches that restrict the next step to a region centered at the current point (see Fig. 14 for a schematic). At each iteration, the method either expands or contracts the region depending on the ratio of the predicted and actual reductions in the cost function. A notable feature of this method is that it safely handles indefinite Hessians. We use Algorithm 7.3.4 from Conn et al. (2000), which is sketched below. The numerical constants that appear in the method are arbitrary, and the stationary point and rate of convergence of the method are theoretically insensitive to their values. In practice, the constants effect the number of iterations until the algorithm terminates.
Algorithm 3
(Levenberg–Marquardt)
Let the subscript (k) denote the iteration number and suppose we have some initial guess ζ (0) and radius ϵ (0).
-
1
At the current point ζ (k), compute the cost function, its gradient, and Hessian:
$$J_{(k)} = J( \zeta_{(k)} ), \qquad g_{(k)} = \nabla J( \zeta_{(k)} ),\qquad H_{(k)} = H( \zeta_{(k)} ). $$ -
2
Find the proposed increment Δζ that minimizes the quadratic approximation at the current point, defined such that
$$K(\zeta_{(k)} + \Delta\zeta) = J_{(k)} + g_{(k)}^T \Delta\zeta + \frac{1}{2} (\Delta\zeta)^T H_{(k)} \Delta\zeta, $$constrained to the region where
$$(\Delta\zeta)^T \Delta\zeta\leq\epsilon_{(k)}^2. $$For more details on the solution of quadratic programming problems, see the monograph of Conn et al. (2000).
-
3
Compute the ratio of the actual reduction to predicted reduction of the function at the proposed next step,
$$r = \frac{ J(\zeta_{(k)}) - J(\zeta_{(k)} + \Delta\zeta)}{ K(\zeta_{(k)}) - K(\zeta_{(k)} + \Delta\zeta)}. $$Update the trust region size based on r,
$$\epsilon_{(k+1)} = \left\{ \begin{array}{l@{\quad}l} \epsilon_{(k)}/2 & \text{if $r < 1/4$}, \\ \min(2 \epsilon_{(k)}, \epsilon_M) & \text{if $r > 3/4$}, \\ \epsilon_{(k)} & \text{otherwise}, \end{array} \right. $$where ϵ M is a user specified maximum radius. Move to the next point only if it decreases the cost function, i.e.,
$$\zeta_{(k+1)} = \left\{ \begin{array}{l@{\quad}l} \zeta_{(k)} + \Delta\zeta & \text{if $0 < r$}, \\ \zeta_{(k)} & \text{otherwise}. \end{array} \right. $$Since the proposed increment Δζ minimizes K within the trust region, the denominator of the ratio r is always positive. Hence, the iteration remains at the current step if and only if the predicted step does not decrease the cost function.
-
4
Replace k with k+1, and return to Step 1 until the norm of the gradient g (k) is less than a given value or we reach an upper bound of iterations.
The Derivatives of the Cost Function
The structure of the Hessian of J makes it possible to do these computations efficiently even when the dimension of ζ is large. Since the residuals only depend on the previous and current model step, the Hessian has a single band running down its diagonal, corresponding to the state derivatives, full columns at its far-right side, and full rows at its bottom, both corresponding to the parameter derivatives. We need only store the diagonal, subdiagonals, and bottom rows because the Hessian is symmetric. This representation grows linearly in the number of variables, as opposed to quadratically for the full representation. If the model equations are a discretization of a partial differential equation, we lose this special structure, but numerous libraries exist for optimization when the Hessian is sparse. Thus, the Hessian has the block form
where H xx is a band matrix, and \(H_{\theta\mathbf{x}} = H_{\mathbf{x}\theta}^{T}\) since we assume J is smooth.
To simplify notation, define the (column) vector of residuals
The cost function, its gradient, and its Hessian are thus
where \(H_{\rho_{i}}\) is the Hessian of the ith element of ρ. We use the Gauss–Newton approximation of the Hessian,
Since the goal of the optimization is to make the norm of the residuals small, we expect the neglected terms to be small as well (Fletcher 1987). The derivatives of the cost function in the unconstrained variables follow from an application of the chain rule. An added benefit of the Gauss–Newton approximation is that it is always positive semi-definite. Hence, we can stop the optimization at any iteration, and it is still possible to sample a Gaussian whose covariance matrix is the Moore–Penrose pseudoinverse of the Hessian.
The nonzero derivatives of the model noise are
and the derivative of the observation noise is
The state derivatives of the Lotka–Volterra model equations (14) are
and the derivatives for each parameter are
and
Since the observation functions considered are linear, their Jacobians are identical to the functions.
Rights and permissions
About this article
Cite this article
Weir, B., Miller, R.N. & Spitz, Y.H. Implicit Estimation of Ecological Model Parameters. Bull Math Biol 75, 223–257 (2013). https://doi.org/10.1007/s11538-012-9801-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11538-012-9801-6