Skip to main content
Log in

A primer for data assimilation with ecological models using Markov Chain Monte Carlo (MCMC)

  • Concepts, Reviews and Syntheses
  • Published:
Oecologia Aims and scope Submit manuscript

Abstract

Data assimilation, or the fusion of a mathematical model with ecological data, is rapidly expanding knowledge of ecological systems across multiple spatial and temporal scales. As the amount of ecological data available to a broader audience increases, quantitative proficiency with data assimilation tools and techniques will be an essential skill for ecological analysis in this data-rich era. We provide a data assimilation primer for the novice user by (1) reviewing data assimilation terminology and methodology, (2) showcasing a variety of data assimilation studies across the ecological, environmental, and atmospheric sciences with the aim of gaining an understanding of potential applications of data assimilation, and (3) applying data assimilation in specific ecological examples to determine the components of net ecosystem carbon uptake in a forest and also the population dynamics of the mayfly (Hexagenia limbata, Serville). The review and examples are then used to provide guiding principles to newly proficient data assimilation practitioners.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • A’Brook R, Weyers J (1996) Teaching of statistics to UK undergraduate biology students in 1995. J Biol Educ 30(4):281–288

    Article  Google Scholar 

  • Akaike H (1974) A new look at the statistical model identification. IEEE Trans Automat Contr 19(6):716–723. doi:10.1109/TAC.1974.1100705

    Article  Google Scholar 

  • Baldocchi D (2008) “Breathing” of the terrestrial biosphere: lessons learned from a global network of carbon dioxide flux measurement systems. Aust J Bot 56(1):1–26

    Article  CAS  Google Scholar 

  • Beven K, Freer J (2001) Equifinality, data assimilation, and uncertainty estimation in mechanistic modelling of complex environmental systems using the GLUE methodology. J Hydrol 249(1–4):11–29. doi:10.1016/S0022-1694(01)00421-8

    Article  Google Scholar 

  • Braswell BH, Sacks WJ, Linder E, Schimel DS (2005) Estimating diurnal to annual ecosystem parameters by synthesis of a carbon flux model with eddy covariance net ecosystem exchange observations. Glob Change Biol 11(2):335–355. doi:10.1111/j.1365-2486.2005.00897.x

    Article  Google Scholar 

  • Burnham KP, Anderson DR (eds) (2002) Model selection and multimodel inference. Springer, New York

    Google Scholar 

  • Cable JM, Ogle K, Lucas RW, Huxman TE, Loik ME, Smith SD, Tissue DT, Ewers BE, Pendall E, Welker JM, Charlet TN, Cleary M, Griffith A, Nowak RS, Rogers M, Steltzer H, Sullivan PF, van Gestel NC (2011) The temperature responses of soil respiration in deserts: a seven desert synthesis. Biogeochemistry 103:71–90. doi:10.1007/s10533-010-9448-z

    Google Scholar 

  • Canadell J, Ciais P, Cox P, Heimann M (2004) Quantifying, understanding and managing the carbon cycle in the next decades. Clim Change 67(2):147–160. doi:10.1007/s10584-004-3765-y

    Article  CAS  Google Scholar 

  • Chadwick Ma, Feminella JW (2001) Influence of salinity and temperature on the growth and production of a freshwater mayfly in the Lower Mobile River, Alabama. Limnol Oceanogr 46(3):532–542

    Article  Google Scholar 

  • Clark JS (1998) Why trees migrate so fast: confronting theory with dispersal biology and the paleorecord. Am Nat 152(2):204–224

    Article  PubMed  CAS  Google Scholar 

  • Clark J (2005a) Why environmental scientists are becoming Bayesians. Ecol Lett 8:2–14. doi:10.1111/j.1461-0248.2004.00702.x

    Article  Google Scholar 

  • Clark JS (2005b) Models for ecological data: statistical computation for classical and Bayesian approaches. Princeton University Press, Princeton

    Google Scholar 

  • Committee on Undergraduate Biology Education to Prepare Research Scientists for the 21st Century, National Research Council (2003) BIO2010: transforming undergraduate education for future research biologists. The National Academies Press, Washington, DC

    Google Scholar 

  • Daley R (1994) Atmospheric data analysis, cambridge atmospheric and space science series. Cambridge University Press, New York

    Google Scholar 

  • Davidson EA, Janssens IA, Luo Y (2006) On the variability of respiration in terrestrial ecosystems: moving beyond Q10. Glob Change Biol 12:154–164. doi:10.1111/j.1365-2486.2005.01065.x

    Article  Google Scholar 

  • Desai AR, Richardson AD, Moffat AM, Kattge J, Hollinger DY, Barr A, Falge E, Noormets A, Papale D, Reichstein M, Stauch VJ (2008) Cross-site evaluation of eddy covariance GPP and RE decomposition techniques. Agric For Meteorol 148(6–7):821–838. doi:10.1016/j.agrformet.2007.11.012

    Google Scholar 

  • Doney S, Ducklow H (2006) A decade of synthesis and modeling in the US Joint Global Ocean Flux Study. Deep Sea Res (2 Top Stud Oceanogr) 53(5–7):451–458. doi:10.1016/j.dsr2.2006.01.019

    Article  Google Scholar 

  • Ellison AM, Dennis B (2010) Paths to statistical fluency for ecologists. Front Ecol Environ 8(7):362–370. doi:10.1890/080209

    Article  Google Scholar 

  • Eugster W, Rouse WR, Pielke RA Sr, Mcfadden JP, Baldocchi DD, Kittel TGF, Chapin FS III, Liston GE, Vidale PL, Vaganov E, Chambers S (2000) Land-atmosphere energy exchange in Arctic tundra and boreal forest: available data and feedbacks to climate. Glob Change Biol 6:84–115. doi:10.1046/j.1365-2486.2000.06015.x

  • Evensen G (2009) Data assimilation: the ensemble Kalman filter, 2nd edn. Springer, New York

    Google Scholar 

  • Fox A, Williams M, Richardson AD, Cameron D, Gove JH, QuaifeT, Ricciuto D, Reichstein M, Tomelleri E, Trudinger CM, Van Wijk MT (2009) The REFLEX project: comparing different algorithms and implementations for the inversion of a terrestrial ecosystem model against eddy covariance data. Agric For Meteorol 149(10):1597–1615. doi:10.1016/j.agrformet.2009.05.002

  • Friend AD, Arneth A, Kiang NY, Lomas M, Ogée J, Rödenbeck C, Running SW, Santaren JD, Sitch S, Viovy N, Woodward FI, Zaehle S (2007) FLUXNET and modelling the global carbon cycle. Glob Change Biol 13(3):610–633. doi:10.1111/j.1365-2486.2006.01223.x

    Google Scholar 

  • Heinsch FA, Zhao M, Running SW, Kimball JS, Nemani RR, Davis KJ, Bolstad PV, Cook BD, Desai AR, Ricciuto DM, Law BE, Oechel WC, Kwon H, Luo H, Wofsy SC, Dunn AL, Munger JW, Baldocchi DD, Xu L, Hollinger DY, Richardson AD, Stoy PC, Siqueira MBS, Monson RK, Burns SP, Flanagan LB (2006) Evaluation of remote sensing based terrestrial productivity from MODIS using regional tower eddy flux network observations. IEEE Trans Geosci Remote Sens 44(7):1908–1925. doi:10.1109/TGRS.2005.853936

    Google Scholar 

  • Hurtt GC, Armstrong RA (1996) A pelagic ecosystem model calibrated with BATS data. Deep Sea Res (2 Top Stud Oceanogr) 43:653–683

    Article  CAS  Google Scholar 

  • Janssens IA, Lankreijer H, Matteucci G, Kowalski AS, Buchmann N, Epron D, Pilegaard K, Kutsch W, Longdoz B, Grünwald T, Montagnani L, Dore S, Rebmann C, Moors EJ, Grelle A, Rannik Ü, Morgenstern K, Oltchev S, Clement R, Guðmundsson J, Minerbi S, Berbigier P, Ibrom A, Moncrieff J, Aubinet M, Bernhofer C, Jensen NO, Vesala T, Granier A, Schulze ED, Lindroth A, Dolman AJ, Jarvis PG, Ceulemans R, Valentini R (2001) Productivity overshadows temperature in determining soil and ecosystem respiration across European forests. Glob Change Biol 7:269–278

    Google Scholar 

  • Jaynes ET (2003) Probability theory: the logic of science. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Johnson JB, Omland KS (2004) Model selection in ecology and evolution. Trends Ecol Evol 19(2):101–108. doi:10.1016/j.tree.2003.10.013

    Article  PubMed  Google Scholar 

  • Keller M, Schimel DS, Hargrove WW, Hoffman FM (2008) A continental strategy for the National Ecological Observatory Network. Front Ecol Environ 6(5):282–284. doi:10.1890/1540-9295(2008)6[282:ACSFTN]2.0.CO;2

    Article  Google Scholar 

  • Konishi S, Kitagawa G (2008) Information criteria and statistical modeling. Springer, New York

    Book  Google Scholar 

  • Litton CM, Raich JW, Ryan MG (2007) Carbon allocation in forest ecosystems. Glob Change Biol 13(10):2089–2109. doi:10.1111/j.1365-2486.2007.01420.x

    Article  Google Scholar 

  • Lloyd J, Taylor JA (1994) On the temperature dependence of soil respiration. Funct Ecol 8:315–323

    Article  Google Scholar 

  • Lorenz E (1963) Deterministic nonperiodic flow. J Atmos Sci 20:130–141

    Article  Google Scholar 

  • Luo Y, Weng E, Wu X, Gao C, Zhou X, Zhang L (2009) Parameter identifiability, constraint, and equifinality in data assimilation with ecosystem models. Ecol Appl 19(3):571–574. doi:10.1890/08-0561.1

    Article  PubMed  Google Scholar 

  • Mathieu P, O’Neill A (2008) Data assimilation: from photon counts to Earth System forecasts. Remote Sens Environ 112(4):1258–1267. doi:10.1016/j.rse.2007.02.040

    Article  Google Scholar 

  • Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E (1953) Equations of state calculations by fast computing machines. J Chem Phys 21(6):1087–1092. doi:10.1063/1.1699114

    Article  CAS  Google Scholar 

  • Metz AM (2008) Teaching statistics in biology: using inquiry-based learning to strengthen understanding of statistical analysis in biology laboratory courses. Cell Biol Educ 7(3):317–326. doi:10.1187/cbe.07-07-0046

    Article  Google Scholar 

  • Monson RK, Turnipseed AA, Sparks JP, Harley PC, Scott-Denton LE, Sparks K, Huxman TE (2002) Carbon sequestration in a high-elevation, subalpine forest. Glob Change Biol 8:459–478

    Article  Google Scholar 

  • Nemani RR, Keeling CD, Hashimoto H, Jolly WM, Piper SC, Tucker CJ, Myneni RB, Running SW (2003) Climate-driven increases in global terrestrial net primary production from 1982 to 1999. Science 300(5625):1560–1563. doi:10.1126/science.1082750

    Article  PubMed  CAS  Google Scholar 

  • Ogle K, Barber JJ (2008) Bayesian data—model integration in plant physiological and ecosystem ecology. Prog Bot 69:281–311. doi:10.1007/978-3-540-72954-9_12

    Article  Google Scholar 

  • Olden JD, Lawler JJ, Poff NL (2008) Machine learning methods without tears: a primer for ecologists. Q Rev Biol 83(2):171–193

    Article  PubMed  Google Scholar 

  • Peters DP, Groffman PM, Nadelhoffer KJ, Grimm NB, Collins SL, Michener WK, Huston MA (2008) Living in an increasingly connected world: a framework for continental-scale environmental science. Front Ecol Environ 6(5):229–237. doi:10.1890/070098

    Article  Google Scholar 

  • Piao S, Ciais P, Friedlingstein P, Peylin P, Reichstein M, Luyssaert S, Margolis H, Fang J, Barr A, Chen A, Grelle A, Hollinger DY, Laurila T, Lindroth A, Richardson AD, Vesala T (2008) Net carbon dioxide losses of northern ecosystems in response to autumn warming. Nature 451(7174):49–52. doi:10.1038/nature06444

    Google Scholar 

  • Piovesan G, Adams JM (2000) Carbon balance gradient in European forests: interpreting EUROFLUX. J Veg Sci 11(6):923–926. doi:10.2307/3236563

    Article  Google Scholar 

  • Raupach MR, Rayner PJ, Barrett DJ, DeFries RS, Heimann M, Ojima DS, Quegan S, Schmullius CC (2005) Model-data synthesis in terrestrial carbon observation: methods, data requirements and data uncertainty specifications. Glob Change Biol 11(3):378–397. doi:10.1111/j.1365-2486.2005.00917.x

    Article  Google Scholar 

  • Reichstein M, Falge E, Baldocchi D, Papale D, Aubinet M, Berbigier P, Bernhofer C, Buchmann N, Gilmanov T, Granier A, Grünwald T, Havránková K, Ilvesniemi H, Janous D, Knohl A, Laurila T, Lohila A, Loustau D, Matteucci G, Meyers T, Miglietta F, Ourcival JM, Pumpanen J, Rambal S, Rotenberg E, Sanz M, Tenhunen J, Seufert G, Vaccari F, Vesala T, Yakir D, Valentini R (2005) On the separation of net ecosystem exchange into assimilation and ecosystem respiration: review and improved algorithm. Glob Change Biol 11:1424–1439. doi:10.1111/j.1365-2486.2005.001002.x

    Google Scholar 

  • Richey M (2010) The evolution of Markov Chain Monte Carlo methods. Am Math Monthly 117(5):343–383. doi:10.4169/000298910X485923

    Article  Google Scholar 

  • Sacks WJ, Schimel DS, Monson RK, Braswell BH (2006) Model-data synthesis of diurnal and seasonal CO2 fluxes at Niwot Ridge, Colorado. Glob Change Biol 12:240–259. doi:10.1111/j.1365-2486.2005.01059.x

    Article  Google Scholar 

  • Saigusa N, Yamamoto S, Murayama S, Kondo H, Nishimura N (2002) Gross primary production and net ecosystem exchange of a cool-temperate deciduous forest estimated by the eddy covariance method. Agric For Meteorol 112:203–215. doi:10.1016/S0168-1923(02)00082-5

    Article  Google Scholar 

  • Schimel DS, House JI, Hibbard KA, Bousquet P, Ciais P, Peylin P, Braswell BH, Apps MJ, Baker D, Bondeau A, Canadell J, Churkina G, Cramer W, Denning AS, Field CB, Friedlingstein P, Goodale C, Heimann M, Houghton RA, Melillo JM, Moore B III, Murdiyarso D, Noble I, Pacala SW, Prentice IC, Raupach MR, Rayner PJ, Scholes RJ, Steffen WL, Wirth C (2001) Recent patterns and mechanisms of carbon exchange by terrestrial ecosystems. Nature 414(6860):169–172. doi:10.1038/35102500

    Google Scholar 

  • Schwartz G (1978) Estimating the dimensions of a model. Ann Stat 6(2):461–464

    Article  Google Scholar 

  • Sokal R, Rohlf J (1995) Biometry. W. H. Freeman & Co, New York

    Google Scholar 

  • Solomon S, Qin D, Manning M, Chen Z, Marquis M, Averyt KB, Tignor M, Miller HL (2007) Climate change 2007: the physical science basis. contribution of working group I to the fourth assessment report of the intergovernmental panel on climate change. Cambridge University Press, New York

    Google Scholar 

  • Steen LA (2005) Math & bio 2010: linking undergraduate disciplines. Mathematical Association of America (MAA), Washington DC

    Google Scholar 

  • Sweeney BW, Vannote RL (1978) Size variation and the distribution of hemimetabolous aquatic insects: two thermal equilibrium hypotheses. Science 200(4340):444–446. doi:10.1126/science.200.4340.444

    Article  PubMed  CAS  Google Scholar 

  • Tarantola A (2005) Inverse problem theory and model parameter estimation. SIAM Books, Philadelphia

    Book  Google Scholar 

  • Trudinger CM, Raupach MR, Rayner PJ, Kattge J, Liu Q, Pak B, Reichstein M, Renzullo L, Richardson AD, Roxburgh SH, Styles J, Ping Wang YP, Briggs P, Barrett D, Nikolova S (2007) OptIC project: an intercomparison of optimization techniques for parameter estimation in terrestrial biogeochemical models. J Geophys Res 112(G2). doi:10.1029/2006JG000367

  • Valentini R, Matteucci G, Dolman AJ, Schulze ED, Rebmann C, Moors EJ, Granier A, Gross P, Jensen NO, Pilegaard K, Lindroth A, Grelle A, Bernhofer C, Grünwald T, Aubinet M, Ceulemans R, Kowalski AS, Vesala T, Rannik Ü, Berbigier P, Loustau D, Guethmundsson J, Thorgeirsson H, Ibrom A, Morgenstern K, Clement R, Moncrieff J, Montagnani L, Minerbi S, Jarvis PG (2000) Respiration as the main determinant of carbon balance in European forests. Nature 404:861–865. doi:10.1038/35009084

    Google Scholar 

  • Wang Y-P, Trudinger CM, Enting IG (2009) A review of applications of model-data fusion to studies of terrestrial carbon fluxes at different scales. Agric For Meteorol 149(11):1829–1842. doi:10.1016/j.agrformet.2009.07.009

    Article  Google Scholar 

  • Williams M, Richardson AD, Reichstein M, Stoy PC, Peylin P, Verbeeck H, Carvalhais N, Jung M, Hollinger DY, Kattge J, Leuning R, Luo Y, Tomelleri E, Trudinger CM, Wang YP (2009) Improving land surface models with FLUXNET data. Biogeosciences 6(7):1341–1359. doi:10.5194/bg-6-1341-2009

    Article  CAS  Google Scholar 

  • Wofsy SC, Goulden ML, Munger JW, Fan SM, Bakwin PS, Daube BC, Bassow SL, Bazzaz FA (1993) Net exchange of CO2 in a mid-latitude forest. Science 260:1314–1317

    Article  PubMed  CAS  Google Scholar 

  • Yi C, Li R, Bakwin PS, Desai A, Ricciuto DM et al (2004) A non-parametric method for separating photosynthesis and respiration components in CO2 flux measurements. Geophys Res Lett 31. doi:10.1029/2004GL020490

  • Zobitz J, Burns S, Ogee J, Reichstein M, Bowling R (2007) Partitioning net ecosystem exchange of CO2: a comparison of a Bayesian/isotope approach to environmental regression methods. J Geophys Res–Biogeosciences 112(G3). doi:10.1029/2006JG000282

Download references

Acknowledgments

The authors would like to thank T. Quaife for helpful discussions, students of the Annual Summer Course in Flux Measurements and Modeling (NSF 0634649), and anonymous reviewers. ARD acknowledges funding support from Department of Energy (DOE) Office of Biological and Environmental Research (BER) National Institute for Climatic Change Research (NICCR) Midwestern Region Subagreement 050516Z19.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to J. M. Zobitz.

Additional information

Communicated by Russell Monson.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PDF 287 kb)

The ESM for this paper includes helpful “Rules of Thumb” for data assimilation, a glossary of data assimilation terminology, a table outlining different data assimilation studies, additional results to those described above, and all data and codes used to generate the results. These data and programs are also available at: http://www.augsburg.edu/home/math/faculty/zobitz/dataAssimilation.html.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zobitz, J.M., Desai, A.R., Moore, D.J.P. et al. A primer for data assimilation with ecological models using Markov Chain Monte Carlo (MCMC). Oecologia 167, 599–611 (2011). https://doi.org/10.1007/s00442-011-2107-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00442-011-2107-9

Keywords

Navigation