Abstract
Data assimilation, or the fusion of a mathematical model with ecological data, is rapidly expanding knowledge of ecological systems across multiple spatial and temporal scales. As the amount of ecological data available to a broader audience increases, quantitative proficiency with data assimilation tools and techniques will be an essential skill for ecological analysis in this data-rich era. We provide a data assimilation primer for the novice user by (1) reviewing data assimilation terminology and methodology, (2) showcasing a variety of data assimilation studies across the ecological, environmental, and atmospheric sciences with the aim of gaining an understanding of potential applications of data assimilation, and (3) applying data assimilation in specific ecological examples to determine the components of net ecosystem carbon uptake in a forest and also the population dynamics of the mayfly (Hexagenia limbata, Serville). The review and examples are then used to provide guiding principles to newly proficient data assimilation practitioners.
Similar content being viewed by others
References
A’Brook R, Weyers J (1996) Teaching of statistics to UK undergraduate biology students in 1995. J Biol Educ 30(4):281–288
Akaike H (1974) A new look at the statistical model identification. IEEE Trans Automat Contr 19(6):716–723. doi:10.1109/TAC.1974.1100705
Baldocchi D (2008) “Breathing” of the terrestrial biosphere: lessons learned from a global network of carbon dioxide flux measurement systems. Aust J Bot 56(1):1–26
Beven K, Freer J (2001) Equifinality, data assimilation, and uncertainty estimation in mechanistic modelling of complex environmental systems using the GLUE methodology. J Hydrol 249(1–4):11–29. doi:10.1016/S0022-1694(01)00421-8
Braswell BH, Sacks WJ, Linder E, Schimel DS (2005) Estimating diurnal to annual ecosystem parameters by synthesis of a carbon flux model with eddy covariance net ecosystem exchange observations. Glob Change Biol 11(2):335–355. doi:10.1111/j.1365-2486.2005.00897.x
Burnham KP, Anderson DR (eds) (2002) Model selection and multimodel inference. Springer, New York
Cable JM, Ogle K, Lucas RW, Huxman TE, Loik ME, Smith SD, Tissue DT, Ewers BE, Pendall E, Welker JM, Charlet TN, Cleary M, Griffith A, Nowak RS, Rogers M, Steltzer H, Sullivan PF, van Gestel NC (2011) The temperature responses of soil respiration in deserts: a seven desert synthesis. Biogeochemistry 103:71–90. doi:10.1007/s10533-010-9448-z
Canadell J, Ciais P, Cox P, Heimann M (2004) Quantifying, understanding and managing the carbon cycle in the next decades. Clim Change 67(2):147–160. doi:10.1007/s10584-004-3765-y
Chadwick Ma, Feminella JW (2001) Influence of salinity and temperature on the growth and production of a freshwater mayfly in the Lower Mobile River, Alabama. Limnol Oceanogr 46(3):532–542
Clark JS (1998) Why trees migrate so fast: confronting theory with dispersal biology and the paleorecord. Am Nat 152(2):204–224
Clark J (2005a) Why environmental scientists are becoming Bayesians. Ecol Lett 8:2–14. doi:10.1111/j.1461-0248.2004.00702.x
Clark JS (2005b) Models for ecological data: statistical computation for classical and Bayesian approaches. Princeton University Press, Princeton
Committee on Undergraduate Biology Education to Prepare Research Scientists for the 21st Century, National Research Council (2003) BIO2010: transforming undergraduate education for future research biologists. The National Academies Press, Washington, DC
Daley R (1994) Atmospheric data analysis, cambridge atmospheric and space science series. Cambridge University Press, New York
Davidson EA, Janssens IA, Luo Y (2006) On the variability of respiration in terrestrial ecosystems: moving beyond Q10. Glob Change Biol 12:154–164. doi:10.1111/j.1365-2486.2005.01065.x
Desai AR, Richardson AD, Moffat AM, Kattge J, Hollinger DY, Barr A, Falge E, Noormets A, Papale D, Reichstein M, Stauch VJ (2008) Cross-site evaluation of eddy covariance GPP and RE decomposition techniques. Agric For Meteorol 148(6–7):821–838. doi:10.1016/j.agrformet.2007.11.012
Doney S, Ducklow H (2006) A decade of synthesis and modeling in the US Joint Global Ocean Flux Study. Deep Sea Res (2 Top Stud Oceanogr) 53(5–7):451–458. doi:10.1016/j.dsr2.2006.01.019
Ellison AM, Dennis B (2010) Paths to statistical fluency for ecologists. Front Ecol Environ 8(7):362–370. doi:10.1890/080209
Eugster W, Rouse WR, Pielke RA Sr, Mcfadden JP, Baldocchi DD, Kittel TGF, Chapin FS III, Liston GE, Vidale PL, Vaganov E, Chambers S (2000) Land-atmosphere energy exchange in Arctic tundra and boreal forest: available data and feedbacks to climate. Glob Change Biol 6:84–115. doi:10.1046/j.1365-2486.2000.06015.x
Evensen G (2009) Data assimilation: the ensemble Kalman filter, 2nd edn. Springer, New York
Fox A, Williams M, Richardson AD, Cameron D, Gove JH, QuaifeT, Ricciuto D, Reichstein M, Tomelleri E, Trudinger CM, Van Wijk MT (2009) The REFLEX project: comparing different algorithms and implementations for the inversion of a terrestrial ecosystem model against eddy covariance data. Agric For Meteorol 149(10):1597–1615. doi:10.1016/j.agrformet.2009.05.002
Friend AD, Arneth A, Kiang NY, Lomas M, Ogée J, Rödenbeck C, Running SW, Santaren JD, Sitch S, Viovy N, Woodward FI, Zaehle S (2007) FLUXNET and modelling the global carbon cycle. Glob Change Biol 13(3):610–633. doi:10.1111/j.1365-2486.2006.01223.x
Heinsch FA, Zhao M, Running SW, Kimball JS, Nemani RR, Davis KJ, Bolstad PV, Cook BD, Desai AR, Ricciuto DM, Law BE, Oechel WC, Kwon H, Luo H, Wofsy SC, Dunn AL, Munger JW, Baldocchi DD, Xu L, Hollinger DY, Richardson AD, Stoy PC, Siqueira MBS, Monson RK, Burns SP, Flanagan LB (2006) Evaluation of remote sensing based terrestrial productivity from MODIS using regional tower eddy flux network observations. IEEE Trans Geosci Remote Sens 44(7):1908–1925. doi:10.1109/TGRS.2005.853936
Hurtt GC, Armstrong RA (1996) A pelagic ecosystem model calibrated with BATS data. Deep Sea Res (2 Top Stud Oceanogr) 43:653–683
Janssens IA, Lankreijer H, Matteucci G, Kowalski AS, Buchmann N, Epron D, Pilegaard K, Kutsch W, Longdoz B, Grünwald T, Montagnani L, Dore S, Rebmann C, Moors EJ, Grelle A, Rannik Ü, Morgenstern K, Oltchev S, Clement R, Guðmundsson J, Minerbi S, Berbigier P, Ibrom A, Moncrieff J, Aubinet M, Bernhofer C, Jensen NO, Vesala T, Granier A, Schulze ED, Lindroth A, Dolman AJ, Jarvis PG, Ceulemans R, Valentini R (2001) Productivity overshadows temperature in determining soil and ecosystem respiration across European forests. Glob Change Biol 7:269–278
Jaynes ET (2003) Probability theory: the logic of science. Cambridge University Press, Cambridge
Johnson JB, Omland KS (2004) Model selection in ecology and evolution. Trends Ecol Evol 19(2):101–108. doi:10.1016/j.tree.2003.10.013
Keller M, Schimel DS, Hargrove WW, Hoffman FM (2008) A continental strategy for the National Ecological Observatory Network. Front Ecol Environ 6(5):282–284. doi:10.1890/1540-9295(2008)6[282:ACSFTN]2.0.CO;2
Konishi S, Kitagawa G (2008) Information criteria and statistical modeling. Springer, New York
Litton CM, Raich JW, Ryan MG (2007) Carbon allocation in forest ecosystems. Glob Change Biol 13(10):2089–2109. doi:10.1111/j.1365-2486.2007.01420.x
Lloyd J, Taylor JA (1994) On the temperature dependence of soil respiration. Funct Ecol 8:315–323
Lorenz E (1963) Deterministic nonperiodic flow. J Atmos Sci 20:130–141
Luo Y, Weng E, Wu X, Gao C, Zhou X, Zhang L (2009) Parameter identifiability, constraint, and equifinality in data assimilation with ecosystem models. Ecol Appl 19(3):571–574. doi:10.1890/08-0561.1
Mathieu P, O’Neill A (2008) Data assimilation: from photon counts to Earth System forecasts. Remote Sens Environ 112(4):1258–1267. doi:10.1016/j.rse.2007.02.040
Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E (1953) Equations of state calculations by fast computing machines. J Chem Phys 21(6):1087–1092. doi:10.1063/1.1699114
Metz AM (2008) Teaching statistics in biology: using inquiry-based learning to strengthen understanding of statistical analysis in biology laboratory courses. Cell Biol Educ 7(3):317–326. doi:10.1187/cbe.07-07-0046
Monson RK, Turnipseed AA, Sparks JP, Harley PC, Scott-Denton LE, Sparks K, Huxman TE (2002) Carbon sequestration in a high-elevation, subalpine forest. Glob Change Biol 8:459–478
Nemani RR, Keeling CD, Hashimoto H, Jolly WM, Piper SC, Tucker CJ, Myneni RB, Running SW (2003) Climate-driven increases in global terrestrial net primary production from 1982 to 1999. Science 300(5625):1560–1563. doi:10.1126/science.1082750
Ogle K, Barber JJ (2008) Bayesian data—model integration in plant physiological and ecosystem ecology. Prog Bot 69:281–311. doi:10.1007/978-3-540-72954-9_12
Olden JD, Lawler JJ, Poff NL (2008) Machine learning methods without tears: a primer for ecologists. Q Rev Biol 83(2):171–193
Peters DP, Groffman PM, Nadelhoffer KJ, Grimm NB, Collins SL, Michener WK, Huston MA (2008) Living in an increasingly connected world: a framework for continental-scale environmental science. Front Ecol Environ 6(5):229–237. doi:10.1890/070098
Piao S, Ciais P, Friedlingstein P, Peylin P, Reichstein M, Luyssaert S, Margolis H, Fang J, Barr A, Chen A, Grelle A, Hollinger DY, Laurila T, Lindroth A, Richardson AD, Vesala T (2008) Net carbon dioxide losses of northern ecosystems in response to autumn warming. Nature 451(7174):49–52. doi:10.1038/nature06444
Piovesan G, Adams JM (2000) Carbon balance gradient in European forests: interpreting EUROFLUX. J Veg Sci 11(6):923–926. doi:10.2307/3236563
Raupach MR, Rayner PJ, Barrett DJ, DeFries RS, Heimann M, Ojima DS, Quegan S, Schmullius CC (2005) Model-data synthesis in terrestrial carbon observation: methods, data requirements and data uncertainty specifications. Glob Change Biol 11(3):378–397. doi:10.1111/j.1365-2486.2005.00917.x
Reichstein M, Falge E, Baldocchi D, Papale D, Aubinet M, Berbigier P, Bernhofer C, Buchmann N, Gilmanov T, Granier A, Grünwald T, Havránková K, Ilvesniemi H, Janous D, Knohl A, Laurila T, Lohila A, Loustau D, Matteucci G, Meyers T, Miglietta F, Ourcival JM, Pumpanen J, Rambal S, Rotenberg E, Sanz M, Tenhunen J, Seufert G, Vaccari F, Vesala T, Yakir D, Valentini R (2005) On the separation of net ecosystem exchange into assimilation and ecosystem respiration: review and improved algorithm. Glob Change Biol 11:1424–1439. doi:10.1111/j.1365-2486.2005.001002.x
Richey M (2010) The evolution of Markov Chain Monte Carlo methods. Am Math Monthly 117(5):343–383. doi:10.4169/000298910X485923
Sacks WJ, Schimel DS, Monson RK, Braswell BH (2006) Model-data synthesis of diurnal and seasonal CO2 fluxes at Niwot Ridge, Colorado. Glob Change Biol 12:240–259. doi:10.1111/j.1365-2486.2005.01059.x
Saigusa N, Yamamoto S, Murayama S, Kondo H, Nishimura N (2002) Gross primary production and net ecosystem exchange of a cool-temperate deciduous forest estimated by the eddy covariance method. Agric For Meteorol 112:203–215. doi:10.1016/S0168-1923(02)00082-5
Schimel DS, House JI, Hibbard KA, Bousquet P, Ciais P, Peylin P, Braswell BH, Apps MJ, Baker D, Bondeau A, Canadell J, Churkina G, Cramer W, Denning AS, Field CB, Friedlingstein P, Goodale C, Heimann M, Houghton RA, Melillo JM, Moore B III, Murdiyarso D, Noble I, Pacala SW, Prentice IC, Raupach MR, Rayner PJ, Scholes RJ, Steffen WL, Wirth C (2001) Recent patterns and mechanisms of carbon exchange by terrestrial ecosystems. Nature 414(6860):169–172. doi:10.1038/35102500
Schwartz G (1978) Estimating the dimensions of a model. Ann Stat 6(2):461–464
Sokal R, Rohlf J (1995) Biometry. W. H. Freeman & Co, New York
Solomon S, Qin D, Manning M, Chen Z, Marquis M, Averyt KB, Tignor M, Miller HL (2007) Climate change 2007: the physical science basis. contribution of working group I to the fourth assessment report of the intergovernmental panel on climate change. Cambridge University Press, New York
Steen LA (2005) Math & bio 2010: linking undergraduate disciplines. Mathematical Association of America (MAA), Washington DC
Sweeney BW, Vannote RL (1978) Size variation and the distribution of hemimetabolous aquatic insects: two thermal equilibrium hypotheses. Science 200(4340):444–446. doi:10.1126/science.200.4340.444
Tarantola A (2005) Inverse problem theory and model parameter estimation. SIAM Books, Philadelphia
Trudinger CM, Raupach MR, Rayner PJ, Kattge J, Liu Q, Pak B, Reichstein M, Renzullo L, Richardson AD, Roxburgh SH, Styles J, Ping Wang YP, Briggs P, Barrett D, Nikolova S (2007) OptIC project: an intercomparison of optimization techniques for parameter estimation in terrestrial biogeochemical models. J Geophys Res 112(G2). doi:10.1029/2006JG000367
Valentini R, Matteucci G, Dolman AJ, Schulze ED, Rebmann C, Moors EJ, Granier A, Gross P, Jensen NO, Pilegaard K, Lindroth A, Grelle A, Bernhofer C, Grünwald T, Aubinet M, Ceulemans R, Kowalski AS, Vesala T, Rannik Ü, Berbigier P, Loustau D, Guethmundsson J, Thorgeirsson H, Ibrom A, Morgenstern K, Clement R, Moncrieff J, Montagnani L, Minerbi S, Jarvis PG (2000) Respiration as the main determinant of carbon balance in European forests. Nature 404:861–865. doi:10.1038/35009084
Wang Y-P, Trudinger CM, Enting IG (2009) A review of applications of model-data fusion to studies of terrestrial carbon fluxes at different scales. Agric For Meteorol 149(11):1829–1842. doi:10.1016/j.agrformet.2009.07.009
Williams M, Richardson AD, Reichstein M, Stoy PC, Peylin P, Verbeeck H, Carvalhais N, Jung M, Hollinger DY, Kattge J, Leuning R, Luo Y, Tomelleri E, Trudinger CM, Wang YP (2009) Improving land surface models with FLUXNET data. Biogeosciences 6(7):1341–1359. doi:10.5194/bg-6-1341-2009
Wofsy SC, Goulden ML, Munger JW, Fan SM, Bakwin PS, Daube BC, Bassow SL, Bazzaz FA (1993) Net exchange of CO2 in a mid-latitude forest. Science 260:1314–1317
Yi C, Li R, Bakwin PS, Desai A, Ricciuto DM et al (2004) A non-parametric method for separating photosynthesis and respiration components in CO2 flux measurements. Geophys Res Lett 31. doi:10.1029/2004GL020490
Zobitz J, Burns S, Ogee J, Reichstein M, Bowling R (2007) Partitioning net ecosystem exchange of CO2: a comparison of a Bayesian/isotope approach to environmental regression methods. J Geophys Res–Biogeosciences 112(G3). doi:10.1029/2006JG000282
Acknowledgments
The authors would like to thank T. Quaife for helpful discussions, students of the Annual Summer Course in Flux Measurements and Modeling (NSF 0634649), and anonymous reviewers. ARD acknowledges funding support from Department of Energy (DOE) Office of Biological and Environmental Research (BER) National Institute for Climatic Change Research (NICCR) Midwestern Region Subagreement 050516Z19.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Russell Monson.
Electronic supplementary material
Below is the link to the electronic supplementary material.
The ESM for this paper includes helpful “Rules of Thumb” for data assimilation, a glossary of data assimilation terminology, a table outlining different data assimilation studies, additional results to those described above, and all data and codes used to generate the results. These data and programs are also available at: http://www.augsburg.edu/home/math/faculty/zobitz/dataAssimilation.html.
Rights and permissions
About this article
Cite this article
Zobitz, J.M., Desai, A.R., Moore, D.J.P. et al. A primer for data assimilation with ecological models using Markov Chain Monte Carlo (MCMC). Oecologia 167, 599–611 (2011). https://doi.org/10.1007/s00442-011-2107-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00442-011-2107-9