Dimensionality reduction for efficient Bayesian estimation of groundwater flow in strongly heterogeneous aquifers

  • Thierry A. Mara
  • Noura Fajraoui
  • Alberto Guadagnini
  • Anis Younes
Original Paper


We focus on the Bayesian estimation of strongly heterogeneous transmissivity fields conditional on data sampled at a set of locations in an aquifer. Log-transmissivity, Y, is modeled as a stochastic Gaussian process, parameterized through a truncated Karhunen–Loève (KL) expansion. We consider Y fields characterized by a short correlation scale as compared to the size of the observed domain. These systems are associated with a KL decomposition which still requires a high number of parameters, thus hampering the efficiency of the Bayesian estimation of the underlying stochastic field. The distinctive aim of this work is to present an efficient approach for the stochastic inverse modeling of fully saturated groundwater flow in these types of strongly heterogeneous domains. The methodology is grounded on the construction of an optimal sparse KL decomposition which is achieved by retaining only a limited set of modes in the expansion. Mode selection is driven by model selection criteria and is conditional on available data of hydraulic heads and (optionally) Y. Bayesian inversion of the optimal sparse KLE is then inferred using Markov Chain Monte Carlo (MCMC) samplers. As a test bed, we illustrate our approach by way of a suite of computational examples where noisy head and Y values are sampled from a given randomly generated system. Our findings suggest that the proposed methodology yields a globally satisfactory inversion of the stochastic head and Y fields. Comparison of reference values against the corresponding MCMC predictive distributions suggests that observed values are well reproduced in a probabilistic sense. In a few cases, reference values at some unsampled locations (typically far from measurements) are not captured by the posterior probability distributions. In these cases, the quality of the estimation could be improved, e.g., by increasing the number of measurements and/or the threshold for the selection of KL modes.


Heterogeneous porous media Stochastic inverse modeling Karhunen–Loève expansion Markov Chain Monte Carlo 



The authors are grateful to the French National Research Agency who funded this work through the program AAP Blanc-SIMI 6 project RESAIN (no ANR-12-BS06-0010-02). AG acknowledges funding from the European Union’s Horizon 2020 Research and Innovation programme in the context of the Water JPI (WATERWORKS2014 ERA-NET cofunded program; Project “WatEr NEEDs, availability, quality and sustainability” WE-NEED).


  1. Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19:716–723. doi: 10.1109/TAC.1974.1100705 CrossRefGoogle Scholar
  2. Chen X, Murakami H, Hahn MS, Hammond GE, Rockhold ML, Zachara JM, Rubin Y (2012) Three-dimensional Bayesian geostatistical aquifer characterization at the Hanford 300 Area using tracer test data. Water Resour Res 48:W06501. doi: 10.1029/2011WR010675 Google Scholar
  3. Cui T, Fox C, O’Sullivan MJ (2011) Bayesian calibration of a large scale geothermal reservoir model by a new adaptive delayed acceptance metropolis hastings algorithm. Water Resour Res 47:W10521. doi: 10.1029/2010WR010352 CrossRefGoogle Scholar
  4. Dagan G (1989) Flow and transport in porous formations. Springer, New YorkCrossRefGoogle Scholar
  5. Das NN, Mohanty BP, Efendiev Y (2010) Characterization of effective saturated hydraulic conductivity in an agricultural field using Karhunen–Loève expansion with the Markov chain Monte Carlo technique. Water Resour Res 46:W06521. doi: 10.1029/2008WR007100 CrossRefGoogle Scholar
  6. Efendiev Y, Hou TY, Luo W (2006) Preconditioning Markov chain Monte Carlo simulations using coarse-scale models. SIAM J Sci Comput 28:776–803. doi: 10.1137/050628568 CrossRefGoogle Scholar
  7. Genton MG (2007) Separable approximations of space-time covariance matrices. Environmetrics 18(Special Issue for METMA3):681–695CrossRefGoogle Scholar
  8. Gneiting T, Genton MG, Guttorp P (2007) Geostatistical space-time models, stationarity, separability and full symmetry. In: Finkenstaedt B, Held L, Isham V (eds) Statistics of spatio-temporal systems. Monographs in statistics and applied probability. Chapman & Hall/CRC Press, Boca Raton, pp 151–175Google Scholar
  9. Green PJ, Mira A (2001) Delayed rejection in reversible jump metropolis-hastings. Biometrika 88:1035–1053. doi: 10.1093/biomet/88.4.1035 CrossRefGoogle Scholar
  10. Haario H, Saksman E, Tamminen J (2001) An adaptive metropolis algorithm. Bernouilli 7(2):223–242CrossRefGoogle Scholar
  11. Higdon D, Gattiker J, Williams B, Rightley M (2008) Computer model calibration using high-dimensional output. J Am Stat Assoc 103:570–583. doi: 10.1198/016214507000000888 CrossRefGoogle Scholar
  12. Huard D, Mailhot A, Duchesne S (2010) Bayesian estimation of intensity–duration–frequency curves and of the return period associated to a given rainfall event. Stoch Environ Res Risk Assess. 24(3):337–347. doi: 10.1007/s00477-009-0323-1 CrossRefGoogle Scholar
  13. Hurvich CM, Tsai CL (1989) Regression and time series model selection in small sample. Biometrika 76(2):297–307CrossRefGoogle Scholar
  14. Kashyap RL (1982) Optimal choice of AR and MA parts in autoregressive moving average models. IEEE Trans Pattern Anal Mach Intell 4(2):99–104. doi: 10.1109/TPAMI.1982.4767213 CrossRefGoogle Scholar
  15. Keating EH, Doherty J, Vrugt JA, Kang Q (2010) Optimization and uncertainty assessment of strongly nonlinear groundwater models with high parameter dimensionality. Water Resour Res 46:W10517. doi: 10.1029/2009WR008584 CrossRefGoogle Scholar
  16. Kennedy MC, O’Hagan A (2001) Bayesian calibration of computer models. J R Stat Soc 63(B):425–464. doi: 10.1111/1467-9868.00294 CrossRefGoogle Scholar
  17. Laloy E, Vrugt JA (2012) High-dimensional posterior exploration of hydrologic models using multiple-try DREAM(ZS) and high-performance computing. Water Resour Res 48:W01526. doi: 10.1029/2011WR010608 Google Scholar
  18. Laloy E, Rogiers B, Vrugt JA, Mallants D, Jacques D (2013) Efficient posterior exploration of a high-dimensional groundwater model from two-stage Markov chain Monte Carlo simulation and polynomial chaos expansion. Water Resour Res 49:2664–2682. doi: 10.1002/wrcr.20226 CrossRefGoogle Scholar
  19. Levenberg K (1944) A method for the solution of certain non-linear problems in least squares. Q Appl Math 2:164–168CrossRefGoogle Scholar
  20. Li W, Cirpka OA (2006) Efficient geostatistical inverse methods for structured and unstructured grids. Water Resour Res 42:W06402. doi: 10.1029/2005wr004668 Google Scholar
  21. Lin G, Tartakovsky AM, Tartakovsky DM (2010) Uncertainty quantification via random domain decomposition and probabilistic collocation on sparse grids. J Comput Phys 229:6995–7012. doi: 10.1016/ CrossRefGoogle Scholar
  22. Loeve M (1977) Probability theory, 4th edn. Springer, New YorkGoogle Scholar
  23. Mara TA, Fajraoui N, Younes A, Delay F (2015) Inversion and uncertainty of highly parameterized models in a Bayesian framework by sampling the maximal conditional posterior distribution of parameters. Adv Water Resour 76:1–10. doi: 10.1016/j.advwatres.2014.11.013 CrossRefGoogle Scholar
  24. Marquardt D (1963) An algorithm for least-squares estimation of nonlinear parameters. SIAM J Appl Math 11:431–441. doi: 10.1137/0111030 CrossRefGoogle Scholar
  25. Marzouk Y, Najm HN (2009) Dimensionality reduction and polynomial chaos acceleration of Bayesian inference in inverse problems. J Comput Phys 228:1862–1902. doi: 10.1016/ CrossRefGoogle Scholar
  26. Mercer J (1909) Functions of positive and negative type and their connection with the theory of integral equations. Philos Trans R Soc 209:415–446CrossRefGoogle Scholar
  27. Murakami H, Chen X, Hahn MS, Liu Y, Rockhold ML, Vermeul VR, Zachara JM, Rubin Y (2010) Bayesian approach for three-dimensional aquifer characterization at the Hanford 300 area. Hydrol Eart Syst Sci 14:1989–2001. doi: 10.5194/hess-14-1989-2010 CrossRefGoogle Scholar
  28. Over MW, Chen X, Yang Y, Rubin Y (2013) A strategy for improved computational efficiency of the method of anchored distributions. Water Resour Res 49:1–19. doi: 10.1002/wrcr.20182 CrossRefGoogle Scholar
  29. Phoon KK, Huang SP, Quek ST (2002) Implementation of Karhunen–Loeve expansion for simulation using a wavelet-Galerkin scheme. Probab Eng Mech 17:293–303. doi: 10.1016/S0266-8920(02)00013-9 CrossRefGoogle Scholar
  30. Ray J, McKenna SA, van Bloemen Waanders B, Marzouk YM (2012) Bayesian reconstruction of binary media with unresolved fine-scale spatial structures. Adv Water Resour 44(2012):1–19. doi: 10.1016/j.advwatres.2012.04.009 CrossRefGoogle Scholar
  31. Rubin Y, Chen X, Murakami H, Hahn M (2010) A Bayesian approach for inverse modeling, data assimilation, and conditional simulation of spatial random fields. Water Resour Res 46:W10523. doi: 10.1029/2009WR008799 CrossRefGoogle Scholar
  32. Schöniger A, Wöhling T, Samaniego L, Nowak W (2014) Model selection on solid ground: rigorous comparison of nine ways to evaluate Bayesian model evidence. Water Resour Res 50:9484–9513. doi: 10.1002/2014WR016062 CrossRefGoogle Scholar
  33. Schoups G, Vrugt JA (2010) A formal likelihood function for parameter and predictive inference of hydrologic models with correlated, heteroscedastic and non-Gaussian errors. Water Resour Res 46:W10531. doi: 10.1029/2009WR008933 Google Scholar
  34. Schwarz GE (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464. doi: 10.1214/aos/1176344136 CrossRefGoogle Scholar
  35. Shi X, Ye M, Finsterle S, Wu J (2012) Comparing nonlinear regression and Markov chain Monte Carlo methods for assessment of predictive uncertainty in vadose zone modeling. Vadose Zone J. doi: 10.2136/vzj2011.0147 Google Scholar
  36. Spanos Pol D, Beer M, Red-Horse J (2007) Karhunen–Loève expansion of stochastic processes with a modified exponential covariance kernel. J Eng Mech 133:773–779CrossRefGoogle Scholar
  37. Su C-H, Lucor D (2006) Covariance kernel representations of multidimensional second-order stochastic processes. J Comput Phys 217:82–99CrossRefGoogle Scholar
  38. Tartakovsky DM (2013) Assessment and management of risk in subsurface hydrology: a review and perspective. Adv Water Resour 51:247–260. doi: 10.1016/j.advwatres.2012.04.007 CrossRefGoogle Scholar
  39. Tartakovsky DM, Nowak W, Bolster D (2012) Introduction to the special issue on uncertainty quantification and risk assessment. Adv Water Resour 36:1–2. doi: 10.1016/j.advwatres.2011.12.010 CrossRefGoogle Scholar
  40. ter Braak C, Vrugt J (2008) Differential Evolution Markov Chain with snooker updater and fewer chains. Stat Comput 18(4):435–446. doi: 10.1007/s11222-008-9104-9 CrossRefGoogle Scholar
  41. Tsantili IC, Hristopulos DT (2016) Karhunen–Loève expansion of spartan spatial random fields. Probab Eng Mech 43:132–147CrossRefGoogle Scholar
  42. Vrugt JA, Bouten W (2002) Validity of first-order approximations to describe parameter uncertainty in soil hydrologic models. Soil Sci Soc Am J 66:1740–1751. doi: 10.2136/sssaj2002.1740 CrossRefGoogle Scholar
  43. Vrugt JA, Gupta HV, Bouten W, Sorooshian S (2003) A shuffled complex evolution metropolis algorithm for optimization and uncertainty assessment of hydrologic model parameters. Water Resour Res 39(8):1201. doi: 10.1029/2002WR001642 Google Scholar
  44. Vrugt JA, ter Braak CJF, Clark MP, Hyman JM, Robinson BA (2008) Treatment of input uncertainty in hydrologic modeling: doing hydrology backward with Markov chain Monte Carlo simulation. Water Resour Res 44:W00B09. doi: 10.1029/2007WR006720 CrossRefGoogle Scholar
  45. Vrugt JA, ter Braak CJF, Diks CGH, Higdon D, Robinson BA, Hyman JM (2009a) Accelerating Markov chain Monte Carlo simulation by differential evolution with self-adaptive randomized subspace sampling. Int J Nonlinear Sci Numer Simul 10(3):273–290. doi: 10.1515/IJNSNS.2009.10.3.273 CrossRefGoogle Scholar
  46. Vrugt JA, ter Braak CJF, Gupta HV, Robinson BA (2009b) Equifinality of formal (DREAM) and informal (GLUE) Bayesian approaches in hydrologic modeling? Stoch Environ Res Risk Assess. 23(7):1011–1026. doi: 10.1007/s00477-008-0274-y CrossRefGoogle Scholar
  47. Younes A, Ackerer P, Delay F (2010) Mixed finite element for solving 2D diffusion-type equations. Rev Geophys 48:RG1004. doi: 10.1029/2008RG000277 CrossRefGoogle Scholar
  48. Zanini A, Kitanidis PK (2009) Geostatistical inversing for large-contrast transmissivity fields. Stoch Environ Res Risk Assess 23:565–577. doi: 10.1007/s00477-008-0241-7 CrossRefGoogle Scholar
  49. Zhang D (2002) Stochastic methods for flow in porous media, coping with uncertainties. Academic Press, San DiegoGoogle Scholar
  50. Zhang D, Lu Z (2004) An efficient, higher-order perturbation approach for flow in randomly heterogeneous porous media via Karhunen–Loeve decomposition. J Comput Phys 194:773–794. doi: 10.1016/ CrossRefGoogle Scholar
  51. Zheng Y, Han F (2016) Markov Chain Monte Carlo (MCMC) uncertainty analysis for watershed water quality modeling and management. Stoch Environ Res Risk Assess 30(1):293–308. doi: 10.1007/s00477-015-1091-8 CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • Thierry A. Mara
    • 1
  • Noura Fajraoui
    • 2
    • 3
  • Alberto Guadagnini
    • 4
    • 5
  • Anis Younes
    • 2
    • 6
    • 7
  1. 1.PIMENT, EA 4518Université de La RéunionSaint-DenisFrance
  2. 2.LHyGeS, UMR-CNRS 7517Université de Strasbourg/EOSTStrasbourgFrance
  3. 3.Chair of Risk, Safety and Uncertainty Quantification, Department of Civil EngineeringETH ZurichZurichSwitzerland
  4. 4.Dipartimento di Ingegneria Civile e Ambientale, Politecnico di MilanoMilanItaly
  5. 5.Department of Hydrology and Atmospheric SciencesUniversity of ArizonaTucsonUSA
  6. 6.IRD UMR LISAHMontpellierFrance
  7. 7.LMHEEcole Nationale d’Ingénieurs de TunisTunisTunisie

Personalised recommendations