Abstract
Data-driven machine learning surrogates are used to substitute complex groundwater numerical simulation models within optimization algorithms to reduce computational burden for large-scale aquifer management. The traditional surrogate-assisted simulation–optimization modeling has been limited due to uncertainty persisting in surrogate model predictions. More advanced methods are imperative to reduce impact of uncertainties from surrogate models on solution optimality. In this regard, we propose an ensemble surrogate-based simulation–optimization methodology for optimal saltwater intrusion (SWI) control through accounting for uncertainty induced by surrogate models. The optimization model includes two conflicting objectives: minimizing total groundwater pumping and injection rate from an extraction-injection horizontal well system while reducing chloride concentration at monitoring locations below a certain level as much as possible. Three types of machine learning surrogates including artificial neural network, random forest and support vector machine were established to replace a high-fidelity physically based saltwater intrusion model. Optimal Latin hypercube design combined with parallel computing on high performance computing (HPC) was performed to generate input–output data of pumping and injection schedules and resulting salinity levels. An innovative Bayesian set pair analysis approach was presented to derive posterior model weights by considering both training and testing data. The newly constructed individual and ensemble machine learning surrogates were then coupled with a bi-objective optimization model to obtain Pareto-optimal extraction-injection strategies in a deep “2000-foot” sand of the Baton Rouge area, Louisiana, where the optimization was solved using a multi-objective genetic algorithm NSGA-II. Results showed that individual and ensemble surrogate models were accurate enough for salinity prediction. Through comparing the Pareto-optimal solutions, the ensemble surrogate-based modeling was confirmed to provide more reliable and conservative strategies for alleviating saltwater intrusion threat while considerably reducing computational cost. The improved Bayesian set pair analysis approach proved to be robust to integrate multiple models by quantifying model uncertainty.
Similar content being viewed by others
Data Availability
The historical pumping data were provided by the Capital Area Ground Water Conservation Commission (CAGWCC) of Louisiana. The water quality data were provided by the U.S. Geological Survey (USGS). Other data, models, and codes that support the findings of this study are available from the corresponding author upon request.
References
Ajami NK, Duan Q, Sorooshian S (2007) An integrated hydrologic Bayesian multimodel combination framework: Confronting input, parameter, and model structural uncertainty in hydrologic prediction. Water Resour Res 43(1):W01403. https://doi.org/10.1029/2005WR004745
Asher M, Croke B, Jakeman A, Peeters L (2015) A review of surrogate models and their application to groundwater modeling. Water Resour Res 51(8):5957–5973. https://doi.org/10.1002/2015WR016967
Badaruddin S, Werner AD, Morgan LK (2017) Characteristics of active seawater intrusion. J Hydrol 551(8):632–647. https://doi.org/10.1016/j.jhydrol.2017.04.031
Bolstad WM, Curran JM (2016) Introduction to Bayesian statistics. John Wiley & Sons, pp 85–110
Breiman L (2001) Random forests. Mach Learn 45:5–32
Brodeur ZP, Herman JD, Steinschneider S (2020) Bootstrap aggregation and cross-validation methods to reduce overfitting in reservoir control policy search. Water Resour Res 56(8):e2020WR027184
Christelis V, Mantoglou A (2016) Pumping optimization of coastal aquifers assisted by adaptive metamodelling methods and radial basis functions. Water Resour Manag 30(15):5845–5859. https://doi.org/10.1007/s11269-016-1337-3
Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6(2):182–197. https://doi.org/10.1109/4235.996017
Draper D (1995) Assessment and propagation of model uncertainty. J Roy Stat Soc Ser B (Methodol) 57(1):45–70. https://doi.org/10.1111/j.2517-6161.1995.tb02015.x
Du C, Yu J, Zhong H, Wang D (2015) Operating mechanism and set pair analysis model of a sustainable water resources system. Front Environ Sci Eng 9(2):288–297. https://doi.org/10.1007/s11783-014-0642-4
Garud SS, Karimi IA, Kraft M (2017) Smart sampling algorithm for surrogate model development. Comput Chem Eng 96(Supplement C):103–114. https://doi.org/10.1016/j.compchemeng.2016.10.006
Harbaugh AW (2005) MODFLOW-2005, the U.S. Geological Survey modular ground-water model: The ground-water flow process. Techniques and Methods 6-A16. Reston, VA: US Dept. of the Interior, USGS
Hou Z, Dai Z, Lao W, Wang Y, Lu W (2019) Application of mixed-integer nonlinear optimization programming based on ensemble surrogate model for dense nonaqueous phase liquid source identification in groundwater. Environ Eng Sci 36(6):699–709. https://doi.org/10.1089/ees.2018.0366
Hou Z, Lu W, Xue H, Lin J (2017) A comparative research of different ensemble surrogate models based on set pair analysis for the DNAPL-contaminated aquifer remediation strategy optimization. J Contam Hydrol 203:28–37. https://doi.org/10.1016/j.jconhyd.2017.06.003
Jasechko S, Perrone D, Seybold H, Fan Y, Kirchner JW (2020) Groundwater level observations in 250,000 coastal US wells reveal scope of potential seawater intrusion. Nat Commun 11(1):1–9. https://doi.org/10.1038/s41467-020-17038-2
Jiang X, Lu W, Hou Z, Zhao H, Na J (2015) Ensemble of surrogates-based optimization for identifying an optimal surfactant-enhanced aquifer remediation strategy at heterogeneous DNAPL-contaminated sites. Comput Geosci 84(11):37–45. https://doi.org/10.1016/j.cageo.2015.08.003
Jiang X, Lu W, Na J, Hou Z, Wang Y, Chi B (2018) A stochastic optimization model based on adaptive feedback correction process and surrogate model uncertainty for DNAPL-contaminated groundwater remediation design. Stoch Env Res Risk Assess 32(11):3195–3206. https://doi.org/10.1007/s00477-018-1559-4
Ketabchi H, Ataie-Ashtiani B (2015) Assessment of a parallel evolutionary optimization approach for efficient management of coastal aquifers. Environ Model Softw 74:21–38. https://doi.org/10.1016/j.envsoft.2015.09.002
Konikow LF, Hornberger GZ, Halford KJ, Hanson RT, Harbaugh AW (2009) Revised Multi-Node Well (MNW2) package for MODFLOW ground-water flow model, U. S. Geological Survey Techniques and Methods 6–A30, p 67
Kuczera G, Kavetski D, Franks S, Thyer M (2006) Towards a Bayesian total error analysis of conceptual rainfall-runoff models: Characterising model error using storm-dependent parameters. J Hydrol 331(1–2):161–177
Kumar K, Garg H (2018) Connection number of set pair analysis based TOPSIS method on intuitionistic fuzzy sets and their application to decision making. Appl Intell 48(8):2112–2119. https://doi.org/10.1007/s10489-017-1067-0
Lal A, Datta B (2018) Development and implementation of support vector machine regression surrogate models for predicting groundwater pumping-induced saltwater intrusion into coastal aquifers. Water Resour Manag 32(7):2405–2419. https://doi.org/10.1007/s11269-018-1936-2
Langevin CD, Thorne Jr DT, Dausman AM, Sukop MC, Guo W (2008) SEAWAT version 4: a computer program for simulation of multi-species solute and heat transport (No. 6-A22). Geological Survey (US). https://doi.org/10.3133/tm6A22
Lovelace JK (2007) Chloride Concentrations in Ground Water in East and West Baton Rouge Parishes, Louisiana, 2004-05. US Department of the Interior, US Geological Survey. 2007–5069. https://doi.org/10.3133/sir20075069
Maier HR, Dandy GC (2000) Neural networks for the prediction and forecasting of water resources variables: a review of modelling issues and applications. Environ Model Softw 15(1):101–124. https://doi.org/10.1016/S1364-8152(99)00007-9
Maliva RG, Manahan WS, Missimer TM (2020) Aquifer storage and recovery using saline aquifers: Hydrogeological controls and opportunities. Groundwater 58(1):9–18. https://doi.org/10.1111/gwat.12962
Ouyang Q, Lu W, Miao T, Deng W, Jiang C, Luo J (2017) Application of ensemble surrogates and adaptive sequential sampling to optimal groundwater remediation design at DNAPLs-contaminated sites. J Contam Hydrol 207:31–38. https://doi.org/10.1016/j.jconhyd.2017.10.007
Pham HV, Tsai FT-C (2017) Modeling complex aquifer systems: a case study in Baton Rouge, Louisiana (USA). Hydrogeol J 25(3):601–615. https://doi.org/10.1007/s10040-016-1532-6
Platt JC (1999) Fast training of support vector machines using sequential minimal optimization. In: Schölkopf B, Burges CJC, Smolar AJ (eds) Advances in Kernel Methods-Support Vector Learning. MIT Press, Cambridge
Post VE, Werner AD (2017) Coastal aquifers: Scientific advances in the face of global environmental challenges. J Hydrol 551(8):1–3. https://doi.org/10.1016/j.jhydrol.2017.04.046
Probst P, Wright MN, Boulesteix AL (2019) Hyperparameters and tuning strategies for random forest. Wiley Interdiscip Rev Data Mining Knowledge Discovery 9(3):e1301. https://doi.org/10.1002/widm.1301
Rajabi MM, Ataie-Ashtiani B, Janssen H (2015) Efficiency enhancement of optimized Latin hypercube sampling strategies: Application to Monte Carlo uncertainty analysis and meta-modeling. Adv Water Resour 76:127–139. https://doi.org/10.1016/j.advwatres.2014.12.008
Roy DK, Datta B (2017) Multivariate adaptive regression spline ensembles for management of multilayered coastal aquifers. J Hydrol Eng 22(9):04017031. https://doi.org/10.1061/(ASCE)HE.1943-5584.0001550
Rumelhart DE, McClelland JL, The PDP Research Group (1986) Parallel distributed processing: Explorations in the microstructure of cognition. MIT Press, Cambridge, p 516
Schölkopf B, Smola AJ, Bach F (2002) Learning with Kernels: Support vector machines, regularization, optimization, and beyond. MIT Press, Cambridge, p 626
Schöniger A, Wöhling T, Samaniego L, Nowak W (2014) Model selection on solid ground: Rigorous comparison of nine ways to evaluate B ayesian model evidence. Water Resour Res 50(12):9484–9513
Shi L, Lu C, Ye Y, Xie Y, Wu J (2020) Evaluation of the performance of multiple-well hydraulic barriers on enhancing groundwater extraction in a coastal aquifer. Adv Water Resour 144(4):103704. https://doi.org/10.1016/j.advwatres.2020.103704
Siade AJ, Cui T, Karelse RN, Hampton C (2020) Reduced-dimensional Gaussian process machine learning for groundwater allocation planning using swarm theory. Water Resour Res 56(3):e2019WR026061. https://doi.org/10.1029/2019WR026061
Song J, Yang Y, Wu J, Wu J, Sun X, Lin J (2018) Adaptive surrogate model based multiobjective optimization for coastal aquifer management. J Hydrol 561:98–111. https://doi.org/10.1016/j.jhydrol.2018.03.063
Sreekanth J, Datta B (2011) Coupled simulation‐optimization model for coastal aquifer management using genetic programming‐based ensemble surrogate models and multiple‐realization optimization. Water Resour Res 47(4):W04516. https://doi.org/10.1029/2010WR009683
Sreekanth J, Moore C (2018) Novel patch modelling method for efficient simulation and prediction uncertainty analysis of multi-scale groundwater flow and transport processes. J Hydrol 559(4):122–135. https://doi.org/10.1016/j.jhydrol.2018.02.028
Tomaszewski DJ (1996) Distribution and movement of saltwater in aquifers in the Baton Rouge area, Louisiana, 1990-92. Baton Rouge, LA: Louisiana Department of Transportation and Development, p 44
Vapnik V (2013) The nature of statistical learning theory. Springer science & business media. 38(4):409–409
Viana FA, Haftka RT, Steffen V (2009) Multiple surrogates: how cross-validation errors can help us to obtain the best predictor. Struct Multidiscip Optim 39(4):439–457. https://doi.org/10.1007/s00158-008-0338-0
Williams HP (2013) Model building in mathematical programming. John Wiley & Sons, pp 35–42
Xiao C, Liang X, Zhang F, Feng B, Xie S (2009) Advances in water resources and hydraulic engineering. Springer, Berlin Heidelberg, New York
Yan S, Minsker B (2006) Optimal groundwater remediation design using an adaptive neural network genetic algorithm. Water Resour Res 42(5):W05407. https://doi.org/10.1029/2005WR004303
Yin J, Tsai FT-C (2020) Bayesian set pair analysis and machine learning based ensemble surrogates for optimal multi-aquifer system remediation design. J Hydrol 580(1):124280. https://doi.org/10.1016/j.jhydrol.2019.124280
Yin J, Pham HV, Tsai FT-C (2020) Multiobjective spatial pumping optimization for groundwater management in a multiaquifer system. J Water Resour Plan Manag 146(4):04020013. https://doi.org/10.1061/(ASCE)WR.1943-5452.0001180
Yu L, Wang S, Lai KK (2007) Basic learning principles of artificial neural networks. Foreign-exchange-rate forecasting with artificial neural networks. International Series in Operations Research & Management Science, vol 107. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-71720-3_2
Zhao KQ, Xuan AL (1996) Set pair theory-a new theory method of non-define and its applications. Syst Eng 14(1):18–23
Zheng C, Wang PP (1999) MT3DMS: a modular three-dimensional multi-species transport model for simulation of advection, dispersion and chemical reactions of contaminants in groundwater systems; documentation and user’s guide. U.S. Army Engineer Research and Development Center Contract. Report SERDP-99-1, Vicksburg, p 202
Zhou X, Ma Y, Tu Y, Feng Y (2013) Ensemble of surrogates for dual response surface modeling in robust parameter design. Qual Reliab Eng Int 29(2):173–197. https://doi.org/10.1002/qre.1298
Acknowledgements
This research was supported by National Key Research and Development Program (No. 2021YFC3200500), the National Natural Science Foundation of China (No. 52109080), and Fundamental Research Funds for the Central Universities (B220201013). The authors acknowledge the Capital Area Ground Water Conservation Commission (CAGWCC) for providing water pumping data and the U.S. Geological Survey (USGS) for providing water quality data. High Performance Computing (HPC) Platform in Hohai University is acknowledged for providing technique support.
Funding
This research was supported by National Key Research and Development Program (No. 2021YFC3200500), the National Natural Science Foundation of China (No. 52109080), and Fundamental Research Funds for the Central Universities (B220201013). High Performance Computing (HPC) Platform in Hohai University is acknowledged for providing technique support.
Author information
Authors and Affiliations
Contributions
Jina Yin: Conceptualization, Data curation, Methodology, Funding acquisition, Validation, Formal analysis, Writing—original draft. Frank T.-C. Tsai: Conceptualization, Methodology, Supervision, Resources, Validation, Writing-review & editing. Chunhui Lu: Funding acquisition, Data curation, Methodology, Resources, Supervision, Validation, Writing-review & editing.
Corresponding author
Ethics declarations
Ethics Approval
Relevant research content in this study was in accordance with the ethical standards of the institutional and national research committee.
Consent to Participate
All authors consent to participate.
Consent to Publish
All authors consent to publish.
Conflicts of Interest
The authors declare that they have no known competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yin, J., Tsai, F.TC. & Lu, C. Bi-objective Extraction-injection Optimization Modeling for Saltwater Intrusion Control Considering Surrogate Model Uncertainty. Water Resour Manage 36, 6017–6042 (2022). https://doi.org/10.1007/s11269-022-03340-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11269-022-03340-9