A novel neural network training framework with data assimilation

Chen, Chong; Dou, Yixuan; Chen, Jie; Xue, Yaru

doi:10.1007/s11227-022-04629-7

A novel neural network training framework with data assimilation

Published: 15 June 2022

Volume 78, pages 19020–19045, (2022)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Chong Chen¹,
Yixuan Dou¹,
Jie Chen¹ &
…
Yaru Xue¹

534 Accesses
2 Citations
Explore all metrics

Abstract

In recent years, the prosperity of deep learning has revolutionized the Artificial Neural Networks. However, the dependence of gradients and the offline training mechanism in the learning algorithms prevents the Artificial Neural Networks from further improvement. In this study, a gradient-free training framework based on data assimilation is proposed to avoid the calculation of gradients. In data assimilation algorithms, the error covariance between the forecasts and observations is used to optimize the states. The Feedforward Neural Networks are trained by gradient decent, data assimilation algorithms (Ensemble Kalman Filter and Ensemble Smoother with Multiple Data Assimilation), respectively. Ensemble Smoother with Multiple Data Assimilation trains Feedforward Neural Networks with pre-defined iterations by updating the parameters (i.e. states) using all the available observations which can be regarded as offline learning. Ensemble Kalman Filter optimizes Feedforward Neural Networks when new observation available by updating parameters which can be regarded as real-time learning. Two synthetic cases with the regression of a Sine function and a Mexican Hat function are conducted to validate the effectiveness of the proposed framework. Quantitative comparison with the root mean square error and coefficient of determination show that better performance is obtained by the proposed framework than the gradient decent method. Furthermore, the uncertainty of the parameters is quantified which shows the reduction in uncertainty along with the iterations in Ensemble Smoother with Multiple Data Assimilation. The proposed framework explores alternatives for real-time/offline training the existing Artificial Neural Networks (e.g. Convolutional Neural Networks, Recurrent Neural Networks) without the dependence of gradients and conducting uncertainty analysis at the same time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scientific Machine Learning Through Physics–Informed Neural Networks: Where we are and What’s Next

Article Open access 26 July 2022

Development and Application of Artificial Neural Network

Article 30 December 2017

A survey of uncertainty in deep neural networks

Article Open access 29 July 2023

Data availability

All data generated or analysed during this study are included in this published article.

References

Huang X, Gao L, Crosbie RS, Zhang N, Fu GB, Doble R (2019) Groundwater recharge prediction using linear regression. Multi-Layer Perception Network, and Deep Learning, Water 11:19
Google Scholar
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444
Article Google Scholar
Jin KH, McCann MT, Froustey E, Unser M (2017) Deep convolutional neural network for inverse problems in imaging. IEEE Trans Image Process 26:4509–4522
Article MathSciNet MATH Google Scholar
Zhou H, Chen C, Liu H, Qin F, (2019) Liang H Proactive Knowledge-Goals Dialogue System Based on Pointer Network, CCF International Conference on Natural Language Processing and Chinese Computing, (Springer), pp. 724–735.
McCulloch WS, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5:115–133
Article MathSciNet MATH Google Scholar
Rosenblatt F, The perceptron, a perceiving and recognizing automaton Project Para (Cornell Aeronautical Laboratory, 1957).
Hopfield JJ (1982) Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci USA 79:2554–2558
Article MathSciNet MATH Google Scholar
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536
Article MATH Google Scholar
Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math Control Signals Systems 2:303–314
Article MathSciNet MATH Google Scholar
Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Netw 4:251–257
Article Google Scholar
Zhao H (2016) General vector machine, arXiv preprint
Khan NA, Hameed T, Razzaq OA, Ayaz M (2019) Tracking the chaotic behaviour of fractional-order Chua’s system by Mexican hat wavelet-based artificial neural network. J Low Freq Noise Vib Act Control 38:1279–1296
Article Google Scholar
Gray A, Wimbush A, de Angelis M, Hristov PO, Calleja D, Miralles-Dolz E, Rocchetta R (2022) From inference to design: A comprehensive framework for uncertainty quantification in engineering with limited information. Mech Syst Signal Process 165:108210
Article Google Scholar
Song X, Zhan C, Kong F, Xia J (2011) Advances in the study of uncertainty quantification of large-scale hydrological modeling system. J Geog Sci 21:801
Article Google Scholar
Her Y, Yoo S-H, Cho J, Hwang S, Jeong J, Seong C (2019) Uncertainty in hydrological analysis of climate change: multi- parameter vs. multi-GCM ensemble predictions, Scientific Reports, p 9.
Beven K (2006) On undermining the science? Hydrol Process 20:3141–3146
Article Google Scholar
Li L, Xia J, Xu C-Y, Singh VP (2010) Evaluation of the subjective factors of the GLUE method and comparison with the formal Bayesian method in uncertainty assessment of hydrological models. J Hydrol 390:210–221
Article Google Scholar
Engeland K, Xu C-Y, Gottschalk L (2005) Assessing uncertainties in a conceptual water balance model using Bayesian methodology / Estimation bayésienne des incertitudes au sein d’une modélisation conceptuelle de bilan hydrologique, Hydrological Sciences Journal, 50 null-63.
Beven K, Freer J (2001) Equifinality, data assimilation, and uncertainty estimation in mechanistic modelling of complex environmental systems using the GLUE methodology. J Hydrol 249:11–29
Article Google Scholar
Daley R, Atmospheric Data Analysis (Cambridge University Press, 1993)
Houtekamer PL, Zhang FQ (2016) Review of the ensemble kalman filter for atmospheric data assimilation. Mon Weather Rev 144:4489–4532
Article Google Scholar
Tribbia J, Baumhefner D (2004) Scale interactions and atmospheric predictability: an updated perspective. Mon Weather Rev 132:703–713
Article Google Scholar
Leith C (1993) Numerical models of weather and climate. Plasma Phys Controlled Fusion 35:919
Article Google Scholar
Wunsch C, Discrete inverse and state estimation problems: with geophysical fluid applications (Cambridge University Press, 2006).
Biegler LT, Coleman TF, Conn AR, Santosa FN, Large-Scale Optimization with Applications: Part I: Optimization in Inverse Problems and Design (Springer Science & Business Media, 2012).
Emerick A, History Matching and Uncertainty Characterization: Using Ensemble-based Methods (LAP LAMBERT Academic Publishing, 2012).
C.D. Rodgers, Inverse methods for atmospheric sounding: theory and practice (World scientific, 2000).
A. Tarantola, Inverse problem theory and methods for model parameter estimation (siam, 2005).
Evensen G (1994) Sequential data assimilation with a nonlinear quasi-geostrophic model using monte carlo methods to forecast error statistics. J Geophys Res 99:10–10
Google Scholar
Kalman RE (1960) A new approach to linear filtering and prediction problems. J Fluids Eng 82:35–45
MathSciNet Google Scholar
Kalman RE, Bucy RS (1961) New results in linear filtering and prediction theory. J Basic Eng 83:95–108
Article MathSciNet Google Scholar
Aanonsen SI, Nævdal G, Oliver DS, Reynolds AC, Vallés B (2009) Review of ensemble Kalman filter in petroleum engineering. Spe J 14:393–412
Article Google Scholar
Hendricks Franssen HJ, Kinzelbach W (2008) Real‐time groundwater flow modeling with the Ensemble Kalman Filter: Joint estimation of states and parameters and the filter inbreeding problem, Water Resour Res, p 44
Erazo DEG, Wallscheid O, Bocker J (2020) Improved fusion of permanent magnet temperature estimation techniques for synchronous motors using a kalman filter. IEEE Trans Ind Electron 67:1708–1717
Article Google Scholar
Bocquet M, Brajard J, Carrassi A, Bertino L (2020) Bayesian inference of chaotic dynamics by merging data assimilation, machine learning and expectation-maximization, Foundations of Data. Science 2:55–80
Google Scholar
Nóbrega JP, Oliveira ALI (2019) A sequential learning method with Kalman filter and extreme learning machine for regression and time series forecasting. Neurocomputing 337:235–250
Article Google Scholar
Mu T, Nandi AK (2009) Automatic tuning of L2-SVM parameters employing the extended Kalman filter. Expert Syst 26:160–175
Article Google Scholar
Shah S, Palmieri F, Datum M (1992) Optimal filtering algorithms for fast learning in feedforward neural networks. Neural Netw 5:779–787
Article Google Scholar
Sum J, Chi-Sing L, Young GH, Wing-Kay K (1999) On the Kalman filtering method in neural network training and pruning. IEEE Trans Neural Networks 10:161–166
Article Google Scholar
Youmin Z, Li XR (1999) A fast U-D factorization-based learning algorithm with applications to nonlinear system modeling and identification. IEEE Trans Neural Networks 10:930–938
Article Google Scholar
Simon D (2002) Training radial basis neural networks with the extended Kalman filter. Neurocomputing 48:455–475
Article MATH Google Scholar
Obradovic D (1996) On-line training of recurrent neural networks with continuous topology adaptation. IEEE Trans Neural Networks 7:222–228
Article Google Scholar
Puskorius GV, Feldkamp LA (1994) Neurocontrol of nonlinear dynamical systems with Kalman filter trained recurrent networks. IEEE Trans Neural Networks 5:279–297
Article Google Scholar
Chen Y, Chang H, Meng J, Zhang D (2019) Ensemble Neural Networks (ENN): a gradient-free stochastic method. Neural Netw 110:170–185
Article Google Scholar
Emerick AA, Reynolds AC (2013) Ensemble smoother with multiple data assimilation. Comput Geosci 55:3–15
Article Google Scholar
Leeuwen PJv, Evensen G (1996) Data assimilation and inverse methods in terms of a probabilistic formulation. Mon Weather Rev 124:2898–2913
Article Google Scholar
Bao J, Li L, Redoloza F (2020) Coupling ensemble smoother and deep learning with generative adversarial networks to deal with non-Gaussianity in flow and transport data assimilation. J Hydrol 590:125443
Article Google Scholar
Li Y, Chen C, Zhou J, Zhang G, Chen X (2016) Dual state-parameter simultaneous estimation using localised Ensemble Kalman Filter and application in environmental model. Int J Embedded Syst 8:93–103
Article Google Scholar
Emerick AA, Reynolds AC (2012) History matching time-lapse seismic data using the ensemble Kalman filter with multiple data assimilations. Computat Geosci 16:639–659
Article Google Scholar
Moriasi DN (2007) Model evaluation guidelines for systematic quantification of accuracy in watershed simulations. Trans ASABE 50:885–880
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank the reader for their precious time and efforts and providing constructive comments (if possible) to improve the paper. This work was supported by the National Natural Science Foundation of China [grant No. 62006247]; PetroChina Innovation Foundation [2020D-5007-0301]; the National Key R&D Program of China [grant No. 2019YFC1510501].

Author information

Authors and Affiliations

College of Information Science and Engineering, China University of Petroleum-Beijing, Beijing, 102249, China
Chong Chen, Yixuan Dou, Jie Chen & Yaru Xue

Authors

Chong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yixuan Dou
View author publications
You can also search for this author in PubMed Google Scholar
Jie Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yaru Xue
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chong Chen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, C., Dou, Y., Chen, J. et al. A novel neural network training framework with data assimilation. J Supercomput 78, 19020–19045 (2022). https://doi.org/10.1007/s11227-022-04629-7

Download citation

Accepted: 23 May 2022
Published: 15 June 2022
Issue Date: November 2022
DOI: https://doi.org/10.1007/s11227-022-04629-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A novel neural network training framework with data assimilation

Abstract

Access this article

Similar content being viewed by others

Scientific Machine Learning Through Physics–Informed Neural Networks: Where we are and What’s Next

Development and Application of Artificial Neural Network

A survey of uncertainty in deep neural networks

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A novel neural network training framework with data assimilation

Abstract

Access this article

Similar content being viewed by others

Scientific Machine Learning Through Physics–Informed Neural Networks: Where we are and What’s Next

Development and Application of Artificial Neural Network

A survey of uncertainty in deep neural networks

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation