Skip to main content
Log in

A novel neural network training framework with data assimilation

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

In recent years, the prosperity of deep learning has revolutionized the Artificial Neural Networks. However, the dependence of gradients and the offline training mechanism in the learning algorithms prevents the Artificial Neural Networks from further improvement. In this study, a gradient-free training framework based on data assimilation is proposed to avoid the calculation of gradients. In data assimilation algorithms, the error covariance between the forecasts and observations is used to optimize the states. The Feedforward Neural Networks are trained by gradient decent, data assimilation algorithms (Ensemble Kalman Filter and Ensemble Smoother with Multiple Data Assimilation), respectively. Ensemble Smoother with Multiple Data Assimilation trains Feedforward Neural Networks with pre-defined iterations by updating the parameters (i.e. states) using all the available observations which can be regarded as offline learning. Ensemble Kalman Filter optimizes Feedforward Neural Networks when new observation available by updating parameters which can be regarded as real-time learning. Two synthetic cases with the regression of a Sine function and a Mexican Hat function are conducted to validate the effectiveness of the proposed framework. Quantitative comparison with the root mean square error and coefficient of determination show that better performance is obtained by the proposed framework than the gradient decent method. Furthermore, the uncertainty of the parameters is quantified which shows the reduction in uncertainty along with the iterations in Ensemble Smoother with Multiple Data Assimilation. The proposed framework explores alternatives for real-time/offline training the existing Artificial Neural Networks (e.g. Convolutional Neural Networks, Recurrent Neural Networks) without the dependence of gradients and conducting uncertainty analysis at the same time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data availability

All data generated or analysed during this study are included in this published article.

References

  1. Huang X, Gao L, Crosbie RS, Zhang N, Fu GB, Doble R (2019) Groundwater recharge prediction using linear regression. Multi-Layer Perception Network, and Deep Learning, Water 11:19

    Google Scholar 

  2. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444

    Article  Google Scholar 

  3. Jin KH, McCann MT, Froustey E, Unser M (2017) Deep convolutional neural network for inverse problems in imaging. IEEE Trans Image Process 26:4509–4522

    Article  MathSciNet  MATH  Google Scholar 

  4. Zhou H, Chen C, Liu H, Qin F, (2019) Liang H Proactive Knowledge-Goals Dialogue System Based on Pointer Network, CCF International Conference on Natural Language Processing and Chinese Computing, (Springer), pp. 724–735.

  5. McCulloch WS, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5:115–133

    Article  MathSciNet  MATH  Google Scholar 

  6. Rosenblatt F, The perceptron, a perceiving and recognizing automaton Project Para (Cornell Aeronautical Laboratory, 1957).

  7. Hopfield JJ (1982) Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci USA 79:2554–2558

    Article  MathSciNet  MATH  Google Scholar 

  8. Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536

    Article  MATH  Google Scholar 

  9. Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math Control Signals Systems 2:303–314

    Article  MathSciNet  MATH  Google Scholar 

  10. Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Netw 4:251–257

    Article  Google Scholar 

  11. Zhao H (2016) General vector machine, arXiv preprint

  12. Khan NA, Hameed T, Razzaq OA, Ayaz M (2019) Tracking the chaotic behaviour of fractional-order Chua’s system by Mexican hat wavelet-based artificial neural network. J Low Freq Noise Vib Act Control 38:1279–1296

    Article  Google Scholar 

  13. Gray A, Wimbush A, de Angelis M, Hristov PO, Calleja D, Miralles-Dolz E, Rocchetta R (2022) From inference to design: A comprehensive framework for uncertainty quantification in engineering with limited information. Mech Syst Signal Process 165:108210

    Article  Google Scholar 

  14. Song X, Zhan C, Kong F, Xia J (2011) Advances in the study of uncertainty quantification of large-scale hydrological modeling system. J Geog Sci 21:801

    Article  Google Scholar 

  15. Her Y, Yoo S-H, Cho J, Hwang S, Jeong J, Seong C (2019) Uncertainty in hydrological analysis of climate change: multi- parameter vs. multi-GCM ensemble predictions, Scientific Reports, p 9.

  16. Beven K (2006) On undermining the science? Hydrol Process 20:3141–3146

    Article  Google Scholar 

  17. Li L, Xia J, Xu C-Y, Singh VP (2010) Evaluation of the subjective factors of the GLUE method and comparison with the formal Bayesian method in uncertainty assessment of hydrological models. J Hydrol 390:210–221

    Article  Google Scholar 

  18. Engeland K, Xu C-Y, Gottschalk L (2005) Assessing uncertainties in a conceptual water balance model using Bayesian methodology / Estimation bayésienne des incertitudes au sein d’une modélisation conceptuelle de bilan hydrologique, Hydrological Sciences Journal, 50 null-63.

  19. Beven K, Freer J (2001) Equifinality, data assimilation, and uncertainty estimation in mechanistic modelling of complex environmental systems using the GLUE methodology. J Hydrol 249:11–29

    Article  Google Scholar 

  20. Daley R, Atmospheric Data Analysis (Cambridge University Press, 1993)

  21. Houtekamer PL, Zhang FQ (2016) Review of the ensemble kalman filter for atmospheric data assimilation. Mon Weather Rev 144:4489–4532

    Article  Google Scholar 

  22. Tribbia J, Baumhefner D (2004) Scale interactions and atmospheric predictability: an updated perspective. Mon Weather Rev 132:703–713

    Article  Google Scholar 

  23. Leith C (1993) Numerical models of weather and climate. Plasma Phys Controlled Fusion 35:919

    Article  Google Scholar 

  24. Wunsch C, Discrete inverse and state estimation problems: with geophysical fluid applications (Cambridge University Press, 2006).

  25. Biegler LT, Coleman TF, Conn AR, Santosa FN, Large-Scale Optimization with Applications: Part I: Optimization in Inverse Problems and Design (Springer Science & Business Media, 2012).

  26. Emerick A, History Matching and Uncertainty Characterization: Using Ensemble-based Methods (LAP LAMBERT Academic Publishing, 2012).

  27. C.D. Rodgers, Inverse methods for atmospheric sounding: theory and practice (World scientific, 2000).

  28. A. Tarantola, Inverse problem theory and methods for model parameter estimation (siam, 2005).

  29. Evensen G (1994) Sequential data assimilation with a nonlinear quasi-geostrophic model using monte carlo methods to forecast error statistics. J Geophys Res 99:10–10

    Google Scholar 

  30. Kalman RE (1960) A new approach to linear filtering and prediction problems. J Fluids Eng 82:35–45

    MathSciNet  Google Scholar 

  31. Kalman RE, Bucy RS (1961) New results in linear filtering and prediction theory. J Basic Eng 83:95–108

    Article  MathSciNet  Google Scholar 

  32. Aanonsen SI, Nævdal G, Oliver DS, Reynolds AC, Vallés B (2009) Review of ensemble Kalman filter in petroleum engineering. Spe J 14:393–412

    Article  Google Scholar 

  33. Hendricks Franssen HJ, Kinzelbach W (2008) Real‐time groundwater flow modeling with the Ensemble Kalman Filter: Joint estimation of states and parameters and the filter inbreeding problem, Water Resour Res, p 44

  34. Erazo DEG, Wallscheid O, Bocker J (2020) Improved fusion of permanent magnet temperature estimation techniques for synchronous motors using a kalman filter. IEEE Trans Ind Electron 67:1708–1717

    Article  Google Scholar 

  35. Bocquet M, Brajard J, Carrassi A, Bertino L (2020) Bayesian inference of chaotic dynamics by merging data assimilation, machine learning and expectation-maximization, Foundations of Data. Science 2:55–80

    Google Scholar 

  36. Nóbrega JP, Oliveira ALI (2019) A sequential learning method with Kalman filter and extreme learning machine for regression and time series forecasting. Neurocomputing 337:235–250

    Article  Google Scholar 

  37. Mu T, Nandi AK (2009) Automatic tuning of L2-SVM parameters employing the extended Kalman filter. Expert Syst 26:160–175

    Article  Google Scholar 

  38. Shah S, Palmieri F, Datum M (1992) Optimal filtering algorithms for fast learning in feedforward neural networks. Neural Netw 5:779–787

    Article  Google Scholar 

  39. Sum J, Chi-Sing L, Young GH, Wing-Kay K (1999) On the Kalman filtering method in neural network training and pruning. IEEE Trans Neural Networks 10:161–166

    Article  Google Scholar 

  40. Youmin Z, Li XR (1999) A fast U-D factorization-based learning algorithm with applications to nonlinear system modeling and identification. IEEE Trans Neural Networks 10:930–938

    Article  Google Scholar 

  41. Simon D (2002) Training radial basis neural networks with the extended Kalman filter. Neurocomputing 48:455–475

    Article  MATH  Google Scholar 

  42. Obradovic D (1996) On-line training of recurrent neural networks with continuous topology adaptation. IEEE Trans Neural Networks 7:222–228

    Article  Google Scholar 

  43. Puskorius GV, Feldkamp LA (1994) Neurocontrol of nonlinear dynamical systems with Kalman filter trained recurrent networks. IEEE Trans Neural Networks 5:279–297

    Article  Google Scholar 

  44. Chen Y, Chang H, Meng J, Zhang D (2019) Ensemble Neural Networks (ENN): a gradient-free stochastic method. Neural Netw 110:170–185

    Article  Google Scholar 

  45. Emerick AA, Reynolds AC (2013) Ensemble smoother with multiple data assimilation. Comput Geosci 55:3–15

    Article  Google Scholar 

  46. Leeuwen PJv, Evensen G (1996) Data assimilation and inverse methods in terms of a probabilistic formulation. Mon Weather Rev 124:2898–2913

    Article  Google Scholar 

  47. Bao J, Li L, Redoloza F (2020) Coupling ensemble smoother and deep learning with generative adversarial networks to deal with non-Gaussianity in flow and transport data assimilation. J Hydrol 590:125443

    Article  Google Scholar 

  48. Li Y, Chen C, Zhou J, Zhang G, Chen X (2016) Dual state-parameter simultaneous estimation using localised Ensemble Kalman Filter and application in environmental model. Int J Embedded Syst 8:93–103

    Article  Google Scholar 

  49. Emerick AA, Reynolds AC (2012) History matching time-lapse seismic data using the ensemble Kalman filter with multiple data assimilations. Computat Geosci 16:639–659

    Article  Google Scholar 

  50. Moriasi DN (2007) Model evaluation guidelines for systematic quantification of accuracy in watershed simulations. Trans ASABE 50:885–880

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the reader for their precious time and efforts and providing constructive comments (if possible) to improve the paper. This work was supported by the National Natural Science Foundation of China [grant No. 62006247]; PetroChina Innovation Foundation [2020D-5007-0301]; the National Key R&D Program of China [grant No. 2019YFC1510501].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chong Chen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, C., Dou, Y., Chen, J. et al. A novel neural network training framework with data assimilation. J Supercomput 78, 19020–19045 (2022). https://doi.org/10.1007/s11227-022-04629-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-022-04629-7

Keywords

Navigation