Skip to main content
Log in

Neural Network-Based Limiter with Transfer Learning

  • Original Paper
  • Published:
Communications on Applied Mathematics and Computation Aims and scope Submit manuscript

Abstract

Recent works have shown that neural networks are promising parameter-free limiters for a variety of numerical schemes (Morgan et al. in A machine learning approach for detecting shocks with high-order hydrodynamic methods. https://doi.org/10.2514/6.2020-2024; Ray et al. in J Comput Phys 367: 166–191. https://doi.org/10.1016/j.jcp.2018.04.029, 2018; Veiga et al. in European Conference on Computational Mechanics and VII European Conference on Computational Fluid Dynamics, vol. 1, pp. 2525–2550. ECCM. https://doi.org/10.5167/uzh-168538, 2018). Following this trend, we train a neural network to serve as a shock-indicator function using simulation data from a Runge-Kutta discontinuous Galerkin (RKDG) method and a modal high-order limiter (Krivodonova in J Comput Phys 226: 879–896. https://doi.org/10.1016/j.jcp.2007.05.011, 2007). With this methodology, we obtain one- and two-dimensional black-box shock-indicators which are then coupled to a standard limiter. Furthermore, we describe a strategy to transfer the shock-indicator to a residual distribution (RD) scheme without the need for a full training cycle and large dataset, by finding a mapping between the solution feature spaces from an RD scheme to an RKDG scheme, both in one- and two-dimensional problems, and on Cartesian and unstructured meshes. We report on the quality of the numerical solutions when using the neural network shock-indicator coupled to a limiter, comparing its performance to traditional limiters, for both RKDG and RD schemes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23

Similar content being viewed by others

Notes

  1. Some of the solvers used are still under development and not publicly available.

References

  1. Abgrall, R.: Residual distribution schemes: current status and future trends. Comput. Fluids 35(7), 641–669 (2006). https://doi.org/10.1016/j.compfluid.2005.01.007

    Article  MathSciNet  MATH  Google Scholar 

  2. Abgrall, R., Bacigaluppi, P., Tokareva, S.: High-order residual distribution scheme for the time-dependent Euler equations of fluid dynamics. Comput. Math. Appl. 78(2), 274–297 (2019). https://doi.org/10.1016/j.camwa.2018.05.009

    Article  MathSciNet  MATH  Google Scholar 

  3. Bacigaluppi, P., Abgrall, R., Tokareva, S.: “A posteriori” limited high order and robust residual distribution schemes for transient simulations of fluid flows in gas dynamics. arXiv:1902.07773 (2019)

  4. Beck, A., Zeifang, J., Schwarz, A., Flad, D.: A neural network based shock detection and localization approach for discontinuous Galerkin methods (2020). https://doi.org/10.13140/RG.2.2.20237.90085

  5. Bergstra, J., Yamins, D., Cox, D.: Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures. In: Dasgupta, S., McAllester, D. (eds.) Proceedings of the 30th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 28, pp. 115–123. PMLR, Atlanta (2013). http://proceedings.mlr.press/v28/bergstra13.html

  6. Biswas, R., Devine, K.D., Flaherty, J.E.: Parallel, adaptive finite element methods for conservation laws. Appl. Numer. Math. 14(1), 255–283 (1994). https://doi.org/10.1016/0168-9274(94)90029-9

    Article  MathSciNet  MATH  Google Scholar 

  7. Bottou, L., Curtis, F., Nocedal, J.: Optimization methods for large-scale machine learning. SIAM Rev. 60(2), 223–311 (2018). https://doi.org/10.1137/16M1080173

    Article  MathSciNet  MATH  Google Scholar 

  8. Burman, E., Fernández, M.A.: Continuous interior penalty finite element method for the time-dependent Navier-Stokes equations: space discretization and convergence. Numer. Math. 107(1), 39–77 (2007). https://doi.org/10.1007/s00211-007-0070-5

    Article  MathSciNet  MATH  Google Scholar 

  9. Clain, S., Diot, S., Loubère, R.: Multi-dimensional optimal order detection (MOOD)—a very high-order finite volume scheme for conservation laws on unstructured meshes. In: Fořt, J., Fürst, J., Halama, J., Herbin, R., Hubert, F. (eds.) Finite Volumes for Complex Applications VI Problems & Perspectives, pp. 263–271. Springer, Berlin (2011)

    Chapter  MATH  Google Scholar 

  10. Cockburn, B., Shu, C.W.: TVB Runge-Kutta local projection discontinuous Galerkin finite element method for conservation laws II: general framework. Math. Comput. 52(186), 411–435 (1989). http://www.jstor.org/stable/2008474

  11. Cockburn, B., Shu, C.W.: The Runge-Kutta discontinuous Galerkin method for conservation laws V. J. Comput. Phys. 141(2), 199–224 (1998). https://doi.org/10.1006/jcph.1998.5892

    Article  MathSciNet  MATH  Google Scholar 

  12. Cohen, T.S., Geiger, M., Weiler, M.: A general theory of equivariant CNNs on homogeneous spaces. In: Wallach, H., Larochelle, H., Beygelzimer, A., Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems 32, pp. 9145–9156. Curran Associates, Inc. (2019). http://papers.nips.cc/paper/9114-a-general-theory-of-equivariant-cnns-on-homogeneous-spaces.pdf

  13. Cohen, T.S., Weiler, M., Kicanaoglu, B., Welling, M.: Gauge equivariant convolutional networks and the icosahedral CNN. arXiv:1902.04615 (2019)

  14. Dafermos, C.: Hyperbolic Conservation Laws in Continuum Physics. Grundlehren der mathematischen Wissenschaften. Springer, Berlin (2009). https://books.google.com/books?id=49bXK26O_b4C

  15. Github repository. https://github.com/hanveiga/1d-dg-nn

  16. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. The MIT Press, Massachusetts (2016). http://www.deeplearningbook.org

  17. Gottlieb, S.: On high order strong stability preserving Runge-Kutta and multi step time discretizations. J. Sci. Comput. 25(1), 105–128 (2005). https://doi.org/10.1007/BF02728985

    Article  MathSciNet  MATH  Google Scholar 

  18. Harten, A., Lax, P.D.: On a class of high resolution total-variation-stable finite-difference schemes. SIAM J. Numer. Anal. 21(1), 1–23 (1984). http://www.jstor.org/stable/2157043

  19. He, Y. et al.: Streaming end-to-end speech recognition for mobile devices. In: 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6381–6385 (2019). https://doi.org/10.1109/ICASSP.2019.8682336

  20. Hoens, T.R., Chawla, N.V.: Imbalanced Datasets: From Sampling to Classifiers. Wiley, New York (2013). https://doi.org/10.1002/9781118646106.ch3

    Book  Google Scholar 

  21. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. ICLR (2015). arXiv:abs/1412.6980

  22. Krivodonova, L.: Limiters for high-order discontinuous Galerkin methods. J. Comput. Phys. 226, 879–896 (2007). https://doi.org/10.1016/j.jcp.2007.05.011

    Article  MathSciNet  MATH  Google Scholar 

  23. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems—Volume 1, pp. 1097–1105. Curran Associates Inc., USA (2012). http://dl.acm.org/citation.cfm?id=2999134.2999257

  24. Kurganov, A., Tadmor, E.: Solution of two-dimensional Riemann problems for gas dynamics without Riemann problem solvers. Numerical Methods for Partial Differential Equations 18(5), 584–608 (2002). https://doi.org/10.1002/num.10025.

  25. Lax, P.D.: Weak solutions of nonlinear hyperbolic equations and their numerical computation. Commun. Pure Appl. Math. 7(1), 159–193 (1954). https://doi.org/10.1002/cpa.3160070112

    Article  MathSciNet  MATH  Google Scholar 

  26. Mhaskar, H., Liao, Q., Poggio, T.A.: When and why are deep networks better than shallow ones? In: AAAI, pp. 2343–2349 (2017)

  27. Mikolajczyk, A., Grochowski, M.: Data augmentation for improving deep learning in image classification problem. In: 2018 International Interdisciplinary PhD Workshop (IIPhDW), pp. 117–122 (2018)

  28. Morgan, N.R., Tokareva, S., Liu, X., Morgan, A.: A machine learning approach for detecting shocks with high-order hydrodynamic methods. https://doi.org/10.2514/6.2020-2024

  29. Petersen, P., Voigtlaender, F.: Optimal approximation of piecewise smooth functions using deep ReLU neural networks. arxiv:1709.05289 (2017)

  30. Prechelt, L.: Early Stopping—But When? In: Montavon G., Orr G.B., Müller K.R. (eds.) Neural Networks: Tricks of the Trade. Lecture Notes in Computer Science, vol. 7700, pp. 53–67, Springer, Berlin, Heidelberg (2012)

  31. Ray, D., Hesthaven, J.S.: An artificial neural network as a troubled-cell indicator. J. Comput. Phys. 367, 166–191 (2018). https://doi.org/10.1016/j.jcp.2018.04.029

    Article  MathSciNet  MATH  Google Scholar 

  32. Ricchiuto, M., Abgrall, R.: Explicit Runge-Kutta residual distribution schemes for time dependent problems: second order case. J. Comput. Phys. 229(16), 5653–5691 (2010). https://doi.org/10.1016/j.jcp.2010.04.002

    Article  MathSciNet  MATH  Google Scholar 

  33. Ricchiuto, M., Abgrall, R., Deconinck, H.: Application of conservative residual distribution schemes to the solution of the shallow water equations on unstructured meshes. J. Comput. Phys. 222(1), 287–331 (2007). https://doi.org/10.1016/j.jcp.2006.06.024

    Article  MathSciNet  MATH  Google Scholar 

  34. Rojas, R.: Networks of width one are universal classifiers. In: Proceedings of the International Joint Conference on Neural Networks, vol. 4, pp. 3124–3127 (2003). https://doi.org/10.1109/IJCNN.2003.1224071

  35. Rojas, R.: Deepest Neural Networks. arxiv:1707.02617 (2017)

  36. Schaal, K. et al: Astrophysical hydrodynamics with a high-order discontinuous Galerkin scheme and adaptive mesh refinement. Mon. Not. R. Astron. Soc. 453(4), 4278–4300 (2015). https://doi.org/10.1093/mnras/stv1859

    Article  Google Scholar 

  37. Snyman, J.: Practical Mathematical Optimization: an Introduction to Basic Optimization Theory and Classical and New Gradient-Based Algorithms. Applied Optimization. Springer, New York (2005). https://books.google.ch/books?id=0tFmf_UKl7oC

  38. Sutskever, I., Hinton, G.E.: Deep, narrow sigmoid belief networks are universal approximators. Neural Comput. 20(11), 2629–2636 (2008). https://doi.org/10.1162/neco.2008.12-07-661

    Article  MATH  Google Scholar 

  39. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Proceedings of the 27th International Conference on Neural Information Processing Systems—Volume 2, pp. 3104–3112. MIT Press, Cambridge (2014). http://dl.acm.org/citation.cfm?id=2969033.2969173

  40. Veiga, M.H., Abgrall, R.: Towards a general stabilisation method for conservation laws using a multilayer perceptron neural network: 1d scalar and system of equations. In: European Conference on Computational Mechanics and VII European Conference on Computational Fluid Dynamics, vol. 1, pp. 2525–2550 (2018). https://doi.org/10.5167/uzh-168538

  41. Vilar, F.: A posteriori correction of high-order discontinuous Galerkin scheme through subcell finite volume formulation and flux reconstruction. J. Comput. Phys. 387, 245–279 (2019). https://doi.org/10.1016/j.jcp.2018.10.050

    Article  MathSciNet  MATH  Google Scholar 

  42. Weiss, K.R., Khoshgoftaar, T.M., Wang, D.: A survey of transfer learning. J. Big Data 3, 1–40 (2016)

    Article  Google Scholar 

  43. Xu, B., Wang, N., Chen, T., Li, M.: Empirical evaluation of rectified activations in convolutional network. CoRR (2015). arXiv:abs/1505.00853

Download references

Acknowledgements

We want to thank the referees for the insightful discussion, feedback and comments that hopefully lead to an improved manuscript. Majority of this work was done in University of Zurich, where MHV was funded by the UZH Candoc Forschungskredit grant.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maria Han Veiga.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Appendix A MLP Architectures

Appendix A MLP Architectures

We detail the architectures tested in this work in Table A1. For this work, we found the lower bound (\(2^{28}\)) for the number of weights and fixed the activation functions to be ReLU (as detailed in Sect. 3.1.1). What we observed was that this quantity was not producing very good models. Thus, we used (\(2^{32}\)) non-zero weights, which presupposes a constant C of order 10. It is left to specify the distribution of the weights across the different layers. The number of weights and neurons is related by multiplying d through the number of neurons per layers to get the total number of weights.

Table A1 Architectures

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Abgrall, R., Han Veiga, M. Neural Network-Based Limiter with Transfer Learning. Commun. Appl. Math. Comput. 5, 532–572 (2023). https://doi.org/10.1007/s42967-020-00087-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42967-020-00087-1

Keywords

Mathematics Subject Classification

Navigation