Abstract
Recent works have shown that neural networks are promising parameter-free limiters for a variety of numerical schemes (Morgan et al. in A machine learning approach for detecting shocks with high-order hydrodynamic methods. https://doi.org/10.2514/6.2020-2024; Ray et al. in J Comput Phys 367: 166–191. https://doi.org/10.1016/j.jcp.2018.04.029, 2018; Veiga et al. in European Conference on Computational Mechanics and VII European Conference on Computational Fluid Dynamics, vol. 1, pp. 2525–2550. ECCM. https://doi.org/10.5167/uzh-168538, 2018). Following this trend, we train a neural network to serve as a shock-indicator function using simulation data from a Runge-Kutta discontinuous Galerkin (RKDG) method and a modal high-order limiter (Krivodonova in J Comput Phys 226: 879–896. https://doi.org/10.1016/j.jcp.2007.05.011, 2007). With this methodology, we obtain one- and two-dimensional black-box shock-indicators which are then coupled to a standard limiter. Furthermore, we describe a strategy to transfer the shock-indicator to a residual distribution (RD) scheme without the need for a full training cycle and large dataset, by finding a mapping between the solution feature spaces from an RD scheme to an RKDG scheme, both in one- and two-dimensional problems, and on Cartesian and unstructured meshes. We report on the quality of the numerical solutions when using the neural network shock-indicator coupled to a limiter, comparing its performance to traditional limiters, for both RKDG and RD schemes.
Similar content being viewed by others
Notes
Some of the solvers used are still under development and not publicly available.
References
Abgrall, R.: Residual distribution schemes: current status and future trends. Comput. Fluids 35(7), 641–669 (2006). https://doi.org/10.1016/j.compfluid.2005.01.007
Abgrall, R., Bacigaluppi, P., Tokareva, S.: High-order residual distribution scheme for the time-dependent Euler equations of fluid dynamics. Comput. Math. Appl. 78(2), 274–297 (2019). https://doi.org/10.1016/j.camwa.2018.05.009
Bacigaluppi, P., Abgrall, R., Tokareva, S.: “A posteriori” limited high order and robust residual distribution schemes for transient simulations of fluid flows in gas dynamics. arXiv:1902.07773 (2019)
Beck, A., Zeifang, J., Schwarz, A., Flad, D.: A neural network based shock detection and localization approach for discontinuous Galerkin methods (2020). https://doi.org/10.13140/RG.2.2.20237.90085
Bergstra, J., Yamins, D., Cox, D.: Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures. In: Dasgupta, S., McAllester, D. (eds.) Proceedings of the 30th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 28, pp. 115–123. PMLR, Atlanta (2013). http://proceedings.mlr.press/v28/bergstra13.html
Biswas, R., Devine, K.D., Flaherty, J.E.: Parallel, adaptive finite element methods for conservation laws. Appl. Numer. Math. 14(1), 255–283 (1994). https://doi.org/10.1016/0168-9274(94)90029-9
Bottou, L., Curtis, F., Nocedal, J.: Optimization methods for large-scale machine learning. SIAM Rev. 60(2), 223–311 (2018). https://doi.org/10.1137/16M1080173
Burman, E., Fernández, M.A.: Continuous interior penalty finite element method for the time-dependent Navier-Stokes equations: space discretization and convergence. Numer. Math. 107(1), 39–77 (2007). https://doi.org/10.1007/s00211-007-0070-5
Clain, S., Diot, S., Loubère, R.: Multi-dimensional optimal order detection (MOOD)—a very high-order finite volume scheme for conservation laws on unstructured meshes. In: Fořt, J., Fürst, J., Halama, J., Herbin, R., Hubert, F. (eds.) Finite Volumes for Complex Applications VI Problems & Perspectives, pp. 263–271. Springer, Berlin (2011)
Cockburn, B., Shu, C.W.: TVB Runge-Kutta local projection discontinuous Galerkin finite element method for conservation laws II: general framework. Math. Comput. 52(186), 411–435 (1989). http://www.jstor.org/stable/2008474
Cockburn, B., Shu, C.W.: The Runge-Kutta discontinuous Galerkin method for conservation laws V. J. Comput. Phys. 141(2), 199–224 (1998). https://doi.org/10.1006/jcph.1998.5892
Cohen, T.S., Geiger, M., Weiler, M.: A general theory of equivariant CNNs on homogeneous spaces. In: Wallach, H., Larochelle, H., Beygelzimer, A., Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems 32, pp. 9145–9156. Curran Associates, Inc. (2019). http://papers.nips.cc/paper/9114-a-general-theory-of-equivariant-cnns-on-homogeneous-spaces.pdf
Cohen, T.S., Weiler, M., Kicanaoglu, B., Welling, M.: Gauge equivariant convolutional networks and the icosahedral CNN. arXiv:1902.04615 (2019)
Dafermos, C.: Hyperbolic Conservation Laws in Continuum Physics. Grundlehren der mathematischen Wissenschaften. Springer, Berlin (2009). https://books.google.com/books?id=49bXK26O_b4C
Github repository. https://github.com/hanveiga/1d-dg-nn
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. The MIT Press, Massachusetts (2016). http://www.deeplearningbook.org
Gottlieb, S.: On high order strong stability preserving Runge-Kutta and multi step time discretizations. J. Sci. Comput. 25(1), 105–128 (2005). https://doi.org/10.1007/BF02728985
Harten, A., Lax, P.D.: On a class of high resolution total-variation-stable finite-difference schemes. SIAM J. Numer. Anal. 21(1), 1–23 (1984). http://www.jstor.org/stable/2157043
He, Y. et al.: Streaming end-to-end speech recognition for mobile devices. In: 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6381–6385 (2019). https://doi.org/10.1109/ICASSP.2019.8682336
Hoens, T.R., Chawla, N.V.: Imbalanced Datasets: From Sampling to Classifiers. Wiley, New York (2013). https://doi.org/10.1002/9781118646106.ch3
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. ICLR (2015). arXiv:abs/1412.6980
Krivodonova, L.: Limiters for high-order discontinuous Galerkin methods. J. Comput. Phys. 226, 879–896 (2007). https://doi.org/10.1016/j.jcp.2007.05.011
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems—Volume 1, pp. 1097–1105. Curran Associates Inc., USA (2012). http://dl.acm.org/citation.cfm?id=2999134.2999257
Kurganov, A., Tadmor, E.: Solution of two-dimensional Riemann problems for gas dynamics without Riemann problem solvers. Numerical Methods for Partial Differential Equations 18(5), 584–608 (2002). https://doi.org/10.1002/num.10025.
Lax, P.D.: Weak solutions of nonlinear hyperbolic equations and their numerical computation. Commun. Pure Appl. Math. 7(1), 159–193 (1954). https://doi.org/10.1002/cpa.3160070112
Mhaskar, H., Liao, Q., Poggio, T.A.: When and why are deep networks better than shallow ones? In: AAAI, pp. 2343–2349 (2017)
Mikolajczyk, A., Grochowski, M.: Data augmentation for improving deep learning in image classification problem. In: 2018 International Interdisciplinary PhD Workshop (IIPhDW), pp. 117–122 (2018)
Morgan, N.R., Tokareva, S., Liu, X., Morgan, A.: A machine learning approach for detecting shocks with high-order hydrodynamic methods. https://doi.org/10.2514/6.2020-2024
Petersen, P., Voigtlaender, F.: Optimal approximation of piecewise smooth functions using deep ReLU neural networks. arxiv:1709.05289 (2017)
Prechelt, L.: Early Stopping—But When? In: Montavon G., Orr G.B., Müller K.R. (eds.) Neural Networks: Tricks of the Trade. Lecture Notes in Computer Science, vol. 7700, pp. 53–67, Springer, Berlin, Heidelberg (2012)
Ray, D., Hesthaven, J.S.: An artificial neural network as a troubled-cell indicator. J. Comput. Phys. 367, 166–191 (2018). https://doi.org/10.1016/j.jcp.2018.04.029
Ricchiuto, M., Abgrall, R.: Explicit Runge-Kutta residual distribution schemes for time dependent problems: second order case. J. Comput. Phys. 229(16), 5653–5691 (2010). https://doi.org/10.1016/j.jcp.2010.04.002
Ricchiuto, M., Abgrall, R., Deconinck, H.: Application of conservative residual distribution schemes to the solution of the shallow water equations on unstructured meshes. J. Comput. Phys. 222(1), 287–331 (2007). https://doi.org/10.1016/j.jcp.2006.06.024
Rojas, R.: Networks of width one are universal classifiers. In: Proceedings of the International Joint Conference on Neural Networks, vol. 4, pp. 3124–3127 (2003). https://doi.org/10.1109/IJCNN.2003.1224071
Rojas, R.: Deepest Neural Networks. arxiv:1707.02617 (2017)
Schaal, K. et al: Astrophysical hydrodynamics with a high-order discontinuous Galerkin scheme and adaptive mesh refinement. Mon. Not. R. Astron. Soc. 453(4), 4278–4300 (2015). https://doi.org/10.1093/mnras/stv1859
Snyman, J.: Practical Mathematical Optimization: an Introduction to Basic Optimization Theory and Classical and New Gradient-Based Algorithms. Applied Optimization. Springer, New York (2005). https://books.google.ch/books?id=0tFmf_UKl7oC
Sutskever, I., Hinton, G.E.: Deep, narrow sigmoid belief networks are universal approximators. Neural Comput. 20(11), 2629–2636 (2008). https://doi.org/10.1162/neco.2008.12-07-661
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Proceedings of the 27th International Conference on Neural Information Processing Systems—Volume 2, pp. 3104–3112. MIT Press, Cambridge (2014). http://dl.acm.org/citation.cfm?id=2969033.2969173
Veiga, M.H., Abgrall, R.: Towards a general stabilisation method for conservation laws using a multilayer perceptron neural network: 1d scalar and system of equations. In: European Conference on Computational Mechanics and VII European Conference on Computational Fluid Dynamics, vol. 1, pp. 2525–2550 (2018). https://doi.org/10.5167/uzh-168538
Vilar, F.: A posteriori correction of high-order discontinuous Galerkin scheme through subcell finite volume formulation and flux reconstruction. J. Comput. Phys. 387, 245–279 (2019). https://doi.org/10.1016/j.jcp.2018.10.050
Weiss, K.R., Khoshgoftaar, T.M., Wang, D.: A survey of transfer learning. J. Big Data 3, 1–40 (2016)
Xu, B., Wang, N., Chen, T., Li, M.: Empirical evaluation of rectified activations in convolutional network. CoRR (2015). arXiv:abs/1505.00853
Acknowledgements
We want to thank the referees for the insightful discussion, feedback and comments that hopefully lead to an improved manuscript. Majority of this work was done in University of Zurich, where MHV was funded by the UZH Candoc Forschungskredit grant.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Appendix A MLP Architectures
Appendix A MLP Architectures
We detail the architectures tested in this work in Table A1. For this work, we found the lower bound (\(2^{28}\)) for the number of weights and fixed the activation functions to be ReLU (as detailed in Sect. 3.1.1). What we observed was that this quantity was not producing very good models. Thus, we used (\(2^{32}\)) non-zero weights, which presupposes a constant C of order 10. It is left to specify the distribution of the weights across the different layers. The number of weights and neurons is related by multiplying d through the number of neurons per layers to get the total number of weights.
Rights and permissions
About this article
Cite this article
Abgrall, R., Han Veiga, M. Neural Network-Based Limiter with Transfer Learning. Commun. Appl. Math. Comput. 5, 532–572 (2023). https://doi.org/10.1007/s42967-020-00087-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42967-020-00087-1