Flow, Turbulence and Combustion

, Volume 99, Issue 1, pp 25–46 | Cite as

A Priori Assessment of Prediction Confidence for Data-Driven Turbulence Modeling

  • Jin-Long Wu
  • Jian-Xun Wang
  • Heng XiaoEmail author
  • Julia Ling


Although Reynolds-Averaged Navier–Stokes (RANS) equations are still the dominant tool for engineering design and analysis applications involving turbulent flows, standard RANS models are known to be unreliable in many flows of engineering relevance, including flows with separation, strong pressure gradients or mean flow curvature. With increasing amounts of 3-dimensional experimental data and high fidelity simulation data from Large Eddy Simulation (LES) and Direct Numerical Simulation (DNS), data-driven turbulence modeling has become a promising approach to increase the predictive capability of RANS simulations. However, the prediction performance of data-driven models inevitably depends on the choices of training flows. This work aims to identify a quantitative measure for a priori estimation of prediction confidence in data-driven turbulence modeling. This measure represents the distance in feature space between the training flows and the flow to be predicted. Specifically, the Mahalanobis distance and the kernel density estimation (KDE) technique are used as metrics to quantify the distance between flow data sets in feature space. To examine the relationship between these two extrapolation metrics and the machine learning model prediction performance, the flow over periodic hills at Re = 10595 is used as test set and seven flows with different configurations are individually used as training sets. The results show that the prediction error of the Reynolds stress anisotropy is positively correlated with Mahalanobis distance and KDE distance, demonstrating that both extrapolation metrics can be used to estimate the prediction confidence a priori. A quantitative comparison using correlation coefficients shows that the Mahalanobis distance is less accurate in estimating the prediction confidence than KDE distance. The extrapolation metrics introduced in this work and the corresponding analysis provide an approach to aid in the choice of data source and to assess the prediction performance for data-driven turbulence modeling.


Turbulence modeling Mahalanobis distance Kernel density estimation Random forest regression Extrapolation Machine learning 



HX would like to thank Dr. Eric G. Paterson for numerous helpful discussions during this work.

Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000. SAND2016-6700J.

Compliance with Ethical Standards

Conflict of interests

The authors declare that they have no conflict of interest.


  1. 1.
    Craft, T., Launder, B., Suga, K.: Development and application of a cubic eddy-viscosity model of turbulence. Int. J. Heat Fluid Flow 17, 108–115 (1996)CrossRefGoogle Scholar
  2. 2.
    Milano, M., Koumoutsakos, P.: Neural network modeling for near wall turbulent flow. J. Comput. Phys. 182, 1–26 (2002)CrossRefzbMATHGoogle Scholar
  3. 3.
    Tracey, B., Duraisamy, K., Alonso, J.: Application of supervised learning to quantify uncertainties in turbulence and combustion modeling. In: 51st AIAA Aerospace Sciences Meeting including the New Horizons Forum and Aerospace Exposition, Grapevine, TX, 2013 (AIAA, Reston, VA, 2013), paper 2013–0259Google Scholar
  4. 4.
    Duraisamy, K., Zhang, Z.-J., Singh, A.P.: New approaches in turbulence and transition modeling using data-driven techniques. In: 53rd AIAA Aerospace Sciences Meeting, Kissimmee, FL, 2015 (AIAA, Reston, VA, 2015), paper 2015–1284Google Scholar
  5. 5.
    Ling, J., Templeton, J.: Evaluation of machine learning algorithms for prediction of regions of high reynolds averaged navier stokes uncertainty. Phys. Fluids (1994-present) 27(8), 085103 (2015)Google Scholar
  6. 6.
    Ling, J., Ruiz, A., Lacaze, G., Oefelein, J.: Uncertainty analysis and data-driven model advances for a jet-in-crossflow. In: ASME Turbo Expo (2016)Google Scholar
  7. 7.
    Ling, J., Kurzawski, A., Templeton, J.: Reynolds averaged turbulence modelling using deep neural networks with embedded invariance. J. Fluid Mech. 807, 155–166 (2016)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Wang, J.-X., Wu, J.-L., Xiao, H.: Physics-informed machine learning approach for reconstructing Reynolds stress modeling discrepancies based on DNS data. Phys. Rev. Fluids 2(3), 034603 (2017)Google Scholar
  9. 9.
    Ling, J., Jones, R., Templeton, J.: Machine learning strategies for systems with invariance properties. J. Comput. Phys. 318, 22–35 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Liaw, A., Wiener, M.: Classification and regression by randomforest. R news 2(3), 18–22 (2002)Google Scholar
  11. 11.
    Gorlé, C., Iaccarino, G.: A framework for epistemic uncertainty quantification of turbulent scalar flux models for Reynolds-averaged Navier-Stokes simulations. Phys. Fluids 25(5), 055105 (2013)Google Scholar
  12. 12.
    Emory, M., Pecnik, R., Iaccarino, G.: Modeling structural uncertainties in Reynolds-averaged computations of shock/boundary layer interactions. In: 49th AIAA Aerospace Sciences Meeting including the New Horizons Forum and Aerospace Exposition, Orlando, FL, 2011 (AIAA, Reston, VA, 2011), paper 2011–479 (2011)Google Scholar
  13. 13.
    Xiao, H., Wu, J.-L., Wang, J.-X., Sun, R., Roy, C.J.: Quantifying and reducing model-form uncertainties in Reynolds-Averaged Navier-Stokes simulations: A data-driven, physics-based Bayesian approach. J. Comput. Phys. 324, 115–136 (2016)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Banerjee, S., Krahl, R., Durst, F., Zenger, C.: Presentation of anisotropy properties of turbulence, invariants versus eigenvalue approaches. J. Turbul. 8(32), 1–27 (2007)MathSciNetzbMATHGoogle Scholar
  15. 15.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)CrossRefzbMATHGoogle Scholar
  16. 16.
    Friedman, J., Hastie, T., Tibshirani, R.: The Elements of Statistical Learning. Springer, Berlin (2001)zbMATHGoogle Scholar
  17. 17.
    Emory, M., Larsson, J., Iaccarino, G.: Modeling of structural uncertainties in Reynolds-averaged Navier-Stokes closures. Phys. Fluids 25(11), 110822 (2013)CrossRefGoogle Scholar
  18. 18.
    Chris Rumsey, G.H., Smith, B: Turbulence modeling resource. (2016)
  19. 19.
    Breuer, M., Peller, N., Rapp, C., Manhart, M.: Flow over periodic hills–numerical and experimental study in a wide range of reynolds numbers. Comput. Fluids 38(2), 433–457 (2009)CrossRefzbMATHGoogle Scholar
  20. 20.
    Bentaleb, Y., Lardeau, S., Leschziner, M.A.: Large-eddy simulation of turbulent boundary layer separation from a rounded step. J. Turb. 13, N4 (2012)CrossRefGoogle Scholar
  21. 21.
    Laval, J.-P., Marquillie, M.: Direct numerical simulations of converging–diverging channel flow. In: Progress in Wall Turbulence, Understanding and Modeling, pp 203–209. Springer (2011)Google Scholar
  22. 22.
    Le, H., Moin, P., Kim, J.: Direct numerical simulation of turbulent flow over a backward-facing step. J. Fluid Mech. 330, 349–374 (1997)CrossRefzbMATHGoogle Scholar
  23. 23.
    Maaß, C., Schumann, U.: Direct numerical simulation of separated turbulent flow over a wavy boundary Flow Simulation with High-Performance Computers II, pp 227–241. Springer (1996)Google Scholar
  24. 24.
    Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. 13, 21–27 (1967)zbMATHGoogle Scholar
  25. 25.
    Mahalanobis, P.C.: On the generalized distance in statistics. Proc. Nat. Inst. Sci. (Calcutta) 2, 49–55 (1936)zbMATHGoogle Scholar
  26. 26.
    Silverman, B.: Density Estimation for Statistics and Data Analysis. CRC Press (1986)Google Scholar
  27. 27.
    Scott, D.W.: Multivariate Density Estimation: Theory, Practice, and Visualization. Wiley (2015)Google Scholar
  28. 28.
    Launder, B., Sharma, B.: Application of the energy-dissipation model of turbulence to the calculation of flow near a spinning disc. Lett. Heat Mass Transfer 1(2), 131–137 (1974)CrossRefGoogle Scholar
  29. 29.
    Weller, H.G., Tabor, G., Jasak, H., Fureby, C.: A tensorial approach to computational continuum mechanics using ject-oriented techniques. Comput. Phys. 12(6), 620–631 (1998)Google Scholar
  30. 30.
    Thompson, R.L., Sampaio, L.E.B., de Braganċa Alves, F.A., Thais, L., Mompean, G.: A methodology to evaluate statistical errors in DNS data of plane channel flows. Comput. Fluids 130, 1–7 (2016)Google Scholar
  31. 31.
    Poroseva, S.V., Colmenares F.J.D., Murman, S.M.: On the accuracy of RANS simulations with DNS data. Phys. Fluids 28(11), 115102 (2016)CrossRefGoogle Scholar
  32. 32.
    Wang, J.-X., Wu, J.-L., Ling, J., Iaccarino, G., Xiao, H.: A comprehensive physics-informed machine learning framework for predictive turbulence modeling, submitted. Available at arXiv:1701.07102 (2017)

Copyright information

© Springer Science+Business Media Dordrecht 2017

Authors and Affiliations

  • Jin-Long Wu
    • 1
  • Jian-Xun Wang
    • 1
  • Heng Xiao
    • 1
    Email author
  • Julia Ling
    • 2
  1. 1.Department of Aerospace and Ocean EngineeringVirginia TechBlacksburgUSA
  2. 2.Thermal/Fluid Science and EngineeringSandia National LaboratoriesLivermoreUSA

Personalised recommendations