1 Introduction

Combustion within energy conversion and propulsion devices such as internal combustion engines, gas turbines, rocket engines, etc., usually occurs under turbulent conditions. The turbulence-chemistry interaction in such devices is characterized by highly nonlinear, unsteady, multi-scale, and multi-physics processes, which makes its investigation a challenging task. Although advancements in experimental diagnostics and computational tools have enabled some detailed studies, there are still challenges that need to be addressed. For example, while experiments under extreme operating conditions are often limited to measurements of fewer quantities, computational studies using high-fidelity approaches such as direct numerical simulation (DNS) and large-eddy simulation (LES) usually tend to be computationally expensive, and limited to a few simpler problems. Specifically, DNS, where all relevant spatial and temporal scales are resolved, is used to carry out fundamental studies, but it requires simplifications in the geometry, flow conditions, or chemistry to address the computational cost concerns. On the other hand, although LES, where only large-scales are captured and the effects of small-scales are parameterized using the subgrid-scale (SGS) closure models, is considered a promising strategy (Fureby and Möller 1995; Gonzalez-Juez et al. 2017; Pitsch 2006), to obtain statistical convergence, its computational cost is also not trivial. While subgrid-scale (SGS) closure for reacting LES remains an ongoing research effort for many approaches, the computational cost is a key challenge when employing finite-rate chemistry (FRC) approach with detailed chemical mechanisms. Here, we discuss past strategies to develop machine learning (ML) tools for LES of reacting flows, with a particular focus on finite-rate kinetics.

In recent years, rapid advancements in computing resources and data storage capabilities have led to increased usage of supervised deep learning (DL) using artificial neural network (ANN) (Goodfellow et al. 2016; LeCun et al. 2015) to tackle challenging problems from several fields such as computer vision (Krizhevsky et al. 2012), speech, image and text recognition (Bishop 2006), natural language processing (Collobert and Weston 2008), health-care (Leung et al. 2014), genetic sequencing (Libbrecht and Noble 2015), materials discovery (Pilania et al. 2013), complex game playing (Silver et al. 2017), high-energy physics (Baldi et al. 2014), etc. This is primarily due to the ability of the DL to effectively deal with high-dimensional data and the modeling of complex and nonlinear relationships. DL techniques are essentially representational learning methods that employ multiple levels of representation. These techniques transform the representation at one level starting with the raw input to an abstract representation at a higher level, which allows learning complex nonlinear relationships. The layers of features are learned from huge datasets using general-purpose learning procedures. Such a representational learning approach enables the discovery of intricate structures in high-dimensional data and is therefore amenable to different domains of science and engineering. Furthermore, the recent advancements in the back-propagation algorithm, mini-batch stochastic gradient, novel architectures such as convolutional neural network (CNN), and recurrent neural network (RNN) have also accelerated a wider adoption of DL techniques in different domains of science and engineering (LeCun et al. 2015).

To apply this approach to LES of reacting flows, the data-driven modeling through DL must focus on performance improvements via generalizing a model that captures all variations within the data. A conventional deep neural network (DNN) for modeling of the reaction-rate term is shown in Fig. 1, which is a multilayer fully connected feed-forward network where the information flows in a forward direction from input to output. Here, the input comprise of species mass fraction (\(Y_i\) with \(i = 1, 2, \ldots , k\)) and temperature (T), and the output comprise of the corresponding reaction-rate term (\(\dot{\omega }_i\)). Here, k denotes the total number of chemical species. Mathematically, a DNN defines the mapping \(\mathcal {A}: \boldsymbol{x} \rightarrow \boldsymbol{y}\), where \(\boldsymbol{x}\) and \(\boldsymbol{y}\) denote input and output variables, respectively, and \(\mathcal {A}\) represents a composition of many different functions, which can be represented through a network structure. A typical DNN comprises an input layer, an output layer, and more than one hidden layer. Each layer consists of several nodes, which are connected to all the nodes in the previous and the following layers. The complexity of a DNN increases with an increase in the number of hidden layers and the number of nodes per hidden layer. Such a basic network is also referred to as a multilayer perceptron (MLP). It has been shown that MLPs can yield universal function approximations (Hornik et al. 1989). Therefore, with enough layers and nodes, MLPs can be used to model arbitrarily complex and highly nonlinear functional forms, such as those needed for closure of the SGS terms while performing LES.

Fig. 1
figure 1

Schematic of a multi-layer perceptron (MLP) for modeling of the reaction-rate term with two hidden layers having the vector \(\boldsymbol{x} = (Y_1,~Y_2,~\ldots ,Y_k,~T)\) as an input and the vector \(\boldsymbol{y} = (\dot{\omega }_1,~\dot{\omega }_2,~\ldots ~,\dot{\omega }_k)\) as an output

ANN algorithms have been used for SGS closure models in the context of Reynolds-averaged Navier-Stokes (RANS) and LES in past studies of both non-reacting (Beck et al. 2019; Duraisamy et al. 2015, 2019; Ling et al. 2016; Maulik and San 2017; Vollant et al. 2017) and reacting (Christo et al. 1995, 1996; Lapeyre et al. 2019; Seltz et al. 2019; Sen et al. 2010; Yellapantula et al. 2020) flows. In the context of LES of turbulent combustion, there are two key areas of relevance (a) the need to use detailed chemical kinetics for an accurate representation of the thermochemical state space, and (b) the modeling of the filtered reaction-rate term to account for the SGS turbulence-chemistry interaction. Over several years, past studies have focused on tackling both of these challenges, and further research is still underway.

To address the challenge related to thermochemical representation, detailed chemical kinetics can be used for accurate predictions over a wide range of operating conditions, In contrast, while the use of simplified chemical mechanisms is computationally expedient, they are known to affect the quality of predictions (Bilger et al. 2005). For several reacting flow conditions, the use of flamelet (Peters 2000; Pitsch 2006) and other low-dimensional manifold based approaches (Maas and Pope 1992; Bradley et al. 1988; Van Oijen and De Goey 2000) have been popular for their computational tractability. ANN has also been used to store flamelet libraries to reduce the computational storage requirements (Kempf et al. 2005; Ihme et al. 2009; Zhang et al. 2020). Additionally, it has also been used to model SGS source and transport terms (Seltz et al. 2019). Although low-dimensional manifold formulation can be used for some problems, often detailed finite-rate chemical mechanism is needed to accurately capture the flame dynamics and other features such as extinction, re-ignition, lean blowout, pollutant emissions, etc. However, FRC-based LES, referred here onwards as FRC-LES, becomes computationally intractable for simulation of practical applications when using a detailed chemical mechanism. The higher computational cost of FRC-LES is associated with the need to solve a highly stiff ODE system resulting from a wide range of time scales associated with different chemical species in a complex chemical mechanism, and the need to transport a large number of chemical species. In addition to approaches for computational cost reduction such as hybrid transported-tabulated chemistry (HTTC) (Ribert et al. 2014) and dynamic adaptive chemistry (DAC) (Yang et al. 2017) to name a few, ANN algorithms have also been used to address the computational cost concerns of FRC-LES (Christo et al. 1995, Christo et al. 1996; Sen et al. 2010; Sen and Menon 2010; Zhou et al. 2013; Franke et al. 2017; Sinaei and Tabejamaat 2017; Ranade et al. 2021).

A major challenge for LES of turbulent combustion is the need for accurate modeling of the filtered reaction-rate term. It has led to numerous physics-based SGS closure models for both low-dimensional manifold and FRC-based approaches. The reader is referred to the review articles (Pitsch 2006; Fureby 2009; Gonzalez-Juez et al. 2017), where challenges of different modeling paradigms and strengths and limitations of several modeling approaches are discussed. The modeling of the SGS turbulence-chemistry interaction is key for the accurate prediction of the flame dynamics. ANN-based strategies have been employed for computational cost reduction of filtered reaction-rate term modeling within both low-dimensional manifold (Nikolaou et al. 2019; Lapeyre et al. 2019; Seltz et al. 2019; Yellapantula et al. 2020) and FRC (Sen and Menon 2010; Zhou et al. 2013; Franke et al. 2017; Sen et al. 2010) formulations.

Although ANN algorithms have shown some success in LES of turbulent combustion, further studies are needed to examine the predictive capabilities and robustness of such algorithms. The focus of this chapter is to discuss the application of ANN while employing one specific subgrid model using the linear eddy mixing (LEM) model in LES (referred to as LEMLES) (Menon et al. 1993; Menon and Kerstein 2011). LEMLES is a two-scale strategy, where the species transport equations are solved using a two-step procedure. In the first step, the species transport equations and FRC mechanism are solved at the subgrid level using the 1D LEM model (Kerstein 1989), where the LEM model acts as an embedded SGS model for the species equation as viewed on the LES space- and time-scales. The second step simulates the evolution of the computed subgrid scalar fields at the resolved LES level. LEMLES has been extensively used in past studies for investigation of a wide range of applications, such as gas turbine combustor (Kim et al. 1999), rocket combustor (Srinivasan et al. 2015), spray combustion (Sankaran and Menon 2002), scramjet (Menon and Jou 1991), etc. Although LEMLES allows for the handling of arbitrarily complex chemical mechanisms, its use has so far been limited to moderately complex chemical mechanisms due to the cost associated with the computation of stiff kinetics. ANN algorithm within the framework of LEMLES allows addressing this issue (Sen et al. 2010; Sen and Menon 2010), which is the main focus of this chapter.

The chapter is organized as follows. An overview of ML strategies for modeling turbulent combustion reported in the literature is presented in Sect. 2. The formulation and application of ANN within LEMLES are discussed in Sects. 3 and 4. Section 5 discusses the limitations of the past studies that employed ANN within LEMLES. Section 6 concludes with a discussion of the future of ML for subgrid modeling of turbulent combustion using LEM and their implications.

2 ML for Modeling of Turbulent Combustion

As stated in Sect. 1, ML algorithms have been used to reduce the computational cost of finite rate chemistry while using different chemistry modeling paradigms (low-dimensional manifold or FRC). So, first, a brief overview of ANN-based modeling strategy for chemistry and the constituents of ANN models are discussed. Afterward, a summary of studies focused on the use of ANN in LES of turbulent combustion is presented.

2.1 ANN Model for Chemistry

While using the FRC approach, the reaction rate terms are obtained by solving a system of first-order ordinary differential equations (ODEs) expressed as:

$$\begin{aligned} \frac{d Y_k}{d t} = \mathcal {F}_k(Y_k, T, P) = \dot{\omega }_k, \qquad k=1,2,\ldots N_\textrm{s}, \end{aligned}$$
(1)

where \(Y_{k}\) and \(\dot{\omega }_k\) denote the mass fraction and the reaction rate for the kth species. Here, \( \dot{\omega }_k\) can be obtained for a prescribed chemical mechanism and associated kinetic parameters, along with temperature T and pressure P. The system of ODEs given by Eq. 1 is in general stiff, particularly for detailed chemical mechanisms, due to a wide range of timescales associated with different chemical species. Therefore, to solve Eq. 1, stiff ODE solvers such as the fully implicit double-precision variable-coefficient ODE solver (DVODE) (Brown et al. 1989) are needed, which tend to be expensive. ANN can be used to approximate the ODEs with nonlinear regression, thus addressing the issue of computational cost.

ANN regression can be obtained through a MLP (Bishop 1995; Haykin and Network 2004), which involves a sum of nonlinear basis functions, also referred to as activation functions, and coefficients, which include biases and weights. A typical MLP with inputs (\(Y_{k}\), T) and outputs (\(\dot{\omega }_k\)) is shown in Fig. 1. ANN extracts the complex relations embedded within a given input/output training dataset through a learning procedure, and the extracted complex relations can later be used to predict the states on which the training was not performed. The learning process essentially adjusts the biases and weights for each layer of the MLP to obtain a minimal error at the output layer by using a back-propagation algorithm. These optimal weights and biases, along with the specific MLP configuration, form the ANN model. The resulting ANN model can then be used for an efficient representation of the complex dynamics of chemistry described by Eq. 1.

A typical ANN model includes parameters, hyperparameters, and training strategies. The parameters, such as the model coefficients are updated by the ANN model during the learning process, and they only require initialization. The hyperparameters such as the components of the network architecture are specified for a particular problem, which varies from one problem to the next. These include the number of hidden layers and neurons, learning rate, momentum during the back-propagation algorithm, activation function, epochs, mini-batch size, and dropout. A brief overview of the hyperparameters and training strategies is discussed next.

The two key hyperparameters are the number of hidden layers and the number of neurons, which are needed for an accurate representation of complex nonlinear input/output relationships. Although increasing them, in general, improves the accuracy, it also makes the network heavy and eventually the accuracy tends to stagnate. The activation function is through which weighted sums are passed to obtain a non-linear output. The specification of the activation function determines the efficiency and accuracy of the ANN model. Some of the commonly used activation functions include hyperbolic tangent (\(\tanh \)), rectified linear unit (ReLU), sigmoid, etc.   

When dealing with big data, it is also inefficient to use the entire data for training. Therefore, batches of small-size data are typically used for efficient training, although care needs to be taken to avoid overfitting, which would face difficulties in fitting to any new data. The epochs denote the number of times the algorithm trains on the entire data, and its value is also closely associated with the accuracy of the model.

The strategies that are commonly specified while obtaining the ANN model include initialization of the parameters, data normalization, optimization algorithm, and regularization. The initialization of the parameters can be performed based on the chosen activation functions and it affects the efficiency of the ANN model. In several applications, the input data has different scales, which can affect the rate of convergence during the training of the ANN model. For example, in combustion, inputs comprise of temperature and mass fraction of species, which differs by several orders of magnitude, therefore, normalization becomes imperative for improved performance. The optimizers are algorithms used during the training to reduce the loss function, which in turn is used to update the weights. It can directly affect the convergence of the model during the training stage. Some commonly used optimizers include Adam optimizer, gradient descent, stochastic gradient descent, etc. The loss function needs to be defined during the training to compute the model error. The regularization strategy is useful to avoid the overfitting of the ANN model.

It is apparent that a robust ANN model requires a careful selection of parameters, hyperparameters, and training strategies. This becomes even more challenging for turbulent combustion, which is marked by multi-scale and highly nonlinear processes with multiple regimes and modes of combustion where complex relationships between variables representing the thermochemical space exist. Therefore, usually, a significant amount of tuning is needed to realize a robust ANN model for a particular turbulent combustion application.

2.2 LES of Turbulent Combustion Using ANN

An overview of past studies focused on the use of ANN while performing LES of turbulent combustion is summarized in Table 1. The table includes some well-established turbulent combustion models that are used with either a low-dimensional manifold or a finite-rate representation for chemistry. The FRC models include LEMLES and transported probability density function (TPDF) approaches and the low-dimensional manifold approaches include flamelet and flame surface density (FSD) approaches. It can be observed that the ANN-based strategy has been used to study canonical as well as realistic flow configurations. In addition, both premixed and non-premixed modes of combustion have been examined. This illustrates a wide range of applicability of the use of ANN for LES of turbulent combustion.

Table 1 Summary of contributions to application of ML in modeling of turbulent combustion. The ANN model components are labeled as, T: Training data, O: Optimization  Algorithm, f: Activation function, L: Loss function

Some key details of the ANN models employed by the past studies are also summarized in Table 1 to identify if there are any commonly used constituents of the ANN model. These constituents are labeled as ‘T’, ‘O’, ‘f’, and ‘L’ corresponding to the type of training datasets, the optimization algorithm, the activation function, and the loss function, respectively. As discussed in Sect. 2.1, these are some of the key parameters describing the ANN model.

In general, the training of the ANN model has been performed using different types of datasets such as one-dimensional (1D) laminar flamelet, 1D LEM, and DNS datasets. There are advantages and limitations of the usage of these types of datasets. For example, training solely based on a 1D laminar flamelet can not account for the effects of turbulence-chemistry interactions. While this is partly addressed in training based on the 1D LEM dataset, some key features of turbulent combustion such as large-scale curvature effects are not accounted for. The DNS datasets account for all possible states for a particular test configuration and appear to be better compared to the other two approaches. However, it has limited predictive capabilities for conditions that were not present in the DNS dataset and is computationally prohibitive.

The activation function for a neuron in the ANN model defines the output of that neuron for a given input set. Similar to other fields where ANN has been used, all three widely popular activation functions, namely, \(\tanh \), ReLU, and sigmoid functions (Karlik and Olgac 2011; Nwankpa et al. 2018)   have been used while performing LES of turbulent combustion. For the optimizer, the stochastic gradient descent (SGD) algorithm has been typically used. However, some studies have also used Widrow-Hoff (WH) and Levenberg–Marquardt (LM) algorithms. Finally, mean-squared error (MSE) has been used commonly for the loss function in these studies.

Most of the studies summarized in Table 1 demonstrate an improved performance in terms of speedup of chemistry computation as compared to a conventional direct integration (DI) approach for handling stiff kinetics (other studies may exist and hence, this list is not considered comprehensive). In addition, these studies have also demonstrated the benefits of the use of ANN in terms of reduced computational storage requirements. Some recent studies relying on the use of CNN (Lapeyre et al. 2019; Ren et al. 2021) have shown the robustness of the approach for accurately simulating realistic flow configurations where the performance of the CNN based subgrid model was shown to be better compared to reference algebraic closures. Overall, the past and recent studies clearly demonstrate the potential of ANN-based modeling of turbulent combustion. However, further studies are also needed to identify the best practices in specifying the hyperparameters and the strategies for attaining a successful and accurate ANN model.

3 Mathematical Formulation with ANN

In this section, the mathematical formulation of LEMLES with the use of ANN for the modeling of chemistry is discussed. First, the governing equations for FRC-LES and the subgrid modeling of the scalar fields using LEM are described. Afterward, two approaches using ANN, either to model the resolved reaction rates at the subgrid level or to directly model the filtered reaction rates including the subgrid effects are discussed.

3.1 Governing Equations and Subgrid Models

3.1.1 Large-Eddy Simulation

The LES equations are obtained through Favre filtering of compressible multi-species Navier-Stokes equations, which lead to the following conservation equations for mass, momentum, energy, and species mass

$$\begin{aligned} \frac{\partial {\overline{\rho }}}{\partial {t}} + \frac{\partial {\overline{\rho }\widetilde{u_{i}}}}{\partial {x_{i}}} = 0, \end{aligned}$$
(2)
$$\begin{aligned} \frac{\partial {\overline{\rho }\widetilde{u_{i}}}}{\partial {t}} + \frac{\partial }{\partial {x_{j}}} \left[ \overline{\rho }\widetilde{u_{i}}\widetilde{u_{j}} + \overline{P}\delta _{ij} - \overline{\tau }_{ij} + \tau _{ij}^\textrm{sgs} \right] = 0, \end{aligned}$$
(3)
$$\begin{aligned} \frac{\partial {\overline{\rho }\widetilde{E}}}{\partial {t}} + \frac{\partial }{\partial {x_{i}}} \left[ \left( \overline{\rho }\widetilde{E}+\overline{P}\right) \widetilde{u_{i}} + \overline{q}_{i} - \widetilde{u_{j}}\overline{\tau }_{ij} + H_{i}^\textrm{sgs} + \sigma _{i}^\textrm{sgs} \right] = 0, \end{aligned}$$
(4)
$$\begin{aligned} \frac{\partial \overline{\rho } \widetilde{Y}_k}{\partial t} + \frac{\partial }{\partial x_i} \left[ \overline{\rho } \left( \widetilde{Y}_k \widetilde{u}_i + \widetilde{Y}_k \widetilde{V}_{i,k} \right) + \mathcal {Y}_{i,k}^\textrm{sgs} + \theta _{i,k}^\textrm{sgs} \right] = \overline{\dot{\omega }}_{k} ~~~k=1,...,N_s. \end{aligned}$$
(5)

Here, \(\overline{f}\) denotes a spatially filtered quantity corresponding to the variable f, and \(\widetilde{f}\) is a Favre-filtered quantity, which is defined as: \(\widetilde{f}=\overline{\rho f}/\overline{\rho }\). In the above equations, \(\rho \) is the density, \(u_i\) is the velocity vector, P represents the pressure, E is the total energy per unit mass, \(Y_k\) is the mass fraction of the kth species, and \(N_s\) is the total number of species. In addition, \(\tau _{ij}\) is the viscous stress tensor, \(q_{i}\) is the heat flux vector, and \(V_{i,k}\), and \(\dot{\omega }_{k}\) are species diffusion velocity vector and the reaction-rate for the kth species, respectively. The terms with superscript ‘sgs’ are unclosed terms resulting from the filtering operation, which require additional closure models.

The total energy per unit mass in Eq. 4, E, is defined as the sum of the internal energy per unit mass (e) and the kinetic energy per unit mass. The corresponding Favre-filtered total energy per unit mass, i.e., \(\widetilde{E}\), is given as the sum of \(\widetilde{e}\), the resolved kinetic energy per unit mass \(\left( \widetilde{u_i}\widetilde{u_i}\right) /2\), and the SGS kinetic energy per unit mass \(k^\textrm{sgs}=\left( \widetilde{u_iu_i}-\widetilde{u_i}\widetilde{u_i}\right) /2\).

The above system of conservation equations is closed by using an equation of state through: \(\overline{P}=\overline{\rho }\left( \widetilde{R}\widetilde{T}+T^\textrm{sgs}\right) \), and the filtered enthalpy per unit mass, which is defined as: \(\widetilde{h}=\left( \Sigma _{k=1}^{N_S}\widetilde{Y}_k \widetilde{h}_{k} + E_k^\textrm{sgs}\right) + T^\textrm{sgs}\). Here, \(\widetilde{h}_{k}\) is the specific enthalpy of the kth species, \(\widetilde{R}\) is the mixture gas constant and \(T^\textrm{sgs}\) is an unclosed term resulting from the filtering of the equation of state.

The filtered viscous stress tensor, \(\overline{\tau }_{ij}\), and the filtered heat-flux vector, \(\overline{q}_i\), are given by

$$\begin{aligned} \overline{\tau }_{ij} = 2\overline{\mu {S}_{ij}}-\frac{2}{3}\overline{ \mu S_{kk}}\delta _{ij} \approx 2\overline{\mu }\left( \widetilde{S}_{ij}-\frac{1}{3}\widetilde{S}_{kk}\delta _{ij}\right) , \end{aligned}$$
(6)
$$\begin{aligned} \overline{q}_i = - \overline{\kappa \frac{\partial T}{\partial x_i}} + \overline{\rho }\sum _{k=1}^{N_S}\widetilde{h}_k\widetilde{Y}_k\widetilde{V}_{i,k} + \sum _{k=1}^{N_S}q_{i,k}^\textrm{sgs} \approx - \overline{\kappa }\frac{\partial \widetilde{T}}{\partial x_i} + \overline{\rho }\sum _{k=1}^{N_S}\widetilde{h}_k\widetilde{Y}_k\widetilde{V}_{i,k} + \sum _{k=1}^{N_S}q_{i,k}^\textrm{sgs}, \end{aligned}$$
(7)

where \(\widetilde{S}_{ij}\) is the resolved strain-rate tensor, and \(\overline{\mu }\) and \(\overline{\kappa }\) are filtered viscosity and thermal diffusivity, respectively, which are approximated using the resolved quantities.

The SGS terms appearing in the above equations require further modeling. These terms are given as: \(\tau _{ij}^\textrm{sgs}=\overline{\rho }\left( \widetilde{u_iu_j}-\widetilde{u_i}\widetilde{u_j}\right) \), \(H_{i}^\textrm{sgs}=\overline{\rho }\left( \widetilde{E u_i}-\widetilde{E}\widetilde{u}_i\right) + \left( \overline{u_i P}-\widetilde{u}_i\overline{P}\right) \), \(\sigma _i^\textrm{sgs}=\left( \overline{u_j\tau _{ij}}-\widetilde{u}_j\overline{\tau }_{ij}\right) \), \(\mathcal {Y}_{i,k}^\textrm{sgs}=\overline{\rho }\left( \widetilde{u_i Y_k}-\widetilde{u}_i \widetilde{Y}_k\right) \), \(\theta _{i,k}^\textrm{sgs}=\overline{\rho }\left( \widetilde{V_{i,k}Y_k}-\widetilde{V}_{i,k}\widetilde{Y}_k\right) \), \(q_{i,k}^\textrm{sgs}=\overline{\rho }\left( \widetilde{h_k Y_k V_{i,k}}-\widetilde{h}_k\widetilde{Y}_k\widetilde{V}_{i,k}\right) \), \(T^\textrm{sgs}=\widetilde{RT}-\widetilde{R}\widetilde{T}\), and \(E_k^\textrm{sgs} = \widetilde{Y_k e_k(T)} - \widetilde{Y}_k e_k(\widetilde{T})\), which result from the application of filtering operation to the non-linear terms. In the expressions for \(\theta _{i,k}^\textrm{sgs}\), \(q_{i,k}^\textrm{sgs}\) and \(E_k^\textrm{sgs}\) here, the repeated index k does not imply summation. Further details about these terms, their physical relevance and terms that are typically neglected in LES studies are discussed elsewhere (Fureby and Möller 1995; Ranjan et al. 2016).

In the context of reacting flows, \(\mathcal {Y}_{i,k}^\textrm{sgs}\), \(\theta _{i,k}^\textrm{sgs}\), \(q_{i,k}^\textrm{sgs}\), \(T^\textrm{sgs}\), \(E_k^\textrm{sgs}\) and \(\overline{\dot{\omega }}_{k}\) require closure models. Typically, \(q_{i,k}^\textrm{sgs}\), \(T^\textrm{sgs}\), \(\theta _{i,k}^\textrm{sgs}\), and \(E_k^\textrm{sgs}\) are neglected in LES (Fureby and Möller 1995), and therefore, these terms are neglected here as well. The modeling of SGS scalar flux \(\mathcal {Y}_{k,i}^\textrm{sgs}\) and filtered reaction rate \(\overline{\dot{\omega }}_{k}\), is the key focus here, and they are discussed further in the following sections.

3.1.2 Subgrid Modeling Using LEM

The linear eddy mixing (LEM) model (Kerstein 1989) is a stochastic approach to model the effects of 3D turbulent mixing in a 1D domain. It was originally a stand-alone model to account for the interactions between turbulence, molecular diffusion, and reaction kinetics. In LES, the unsteady species and temperature evolution equations are solved on a 1D subdomain embedded inside each of the LES cells, where the reaction and the diffusion processes are locally resolved, but the effects of 3D (assumed isotropic) turbulence are included via randomized stirring events. The governing equations for 1D LEM are given by

$$\begin{aligned} \rho \frac{\partial Y_{k}}{\partial t}&= F_{k, \mathrm stir} - \frac{\partial }{\partial s} \left( \rho Y_{k} V_{s,k} \right) + \dot{\omega }_{k},\end{aligned}$$
(8)
$$\begin{aligned} \rho C_{p,\mathrm mix}\frac{\partial T}{\partial t}&= F_{T, \mathrm stir} + \frac{\partial }{\partial s}\left( \kappa \frac{\partial T}{\partial s} \right) - \frac{\partial }{\partial s} \left( \sum _{k=1}^{N_S} h_{k} \rho Y_{k} V_{s,k} \right) - \sum _{k=1}^{N_S} h_{k}\dot{\omega }_{k}, \end{aligned}$$
(9)

where ‘s’ represents the co-ordinate along the 1D LEM domain. The terms \(F_{k, \mathrm stir}\) and \(F_{T, \mathrm stir}\) represent stirring events in the above equations. The turbulent stirring is implemented as stochastic events (based on the so-called triplet maps (Kerstein 1989) that attempts to mimic the effect of vortices on the scalar field. Successive folding and compressive motions are modeled during these events, with its time/length-scale governed by the nature of turbulence. This also allows for capturing a thickened reaction zone at high turbulence intensity, as the stirring time-scales get smaller, and small-sized eddies can disturb the reactive/diffusive flame structure.

Fig. 2
figure 2

Sketch of the 1D LEM embedded along the flame for its standalone application (a), or within the LES cells for LEMLES/RRLES (b)

The 1D LEM domain is notionally aligned in the flame-normal direction as shown in Fig. 2a. The LEM has also been coupled with LES for subgrid closure of the terms discussed in the previous section, wherein, the 1D LEM domain is embedded within each LES cell, as shown in Fig. 2b. Two approaches, linear eddy mixing model with large eddy simulation (LEMLES) (Menon and Kerstein 2011), and reaction-rate closure for large eddy simulation (RRLES) (Ranjan et al. 2016; Panchal et al. 2019) have been used in the past, and they are briefly summarized below.

The LEMLES approach models the species evolution equation, i.e., Eq. 5 with unclosed terms \(\mathcal {Y}_{i,k}^\textrm{sgs}\) and \(\overline{\dot{\omega }}_k\) altogether. The species mass fractions are not evolved on the LES grid, but rather only on the 1D LEM domains embedded within the 3D LES computational cells. Since the flame is resolved on the 1D domain, the grid resolution can be chosen to be fine enough to resolve the reaction and the diffusive terms, thus eliminating the need for any further closures. However, closures are needed for the subgrid turbulent mixing and the large-scale convection. While the subgrid mixing is modeled through turbulent mixing, the large-scale transport is modeled using a Lagrangian transport through the splicing algorithm (Menon and Kerstein 2011). With this approach chunks of 1D LEM domain (with Y and T) along the direction of convection across the LES cells are transported.

LEMLES has been successfully used in the past for a wide variety of problems, including premixed (Sankaran and Menon 2005), non-premixed (Sen et al. 2010; Srinivasan et al. 2015) and spray (Sankaran and Menon 2002; Patel and Menon (2008)) flames over a range of conditions. However, there are certain disadvantages of the LEMLES approach. A key limitation is that the reduction to a 1D notional dimension limits its ability in cases where the flame has to propagate in 3D as opposed to fluctuate around a statistically mean direction. At high Re, the turbulent diffusion usually dominates the molecular diffusion, which is captured by the 1D LEM model, but errors are incurred at low Re, or towards the DNS limit, where molecular diffusion, which is neglected on the large-scale, dominates.

Considering these drawbacks, the RRLES approach (Ranjan et al. 2016; Panchal et al. 2019) is a recent modification of the LEMLES approach, where only the filtered reaction-rate terms \(\overline{\dot{\omega }}_{k}\) are modeled using a multi-scale LEM framework. Here, filtered species equations Eq. 5 are still solved using a 3D grid where a conventional gradient-diffusion closure is used for \(\mathcal {Y}_{i,k}^\textrm{sgs}\), whereas, the filtered reaction rate term \(\overline{\dot{\omega }}_k\) is modeled using LEM. At every time step of the evolution of the LES equations in 3D, the filtered species mass fractions (\(\widetilde{Y_k}\)) and the filtered temperature (\(\widetilde{T}\)) evolving at the resolved level are used to reconstruct SGS variation on the 1D notional LEM domain inside each LES cell, and after solving for the subgrid reaction-diffusion equation and including the effect of turbulent mixing on the LEM domain, the filtered reaction rates are computed and projected back to the 3D LES grid. The RRLES approach has an advantage over the original LEMLES approach, particularly in a well-resolved or a locally laminar condition, where it can asymptote to the DNS limit. However, this approach cannot account for counter-gradient transport of scalars, and sensitivity of results to the reconstruction procedure is another uncertainty (Ranjan et al. 2016).

3.2 ANN Based Modeling

As discussed in Sects. 1 and 2, ANNs can be considered as highly non-linear regression models, and they are used here to model the reaction rate terms \(\dot{\omega }_k\) and \(\overline{\dot{\omega }}_{k}\) described in the previous section.

3.2.1 Problem Definition: Resolved Reaction Rates

The conventional FRC allows for the inclusion of arbitrarily complex chemical kinetic mechanisms, that can range from \(\mathcal {O}(10)\) to \(\mathcal {O}(100)\) species and reactions. The individual reaction rates are computed using Arrhenius rate expressions, and these computations can get expensive with an increasing number of species/reactions. Even with reduced chemical kinetics, a stiff direct integration (DI) solver such as DVODE may have to be used, which can result in a significant computational cost, ranging 60-90% of the total computational cost of a simulation (Sen et al. 2010). As discussed in Sects. 1 and 2, a solution to this could be to tabulate these source terms over a range of conditions, instead of DI of them at each simulation step. However, this table would become very large and highly multi-dimensional as it would have \(N_{s}+1\) input (\(Y_k\), T) and \(N_s\) (\(\dot{\omega }_k\)) output variables. Therefore, instead of tabulation, the ANN model denoted by \(\mathcal {A}_k\) for the kth species is employed for estimating the reaction rates as:

$$\begin{aligned} \dot{\omega }_{k} = \mathcal {A}_{k}(Y_{1}, Y_{2}, \dots , Y_{N_s}, T), \quad \text{ for } \quad k=1, 2, \ldots , N_{s}. \end{aligned}$$
(10)

Considering a range of time scales associated with different chemical species, separate multi-input, and single-output MLP are used for each species. Each neuron in the ANN model \(\mathcal {A}_{k}\) contains weights and biases, and their training is discussed in the next sections. The capabilities of the ANN model have been assessed using three chemical mechanisms in the past studies (Sen et al. 2010; Sen and Menon 2010; Sen and Menon 2009). These include, (A) 11-steps-14-species Syngas/air (Sen et al. 2010) skeletal mechanism for premixed flames, (B) 12-steps-16-species methane/air skeletal mechanism (Sung et al. 1998) for premixed flames, and (C) 21-steps-11-species Syngas (Hawkes et al. 2007) mechanism for non-premixed flame. Note that independent ANN model and training datasets are required for each chemical kinetics.

3.2.2 Training Algorithm

The training of ANN model comprise of two stages, which include, a forward propagation of the input, and a backward propagation the error. The output of a single neuron i at iteration number k is calculated as

$$\begin{aligned} y_i[k] = f\left( \sum _{m=0}^{M} W_{im}[k] y_m[k] - b_i [k]\right) . \end{aligned}$$
(11)

Here, \(W_{im}[k]\) is the weight coefficient between neurons i and m, \(y_m[k]\) is the output of the neuron m, \(b_i[k]\) is the bias of the neuron i, and M is the number of neurons feeding into the neuron i. As described in Sect. 2.1, there are several options for specifying the activation function \(f(\cdot )\). All the results presented in this chapter use the hyperbolic-tangent (\(\tanh \)) as the activation function.

To perform tuning of the model weights and biases during the training of the ANN model, mean squared error (E) of the network are typically minimized using a gradient descent rule (GDR), i.e.,

$$\begin{aligned} W_{im}[k+1] = W_{im}[k] - \eta \frac{\partial E[k]}{\partial W_{im}[k]}, \end{aligned}$$
(12)

where k is the GDR iteration step. Standard GDR may not be able to deal with error surfaces that have local minima where it could get trapped, and therefore, a momentum modification is used as

$$W_{im}[k+1] = W_{im}[k] - \eta \frac{\partial E[k]}{\partial W_{im}[k]} - \alpha \frac{\partial E[k]}{\partial W_{im}[k-1]}.$$

Here, \(\eta \) and \(\alpha \) are the model hyperparameters, global learning rate and momentum coefficient, respectively. Since, these model hyperparameters need to be calibrated for each new case for optimum convergence, otherwise, a modification similar to extended delta-bar-delta (EDBD) (Minai and Williams 1990) learning model has to be used. In the current approach, each neuron has their own model parameters (\(\eta _{im}\), \(\alpha _{im}\)), and they are updated at every ANN iteration based on the history of the global error as:

$$\begin{aligned} \eta _{im}[k+1]&= \eta _{im}[k] + \Delta \eta _{im}[k], \end{aligned}$$
(13)
$$\begin{aligned} \Delta \eta _{im}[k]&= {\left\{ \begin{array}{ll} \kappa _1 \lambda \eta _{im}, &{} \text {if } \phi _{im}[k] \overline{\phi }_{im}[k-1] > 0 \\ -\kappa _1 \lambda \eta _{im}, &{} \text {if } \phi _{im}[k] \overline{\phi }_{im}[k-1] < 0 \\ 0, &{} \text {if } \phi _{im}[k] \overline{\phi }_{im}[k-1] = 0. \\ \end{array}\right. } \end{aligned}$$
(14)

Here, \(\lambda = (1-\text {exp}(-\kappa _2 \phi _{im}[k]))\), \(\phi _{im}[k] = \partial E[k] / \partial W_{im} [k]\), and \(\overline{\phi }_{im}[k] = (1-\theta ) \phi _{im}[k-1] + \theta \phi _{im}[k]\). Furthermore, \(\kappa _1\) and \(\kappa _2\) are second-order model-coefficients, which are specified to be 0.1 and 0.01, respectively, based on numerical experiments.

Some salient features of this training approach are as follows:

  • Each connection has its learning coefficients.

  • Changes to the model coefficients are performed based on the value of the local error gradients (\(\phi _{im}[k]\) and \(\phi _{im}[k-1]\)), where the updates are enhanced in the regions of huge error gradient, and reduced near a minimum.

  • Instead of training using the mini-batch approach, the updates are done after introducing the whole training set to establish a correlation between \(\phi _{im}[k]\) and \(\phi _{im}[k-1]\).

  • In case the weights start to increase without bounds, the coefficients are reverted to a previously saved state.

Further details about this approach can be found elsewhere (Sen and Menon 2010, Sen and Menon 2009), however, application of more advanced approaches developed in the ML community, e.g. Adam optimizer algorithm (Kingma and Ba 2014), needs to be evaluated in the future to the problems considered here.

3.2.3 Training Dataset

For the ANN model to be able to accurately model the reaction rates \(\dot{\omega }_k\), the training set has to cover a range of conditions, i.e., \(Y_k\) and T that would be encountered during the 3D simulations. Since the training set has to be generated using DI, the cost of its generation is another concern. For example, even though a DNS of the 3D application problem can generate all the states accessed during the simulation, it is not computationally feasible to do so for training, thus requiring alternate approaches. The results presented here consider the following three methods for obtaining the training dataset:

  • FANN: The training set is generated using the tables extracted from a 2D flame-vortex interaction (FVI) simulation (Poinsot et al. 1991; Sen and Menon 2009). A premixed flame is initialized corresponding to the inflow equivalence ratio and temperature, and a coherent vortex diameter(\(D_C\)) is chosen to be of the same as the integral length scale \(L_F\) of the 3D application. Since turbulence is a superposition of multiple vortices, the maximum velocity induced by the vortex \(U_{C, max}\) is varied in the range \(10< U_{C, max}/S_L < 400\), where \(S_L\) is the laminar premixed flame speed. Six cases have been considered within this range, and training samples are obtained from multiple snapshots.

  • PANN: The training set for PANN is generated using tables obtained from 1D laminar premixed flame simulations (Sen et al. 2010). The inflow equivalence ratio and temperature are specified based on premixed flame operating conditions. A limitation of this approach is that no information about the turbulence is embedded within the training dataset.

  • LANN: Here, the training set is generated using standalone 1D LEM simulations (Sen et al. 2010). As LEM is supposed to emulate the effects of turbulence on a flame, therefore, the resulting training dataset accounts for some effects of turbulence. This approach can be used for either premixed or non-premixed flames. The laminar flame is initialized on the 1D LEM domain, and the reaction, diffusion, stirring equations are solved as described earlier. For premixed cases, the initial profile is a function of the equivalence ratio (ER) and inflow temperature, and for the non-premixed cases, it is also a function of the strain rate. In this approach, turbulent Reynolds number \(Re_t\) can be varied, which for the cases considered here has been varied from 10 to 180 (with 20 values in between) for LEM, and the integral length scale L corresponds to the specific 3D application.

The above strategies are computationally cheaper compared to the dataset generation using 3D simulations. The three approaches have different levels of fidelity in terms of embedding the effects of subgrid turbulence-chemistry interactions in the training datasets. For example, while PANN completely ignores the subgrid turbulence-chemistry interactions, LANN accounts for it albeit in form of stochastic stirring events. Alternate strategies need to be examined further to have an increased fidelity of the training dataset that can be generated in an efficient manner. These strategies will also need to incorporate the effects of other input variables such as pressure (and possibly heat loss) to enable applications to practical configurations.

3.2.4 Structure of ANN

Given the training dataset and the algorithm, the next step is to choose the ANN structure, e.g. number of neurons, hidden layers, etc., and normalization of input/output. A typical training dataset considered here contains approximately 5 million states. The database is first divided into nine equidistant temperature bins, and at least 100,000 data points are added to each bin to achieve proper sensitivity to temperature in reaction rate calculations. A typical flame solution would have a large number of points in the reactants and the products but not so many within the flame region, and this ensures that the ANN is not biased. The inputs and the outputs to the ANN are then normalized between \(\pm 1\) and \(\pm 0.8\), respectively, to increase the sensitivity to each parameter and remove any bias towards species with higher mass fractions. An 85/15 training/testing split has been considered to realize the ANN model. The training is stopped if there is no improvement in consecutive iterations to avoid overfitting.

Table 2 Number of connections and testing errors corresponding to different ANN architectures. The table is reproduced using the data from Sen and Menon (2010)
Fig. 3
figure 3

Speedup of ANN against DI with various number of connections. The figure is reproduced using the digitized data from Sen and Menon (2010)

The ANN can have multiple hidden layers, however, a smaller network would struggle with predicting complex reaction rate manifolds, whereas a larger network would result in a larger number of connections and a higher computational cost. To understand this, multiple ANN structures have been considered, and a few representative networks for the chemical mechanism C are summarized in Table 2. The corresponding computational speedups, with respect to DI, are plotted in Fig. 3. A significant slowdown occurs beyond 500 connections, and the ANN is even slower than DI beyond 20,000 connections. Considering this, and the testing errors in Table 2, 5/3/2 is selected as the optimal network for this particular kinetics, and it results in a 5 times speedup with testing errors below \(10^{-4}\). The optimal networks for mechanisms A and B are 10/5 and 10/8/4, respectively, and they result in 11 and 35 times speedup as compared to the corresponding DI. The larger speedup in mechanism B results from its stiffness. The number of training samples was always specified more than 10 times the number of neurons to avoid overfitting.

Note that the errors discussed in this section are testing errors based on the dataset that was selected for training, and not the actual errors as they would result in a 3D application. These errors can occur when thermochemical states, which are accessed by the ANN model were not available in the training dataset. Further details about these errors are discussed later.

3.2.5 Modeling Filtered Reaction Rates

Prediction of \(\dot{\omega }_k\) using ANN was discussed in the previous section, and these can be used instead of DI, either for a direct numerical simulation (DNS) or with the LEMLES/RRLES approach but within the LEM domain where a turbulence closure is not required for the reaction rates. Solution of LEM within each LES cell could still be costly for problems of practical interest, and therefore, a modified LES approach, referred to as TANN, where the filtered reaction rates \(\overline{\dot{\omega }}_k\) are directly computed using ANN was developed (Sen 2009). This approach has similarities with the RRLES approach, for instance, subgrid species diffusion \(\mathcal {Y}_{i,k}^\textrm{sgs}\) is computed using a gradient-diffusion approach, however, instead of using the LEM solver online within each cell as the simulation progresses, the filtered reaction rates are trained beforehand. The filtered reaction rates for the \(k^\textrm{th}\) species are modeled using the ANN model \(\mathcal {B}_k\) through

$$\begin{aligned} \overline{\dot{\omega }}_k= \mathcal {B}_{k} \left( \widetilde{Y}_1, \widetilde{Y}_2, \dots , \widetilde{Y}_{N_s}, \widetilde{T}, Re_{\Delta }, \frac{\partial \widetilde{Y}_1}{\partial x}, \frac{\partial \widetilde{Y}_2}{\partial x}, \dots , \frac{\partial \widetilde{Y}_{N_s}}{\partial x} \right) . \end{aligned}$$
(15)

Here, \(Re_{\Delta }\) corresponds to the subgrid Reynolds number \(u' \Delta / \nu \), where \(\Delta \) is the LES filter size, and \(u' = \sqrt{2 k^\textrm{sgs}/3}\). Previously described methods for ANN training and selection of optimal architecture have also been used with this approach. The ANN training database for TANN is constructed using standalone LEM solutions. Initializing with species and temperature profiles corresponding to laminar flames, a range of \(Re_t\) and L are explored corresponding to the conditions for the 3D application. The obtained 1D LEM solutions at multiple time instances are then filtered with size \(\Delta \) and they are then used for ANN training.

Since, the velocity field is not available from standalone LEM, \(Re_{\Delta }\) cannot be computed from \(u'\) or \(k^\textrm{sgs}\). For this, an additional equation for kinetic energy k(s) is solved on the LEM domain as

$$\frac{\partial {k}}{\partial {t}} = P_{k} - \epsilon , $$

where \(P_k\) and \(\epsilon \) are turbulence production and dissipation rates, respectively. A local velocity disturbance field \(u^\textrm{LEM} = \nu Re_{t}/L\) is computed on the segment where stirring is applied, and this is used as \(P_{k} = 3/2 (u^\textrm{LEM})^{2} / \Delta t\) and \(\epsilon = (u^\textrm{LEM})^{3}/\Delta s\) to compute the production and the dissipation terms, respectively. Here, \(\Delta t\) and \(\Delta s\) are the time and space discretizations for the LEM domain, and this follows the assumption that the turbulence that is modeled by LEM is homogeneous. The evolved k over the entire domain is then filtered to compute \(k^\textrm{sgs}\) and \(Re_{\Delta }\).

4 Example Applications

In this section, results from the application of different types of ANN-based modeling strategies discussed in Sect. 3 are described for four test canonical configurations. These cases correspond to different modes (premixed and non-premixed) of combustion and demonstrate the application to configurations with an increasing degree of geometric complexity. The first test case is a canonical premixed flame-turbulence-vortex interaction configuration where the results are compared for LEMLES between DI, LANN, PANN, and FANN. The second test case corresponds to a non-premixed temporally evolving jet flame that exhibits the presence of extinction and re-ignition dynamics in the presence of turbulence, and the results using LANN-LEMLES and TANN-LES are compared against available DNS data. The third test considers a stagnation point reversed flow (SPRF) premixed combustor with LANN-LEMLES and TANN-LES, and finally, the results from a cavity strut supersonic combustor obtained using TANN-LES are discussed. The third and the fourth tests illustrate application to practical configurations for which the results are compared against the available experimental data.

4.1 Premixed Flame Turbulence

The test configuration follows a previous work (Sen et al. 2010) for premixed flame-turbulence-vortex interaction for syngas/air flame. The reacting flow field is initialized using a 1D laminar steady solution for premixed flame, and a counter-rotating vortex pair is superimposed on the isotropic turbulence to induce small- and large-scale wrinkling. The chemical mechanism A is used for this test configuration and four different test conditions are considered, which include two equivalence ratios, and two values of \(u'/S_{L}\). Here, \(u'\) and \(S_L\) denote turbulence intensity and laminar flame speed, respectively. The ratio of integral length scale to the laminar flame thickness \(L/L_F = 5\) is selected so that the flame remains in the thin reaction zone regime. The maximum induced velocity by the vortex is chosen as \(U_{C, max}/S_L=50\). A \(64^{3}\) uniform grid is used with \(\Delta /\eta =4\), where \(\eta \) is the Kolmogorov length scale. The subgrid 1D LEM domain is spatially discretized using 24 cells. A 10/5 ANN model is used for this case. The use of ANN for chemistry modeling while performing LEMLES resulted in approximately 11\(\times \) speedup as compared to DI of the chemical kinetics.

Fig. 4
figure 4

Comparison of LES results for premixed flame-turbulence-vortex interaction for syngas/air at an instance for ER \(=\) 0.6 and \(u'/S_L=5\). The figures are reproduced using the digitized data from Sen and Menon (2010)

The results for the case with ER \(= \) 0.6 and \(u'/S_L=5\) are shown in Fig. 4 at \(t^*=L/U_{C,max}=5\). For the sake of brevity, only spatially averaged profiles of a major species \(\textrm{H}_2\) and two intermediate species, namely H and O are shown here, but the other species also showed a qualitatively similar trend. The model PANN shows the highest error with respect to DI even for the major species \(\textrm{H}_2\), where it shows an early consumption of the fuel, which can be associated with a faster consumption speed, and the errors for PANN are even higher for the minor species.

The results with the other two models, namely, FANN and LANN are comparable to DI for this particular test case, suggesting that both the flame-vortex interaction and the standalone LEM are capable of covering a range of thermochemical states that are encountered during the 3D flame-turbulence interactions. The same conclusions are obtained for the other values of ER and \(u'/S_L\) as well. These results demonstrated both the accuracy and the efficiency of the ANN-based modeling approach for chemistry. Furthermore, the results also highlight the importance of the employed training datasets on attaining accurate results.

4.2 Non-premixed Temporally Evolving Jet Flame

This computational setup follows a DNS study of turbulent non-premixed syngas/air combustion in a temporally evolving jet (Hawkes et al. 2007; Sen et al. 2010). An inner fuel jet and an outer oxidizer jet flow in opposite directions, with the jet Reynolds number of \(Re_{jet}=4478\), and a Damköhler number of \(Da=0.011\). While, DNS was performed using 350 million grid points, for LES, 5.5 million (\(\Delta /\eta = 8.3\)) cells are used. The 1D LEM domain is discretized using 12 cells. For this test case, the chemical mechanism C has been considered. Here, the results from LANN-LEMLES and TANN-LES are discussed. In terms of the computational cost, LANN-LEMLES provided a 5.5 times speedup compared to DI-LEMLES, whereas, TANN-LES provided 18.3 times speedup, showing a significant computational gain.

Fig. 5
figure 5

Evolution of mean temperature at stoichiometric mixture fraction for non-premixed extinction re-ignition test using DNS, LANN-LEMLES and TANN-LES. The figure is reproduced using the digitized data from Sen et al. (2010) and Sen (2009)

The time variation of mean temperature at stoichiometric mixture fraction is shown in Fig. 5. The temperature is expected to be the maximum on the stoichiometric surface for a non-premixed flame. The initially stable non-premixed flame approaches extinction as a result of the shear-generated background turbulence, and the temperature decreases from an initial 1450 K to 1100 K at a non-dimensional time \(t_j=20\) in DNS. After this time instant, the temperature starts increasing again as a result of the re-ignition process, and finally reaches up to 1300 K at \(t_j=40\), close to its initial value. These global features are captured by both LANN-LEMLES and TANN-LES, with 5-10% error near extinction.

Fig. 6
figure 6

Contours of OH mass fraction in the central plane at \(t_j = 20\) and \(t_j = 40\) obtained from DNS (a, c) and LANN-LEMLES (b, d) cases for the temporally evolving non-premixed jet configuration. The figures are borrowed from  Sen et al. (2010)

The contours of mass-fraction of OH species in the central \(x-y\) plane are shown in Fig. 6 at time instances \(t_j=20\) and \(t_j=40\) obtained from DNS and LANN-LEMLES cases. The OH mass fraction from DNS peaks along with the shear layers, showing a broken structure due to local extinctions at \(t_j=20\), but this is followed by re-ignition at \(t_j=40\) within these pockets. Qualitatively, the features observed in the DNS case are also captured in the LANN-LEMLES case.

Fig. 7
figure 7

Conditional average of \(\widetilde{Y}_\mathrm{{OH}}\) at \(t^{*}=20\) and \(t^{*}=40\) for non-premixed extinction re-ignition test. The symbols have the same meaning as Fig. 5. The figures are reproduced using the digitized data from Sen et al. (2010) and Sen (2009)

Mass-fractions and temperature statistics in the compositional space were also analyzed for a quantitative comparison of the flame structure by different models. The variation of OH mass fraction is shown in Fig. 7 at \(t_j=20\) and \(t_j=40\). Results with all, DNS, LANN-LEMLES and TANN-LES drop below the laminar flamelet value at extinction at \(t_j=20\), and shoot back up above it at \(t_j=40\) confirming re-ignition. Both LANN-LEMLES and TANN-LES are able to predict this behavior and match the DNS data with reasonable accuracy, with TANN-LES providing a slightly better match, particularly during the extinction phase.

Overall, the results presented here demonstrate the robustness of the ANN-based modeling of chemistry. This test case is particularly challenging because of the presence of the unsteady dynamics of turbulence-chemistry interaction, which is marked by the presence of extinction and reignition events.

Fig. 8
figure 8

Schematic of the stagnation point reversed flow combustor. This figure is borrowed from Undapalli et al. (2009)

4.3 SPRF Combustor

The stagnation point reversed flow (SPRF) combustor (see Fig. 8) was designed to reduce emissions (Gopalakrishnan et al. 2007; Undapalli et al. 2009). It was simulated in a premixed mode configuration for evaluating the capabilities of the LANN-LEMLES and the TANN-LES approaches (Sen 2009). Methane/air mixture is injected into the combustor at an equivalence ratio of 0.58. The flow enters and leaves the combustion chamber in the same plane, providing extensive preheating and allowing the flame to stabilize at very lean conditions. The combustion chamber marked as region (5) has a wall (6) at the end. Surface (2) is closed and (3) injects the premixed mixture, with (4) as the outflow. The annular jet bulk flow velocity is 122 m/s, and it is preheated to 500 K, with \(Re=12900\). The computational domain is spatially discretized using approximately 1.2 million cells. The methane/air mechanism B is used for this test configuration. For the ANN model, \(Re_{t}\) varying from 10 to 400, and the integral length scale L as the radius of the whole injector assembly (\(L=8.25\, mm\)) are considered. In terms of computational cost, LANN-LEMLES and TANN-LES showed 49.2 and 134.9 times speedup, respectively, as compared to DI-LEMLES for this test configuration.

Fig. 9
figure 9

Axial variations of time-averaged temperature and axial velocity for the SPRF combustor. This figures are reproduced using the digitized data from Sen (2009)

The simulation results using DI-LEMLES, LANN-LEMLES and TANN-LES were time-averaged over two flow-through times and compared against experimental data along the centerline as shown in Figs. 9 and 10. Both LANN-LEMLES and DI-LEMLES are able to capture the far-field axial velocity variation accurately. The differences near the injector could be due to differences in the boundary conditions as discussed elsewhere Sen (2009). The same holds true for temperature, \(\mathrm {CH_4}\), and \(\mathrm {CO_2}\) centerline variations, the results show approximately 10% errors with respect to the experiments, but both LANN-LEMLES and DI-LEMLES show similar results.

Fig. 10
figure 10

Axial variations of time-averaged mass fraction of CH\(_4\) and CO\(_2\) for the SPRF combustor. This figures are reproduced using the digitized data from Sen (2009)

The centerline time-averaged variations for axial velocity are worse for TANN-LES as compared to LANN-LEMLES, whereas they are better for temperature, \(\mathrm {CH_4}\), and \(\mathrm {CO_2}\) with respect to the experiments. It was hypothesized that this could be due to differences between the use of LEM in LEMLES and TANN-LES, where, the eddy-sizes are restricted between \(\eta \) and \(\Delta \) in the former, whereas they are between \(\eta \) and L in the latter, that could result in a higher wrinkling of the flame front and increased turbulence within the combustor.

The training of the ANN model using the 1D LEM dataset and subsequent use of the model while performing LES of a practical configuration again demonstrates efficiency, robustness, and generality aspects of the approach. The observed differences from the reference results, particularly with the TANN-LES need further studies so that the accuracy of the approach can be enhanced further. Some of these studies are currently underway.

Fig. 11
figure 11

Schematic of a supersonic cavity-strut flame-holder. The figure is borrowed from Ghodke et al. (2011)

4.4 Cavity Strut Flame-Holder for Supersonic Combustion

Now, the results from LES of a cavity-based flame-holder are discussed (Ghodke et al. 2011). Two configurations, as shown in Fig. 11, were considered; baseline cavity with 11 injectors on the aft ramp (no strut), and a strut positioned upstream of the cavity with 6 fuel injectors (with strut). The cavity extends 153 mm in the spanwise direction, with 90\(^{\circ }\) leading edge and 22.5\(^{\circ }\) ramp at the trailing edge. The cavity is 16.5 mm deep with \(L/D = 2.79\), and the length of the cavity floor is 46 mm. The injected fuel mixture contains 70% methane and 30% hydrogen, whereas the mainstream contains air and water vapor at a Mach number of 2.

The computational grids for both configurations contained approximately 10 million cells, with clustering in the near-wall regions, shear layers, and near the fuel injectors. A reduced four-step methane-hydrogen kinetics was used (Peters and Kee 1987) for the simulations. The ANN model for TANN-LES was trained using the previously described method, and a 10/8/4 hidden layer structure was found to be optimal. Simulations were performed for a duration of 6 flow-through times, and the results are compared between experiments, DI-LEMLES and TANN-LES. Compared to DI-LEMLES, TANN-LES was around 50 times faster for both the no-strut and strut configurations.

Fig. 12
figure 12

Temperature contours on a center-slice at an instant for the supersonic cavity strut flame-holder. The figures are borrowed from  Ghodke et al. (2011)

Figure 12 shows instantaneous temperature field contours on a plane that is normal to the spanwise direction for both the configurations. Most of the cavity region is filled with hot products, which causes lifting of shear layer for oxidizer entrainment into the cavity. The reaction zone is even larger for the configuration with the strut due to an increased mass and heat transfer between the cavity and the main stream, as a result of the low-pressure region behind the strut. Vortical structures behind the strut are responsible for better mixing of fuel and maintaining hot regions inside combustor by mass transfer which helps flame-holding.

Fig. 13
figure 13

Bottom wall pressure comparison against experimental data (Grady et al. 2010) for the supersonic cavity strut flame-holder. The strut extends from \(x=-36\) mm to \(x=25\) mm, and the cavity extends from \(x=0\) mm to \(x=86\) mm. The figures are reproduced using the digitized data from Ghodke et al. (2011)

Figure 13 shows the wall pressure comparison for reacting cases with available experimental data (Grady et al. 2010). For both cases, location of leading-edge shock (\(x \sim \) −30 mm and \(x \sim \) 0 mm for configurations with and without strut, respectively) and ramp expansion (\(x \sim \) 85 mm) are captured well, along with multiple reflections off the wall. The pressure inside the cavity is almost constant, and hence, this could be considered as a constant pressure combustion process. The peak pressures along the wall as predicted by both DI-LEMLES and TANN-LES are also in good agreement with reference experimental data, thus illustrating that the heat releases effects are accurately captured.

The use of an ANN-based strategy for modeling subgrid turbulence-chemistry interactions in this test configuration demonstrates the robustness of such an approach. This could be attributed to the efficacy of ANN to accurately represent multi-dimensional data in form of a nonlinear regression, which in turn, can account for complex input/output relations as prevalent in this particular test case where turbulence-chemistry interactions occur under supersonic flow conditions in a complex geometrical configuration. Although the approach employed here is able to capture the trends both qualitatively and quantitatively, some discrepancies with the experimental data can also be seen, which needs further investigation.

5 Limitations of Past Studies

The results discussed here used ANNs to directly represent the chemical kinetics at the subgrid level. Even though the results demonstrated various aspects of the ANN-based modeling approach for efficient computations of chemically reacting flows while utilizing FRC, there are certain challenges that need further studies. Some of the key features of ANN-based modeling that were demonstrated include a significant decrease in the computational cost and memory requirements, robustness in application to different modes and regimes of combustion, predictive ability in terms of decoupling the training dataset from the actual application, etc. Some limitations and concerns of the current work are highlighted next in order to stimulate future research:

  • Stiff kinetics with complex mechanism: The majority of configurations in the current work explored mechanisms comprising of 11–16 species with varying levels of stiffness. The results discussed provided consistently acceptable predictions for all cases. However, scenarios relevant to practical combustion applications involve detailed kinetics with an order of 50–100 species and an order of 1000 reactions. Hence, the scalability of the current approach to such a scenario needs to be investigated further with specific attention to predictions of minor species, stiff radicals, and their interactions with turbulence.

  • ANN architecture optimization: Even though a significant portion of the current work is focused on devising an optimal neural network for a general case, the sensitivity of the ANN to hyperparameters such as the number layers, size of the training data generated, effects of using different optimization approaches, activation functions as well as error estimation techniques need to be further examined, especially with help of open-source well-established powerful tools such as TensorFlow (Martín et al. 2015) or PyTorch (Paszke et al. 2019).

  • Training data generation: For TANN, the filtered training data is generated by using the filter width based on the information of the actual LES grid. For canonical problems, this information is easily available as the computational grid involved is almost uniform throughout the domain. However, in complex configurations, since the computational grid varies significantly due to clustering in specific areas of interest, the filter width definition needs to be revised. Moreover, the ANN model trained on the table generated using the standalone LEM computations still suffers from the assumption made during the LEM approach used for table generation. Therefore, some assumptions involved in standalone LEM computations regarding turbulence homogeneity and isotropy, LEM solution initialization using a 1D laminar flame solution, stirring operations, etc. need to be revisited for further improvements.

  • Off line training: The ANN training approach adopted here is based on offline training philosophy. The training dataset generated using 1D LEM, flame-vortex interaction, or 1D laminar flame are used to construct the thermochemical database, and once the ANN model is trained on this, it is used as-is in LES without any further learning. Therefore, it is expected that it may contribute to large errors at some state spaces that are far from training data. The alternate method of combining offline and online training, where ANN needs to be retrained on such states, can be adopted. A similar strategy has been employed in a recent DNS study (Chi et al. 2021).

  • Subgrid modeling of scalar fluxes: In the TANN approach, a gradient diffusion-based eddy diffusivity model is used for the closure of the SGS scalar flux. However, in chemically reacting flows both gradient and counter-gradient subgrid turbulent transport of the scalar fields are observed (Ranjan et al. 2016). Therefore, the predictive capabilities of the TANN-LES strategy can be further improved by using ANN-based modeling of the SGS scalar flux.

6 Summary and Outlook

Rapid advancements in computational resources have led to an increased usage of ML tools, particularly supervised DL to solve challenging problems in the field of science and engineering. DL techniques relying on the ANN is a representational learning method, which transforms the representation at one level starting with the raw input to an abstract representation at a higher level, which allows learning of complex nonlinear relationships and enables the discovery of intricate structures in a high-dimensional dataset. In this chapter, different approaches relying on ANN algorithms for efficient modeling of the chemistry. Within the FRC framework have been discussed for LES of turbulent combustion.

The two major challenges associated with FRC-LES include a robust SGS closure for turbulence-chemistry interaction and efficient handling of stiffness associated with the use of detailed chemical kinetics. In the LEMLES approach, a two-scale strategy is used; LEM is used for the subgrid modeling of the reaction, diffusion, and turbulent mixing, and large-scale transport is handled in a Lagrangian manner. The approach has been demonstrated in the past for simulations of a wide variety of canonical and practical configurations. As it allows for the inclusion of arbitrarily complex chemical kinetics and resolves the flame in the 1D LEM domain, ANN-based models have been examined in terms of their ability to efficiently model the reaction-rate terms. Apart from LEMLES, a conventional LES approach has also been discussed where instead of modeling the reaction-rate term at the subgrid level as in LEMLES, a model for the filtered-reaction-rate term is devised based on ANN.

A key step in ANN-based modeling is the training database, which was generated using three approaches, namely, laminar flame solutions, flame vortex interactions (FVI), and flame turbulence interactions (FTI) using standalone 1D LEM computations. In all three approaches, the thermochemical state space is predicted using canonical configurations and with only the knowledge of large-scale parameters of the actual geometry of interest. The ANN model trained using these three approaches showed the effectiveness of FTI (LANN) and FVI (FANN) approaches over laminar flame solutions (PANN) for training data generation and predicting the behavior of canonical as well as complex (premixed and non-premixed modes) reacting flow configurations. The TANN approach utilizes a tabulation model for the filtered reaction rates, which does not employ any explicit assumption regarding the interaction of turbulence with the laminar flame front, but solves them directly on their respective time and length scales using standalone LEM computations. The ANN models considered in the example applications were based on a back-propagation algorithm with adaptive gradient descent rule (AGDR), and \(\tanh \) activation function with a simple architecture using a maximum of 3 hidden layers, one input, and one output layer. Furthermore, during the learning stage of the ANN model, the training was stopped when saturation in training error was observed to ascertain the ANN generality and avoid problems of data memorizations.

The performance of ANN-based modeling strategies was examined in terms of their accuracy, robustness, and efficiency using four test cases with an increasing degree of complexity. These cases included canonical turbulent premixed and non-premixed flames where reference DNS results were used to assess the capabilities of different modeling approaches. The robustness of the use of the ANN model for FRC was demonstrated through two practical configurations corresponding to a premixed combustor and a supersonic cavity flame holder. These cases were simulated using three different chemical mechanisms. Overall, ANN-based modeling of chemistry with the LEMLES and TANN-LES framework was able to capture qualitative features of flame-turbulence interactions, and their quantitative statistics were in good agreement with direct integration approaches for chemistry. However, some discrepancies were also noted in the results, which needs further investigation for potential improvement to the employed modeling strategies.

A major challenge with modeling of chemistry using ANN is an accurate representation of detailed chemical mechanisms over a wide range of operating conditions, which usually have a higher level of stiffness due to the wide separation of timescales associated with different chemical species. So far, while using ANN with LEM only moderately complex chemical mechanisms have been considered, which need to be extended to detailed chemical mechanisms. While modeling FRC using ANN, a multi-input and single-output ANN model is needed for each chemical species, which also poses a challenging task for the training process to attain an optimal architecture. To obtain an optimal ANN model, parameters, hyperparameters, and training strategies need to be specified. While some of the hyperparameters have demonstrated their applicability to different types of problems, further usage and assessment of ANN algorithms for turbulent combustion modeling can potentially lead to some common parameters that may work for a wide range of applications.

Another key challenge for ANN-based predictive modeling is the efficient generation of reliable training data. The data generation procedure should be general enough so that it can be used with different types of geometrical configurations, and different modes and regimes of combustion. Furthermore, the procedure should be efficient to enable a faster generation of training data for a range of input conditions that can cover a large thermochemical state space. To this end, 1D LEM-based training seems to be a good strategy, however, further improvements are needed. Some improvements that should be considered are: accounting for the effects of pressure, the use of different types of energy spectra in the LEM equations, considering a range of LES filter sizes, etc. In addition, an adaptive training approach (Chi et al. 2021) can also be considered by employing a cost function associated with the accuracy and efficiency of the ANN model.

The ANN model for reaction rate discussed in this chapter relied on a different network for each species. However, reaction-rate for the species are related to each other through the constraint of conservation of mass. This aspect is not addressed in the formulation considered here, and therefore, can be considered in future studies by following the approach used by the physics-informed neural network (Raissi et al. 2019). Although turbulent combustion modeling in the context of LES has mainly focused on robust and accurate modeling of the filtered reaction-rate term, ML tools can also be used for modeling the other unclosed terms such as SGS scalar flux, temperature, equation of state, etc. Such constraints and improvements by the use of ML tools can yield improved predictions, particularly under extreme conditions when large variations in thermochemical state space can occur, and therefore, should be considered in future studies.