Deep learning-based image segmentation for instantaneous flame front extraction

Strässle, Ruben M.; Faldella, Filippo; Doll, Ulrich

doi:10.1007/s00348-024-03814-z

Deep learning-based image segmentation for instantaneous flame front extraction

Research article
Open access
Published: 04 June 2024

Volume 65, article number 94, (2024)
Cite this article

Download PDF

You have full access to this open access article

Experiments in Fluids Aims and scope Submit manuscript

Deep learning-based image segmentation for instantaneous flame front extraction

Download PDF

348 Accesses
Explore all metrics

Abstract

This paper delves into the methodology employed in examining lean premixed turbulent flame fronts extracted from Planar Laser Induced Fluorescence (PLIF) images at elevated pressures. In such flow regimes, the PLIF signal suffers from significant collisional quenching, typically resulting in image data with low signal-to-noise ratio (SNR). This poses severe difficulties for conventional flame front extraction algorithms based on intensity gradients and requires intense user intervention to yield acceptable results. In this work, we propose Convolutional Neural Network (CNN)-based Deep Learning (DL) models as an alternative to problem specific conventional methods. The pretrained DL models were fine-tuned, employing data augmentation, on a small annotated dataset including a variety of conditions between SNR $\approx$ 1.6 to 2.6 and subsequently evaluated. All DL models significantly outperformed the best-scoring conventional implementation both quantitatively and visually, while having similar inference times. IoU-scores and Recall values were found to be up to a factor $\approx$ 1.2 and $\approx$ 2.5 higher, respectively, with $\approx$ 1.15 times improved Precision. Small-scale structures were captured much better with fewer erroneous predictions, becoming particularly pronounced for the lower SNR data investigated. Moreover, by applying artificially modeled noise, it was shown that the range of image conditions in terms of SNR that can be reliably processed extends well beyond the images included in the training data, and satisfactory segmentation performances were found for SNR as low as $\approx$ 1.1. The presented DL-based flame front detection algorithm marks a methodology with significantly increased detection performance, while a similar computational effort for inference is achieved and the need for user-based parameter tuning is eliminated. It enables a very accurate extraction of instantaneous flame fronts in large image datasets where supervised processing is infeasible, unlocking unprecedented possibilities for the study of flame dynamics and instability mechanisms at industry-relevant conditions.

High-speed planar imaging of OH radicals in turbulent flames assisted by deep learning

Article 16 February 2022

Hybrid algorithm for the detection of turbulent flame fronts

Article Open access 19 May 2023

Generating planar distributions of soot particles from luminosity images in turbulent flames using deep learning

Article 15 January 2021

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In various realms of experimental research, particularly within the domains of fluid dynamics and combustion, extensive postprocessing efforts are imperative for distilling targeted information from raw imaging data. The utilization of segmentation and edge or object detection techniques is crucial to derive important flow physical quantities from output data obtained through imaging methods. Examples of features that may be sought include, the coordinates of a dye plume (Yadav 2018), oil or smoke patterns (Arivoli 2023), the position of a shock wave (Kováics et al. 2023), vortical structures (Lindner et al. 2020) or particle clusters (Metzger et al. 2022), species concentrations and iso-surfaces (Zheng et al. 2022), interfaces and zones (Reuther and Kähler 2018). The methodologies applied in this context span a range of experimental techniques (Tropea et al. 2007), including chemiluminescence (CL) (Guethe et al. 2012), laser induced fluorescence (LIF) (Eghtesad et al. 2024) and phosphorescence (LIP) (Charogiannis 2013), filtered Rayleigh scattering (FRS) (Doll et al. 2023), thermography (Astarita et al. 2006), schlieren imaging and shadowgraphy (Settles and Hargather 2017), high-speed photography (Versluis 2013), dye injection (Di et al. 2022), smoke visualization (Willmott et al. 1997), oil film measurements (Cai et al. 2022) as well as experiments involving shear- or temperature/pressure-sensitive liquid crystals (SLC, TLC) (Ireland and Jones Jul 2000) and paints (TSP, PSP) (Gregory et al. 2008).

A good postprocessing algorithm for image segmentation must be accurate, computationally cheap and ideally not require user-based parameter tuning, such that a large set of images can be evaluated with the same settings, without the need for human interference. When applications become more complex, some of the above aspects often have to be traded off against one another when conventional approaches based on brightness gradients in images are used.

Recently, Machine Learning (ML) has been increasingly integrated into problem solving strategies in fluid dynamics (Brunton et al. 2020) and combustion (Zhou et al. 2022). Some example applications relevant to combustion include data-driven physical modeling and estimation of properties (Eckart et al. 2022; Gonzáilez et al. 2020; Joo et al. 2015), reconstruction of scalar quantities and fields from measurement data (Barwey et al. 2022; Teutsch et al. 2023; Clark Di Leoni et al. 2023), object detection and classification (Roncancio et al. 2022; Ryu and Kwak 2021; Pulido et al. 2021), image segmentation (Vennemann and Rösgen 2020; Kuzu et al. 2022; Kashir et al. 2021), and prediction of events in real time that can be further used as inputs for control systems (Cellier et al. 2021; Li et al. 2022; Aliramezani et al. 2022). However, ML-based methods have not yet found their way into flow diagnostics as much as they have in other disciplines that deal with image processing techniques, such as robotic perception or biomedical imaging, although ML might offer significant benefits and could provide a valuable alternative to existing conventional techniques.

Flame and flame front detection tasks have already been carried out with ML-based models for high-speed camera imaging. In the context of industrial burners and furnaces, Landgraf et al. (2023) and Groß et al. (2021) conducted investigations for segmentation of the whole flame brush for monitoring purposes. Sun et al. (2022) did flame edge detection for premixed, diffusion and energetic material flames at atmospheric pressures in thermography, single-channel and RGB images with fairly good contrast to the background. In the context of combustion diagnostics research and for more demanding image conditions, there exist some applications of ML-based flame front detection and segmentation in optical SI-engines by Petrucci et al. (2022); Petrucci et al. (2022) and Rufino et al. (2023) for pressures up to 0.38 MPa. However, to the knowledge of the authors, no ML-based application considering planar laser induced fluorescence images of the OH radical (OH-PLIF) and data recorded at high pressure conditions exists in the literature. This is particularly relevant because the OH molecules are subject to collisional quenching at elevated pressures, leading to a loss of LIF signal, which significantly lowers the signal-to-noise ratio (SNR) of the image data, cf. Fig. 1, and thereby impedes the extraction of the flame fronts. Especially in lower SNR conditions, conventional methods based on gradients of the OH signal possess some deficiencies. Obtaining and evaluating accurate instantaneous flame fronts is of central importance for precisely deriving local quantities, such as flame curvature, to study the flame dynamics and instability mechanisms. Hence, there is a need for more robust and reliable flame front extraction methods.

The aim of this paper is to show the possibilities and establish simple ML-based methods as an alternative to other advanced and very tailored conventional flame front detection algorithms. For this purpose, the present work delves into the methodology for instantaneous flame front extraction from OH-PLIF images of turbulent premixed lean hydrogen-methane (H₂–CH₄) flames at elevated pressures. The experimental setup and data described in Faldella et al. (2023) was used. Further details on the employed setup and experimental conditions are summarized in the supplementary material for the interested reader. Firstly, established conventional extraction techniques and their limitations are discussed in Sect. 2, followed by an overview of ML-based approaches in Sect. 3. For the ML-based approaches, we motivate the powerful idea of convolutional filters in the context of neural networks and supervised learning. We then highlight different model categories and possible models that can be used for the task of extracting flame fronts. In Sect. 4, some important implementations are highlighted. The performance of a selection of ML-based flame front detection models and their behaviour for low SNR are evaluated and put into perspective with conventional benchmark models in Sect. 5. Finally, in Sect. 6, the main findings and conclusions are summarized.

2 Conventional flame front detection

In principle, two pixel-based techniques can be distinguished for the extraction of structures in images; segmentation or edge detection. Segmentation is the process by which all pixels in the image are assigned to different classes and labeled accordingly, whereas edge detection is used as a term for techniques that make use of gradients in images. For the extraction of flame fronts in OH-PLIF images, two methods have been established. They are referred to as the conventional methods in this manuscript. (1) The first approach (segmentation technique) aims to separate the zones of burnt from unburnt gases based on the intensity values of the OH signal in individual pixels. This is usually done by thresholding with previous preprocessing steps like contrast enhancement and low-pass filtering, to ensure the subsequent steps are not too sensitive toward noise. From there, the flame front can be extracted as the boundary between the two zones. The choice of the threshold has an essential impact. Often thresholds are set manually and empirical values are used as starting points (Griebel et al. 2005). This may lead to uncertainties and cannot be executed unsupervised for significantly differing image regimes. The threshold can, however, also be selected statistically in a completely unsupervised manner using Otsu segmentation (Otsu 1979). This method is based on the histogram of all pixel intensity values in one image, and maximizes inter-class variance assuming a bimodal distribution. Adaptive or sliding window thresholding (Bradley and Roth 2007) can be beneficial if strong spatial variations in illumination or intensity changes in the image are present. (2) The second approach (edge detection technique) makes use of a possible definition of the flame front which correlates the location of peak heat release with the location of peak OH-PLIF intensity gradients in a flamelet. The edge can be obtained by first-derivative-based filters which approximate the gradient, e.g. the Sobel, Prewitt, or Roberts operator, and the location of their peak. If second-derivative-based filters are applied, such as the Log operator, the location of its zero crossing defines the edge (Yousaf et al. 2018). Canny edge detection (Canny 1986), for example, is a multi-step algorithm which makes use of derivative kernels to locate edges. The second approach (edge detection) is favored, because compared to the first (Otsu segmentation) it considers local information. Apart form the actual quantity we are interested in, which is the OH number density, the PLIF signal has a multi-dimensional dependency on pulse energy, upstream absorption, and thermodynamic conditions (Lacassagne et al. 2023). This is why global thresholding of intensities makes comparability rather challenging.

A simple routine for the second approach (edge detection), referred to as the gradient method in this manuscript, might look as follows (depicted in Fig. 2a-c, and e). In a first step, the retrieved OH-PLIF image is preprocessed by applying a low-pass filter, e.g. Gaussian blurring. Then the gradient is established with the Sobel $3\times 3$ filter. Finally, to minimize erroneous predictions, a region of interest (ROI) is selected, corresponding to the averaged flame brush, in which the maximum of the gradient is extracted using the Canny method (Canny 1986), resulting in the binary mask, which ought to correspond to the ground truth. The disadvantage of the gradient method is that two hysteresis threshold parameters, i.e. $\tau _{\text {High}}$ and $\tau _{\text {Low}}$, have to be chosen for the Canny edge detection, and that usually some sort of blurring, with parameter $\sigma _{\text {Blurr}}$ in case of Gaussian blurring, has to be applied to cope with noise. The loss of information due to blurring mainly affects the small scales of the flame front, which are crucial for the rather challenging example of highly wrinkled turbulent premixed flames. Additionally, also operating conditions of the combustor have a strong influence on the image quality. In particular, as the pressure increases, the fluorescence yield decreases approximately inversely proportional due to collisional quenching, thus, the signal-to-noise ratio (SNR) decreases significantly (Tu et al. 2020). This makes increased blurring a necessity, and affects the quality and uncertainty of the extracted flame fronts derived from the raw data. A minimal loss of information due to blurring is to be targeted. Furthermore, thermo-diffusive effects in the case of H₂-blends lead to lower OH concentrations in concavely curved regions of the flame front (Bell et al. 2007). The OH gradients in these regions can possess similar magnitudes as non-flame front structures, which makes the choice of the hysteresis thresholds extremely difficult. The above reasonings lead to the fact that the processing should be carried out image or regime specific in a supervised way, especially in the case of preferential- or thermo-diffusively unstable mixtures.

In an attempt to make the choice of these hysteresis thresholds more autonomous, they are sometimes set to $\tau _{\text {High}}=\tau _{\text {Otsu}}$ and $\tau _{\text {Low}}=\tau _{\text {Otsu}}/2$ (Setiawan et al. 2017). However, this does not always yield satisfactory results. A recently published paper reported the use of the Filtered Canny algorithm; Chaib et al. (2023) used the rough flame front contour obtained by an initial Otsu segmentation as ROI for Canny edge detection. Advanced preprocessing schemes such as nonlinear edge-preserving filters and contrast enhancement techniques were applied in the aforementioned work and user-based parameter tuning was eliminated. Nevertheless, even without sophisticated preprocessing, the idea of a preliminary Otsu segmentation to obtain a streamlined and much smaller ROI for Canny edge detection, cf. Fig. 2d, enables reduced blurring and simplifies the parameter selection for Canny edge detection considerably.

3 Deep learning-based flame front detection

3.1 Motivation for deep convolutional neural networks

Classical pixel-based image postprocessing most commonly involves some kind of filtering, whether for noise removal or for other specific tasks, e.g., the computation of a gradient. The convolution process is illustrated in Fig. 3 for the case of a $3\times 3$ Sobel filter. Filters for different applications differ in their predefined weights, but are applied in the same way. The limitations of conventional flame front detection methods due to filtering mentioned in Sect. 2 lead to the question of an alternative approach and as such, whether an optimal filter for a particular task exists and how it can be found. Instead of manually designing and testing filters with predefined weights, the optimal weights of a convolutional filter can also be learned in a Neural Network (NN), i.e. in a Convolutional Neural Network (CNN), in a supervised way. A brief overview of the ML nomenclature used in the remainder of this manuscript can be found in Table 1. A simple exemplary architecture of a deep CNN can be seen in Fig. 4; the VGG16 (Simonyan and Zisserman 2015). Each block consists of several convolutional layers, in which a number of convolutional filters (cf. Fig. 3) are applied and followed by an activation function, as well as a pooling layer (cf. dimensions in Fig. 4 for an exemplary image of size $256\times 512$ pixels). The resulting excellent ability to perceive underlying features has made CNNs, especially deep CNNs, an indispensable tool in modern image processing.

Table 1 Machine Learning nomenclature in the context of flame front detection

Full size table

The learning process in a NN always occurs in two steps: (1) Inference or forward pass, and (2) backpropagation or backward pass. (1) During inference, a batch of samples of the dataset is passed through the network. All the weights according to the current training state are applied to obtain an output prediction. The loss is computed as preparation for the backpropagation step. The choice of the loss function is very problem specific and has a significant impact on the outcome. (2) The loss is now minimized with a gradient descent algorithm, since this is a non-convex optimization problem. Hence, in a second step, the gradient of the loss with respect to each weight is computed over all its connections from the output until the respective layer, i.e. propagated back. These steps are repeated until a stopping criterion is met.

It is important to note that only local minima are found, determined by the learning rate and the starting point of gradient descend, due to the non-convexity of the problem. Moreover, the solution is non-unique and NNs are not invariant. For that matter, transfer learning and data augmentation are generally used, especially when dealing with sparse training data and an increased risk of overfitting. Adopting a pretrained network leads to a starting point in the loss landscape expected to be closer to the targeted minima, and therefore to a higher probability to obtain the optimal solution and reduced computational effort for training. Data augmentation helps network invariance, mitigates overspecializing on certain features, and the dataset can be artificially increased to a certain extent. Further concepts and underlying mathematics can be found in, e.g., Murphy (2012).

3.2 Models for flame front detection

A thorough model overview in the general context of modern object segmentation, not only limited to semantic image segmentation that can be used in our case, is given in Wang et al. (2022). As a counterpart, a solid review on edge detection technology is provided by Shou-Ming et al. (2021). This work focuses on CNN-based Deep Learning (DL) models in order to leverage their excellent abilities for image processing and the idea of learning optimal filters, as motivated in the previous subsection. Hence, traditional ML or spectral clustering methods were not considered in this work. In the following, the most important DL-models relevant to flame front detection are presented and further options illustrated. More details about the architectures can be found in the original sources provided. A distinction can be made between two categories: (1) The encoder-decoder type CNN, and (2) the multiscale fusion type CNN.

(1) In order to compute a loss, the output dimensions of the CNN need to match the label, which has the same spatial resolution as the input. Encoder-decoder type CNNs achieve this by attaching a decoder, usually the same backbone, to an encoder, connected via the latent space. A prime example of this architecture is the U-Net (Ronneberger et al. 2015), originally developed for semantic segmentation applications in biomedical imaging. This model additionally has some skip connections (cf. gray arrows in Fig. 5), concatenating arrays from convolutional blocks of similar sizes between the encoder and decoder, in order to combine information from the down- and up-sizing branches of the model. Over time, improvements were made and more sophisticated models were developed, for example, the Attention U-Net (Oktay et al. 2018), the Residual U-Net (Zhang et al. 2018), and the U-Net++ (Zhou et al. 2018), to name a few. More detailed illustrations and explanations are summarized in Siddique et al. (2021). LinkNet (Chaurasia and Culurciello 2017), on the other hand, was developed as an alternative with reduced model parameters in order to be computationally more efficient for applications in embedded systems. Compared to these concatenation connections in the basic U-Net, the input of each encoder layer in the LinkNet is directly bypassed and added to the output of its corresponding decoder layer.

(2) A different approach is chosen for the multiscale fusion type CNN. Here, a prime example is the Holistically-Nested Edge Detector (HED) (Xie and Tu 2015) schematically depicted in Fig. 6. The idea is to branch off arrays from each convolutional block and to enlarge them back to the original image size (side outputs). A loss is computed for all individual side outputs, leading to an approximation of the label in the corresponding layer of the HED. The output of the model (fuse-channel), for which a loss is also computed, consists of the fusion of the side outputs. It can be seen that the side outputs after the lower sized blocks only consist of very coarse features. A smearing effect acts on the model output (fuse-channel), if these coarse features are now merged with side outputs from the upper levels, which are very detailed. Similarly to the U-Net, there are also further improved and related models for the multiscale fusion type, such as N$^4$-fields (Ganin and Lempitsky 2014), DeepContour (Shen et al. 2015), Richer Convolutional Features (RCF) (Liu et al. 2019), or DeepEdge (Bertasius et al. 2015).

Furthermore, there are hybrid models that combine both ideas, that of an encoder-decoder and that of a fusion of outputs from multiple layers. These models usually run several streams or pyramids in parallel. Examples are the HED-UNet (Heidler et al. 2022) or the Pyramid Scene Parsing Network (PSPNet) (Zhao et al. 2017). Also worth mentioning is a family of models for instance segmentation, which refers to the simultaneous detection of objects and their segmentation on the respective patch in the image. An example is the family of the Region-Based CNN (R-CNN) with its successors such as the Mask R-CNN (He et al. 2017). Some of these models however, e.g., the Feature Pyramid Network (FPN) (Lin et al. 2017), can also be readily used for semantic segmentation.

4 Implementations

4.1 Dataset, preprocessing and metrics

Labels are needed in order to train the DL models and to evaluate all models, including the conventional methods. In an ideal case the labels correspond to the ground truth. For the task of flame front detection in OH-PLIF images, the labels are not readily available, so they have to be created first. Thus, the images were annotated with a labeling tool called Label Studio.^{Footnote 1} The accuracy of the hand-labeling step depends on the performance of the human annotator, which is why it is generally accepted and common practice in ML when done by multiple domain experts (Karimi et al. 2020). We assume that the created labels are a reasonable approximation for the ground truth when they fit the raw image and are within the lines of the steepest gradients, using the convenient definition of the flame front which correlates the location of peak heat release with the location of peak OH-PLIF intensity gradients.

Table 2 Utilized dataset. Additional information is provided in the supplementary material

Full size table

Table 3 Utilized evaluation metrics

Full size table

Table 4 Parameter settings leading to the best possible IoU-scores for the implemented conventional methods serving as benchmarks

Full size table

Producing a large number of good quality labels is not very feasible, as it is a time consuming process. For this work, the dataset consisted of a total of 60 annotated images. 10 images each were gathered at 6 different operating conditions. Further specifics can be found in Table 2. The following preprocessing steps were applied to the entire dataset of single channel images (OH intensity signal): The background was subtracted, the images were corrected for the laser sheet inhomogeneity w.r.t. the vertical coordinate and a crop was made to discard the irrelevant section of the images. The cropped data was resized to $512\times 256$ pixels using bicubic interpolation in order to map all images to the same size. Then, min-max scaling was applied such that all 8-bit images possess integer values $\in \{0, \ldots , 255\}$. The SNR was calculated according to Sweeney and Hochgreb (2009), using the formulation

$$\begin{aligned} \hbox {SNR} = \dfrac{\mu _{P}-\mu _{R}}{\sigma _{P}} , \end{aligned}$$

(1)

where $\mu$ is the mean and $\sigma$ the standard deviation of all pixel intensities measured on either the product side (denoted by the subscript $_P$) or the reactant side (denoted by the subscript $_R$). Since a clearly separated unburnt and burnt zone are needed for this purpose, the images were binarized after edge detection with the Otsu-Aided Gradient Method described in Sect. 4.2. Mean SNR and standard deviations over the image set in a regime ($\mu _{\text {Set}}$, $\sigma _{\text {Set}}$) are indicated in Table 2.

All metrics employed in this work are listed in Table 3. They are computed from the confusion matrix of hypothesis testing, where all predicted pixels are compared to the label and thus assigned as either, False Positive (FP), True Positive (TP), False Negative (FN), or True Negative (TN). In the context of flame front detection, Positives (P) mark pixels predicted as flame front and Negatives (N) mark pixels predicted as background. True (T) and False (F) indicate a correct or incorrect identification of the pixels w.r.t. the ground truth, respectively. The metrics in Table 3 were utilized because most of them are independent of the majority class of TN in our highly unbalanced task in which flame fronts occur much less frequent than the background class. The flame front class accounted for 2.122% of all pixels in the entire dataset. Note that the resulting metrics for the same flame can change depending on the cropped image region and the resolution.

4.2 Benchmark models

Conventional benchmark methods were implemented and evaluated to put the results of the DL models into perspective. The following three conventional methods, which have already been roughly introduced in Sect. 2, were used: (1) The gradient method with manual selection of the hysteresis thresholds $\tau _{\text {High}}$ and $\tau _{\text {Low}}$ for canny edge detection. This implementation is referred to as the "Gradient Method ($\tau$ Supervised)" in this manuscript. (2) As an alternative to the manual selection of these hysteresis thresholds, a gradient method was chosen where $\tau _{\text {High}}$ and $\tau _{\text {Low}}$ were set to $\tau _{\text {Otsu}}$ and $\tau _{\text {Otsu}}/2$ respectively, which is why this method is herein also referred to as the "Gradient Method ($\tau$ Unsupervised)". (3) As a third conventional method, the idea of a preliminary Otsu segmentation to obtain a streamlined ROI for Canny edge detection, herein referred to as the "Otsu-Aided Gradient Method", was implemented as a simplification to the proposed Filtered Canny algorithm by Chaib et al. (2023) excluding the applied preprocessing schemes. The flame front contour of the preliminary Otsu segmentation was thickened by 8 pixels. In all three benchmark methods, only blurring was used as preprocessing, as all other advanced preprocessing schemes can also be used for DL methods and are therefore not relevant to this comparison. The blurring parameter $\sigma _{\text {Blurr}}$ must be selected by the user in a supervised manner for all three conventional benchmark methods. These parameters ($\tau _{\text {High}}$, $\tau _{\text {Low}}$ and $\sigma _{\text {Blurr}}$) only provide satisfactory results if they are specifically determined for each image and combustion regime. To highlight the best possible results, a grid-search over all parameters was performed to find the optimal parameters in terms of IoU-score. This score is the main metric in the present work, as it was used in the loss function of the DL models that were subsequently optimized for it. The grid-search was started with $\sigma _{\text {Blurr}}$ of the Gradient Method ($\tau$ Unsupervised). The same blurring parameter was used for the Gradient Method ($\tau$ Supervised), where both hysteresis thresholds were optimized thereafter. For the Otsu-Aided Gradient Method, solely $\sigma _{\text {Blurr}}$ was optimized; the thresholds $\tau _{\text {High}}$ and $\tau _{\text {Low}}$ were set to $\tau _{\text {Otsu}}$ and $\tau _{\text {Otsu}}/2$. The optimized regime-specific parameters of all conventional methods are summarized in Table 4.

As a further benchmark, and to assess the quality of annotation and segmentation by human eye and hand, the same set of images (Table 2) was labeled by a different person and subsequently evaluated. This result is presented in Sect. 5 as a point of reference together with the results of the other conventional methods.

Table 5 Model size and training parameters for the implemented DL models until convergence

Full size table

4.3 Deep learning models

The simplest form of the encoder-decoder and the multiscale fusion type were implemented, namely a U-Net and a HED, as well as the LinkNet and FPN as alternatives. Furthermore, the influence of several backbone architectures and the attention connections were tested on the U-Net. For a fair comparison independent of the platform’s performance, all implementations were made in Python (3.10.12)^{Footnote 2} on Google Colab,^{Footnote 3} with TensorFlow (2.15.0)^{Footnote 4} and the Keras (2.15.0)^{Footnote 5} application programming interface (API). The HED architecture was written from scratch, while the architectures of the U-Net with different backbones, the LinkNet and the FPN were built using a Keras library called segmentation-models (1.0.1),^{Footnote 6} and the Attention U-Net was built using keras-unet-collection (0.1.13).^{Footnote 7} In the following, some important concepts are highlighted and the implications with respect to the implementations are discussed.

As usual in ML, the total dataset was divided into a training, validation and test set in order to monitor overfitting during training (training and validation set) and to perform unbiased evaluations with the final model (test set). These subsets were randomly drawn with a split ratio of 2/3, 1/6, and 1/6, respectively. To ensure adequate significance of the evaluations, reasonable representations of the different regimes and SNR conditions in the subsets were taken into account in order to proceed.

The annotated dataset is not particularly large for reasons of feasibility. This is a significant disadvantage for the use of DNN, which are very prone to overfitting. However, there are two very strong measures that can be taken to counteract this data sparsity; usage of transfer learning (1) and data augmentation (2).

(1) In this work, pretrained weights were adopted from ImageNet (Deng et al. 2009). These weights were deployed on the VGG16 (Simonyan and Zisserman 2015), ResNet50 (He et al. 2016) and EfficientNet-B5 (Tan and Le 2019) backbones, respectively. Those versions of ResNet and EfficientNet were chosen due to the similar number of trainable parameters in order to investigate the effect of backbone complexity independent from an increasing number of parameters. A trainable convolution layer was added to the input layer of the networks in order to map the single-channel OH-PLIF image to the network’s predefined 3-channel input. All activation functions were retained, except for a softmax that was added to the final layer. The weights of the encoder of all U-Nets and the LinkNet, as well as the first pyramid of the FPN were frozen in order to save computational effort during training and to maximize impact of the pretrained latent space on the segmentation output.

(2) In this work, random combinations of translations of integer pixel steps, mirroring around both main axes and rotations in 90 degree steps were applied. Other transformations such as scaling or shearing were omitted, as they are not pixel-matching, in order to avoid errors resulting from the interpolation and subsequent rounding to $\in \{0,1\}$ to obtain the augmented label. It was found that such transformations have a widening or fragmentation effect on the flame front label, which significantly reduced the model performance. In addition to the artificial increase in dataset size, data augmentation also helped to ensure the network not specializing on background features, such as the inlet of the combustor and the anchor points of the flame, but judging purely based on visual information.

For the loss function, Binary Cross-Entropy (BCE), Weighted Cross-Entropy, Focal loss, Dice loss (based on the F1-Score), Jaccard loss (based on the IoU-score), Tversky loss, and all combinations thereof were tested. The mathematical formulation of those loss functions can be found, e.g. in Xu et al. (2023). The BCE-Jaccard loss, cf. Eq. 2, was found to perform the best and was thus subsequently used to train all models described in Table 5. It can be written as

$$\begin{aligned} {\mathscr {L}_{\text {BCE-Jaccard}} = (1-\alpha ) \mathscr {L}_{\text {BCE}} + \alpha \mathscr {L}_{\text {Jaccard}},} \end{aligned}$$

(2)

where

$$\begin{aligned}\mathscr {L}_{\text {BCE}}&= - \frac{1}{N} \sum _{i=1}^{N} (y_i log({\hat{p}}_i) + (1 - y_i) log(1 - {\hat{p}}_i)) , \\ \mathscr {L}_{\text {Jaccard}}&= \dfrac{\sum _{i=1}^{N} y_i {\hat{p}}_i +1}{\sum _{i=1}^{N} y_i+\sum _{i=1}^{N} {\hat{p}}_i-\sum _{i=1}^{N} y_i {\hat{p}}_i +1} , \end{aligned}$$

the binary label for the i-th pixel is $y_i \in \{0,1\}$, the predicted probability of the i-th pixel corresponding to the positive class is ${\hat{p}}_i\in [0,1]$, and $\alpha =$ 0.5 was used.

Google Colab’s free graphics processing units (NVIDIA Tesla T4 GPU) were used for accelerated training. During the training process, the CNNs were fed with the training and validation data in batch sizes of 10 images, applying data augmentation as part of the pipeline. Adam optimizer, a stochastic gradient descent method with adaptive estimation of moments, was applied. The models were trained in two steps with decreasing learning rates until convergence, while monitoring the most important metrics for the training and validation sets separately with livelossplot (0.5.5).^{Footnote 8} The convergence criterion relied on the divergence of the monitored learning curves of the training and validation set, respectively, i.e. when the loss of the validation set reached a plateau while the loss of the training set still decreased. Various model states surrounding the corresponding epoch were subsequently assessed both quantitatively and visually based on the validation set. A second round of learning commenced with the best scoring model state not showing signs of overfitting. This time, a lower learning rate was employed to achieve a slightly improved position within the local minima of the loss landscape. The same criteria were applied to select the model state. Further, this procedure was repeated for each model with two additional train/val/test splits to check that the final model state was representative. The number of trainable parameters, as well as the learning rate with the corresponding number of epochs of training and the time requirement until convergence is shown in Table 5.

The output of CNNs are soft predictions, i.e. probabilities. Hence, a threshold has to be set during inference to obtain a binary mask of the predicted flame front. This threshold $\tau _{\text {Hard}}$ can be selected (1) based on the maximum likelyhood, which in the binary case corresponds to $\tau _{\text {Hard}}$ = 0.5, (2) based on maximising a single metric, e.g. the IoU-score, or (3) based on the relative performance of multiple metrics, e.g. with the Receiver Operating Characteristic (ROC) or Precision-Recall Curve (PRC). These $\tau _{\text {Hard}}$ of (2) and (3) are usually evaluated based on the validation set and come with the final model, in order to avoid turning them into a hyperparameter that has to be selected during inference.

5 Results and discussion

5.1 Model evaluation

In this section, all methods and models are evaluated based on the test set, which was unseen so far. The ROC and PRC curves (curves for variable thresholds $\tau _{\text {Hard}}$) are presented in Fig. 7. Note that the conventional methods don’t provide soft predictions and therefore only yield one point in this figure. While one of the above options (1, 2 or 3) could have been chosen to determine $\tau _{\text {Hard}}$ for inference, we now want to highlight the best possible IoU-score for a fair comparison of peak performance between all methods and models on the test set. Similar to the parameter optimization for the conventional methods, the threshold that produces the best possible IoU-score, i.e. $\tau _{\text {Best IoU}}$, is marked in Fig. 7 and all corresponding metrics are listed in Table 6. These states of the DL models significantly outperformed the conventional methods of which the Otsu-Aided Gradient Method marked the best. The IoU-score and Recall (=TPR) were factors $\approx$ 1.2 and $\approx$ 2.5 higher, respectively, while having similar FPR and $\approx$ 1.15 times better Precision. The DL models produced similar results as the segmentation by human eye and hand, where clearly the labeling by hand is the limiting factor, though they still remained far from a perfect classifier. In Fig. 7, it can be seen that the curve of the Attention U-Net VGG16 lies above the curves of all other models in both the ROC and PRC independent of the threshold. A comparison of the individual metrics among the DL models in Table 6 shows that almost all high-scores were achieved by the Attention U-Net VGG16, with the exception of Recall, where the U-Net EfficientNet-B5 achieved the most.

Table 6 Evaluation of the model performances on the test set based on a fixed threshold

Full size table

Typical predictions of different models are depicted in Figs. 8 and 9 for a moderate and lower SNR, respectively. It can be seen that the Otsu-Aided Gradient Method had some erroneous predictions in Fig. 8 and many in Fig. 9. In both figures, these cannot be avoided unless more blurring is applied and details are lost, which in turn would also affect the scores. Further, the HED produced flame front predictions with uneven thickness, stemming from its main idea; the fusion of several differently coarse and smeared layers of the CNN. Some DL models possessed better capabilities than others in properly detecting burnt and unburnt gas pockets, while some models featured more characteristic high-frequency, not necessarily connected flame front predictions. For that matter, three different U-Net variations are depicted for comparison in the lower column of both Figs. 8 and 9. The Attention U-Net, for example, showed high-frequency contours close to but not within the ground truth. This behavior is probably due to the attention gates visually adapting this way for noisy images. To avoid such non-physical, high-frequency erroneous predictions in general, a formulation of penalizing them, or for example a formulation containing connectivity and curvature conditions could be implemented in the loss function in Eq. (2). Moreover, backbone complexity seemed to visually retrieve a slightly higher fraction of relevant instances than the other models, specifically concerning pockets. This can also be concluded from the EfficientNet’s highest Recall (=TPR) out of all basic U-Net Models, and the monotonic increase in Recall over the VGG16, ResNet50, and EfficientNet-B5 backbones.

However, too much confidence should not be placed on the observations from individual visual samples of the test set, nor on comparisons between tiny differences in scores of individual DL models. This is, firstly, due to the non-convex nature of training NNs and obtaining the most optimal model state, and secondly, due to our sparse dataset, considering the statistical significance of the few samples in training/validation as well as testing. Influence of the second reasoning can only be eliminated in the limit of an infinite dataset. Rather, this comparison aims to point out the differences between the two classes of models, DL and conventional. All in all, two trends can be noticed. (1) The DL models performed much better in capturing more small-scale structures (mainly evident from higher Recall scores) and meanwhile having fewer erroneous positive predictions (supported by higher Precision scores) than the conventional methods, while mitigating erroneous predictions in the conventional methods is only possible with increased blurring and an associated loss of small-scale accuracy. (2) Secondly, the DL models produced much more satisfying results than the conventional methods for image conditions with lower SNR. This can also be observed when the IoU-scores of e.g. the U-Net VGG16 and the Otsu-Aided Gradient Method are evaluated on each regime separately, as shown in Table 7. It can be seen that all DL models performed better on the most challenging regime than the best conventional model did on the easiest regime.

Table 7 Performance of the U-Net VGG16 and the Otsu-Aided Gradient Method on each regime of the test set

Full size table

5.2 Performance limits for low signal-to-noise ratios

Real combustion systems operate at much higher pressures compared to the conditions of the dataset utilized in this study. In addition, hand labeling might become impossible with further deteriorating image quality. Therefore, the applicability limits of DL models outside the image regimes included in the training data are extremely important. To investigate this behavior, the test set of the labeled dataset was distorted with artificial noise and the IoU-scores were evaluated for predictions of all U-Net variations and the Otsu-Aided Gradient Method. The noise was modeled as a combination of Gaussian noise with a standard deviation in the range of $\sigma$ = 0$\div$23.5% of the maximum pixel value and a pixel dropout rate (pepper noise) in the range of 0$\div$15%, which yielded SNR between 0.463$\div$2.572. As previously, the best possible IoU-score per data point was found by optimizing the threshold ($\tau _{\text {Hard}}$) or blurring ($\sigma _{\text {Blurr}}$) parameters for the DL and conventional models, respectively, using a grid-search.

The behavior of the Otsu-Aided Gradient Method, U-Net VGG16, U-Net EfficientNet-B5 and Attention U-Net VGG16 is depicted in Fig. 10. The Attention U-Net VGG16 was most resilient toward artificial noise distortions in terms of IoU-score. From the intersection of both smoothing splines, it can be concluded that the Attention U-Net VGG16 performed on average better than the Otsu-Aided Gradient Method for SNR above $\gtrsim$ 0.85. Thereby, the performance difference started to rise significantly from SNR $\gtrsim$ 1.25. Similarly, all other DL models started to significantly outperform the Otsu-Aided Gradient Method from SNR $\gtrsim$ 1.4 at the latest. Moreover, the Attention U-Net VGG16 could maintain better Recall values for decreasing SNR than the other DL models, but in return seemed to trade them for Precision, which was very similar to the other DL models. Finally and most importantly, the bottom plot in Fig. 10 reveals that the U-Net VGG16 was the only model not showing an increasing trend in FPR for decreasing SNR. It seems that this model predicted the flame front class very conservatively below SNR $\lesssim$ 1.5.

The above discussion can be confirmed by some visual examples, depicted in Fig. 11. It is apparent that the U-Net EfficientNet-B5 started having a lot of erroneous, seemingly random predictions far away from the ground truth much earlier than the other models. Hence, it can be concluded that a model with this complex of a backbone might work very well in a high to moderate SNR scenario, however, it is too overconfident for those very low SNR conditions $\lesssim$ 1.35. The Attention U-net VGG16 continued to show the behavior of high-frequency erroneous predictions close to but not within the ground truth for moderate to low SNR, as discussed in the context of Fig. 9. This type of predictions seems to be the outcome of trading better Recall for more moderate Precision, as quantitatively expected from the discussion of Fig. 10. Compared to the U-Net EfficientNet-B5, however, it did not show signs of overconfidence represented by seemingly random predictions. Therefore, a relation between the attention gates’ mechanisms to cope with noise and the characteristic high-frequency predictions can be concluded. All in all, the U-Net VGG16 had the most pleasing results under artificial noise distortions. Keeping the FPR low and being conservative pays off, although IoU-scores were not as high as for the Attention U-Net VGG16 and fewer total flame front predictions were made. However these few guesses are way more reasonable, judging from human perspective, and therefore such a model provides the most added value for the use under low SNR conditions. Reasonable visual performance limits were encountered around SNR $\approx$ 1.1, (cf. last two rows of Fig. 11).

5.3 Computational effort for inference

For model inference, a prediction with the same settings and without the need for parameter tuning can be achieved by applying e.g. the maximum likelyhood as threshold $\tau _{\text {Hard}}$ to the output of the CNN, as mentioned in Sect. 4.3. A similar computational effort was achieved during inference for the U-Net VGG16 with 0.685 s/image compared to the Otsu-Aided Gradient Method with 0.615 s/image (regime specific parameter tuning excluded) on a T4 GPU. Hereby, the DL model has the advantage that images can be fed through in different batch sizes; peak times for the Attention U-Net VGG16 were achieved with a batch size of 58 images, ensuring optimal load despite the huge number of parameters of the CNN. Backbone complexity increases inference time, which should be taken into account depending on the usage goal.

5.4 Some considerations concerning the training set

Note that the applicability range of a CNN is determined by the image samples trained on. This has some implications. An extension of the applicability limits of the DL models w.r.t. noise, as discussed in Sect. 5.2, can be achieved, e.g. by adding random artificial noise distortions during data augmentation of the training process. This significantly alters the statistical distribution of the data input given to the model. Hence, all mentioned values in Sect. 5.2 are expected to indicate upper bounds in case this tactic is applied. Moreover, the limits can even be extended to regimes where it is practically impossible to produce labels that still legitimately approximate the ground truth (due to too challenging image conditions), by augmenting training samples, for which annotating data is feasible.

This principle of feature engineering can in general be applied for better generalization of the model. For example, to enhance the transferability of the performance to other test rigs and flame configurations, different flame types such as swirl flames, can be included in training and validation. Additionally, the datasets can be extended with synthetic data by means of generative models, such as diffusion models or Generative Adversarial Networks (GAN), that aim to learn and generate the probability distribution of a given dataset, or domain randomization, as e.g. applied in Jose and Hampp (2024). The transferability of the evaluated models to other test-rigs and flame configurations was not assessed as part of this work. Rather we want to highlight the high degree of accuracy that can be achieved with surprisingly few data, which allows for transferability, not only to other flame types, but also to other segmentation tasks in experimental fluid dynamics and combustion, by fine tuning these models (based on the supplementary material provided) with new domain specific annotated data.

Another consideration that should be made is that conventional gradient based techniques typically require application of an edge-preserving filtering scheme to work properly, whereas DL models could additionally eliminate the need for operations such as background subtraction and laser sheet inhomogeneity correction if trained on raw images. Thus, since CNNs are theoretically also able to learn these operations, a reduction of overall postprocessing effort could be another potential benefit.

Finally, another advantage of the DL approach compared to traditional flame front detection is the extendability to instance segmentation and multi-class pixel classification, for example offering the possibilities to separately detect and distinguish between the main flame front, unburnt and burnt pockets.

6 Summary and conclusions

In this work, the potential of ML-based methods for the segmentation of turbulent premixed flame fronts in OH-PLIF images at elevated pressure conditions was explored. Simple DL models that can be used for this task, as well as the most important underlying principles were introduced. Basic implementations of the highlighted supervised CNN architectures were trained on a dataset of a total of 60 annotated images stemming from 6 differing combustor operating conditions. With transfer learning and data augmentation, efficient mitigation strategies against the disadvantages associated with sparse training data were highlighted. The DL models were evaluated with benchmark segmentation metrics (IoU-score, F1-score, Precision, Recall, FPR), the Receiver Operating Characteristic and the Precision-Recall Curve. Thereby, three conventional methods based on the concept of the steepest gradient of the recorded OH-LIF intensity served as baseline implementations. Further, the influence of backbone complexity and attention gates were tested, and the performance limits for very low SNR were investigated by distorting images with artificial noise.

All supervised DL models significantly outperformed the conventional implementations. Comparing model states with the best possible IoU-scores, IoU-scores and Recall (=TPR) values were found to be up to a factor $\approx$ 1.2 and $\approx$ 2.5 higher, respectively, while having similar FPR and $\approx$ 1.15 times better Precision. The results were on similar performance levels as segmentation by human eye and hand for high to moderate SNR. The DL models captured small-scale details much better with less erroneous positive predictions, while an increase in Precision for the conventional methods is associated to more low-pass filtering (blurring) during preprocessing and a loss of information in small-scale structures. This effect was more pronounced for noisy images. The DL models extended the range of image conditions w.r.t. SNR that can reliably be processed with reasonable detection performance, compared to previously existing conventional methods.

Concerning the comparison between evaluated DL models, it was shown that multiscale fusion models such as the Holistically-Nested Edge Detector yielded uneven flame fronts and are therefore not suitable for this task. For high to moderate SNR, a more complex backbone such as an EfficientNet as well as features like attention gates proved to be powerful in the U-Net. For moderate to very low SNR, a simple U-Net is the better choice, since it provided more conservative predictions by keeping the FPR low, together with artificial noise distortions during data augmentation in training.

The presented DL-based flame front detection algorithm requires comparable computational effort for inference as conventional methods while eliminating the need for user-based parameter tuning. It provides significantly increased performance for single-shot flame front detection of large image datasets where individual image processing is infeasible. The results demonstrate that the use of ML can bring great benefits to postprocessing of experimentally obtained data not only limited to the presented case. Future work will focus on transferability, implementing customized loss functions and the extension to multi-class classification, as well as the application to extract local physical quantities from single-shot images in a wide range of pressures for combustion research.

Data and code availability

Our image dataset and trained models, as well as a sample code are appended in the supplementary materials. Further data that support the findings of this study are available from the corresponding authors upon reasonable request.

Notes

References

Aliramezani M, Koch CR, Shahbakhti M (2022) Modeling, diagnostics, optimization, and control of internal combustion engines via modern machine learning techniques: a review and future directions. Progress Energy Combust Sci 88:100967
Article Google Scholar
Arivoli D, Singh I (2023) Effect of gurney flap on the vortex-dominated flow over low-AR wings. Exp Fluids 64(4):68. https://doi.org/10.1007/s00348-023-03605-y
Article Google Scholar
Astarita T, Cardone G, Carlomagno G (2006) Infrared thermography: An optical method in heat transfer and fluid flow visualisation. Opt Lasers Eng 44(3):261–281, optical Methods in Heat Transfer and Fluid Flow. Available: https://www.sciencedirect.com/science/article/pii/S0143816605000552
Barwey S, Hassanaly M, Raman V, Steinberg A (2022) Using machine learning to construct velocity fields from OH-PLIF images. Combust Sci Technol 194(1):93–116. https://doi.org/10.1080/00102202.2019.1678379
Article Google Scholar
Bell JB, Cheng RK, Day MS, Shepherd IG (2007) Numerical simulation of Lewis number effects on lean premixed turbulent flames. Proc Combust Inst 31(1):1309–1317
Article Google Scholar
Bertasius G, Shi J, Torresani L (2015) Deepedge: a multi-scale bifurcated deep network for top-down contour detection. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), Conference Proceedings, pp 4380–4389. https://doi.org/10.1109/CVPR.2015.7299067
Bradley D, Roth G (2007) Adaptive thresholding using the integral image. J Graph Tools 12:13–21. https://doi.org/10.1080/2151237X.2007.10129236
Article Google Scholar
Brunton SL, Noack BR, Koumoutsakos P (2020) Machine learning for fluid mechanics. Ann Rev Fluid Mech 52(1):477–508. https://doi.org/10.1146/annurev-fluid-010719-060214
Article MathSciNet Google Scholar
Cai Z, Salazar DM, Chen T, Liu T (2022) Determining surface pressure from skin friction. Exp Fluids 63(9):152. https://doi.org/10.1007/s00348-022-03500-y
Article Google Scholar
Canny J (1986) A computational approach to edge detection. IEEE Trans Pattern Anal Mach Intell PAMI–8(6):679–698. https://doi.org/10.1109/TPAMI.1986.4767851
Article Google Scholar
Cellier A et al (2021) Detection of precursors of combustion instability using convolutional recurrent neural networks. Combust Flame 233:111558
Article Google Scholar
Chaib O, Zheng Y, Hochgreb S, Boxx I (2023) Hybrid algorithm for the detection of turbulent flame fronts. Exp Fluids 64(5):104. https://doi.org/10.1007/s00348-023-03651-6
Article Google Scholar
Charogiannis A, Beyrau F (2013) Laser induced phosphorescence imaging for the investigation of evaporating liquid flows. Exp Fluids 54(5):1518. https://doi.org/10.1007/s00348-013-1518-2
Article Google Scholar
Chaurasia A, Culurciello E (2017) Linknet: exploiting encoder representations for efficient semantic segmentation. In: 2017 IEEE visual communications and image processing (VCIP), pp 1–4. https://doi.org/10.1109/VCIP.2017.8305148
Clark Di Leoni P, Agarwal K, Zaki TA, Meneveau C, Katz J (2023) Reconstructing turbulent velocity and pressure fields from under-resolved noisy particle tracks using physics-informed neural networks. Exp Fluids 64(5):95. https://doi.org/10.1007/s00348-023-03629-4
Article Google Scholar
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, Conference Proceedings, pp 248–255. https://doi.org/10.1109/CVPR.2009.5206848
Di Bella B, Khatamifar M, Lin W (2022) Experimental study of flow visualisation using fluorescent dye. Flow Meas Instrum 87:102231
Article Google Scholar
Doll U, Kapulla R, Dues M, Steinbock J, Melnikov S, Röhle I, Migliorini M, Zachos PK (2023) Towards time-resolved multi-property measurements by filtered Rayleigh scattering: diagnostic approach and verification. Exp Fluids 65(1):2. https://doi.org/10.1007/s00348-023-03740-6
Article Google Scholar
Eckart S, Prieler R, Hochenauer C, Krause H (2022) Application and comparison of multiple machine learning techniques for the calculation of laminar burning velocity for hydrogen-methane mixtures. Therm Sci Eng Progress 32:101306
Article Google Scholar
Eghtesad A, Bijarchi MA, Shafii MB, Afshin H (2024) A state-of-the-art review on laser-induced fluorescence (LIF) method with application in temperature measurement. Int J Therm Sci 196:108686
Article Google Scholar
Faldella F, Eisenring S, Kim T, Doll U, Jansohn P (2023) Turbulent flame speed and flame characteristics of lean premixed H2–CH4 flames at moderate pressure levels. J Eng Gas Turbines Power 146(2):021012. https://doi.org/10.1115/1.4063524
Article Google Scholar
Ganin Y, Lempitsky V (2014) N$^4$-fields: neural network nearest neighbor fields for image transforms. In: Asian conference on computer vision. Conference Proceedings. Springer pp 536–551. https://doi.org/10.48550/arXiv.1406.6558
Gonzáilez-Espinosa A, Gil A, Royo-Pascual L, Nueno A, Herce C (2020) Effects of hydrogen and primary air in a commercial partially-premixed atmospheric gas burner by means of optical and supervised machine learning techniques. Int J Hydrog Energy 45(55):31130–31150
Article Google Scholar
Gregory JW, Asai K, Kameda M, Liu T, Sullivan JP (2008) A review of pressure-sensitive paint for high-speed and unsteady aerodynamics. Proc Inst Mech Eng Part G J Aerospace Eng 222(2):249–290. https://doi.org/10.1243/09544100JAERO243
Article Google Scholar
Griebel P, Bombach R, Inauen A, Schären R, Schenker S, Siewert P (2005) Flame characteristics and turbulent flame speeds of turbulent, high-pressure, lean premixed methane/air flames, vol 2. https://doi.org/10.1115/GT2005-68565
Großkopf J, Matthes J, Vogelbacher M, Waibel P (2021) Evaluation of deep learning-based segmentation methods for industrial burner flames’’. Energies 14(6):1716
Article Google Scholar
Guethe F, Guyot D, Singla G, Noiray N, Schuermans B (2012) Chemiluminescence as diagnostic tool in the development of gas turbines. Appl. Phys. B 107(3):619–636. https://doi.org/10.1007/s00340-012-4984-y
Article Google Scholar
He K, Gkioxari G, Doáir P, Girshick R (2017) Mask R-CNN. In: Proceedings of the IEEE international conference on computer vision, Conference Proceedings, pp 2961–2969. https://doi.org/10.1109/TPAMI.2018.2844175
Heidler K, Mou L, Baumhoer C, Dietz A, Zhu XX (2022) HED-UNet: combined segmentation and edge detection for monitoring the antarctic coastline. IEEE Trans Geosci Remote Sens 60:1–14. https://doi.org/10.1109/TGRS.2021.3064606
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition (CVPR), 2016, pp 770–778. https://doi.org/10.1109/CVPR.2016.90
Ireland PT, Jones TV (2000) Liquid crystal measurements of heat transfer and surface shear stress. Meas Sci Technol 11(7):969. https://doi.org/10.1088/0957-0233/11/7/313
Article Google Scholar
Joo S, Yoon J, Kim J, Lee M, Yoon Y (2015) NO$_x$ emissions characteristics of the partially premixed combustion of H$_2$/CO/CH$_4$ syngas using artificial neural networks. Appl Therm Eng 80:436–444
Article Google Scholar
Jose B, Hampp F (2024) Machine learning based spray process quantification. Int J Multiphase Flow 172:104702
Article Google Scholar
Karimi D, Dou H, Warfield SK, Gholipour A (2020) Deep learning with noisy labels: exploring techniques and remedies in medical image analysis. Med Image Anal 65:101759
Article Google Scholar
Kashir B, Ragone M, Ramasubramanian A, Yurkiv V, Mashayek F (2021) Application of fully convolutional neural networks for feature extraction in fluid flow. J Vis 24(4):771–785. https://doi.org/10.1007/s12650-020-00732-0
Article Google Scholar
Kováics DG, Grossir G, Dimitriadis G, Chazot O (2023) Space debris interaction across a two-dimensional oblique shock wave. Exp Fluids 64(8):146. https://doi.org/10.1007/s00348-023-03686-9
Article Google Scholar
Kuzu RS, Mühlmann P, Zhu XX (2022) Automatic separation of laminar-turbulent flows on aircraft wings and stabilisers via adaptive attention butterfly network. Exp Fluids 63(10):166. https://doi.org/10.1007/s00348-022-03516-4
Article Google Scholar
Lacassagne T et al (2023) Systematic error and correction of intensity-based I-PLIF for local pH and concentration measurements in unsteady boundary layers. Exp Fluids. https://doi.org/10.1007/s00348-023-03726-4
Article Google Scholar
Landgraf S, Hillemann M, Aberle M, Jung V, Ulrich M (2023) Segmentation of industrial burner flames: a comparative study from traditional image processing to machine and deep learning. https://doi.org/10.48550/arXiv.2306.14789
Li Y, Chang J, Kong C, Bao W (2022) Recent progress of machine learning in flow modeling and active flow control. Chin J Aeronaut 35(4):14–44
Article Google Scholar
Lin TY, Dolláir P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 936–944. https://doi.org/10.1109/CVPR.2017.106
Lindner G, Devaux Y, Miskovic S (2020) Vortexfitting: a post-processing fluid mechanics tool for vortex identification. SoftwareX 12:100604
Article Google Scholar
Liu Y, Cheng MM, Hu X, Bian JW, Zhang L, Bai X, Tang J (2019) Richer convolutional features for edge detection. IEEE Trans Pattern Anal Mach Intell 41(8):1939–1946. https://doi.org/10.1109/TPAMI.2018.2878849
Article Google Scholar
Metzger JP, Strässle RM, Girardin LN, Conzelmann A, Müller CR (2022) On the rising and sinking of granular bubbles and droplets. J Fluid Mech 945:A16. https://doi.org/10.1017/jfm.2022.548
Article MathSciNet Google Scholar
Murphy KP (2012) Machine learning: a probabilistic perspective. MIT Press, Cambridge
Google Scholar
Oktay O, Schlemper J, Folgoc LL, Lee M, Heinrich M, Misawa K, Mori K, McDonagh S, Hammerla NY, Kainz B, Glocker B, Rueckert D (2018) Attention u-net: learning where to look for the pancreas. In: Medical imaging with deep learning. https://doi.org/10.48550/arXiv.1804.03999
Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1):62–66. https://doi.org/10.1109/TSMC.1979.4310076
Article Google Scholar
Petrucci L, Ricci F, Martinelli R, Mariani F (2022) Detecting the flame front evolution in spark-ignition engine under lean condition using the mask R-CNN approach. Vehicles 4(4):978–995
Article Google Scholar
Petrucci L, Ricci F, Mariani F, Discepoli G (2022) A development of a new image analysis technique for detecting the flame front evolution in spark ignition engine under lean condition. Vehicles 4(1):145–166
Article Google Scholar
Pulido J, da Silva RD, Livescu D, Hamann B (2021) Multiresolution classification of turbulence features in image data through machine learning. Comput Fluids 214:104770
Article MathSciNet Google Scholar
Reuther N, Kähler CJ (2018) Evaluation of large-scale turbulent/non-turbulent interface detection methods for wall-bounded flows. Exp Fluids 59(7):121. https://doi.org/10.1007/s00348-018-2576-2
Article Google Scholar
Roncancio R, El Gamal A, Gore JP (2022) Turbulent flame image classification using convolutional neural networks. Energy AI 10:100193. https://doi.org/10.1016/j.egyai.2022.100193
Article Google Scholar
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: Navab N, Hornegger J, Wells WM, Frangi AF (Eds), Medical image computing and computer-assisted intervention—MICCAI 2015, Springer, Conference Proceedings, pp 234–241. https://doi.org/10.1007/978-3-319-24574-4_28
Rufino CH, CoraÇa EM, Lacava PT, Ferreira JV (2023) Deep learning based techniques for flame identification in optical engines. Int J Engine Res 24(5):1877–1891. https://doi.org/10.1177/14680874221106976
Article Google Scholar
Ryu J, Kwak D (2021) Flame detection using appearance-based pre-processing and convolutional neural network. Appl Sci 11(11):5138
Article Google Scholar
Setiawan BD, Rusydi AN, Pradityo K (2017) Lake edge detection using canny algorithm and Otsu thresholding. In: 2017 International symposium on geoinformatics (ISyG), pp 72–76. https://doi.org/10.1109/ISYG.2017.8280676
Settles GS, Hargather MJ (2017) A review of recent developments in Schlieren and shadowgraph techniques. Meas Sci Technol 28(4):042001. https://doi.org/10.1088/1361-6501/aa5748
Article Google Scholar
Shen W et al (2015) Deepcontour: A deep convolutional feature learned by positive-sharing loss for contour detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, Conference Proceedings, pp 3982–3991. https://doi.org/10.1109/CVPR.2015.7299024
Shou-Ming H, Chao-Lan J, Ya-Bing W, Mackenzie B (2021) A review of the edge detection technology. Sparklinglight Trans Artif Intell Quantum Comput (STAIQC) 1(2):26–37
Article Google Scholar
Siddique N, Paheding S, Elkin CP, Devabhaktuni V (2021) U-net and its variants for medical image segmentation: a review of theory and applications. IEEE Access 9:82031–82057. https://doi.org/10.48550/arXiv.2011.01118
Article Google Scholar
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. https://doi.org/10.48550/arXiv.1409.1556
Sun H, Hao X, Wang J, Pan B, Pei P, Tai B, Zhao Y, Feng S (2022) Flame edge detection method based on a convolutional neural network. ACS Omega 7(30):26680–26686. https://doi.org/10.1021/acsomega.2c02858
Article Google Scholar
Sweeney M, Hochgreb S (2009) Autonomous extraction of optimal flame fronts in oh planar laser-induced fluorescence images. Appl Opt 48(19):3866–3877
Article Google Scholar
Tan M, Le Q (2019) Efficientnet: Rethinking model scaling for convolutional neural networks. In: International conference on machine learning. PMLR, Conference Proceedings, 97:6105–6114. https://doi.org/10.48550/arXiv.1905.11946
Teutsch P, Käufer T, Mäder P, Cierpka C (2023) Data-driven estimation of scalar quantities from planar velocity measurements by deep learning applied to temperature in thermal convection. Exp Fluids 64(12):191. https://doi.org/10.1007/s00348-023-03736-2
Article Google Scholar
Tropea C, Yarin AL, Foss JF (2007) Springer handbook of experimental fluid mechanics. Springer, Berlin. https://doi.org/10.1007/978-3-540-30299-5
Book Google Scholar
Tu X, Wang L, Qi X, Yan B, Mu J, Chen S (2020) Effects of temperature and pressure on OH laser-induced fluorescence exciting a-x (1, 0) transition at high pressures. Chin Phys B 29(9):093301. https://doi.org/10.1088/1674-1056/aba5ff
Article Google Scholar
Vennemann B, Rösgen T (2020) A dynamic masking technique for particle image velocimetry using convolutional autoencoders. Exp Fluids 61(7):168. https://doi.org/10.1007/s00348-020-02984-w
Article Google Scholar
Versluis M (2013) High-speed imaging in fluids. Exp Fluids 54(2):1458. https://doi.org/10.1007/s00348-013-1458-x
Article Google Scholar
Wang Y, Ahsan U, Li H, Hagen M (2022) A comprehensive review of modern object segmentation approaches. Found Trends® Comput Graph Vis, 13(2-3):111–283. https://doi.org/10.1561/0600000097
Willmott AP, Ellington CP, Thomas ALR (1997) Flow visualization and unsteady aerodynamics in the flight of the hawkmoth, manduca sexta. Philos Trans R Soc Lond Ser B Biol Sci 352(1351):303–316. https://doi.org/10.1098/rstb.1997.0022
Article Google Scholar
Xie S, Tu Z (2015) Holistically-nested edge detection. In: Proceedings of the IEEE international conference on computer vision, Conference Proceedings, pp 1395–1403. https://doi.org/10.1109/ICCV.2015.164
Xu H, He H, Zhang Y, Ma L, Li J (2023) A comparative study of loss functions for road segmentation in remotely sensed road datasets. Int J Appl Earth Observ Geoinf 116:103159
Google Scholar
Yadav H, Agrawal A (2018) Effect of pulsation on the near flow field of a submerged water jet. Sādhanā 43(3):44. https://doi.org/10.1007/s12046-018-0814-1
Article Google Scholar
Yousaf RM, Habib HA, Dawood H, Shafiq S (2018) A comparative study of various edge detection methods. In: 2018 14th international conference on computational intelligence and security (CIS), Conference Proceedings, pp 96–99. https://doi.org/10.1109/CIS2018.2018.00029
Zhang Z, Liu Q, Wang Y (2018) Road extraction by deep residual u-net. IEEE Geosci Remote Sens Lett 15(5):749–753. https://doi.org/10.1109/LGRS.2018.2802944
Article Google Scholar
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 6230–6239. https://doi.org/10.1109/CVPR.2017.660
Zheng Y, Weller L, Hochgreb S (2022) Instantaneous flame front identification by Mie scattering versus OH PLIF in low turbulence Bunsen flame. Exp Fluids 63(5):79. https://doi.org/10.1007/s00348-022-03423-8
Article Google Scholar
Zhou Z, Siddiquee MMR, Tajbakhsh N, Liang J (2018) Unet++: a nested u-net architecture for medical image segmentation. Deep Learn Med Image Anal Multimodal Learn Clin Dec Support 11045:3–11. https://doi.org/10.48550/arXiv.1807.10165
Article Google Scholar
Zhou L, Song Y, Ji W, Wei H (2022) Machine learning for combustion. Energy AI 7:100128. https://doi.org/10.1016/j.egyai.2021.100128
Article Google Scholar

Download references

Acknowledgements

R.M.S. would like to thank Nicolas Noiray and Peter Jansohn for their support.

Funding

Open access funding provided by Swiss Federal Institute of Technology Zurich. This work has been carried out in the NextMGT (Next Generation of Micro Gas Turbines for High Efficiency, Low Emissions and Fuel Flexibility) framework. NextMGT has received funding from the European Union’s Horizon 2020 research and innovation programme under Marie Skłodowska-Curie grant agreement No 861079.

Author information

Authors and Affiliations

Department of Mechanical and Process Engineering, ETH Zürich, 8092, Zurich, Switzerland
Ruben M. Strässle
Energy and Environment Division, Paul Scherrer Institut, Forschungsstrasse 111, 5232, Villigen, Switzerland
Filippo Faldella
Department of Mechanical and Production Engineering, Aarhus University, 8200, Aarhus N, Denmark
Ulrich Doll

Authors

Ruben M. Strässle
View author publications
You can also search for this author in PubMed Google Scholar
Filippo Faldella
View author publications
You can also search for this author in PubMed Google Scholar
Ulrich Doll
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

R.M.S. performed the experiments, conceptualized the study, derived, implemented and evaluated the models, analyzed the data, wrote the original draft and edited the manuscript. F.F. designed and performed the experiments, conceptualized the study, reviewed, and edited the manuscript. U.D. conceptualized the study, reviewed, and edited the manuscript. All authors approved the final manuscript

Corresponding authors

Correspondence to Ruben M. Strässle or Ulrich Doll.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest or competing interests to declare that are relevant to the content of this article.

Ethical approval

The work presented here by the authors did not require ethics approval or consent to participate.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 914 KB)

Supplementary file 2 (ZIP 923418 KB)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Strässle, R.M., Faldella, F. & Doll, U. Deep learning-based image segmentation for instantaneous flame front extraction. Exp Fluids 65, 94 (2024). https://doi.org/10.1007/s00348-024-03814-z

Download citation

Received: 01 February 2024
Revised: 07 April 2024
Accepted: 12 April 2024
Published: 04 June 2024
DOI: https://doi.org/10.1007/s00348-024-03814-z

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Deep learning-based image segmentation for instantaneous flame front extraction

Abstract

Similar content being viewed by others

High-speed planar imaging of OH radicals in turbulent flames assisted by deep learning

Hybrid algorithm for the detection of turbulent flame fronts

Generating planar distributions of soot particles from luminosity images in turbulent flames using deep learning

1 Introduction

2 Conventional flame front detection

3 Deep learning-based flame front detection

3.1 Motivation for deep convolutional neural networks

3.2 Models for flame front detection

4 Implementations

4.1 Dataset, preprocessing and metrics

4.2 Benchmark models

4.3 Deep learning models

5 Results and discussion

5.1 Model evaluation

5.2 Performance limits for low signal-to-noise ratios

5.3 Computational effort for inference

5.4 Some considerations concerning the training set

6 Summary and conclusions

Data and code availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Supplementary information

Supplementary file 1 (pdf 914 KB)

Supplementary file 2 (ZIP 923418 KB)

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation