The Influence of Flow on Cycle-to-Cycle Variations in a Spark-Ignition Engine: A Parametric Investigation of Increasing Exhaust Gas Recirculation Levels

Cyclic variability is investigated in an optically accessible single-cylinder spark-ignition research engine by introducing artificial exhaust gas in controlled amounts to the homogenous air–fuel mixture before ignition. A skip-fire scheme ensures the absence of internal exhaust gas recirculation (EGR) and allows the engine to be fired continuously for acquisition of large statistics. Four operating conditions ranging from a stable 0% EGR case up to a highly unstable extreme EGR case are analyzed to examine the increasing effects of homogeneous EGR on the cycle performance. To that end, high-speed measurements of the velocity field via particle image velocimetry and flame imaging in the tumble plane allow the determination of phenomena leading to various flame positions and sizes as well as faster and slower combustion cycles. Through extensive conditional statistical and multivariate correlation techniques, flames are found to be heavily influenced by large-scale velocity motion, especially with the presence of greater EGR which leads to lower flame speeds. The greater sensitivity of slower flames to variations in the velocity field manifests itself in an exponential increase in cyclic variability of the maximum in-cylinder pressure and causes misfire cycles where the flame is blown off or quenched at the cylinder roof. In the most extreme cycles at the highest EGR level, the state of the large-scale velocity structures at the time of ignition determines whether the flame propagates towards the center of the cylinder (and is blown off or quenched) or if the flame sustains growth by propagating within the lingering tumble vortex.


Introduction
The need for a transition to cleaner energy systems has never been clearer as average global temperatures have steadily increased over the last 4 decades leading to dangerous changes in the earth's climate (NOAA 2021). In 2020, the transport sector alone accounted for 20% of greenhouse gas emissions as CO 2 (Crippa et al. 2021), with nearly half of that attributed to light duty vehicles (IEA: Tracking Transport 2021). Although the market for electric vehicles is growing rapidly, the industry and global infrastructure to support the industry is still in its infancy. Therefore, highly efficient internal combustion (IC) engines cannot be completely ruled out for passenger vehicles in the near future.
Researchers focusing on cleaner technologies, such as the retrofitting of current engines with viable e-fuels or the development of engines using hydrogen fuel, are faced with significant challenges related to reliability, safety, availability of fuels, and more importantly for the environment, toxic gas emissions. Exhaust gas recirculation (EGR) in spark-ignition (SI) IC engines offers the advantage of reducing fuel consumption via the reduction of throttling loss at part-load conditions as well as reduced flame temperatures leading to a reduction of thermal NO x (Fontana and Galloni 2010). However, the benefits of decreasing fuel consumption and NO x emissions are counterbalanced by an increase in cycle-to-cycle variations (CCV) eventually leading to an exponential growth in cyclic instability.
Understanding the causes of CCV in SI engines has long been the focus of researchers in the pursuit of further optimizing modern engines (Ozdor et al. 1994). With the dawn of significant improvements in advanced imaging, laser, and computing technologies over the last decade, engine CCV has been studied with detailed spatial and temporal scales. Recent experimental studies have related the in-cylinder velocity to fired engine CCV (Buschbeck et al. 2012;Zeng et al. 2015Zeng et al. , 2019Bode et al. 2017Bode et al. , 2019Schiffmann et al. 2017;Fach et al. 2022;Dreher et al. 2021;Hanuschkin et al. 2020) using high-speed particle image velocimetry (PIV) combined with other techniques such as in-cylinder pressure measurements as well as flame and spark imaging. In particular, Buschbeck et al. and Zeng et al. studied homogenous mixtures with varying equivalence ratios to examine the influence of the flame speed on cyclic performance; it was shown that large-scale flow structures can have a significant effect on the flame development and subsequent combustion speed, a finding which is amplified when the flame speed is slower (Buschbeck et al. 2012;Zeng et al. 2019). The availability of larger sets of highly resolved engine data has also allowed for the use of advanced statistical techniques to study CCV. The investigation of conditional statistics Stiehl et al. 2016), multivariate statistics (Schiffmann et al. 2017;Truffin et al. 2015), flow-feature correlations (Bode et al. , 2019Fach et al. 2022), and machine learning techniques (Dreher et al. 2021;Hanuschkin et al. 2020;Kodavasal et al. 2018;Mauro et al. 2019) have shed new light on the causal chain of engine CCV.
This paper aims to combine the strategies of conditional statistics with multivariate analysis using a large data set obtained from parametric high-speed PIV and flame imaging experiments with increasing levels of homogeneously-mixed external EGR to further delineate the causal chain for fired CCV in SI engines. First, the experimental setup and boundary conditions are described. Next, the effects of EGR on the engine performance, average flow field, and early flame kernel development are examined to establish a characterization of the parametric operating points. Then, several flame features are defined and evaluated using a multivariate correlation approach and a relative feature importance analysis. With the highest correlated flame feature identified, the investigation is narrowed down to a correlation field and conditional statistical analysis to identify the cause for such correlations between the flame and large-scale flows. Finally, a similar correlation field and conditional statistical analysis technique is used to examine the causal chain for misfires and cycles of very fast combustion in the highest EGR case.

Engine Test Bench and Boundary Conditions
The single-cylinder optically accessible research engine test stand at the Technical University of Darmstadt (Technische Universität Darmstadt) was employed in this investigation. The engine's optical access is granted through a quartz glass flat piston (with a Bowditch piston extension) and quartz glass cylinder liner with a pent roof. In this work, a well-documented spray-guided cylinder head with a compression ratio of 8.7:1 was used (Freudenhammer et al. 2015;Geschwindner et al. 2020;Welch et al. 2020). The engine test stand has been designed to consistently operate with precisely controlled conditions to investigate phenomena in SI IC engines (Baum et al. 2014), namely in this study, the effect of external homogeneous EGR on combustion and CCV. To that end, a gas mixture system (GMS) was used to mix dry air (1.8% relative humidity) with the desired amount of pure nitrogen and carbon-dioxide gas upstream of a storage plenum before the gas mixture entered a noise reduction plenum and finally the intake manifold. Figure 1a shows a simplified schematic of the GMS, the noise reduction plenum, the intake manifold, and the engine cylinder. Temperature and pressure probes are located throughout the test stand and are displayed in relevant locations in Fig. 1a. Air seeded with silicone oil droplets (see Sect. 2.2Optical Diagnostics) was introduced just after the noise reduction plenum to be used for the PIV and flame imaging. In addition, isooctane fuel was injected far upstream of the cylinder to allow sufficient evaporation and homogeneous mixing with the gas mixture.
Four operating conditions (OCs) of increasing EGR were employed to compare the effects of EGR in a completely homogeneous environment during engine operation. The four OCs are based on the standard D-fired OC of the Darmstadt engine , which has a part-load intake pressure of 0.4 bar and a speed of 1500 rpm. The conditions in this work consist of a highly stable OC of 0% EGR, a stable OC of 10.5% EGR, an unstable OC of 12.9% EGR, and a highly unstable OC of 14.3% EGR. Here, stability is defined by the coefficient of variance (COV) of the maximum in-cylinder pressure P cyl,max . The latter 2 OCs have higher COVs which also includes the misfire cycles, here defined as cycles whose gross indicated mean effective pressure (IMEP g ) were negative. Table 1 summarizes the differences in the boundary conditions for each OC. A skip-fire scheme was used in which for every combustion cycle the 6 following cycles were not ignited to allow residual exhaust gases to fully leave the system before the next ignition. Since this skip-firing scheme eliminated internal EGR due to valve overlap and other possible causes, the EGR OCs' boundary conditions significantly changed from the standard D-fired OC. To achieve stable combustion with increasing EGR levels, less fuel had to be injected to achieve an equivalence ratio of 1 and the ignition timing had to be moved earlier, whereby the ignition timing t spark shown in Table 1 was optimized such that the crank angle at 50% mass fraction burned was 8°CA (degrees crank angle relative to compression TDC), further reducing influencing factors for CCV.

Optical Diagnostics
Both optical techniques in this study involve the planar illumination of seeded oil droplets. Particle image velocimetry in the symmetry (tumble) plane was used to measure the in-cylinder velocity from − 180°CA to − 4°CA (minus signifies before compression TDC) in increments of 2°CA. Figure 1b shows a schematic of the optical setup combined with the cross-sectional view of the engine in the symmetry plane as well as a top view of the imaging system. To acquire high-speed PIV in the engine, silicone oil droplets (≈ 0.5 µm, DOWSIL 510, Dow Corning) were nebulized in air by a Palas AGF 10.1 seeder and introduced into the intake pipe immediately after the noise reduction plenum (see Fig. 1). The oil particles were then illuminated by two Nd:YVO 4 laser sheets (FWHM ≈ 800 µm, 200 µm at − 90°CA and − 30°CA, respectively) with optimized time separation dt between laser pulses (dt ranged from 18 to 32 µs). The planar Mie scattering of droplets was captured using a Phantom v2640 high-speed CMOS camera equipped with a 180 mm Sigma 1 3 lens (f/5.6), a 532 nm bandpass filter, and a correction lens (f = + 2000, to counteract astigmatism due to the curvature of the cylinder glass), as displayed in Fig. 1b. Finally, for each measurement day, a series of target images was captured, whereby a 058-5 dual-plane calibration target (LaVision) was placed inside the engine and imaged to provide scaling and dewarping to the raw images in processing.
As previously mentioned, image pairs were captured from − 180°CA to − 4°CA at every second CAD, resulting in a measurement speed of 4.5 kHz. For the flame imaging, the second frame of each image pair was used to extract the burnt gas areas where the oil droplets were evaporated by the flame and the Mie scattering signal is absent. Although this is a measurement of the burnt gas region and not a direct measurement of the flame, it serves as an indicator of the flame location. Therefore, extracted burnt gas regions will be referred to as the flame for the remainder of this work.

Data Processing
Particle image velocimetry vectors were calculated using the commercial software DaVis 10.1.2 (LaVison). First, images were masked to remove reflective objects, such as the spark plug and cylinder roof. The image pairs were run through a multi-pass cross-correlation vector calculation program consisting of perspective correction (using a 3 rd order polynomial based on calibration target images), 2 initial passes of interrogation window size 64 × 64 with 50% overlap, then 2 final passes of size 32 × 32 with 75% overlap. After each pass, vector post-processing, namely a peak ratio criterion of 1.3 was used to remove vectors whose first and second correlation peaks were below 1.3, a universal outlier detection consisting of a sliding median filter of size 7 vector × 7 vector was used, and finally a removal criterion of size 5 was used, meaning if less than 5 vectors are available out of a window overlap group, the entire group is removed.
Flame images were binarized using an in-house code developed in MATLAB (version R2021b). To achieve flame binarization from PIV images, the second image of each pair beginning from the start of ignition until − 4°CA was selected. Then a mask was used to remove reflective surfaces as with PIV processing. The mask for the flame imaging was more conservative than with PIV because near the cylinder walls reflections decrease the signal-to-noise ratio, making it more difficult to discriminate between flame and particle signals. Next, flame images were separated by cycle and the binarization algorithm was used on each cycle. The binarization algorithm first normalizes all images in a cycle by dividing each pixel's intensity by the maximum cyclic intensity. Then a sliding entropy filter of size 5 px × 5 px is applied, which outputs the entropy value of the surrounding pixel's neighborhood (The MathWorks, Inc, 2021a). The next step divides the filtered images of a cycle by the first filtered image without a flame. Since flame images are divided by images without flames, pixels below a threshold criterion of < 1 signify burnt gas or a flame. Then remaining pixels are set to a value of 1, before undergoing an erosion, which removes pixels along the edges of remaining structures using a disk shape with a radius of 4 pixels.
Then an area criterion of 200 pixels is applied, which removes flames smaller than 200 pixels in size. Finally, dilation of the same size as the erosion is applied to restore the flames to their original sizes and all holes within a flame are filled. Although the flame front and detailed wrinkling cannot be accurately captured with such an evaporation method (also due to the erosion and dilation processing steps), the bulk of the burnt gas region and flame curvature is well-captured and will be assumed to represent the cross-sectional flame for this work. Further processing and generation of plots and figures was conducted using MATLAB (version R2021b) and some post-processing, for example, combining plots, was done using Inkscape (version 1.1.2).

In-cylinder Pressure and Temperature
A characterization of the four OCs with varying levels of external EGR is required before further analysis of the flame and flow can be done. Figure 2a displays the mean and standard deviation (std) of the measured in-cylinder pressure P cyl vs. crank-angle-degree (CAD). The mean pressure is represented by the solid lines and the std by the shaded areas. The pressure traces reveal that the maximum pressure P cyl,max for each condition is relatively well phase-locked. This is due to the spark timing optimization, which was conducted to ensure that 50% mass fraction burned occurred at 8°CA for each OC as the combustion slows down significantly with the introduction of EGR. While this leads to varying thermodynamic and flow conditions near the spark plug at the time of ignition for each OC, a basis for comparison must be established when defining OCs. One effect of employing different spark timing can be observed by comparing the exhaust temperature T ex,2 (Table 1), which shows that earlier spark timing and increasing EGR leads to lower exhaust temperatures; this is a decrease of 6% from no EGR to 10.5% EGR. If the spark timing were left constant, then the higher EGR cases would not properly burn and it would be impossible to analyze the relationships between the early flame and the flow field. Finally, it should be noted that for the 14.3% EGR OC, it was impossible to fully optimize the spark timing because the number of misfires increased exponentially when the spark timing was moved any earlier. A further observation that can be made from Fig. 2a is a trend of decreasing maximum pressure P cyl,max and an increase in variation as EGR increases. This is in line with the decreasing T ex,2 with increasing EGR.
To further compare the OCs to one another, especially in relation to their variability, a so-called violin plot of the P cyl,max is displayed in Fig. 2b and was rendered using code from Bechtold (2016). The violin plots shown combine conventional box plots (white circle: median, black rectangle: inter-quartile range, and black whiskers: maximum and minimum values), with the kernel density estimation (KDE) of the distribution of the P cyl,max , represented by the shape and outline of the colored violins (scaled to the same maximum width and mirrored along the center line, respectively), and the scattered data (colorshaded circles), which represent the corresponding P cyl,max values where they occur along the y-axis, randomly distributed along the x-axis within the bounds of the violins. In addition, the mean P cyl,max is represented by the white lines, and coincides very well with the median. Such a plot shows a more complete representation of the in-cylinder pressure variation than the pressure trace because it provides context for the high stds shown in Fig. 2a. All OCs exhibit mono-modal, almost normally distributed KDEs. The inter-quartile range and spread of the KDE increases steadily from the no-EGR to the 14.3% OC, while the median and mean decreases. The distributions of maximum in-cylinder pressures are negatively skewed, which is evident in the longer tail of low pressures, even though misfire cycles are excluded. Notably, the upper bound of pressures is higher for the 12.9% and 14.3% than for the 10.5% OC. This is likely an effect of the earlier ignition timing, which allows cycles with optimal conditions for flame growth, such as ideal flow structures and early flame positions, to reach higher cylinder pressures as the volume decreases during late compression. This scenario occurred less frequently in the cases of low CCV because the optimized ignition timing (using the mean of hundreds of cycles) was more representative of all the cycles.

Mean Velocity
The prominence of extreme cycles for the higher EGR OCs and the large quantity of cycles acquired allows for conditional statistical analysis to be conducted to aid in the Fig. 2 a Mean and standard deviation (shaded area) of P cyl for each EGR rate (misfires incluced). b Violin plot of P cyl,max for each EGR rate understanding of cyclic variations within each OC. However, first the velocity field before ignition should be characterized to discern the effects of EGR on the flow field. Since solely the intake gas composition is changed between different OCs and the engine is operated in a skip-fire mode with 6 motored cycles prior to each analyzed fired cycle, no significant effect of the gas composition on the average flow field is expected. Figure 3 shows the mean velocity field for each OC during the compression stroke at − 90°CA and at their respective ignition timing. The multi-colored colormap is chosen to help convey the lack of contrast between velocity magnitudes of each case at − 90°CA. In addition, the streamlines represent the flow direction and show that the velocities remain the same. These flow fields show that on average, the amount of EGR and the resulting slightly different engine temperatures do not have a significant effect on the in-cylinder velocity. Although the flow field during compression is unaffected by the gas composition, the mean velocity at the time of ignition does vary as the piston position and speed is different at the distinct ignition timings of each OC (lower panel of Fig. 3).

Effect of EGR on the Flame
Although the different EGR levels have no significant effect on the mean in-cylinder velocity, the combustion is greatly affected by the in-cylinder conditions at ignition, as evidenced by the decreasing flame speed, in-cylinder pressure, and temperature and by the increasing variance of the pressure with increasing EGR levels. To examine the differences in combustion, the first approach is to investigate the early flame at each EGR level. Figure 4 shows the spatial probability, or probability density function (PDF), of flames at early combustion for each OC. The PDF is calculated using the entire sample of each case; therefore, a value of 1 means all cycles contain a flame at that pixel position. Although the different ignition times cause the flames to be in different stages of combustion at the same CAD for different OCs, towards the end of compression the kernel sizes become more similar. With the increase in CCV and misfire at higher EGR rates comes a decrease in the flame PDF. In addition, the presence of EGR increases the likelihood of a flame to propagate to the right of and below the spark plug; whereas for the 0% EGR case, most of the flames propagate towards the center of the cylinder until very late in the compression when the flames begin to grow around the spark plug. This exemplifies two phenomena: first, the direction of the flame propagation coincides with the mean flow direction at the time of ignition shown in the lower panels of Fig. 3. The preferential, often unidirectional propagation away from the spark plug may also be augmented by heat losses at the spark plug. Second, the faster flame speed of the no EGR case leads to a smaller influence of flow variations near the spark plug at the time of ignition and early flame growth, resulting in fewer variations of the flame position. On the contrary, with increasing EGR and decreasing flame speed, the flame's position is more heavily affected by convection as the flame is influenced by the clockwise tumble motion displayed in Fig. 3. Furthermore, since there are more flames located below the spark plug with increasing EGR, there is likely a link to the extreme cycles associated with the more CCV-driven higher EGR cases which have some cycles with much faster combustion, as indicated by the violin plot in Fig. 2b. Finally, the 14.3% EGR case at − 10°CA highlights this case's instability since the flames are relatively evenly spread out, but not above 50% probability.
To distinguish the flames between fast and slow combustion cycles, Fig. 5 displays the PDF of low-speed cycles (LC), high-speed cycles (HC), and all cycles at − 18°CA. In this case, LC and HC are defined as the 10% of cycles with the lowest and highest P cyl,max , respectively. − 18°CA was selected to allow comparison of the early flames of each OC as they appear to have reached relatively similar development at this time. It is unsurprising that for 0% EGR both LC and HC have similar flame PDFs at − 18°CA since this is a very stable OC. This remains the case for 10.5% EGR with some HC cycles exhibiting flames surrounding the spark plug, a configuration which allows flame growth in all directions. For the 12.9% case, there are 10 cycles with misfire out of the sample of 1288 cycles, so the low flame density in the LC sample reveals very slow combustion for the remaining 119 cycles and a tendency for the flame to propagate towards the center of the cylinder. On the HC side, 30% of the flames nearly cover the entire field of view (FOV) shown at − 18°CA and over 50% appear to the immediate left and right of the spark plug, showing the stark contrast between LC and HC flames. Furthermore, more flames appear to the right of the spark plug than to the left, which is linked to the greater influence of the flow on the flame propagation and a higher tendency of the flow near the spark plug to move Finally, for the 14.3% EGR case, there are 90 misfire cycles which explains the absence of flames on the LC side, while for HC, there is an even distribution of flames all around the spark plug, with a few more appearing to the right. Figure 6 displays the mean flame centroid position for HC and LC cycles over all crank angles (indicated by the color bars) from ignition until − 4°CA. The left column represents LC and HC selected using the median P cyl,max as the criterion, meaning half of the cycles are LC and the other half are HC, while the right column uses the 10% lowest and highest P cyl,max cycles as its criterion for separation. The LC centroid trajectories are represented by the cool colormap, while the HC are represented by the warm colormap. In addition, blue (LC) and red (HC) vectors stemming from the first centroid at ignition represent the mean velocity in the black squares shown in Fig. 3 at the time of ignition. The arrow directions indicate the actual mean velocity directions in the square region and the arrow lengths represent the magnitude of the velocity (also displayed in the top left corner of each panel). When comparing trajectories of increasing EGR (with the exception of the 14.3% case), the difference between LC and HC increases, and this is also amplified when examining the extreme LC and HC trajectories on the right. This is unsurprising since the CCV increases greatly with more EGR. In addition, HC trajectories tend to be located further from the cylinder roof than for LC because they represent more even propagation throughout the FOV. Another important phenomenon to note is that with increasing EGR the centroids shift from beginning from the left of the spark plug to the right. While it was possible to observe a trend of some flames beginning to grow to the right of the spark plug in Figs. 4 and 5, it seemed that this higher population of flames to the right of the spark plug was simply due to more even and faster propagation of the flame. However, Fig. 6 shows that this trend is skewed towards the right of the spark plug, meaning many flames either begin to the right, or first appear to the right of the spark plug as they sporadically come in and out of the symmetry plane, then they propagate to the left of and surrounding the spark plug within this plane. This trend in the trajectory beginning correlates with a shift in the average velocity around the spark plug, as the arrows rotate clockwise with increasing EGR and as a consequence thereof, with earlier ignition timing. In addition, the LC centroid trajectories support the PDFs in Fig. 5 in that slower combustion cycle flames tend to be convected to the left and upward. The parametric increase in EGR in this study shows how convection begins to play a bigger role in the flame's development since other complex phenomena such as inhomogeneities in the mixture and gas composition are removed due to the experimental boundary conditions. Although the vectors in Fig. 6 do not appear to differ too greatly from LC to HC cycles, likely due to the large sampling region dulling local flow phenomena, some differences emerge: while a strong mean flow towards the left of the spark plug seems to aid combustion in the 0% OC, the change in general flow direction for the higher EGR OCs alternates this relation and a weaker flow, which is less likely to convect the flame towards quenching-relevant geometry, like the spark plug stem and cylinder head, and more likely to convect the flame with the tumble, is beneficial for faster combustion. In the highest EGR case, which has the greatest amount of instability, the velocity magnitude near the spark plug for LC is 37% (65% for the extreme cycles on the right) greater than for HC and is directed more toward the left. Interestingly, the general tumble trajectory does not change much for the median LC and HC cases despite these Fig. 6 Calculated mean centroid of the flame over time for each EGR rate using the median (left) and 10% extreme cycles (right) as the HC/LC criterion. LC and HC cycles are represented by the blue-pink (cool) and black-red-white (warm) color bars, respectively velocity differences, while the extreme case on the right leads to misfire in all LC cycles. Since there appears to be a correlation between the flame, the flow, and the in-cylinder pressure, these aspects are further explored in the next section to elucidate the mechanisms leading to IC engine CCV.

Definition of Flame Features
Rather than employing machine learning (ML) methods requiring hundreds (Hanuschkin et al. 2020) or several thousand (Dreher et al. 2021) cycles to predict CCV, this work proposes the use of the correlation between several flame features and the flow field to select specific OCs and features to closer analyze the relationships between the flame, flow, and CCV. Building from the feature importance analysis allowed by the decision tree-based ML models used in the same engine for flames of a full-load OC (Hanuschkin et al. 2020), this work uses the 10 features with the highest calculated importance to examine their relevance in the homogenous EGR OCs. Table 2 lists the flame features used in this work with the previously used names given by Hanuschkin et al. (2020) and new descriptive names. As in the previous work using only flame images for analysis (Hanuschkin et al. 2020), each feature represents one calculated quantity for each flame image, for example, centroidX is the x-coordinate of the centroid for the flame of one cycle at one CAD and can be correlated to other quantities of the same cycle such as P cyl,max . Position-related features are denoted with the gray shaded background and size-related features have a white background.

Correlation Between Flame and Combustion Speed
The correlations between the flame features and P cyl,max over CAD (for each flame image) is the first set of quantities to examine in the delineation of the causal chain of the CCV caused by EGR. The Pearson correlation coefficient R is calculated using the The gray shaded section denotes position-related features and the white zone denotes size-related features linear regression of the scattered data at each CAD, that is, one flame feature value and one P cyl,max for every cycle at each CAD. The correlation of all 10 features to P cyl,max is displayed in Fig. 7. The position-related (gray zone) and size-related (white zone) features each have self-similar trends to one another. The x-position-related features show increasing correlation of up to 0.6 shortly after ignition as the flame first appears in the FOV and begins to grow near the spark plug. This is due to the either left or right early flame development, which turns out to be a decisive factor in higher EGR cases. Then correlation values decrease, as the initial flame kernel expands. In the case of leftmostX the correlation switches sign and increases again. Towards TDC, cycles that exhibit a delayed flame development into the left half of the cylinder result in low P cyl,max . On the other hand, the y-position-related features have a relatively steady or an increase in correlation to P cyl,max . For the higher EGR-cases these features tend to correlate with the cross-sectional area and are simultaneously bound by the upward moving piston. As EGR levels are increased, correlations generally increase which indicates a link between the pressure COV and the flame development. Initially, x-position features have positive correlation which means that a flame whose center position is further to the right is likely to have faster combustion. Maximum correlation values of up to 0.6 are observed already at − 30°CA, roughly 40-50 CADs before maximum cylinder pressures are reached, which emphasizes the importance of early combustion on the subsequent flame growth. Conversely, y-position features generally have negative correlations to the pressure, indicating an inverse relationship, that is, a flame whose y-position is located closer to the piston (more negative y-position) is likely to exhibit faster combustion. The size-related features, especially the cross-sectional flame area, exhibit constant increasing correlations, which shows that the cross-sectional flame area represents the flame volume well.

Flame-Flow Correlations
To correlate flame features with the flow, each PIV vector at a certain position and CAD can be correlated with a flame feature over all cycles. The resulting correlations can then be displayed in the position of the corresponding vector on a 2D plot with a colormap representing the R values to allow a spatial analysis of the feature and flow field's correlation. The method of spatial correlations with flow fields has already been utilized in optical research engines to examine the relationship between cyclic features such as the IMEP , the spark position (Bode et al. 2019), and the flame position (Fach et al. 2022) with the velocity.
In this work, the spatial flame-flow correlations are first compared with one another and with the features from the different EGR levels on one coherent chart before specific OCs and features are selected for closer examination. For the flame features, one CAD, namely − 30°CA was selected based on the high correlations between the x-position features and the maximum in-cylinder pressure displayed in Fig. 7. At − 30°CA the 0% EGR OC is not included since the ignition occurs later; however, the correlations of the no EGR OC at later CADs are lower than those with higher EGR since the flow has less of an effect on the flame development in this case, and are thus less relevant. A reduction of complexity must first be done because the resulting number of correlation fields is 2670 (3 OCs × 10 features × 89 PIV CADs). Figure 8 displays the comparison of spatio-temporal average correlations between the flame feature and the x-and y-component of velocity by taking the mean absolute value of R over all flow fields and CADs. To ensure a high probability that the correlation R is significant (and a low probability of observing the null hypothesis), each correlation undergoes a significance filter in which the p-values must be less than 0.05, where low p-values mean a more significant correlation in R (The MathWorks, Inc, 2021b). The resulting quantity is the correlation coefficient normalized to all possible vector positions over each CAD in time. As the quantities are normalized, they offer a direct comparison of the correlation between the velocity and flame features at each EGR level.  Fig. 8 presents a non-ML-based feature importance in terms of their relationship to the flow. The correlations agree generally well with Fig. 7 in that increasing EGR and CCV leads to increasing correlation. However, at the highest EGR level, the x-position features of the flame correlate less with the flow than at 12.9% EGR which is probably due to the significant amount of out of FOV and plane motion of the flames present in the 14.3% EGR OC. While the correlations in Fig. 7 relate the flame to the in-cylinder pressure, and thereby CCV, Fig. 8 relates the flame features at − 30°CA to the flow, at which point the flame may be underdeveloped (due to the many partial misfires), out of the plane (due to the greater influence of convection on the flame), or highly developed and out of the FOV, washing out the correlations. The correlations also agree well with Fig. 7 in that features related to the position of the flame have higher correlations than those related to the size, which is because most of the flames at − 30°CA are very small in the FOV. Therefore, the direction of the initial flame propagation plays a large role in the determination of the cyclic performance. In addition to higher EGR having higher correlations, V y component correlations are also generally greater than V x correlations, indicating that upward motion induced by the compression has a more significant role in the placement and shape of the flame or it is more descriptive of the final tumble position. It is especially interesting to consider the flame feature with the greatest correlations: centroidX. The y-component of the velocity correlates higher than the x-component even for the x-coordinate of the flame's centroid, meaning that the centroid's horizontal position is more sensitive to variations in the upward flow.
Since the highest correlations between the flame (at − 30°CA) and the flow occur for the 12.9% EGR OC, further analysis of this EGR condition can be done to better understand the link between the velocity field and the flame development. Figure 9 displays the average spatial correlation between flow fields from the late intake phase until late compression and flame features at − 30°CA. It is not only remarkable that significant correlations exist between the flame and the velocity fields as early as the intake phase, but also that peaks in average correlation correspond to physical events such as the closing of the intake valves, the appearance of the tumble center within the given FOV, and the piston deceleration. Furthermore, such a visualization provides context for quantities in Fig. 8 that are averaged over all CADs. Each flame feature has a phase-locked correlation to the physical events Fig. 9 Sum of correlation over the field at each CAD normalized to the total sum at all available vector positions. Flame features are calculated at − 30°CA dictating the in-cylinder velocity. However, while Figs. 8 and 9 provide a broad generalized overview of the relationships between flame features and velocity fields, they cannot provide spatial information about these relationships; yet they offer a hint to which specific correlation fields may be of note.
Building from the analysis of Figs. 8 and 9, the flame feature with the greatest correlations to the flow fields, centroidX, can be further examined at the CADs with physical events highlighted by the vertical dot-dash lines in Fig. 9. Figure 10 shows the correlation fields of the horizontal and vertical velocity components with centroidX at the highlighted relevant CADs. The blue-white-red colormap represents the quantity of the correlations with dark blue and red corresponding to a high negative and positive correlation, respectively. The vector field displayed represents the mean velocity magnitude and direction over all cycles at each interrogation window (showing every 6 th vector). As the Fig. 10 Correlation field for V x (left) and V y (right) and centroidX (calculated at − 30°CA) at selected CADs correlations represent linear regression relationships between scattered data, a schematic of arrow directions is displayed to help explain the physical meaning of a positive correlation in terms of the velocity component and the horizontal location of the centroid. Moreover, two point locations labeled Point 1 and 2 are indicated by a blue and red pixel and large arrows to describe the point locations for the scatter plots shown in Fig. 11. In addition to the aforementioned significance filter which removes locations with a low probability of having a significant correlation, the symmetric colormap also removes additional locations of very low correlation by the white color.
At − 132°CA, when the intake valves close, the V x side of Fig. 10 shows widespread significant negative correlations averaging R ≈ − 0.30, especially in the region to the left and below the spark plug. Since all of the vectors in this negative R region point towards the right, it indicates a tendency for positive x-direction velocities this early in the engine cycle to correlate with a more leftward flame centroid. The negative correlations for the V y field are on average greater (R ≈ − 0.35) with a larger presence on the intake side between y = − 20 and − 40 mm. Since most of these negative correlations occur where the mean velocity is pointed in the positive y-direction, this indicates that greater upward (or less downward) velocities in these regions at the time of the intake valve closing correlate to more leftward flames at − 30°CA. As the tumble center begins to come into view at − 114°CA, the V x correlations seem to not change much apart from the decrease in the area of negative R values and the emergence of very low positively correlated locations below the tumble center. On the V y side a similar trend to − 132°CA is observed, but the region of negative correlation has grown and velocities in the region tend to be more vertically oriented.
In late compression at − 60°CA, the correlations near the cylinder roof and especially near the spark plug increase and become strongly positive for the V x side. This change of signs is confirmed by negative correlations between V x at − 132°CA and − 60°CA close to the cylinder head. An analysis of conditioned flow fields (not shown) reveals that cycles featuring an initially stronger upwards directed flow of the tumble lead to lower velocities near the cylinder head and a slightly shifted tumble center during late compression. At this stage in the cycle, the flow's tumble center is still visible in the FOV as the piston compresses it towards the exhaust valves. To the left of the spark plug, positive V x -cen-troidX correlations averaging R ≈ 0.50 and a positive V x indicate that higher velocities Fig. 11 Scatter plot of selected points of the correlation field at − 46°CA in Fig. 10 into the spark plug correlate to the flame growing to the right at − 30°CA. On the V y side, the greatest correlations (R ≈ − 0.55) occur near the tumble center to the right and below the spark plug. At the locations of peak negative correlations, the velocity remains in the positive y-direction indicating that greater velocities there lead to the flame growing to the left at − 30°CA. The trends continue for both velocity components at the time of ignition (-46°CA), but the peak correlations reach up to an average of positive and negative 0.65 near the spark plug. To aid in the interpretation, Fig. 11 displays scatter plots of the V x and V y data with the centroidX at the two points indicated in Fig. 10. While the average vector arrows in Fig. 10 point in one direction, the scatter plots indicate that there is a wide range of positive and negative velocities for both components which follow a linear trend (indicated by the solid red regression line). The region around centroidX = 5 mm shows fewer data due to the spark plug geometry at this position and since the flames at − 30°CA are typically either to the left or right of the spark plug.
Despite the arrow schematic in Fig. 10 and the scatter plots of data near the spark plug in Fig. 11, it can still be quite difficult to interpret the meaning of the correlation fields at the time of ignition. Therefore, it is helpful to investigate conditional statistics using the cycles of extreme centroidX as the criterion for separation to directly examine the role of convection on the flame propagation. Figure 12a shows the mean velocity fields of the 50 lowest (left with blue box) and highest (right with red box) centroidX cycles. There is a substantial difference in average flow direction and magnitude for the two samples. In the high centroidX cases, the tumble center is shifted into the FOV next to the spark plug and a region of strong horizontal flow from the intake to the exhaust side is visible, which directly affects the spark plug gap region. Subsequently, the flames with highest centroidX are convected to the right side, in stark contrast to the lower centroidX cycles, which appear to the left. The location of the tumble center in the case of the low centroidX sample is nearly out of the FOV and there is a large vertical flow structure that reaches from the piston surface to the spark plug. This high vertical velocity region below and to the right Fig. 12 a Conditional statistics of the velocity components for 12.9% EGR at − 46°CA, b flame propagation PDF with flow model, and box plots of P cyl,max for the 50 lowest and highest centroidX cycles of the spark plug is similar in location to the high negative V y -centroidX correlation region in Fig. 10. This confirms that very high upward-directed velocities to the right of the spark plug inhibit the flame's growth in that region, leading to a preferred development and convection towards the center of the cylinder. The comparison between the velocity magnitude fields shown on the bottom of Fig. 12a further shows the importance of higher magnitude velocities reaching from the piston surface to the right of the spark plug in inhibiting the flame growth there.
In Fig. 12b, the conditional PDFs of the flame at − 30°CA and − 18°CA are shown to summarize a flame-flow model taken from the mean velocities where vectors are available at the corresponding CADs (not shown). Beginning with the lowest centroidX cycles (blue box), the large upward flow directed from the piston towards the bottom right tip and to the right of the spark plug forces the early flame to begin its propagation away from the flow structure. At − 18°CA, the flame within the symmetry plane is partially attached to the spark plug and convected towards the intake valves (much of it out of the FOV) by the remaining upward-and leftward-directed flow from the piston motion. In the case of the highest centroidX cycles, the combination of the presence of the large tumble vortex center to the right of the spark plug (with a strong leftward flow below the tumble) and the rightward-directed velocity starting at the left side of the spark plug cause the flame to propagate to the right (also out of the plane and FOV). At − 30°CA, the tumble center remains in the FOV to the right of the spark plug and the flames appear to follow the clockwise convection, moving to the left below the tumble center. At − 18°CA, the flame occupies nearly the entire region to the right of the spark plug as the leftward flow stemming from the compression of the tumble continues to convect it towards the center. While it is clear that some of the flame is cut off by the given FOV, the cross-sectional flame area from the high centroidX sample is significantly larger, which helps explain why the P cyl,max box plots on the right side of Fig. 12b are different, especially in terms of mean (oval) and median (red line) maximum in-cylinder pressure. Since centroidX was calculated at − 30°CA, a time when it correlates relatively high with P cyl,max (compare Fig. 7), the conditioned cycles of extreme left and right centroidX are not only linked to the stark differences in the flow field, but also to slower and faster combustion, as evidenced by a mean and median P cyl,max difference of over 1 bar between samples. Therefore, Fig. 12 exhibits a tendency for the early flame's direction of propagation to predict whether the flame will grow quickly or slowly and this is directly correlated with the state of the velocity at the time of ignition. If the flow field is characterized by a strong upward flow onto the right side of the spark plug, the flame is likely to first propagate to the left, where it is convected further upward towards the intake valves, causing it to lose heat at the walls. The consequence is a slower growth of the flame as it must develop against a strong flow that directs its propagation along the walls, leading to lower P cyl,max . However, when the clockwise tumble center lingers close to the spark plug, a rightward-directed flow influences the flame to propagate first to the right, where it can grow in a region of lower velocity, then it is propelled by the vortex and then free to quickly grow with a strong leftward velocity towards the rest of the unburnt cylinder gas, leading to higher P cyl,max .

Correlation of Large-scale Flow Structures to CCV
Earlier research has used planar PIV and flame imaging to show that large-scale flow structures play an important role in determining cyclic combustion speed (Buschbeck et al. 2012;Zeng et al. 2019). In the present work, a link between flame features and the combustion speed 1 3 (compare Fig. 7) and a link between the flame features and flow field has been shown. Figure 12 shows a correlation between the flow field, the flame's x-position, and the in-cylinder pressure. However, to further strengthen the contention of the present work that there is a significant correlation between the flame position, size, and large-scale velocity structures to the combustion speed, conditional analysis of the most unstable case (14.3% EGR) with 90 misfire cycles is selected due to the amplification of the effects causing CCV. Although the same trends for the correlations between flow and P cyl,max occur in the 12.9% EGR case, albeit with slightly lower correlations, the 14.3% case offers the unique opportunity to examine the differences between a large number of misfire cycles and fast combustion cycles. For the most acute comparison, conditional statistics between the 90 misfire cycles and the 90 greatest P cyl,max cycles are compared in Fig. 13. In Fig. 13a the conditional mean velocity field for the misfire (left, blue box) and the fast combustion (right, red box) cycles is shown for the intake, compression, and the time of ignition. During intake at − 180°CA and compression at − 90°CA, the differences between the velocities are subtle, with the main characteristic difference being greater velocity magnitudes and a more compact tumble center in the case of the misfire set. However, at the time of ignition, greater disparity is apparent. As was observed with cycles with low centroidX (compare Fig. 12), misfire cycles have a greater bulk velocity structure extending from the piston surface to the spark plug than their counterpart's flow field. Furthermore, the fast combustion flow fields also have a stronger horizontal flow across the spark plug from the intake to the exhaust side. However, it is interesting to note that both sets exhibit a mean tumble vortex center of similar size and position, despite the aforementioned contrasts. This indicates that the presence of the tumble center to the right of the spark plug is far from the lone influencing attribute causing faster flame growth. Rather, the flame's growth is determined by the direction and magnitude of the flow surrounding the spark plug, as discussed Fig. 13 a Conditional average velocity fields for 14.3% EGR with the 90 misfire cycles (left) and the 90 cycles with the highest P cyl,max (right). b The correlation between the horizontal (left) and vertical (right) velocity components with P cyl,max . c Propagation of the conditional flame PDF and flow model with Fig. 12. In this case, the flow around the tumble center varies strongly between the two samples leading to distinct velocity structures around the spark plug at ignition.
Since there are distinct characteristics that emerge in the contrasting conditional flow fields near the spark plug, they also appear in the correlation fields (over all cycles, not conditionally separated) displayed in Fig. 13b. Although correlations are generally lower than those appearing for centroidX at ignition (for the flame at − 30°CA for 12.9% EGR), they have the same structures. The correlation fields displayed lead to the conclusion that cycles of faster combustion correlate with stronger rightward velocities and weaker upward velocities near the spark plug.
To summarize a conceptual model for the extreme cycles, Fig. 13c displays the PDFs of the early flame as well as a flow diagram which takes into account the flow at − 50°CA and the corresponding velocity fields at the times of the flame PDFs (not shown). For the misfire cycles (left, blue box), it may be surprising to observe the presence of early flame kernels despite the negative IMEP g criterion. Nevertheless, the significant bulk flow stemming from the piston's approach to TDC forces the flame to develop towards the left, before blowing off or quenching at the cylinder roof out of the measurement plane. Conversely, the fast combustion cycles sustain flame growth towards the right side due to the horizontal flow across the spark gap and the lack of high velocities forcing the flame against the roof due to the location of the tumble center. In contrast to the flames conditioned to high centroidX shown in Fig. 12b, the fast combustion flames for 14.3% EGR exhibit less clockwise convection along the tumble vortex center because the velocities surrounding the vortex center are more moderate. However, similar to the case of centroidX for 12.9% EGR, after an initial rightward propagation, the flame grows quickly in the low velocity vortex center before being convected by the leftward flow above the piston towards the rest of the unburnt gas in the cylinder.
The analysis accompanied with Figs. 12 and 13 shows that the velocity in the vicinity of the spark plug at the time of ignition is critical in determining the cyclic combustion speed. However, when comparing the differences in velocity of HC and LC shown in Fig. 6, analyzing the velocity at the spark plug alone is not enough to predict fast and slow cycles since depending on the OC, higher velocity directed towards the intake valves leads to either slow or fast combustion. Because the flame development, and consequently the combustion speed, correlates with the velocity significantly as early as the intake phase, the entire flow history can provide context for an accurate prediction of the cyclic performance. In a previous study with the same engine under full-load conditions, ML techniques used the flow history from intake until ignition and flame features to predict cycles with high and low P cyl,max (Dreher et al. 2021). Although the engine was operated under different conditions such as the intake pressure, ignition timing, and internal EGR levels, the horizontal velocity near the spark plug was found to be important in the prediction of CCV similar to the present study. In addition, the conditional velocity fields displayed that a stronger horizontal flow across the spark plug yields higher in-cylinder pressure. The agreement of results with entirely different OCs is a promising finding that aspects of the analysis of this work can translate to different engine configurations.

Conclusions
Various levels of homogeneously mixed exhaust gas recirculation were introduced in a port fuel injection SI engine experiment operating in a skip-fire scheme to ensure proper reactant homogeneity at ignition. The incremental increase in externally-controlled 1 3 EGR levels led to an exponential increase in cyclic variance. By employing simultaneous high-speed PIV and planar Mie scattering techniques for thousands of cycles, the causes of CCV were investigated by examining the conditional statistics and multivariate correlations of the flame development and the velocity fields in the symmetry plane of the engine.
Although the mean flow field in compression was unaffected by the varying EGR levels, combustion was slowed down with increasing EGR, allowing for an increase in the flame's sensitivity to variations in the in-cylinder velocity. Since there were no differences in gas homogeneity nor the thermodynamic conditions at and after ignition for a given OC, the measured CCV was truly due to observable variations in the bulk velocity field. In the stable case of 0% EGR, flames were always directed towards the center of the cylinder due to the bulk motion induced by the piston at the time of the ignition, causing a tendency for the flame kernel to grow evenly from the center to the cylinder walls. However, to achieve combustion with increasing EGR levels, the spark timing had to be moved earlier to counteract the slowed flame speeds. Consequentially, in higher EGR cases, the flames could be convected by the presence of a lingering tumble vortex center near the spark plug, encouraging flame propagation from right to left following the clockwise vortex. Conversely, when the tumble center shifts further right or there is a stronger upward velocity from the piston surface onto the spark plug tip, the flame is directed towards the center of the cylinder from right to left. In the most sensitive condition with 14.3% EGR, significant correlations between the maximum in-cylinder pressure and the velocity components show that large-scale velocity structures play a crucial role in determining combustion speed. Similar to the conditions affecting the direction of the flame propagation in the 12.9% case, the presence of a higher magnitude upward velocity directed towards the bottom of the spark plug can cause the flame to not only propagate to the left, but also towards the intake valves where they either blow off, or quickly quench at the wall, leading to misfire. Moreover, the lack of this bulk flow directly below the spark plug and an increased velocity from the left to right across the spark plug gap encourages faster combustion.
In future work, analysis of the spark through high-speed imaging of the plasma and spark energy measurements in combination with the flame and flow analysis techniques of this work can help complete the knowledge of the causal chain of cyclic variations with increasing EGR. Furthermore, the extension of the measurement and analysis techniques employed in this work to internally-induced EGR with species concentration measurements will be invaluable in the design of near-future highly optimized IC engines.
Author Contributions All authors contributed to the study conception and design. AD and BB acquired funding. Experimental preparation and data collection were performed by CW, LI, and MS. Data analysis, generation of figures, and the preparation of the first draft of the manuscript were performed by CW. All authors commented on previous versions of the manuscript. The final discussion and analysis of results were performed by CW and MS. All authors read and approved the final manuscript.
Funding Open Access funding was enabled and organized by Projekt DEAL. Support by Deutsche Forschungsgemeinschaft through FOR 2687 "Cyclic variations in highly optimized spark-ignition engines: experiment and simulation of a multi-scale causal chain"-project number 423224402-is kindly acknowledged.

Conflict of interest
The authors have no competing interests nor conflicts of interest to declare.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.