1 Introduction

In an effort to reduce pollutant emissions and improve efficiency, current trends in internal combustion engines (ICE) include strategies such as downsizing, turbocharging and hybridization (Heywood 2019). These technologies offer advantages in terms of energy efficiency, particularly in reducing \(\hbox {CO}_2\) emissions (Szybist et al. 2021), and are well suited for use as an auxiliary or secondary powertrain in electric vehicles. However, an important consequence of downsizing is the increase in the surface-to-volume ratio of the combustion chamber, which increases the significance of near-wall processes. The heat loss to the walls has a direct impact on efficiency, mixing, pollutant formation (Borman and Nishiwaki 1987; Alkidas 1999; Dec and Hwang 2009) and thermal stresses of the components (Shalev et al. 1983; Cioată et al. 2017). The near-wall heat transfer is in turn determined by the highly turbulent bulk flow and its interaction with the mass and energy transfer processes in the BL at the combustion chamber walls. Understanding the mechanisms of wall heat transfer and the evolution of thermal and velocity BL is therefore crucial for improving efficiency, reducing pollutants and unburned fuel emissions and selecting suitable materials.

The experimental investigation of the near-wall region in ICE is challenging due to the limited optical accessibility. Early studies employed Laser Doppler Velocimetry (LDV) to study the region near the cylinder wall or for a specially-shaped cylinder head (Hall and Bracco 1986; Foster and Witze 1987; Pierce et al. 1992). LDV provides valuable insights but can only measure velocities at discrete points. Particle Image Velocimetry (PIV) and Particle Tracking Velocimetry (PTV) have gradually replaced LDV, enabling the measurement of instantaneous two- or three-dimensional (2D or 3D) velocity fields close to the walls with better accuracy (Kähler et al. 2012). Alharbi and Sick (2010) reported resolutions of up to 45 μm using PTV during the compression and expansion strokes of a motored engine. They have also observed the presence of sub-millimeter vortices within the BL at the center of the pent-roof shaped cylinder head. Jainski et al. (2012) studied the BL at the same location using a sum-of-correlation method with a resolution of 50 μm. They compared the viscous and logarithmic layers with the ones predicted by the widely used law of the wall. The velocities were either under or over predicted depending on the crank angle. At higher engine speeds, the viscous layer was too thin to be accurately characterized by the measurements. Renaud et al. (2018) further increased the wall-normal resolution up to 20 μm using a PIV+PTV algorithm and highlighted fundamental differences between the BL that occur in ICE and those that develop on a flat plate or mildly curved surfaces, such as the lack of adherence to the logarithmic law of the wall. More recently, Schmidt et al. (2023) investigated the velocity BL above the piston by means of high-speed PTV with resolutions of up to 5 μm perpendicular to the wall, allowing the viscous sublayer to be resolved for engine speeds up to 2500 rpm.

The thermal BL in ICE has been studied even less than the velocity BL due to the lack of diagnostic methods capable of accurately determining the gas temperatures near the wall (Ojo et al. 2021). Laser-induced fluorescence (LIF) has been used extensively to measure gas temperatures in ICE (Dronniou and Dec 2012; Peterson et al. 2013, 2014; Kaiser et al. 2013; Alzuabi et al. 2021). However, the coarse spatial distribution and the fact that it provides relative rather than absolute temperature distributions (Kaiser et al. 2013; Peterson et al. 2014) have limited the potential of LIF for the study of thermal BL. Schlieren photography has also been used to determine the thermal BL thickness in engines (Lyford-Pike and Heywood 1984). In contrast, Coherent Anti-Stokes Raman Scattering (CARS) and hybrid fs/ps rotational CARS (HRCARS) have emerged as the most widely used techniques to measure gas temperatures and species concentrations within the BL with high accuracy and precision (Lucht et al. 1991; Bohlin et al. 2015; Kosaka et al. 2018; Retter et al. 2018). Escofet-Martin et al. (2020) and Ojo et al. (2021) employed HRCARS to investigate the evolution of thermal BL under engine-relevant conditions. Thermocouples are commonly used to determine the instantaneous wall heat flux in ICE. The influence of measurement location, engine speed, throttle setting and ignition timing on the wall heat flux was reported by Nijeweme et al. (2001). Gingrich et al. (2014) compared the instantaneous heat transfer at the piston crown under different combustion conditions. In their study, Ma et al. (2017) carried out measurements of the instantaneous heat flux at the cylinder head. They found a peak heat flux of about 60\(\textrm{k W m}^{-2}\) when the engine was running at a speed of 500 rpm under motored operation.

While these techniques provide valuable data, they are limited to 2D slices or small volumes. Scale-resolving simulations for the investigation of ICE flows enabled fundamental research of the in-cylinder physical processes (Haworth 1999). Baumann et al. (2013) conducted a study in which predictions from 50 large eddy simulation (LES) cycles were compared with PIV data of the TUDa engine (Baum et al. 2014) for two different grid resolutions. di Mare et al. (2014) used a similar configuration to evaluate the appropriateness of different LES quality criteria, while Nguyen et al. (2014) employed two numerical codes to investigate the effects of lattice structure and numerical schemes, obtaining similar results for both methods. Ameen et al. (2018) used a parallel LES approach to study cyclic variability in a spark ignition engine. However, despite recent advances in the development of scale-resolving methods to study engine flows, LES rely on modeling assumptions for near-wall phenomena due to resolution requirements that necessitate approximations in capturing the complex interactions at the wall boundary.

To overcome the limitations of experimental techniques and the need for wall modeling for LES, DNS have emerged as powerful tools for investigating the BL and wall heat fluxes. By directly solving the Navier-Stokes equations without any modeling assumptions, DNS provide complete information on the evolution of velocity, temperature and composition at the required spatial and temporal resolutions for investigating BL and have been used extensively to study canonical flow configurations such as channel, pipe, stagnation or impinging jet flows (Kline and Robinson 1990; Wallace 2012; Duponcheel and Bartosiewicz 2021). Kim et al. (1987) performed a thorough analysis of BL in an incompressible and isothermal channel flow and found good agreement with the law of the wall. The temperature and velocity BL in impinging jet configurations were investigated by Hattori and Nagano (2004), who showed that the relationship between the Nusselt number and the distance from the impact center indicates that the wall-normal velocity component has a significant influence on the local heat transfer at the wall. In addition, significant deviations from the law of the wall were found in the thermal and velocity BL, with the most pronounced differences occurring in the immediate vicinity of the impact center. In the study by Bradshaw and Huang (1995) it was shown that the law of the wall is not sufficient to accurately characterize complicated three-dimensional BL, such as those found in ICE. Time-varying temperature differences between the engine walls and the bulk gas have a strong influence on the BL in ICE, which leads to the formation of thermal BL. The pressure gradient effect on the BL structure has also been studied extensively for canonical configurations (e.g., Harun et al. (2013); Kitsios et al. (2017); Bobke et al. (2017); Rkein and Laval (2023)).

Despite the significant advances in DNS for canonical flow configurations, DNS of the entire ICE cylinder is scarce due to the high computational costs and the need for advanced tools that combine accurate and efficient spatio-temporal discretization techniques with the ability to capture the geometric complexity of ICE. DNS studies of simplified engine-like geometries under engine-relevant conditions have been performed by Schmitt et al. (Schmitt et al. 2015; Schmitt and Boulouchos 2016; Schmitt et al. 2016a, b) to investigate the structure of the near-wall velocity and temperature BL profiles as well as the influence of wall heat transfer on the evolution of temperature stratification in the bulk. Advances in scalable computational methods and tools as well as the availability of high-performance computing enabled the first DNS of a single compression-expansion cycle of the optical engine at TU Darmstadt (TUDa) at 800 rpm under partial load (0.45 bar intake pressure) (Giannakopoulos et al. 2022) using the CPU spectral element solver Nek5000 (XXX).

The present work addresses the knowledge gap regarding the evolution of velocity and thermal BL and wall heat flux in ICE at practically relevant operating conditions by utilizing recent algorithmic and software developments on state-of-the-art GPU-based supercomputers. Building on the findings of previous studies by Schmitt et al. (2015) and Giannakopoulos et al. (2022), it focuses on a comprehensive investigation of a modified geometry of the pent-roof ICE studied experimentally at TUDa (Baum et al. 2014). The technically relevant full-load (0.95 bar intake pressure) operating points (OP) at 1500 and 2500 rpm are considered, capturing the complex interactions between the in-cylinder flow and the engine walls. A total of 12 compression-expansion cycles were simulated for each engine speed. The multi-cycle realization enabled the acquisition of much needed phase-averaged statistics, enhancing the dataset beyond what was achievable in the single cycle DNS by Giannakopoulos et al. (2019).

The remainder of the paper is organized as follows. The case description and numerical approach are presented in Sects. 2 and 3, respectively. Then, based on the DNS results, the evolution of the in-cylinder flows, BL and wall heat flux is analyzed in Sect. 4. Finally, the major conclusions are drawn in the last section.

2 Computational Setup

The direct injection spark ignition engine at TUDa is an optically-accessible single cylinder featuring a pent roof, four-valve head and an inlet duct designed to promote the formation of tumble flow. The setup was designed to provide well-defined boundary conditions and reproducible operation. The cylinder of the square engine has a bore of \(B={86\, \text{mm}}\), which is typical for a passenger car engine. Detailed information regarding the engine and the accompanying test facility can be found in Baum et al. (2014), while the engine OP considered here are listed in Table 1. In order to simplify grid generation and reduce the computational cost, the long crevice volume of the experimental engine and the small pin of the center electrode of the spark plug were removed from the DNS domain. The bulk Reynolds number, defined as \(Re=B\bar{V_p}/\nu\), where \(\bar{V_p}\) is the maximum piston speed, and \(\nu\) the kinematic viscosity of the gas (nitrogen) at the wall temperature and intake pressure, which is only indicative of the large-scale flow at the beginning of the compression stroke and varies with time, is 18,368 and 30,615 for the 1500 and 2500 rpm respectively, i.e. a 4.5 and 7.5-fold increase compared to the OP B simulated by Giannakopoulos et al. (2022).

Table 1 Engine operating conditions

3 Numerical Methodology

The multi-cycle DNS was conducted using the recently developed spectral element solver NekRS (Fischer et al. 2022). In addition to supporting CPU platforms, NekRS targets GPU-accelerated platforms using the portable Open Concurrent Computing Abstraction (OCCA) library (Medina et al. 2014), which allows for runtime code generation for different threading programming paradigms such as CUDA, HIP, and OPENCL. This hybrid MPI+X parallelism approach enables seamless support for multiple hardware architectures. NekRS uses high-order spectral elements in which the solution, the data and the test functions are represented as locally structured N th-order tensor-product polynomials on a set of E globally unstructured conforming hexahedral spectral elements. This approach has two main advantages. First, for smooth functions, such as solutions to the incompressible/low Mach-Navier–Stokes equations, high-order polynomial expansions yield exponential convergence with N, implying a significant reduction in the number of grid points (\(n\approx EN^3\)) required for a given accuracy in comparison to low-order methods. Second, the locally structured form allows the use of tensor product sum factorization to achieve low \({\mathcal {O}}(n)\) memory cost and \({\mathcal {O}}(nN)\) computational complexity. The time integration in NekRS is based on a semi-implicit splitting scheme using k th-order backward differences (BDFk) to approximate the time derivative (k up to three), coupled with an implicit treatment of the viscous and pressure terms and a k th-order extrapolation (EXTk) for the remaining advection and forcing terms. This approach leads to independent elliptic subproblems consisting of a Poisson equation for the pressure, a coupled system of Helmholtz equations for the three velocity components and an additional Helmholtz equation for the temperature.

The construction of computational grids that accurately represent complex ICE geometries poses a significant challenge for NekRS due to its requirement for conformal hexahedral meshes. The mesh generation employed Coreform Cubit (2022) (Version 2022.11) and involved initially filling the cylinder head volume with tetrahedral elements (TET), each of which was then split into four hexahedra (HEX). The mesh for the engine head obtained after the TET-to-HEX conversion is shown in Fig. 1. The mesh of the lower horizontal plane of the cylinder head was subsequently extruded to the piston to create tensor product element layers capable of accommodating the vertical mesh deformation caused by piston motion while minimizing distortion. To account for the mesh movement, NekRS was extended at ETH Zurich by implementing the Arbitrary Lagrangian/Eulerian (ALE) formulation of (Ho 1989). The mesh velocity scales linearly between the instantaneous piston velocity on the piston and zero at a distance of 1.6 mm below the head rim. To compensate for the distortion of the spectral elements during compression, four grids were constructed with different numbers of elements, ranging from \(E=4.8\) to 9.3M, aiming to maintain mesh quality throughout the cycle by removing layers as needed. Specifically, the grids were adjusted at different stages: (i) \(E=9.3\)M spectral elements from \(-120\) to \(-60\) CAD, (ii) \(E=6.7\)M from \(-60\) to \(-30\) CAD, (iii) \(E=4.9\)M between \(-30\) to \(-10\) CAD, and (iv) \(E=4.4M\) from \(-10\) CAD to 30 CAD after TDC. A scalable high-order spectral interpolation was employed to transition the solution from one grid to the next without compromising the accuracy of the high-order method. Polynomial orders of \(N=7\) and \(N=9\) were chosen for the 1500 and 2500 rpm cases, respectively, yielding meshes with 1.5 to 6.8 billion unique grid points. The mesh achieved an average resolution of 30 \(\upmu\)m and 23 \(\upmu\)m in the bulk, with the first grid point located 3.75 \(\upmu\)m and 3 \(\upmu\)m away from the walls for 1500 and 2500 rpm, respectively.

Fig. 1
figure 1

Representation of the engine head mesh after TET-to-HEX conversion, illustrating the spectral element structure of the fully conformal hexahedral mesh

The initial conditions for the DNS were obtained from precursor Large Eddy Simulations (LES) validated statistically against available experimental data following the workflow described in Giannakopoulos et al. (2022). Subsequent cycles were selected from the LES results, and the data were interpolated onto the DNS grid at Intake Valve Closure (IVC) at \(-120\) CAD. The no-slip engine walls and the piston were considered isothermal at the conditioning temperature \(T_w={333.15\,\textrm{K}}\). Starting from initial conditions generated by means of LES, the low-Mach number form of the governing equations were integrated in time with NekRS using a second-order semi-implicit scheme. The time step was dynamically adjusted during the simulations to maintain a fixed maximum Courant–Friedrichs–Lewy (CFL) number of 2, ensuring numerical stability by using a high-order characteristics-based scheme (Patel et al. 2018) that allows to overcome the CFL constraints imposed by standard schemes. With this approach, variable time step sizes ranging from 14 to 36 \(\upmu\)s were achieved, which corresponds to 650 to 250 time steps/CAD for the engine speeds considered in this study. A total of 12 cycles were computed for each engine speed on the JUWELS Booster GPU nodes at the Jülich Supercomputing Centre, each equipped with 2 AMD EPYC Rome CPUs with 48 cores and 4 NVDIA A100 GPUs with 40 GB memory. The simulations were performed on 70 and 130 GPU nodes for OP C and OP E, respectively, requiring 2700 and 5200 node-hours per cycle at 1500 and 2500 rpm.

4 Results and Discussion

4.1 Evolution of the In-Cylinder Flow and Temperature Fields

In order to interpret the BL profiles during the compression stroke, it is important to first describe the overall behavior of the in-cylinder flows. For the statistical study, a Favre phase-averaging technique was utilized for the 12 computed cycles to account for the density variations caused by the temperature disparity between the cold walls and the increasing bulk temperature due to compression.

The evolution of the Favre phase-averaged velocity magnitude at different times during the compression stroke is illustrated in Figs. 2 (OP C) and 3 (OP E); in both cases, a similar qualitative behaviour was observed for the two engine speeds. The global flow in the engine is driven by the piston motion, the momentum of the intake flow and the pent-roof cylinder head geometry, creating a large clockwise tumble vortex. The tumble vortex can be discerned in Fig. 2 both from the velocity vectors superimposed on the contour plots (top row) and by pressure isosurfaces (bottom row). The pressure isosurfaces become progressively more wrinkled as compression progresses due to the increasing Reynolds number in the bulk; this effect is enhanced as the vortex develops instabilities before breaking down. The tumble structure persists throughout most of the compression stroke until shortly before TDC. This large scale structure conserves the kinetic energy of the intake jets, and is constantly being compressed by the upward moving piston. Close to TDC, the tumble vortex breaks down, resulting in a transfer of kinetic energy from the large-scale tumbling motion to small-scale turbulence (Borée and Miles 2014).

The evolution of the tumble flow in the experimental studies is typically limited to 2D slices on the cross tumble plane (Jainski et al. 2012; Renaud et al. 2018; Schmidt et al. 2023). However, in order to gain a better understanding, the complete 3D motion and the effect of the engine geometry must be taken into account. As also observed by Voisine et al. (2010), the flow induced by the 3D tumble motion rolls up along the cylinder wall and generates two secondary horizontal vortex structures. In contrast to the findings of Voisine et al. (2010), in the TUDa engine, these vortices are not only located in the centre of the upper part of the cylinder, but rather exhibit a positional variance along the x-axis that depends on the height within the cylinder. At large distances above the piston, the centers of the vortical structures, indicated by the red ellipses, are predominantly oriented towards the exhaust side (Fig. 3b). Conversely, at lower heights closer to the piston, the centers of the vortices tend to be situated more towards the intake side (Fig. 3c). They persist throughout the compression stroke, and as the piston approaches TDC, they begin to form to within 1 mm of the piston surface, as can be discerned from the vectors in Fig. 3f, making their influence on the piston BL increasingly evident. These structures pertain primarily to the 3D advancement of the tumbling flow, rather than being indicative of its early breakdown, as suggested by Lumley (2001). Their effect on BL evolution is discussed in more detail in the following sections.

Fig. 2
figure 2

Favre phase-averaged velocity magnitude with superimposed vectors (top) and pressure (\(p_1 = 10\) Pa) isosurfaces (bottom) on the tumble plane (\(y=0\) mm) at \(-105\), \(-75\) and \(-15\) CAD for OP C

Fig. 3
figure 3

Favre phase-averaged piston-parallel velocity magnitude with superimposed vectors for OP E at different heights above the piston: a 37 mm, b 25 mm, c 13 mm, d 1 mm at \(-75\) CAD, and e 4 mm, f 1 mm at \(-15\) CAD. The red ellipses mark the vortical structures

The evolution of the Favre phase-averaged temperature fields is depicted in Fig. 4. The temperature fields are largely homogeneously distributed, even during the earlier phases of compression, with the exception of the areas close to the walls (Fig. 4 right). In contrast to the findings of Schmitt et al. (2015), thermal stratification does not increase significantly during the compression stroke. A possible explanation could be the higher engine speed, 2500 rpm for the plots in Fig. 4, compared to 560 rpm in Schmitt et al. (2015) and the vortical structures in the bulk flow. The increased engine speed leads to more turbulence and promotes mixing, leading to a more uniform temperature distribution within the cylinder. However, it should be noted that this temperature homogeneity is largely a characteristic of the motored operating condition considered here, where exhaust gas recirculation effects are absent.

In the following, the impact of the flow and temperature fields on the transient BL profiles is analyzed in detail.

Fig. 4
figure 4

Favre phase-averaged temperature on the tumble plane (\(y={0\,\textrm{mm}}\)) at \(-105\), \(-75\) and \(-15\) CAD for OP E

4.2 Boundary Layer Averaging

In contrast to the experiments where hundreds of cycles can be readily measured for statistical analysis (Jainski et al. 2012; Renaud et al. 2018; Schmidt et al. 2023), 12 compression-expansion cycles were simulated for each OP. In order to obtain better converged statistics, Favre phase-averaging was combined with spatial averaging. In most studies where BL in canonical configurations are investigated, the use of spatial averaging techniques in combination with time- and/or phase-averaging is a common practice, as it allows for a reduction of the total computational cost to obtain converged statistics. However, for complex flow configurations, such as those found in an ICE, determining the appropriate direction for spatial averaging is not readily apparent. Similarly to Giannakopoulos et al. (2022), the cross-tumble or spanwise direction (i.e., the direction parallel to the y-axis) was chosen, since it exhibits sufficient homogeneity, as can be seen in the Favre phase-averaged piston-parallel velocity magnitude distribution near the piston shown in Fig. 5.

Fig. 5
figure 5

Favre phase-averaged piston-parallel velocity magnitude 1 mm away from the piston surface at \(-75\) CAD for OP E. The rectangles defined by the dashed black line mark the S1, S2 and S3 sampling regions discussed in the text

A domain with a height of 2 mm covering a region of \(4.5\times 4\) \(\hbox {cm}^2\) around the piston center was selected for spatial averaging; it is marked by the black dashed lines in Fig. 5. This domain was further divided into three regions of equal area labeled as S1, S2 and S3, in order to investigate the effects of local flow structures on the BL. The results from each region are presented in the following sections. Unless otherwise noted, the largest region (covering \(4.5\times 4\) \(\hbox {cm}^2\) of the piston surface area) was used to collect near-wall statistics on the piston surface.

In the following, the Favre phase-averaged velocity \(\tilde{u}\) and temperature \(\tilde{T}\) defined as

$$\begin{aligned} \tilde{u} = \frac{\overline{\rho u}}{\overline{\rho }}, \quad \quad \tilde{T} = \frac{\overline{\rho T}}{\overline{\rho }}, \end{aligned}$$

where \(\rho\) is the gas density and the overline (\(\overline{\cdot }\)) denotes phase-averaging, will be simply referred to as u and T.

4.3 Velocity Boundary Layer

4.3.1 Dimensional Profiles

Fig. 6
figure 6

Normalized piston-parallel velocity profiles as a function of the distance from the piston surface at different CAD during compression for OP C (left) and OP E (right). The black diamond represents the velocity BL thickness

The evolution of the BL velocity profiles at the piston for the two OP is shown in Fig. 6. The Favre phase-averaged piston-parallel velocity magnitude \(|V_{xy}|=\sqrt{u_x^2 + u_y^2}\) is normalized by the corresponding maximum piston speed \(\bar{V_p}\). The velocity profiles show a region close to the piston characterized by steep velocity gradients, indicating the presence of a pronounced viscous sublayer and resulting in the almost linear increase close to the piston (Fig. 6). The velocity increases up to a maximum value before decreasing in the bulk due to the tumble flow outside the BL, which causes a drop in velocity away from the piston surface, in contrast to the monotonic increase typical of canonical flow configurations. This tumble-induced deceleration is characteristic of engine BL in which the mean in-cylinder flow in the core region of the engine is controlled by large vortical motions that counteract the outward growth of the BL velocity profile (Ma et al. 2016). In addition, the velocity decreases as the piston approaches TDC as a consequence of the dissipation of kinetic energy in the flow and the decreasing piston speed. The BL thickness, defined by the location where the velocity reaches 99% of its peak value (Renaud et al. 2018), scales inversely with the engine speed and the crank angle. The inverse scaling with the crank angle can be attributed to the decreasing turbulent length scales and increasing Reynolds numbers due to the decreasing kinematic viscosity. The kinematic viscosity decreases during the compression stroke because of the globally increasing temperature and pressure. In the vicinity of the cold isothermal walls, this decrease in kinematic viscosity is even more pronounced (Schmitt et al. 2016b). As a result, the BL thickness at 2500 rpm reaches values as low as 0.41 mm.

Table 2 summarizes several parameters defined in Renaud et al. (2018) and Schmidt et al. (2023) to characterize the BL computed in the sampling domain shown in Fig. 5: the thickness of the viscous sublayer \(\delta _{\nu }\), the BL thickness \(\delta\), the displacement thickness \({\delta }^*\), the momentum thickness \({\theta }\), the shape factor H, and the Reynolds number based on the momentum thickness \(\hbox {Re}_{\theta }\),

$$\begin{aligned} \delta _{\nu }&= z |_{z^+_{xy}=5}, \end{aligned}$$
(1)
$$\begin{aligned} \delta&= z |_{V_{xy} = 0.99 V_{xy,\text {max}}}, \end{aligned}$$
(2)
$$\begin{aligned} \delta ^*&= \int _0^\infty \left( 1 - \frac{\rho (z) V_{xy}(z)}{\rho _\infty u_\infty } \right) \, dz, \end{aligned}$$
(3)
$$\begin{aligned} \theta&= \int _0^\infty \frac{\rho (z) V_{xy}(z)}{\rho _\infty u_\infty } \left( 1-\frac{V_{xy}(z)}{u_\infty } \right) \, dz, \end{aligned}$$
(4)
$$\begin{aligned} H&= \frac{\delta ^*}{\theta }, \end{aligned}$$
(5)
$$\begin{aligned} Re_{\theta }&= \frac{u_{\infty }\theta }{\nu }. \end{aligned}$$
(6)

Here, \(\rho\) and \(\nu\) stand for the local gas density and kinematic viscosity, respectively and \(z^+_{xy}\) is the inner-scaled wall distance (Giannakopoulos et al. 2022)

$$\begin{aligned} z^+_{xy}&= z \frac{u_{\tau xy}}{\nu _w}, \end{aligned}$$
(7)

where \(u_{\tau xy} = \sqrt{\tau _{wxy}/\rho _w}\) is the friction velocity based on the wall shear stress \(\tau _{wxy} = \nu _w \sqrt{\left( du_x/dz \right) ^2 + \left( du_y/dz\right) ^2}\). The upper limit of the integrals (Eqs. 3 and 4) is set at the height above the piston where \(u_\infty =u_{xy, \text {max}}\) (Renaud et al. 2018).

The viscous sublayer is found to be as thin as 13 \(\upmu\)m at the highest rpm, and scales inversely with the engine speed and the crank angle similar to the BL thickness. The values for \(\hbox {Re}_{\theta }\), and the displacement thickness \({\delta }^*\) are low, although the BL exhibit turbulent behavior. The turbulent character of the BL is supported by the observed values for the shape factor H. In the context of laminar BL, the classical Blasius profile was found to have a shape factor of 2.59 (Schlichting and Gersten 2017). Conversely, turbulent BL at high Reynolds numbers have been found to have a shape factor that typically acquires values between 1.3 and 1.7 (Schlichting and Gersten 2017). The values observed in this study are consistently below 1.5, indicating that the configuration is compatible with a turbulent scenario despite the relatively low \(\hbox {Re}_{\theta }\). These observations indicate that the presence of a turbulent bulk flow leads to an accelerated onset of turbulence in the BL, consequently deviating from the expected characteristics of the BL as predicted by established theories.

Table 2 Velocity BL parameters computed for OP C and E

4.3.2 Scaled Profiles

Figure 7 shows the velocity profiles during compression for the two OP using transformations from physical to wall-normal units as defined by Huang et al. (1995), where inner scaling is denoted by \(+\) and semi-local scaling by \(*\). The variables of interest, namely the scaled wall distance \(z^*_{xy}\) and the scaled piston-parallel velocities \(u^+_{xy}\) and \(u^*_{xy}\) are defined as:

$$\begin{aligned} z^*_{xy}&= \frac{z}{\nu _w} \sqrt{\frac{\tau _{wxy}}{\rho }}, \end{aligned}$$
(8)
$$\begin{aligned} u^+_{xy}&= \frac{\sqrt{u^2_x+u^2_y}}{u_{\tau xy}}, \end{aligned}$$
(9)
$$\begin{aligned} u^*_{xy}&= \sqrt{u^2_x+u^2_y} \sqrt{\frac{\rho }{\tau _{wxy}}}, \end{aligned}$$
(10)

with \(\nu _{w}\) being the kinematic viscosity at the wall and \(u_{\tau xy}\) and \(\tau _{wxy}\) the friction velocity and the wall shear stress on the piston, respectively. The dashed black lines in Fig. 8 represent the law of the wall (Bredberg 2000)

$$\begin{aligned} u^+_{xy}&= {\left\{ \begin{array}{ll} z^+_{xy}, &{} \text {if } z^+_{xy} < 10\\ \frac{1}{k} \ln (z^+_{xy}) + B, &{} \text {if } z^+_{xy} \ge 10 \end{array}\right. } \end{aligned}$$
(11)
$$\begin{aligned} u^*_{xy}&= {\left\{ \begin{array}{ll} z^*_{xy}, &{} \text {if } z^*_{xy} < 10\\ \frac{1}{k} \ln (z^*_{xy}) + B, &{} \text {if } z^*_{xy} \ge 10 \end{array}\right. } \end{aligned}$$
(12)

where the constants \(k=0.41\) and \(B=5.2\) take standard values (Pope 2000).

Fig. 7
figure 7

Piston-parallel velocity magnitude profiles on the piston at different CAD during compression for OP C (left) and OP E (right). Top row: inner scaling; bottom row: semi-local scaling; black dashed lines indicate the profiles predicted by the law of the wall

Fig. 8
figure 8

Piston-parallel velocity magnitude profiles on the piston at \(-45\), \(-30\), and \(-15\) CAD for OP C (blue) and E (red). Top row: inner scaling; bottom row: semi-local scaling; black dashed lines indicate the profiles predicted by the law of the wall

The scaled velocity profiles in Fig. 7 show comparable qualitative characteristics for the two engine speeds. In particular, the observed values for both profiles are lower than the predictions derived from the law of the wall, with the semi-local scaled profiles (Fig. 7c, d) showing a larger deviation, especially at TDC where the largest deviations are observed. In contrast to the results of Schmitt et al. (2015) obtained in a pancake engine-like geometry at 560 rpm, the semi-local scaled profiles in the buffer and outer layer do not collapse into a single curve. According to Huang et al. (1995) and Nicoud (1999), the collapse would indicate that the differences in the inner scaled profiles (Fig. 7a, b) are mainly due to density variations within the BL, which does not seem to be the case for the complex engine flows considered here. While most of the profiles conform to the linear velocity law for \(z^+_{xy} \lesssim 5\), they do not clearly indicate the presence of a logarithmic layer region. This observation suggests a significant degree of overlap between the viscosity-driven inner layer and the outer layer, as reported by Spalart (1988). These findings are in agreement with previous experimental (Renaud et al. 2018; Schmidt et al. 2023) and numerical (Giannakopoulos et al. 2022) studies on the engine BL structure.

Fig. 9
figure 9

Scaled velocity BL on the piston at \(-45\), \(-30\), and \(-15\) CAD: magnitude-based processed piston-parallel velocity magnitude profiles for sampling domains S1 (top) and S2 (middle) and S3 (bottom); black dashed lines: law of the wall profiles

Fig. 10
figure 10

Favre phase-averaged velocity magnitude on the piston reference plane with velocity vectors superimposed on the tumble plane slice (\(y=0\) mm) (top) and at 1 mm above the piston (bottom) at \(-45\) CAD (top) and \(-15\) CAD (bottom) for OP C (left) and OP E (right). The red ellipses indicate the center of the vortices

Comparing the scaled profiles for the two engine speeds (Fig. 8), it can be seen that the values for OP E (red curves) are consistently slightly lower than those for OP C (blue curves), although the Reynolds number is higher and the BL is expected to be more developed. In order to better understand this behavior, the sampling domain shown in Fig. 5 is divided into three equal regions. The direction of the tumble-induced flow above the piston surface is from right to left, i.e., from region S3 to S1. The corresponding inner scaled velocity profiles are shown in Fig. 9 for the same time instants; the pressure gradient values are also provided. Figure 10 depicts the Favre phase-averaged xz-velocity magnitude in the piston reference frame (i.e. piston speed is subtracted from the z-component of velocity) on the center tumble plane (\(y={0\,\text{mm}}\)) and the xy-velocity component at 1 mm above the piston at different time instants during compression. The flow in the y-direction (Fig. 10c, d) is split at the center line (\(y={0\,\text{mm}}\)) and runs in opposite directions, which can be seen more clearly in the S1 region (Fig. 10d). Similarly, the pressure gradient values are also split above and below the center line. In order to obtain meaningful values that do not cancel each other out when averaging across the illustrated regions, the averaged dP/dy values for \(y<{0\,\text{mm}}\) have been inverted and added to the values for \(y>{0\,\text{mm}}\). The resulting dP/dy values are shown in Fig. 9 together with dP/dx values which were computed regularly without any inversion.

In region S3 (Fig. 9g, h, i), the corresponding BL are rather developed as they partially coincide with the log law. The flow structures shown in Fig. 10a, b resemble canonical impinging wall jets. The area in which the flow impinges on the piston appears to be further to the right compared to S3. The flow is therefore deflected by the piston and accelerated in this region. This is confirmed by the positive dP/dx values, which indicate that the flow experiences a favorable pressure gradient in the x-direction, as the direction of the flow is from right to left. The pressure gradients in the x-direction have similar values for both OP and are significantly higher than in the y-direction. Due to the aforementioned inversion of the dP/dy values for \(y<{0\,\text{mm}}\) a negative pressure gradient in the y-direction is favorable in this case, as the flow above the center line runs in the positive y-direction. Therefore, the flow is also accelerated in the y-direction. The profiles for OP E are slightly higher than OP C in S3, a characteristic of a more developed BL, which is consistent with previous results (Giannakopoulos et al. 2022; Schmidt et al. 2023). In S2 (Fig. 9d, e, f), the pressure gradients in the x-direction are also favorable (positive) and higher than in S3 during late compression at \(-15\) CAD. In addition, the pressure gradients in the y-direction also show higher negative values. Higher positive values in the x-direction and higher negative values in the y-direction mean that the flow is further accelerated and more similar to a canonical flow, such as flow past a flat plate, while the BL profiles appear to be shifted closer to the logarithmic region.

In S1 (Fig. 9a, b, c), the velocity profiles for OP E show significantly lower values, which continue to decrease as compression progresses. The pressure gradients in the x-direction for OP E are still favorable (positive) and the pressure gradients in the y-direction are also favorable (negative) in this region. However, when looking at Fig. 10d, the flow in S1 for OP E appears to be deflected by the vortical structures discussed in Sect. 4.1 and no longer runs from left to right, but is split at the center line (\(y={0\,\text{mm}}\)) and moves in the positive and negative y-direction. The flow in this region is similar to that of two opposing wall jets (Johansson and Andersson 2005; Liang et al. 2023), as the flow generated by the vortical structures counteracts the directional flow from the tumble. A stagnation zone appears to form near the center line, as can be seen from the velocity values in Fig. 10d, with possibly high turbulence levels (Johansson and Andersson 2005) and underdeveloped BL, which could explain why the BL profiles do not match those predicted by the law of the wall during the later stages of compression. The BL profiles are expected to be more developed as the flow is accelerated again away from the collision zone. For OP C, all pressure gradients dP/dx are adverse (negative), indicating that the flow is decelerated, which can also be confirmed by the velocity values in Fig. 10c. For canonical flow configurations, an adverse pressure gradient causes the BL profiles to be shifted below the logarithmic region (Monty et al. 2011). However, the profiles seem to adhere quite well to the logarithmic region for OP C in S1. This can be partly explained by the favorable (negative) pressure gradients in the y-direction, which accelerate the flow along the y-component. The largest difference between the profiles for the two OP can be seen in region S1 at \(-15\) CAD (Fig. 9c). The profile for OP E no longer conforms with the law of the wall within the viscous sublayer. This large deviation can be partly explained by the presence of the vortical structures. At the highest engine speed investigated, the influence of these vortex structures extends beyond the center of the piston and significantly alters the BL profiles during the later stages of compression, as shown in Fig. 10d. For OP C, the vortical structures close to the piston do not extend as far and their effect on the BL profile in the S1 region is therefore not as strong (Fig. 10c).

In summary, the velocity BL inside the engine vary considerably in space and time and depend on the instantaneous flow structures. They deviate from canonical flow configurations due to local pressure gradients and large-scale vortical structures in the bulk flow. Numerical wall models must account for the unsteady nature of the non-equilibrium flow with a non-zero pressure gradient if near-wall processes are to be accurately predicted in engineering ICE simulations.

4.4 Thermal Boundary Layer

The DNS enables the analysis of the thermal BL without invoking the Reynolds analogy, which assumes that the turbulent Prandtl number is unity, so that the velocity profile can be used to obtain the temperature profile and the local density and viscosity.

4.4.1 Dimensional Profiles

The evolution of the normalized temperature \(T/T_w\) profiles during compression is plotted in Fig. 11 as a function of the distance from the wall. Steep temperature gradients can be observed near the piston, which flatten out with increasing distance from the wall, and at approximately 1 mm the profiles become almost horizontal. In contrast to the velocity profiles, the temperature profiles are not affected by the tumble vortex in the engine core and increase monotonically until the bulk gas temperature is reached. The thermal BL thickness, defined as the distance between the piston and the point at which the gas temperature reaches 99% of the bulk gas temperature, is marked by black diamonds. Similar to the velocity BL, the thermal BL thickness scales inversely with increasing engine speed but increases with the increasing in-cylinder temperature. At the highest engine speed, the thermal BL thickness reaches values as low as 0.65 mm.

Fig. 11
figure 11

Normalized temperature as a function of the distance from the piston surface at different CAD during compression for OP C (left) and E (right). The black diamond represents the thermal BL thickness

Table 3 summarizes several parameters that characterize the thermal BL: thermal BL thickness \(\delta _T\), thermal displacement thickness \(\delta ^*_T\) and the Prandtl number Pr defined as (Ojo et al. 2021; Fan et al. 2018),

$$\begin{aligned} \delta _{T}&= z |_{T=0.99T_{\text {max}}} \end{aligned}$$
(13)
$$\begin{aligned} \delta ^*_T&= \int _0^\infty \left( \frac{T_{\text {bulk}} - T(z)}{T_{\text {bulk}}-T_w}\right) \, dz \end{aligned}$$
(14)
$$\begin{aligned} Pr&= \frac{\mu c_p}{\lambda } \end{aligned}$$
(15)

where \(T_{\text {bulk}}\) denotes the gas temperature in the bulk, \(c_p\) the specific heat at constant pressure and \(\lambda\) the thermal conductivity; the upper limit of the integral is set at the height above the piston where \(T_{\text {max}}=T_{\text {bulk}}\). The fluid properties are evaluated at the bulk temperature and pressure.

Table 3 Thermal BL parameters computed for OP C and E

The thickness of the dimensional thermal BL \(\delta _T\) is found to increase with increasing pressure and temperature, which is due to the increased \(T_{\text {bulk}}\) when the gas is compressed. In contrast, the thermal displacement thickness \(\delta ^*_T\) decreases as compression progresses. This can be attributed to several factors, including changes in the thermal properties of the fluid due to increasing temperature and increased thermal mixing due to enhanced turbulence. Higher temperature leads to higher gas thermal conductivity, which may reduce the displacement effect due to a more uniform temperature distribution near the wall. Similarly, the increased turbulence during compression homogenizes the temperature profile near the wall and can lead to a reduction in thermal displacement thickness as the temperature difference driving the displacement becomes smaller. The Pr number remains practically constant during mid and late compression (− 60 to 0 CAD) with a value of 0.712, but varies between 0.704 and 0.712 throughout the compression stroke (\(-120\) to 0 CAD). Fan et al. (2018) reported a Pr that varies between 0.705 and 0.727 during the compression stroke of a motored operation of a spark-ignition and direct-injection engine. This small difference could be attributed to the lower average temperature in the bulk at the beginning of compression and possibly a different working fluid.

4.4.2 Scaled Profiles

Inner and semi-local scaling is applied to all variables reported in Fig. 12, where the scaled temperatures \(T^+\) and \(T^*\) are defined as (Giannakopoulos et al. 2022):

$$\begin{aligned} T^+&= \rho _w c_{p,w} \frac{T_w - T}{q_w} u_{\tau xy}, \end{aligned}$$
(16)
$$\begin{aligned} T^*&= \rho _w c_{p,w} \frac{T_w - T}{q_w} \sqrt{\frac{\tau _{wxy}}{\rho }}, \end{aligned}$$
(17)

where \(c_{p,w}\) is the heat capacity at the wall, and \(q_w\), the heat flux at the wall, is given by

$$\begin{aligned} q_w = \lambda _w \frac{\partial T}{\partial n}, \end{aligned}$$
(18)

with \(\lambda _w\) being the gas thermal conductivity at the fixed wall temperature \(T_w\).

The black dashed line in Fig. 12 represents the thermal law of the wall given as (Bradshaw and Huang 1995)

$$\begin{aligned} T^+&= {\left\{ \begin{array}{ll} z^+_{xy}Pr, &{} \text {if } z^+_{xy} < 10\\ 2.075 \ln (z^+_{xy}) + 3.9, &{} \text {if } z^+_{xy} \ge 10 \end{array}\right. } \end{aligned}$$
(19)
$$\begin{aligned} T^*&= {\left\{ \begin{array}{ll} z^*_{xy}Pr, &{} \text {if } z^*_{xy} < 10\\ 2.075 \ln (z^*_{xy}) + 3.9, &{} \text {if } z^*_{xy} \ge 10 \end{array}\right. } \end{aligned}$$
(20)

In the viscous sublayer (\(z^+_{xy} \lesssim 5\)) the profiles agree well with the thermal law of the wall, especially the semi-local scaled ones, while deviations in the buffer and outer layer regions tend to increase towards TDC, in agreement with Schmitt et al. (2015); Giannakopoulos et al. (2022). As observed by Giannakopoulos et al. (2022) the curves with the semi-local formulation do not collapse and span a wider range of values than the inner scaled ones, confirming that even at high engines speeds the thermal profiles differ from those of fully-developed canonical flows (Huang et al. 1995; Trettel and Larsson 2016; Griffin et al. 2023).

Fig. 12
figure 12

Thermal BL profiles on the piston at different CAD during compression for OP C (left) and E (right). Top row: inner-scaled profiles; bottom row: semi-local scaled profiles; black dashed lines represent the profiles predicted by the thermal law of the wall

4.5 Wall Heat Flux

The distribution of the average heat flux over different regions of the engine is shown for the two engine speeds in Fig. 13 at − 15 CAD and Fig. 14 at − 60 CAD. On the left side of the cylinder head (Fig. 13, top row), where the flow impinges/stagnates due to the tumbling motion, the average heat flux values are higher than on the right side, in agreement with previous studies (Schmitt et al. 2015; Giannakopoulos et al. 2022). However, the distribution of heat flux on the piston differs between the two OP and from the results in Giannakopoulos et al. (2022). The highest values for OP C are observed towards the right side and in the centre of the piston, while for OP E the highest values are mainly in the middle left part. This is mostly an effect of the tumble core shift to the left at the higher rpm. The vortical structures that characterize the 3D motion of the tumble rolling on the cylinder wall generate a directional flow at the cylinder liner, resulting in higher wall heat flux values, as shown in Fig. 14. However, their effect on the wall heat flux is not as pronounced as that of the tumble impinging/stagnating on the piston and the cylinder head.

Fig. 13
figure 13

Distribution of the average heat flux at the cylinder walls for OP C (left) and OP E (right) at − 15 CAD. Cylinder head view at the top and piston view at the bottom

Fig. 14
figure 14

Distribution of the average heat flux at the cylinder liner and contour plot of the velocity magnitude superimposed with vector lines for OP C (left) and OP E (right) at − 60 CAD

In terms of global metrics, the crank-angle resolved surface-averaged heat flux depicted in Fig. 15 (top) peaks around TDC, as expected since this is the moment of peak cylinder pressure/temperature. More importantly, significant cycle-to-cycle variability of up to 8% is observed for both OP, even though no combustion takes place. To better understand this cyclic variability, the evolution of the tumble flow and its effect on the wall heat flux is examined in more detail. The tumble ratio \(TR_y\), a global measure that characterizes the development of the large-scale motion in the combustion chamber, is used to quantitatively describe the flow dynamics in the cylinder (Giannakopoulos et al. 2019),

$$\begin{aligned} TR_y = \frac{\omega _y}{\omega _{cs}}, \end{aligned}$$
(21)

where \(\omega _y\) represents the angular velocity of the solid-body flow rotating around the center of mass in the y-direction, and \(\omega _{cs}\) denotes the angular velocity of the engine crankshaft. The angular velocity \(\omega _y\) is determined by

$$\begin{aligned} \omega _y = \frac{L_y}{I_y}, \end{aligned}$$
(22)

where \(L_y\) and \(I_y\) represent the angular momentum and the moment of inertia with respect to the center of mass in the y-direction, respectively.

The tumble ratio with respect to the y-axis \(TR_y\) is depicted in Fig. 15 (bottom row) for the 12 cycles at both OP. Tumble generation reaches its peak at maximum piston speed, around \(-75\) CAD, and subsequently decreases almost linearly. When looking at the individual cycles, it can be seen that the cycles which reach the lowest TR values around TDC also correspond to the cycles in which the peak value of the wall heat flux is reached (cycles 2, 4 and 11 for OP C and cycles 7, 8 and 9 for OP E in Fig. 15). The low TR values could be indicative of an earlier onset of the tumble breakdown, possibly enhancing the transfer of kinetic energy from the large tumble structure to small-scale turbulence. The increased turbulent kinetic energy could lead to an increased convective heat transfer from the hot bulk flow to the cooled engine walls, potentially increasing the wall heat flux values.

Fig. 15
figure 15

Surface-averaged heat flux along the cylinder walls (top) and tumble ratios (bottom) for the individual and average cycles for OP C (left) and OP E (right) during compression and early expansion (up to 30 CAD after TDC)

5 Conclusions

Using the workflow established in Giannakopoulos et al. (2022) and well-characterized initial conditions from precursor multi-cycle LES of the motored operation of the optical ICE at TUDa, multi-cycle DNS of the laboratory-scale engine were performed at the practically relevant engine speeds of 1500 and 2500 rpm under full-load motored operation. NekRS, a spectral element CFD solver developed to harness the computational power of GPU-accelerated high performance computing systems (Fischer et al. 2022), was extended by implementing an Arbitrary-Lagrangian–Eulerian approach for moving mesh geometries and employed in the simulations. The parallel efficiency of NekRS enabled the extension of the single-cycle DNS of the TUDa engine at 800 rpm and partial load of Giannakopoulos et al. (2022). A total of 12 cycles were computed at each engine speed, generating rich data to investigate the evolution of the flow and temperature fields. A Favre phase-averaging approach was used to account for the density variations near the wall and to extract representative statistics to study the near-wall phenomena.

The near-wall analysis confirms previous experimental and numerical findings that the boundary layers (BL) in ICE differ from canonical-flow steady turbulent BL, conditions that are commonly assumed in deriving scaling laws for wall model closures. The large-scale tumble flow generated by the intake process leads to a dynamically changing behavior of the BL, both temporally and locally. Above the piston, the flow undergoes a deceleration-stagnation process, where the tumble vortex impinges on the piston and then accelerates as the flow diverges from the area of impact.

The analysis of the 3D motion of the tumble vortex revealed that the flow rolls off the cylinder wall, resulting in vortical structures that are positioned closer to the intake valves at lower heights above the piston and progressively shift towards the exhaust valves with increasing distance from the piston. These vortical structures appear at distances as low as 1 mm from the piston surface during later stages of compression and affect the BL, especially at higher engine speeds. As a result, these fluctuations lead to alternating-sign pressure gradients, ultimately invalidating the flow equilibrium assumption commonly used in wall modeling approaches. Furthermore, both the BL thickness and the viscous sublayer thickness were found to scale inversely with the engine speed and the crank angle, reaching values as low as 0.41 mm and 13 μm, respectively, at the highest engine speed investigated, posing a significant challenge for both numerical and experimental studies in terms of resolution requirements to properly resolve such BL.

The thermal BL were also found to deviate considerably from ideal scaling laws, even at high engine speeds. Similar to the velocity BL, it was found that the thickness of the thermal BL scales inversely to the engine speed but increases with increasing bulk temperature in the cylinder. In contrast, the thermal displacement thickness, which is sometimes used as an approximation of the thermal BL thickness, was found to decrease with increasing temperature in the bulk. Examination of the heat flux distribution confirmed the similarity between the flow and heat flux patterns and revealed regions of increased heat flux, particularly at locations characterized by a strong flow directed towards the wall caused by the tumble vortex impinging on the piston and cylinder head, or the horizontal vortical structures impinging on the cylinder liner. In addition, significant cyclic variations in the surface-averaged wall heat flux were observed for both OP. An analysis of the cyclic tumble ratio revealed that the cycles in which the tumble ratio reaches lower values near TDC, indicative of an earlier tumble breakdown, also exhibit higher surface averaged wall heat fluxes.

In this DNS study, 12 cycles were simulated for each OP, a relatively small number compared to experimental work (Jainski et al. 2012; Renaud et al. 2018; Schmidt et al. 2023), where hundreds of cycles can be measured for statistical analysis. In addition, the engine crevice was removed from the geometry. To address these drawbacks, phase averaging was combined with spatial averaging to obtain better converged statistics and the sampling domain at the piston level was selected away from the liner where the removal of the crevice would have an effect. Nonetheless, despite these limitations, these first-of-its-kind simulations, leading to one of the largest databases for ICE flows, represent an important step towards the next generation of ICE simulations using GPU-accelerated HPC platforms. The continuous development of the spectral element solver capabilities, as well as an extension for low Mach number chemically reactive flows, will enable DNS of more challenging conditions, including fired engine operation using detailed chemistry. This advancement is crucial for enhancing our understanding and optimization of engine performance under various operational conditions with climate-friendly fuels and ensuring the practical applicability of the developed technologies in real-world engine designs. The approach promises not only to advance the field of ICE research but also to contribute significantly to the evolution of efficient powertrains using more sustainable fuels.