1 Introduction

The multi-relaxation-time Lattice Boltzmann method (MRTLBM) was developed by d’Humières and et al. [1, 2]. Based on statistical mechanics, the MRTLBM solves the Boltzmann equation for a fluid particle probability distribution function (PDF). The fluid particle collision process is performed in a moment space and the PDF moments are relaxed in different rates to achieve better accuracy and numerical stability [1,2,3]. All the macroscopic variables such as density, velocity, and energy are retrieved using the statistical moments of the particle PDF. This method has been demonstrated to be superior in simulation of turbulent flows over single relaxation LBM in which collisions are represented with a single relaxation rate [2, 3]. Since the Reynolds number in atmospheric turbulent flows is usually large, it is critically important to use the MRTLBM. The ABLE-LBM (Atmospheric Boundary Layer Environment-Lattice Boltzmann Model) was developed specifically for the atmospheric turbulent flow modeling and the MRTLBM was chosen for its suitability for turbulent flow [20]. Other studies [3, 4] also indicated MRTLBM has relatively lower numerical dissipation and dispersion compared to a traditional Navier Stokes (NS) solver. It is import to point out that there are other new types of LBM being developed for turbulent flow problems, such as the cascaded digital LBM [35], cumulant LBM [36], and regularized Bhatnagar– Gross–Krook LBM [37,38,39]. The LBM model has potential to achieve the goal of real-time simulations on a typical desktop/laptop with a modern GPU for a computational domain with several million nodes. There are other advantages to use the LBM, including simplicities in code development, the implementation of complex boundary conditions. Details on the history, fundamental theory, and methods of LBM are referred to review articles and books [5,6,7,8].

Traditionally, atmospheric boundary layer flows around buildings have been simulated using finite volume or finite element solvers of discretized NS equations [9,10,11,12,13,14,15,16,17]. These simulations can be categorized into RANS (Reynolds averaged Navier–Stokes) and LES (large-eddy simulation) approaches. RANS solves the Reynolds averaged NS equations for mean wind and associated turbulent kinetic energy (TKE) equation or combination of TKE and dissipation equations. In the LES approach, turbulent eddies greater than the filter length scale are resolved and only subgrid scale turbulence is modeled. The RANS requires significantly less computational resources than that of the LES. In many applications, however, the turbulence is a required simulation product so that the LES is necessary even though it requires more computing resources. The intrinsic parallel paradigm of the LBM has offered a very powerful way to speed up LES computations. Several studies have been carried out for turbulent flow modeling using the MRTLBM [18,19,20] and the results generally compare well with experiments. There are other urban simulation studies using the different types of LBMs [39,40,41, 45]. The results show that those LBMs are capable to simulate the urban turbulent flows and computation speed increased greatly using a GPU computer.

The laboratory turbulent flow data sets offer an excellent first step for validation of numerical models, as the experiments include detailed inlet boundary conditions and were carefully controlled and repeatable. Many laboratory experimental studies have been carried out for flows around multiple buildings [21,22,23,24]. There are several data sets of laboratory turbulent flows over a variety of building arrays [25,26,27,28], in which different techniques such as hot wire, particle imaging velocimetry, and laser Doppler anemometry were used for velocity measurements. We focus mainly on the Shim et al. [29] data set which used the magnetic resonance imaging technique known as MRV (Magnetic Resonance Velocimetry), for the model validation study. The MRI technique is a non-invasive and non-optical technique [29,30,31]. It provides highly detailed three-dimensional time averaged measurements of the three-component velocity fields throughout the building array domains. Although the working fluid in MRI is incompressible water, it is valid for simulating the atmospheric flow in which the Mach number is very low and the Reynolds number is the same for fully turbulent flow.

The objective of this study is to evaluate the ABLE-LBM’s capabilities for simulation of the building arrays. Specifically, we focus on neutral turbulent flows past an urban domain with some regular building arrangements. The organization of this work is as follows: the second section briefly describes the ABLE-LBM, its boundary conditions, and the turbulent parameterization used in this study. The third section presents results of the ABLE-LBM simulations of turbulent flows around the buildings and compares the model results with laboratory results of Shim et al. [29]. The fourth and final section gives a summary of the work and draws conclusions.

2 Description of the model

2.1 The multi-relaxation-time Lattice Boltzmann method

The ABLE-LBM is a MRTLBM type model with 3D lattice nodes in 19 lattice directions (D3Q19). Discretizing the Boltzmann equation both in time and space [1,2,3, 20], the MRTLBM equation can be written in vector form as follows:

$$\varvec{f}\left(\overrightarrow{x}+{\varvec{c}}_{\alpha }dt,t+dt\right)-\varvec{f}\left(\overrightarrow{x},t\right)={-M}^{-1}S\left[\varvec{m}\left(\overrightarrow{x},t\right)-{\varvec{m}}^{eq}\left(\overrightarrow{x},t\right)\right],$$
(1)

where the symbols in bold-face are column vectors with components in lattice direction α. \(\varvec{f}\left(\overrightarrow{x},t\right)\) is the fluid particle velocity PDF vector at a spatial location \(\overrightarrow{x }\)at a time t; and m and \({\varvec{m}}^{eq}\) are the velocity moment vector and its equilibrium function vector, respectively. M is the transformation matrix between f in velocity space to moment space m. S is the diagonal relaxation rate matrix for collisions in moment space, which are optimized in the MRTLBM for numerical stability [2, 3]. \({\varvec{c}}_{\alpha }\) is the discrete velocity for each D3Q19 node:

$${\varvec{c}}_{\alpha }=\left\{\begin{array}{ccc}(\text{0,0},0)& & \\ (\pm \text{1,0},0)& (0,\pm \text{1,0})& (\text{0,0},\pm 1)\\ (\pm 1,\pm 1,,0)& (\pm \text{1,0},\pm 1)& (0,\pm 1,\pm 1)\end{array}\right.\begin{array}{c}\alpha =0\\ \alpha =1,\dots ,6\\ \alpha =7,\dots ,18\end{array}$$
(2)

.

Equation (1) is solved for the particle PDF collision and advection. The macroscopic turbulent flow variables - density fluctuation and momentum fluxes are retrieved from integrating the PDF in every lattice direction:

$${\rho }^{\prime}=\sum _{\alpha =0}^{18}{f}_{a},\quad { \rho }^{\prime}\overrightarrow{u}=\sum _{\alpha =0}^{18}{c}_{\alpha }{f}_{a}.$$
(3)

The main distinction of MRTLBM is that it allows kinetic modes to relax differently so it is much more numerically stable [3]. The accuracy of the MRTLBM is second order with respect time and space [52, 32]. For a detailed description of the MRTLBM variables in Eq. (1), their derivations and transformations, readers are referred to other works [1,2,3] and our previous publication [20].

2.2 Boundary conditions

Boundary conditions (BC) are very important components for the ABLE-LBM model. The specific simulation cases are highly dependent on the BC. The inflow BC determines the larger scale weather forcing on the simulation domain, and the surface BC provides the small scale force mechanism that terrain and buildings exert on the turbulent atmospheric flows. The BCs of the ABLE-LBM are also prescribed in terms of the fluid particle PDF, i.e. the macroscopic variables such as density and velocity at the boundaries are transformed to their corresponding PDF and applied to the model. Four types BCs were used in this study: (1) the non-slip flow BCs for ground and building surfaces, north and south sides; (2) the prescribed inflow according to the experimentally measured flow profiles; (3) the precursor inflow boundary (see Sect. 3) and open BC for outflow boundary; and (4) free-slip BC at the top of model domain. Figure 1 depicts these four types of boundary conditions.

Fig. 1
figure 1

A Schematic illustration of the computational domain and boundary conditions for the simulations. The inflow boundary conditions are prescribed according to observed flow profiles, and outflow boundary conditions for wind are open, and the surface boundary conditions are no-slip for velocity. The top boundary conditions are free-slip for wind

The complex surface boundaries at the interface between the surface and buildings require special attention in this study. The Guo et al. [32] curved boundary condition was adopted and extended to ABLE-LBM [20] to treat the building surfaces. For the terrain, the ground surface grids are first divided into triangles, and each triangle is used to approximate the local ground surface. The building surfaces are treated as vertical planes and special attention is paid to the corner and edges of the buildings. The geometric parameters of the building and terrain are first computed and stored for the boundary condition of each time step. Further details of this implementation are available in the published literature [32, 20].

2.3 Subgrid turbulence model

In the LES simulations of ABLE-LBM, the subgrid turbulence is modeled using the Lilly-Smagorinsky [33, 34] turbulence model. The turbulence eddies greater than the grid scale are resolved directly from the model, while the subgrid momentum diffusion of the turbulent eddies smaller than the resolvable scales are modeled using a total effective viscosity (\({\nu }_{total}\)) that includes molecular viscosity (\(\nu\)) and turbulent viscosity (\({\nu }_{t}\)):

$${\nu }_{total}=\nu +{\nu }_{t}={\left(C\Delta \right)}^{2}\overline{S},$$
(4)

where C is the Smagorinsky constant, which is taken as 0.2, Δ the filter size that is taken as grid size, and\(\overline{S}\) is the magnitude of the filtered strain rate tensor

$$\overline{S}=\sqrt{2{S}_{ij}{S}_{ij}},$$
(5)

where \({S}_{ij}=\frac{1}{2}(\frac{\partial {u}_{i}}{\partial {u}_{j}}+\frac{\partial {u}_{j}}{\partial {u}_{i})})\). The strain rate tensor components can be computed directly from the non-equilibrium moments [19, 20]

3 Simulation cases and model set-up

The ABLE-LBM model set up is depicted in Fig. 2. The entire model domain in three dimensions (NX, NY, NZ) = (201, 201, 76). The computational domains are larger than those in the experiments to minimize possible lateral boundary effects. Except the domain size, all other details match the laboratory experiment [29]. The buildings are a uniformly spaced orthogonal array of cubic buildings with height and spacing of H with a central tall building of height 3H. The computational grid element is a cubic mesh (Δx = Δy = Δz = 1) and the building height H was set to be 8 grid lengths. Two sets of ABLE-LBM simulations were carried out for boundary layer flows in the model building array for the different wind directions. In the first experiment, the inflow wind direction was perpendicular (0°) to the building surfaces, the second experiment had a 45° wind direction with respect to the building surfaces. Using the bulk streamwise speed (Ub = 0.31 m/s), the Reynolds numbers (Re) were based on computational domain height and tall building height 34000 and 11000, respectively. In the laboratory, the working fluid was an incompressible water flow which is compatible for the LBM. Since the atmospheric boundary layer is well under a low Mach number of 0.1, compressibility effects are expected to be negligible. A detailed description of the experiment and MRI measurements are referred to the work of Shim et. al. [29]. The time step size is 0.012 seconds and the simulations were carried out for 4200 seconds.

Fig. 2
figure 2

The model set-up, including building array, wind directions, and location of the building arrays are shown in this diagram. The left panel shows building array set-up and wind direction is perpendicular (0° ) to the building frontal surface; The right panel shows the building array set-up and wind direction is oblique (45°) to the building surfaces. The cubic building height and the street width are in H, and tall building height is 3H. The domain height is 9.5H

In the large eddy simulation of the turbulent flow, one important issue is to allow the realistic inflow turbulence into the domain. It is still an active research area and there are many publications on specifying the turbulent inflow boundary [42,43,44,45,46]. We choose a simple precursor method for this simulation study because the buildings are on a flat surface. Before the simulation runs for the turbulent flow in the building arrays, an auxiliary simulation with domain of 200 × 100 × 76 in streamwise transverse, and vertical respectively was run to generate a fully turbulent flow on a flat surface and a cross-section of samples of turbulent flow velocities were stored for the inflow boundary condition. In this auxiliary run, a streamwise wind power profile with exponent n = 1/7 which was the inflow profile that the laboratory intended to produce, was used as the initial wind field and the inflow and outflow boundaries were set as cyclic boundary conditions. Roughness array elements with side lengths (2, 2, 0.1) and 2 grid separations in the y direction were set at x = 5 to generate the turbulence. This approximately mimics the experimental set-up in the flow developing section of the water channel. After 5000 time steps, a fully developed stationary turbulent flow on a flat surface was generated. The flow fields at the following 3.1 × 105 steps (1.17 h) were sampled at the transverse cross-section x = 80 in this auxiliary run and used for the inflow boundary condition. The averaged mean streamwise wind profile and the profiles of turbulence standard deviation are shown in Fig. 3. The mean streamwise flow profile agreed with the real mean laboratory inflow profile and exponential inflow profile reasonably well with maximum differences less than 5%. We are working on an algorithm to generate the inflow boundary for the situation of complex terrain using the upwind spectral property based on the observational data.

Fig. 3
figure 3

Left panel are the mean streamwise wind profiles from auxiliary simulation (black), from laboratory test (green), and 1/7 power wind profile (red). Right panel are the standard deviations representing the turbulence in the u′,v′,w′ wind components. These profiles are the averages at the sample plane at x = 80 in the auxiliary simulation for the inflow boundary conditions. Ub is the bulk velocity for the laboratory tests

4 Results and discussion

To compare with the laboratory measurement results, the following comparisons are presented in a non-dimensional form. The simulated velocity field is scaled with the same bulk velocity (Ub= 0.31 m/s) derived from the mean flow rate as the published results from [29]. We focus on the Re in laboratory tests for this validation study, although our model can achieve much higher Re (107) that dynamically is similar to a highly turbulent atmospheric boundary layer. To compare the published results of the MRI observation, we focused on regions similar to those reported in [29], with the color schemes of contour plots as similar as possible to [29].

The average flow field in the model simulation was derived from the time average of each for 1 hour. The inflow wind boundary profile was taken from stored data from auxiliary run to replicate the inflow conditions of the laboratory set-up. These profiles are including the mean and turbulences with mean wind and turbulence turbulent intensity shown in Fig. 3. The flow field reached a quasi-stationary turbulence state in building array after 10 minutes of simulation time. In the following comparison of the model to laboratory results, the averaging started after 10 minutes of simulation for 1 hour. Another reason to apply the inflow turbulence was to construct turbulent eddies for the inflow with large-scale turbulence forcing in the atmospheric boundary layer.

4.1 Comparison of horizontal wind fields with laboratory data

Figure 4 shows the time-averaged mean streamwise wind (U) fields at a vertical distance of ½ H above the surface that are comparable to the MRV measurements [29], where U was normalized by the bulk streamwise velocity (Ub) in both cases. There were reverse flows on the lee side of the buildings in both inflow wind directions. Significant differences exist between the two wind conditions in terms of the U velocity distributions for the two inflow directions. The 0° wind case showed stronger U velocity at the first two columns of the building array with speeding up at building surfaces aligned with wind direction, while the 45° wind showed that strong U wind propagated to third column of the building array with speeding up at the building corners. These patterns are in agreement with the laboratory experiment results [29]. There were significant effects of the tall buildings on the U velocity component. The tall building in both wind direction cases caused wider area of speed-up at the sides of the tall buildings. There were much stronger reversal flows at the lee sides of the tall buildings at ½H for both wind direction cases compared with lee sides of all other short buildings. The magnitude of the streamwise wind velocity also compared reasonably well with the MRI measurements. These kinds of flow patterns also reported in other laboratory results for building array studies [23,24,25,26,27,28].

Fig. 4
figure 4

Average streamwise wind velocity fields at a horizontal plane at ½H. The streamwise wind speed is normalized by the bulk wind speed. (a) MRI result for 0° wind case; (b) ABLE-LBM result for 0° wind case; (c) MRI result for 45° wind case; (d) ABLE-LBM result for 45° wind case

Fig. 5
figure 5

Average vertical wind velocity fields at a horizontal plane at ½H. The vertical wind speed is normalized by the bulk wind speed. (a) MRI result for 0° wind case; (b) ABLE-LBM result for 0° wind case; (c) MRI result for 45° wind case; (d) ABLE-LBM result for 45° wind case

Figure 5 shows the simulation results of vertical wind velocity at ½ H corresponding to the two inflow wind directions. The general patterns of the vertical velocity field compared well with the laboratory experimental results [29]. The windward sides of the buildings generally had downdraft flows due to the building blocking effect, the lee sides of the building generally had updraft flows due to lower pressures caused by recirculation patterns. The tall buildings again had much stronger updraft and downdraft motions with downdraft at the windward sides and updraft at the lee sides of the tall buildings. The strong vertical ventilation effect of tall building updraft and downdraft are also reported in other studies [23,24,25,26,27,28]. Similar to the experimental results of [29], the 45° inflow wind case showed a larger area of the vertical wind at the lee side of the tall building. Immediately at the lee side of the tall building in the 45° wind direction case, the short buildings had stronger updraft vertical wind than that of the 0° inflow wind case at ½ H. This pattern of vertical wind was also displayed in the MRV results [29].

Fig. 6
figure 6

Average vertical cross-sections of wind field (vectors and vertical wind contour). (a) MRI result for 0° wind case; (b) ABLE-LBM result for 0° wind case; (c) MRI result for 45° wind case; (d) ABLE-LBM result for 45° wind case

4.2 Comparison of vertical cross sections

The vertical cross-sections cut through the symmetry planes were compared between ABLE-LBM simulations and the MRI measurement results (Fig. 6). The vertical cross-section was cut through the streamwise flow symmetric plane as presented in Shim et. al. [29] for the purpose of direct comparisons. The averaged wind vector field at this plane compared well with the MRI measurements. For 0° wind direction, the windward side of the building tops had accelerated upward flows and followed the building top vortices. The first canyon also had a distinct vortex located inside the canyon and there was a strong downdraft flow at the windward side of tall building. At the lee side of the tall building there was a large updraft area extending to the fourth column of building array. These type of flow structures were also reported other laboratory studies [23, 25, 27, 28]. The tall building plays an important role for the strong updraft and downdraft flows. Other wind tunnel studies [21, 22] showed that updrafts were only extended to slightly higher altitudes of building roof top when they tested several uniform short buildings. In the 45° wind case, there is a similar kind of flow structure, but the updraft and downdraft winds at the tall building windward sides are not as strong as in the 0° wind case. It is likely due to the tall building in the 45° case having a reduced blocking effect with a more streamlined body shape relative the horizontal flows as show in Fig. 4. The building top vortices were slightly shorter in the 45° wind case than that in 0° wind case. These kinds of the flow structures were also confirmed in the MRI laboratory study [29].

4.3 Karman vortex street generated by the tall buildings

The presence of a Karman vortex street on the lee side of tall buildings has been reported in many references [47,48,49,50,51]. In this MRV test, the center building was tall enough and the Reynolds number based on the bulk velocity and the building width is about 3700, well in the range of the necessary condition for a turbulent Karman vortex street at the lee side. The vortices were shed at the sides of tall building due to the reversal flows generated from the flow separation at the side walls. The simulation data show evidence of a Karman vortex street at the lee side of the tall building. The vortex street is mainly located in the 3H > Z > 2H area, while the lower portion Z < 2H did not show evidence of a vortex street probably due to the large disturbances of rough short buildings at the lee side. Figure 7 displays an instantaneous horizontal velocity streamline for the plane at Z = 2.75H height, where the streamlines display a distinct vortex street pattern for both 0 degree and 45 degree wind cases. In order to characterize this flow pattern, spectral analyses were carried out for the turbulent velocity components (\({u}_{i}^{\prime}),\)in which they are defined as:

$${u}_{i}={\overline{u}_{i}}+{u}_{i}^{\prime},\quad {\overline{u}_{i}}=\int {u}_{i}\left(t\right)dt,i = 1,2,3.$$
(6)
Fig. 7
figure 7

Instantanous samples of horizontal flow pattern at Z = 2.75H height at lee side of the tall buildings for 0° degree wind case (left panel) and 45° wind case (right panel). The vortex street pattern were at both lee side of tall buildings

Fig. 8
figure 8

Spectral analysis results of time series at a lee side point at 2.5 H away (grid coordinate: 82, 100, 22) from the lee edge of the tall buildings for 0° degree wind case (left panel) and 45° wind case (right panel)

Figure 8 shows spectral plots for a point (coordinates: x = 82, y = 100, z = 22) located at the lee side of the tall buildings. The power spectra showed a peak at normalized frequency 0.2–0.3, indicating the vortices are oscillated approximately at this frequency. This frequency is similar to the result reported by Han et al. [49] in a LES simulation of urban flows with tall buildings. The coherent vortex motion contributes to the TKE and momentum flux transport of at the lee side of the tall building [49], which is also evident in our flowing TKE and turbulent momentum flux analysis. The capability to simulate these coherent vortices in the flow is important in building structure design [50] and in transport and dispersion modeling applications.

4.4 Statistical comparison of average wind in laboratory and model results

The previous figures give a visual evaluation and comparison between the laboratory and model results in several sections. In order to compare the entire results in the laboratory domain, the error statistics were computed using the standard method. The relative root-mean-square-error for horizontal (rmseh) and vertical (rmsev) winds can be defined as following equations, where the subscript m represents the model result and subscript o represent the laboratory observation, N is the total number of grid points

$$rmseh={\left[\frac{\sum _{n=1}^{N}{\left({u}_{m}-{u}_{o}\right)}_{n}^{2}+\sum _{n=1}^{N}{\left({v}_{m}-{v}_{o}\right)}_{n}^{2}}{\sum _{n=1}^{N}{\left({u}_{o}\right)}_{n}^{2}+\sum _{n=1}^{N}{\left({v}_{o}\right)}_{n}^{2}}\right]}^{1/2}$$
(7)

,

$$rmsev={\left[\frac{\sum _{n=1}^{N}{\left({w}_{m}-{w}_{o}\right)}_{n}^{2}}{\sum _{n=1}^{N}{\left({w}_{o}\right)}_{n}^{2}}\right]}^{1/2}$$
(8)

.

Since the laboratory MRI scanning grids were not the same as the numerical model grids, the laboratory data were linearly interpolated into the modeling grid for computing error statistics. The computational results are listed in Table 1. In order to evaluate the model performance in disturbed building array and atmosphere above, we separated the domain in two parts accordingly using the Z = 4H as dividing plane. Overall, the flows above the building arrays had much better comparison between the model and the laboratory tests, reflecting that the obstructed flow in the building arrays are much more difficult to simulate than the atmosphere above the urban with less affect by the buildings. Generally, the two model simulation results have similar accuracy compared with the laboratory data. The model results captured about 70% − 80% mean flow information in the building array region compared with the laboratory data (Table 1). The horizontal winds compare more closely than the vertical winds. The root-mean square errors for vertical winds are larger than those of the horizontal winds. This is probably due to the smaller average vertical winds but more frequent upward and downward fluctuations that are harder for the model to capture. Overall, there are substantial differences between laboratory tests and the model simulations, which may be caused by the error of inexact boundary conditions, spatial alignment and experimental measurement uncertainties that are challenging to prescribe.

Table 1 Error statistics of the model simulations

4.5 General horizontal flow patterns of the winds in and around arrays

The flow patterns in and around the building arrays are significantly influenced by the building array arrangement [25,26,27,28]. Since the ABLE-LBM simulations were carried out in a much larger area than the laboratory building arrays, which were conducted in a water channel and included side walls, the pattern of wind circulation around the perimeter of the simulation building arrays can be analyzed without the same lateral boundary effects.

Figure 9 shows a side by side comparison of horizontal flow fields at ½ H for the runs with different wind directions of 0° and 45°. The horizontal velocity is computed as \({U}_{h}={({u}^{2}+{v}^{2})}^{1/2}.\)First, the flow patterns for both cases showed similar behavior within array streets as reported in the laboratory study of [29]. Secondly, the flow at the outside vicinity of building arrays are different for the two wind direction cases. Both 0° and 45° wind cases showed larger wind acceleration zones at the north and south sides of building array. The 0° wind case also showed a wider streamline deflection when flows first encountered the building arrays from the west side, whereas the 45° wind case had no such widened streamline shape. The lee side of the building array has a longer wake area of reversal vortices in the 45° case. This probably is due to the collected effect building array drag forces that are different in the two wind direction cases, along with the larger separation region in the wake of the tall building in the 45° case. Third, as detailed in the experimental study [29], the horizontal flow inside the two arrays also had different flow patterns. The west to east channeling flow is relatively strong into the first two columns of the building array for the 0° case. After the first two columns, the flow shows significant north-south channeling on both sides of the building array. This kind of horizontal wind channeling was also reported in the wind tunnel study by Princevac et al. [25]. On the other hand, the 45° case depicted a zig-zag shape of the channeling in the first three columns of the buildings as the geometry forces the flow to continually move laterally around the buildings at this height. The following columns showed converging channeling flows from the array sides all the way to the array lee wake area. The tall building in both cases showed stronger blocking effects than the short buildings, which makes the channeling of horizontal winds at the sides of the tall building much stronger than those of short buildings. Right outside the building array at the north side and south side, the flows were significantly different. The local flow speeds in the 45° case were faster than that in the 0° case, and there were more significant outflows of air from the building array area in the 45° case.

Fig. 9
figure 9

Average total wind speed (Uh) and horizontal wind streamline at ½ H for entire simulation domains for 0° (left panel) and 45° (right panel) wind cases. The figure shows the different channeling patterns of horizontal winds for the two cases

4.6 The turbulence fluxes and turbulent kinetic energy

Average turbulent kinetic energy (TKE) were computed from the ABLE-LBM simulation results for analysis of the turbulence field in both wind direction cases. In the TKE computation, the instantaneous velocity components were decomposed into two parts by the Reynolds average (Eq. 6). The average TKE was computed using the following equation, where N is the total time steps in the simulation.

$$\text{T}\text{K}\text{E} =\frac{1}{N}\sum _{j}^{N}\left[{{u}_{j}^{\prime}}^{2}+{{v}_{j}^{\prime}}^{2}+{{w}_{j}^{\prime}}^{2}\right]/2$$
(9)

.

Figure 10 shows that the vertical profiles of TKE distributions at the flow symmetry planes for both wind direction cases. The TKE distributions displayed significant heterogeneity for both wind direction cases. The background TKE is around 0.01 for the flows at Re number simulated in this study. The TKE at the street cross-sections were generally greater in the upper parts of the street canyons as compared to the lower part of the street canyon. The street canyons at the frontal two columns of building array had slightly greater TKE than those columns further into the field. This kind of distribution of the turbulence is also reported in [15, 25]. At the upper levels, the effect of the tall building was much more significant than that at lower levels (Fig. 10). The tops of the lee side had greater TKE values due to the strong shear and vortex shedding from the tall building in this region of the flow. The 45° wind case had a slight longer increased TKE footprint than in the 0° wind direction case. The TKE at the building tops were also greater than other locations because the roof top vortices enhanced the turbulence. The windward and lee sides of the tall buildings also have greater TKE, indicating the downdraft and updraft circulations at these locations were comparatively more turbulent than other locations

Fig. 10
figure 10

Normalized average TKE for the two simulation cases at vertical planes cut through the flow symmetry planes. Left panel is the 0° wind direction case. Right panel is the 45° wind direction case

Fig. 11
figure 11

Normalized average vertical momentum flus \(\overline{{{u}^\prime} {{w}^\prime}}\) for the two simulation cases at vertical planes cut through the flow symmetry planes. Left panel is the 0° wind direction case. Right panel is the 45° wind direction case

Figure 11 shows the turbulent vertical momentum fluxes at the flow symmetry planes for both simulation cases. In the background flow less disturbed by the building array, there is a small downward turbulent fluxes. The \(\overline{{{u}^\prime} {{w}^\prime}}\) fluxes are generally related with the TKE values (Fig. 10), the large TKE area has stronger downward momentum fluxes. However, there were sporadic upward momentum transport at the building edges and right below the building edge at the lee side, reflecting the motion with upward vertical w’ and positive u’ component. The upward turbulent momentum flus in 0° wind case at the lee side of tall building was much stronger than that of the 45° wind case.

4.7 GPU implementation of the ABLE-LBM

As discussed in the introduction, one of the advantages of the LBM type model is the intrinsic parallelism in the modern GPU platform because the data locality is satisfied and time-stepping is explicit in the LBM algorithm. In the GPU implementation, each computational grid cell in the domain is assigned to a GPU thread. Modern GPU has several thousands of compute cores and are thus able to run thousands of threads simultaneously. The information in the grid is read, processed, and put back in the same memory by this thread. The code uses the Nvidia CUDA® (Common Unified Architecture). The 3D computation grids are mapped to 1D memory and the data access uses the array structure for memory coalescing. There are many thread blocks in the code and each thread block does the same computation. Each thread block uses local fast shared memory, avoiding unnecessary local to global memory movement. The current version GPU implementation is in a single GPU processor. We are still in the process of implementing the ABLE-LBM in a multi-GPU computation platform. Because the GPU implementation itself is a complex topic, the details on the implementation will be reported in a coming paper. Here we only report its efficiency for two cases in this paper, as indicated in Table 2. The Tesla P100 GPU (3584 cores, 16 GB memory) has more than 380x speed-up compared to a CPU (Intel Core i7 3.1 GHz). It also indicates that real-time computations are achieved for these two cases, both with 3,070,476 of nodes.

Table 2 Comparison of computational efficiency (ratios of real-time/compute-time) for the two cases using the GPUs and CPU

5 Summary and conclusions

The evaluation of ABLE-LBM using laboratory MRI measurement data has indicated that ABLE-LBM performed reasonably well in complex building array situation in the neutral flow condition. The mean streamwise velocity and vertical velocity from simulations produced the observed flow features of vortices at the front and lee of the tall buildings. As observed from the MRI data, significant differences in mean wind channeling exists in the street for the 0° and 45° wind direction cases. The ABLE-LBM captured these differences and depict in the 0° case the straight channeling at the entrance side of the array and then cross-channeling from both sides of the building array near the lee side. The 45° wind case shows more lateral circulations as also indicated from laboratory data. The ABLE-LBM produced reasonable simulation results regarding the influence of the tall building (3H height) compared with the laboratory tests. The tall building produced a distinct strong updraft on the lee side and downdraft at the windward side of the building. The extended analysis indicates that there is a Karman vortex street on the lee side of the tall building. The TKE were also computed from the model simulation results. The TKE distributions and downward momentum turbulent fluxes are consistent with shear production of turbulence from mean flow features of vortices and wave activity. The computational speed of ABLE-LBM using the GPU has shown that real-time LES simulation is realizable for a computational domain with several millions grid points. This evaluation indicated that ABLE-LBM is a reasonable and effective method for turbulent flow simulations over a complex urban building array in a neutral condition. Further evaluation of this model in other wind conditions, such as convective and stable stratification in urban, are needed in future studies.