1 Introduction

An increasing trend in the number and population of urban areas is predicted in the forthcoming decades (UN 2014). The intense urbanization leads to significant land use and land cover (LULC) change, which has remarkable influence on local climate (Solecki and Oliveri 2004; Seto and Shepherd 2009). In urban areas, natural surfaces are replaced by impervious built-up structures (e.g. road, pavement, building). The elevated fraction of artificial materials modifies the thermodynamic properties (e.g. emissivity, albedo, heat capacity) and energy balance of surface, decreases the evaporation produced by vegetation and soil, and changes the wind flow at the urban canopy level through the increased surface friction (Oke 1987). The joint effects of LULC change generate the well-documented and widely investigated urban heat island (UHI) phenomenon (Landsberg 1981; Oke 1987; Arnfield 2003). UHI is a temperature surplus in the downtown related to urban and rural areas with less artificial surface coverage and characterized by a temperature difference, based on the observations at an urban-rural station pair. The highest temperature differences (i.e. UHI intensities (UHII)) typically occur under calm synoptic conditions (e.g. anticyclonic pattern), especially during the nocturnal hours (Oke 1995). The anthropogenic heat production of human metabolism, transportation, and cooling and heating in buildings can further intensify the existing UHI (Sailor and Lu 2004). The combined effects of UHI and climate change have direct and indirect impact on energy consumption (Santamouris 2007; Kolokotroni et al. 2015), air quality (Rosenfeld et al. 1995; Akbari et al. 2001), extreme weather events (e.g. heatwaves; Ward et al. 2016), and human comfort (Watkins et al. 2007; Steeneveld et al. 2011) in urban areas. For these reasons, cities seem to particularly vulnerable to environmental challenges (Lankao and Qin 2011).

The spatiotemporal behaviour of UHI can be monitored using in situ measurements, remote sensing products, and numerical models. The advantage of numerical models is that they are able to reproduce a wide range of meteorological fields and not restricted to space or time (Hidalgo et al. 2008). Due to the broadening of computing technology, urban environments are taken into account by even more complicated canopy layer models, starting from the “bulk” approach (e.g. Liu et al. 2006) and ending with multilayer models (e.g. Martilli et al. 2002; Salamanca and Martilli 2010; Krayenhoff et al. 2015).

Canopy layer models have been coupled to numerical models (e.g. single-layer scheme of Kusaka et al. (2001) and Kusaka and Kimura (2004) and multilayer scheme of Martilli et al. (2002) and Salamanca and Martilli (2010) in Weather Research and Forecast; Town Energy Balance (Masson 2000) in MesoNH, and Met Office Reading Urban Surface Exchange Scheme (Porson et al. 2009) in Met Office Unified Model) in order to provide a sufficient tool for urban climate modelling purposes. With appropriate configurations, the coupled canopy models have the ability to reproduce most of the processes in the urban boundary layer from building to regional scale (Baik et al. 2009; Toparlar et al. 2015).

The Weather Research and Forecasting (WRF) non-hydrostatic mesoscale model (Skamarock et al. 2008) has been designed to such applications as weather prediction, regional climate, and air quality modelling. In WRF, the sophisticated urban environment is taken into account through implemented canopy models with different degrees of complexity. The first attempt to represent urban processes was a “bulk” parameterization (Liu et al. 2006), which uses predetermined values for roughness length, surface albedo, volumetric heat capacity, soil thermal conductivity, and vegetation fraction. A single-layer urban canopy model (SLUCM) has been developed by Kusaka et al. (2001) and Kusaka and Kimura (2004). This scheme treats the urban geometry as arrays of infinitely long, three-dimensional street canyons. The reflection and trapping of short- and long-wave radiation and the shadowing effect of walls are also allowed in SLUCM. An exponential wind profile is assumed in the canopy layer. Prognostic variables include the skin temperature of road, wall, and roof and emitted fluxes from horizontal and vertical surfaces. In WRF, the different SLUCM parameters are assigned to each urban land use category and listed in look-up tables. The multilayer canopy schemes of BEP (Building Effect Parameterization; Martilli et al. 2002) and BEM (Building Energy Model; Salamanca and Martilli 2010) have been linked to and incorporated into WRF since 2010. BEP helps to understand the exchange of momentum, moisture, and heat (released at different heights of buildings) within the urban boundary layer. BEM has been developed to estimate the interactions between the building interior and the surrounding atmosphere. It calculates the heat generation of occupants, and cooling and heating equipment, even on the different floors of buildings.

Of the land surface parameterizations in WRF, Noah LSM (Chen and Dudhia 2001; Tewari et al. 2004) is a frequent option to give lower boundary condition. For each grid cell, Noah computes sensible and latent heat flux, outgoing long-wave radiation, skin temperature, emissivity, and albedo as follows:

$$ X={F}_{\mathrm{veg}}\cdotp {X}_{\mathrm{veg}}+{F}_{\mathrm{urb}}\cdotp {X}_{\mathrm{urb}} $$

where X is the variable being calculated from the surface to the lowest model level, Fveg is the fractional coverage of natural surfaces, Xveg is the value of a variable in Noah for natural surfaces, Furb is the fractional coverage of impervious surfaces, and Xurb is the value of a variable in the given UCM (of the three options) for artificial surfaces. The equation suggests that SLUCM and Noah LSM are linked to WRF through the urban fraction (Furb) parameter.

A vast number of parameters are required to represent the complex urban geometry. For example, SLUCM with its medium complexity employs more than 20 urban canopy parameters (UCPs) for each land use class. In order to update the list of UCPs or generate LULC data, numerous information can be applied: satellite products (e.g. images of Landsat-8 bands with a resolution of 30 m), high-resolution LULC datasets (e.g. U.S. Geological Survey (USGS); Homer et al. 2004; Coordination of Information on the Environment (CORINE): Bossard et al. 2000), building databases from local authorities. It is noteworthy that the amount and accessibility of different datasets have been extended recently, but the quantity and quality of UCP data show huge spatial discrepancies (Chen et al. 2011).

The National Urban Database and Access Portal Tool (NUDAPT) project of Ching et al. (2009) provides WRF-compatible gridded UCP dataset for investigations in large US cities, with a horizontal resolution below 1 km. To broaden NUDAPT to global scale, the World Urban Database and Access Portal Tool (WUDAPT; Mills et al. 2015) framework has been produced. The method uses freely available data and software and is organized into three levels, depending on the richness of the information on urban geometry. The first level of hierarchy requires a LULC classification, based on the local climate zone (LCZ) scheme of Stewart and Oke (2012). In the next two levels, the WUDAPT method consists of detailed UCPs (e.g. building, impervious and vegetative cover, building height, aspect ratio, sky view factor, anthropogenic heat, surface albedo, and roughness length) for each LCZ. Currently, the classification stage (at least for the lowest process level) has been finished for over 20 cities.

The LCZ classification involves ten urban and seven non-urban LULC categories (Stewart and Oke 2012). The construction of the categories relied on the geometric (e.g. sky view factor, impervious and pervious surface fraction, height of roughness elements), radiative (e.g. surface albedo), thermal (e.g. surface admittance), and metabolic (e.g. anthropogenic heat output) properties of the surface. It is assumed that each zone (with typical diameters from 100 m to few km) has its own and individual local climate. This scheme allows a better microclimatic description of the stations on which the UHII parameter was determined (Bechtel et al. 2015). Therefore, the literary comparison of the UHI-related studies in different urban areas also gets easier.

Urban climate studies with different methodologies have been carried out to analyze the spatiotemporal variation of UHI (Cheval and Dumitrescu 2017; Ellis et al. 2017; Przybylak et al. 2017; Dienst et al. 2018; Varentsov et al. 2018). Many of the investigations have used the WRF model to simulate the alteration of meteorological variables in specific cities (Giannaros et al. 2013; Doan et al. 2016; Morris et al. 2017; Ooi et al. 2017), during heatwave and heavy rainfall events (Li and Bou-Zeid 2013; Chen et al. 2014; Zhong and Yang 2015), by using climate scenarios and projections (Argüeso et al. 2014; Fallmann et al. 2017), and associated with LULC processes (Bhati and Mohan 2015; Kaplan et al. 2017); nevertheless, quite a few have been published in terms of LCZs.

Numerical simulations were performed by Morris et al. (2016) to analyze the UHI effect in Putrajaya (Malaysia). It was highlighted that urban greenery had a great impact on thermal conditions at night. The nocturnal UHII in urbanized and highly vegetated LCZs varied from 1.9 to 3.1 °C. Brousse et al. (2016) investigated the daily temperature range (DTR) parameter with BEP-BEM schemes in different LCZs over Madrid (Spain) for a summer and a winter period. It was concluded that WRF with LCZ-based land use classification was capable to estimate the inter-LCZ temperature variability. They also found that for those grids that were surrounded by grids with same LULC classes, LCZ 2 (LCZ 8) had the highest, while LCZ 6 (LCZ 2) had the lowest DTR in summer (winter).

In Szeged, urban climate assessments have mainly been focused on in situ (Unger et al. 2015) and mobile measurements (Unger et al. 2001; Unger et al. 2010), and modelling of the future climate change (Skarbit and Gál 2016). In the study of Gál et al. (2016), observed near-surface air temperature was evaluated for the period between June 2014 and May 2015. It was concluded that UHI, on average, formed around sunset and lasted about 9-h long, while the strongest UHI occurred 3 h after sunset. In summer, the highest temperature differences between a rural station and the given LCZ were observed in compact (midrise and low-rise) classes, with an order of 3.5 °C. The mean intra-LCZ intensities in other LCZs were also measured to be around 3 °C. Contrarily, the mean UHII remained under 2 °C in winter months, with much smaller spatial variability in each class.

In this study, we aimed (i) to incorporate a Szeged-specified LCZ land use (Lelovics et al. 2014) and urban canopy parameter database into the model to give a better representation of artificial surface coverage; (ii) to predict the spatiotemporal variability of near-surface air temperature and energy budget components under a 6-day heatwave period between July 18 and 24, 2017, characterized by low synoptic wind and daily maximum temperatures around 35 °C; (iii) and to evaluate our WRF-SLUCM-LCZ scheme against the measurements of the local urban climate monitoring system.

2 Study area

Szeged (46.26° N; 20.15° E) is situated in the south-eastern part of Hungary, at the riverside of Tisza, with a population of 162,000 (Fig. 1). Several small lakes are located to the west, which may impact the microclimate of their immediate surroundings due to elevated potential evaporation in the summer period. The climate of Szeged is Dfb due to Köppen-Geiger’s classification (Peel et al. 2007), with an annual mean temperature of 10–12 °C and an annual total precipitation of 500–600 mm. Because of the relatively high frequency of summertime high-pressure synoptic pattern, the amount of annual sunshine duration is around 2000 h.

Fig. 1
figure 1

Study area with the corresponding LCZ classes and location of urban climate monitoring stations

The urban area is surrounded by a flat terrain (the mean elevation is around 80 m a.s.l.) with croplands, pastures, and deciduous forests. The total area of the city is around 281 km2, even if the rigorously taken urbanized area is much smaller (~ 50 km2). The inner city, which consists of offices, administrative and educational buildings, and apartment houses, is covered by compact LCZs (i.e. LCZ 2 and LCZ 3) (Fig. 1). LCZ 5, LCZ 6, and LCZ 9 with family houses, blocks of flats (mostly in the north), and shopping centres are found in the outer areas. The north-western parts with logistics and industrials were classified as LCZ 8.

3 Preparation of land use data and urban canopy parameters

Accurate representation of a complex urban surface is essential to estimate the thermal conditions in a densely built region (Chen et al. 2011). Most of the databases in association with LULCs and UCPs have been designed to global scale and have not the ability to describe the thermal, radiative, and geometric properties of surface in a specific urban area. Therefore, it is required to seek for alternative sources of such data. WRF includes the USGS 24-category classification (based on AVHRR satellite data) with 1-km resolution and the MODIS 20-category classification with 1-km resolution (Homer et al. 2004) as default LULC option. Additionally, the CORINE database (Bossard et al. 2000) derived from IRS P6 LIS III and RapidEye remote-sensed images, with 44 classes and a resolution of 100 m, is also a reasonable choice for many European cities. The assignment of the urban categories in CORINE was based on the spectral feature of surface not on the influence of urban geometry on microclimate. Contrarily, the LCZ concept has been developed to classify an urban area according to the thermal reaction of urban canopy on distinct built-up regimes. For this reason, the LCZ framework has widely been used for observation-based comparison of UHI (Alexander and Mills 2014; Leconte et al. 2015) and can be interpreted to state-of-art modelling studies.

For deriving the LCZ map of Szeged, three-dimensional building database, road database, aerial photographs, RapidEye image with a resolution of 5.16 m, and CORINE LULC data were employed (Fig. 2). First, we have calculated the following parameters: sky view factor (Gál et al. 2009), building height (Gál and Unger 2009), surface roughness, surface albedo, building surface fraction, impervious surface fraction, and pervious surface fraction using GIS technique. Then, the so-called lot area polygons were merged to zones with a diameter from metres to several kilometres. Finally, these zones were aggregated to unified classes representing the given LCZ (for further details, see Unger et al. 2014).

Fig. 2
figure 2

Major steps on the preparation of LULC data and UCPs in our WRF-SLUCM-LCZ modelling scheme.

It is noteworthy that the previous process was only performed for the urbanized area of Szeged. For the rest of the study area (including other cities), the CORINE LULC database was taken into account. After all, we have found six LCZs from the possible ten urban zones. LCZ 6 (open low-rise) had the largest extension, while LCZ 2 (compact midrise) and LCZ 3 (compact low-rise) spanned the lowest number of grids (Table 1.).

Table 1 Absolute and relative values of urban and non-urban land use categories in the finest (d03 with 1.5-km grid spacing) model domain. The last six classes indicate the urban LCZs

In the following step, the existed (default) UCPs have been specified for the study area, and so the new UCPs (Table 2) were ready to be updated to the urban parameter table of WRF.

Table 2 Specified values of those urban geometry-related canopy parameters that have been coupled to the LCZ-based land use classification database and integrated into the WRF-SLUCM modelling system

By far, there was a little information about the thermodynamic-related UCPs, although they play important role in energy partitioning. To address this issue, we intended to adjust some of the most relevant parameters listed in Table 3. Such small areas (around 200 m × 200 m) were assigned on Google Earth images that represent a given LCZ and contain roads and buildings. It was assumed that the dominant building materials in this region are asphalt and concrete for road, concrete and brick for wall, and tile and concrete for roof. Knowing the characteristic thermodynamic quantities (e.g. thermal conductivity, heat capacity) of asphalt, brick, concrete, and tile (or ceramic) from engineering reference tables (Wang and Kuo 2001) and the relative occurrence of a given material in each LCZs, the following simple assumption (for example, for heat capacity) was employed to the estimate the new variables:

Table 3 Specified values of those thermodynamic-related parameters of building and road facets that have been coupled to the LCZ-based land use classification and integrated into the WRF-SLUCM modelling system
$$ {C}_{{\mathrm{LCZ}}_{\mathrm{x}}}={C}_{\mathrm{asphalt}}\cdotp {M}_{\mathrm{asphalt}}+{C}_{\mathrm{brick}}\cdotp {M}_{\mathrm{brick}}+{C}_{\mathrm{concrete}}\cdotp {M}_{\mathrm{concrete}}+{C}_{\mathrm{tile}}\cdotp {M}_{\mathrm{tile}} $$

where \( {C}_{{\mathrm{LCX}}_{\mathrm{x}}} \) is the heat capacity of wall, road, or roof in a given LCZ in Joules per cubic metre per Kelvin; Casphalt, Cbrick, Cconcrete, and Ctile are the heat capacity of asphalt, brick, concrete, and tile, respectively, in Joules per cubic metre per Kelvin; and Masphalt, Mbrick, Mconcrete, and Mtile are the relative fraction of asphalt, brick, concrete, and tile coverage on wall, road, and roof surfaces, respectively. The upgraded UCPs from Tables 2 and 3 were applied to all simulations.

4 Model configuration and observational data

The simulations were performed using the WRF model version 3.8.1. coupled to the SLUCM of Kusaka et al. (2001) and Kusaka and Kimura (2004). Three one-way nested domains with a grid spacing (and grid points) of 13.5 km (80 × 75), 4.5 km (121 × 94), and 1.5 km (105 × 78) were applied (Table 4) to the integrations. The innermost domain (d03) covered a wider neighbourhood of Szeged, with a centre at 46.13° N and 20.16° E (Fig. 3). Sigma vertical coordinates were prescribed from the surface to 20 hPa. Twenty of 44 vertical levels can be found under 1.5 km in order to give a sufficient representation for the urban boundary layer. The simulation period started on 17 July 2017 (initialized at 12:00 UTC (14:00 LT)) and lasted until 24 July 2017 00:00 UTC (02:00 LT). The first day (24 h) of the output was considered as spin-up, the rest 132 h was taken into account during the analysis. Three-hour NCEP GFS dataset with a horizontal resolution of 0.25° granted the initial and boundary conditions for the simulations. We have switched the following physical parameterizations on the following: Noah land surface model (Tewari et al. 2004), Rapid Radiative Transfer Model (RRTMG) scheme for long- and short-wave radiation (Iacono et al. 2008), modified MM5 surface layer scheme (Jimenez et al. 2012), BouLac boundary layer scheme (Bougeault and Lacarrere 1989), WSM 5-class scheme (Hong et al. 2004) for microphysics, and Kain-Fritsch cumulus parameterization (Kain 2004). The cumulus convection scheme was not considered for the finest domain (d03), because the model can solve the convection explicitly at such a fine resolution (1.5 km).

Table 4 Basic information about the model setup of the simulations
Fig. 3
figure 3

Location of domains with the corresponding land use categories of finest (d03) model domain. The black dot and square in the upper and bottom plot mark the location of Szeged

The analysis period was selected according to the Weather Factor (WF) of Oke (1998). WF (ϕw) determines the potential of UHI formation under a given synoptic pattern and can be expressed as:

$$ {\varPhi}_{\mathrm{w}}=\left(1-k{m}^2\right){u}^{\raisebox{1ex}{$-1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.} $$

where k is the correction factor for cloud height (Bolz 1949), m is the cloud amount in tenths, and u is the wind speed in metre per second. WF ranges from 0 (poor cooling conditions: overcast, clouds with low base, high wind speed) to 1 (excellent cooling conditions: clear sky, wind speed is under 1 m s−1). Here, we have applied the threshold of 0.8 for WF during the nocturnal hours of 2017. A long contiguous period was found on the days between 18 and 24 July. At that time, a strong high-pressure system formed over Central Europe. The maxima of near-surface air temperature raised over 31 °C on each day. The heatwave was peaked on 21 July, when the air temperature was around 35 °C. This period was ended by a cold front passed on 24 July.

In order to evaluate the performance of WRF in simulating near-surface air temperature, the observations of the local urban climate monitoring system (UCMS) were employed. For each day, the hourly means of the observed near-surface air temperature (Ta-OBS) have been calculated and compared with the modelled values (Ta-WRF). During the analysis, Ta-WRF was considered in the nearest grid to the location of the corresponding measurement site (with a given Ta-OBS) of UCMS. To carry out a more straightforward interpretation of the results, the computed metrics (e.g. statistics) have been averaged and ordered by the LCZs. Figure 4 highlights the names and locations of those sites that have been designated according to the LCZ scheme and used for the evaluation. Because of maintenance works at station 8-1, 8-2, and D-2 in 2017, LCZ 8 has not been taken into consideration in the analysis. Additionally, those model grids that were not in the same LCZ category as the given station (e.g. grids near station 3-1) were also omitted from the study. As a consequence of former studies (e.g. Unger et al. 2015; Gál et al. 2016), it was assumed that each station is representative for the corresponding LCZ. Hereinafter, we refer to the observed and modelled temperature at station D-1 (dryland, cropland, and pasture in the original (default) classification, low plants (LCZ D) in the LCZ scheme) as the rural reference value, since the microclimate of this site is not affected by urban influences (Gál et al. 2016).

Fig. 4
figure 4

Microclimatic environment of the urban (and one rural) climate monitoring sites being applied to the evaluation process. In the designation of each station, the first number refers to the LCZ in which the station is located; the second number is the ordinal number of the station

We employed different statistical measures (e.g. mean bias (MB), mean absolute error (MAE) root mean squared error (RMSE) index of agreement (IOA), hit rate (HR), and correlation coefficient (CC0) (Table 5) to assess the performance of WRF in simulating Ta with the updated LULC and UCPs. In Table 5, \( {T}_{{\mathrm{WRF}}_i} \) and \( {T}_{{\mathrm{OBS}}_i} \) are the modelled and observed near-surface temperature in the ith hour of the simulation period, \( \overline{T_{\mathrm{WRF}}} \) and \( \overline{T_{\mathrm{OBS}}} \) are the daily mean of the modelled and observed near-surface air temperature, n and m are the number of observations and model outputs (must be equal to 6 in our case), and k is a factor for the desired model accuracy (A). Two degrees Celsius was considered for A parameter (Cox et al. 1998).

Table 5 Equations of basic statistics being applied to the validation. Detailed explanation of the variables is found in the text

5 Results and discussion

5.1 Evaluation of the spatiotemporal variability of near-surface air temperature

In order to evaluate the ability of our WRF-SLUCM-LCZ system over the study area under an ideal (in terms of UHI development) synoptic condition, the hourly means of modelled near-surface air temperature were compared with the observations of the local UCMS. Figure 5 illustrates the time series of the observed and simulated Ta in different LCZs during the 6-day analysis period. The maxima of Ta-OBS ranged from 30 to 35 °C, showing a warming tendency. The differences in the minima of Ta-OBS between LCZs generated by the diversity of urban landscape (and physical settings) disappeared, approaching to the end of the period. For example, while the difference in the minimum of Ta-WRF (Ta-OBS) between LCZ 2 and LCZ 5 was around 5.3 °C (3.8 °C) on 19 July, it decreased to 1.2 °C (0.1 °C) by 23 July. The thermal contrast within the city almost completely blurred during the night of the last day.

Fig. 5
figure 5

Temporal variability of observed and modelled near-surface air temperature in different LCZs during the 6-day simulation period. This comparison covers the time period between 18 July (12 UTC; 14 LT) and 24 July (00 UTC; 02 LT).

Overall, it can be concluded that WRF-SLUCM with the modified LCZ land use classification was able to simulate the spatiotemporal variation of Ta reasonably well. Ta-OBS was consistently overestimated in the daytime. The overestimation persisted in LCZ 6, LCZ 9, and LCZ D in the night-time, while slight nocturnal underestimation occurred in LCZ 2 and LCZ 5. The highest difference between Ta-OBS and Ta-WRF manifested in the afternoon of 22 July, when a more intense low-level moisture advection and local-scale cumulus convection were produced by the initialization phase, although it was not confirmed by the observations. On the other days, the model performed satisfactorily in each LCZ.

Figure 6 shows the diurnal variations of observed and predicted Ta in different LCZs. General overestimation of Ta-OBS was identified during the daytime, with the highest magnitude in early afternoon around 12 UTC (14 LT). In LCZ 5, the overestimation shifted to an underestimation after 12 UTC. The overestimations in the daytime were particularly high in LCZ 2 and LCZ D. To understand this, it is required to consider the microclimatic environment of the stations (Fig. 4). Station 2-1 stands in the middle of a square with a higher amount of vegetation related to that was prescribed for LCZ 2. Additionally, station D-1 is close to the urbanized area (see Fig. 1); thus, Ta-WRF might be affected by the advection of warmer urban air.

Fig. 6
figure 6

Mean diurnal variations of observed and modelled near-surface air temperature in different LCZs

The daytime Ta-WRF behaved as a bimodal distribution. At first, Ta-WRF culminated around 11–12 UTC. Subsequently, the first growth phase seemed to be reduced. It was likely attributed to the appearance of convective clouds, which attenuated the incoming shortwave radiation. Around 14 UTC, Ta-WRF started to increase again, creating a second but a lower peak as earlier. The dissipation of cumulus clouds allowed Ta to grow up to 16 UTC.

In the night-time, Ta-OBS was simulated well in LCZ 2 and LCZ 5. On the other hand, LCZ 6, LCZ D, and particularly LCZ 9 were characterized by overestimations of about 2–4 °C. The overestimation appeared to be the most pronounced after sunset (around 18:30 UTC). The warm bias decreased to 0.5–1 °C by the end of the night. The nocturnal overestimation of urban canopy temperature in LCZ 6, LCZ 9, and LCZ D might be stemmed from a more significant heat release from artificial materials caused by the over-prediction of heat capacities. It must be emphasized that the determination of thermodynamic-related UCPs was based on a simple approximation utilizing the frequency of dominant urban fabrics in different LCZs, so this assumption has limitations, even though it provides a more realistic description of the effect of surface on energy exchange than the default UCPs. In addition to that LCZ 9 (sparsely built) is a transition zone between the urban and rural landscapes, thus, the consideration of the microclimate of the corresponding stations is essential. For instance, stations 9-2 and 9-4 are located at the edge of LCZ 9. Indeed, the daytime Ta-OBS in LCZ 9 can be characterized by “urban features” like Ta-OBS in other urban LCZs, while the nocturnal Ta-OBS was rather similar to that in LCZ D. Ta-WRF, however, has not varied significantly from the “urban pattern” in both the night-time and daytime, causing the inconsistencies in question.

The relatively high night-time biases in LCZ 9 were also reported in the study of Richard et al. (2018), who modelled the spatiotemporal distribution of Ta over Dijon (France) for July of 2015. They found a mean cold bias of 3–4 °C, with the larger uncertainties in the nocturnal hours. It was also similar that at 20 UTC, when the UHI effect started to enhance, LCZ 2 (LCZ 9 and LCZ D) showed the highest (lowest) Ta-WRF. On the other hand, Ta was simulated to be 1–1.5 °C higher in LCZ 9 than in LCZ 6, which was not the case for Szeged. It might be the consequence of the difference in the UCP datasets applied to the numerical integrations.

For assessing the daily fluctuation of Ta, the daily temperature range (DTR) parameter was calculated. DTR, the difference of maximum and minimum temperature, characterizes the ability of surface to cool down and heat up the overlying canopy layer and usually larger over natural surfaces as over densely built landscapes. The observed DTRs (DTROBS) ranged from 12.8 to 17.4 °C (Fig. 7). The highest DTROBS occurred in the rural LCZ 9 because of a large cooling potential of the near-surface air layers at night. For similar reasons, the second largest DTROBS (15.4 °C) was in LCZ D. Among open categories (LCZ 5 and LCZ 6), only a difference of 0.6 °C in DTR was observed. As a result of a relatively high nocturnal Ta-OBS, the lowest DTROBS evolved in LCZ 2.

Fig. 7
figure 7

Mean observed and modelled daily temperature range of near-surface air temperature in different LCZs

In LCZ 5 and LCZ 6, the simulated DTR (DTRWRF) was in line with DTROBS. The daytime overestimation of Ta-OBS compensated the night-time overestimations; therefore, DTRWRF has not changed substantially related to DTROBS. The overestimations of DTROBS with around 1–2 °C in LCZ 2 and LCZ D was the consequence of the warm bias of Ta-OBS in the daytime. As we discussed earlier, Ta-OBS around the stations in LCZ 9 was basically influenced by rural effects, particularly in the night-time. In turn, Ta-WRF resembled rather the “urban pattern” (e.g. in LCZ 6) than the “rural (nocturnal) pattern” (e.g. in LCZ D). It must be underlined again that LCZ 9 is a transition between LCZ 6 and LCZ D. The microclimatic investigation of a particular site can be decisive in determining whether urban or rural effects dominate the thermal conditions nearby.

In the study of Brousse et al. (2016), DTR was calculated for all urban climate zones (i.e. LCZ 1––LCZ 10) of Madrid by using the simulations of WRF with multilayer canopy schemes (i.e. BEP and BEM). Based on the mean of a 4-day analysis period, DTR was ranged between 6.5 and 13.5 °C. If we only consider those LCZs that can also be found in our analysis, the lowest DTR occurred in LCZ 2 (7 °C), while the largest ones were in LCZ 6 and LCZ 9 (around 10.5 °C). This tendency was captured by our simulations as well. An additional common feature was that only a small deviation of DTR prevailed between LCZ 6 and LCZ 9. On the other hand, the significant contrast (in terms of urban fraction and other physical properties) between LCZ 2 and LCZ 9 was not simulated perfectly with our model configurations. It can be due to the difference in the complexity of the selected urban schemes and canopy parameters. The multilayer BEP-BEM not only takes the external properties of a building (i.e. the thermodynamic variables of wall and roof) into account but also considers the internal factors (e.g. heat generation by A/C systems and indoor electronic equipment). With the road width of 12.7 m (10 m) and building width of 17.5 m (10 m) in LCZ 2 (LCZ 9), Madrid had higher and wider urban canyons in contrast with Szeged (Table 2). (The urban fractions were nearly the same.) It means that the extension of the active surface that absorbs, reflexes, and emits energy during the entire day was larger in Madrid. Consequently, for example, the emission of stored heat could be more pronounced at night, resulting in higher nocturnal temperatures and lower DTR in LCZ 2. Additionally, the topographic effect played pivotal role in Madrid, while it could entirely be ignored in Szeged.

In Fig. 8, the accuracy of Ta-WRF and Ta-OBS was compared through the coefficient of determination (r2). In principle, LCZs with higher urban fractions (e.g. LCZ 2 and LCZ 5) were typified by higher r2 than those that are located near to the outskirts of the city (e.g. LCZ 9 and LCZ D). r2 were similarly around 0.87 in most of the LCZs, which denotes robust connection between the observed and modelled Ta. In LCZ 9, r2 decreased to 0.84 because of the remarkable night-time overestimation of Ta-OBS that has already been discussed at Figs. 6 and 7. Some outliers can be detected near to the point of (Ta-OBS, Ta-WRF) = (32 °C, 25 °C). It was due to an excessive and erroneous cumulus formation on 22 July, which was being generated wrongly during the model initialization.

Fig. 8
figure 8

Linear regression analysis of observed and modelled near-surface air temperature in different LCZs. r2 is the coefficient of determination between the variables

Table 6 reveals further statistics (listed in Table 5) of Ta-OBS and Ta-WRF averaged over the 6-day integration period. The 6-day means of Ta-OBS showed high correlations with the built-up fraction and compactness (LCZ 5 < LCZ 2 < LCZ 6 < LCZ 9 < LCZ D). Albeit with lower differences (0.7 to 2.3 °C), very similar sequence can be set up for Ta-WRF (LCZ 5< LCZ 2 < LCZ 9 < LCZ 6 < LCZ D). The absolute errors (AEs) of Ta-OBS and Ta-WRF stayed below 1.5 °C in most of the urban zones (i.e. from LCZs 2 to 6). The mean AEs and biases of around 2 °C, however, underpinned the large uncertainties in LCZ 9 and LCZ D. The best overall performances evolved in LCZ 5 and LCZ 2, with biases of 0.22 °C and 0.98 °C, respectively. IOAs and SCCs of 0.9 for Ta-WRF are close to those statistical metrics that were reported by former studies. For example, Giannaros et al. (2013) simulated UHI over Athens and found mean AEs around 0.93 during a 2-day summertime period. Regarding the calculated SCCs, WRF with the SLUCM canopy scheme performed most accurately in LCZ 3 (0.95). There were slightly worse SCCs in LCZ 5 (0.88) and LCZ 9 (0.91). Similar consequences were drawn in the study of Hammerberg et al. (2018) in which the spatiotemporal distribution of Ta in Vienna was investigated. Ta-WRF with a WUDAPT-based LULC database was characterized by RMSEs around 2.45 °C. Overall, there were larger simulation errors in the night-time than in the daytime. This temporal dichotomy was also a feature in our study.

Table 6 Statistics of Ta in different LCZs. MeanOBS, 6-day mean of Ta-OBS; MeanWRF, 6-day mean of Ta-WRF; Bias, 6-day mean of the difference of Ta-OBS and Ta-WRF; AE, 6-day mean of absolute error; RMSE, 6-day mean of root mean squared error; IOA, 6-day mean of index of agreement; SCC, 6-day mean of spatial correlation coefficient

5.2 Evaluation of the spatiotemporal variation of urban heat island

We illustrated the spatial distribution of the modelled urban heat island intensity (ΔTa-WRF) over Szeged and its surrounding for three distinct times (06, 14, and 20 UTC) on each day (Fig. 9). ΔTa at 06 UTC (20 UTC) was selected to describe UHI after the sunrise (the sunset). ΔTa at 06 UTC represents UHI in the early afternoon. ΔTa-WRF was computed by subtracting the hourly mean of Ta-WRF at the station D-1 monitoring site from all other Ta-WRF grids.

Fig. 9
figure 9

Spatial distribution of modelled urban heat island intensity at 06, 14, and 20 UTC on each day of the simulation period

It is clearly seen that UHI was particularly strong after the sunset. On the other hand, UHI was simulated to be negative at either 06 UTC or 20 UTC. After the sunrise, the urban area was 1–3 °C colder than the rural surroundings. Heading to the end of the heatwave period, this urban cool island effect was even more pronounced, with the largest negative ΔTa-WRF in the downtown and northern parts of Szeged. The presence of the urban cool island can be attributed to the shadowing effect of the buildings. Then, a higher amount of solar energy converts to heat energy over the rural parts; hence, these areas can warm faster than the built-up ones. As the solar elevation increases, the attenuation of the building becomes less significant. It was verified by the simulations: ΔTa-WRF limited to − 1 °C at 14 UTC and the spatial variability decreased further over the city. After the sunset, ΔTa-WRF turned to 3–4 °C, suggesting higher Ta-WRF in the urban grids. The largest positive occurred in the centre of the city, where LCZ 2 and LCZ 5 LULC categories are found. In the outer parts of Szeged (classified as LCZ 6 and LCZ 9), ΔTa-WRF reduced by 1–2 °C, although not as much as expected. This possible positive bias could stem from the over-prediction of Ta-WRF in LCZ 9. At the last night of the simulation period, intense wind has started to flow from the northwest, causing the dislocation of heat island to the southeast and the corresponding positive biases. The model simulated a significant hot (cold) spot about 20 km northwest of the city centre, which was attributed to the Lake Fehér (with a total area of 14 km2). The lake and its environment seemed to be colder in the daytime and warmer in the night-time, due to the higher heat capacity of water.

Figure 10 a represents the mean diurnal variation of ΔTa-OBS and ΔTa-WRF in different LCZs. Here, ΔTa-OBSTa-WRF was calculated after the same procedure as previously and by considering only those stations that are illustrated in Fig. 4. It is confirmed that WRF underestimated ΔTa-OBS in the vast majority of the day. Another important feature is that the inter-LCZ variability in the simulations was less pronounced than in the observations: for example, a difference of 1 °C in ΔTa was predicted between LCZ 2 and LCZ 9 around 20 UTC, though this value was observed to be 4 °C by the UCMS (naturally, ΔTa was higher in LCZ 2 at that time). Overall, the model showed weaker negative heat (cool) island during the entire period. The largest mean overestimations of 1–1.5 °C evolved around 05–06 UTC.

Fig. 10
figure 10

Mean diurnal variation (a) and mean, maxima, and minima (b) of observed and simulated urban heat island intensity in different LCZs

To understand the background of the consistent daytime underestimations and nocturnal overestimations of ΔTa-OBS, it is worth to overview the extrema of ΔTa-OBS and ΔTa-WRF. The maxima of ΔTa-OBS were underestimated in most LCZs, in particular in LCZ 5 (Fig. 10b). The underestimations were the highest in LCZ 2 and LCZ 5, with about 2 °C. The minima of ΔTa-WRF in LCZ 2 and LCZ 5 were simulated with values around − 0.5 °C, which is 1–1.5 °C higher than the observed ones. In LCZ 6, the biases decreased to 0.5–1 °C in both the daytime and night-time. In LCZ 9, the maxima of ΔTa-OBS were simulated properly, though the minima were predicted with a bias of 1 °C. WRF was able to capture that the maxima of ΔTa-OBS in LCZ 5 were higher than those in LCZ 2, in spite of the higher urban surface fraction of LCZ 2. The lower minima of ΔTa-OBS in LCZ 2 related to LCZ 5, however, were not simulated accurately.

As we have seen in Fig. 10a, the maxima of both ΔTa-OBS were created at night (around 20 UTC); thus, the exact prediction of ΔTa in that time of the day is demanded. It was concluded earlier that the night-time Ta-OBS was overestimated in LCZ 6, LCZ 9, and LCZ D, but it was simulated with low biases in LCZ 2 and LCZ 5. Consequently, the modelled lower cooling potentials near to the rural reference station could have resulted in the strong cold biases of ΔTa in LCZ 2 and LCZ 5 and with less magnitude in LCZ 6. Since the overestimation of the night-time Ta-OBS was higher in LCZ 9 than in LCZ D, the maxima of ΔTa-WRF were in agreement with those of ΔTa-OBS.

The corresponding overestimations of the negative UHI, when the near-surface air layers over the urban areas warming slower than over the rural ones, can be determined by comparing the observed and modelled temperature gradients (i.e. cooling/warming rate in °C h−1) of the different LCZs. Before the observed peak time of the cool island (before 05 UTC), the model indicated low negative biases of Ta-WRF in LCZ 2 and LCZ 5 and higher positive biases in LCZ 6, LCZ 9, and LCZ D. As a consequence of the observations, the cooling of Ta-OBS shifted to a warming around 03:30 UTC. At 04 and 05 UTC, the gradient was the highest (lowest) in LCZ D (LCZ 2) with the values of 1.1 °C h−1 (0 °C h−1) and 4.4 °C h−1 (1.4 °C h−1), respectively. This contrast remained until 06:30 UTC. In the simulations, the timing of the change from cooling to warming phase and the finishing of negative ΔTa were captured perfectly. On the other hand, the magnitude of the warming in WRF was not consistent with the observations. The warming rate in LCZ D (1.6 °C h−1) was only about 1 °C h−1 higher at 04 UTC than that in other LCZs. One hour later, it increased to 2.6 °C h−1, but the mean difference decreased to 0.3 °C h−1. Therefore, the underestimation of the warming potential around the rural grid point was a remarkable impediment in the accurate representation of modelled urban-rural contrast.

5.3 Analysis of the mean daily variation of surface energy budget components

The appropriate representation of the surface energy budget (SEB) is a key factor in predicting the thermal conditions in the urban canopy. The formula of SEB can be given as:

$$ {Q}^{\ast }+{Q}_{\mathrm{F}}={Q}_{\mathrm{H}}+{Q}_{\mathrm{E}}+{Q}_{\mathrm{G}}\left(\mathrm{W}\ {\mathrm{m}}^{-2}\right) $$

where Q* is the net all-wave radiation, QF is the anthropogenic heat flux, QH is the sensible heat flux, QE is the latent heat flux, and QG is the ground heat flux. Bowen ratio (β) can be expressed as:

$$ \beta =\frac{Q_{\mathrm{H}}}{Q_{\mathrm{E}}} $$

The different components of SEB have been calculated and averaged over the whole simulation period (Fig. 11). These modelled components cannot directly be validated due to the lack of observations, although the evaluation of Ta-WRF can provide indirect information about the accuracy of the components. During the analysis, the anthropogenic heat term was neglected.

Fig. 11
figure 11

Mean diurnal variation of modelled energy budget components and Bowen ratio in different LCZs

Q* varied from − 70 to 650 W m−2. Only a small distinction of around 10–15 W m−2 was identified within the urban zones in both the daytime and night-time. Q* in LCZ D was 30–40 W m−2 lower in the daytime and 20–30 W m−2 higher in the nocturnal hours than that in other LCZs. Such differences in Q* between urban and rural territories stemmed from the outgoing long-wave part of Q*, which is proportional to the surface (skin) temperature (Ts). ΔTs between areas with more and less built-up (e.g. surface urban heat island) were particularly greater in the daytime, governing Q* in the above manner. In the shortwave part of Q*, only a minor contrast existed among impervious and vegetated surfaces.

The daily course of QG showed a significant spatiotemporal variation, particularly in the daytime. QG covered a spectrum between − 450 and 50 W m−2 and correlated firmly with the impervious (urban) fraction of surface. Besides, higher heat capacity and lower thermal conductivity also enhanced the contribution of |QG| in SEB. QG has become negative in the daytime between 04:00 UTC and 17:30 UTC. The negative sign refers to the storage of incoming radiation in the surface materials. Due to the much lower heat storage in LCZ D than in other LCZs, the maximum of |QG| (− 50 W m−2) was about nine times lower in the rural zone than, for example, in the urban LCZ 2 (− 450 W m−2). In other words, the surface in LCZ 2 retained a greater amount of the incoming solar energy around 11:00 UTC, while this effect was less important in LCZ D. QG in LCZ 5 and LCZ 6 possessed similar characteristics during the whole day, with the lowest values of − 300 W m−2. In LCZ 9, (with an urban fraction of 0.2), QG decreased further to – 150 W m−2. During the night-time, nearly the same amount of stored heat released from the surface to the overlying atmospheric layers. In LCZ D, however, the nocturnal QG was 5–10 W m−2 lower than in the other LCZs. All in all, the differences in the urban (and vegetation) fraction between urban LCZs had less influence on QG in the night-time than in the daytime.

Bowen ratio (β) describes some characteristics of the turbulent heat transfer in the boundary layer. In urban areas, where evaporation is low, the heat is mostly transferred as sensible heat; therefore, β > 1. On the other hand, the higher amount of vegetation evaporates more water in rural areas, increasing the latent heat over the sensible heat (β < 1). This tendency is clearly seen in Fig. 11. The greatest β occurred in LCZ 2, with a maximum of 8. In the rest of the LCZs, β stayed below 3, with the highest values in LCZ 5 and LCZ 6. In LULC categories with significant vegetation coverage (LCZ 9 and LCZ D), β decreased below 1 in some stages of the day. Surprisingly, the mean β in LCZ 9 was slightly lower than that in LCZ D. It is quite likely that β was generally overestimated in the night-time (due to the underestimation of the latent heat flux), especially in LCZ 2. One of the potential explanations can be the exclusion of anthropogenic-related water treatment processes, for example, the irrigation of urban parks and gardens and the evaporation of engineered pavements. The determination of anthropogenic heating requires detailed gridded data on human metabolism, transportation, and other energy consumption activities. For this reason, the anthropogenic heat term of SEB is always difficult to estimate and often being neglected from the urban climate studies.

6 Summary and conclusions

A local climate zone (LCZ)–based land use/land cover classification with the corresponding urban canopy parameters of the single-layer urban canopy model (SLUCM) has been integrated into the Weather Research and Forecasting (WRF) mesoscale numerical model in order to simulate the spatiotemporal behaviour of near-surface air temperature (Ta) and urban heat island intensity (UHII) in Szeged, Hungary, under a 6-day heatwave period from July 2017. We have validated the above meteorological parameters against the observations of the local urban climate monitoring system. In this study, our ambition was to test the WRF’s ability in estimating Ta with an adjusted LCZ-based surface database. For this purpose, 17 urban geometry- and thermodynamic-related parameters of SLUCM were determined for each LCZ. Then, these parameters have been incorporated into the physical modules of WRF, being able to start the numerical integrations with a more appropriate static data.

Overall, WRF with the adjusted land use database and the static surface information had the ability to estimate Ta and UHII with reasonable accuracy both in space and time. According to the calculated statistics of Ta (e.g. absolute error (AE), spatial correlation coefficient (SCC)), the model performed satisfactorily in each LCZ, particularly in LCZ 5 and LCZ 6, with mean AEs of 1.3 °C and 1.8 °C and mean SCCs of 0.95 and 0.93, respectively. Larger biases of Ta occurred in LCZ 9, with AE of 2.7 °C and SCC of 0.91. General overestimations in observed daytime Ta were noticed in all LCZs. At nights, there were no clear tendencies: in LCZ 2 and LCZ 5, Ta was slightly underestimated, while in LCZ 6, LCZ 9, and LCZ D, overestimations were emerged.

With the daily temperature range (DTR) of Ta, the warming and cooling potentials of surface over different urban landscapes (i.e. over different LCZs) were intended to be expressed. The highest DTR (16.3 °C) was predicted in LCZ D, while the lowest DTRs occurred in LCZ 2 and LCZ 5, with values around 14–15 °C. Only a small difference in DTR (0.6 °C) was found among the “open” (e.g. LCZ 5 and LCZ 6) land use classes. The observed DTR was underestimated in LCZ 9 and overestimated in LCZ 2 and LCZ D. In LCZ 5 and LCZ 6, the simulated DTR showed good agreement with the observations.

We have calculated the urban-rural contrast of Ta to analyze how the UHII parameter sensitive to the LCZs with different physical settings is. Both the magnitude and shape of UHI were simulated reasonably well. The largest night-time UHII was predicted over the centre of the city (in LCZ 2 and LCZ 3), with the values of around 2–3 °C. Heading to the outer areas of Szeged, the overestimation of UHII increased gradually. There was not any considerable difference in predicted and observed UHII between the urban and rural areas in the daytime. This was not case for the maxima of UHII that were primarily underestimated in LCZ 2 and LCZ 5, with about 2 °C. WRF was not able to catch the urban cool island accurately: the minima of UHII were overestimated in each LCZ. It was due to the underestimation of the warming rate of observed Ta around the sunrise, when the simulated Ta should have increased faster in the rural areas than in the urban territories.

Some components of the surface energy budget (SEB) were analyzed to shed light on the differences in Ta between urban and rural zone(s). The net all-wave radiation showed a little spatial fluctuation. The small distinctions of 20–30 W m−2 evolved due to the spatial inhomogeneity of skin temperature that governs the outgoing long-wave rotation segment. WRF simulated a relatively low daytime heat storage in LCZ D (with a maximum of around − 40 W m−2) and notably higher values in the urban zones (with peaks ranged from − 150 to − 400 W m−2). In the urban LCZs, the sensible heat flux was about two to three times larger than the latent heat flux. The Bowen ratio increased further in LCZ 2 to around 6–8 (here, the vegetation fraction was 10%), which implies particularly low evapotranspiration over this area. During some stages of the night in LCZ 9 and LCZ D, the latent heat flux exceeded the sensible heat flux.

It can be concluded that WRF with the above alterations (i.e. the introduction of a LCZ-based static dataset) is a valuable tool to predict the spatiotemporal variation of the thermal field in the urban canopy, although there may be certain shortcomings. One of the major factors in which this modelling framework should develop is the quality of the input meteorological data. Consequently, as a next step, we intend to assimilate the observations of our local monitoring system into the model, instead of using the NCEP GFS data, wherewith a more proper representation of the initial meteorological field may be achieved. On the other hand, the collection of UCPs must be broadened to apply other WRF canopy schemes (i.e. BEP or BEM) with larger complexity. The future investigations should also contain such non-ideal synoptic situations (e.g. extreme precipitation event) when the physical configurations can further be tested. Additionally, we would like to improve the thermodynamic representation of LCZ, particularly in LCZ 9, by using similar methodology but different data sources (for example, orthophotos with very fine spatial resolution).