1 Introduction

In spite of the ongoing climate change and increasing temperatures, cold extremes have had far-reaching impacts on human society. These include effects on public health (Kyselý et al. 2009; Hajat and Gasparrini 2016), economic losses in transportation (Vajda et a. 2014), agriculture and forestry (frost impacts early or late in growing seasons, Snyder and de Melo-Abreu 2005; Ma et al. 2019), and energy and infrastructure (Añel et al. 2017). The changes of climate observed over past several decades have markedly impacted both natural and human systems around the world (IPCC 2022). The observed temperature increase is the most evident manifestation of current climate change. Krauskopf and Huth (2020) concluded that Europe is warming up by 0.21 °C per decade with the most pronounced warming in winter season (0.35 °C per decade). The warming contributed to a change in the number of cold days and cold spells in central Europe (Tomczyk et al. 2019a). Continued emissions of greenhouse gases are expected to lead not only to a further increase in global temperature but to changes in all components of the climate system (IPCC 2022). Therefore, the severity of extreme cold events may not decrease despite the growing global temperature (Kodra et al. 2011; Cohen et al. 2020) as future changes in atmospheric circulation may considerably modify regional climates (Belleflamme et al. 2015; Horton et al. 2015; Ozturk et al. 2022). To understand these changes climate models have been developed.

The reliability of models’ simulations has been, however, considerably limited by the longstanding issue that they often fail to accurately reproduce various aspects of the historical climate (Vautard et al. 2021; Latonin et al. 2021). These limitations are further exacerbated when shifting focus from the global scale and long-term averages. Regional climate models (RCMs) have become a valuable tool in analysing regional-scale climate by bridging some of the limitations of global climate models (GCMs) related to their low spatial resolution. RCMs, however, tend to inherit the inadequacies of their driving GCMs (Plavcová and Kyselý 2012; Ibebuchi 2023). Analysing the extent to which circulation biases are generated by RCMs and to which they propagate from the driving data is an important step toward understanding and dealing with the primary sources of model errors.

Inadequacies in simulations of historical climate—and thus also uncertainty of model projections—come to light more prominently when looking at more specific climate features, such as the probability and magnitude of extremes (Christensen et al. 2008; Maraun 2012; Vautard et al. 2021). Because the study of extreme weather and climate events with high impact on ecosystems and human society is one of the key tasks of climatology, validation of climate models is a vital step that precedes the assessment of future changes of extreme events. Effects and driving mechanisms of high-impact summer weather (e.g., tropical cyclones, intense thunderstorms, heat waves, and droughts) have been widely studied in the context of rising global temperature (e.g., Russo et al. 2015; Orth et al. 2016; Rousi et al. 2022). Despite their marked effects on various sectors of societal and natural domains, winter extremes (e.g., cold spells, heavy snowfalls, severe windstorms, late frosts) have so far received comparably less attention (IPCC 2014, 2022). We focus on central Europe since it is a region of economic and agricultural importance, as well as a region of comparatively large uncertainties regarding climate change projections (Evin et al. 2021), since it lies between two hotpots of climate change in the form of the Arctic and the Mediterranean regions (Tuel and Eltahir 2020; Ozturk et al. 2022). We aim to address the existing knowledge gap by focusing our research on the behaviour of cold days (CDs) over central Europe.

Because CDs are relatively tightly linked to the synoptic-scale atmospheric circulation over Europe (e.g., Lhotka and Kyselý 2018; Tomczyk et al. 2019b; Plavcová and Kyselý 2019), we evaluate these links in order to study if their inadequate reproduction can explain biases in simulations of CDs in current RCMs. We then analyse whether and how the links between CDs and atmospheric circulation change in projections of future climate. We use three RCMs from the EURO-CORDEX project (Jacob et al. 2014) driven by three GCMs, which makes it possible to evaluate the extent to which the model biases are generated by the RCMs and to which they propagate from the driving data. The ability of models to realistically reproduce the links within the climate system is an important step forward in further improving the models and enhancing the credibility of climate change scenarios based on climate model simulations.

The paper is structured as follows: Data and methods are introduced in Sect. 2. In the Results section, we firstly analyse the ability of RCMs to reproduce CD statistics (Sect. 3.1). Then we introduce circulation types identified over central-European (CE) domain and their main characteristics (Sect. 3.2) and links between atmospheric circulation and CDs (Sect. 3.3) in the historical period. In Sect. 3.4, we attribute model biases to reproduction of atmospheric circulation and to simulated links between circulation and CDs in the historical climate. Changes of CD characteristics and their attribution to differences in projections for the end of the twenty-first century are analysed in Sects. 3.5 and 3.6, respectively, for two different emission scenarios. Discussion and Conclusions follow in Sects. 4 and 5.

2 Data and methods

2.1 Data

The RCM data analysed in this study were obtained from the EURO-CORDEX project database (Jacob et al. 2014; Giorgi and Gutowski 2015). We investigated an ensemble of 3 RCMs driven by 3 GCMs resulting in a total of 9 simulations with a spatial resolution of 0.11° (see Table 1). The Ensemble mean (ENS) was calculated as the average of these 9 simulations. Daily data on minimum temperature (TMIN), mean cloud cover, and mean sea level pressure (SLP) were used for analysis.

Table 1 EURO-CORDEX regional climate model simulations and their acronyms used in the study

Observed TMIN values primarily come from the high-resolution European gridded dataset (E-OBS version 23.1e; Cornes et al. 2018), and for comparison purposes, data from the ERA5 reanalysis (Hersbach et al. 2018, 2020) were also included. Observed SLP and cloud cover data were obtained from ERA5. Gridded observed datasets were used to validate the model outputs since they allow for a more accurate comparison of areal-averaged values than isolated meteorological station measurements.

The study focuses on the extended winter season from November to March, since low temperature extremes rarely occur outside this period. Additionally, the links between atmospheric circulation and temperature (or other meteorological variables) differ between cold and warm parts of the year in central Europe (Plavcová and Kyselý 2011).

We analysed model simulations for a historical period of 42 years (1979−2020) and projected simulations for the end of the twenty-first century (2070−2099). The starting year of 1979 was determined based on the availability of ERA5 data at the beginning of our project. The historical runs of the RCMs were used for the period 1979−2005, while projections using the RCP8.5 emission scenario (van Vuuren et al. 2011) were used for the period 2006−2020. According to IPCC (2022), differences between various RCP scenarios are minimal until 2020, and the RCP8.5 scenario corresponds relatively well to observed emission trends. This combination of control and projected simulations enables a more robust validation of the models. For the future period of 2070–2099, two emission scenarios were employed: the moderate RCP4.5 and the least optimistic RCP8.5.

2.2 Area under study

CDs were evaluated in the CE region, which spans from 48° to 52° N and 10° to 19° E (Fig. 1, solid-lined box). This region, covering approximately 450 × 650 km, was specifically chosen due to its relatively homogenous climate resulting from the absence of high-altitude mountain ranges. Within this region, there are 1903 grid points in the rotated grid of RCM outputs of 0.11° (~ 12.5 km) resolution. The regular 0.1° grid of the E-OBS dataset consists of 3600 (90 × 40) grid points, while the ERA5 dataset at the resolution of 0.25° contains 629 (37 × 17) grid points within the study area. Regardless of these differences, the datasets differ only negligibly (by less than 3 m) in their spatially averaged elevation.

Fig. 1
figure 1

Area under study for evaluating cold days (solid-lined box) and the central European domain used to characterize atmospheric circulation (broken-line box)

2.3 Definition of cold days

CDs are defined using a modified extremity index proposed by Lhotka and Kyselý (2015a), which takes into account the spatial extent and severity of extreme cold events. The definition of CDs involves two steps. First, for each individual grid point, we calculate the 5th percentile of seasonal TMIN distribution (TMINp5). Second, a day is identified as a CD if the spatial average of its gridded TMIN deviations from TMINp5 is negative. This means that a CD occurs only when substantial parts of the CE region experience TMIN below their respective TMINp5, or when there are sufficiently large negative deviations in a smaller area.

The selection of the TMINp5 threshold was made to capture only extreme cold weather events, but still in a sufficient frequency for proper statistical analysis. To account for the biases present in the RCMs (Fig. 2), we calculated the TMINp5 threshold independently for each dataset. As a result, the CD climatology and its links to circulation are not affected by RCM biases, and the datasets can be compared accordingly. Considering that all models project a strong warming trend in winter TMIN over the CE region (around 3°C for mean TMIN, and even more for TMINp5; Fig. 2), the TMINp5 was also calculated separately for each period. This ensures that (1) cold extremes are properly accounted for in each period and its specific climatic regime, and (2) a sufficient number of CDs is obtained for the warmer future to make possible a robust statistical analysis.

Fig. 2
figure 2

Mean values of the 5th quantile of daily minimum temperature distribution (TMINp5) over November-March periods, averaged over all grid points within the area under study. Filled points stand for the historical period (1979–2020), crosses for the RCP4.5 scenario, while empty points for the RCP8.5 scenario (2070–2099). ENS stands for the Ensemble mean

Various characteristics of CDs were calculated for all datasets. In addition to the average number of CDs per year, we assessed the CD mean intensity, CD spatial extent, and mean length of cold snaps (CD sequences). The mean intensity represents the average negative anomaly of TMIN from TMINp5 in °C and indicates the severity of cold conditions. The mean length of cold snaps corresponds to the average number of consecutive CDs, providing insights into their persistence.

2.4 Classification of atmospheric circulation patterns

To characterize the synoptic-scale atmospheric circulation over the study area, we employed a classification of circulation types derived from the Sammon mapping approach. Please refer to Stryhal and Plavcová (2023) for a comprehensive description of the methodology with figures; a concise overview is as follows:

Pre-processing of data

  • For the domain shown in Fig. 1 by the broken-line box, which is centred on the smaller CE region under study, we interpolated the referenced datasets onto a common 1° × 1° grid and calculated daily mean SLP.

  • The chosen domain corresponds to the “D07” region suggested by Philipp et al. (2016) in order to improve on comparability of studies on central-European climate.

Sammon projection

  • We applied Sammon mapping (Sammon 1969) to the ERA5 daily mean SLP in order to transform the high-dimensional data into a two-dimensional Sammon space, in which each daily field is represented by only two indices (x, y coordinates). The mapping aims to minimize the disproportion of Euclidean distances between each pair of fields in the original high-dimensional physical space and between their coordinates in the projection space. As a result, fields with similar spatial patterns appear close to each other in the Sammon map, or, in other words, the Sammon map approximately preserves the topology of the data.

  • All RCMs were projected onto the ERA5 Sammon map. For further details on the projection procedure, please refer to Stryhal and Plavcová (2023) and their Fig. 2.

Classification

  • Note that Sammon mapping does not provide classification; its function is to transform the data. However, this transformation enables us to precisely delineate circulation types (CTs) in a way that optimally captures their connection to CDs.

  • The classification itself is obtained by discretizing the Sammon map into a regular 7 × 14 array of bins, each covering the same area in the Sammon space and collectively encompassing the entire Sammon space of the combined ERA5-RCM dataset. We opted for 98 bins to achieve a balance between output complexity and manageability.

  • For each bin, we obtain its mean pattern by averaging all daily fields classified with it. The patterns are shown in Fig. 3; the frequencies of the bins in ERA5 and individual RCMs, as well as the differences between observed and modelled frequencies for the historical period, are shown in Fig. S1 of the Supplementary Materials. The empty bins in the corners have no associated fields and can be considered artefacts of the methodology, which projects a rectangular array of bins onto an elliptical data space.

  • Owing to the preserved data topology, neighbouring Sammon map bins have very similar patterns and links to CDs and can, therefore, be merged together to form a few robust CTs. We merged the 98 bins (i.e., defined boundaries between CTs) into 8 CTs based on an expert evaluation of (1) the similarity of their circulation patterns and (2) the similarity of their links to CDs. Figure 4a illustrates the division of the bins into CTs, along with the frequency of individual bins in ERA5. CTs are described in detail in Sect. 3.2.

Fig. 3
figure 3

Sammon map bins and their mean SLP patterns [in hPa]. The patterns are calculated over all days classified with a given bin (all datasets combined together). The area from 0° to 30° E and from 40° to 60° N is plotted in each bin, and the red box indicates the central European region for which the cold days are evaluated. Blue lines indicate division into circulation types (their numbers are given in Fig. 4). The empty bins stand for extreme circulation which has never occurred in any dataset

Fig. 4
figure 4

a Frequency [in days] of circulation patterns in ERA5 over 1979–2020 and the division of bins into circulation types (marked with numbers 1–8). b Mean minimum temperature anomaly (TMIN) and (c) cloud cover of days within individual circulation patterns in ERA5 over 1979–2020. Circle (cross) indicates the value significantly lower (higher) than the climatological mean over all winter days (estimated by bootstrapping; R = 10,000)

To analyse the mean weather characteristics associated with individual circulation patterns, we calculated mean daily TMIN and cloud cover for each bin, and their anomalies. While cloud cover anomalies were calculated relative to the November-to-March average, TMIN anomalies were expressed relative to the 42-year averages independently calculated for each day of the year, in order to account for the strong seasonal cycle in TMIN.

2.5 Attribution of model biases and projected changes

To attribute the biases of CD frequencies in model simulations, we calculate the biases due to differences in simulated circulation (DC) and due to differences in simulated links between circulation and the probability of a CD (DL). The total number of observed CDs can be expressed as

$${CD}^{obs}=\sum_{CT=1}^{8}{F}_{CT}^{obs}\cdot {P}_{CT}^{obs}\cdot {N}^{obs}$$

where F represents the frequency of each CT, P represents the probability of a CD in a given CT, and N represents the total number of winter days in the observational dataset obs.

Then, the DC component represents the difference in CD occurrence if the CT-conditional probabilities of CDs were identical in an RCM m and observations, while the frequencies of CTs differed, i.e.

$${D}_{C}^{m}=\sum_{CT=1}^{8}{F}_{CT}^{m}\cdot {P}_{CT}^{obs}\cdot {N}^{m}-{CD}^{obs}$$

The DL component represents the difference in CD occurrence if the frequencies of CTs were identical in an RCM m and observations, while the CT-conditional probabilities of CDs differed, i.e.

$${D}_{L}^{m}=\sum_{CT=1}^{8}{F}_{CT}^{obs}\cdot {P}_{CT}^{m}\cdot {N}^{m}-{CD}^{obs}$$

We assess future changes in CD occurrences in models’ projections analogically, that is, substituting in the above equations the reference observational dataset with model historical runs and model simulations with model projections. Note that Nobs differs from Nm in some cases, because not all models use the standard calendar. Analogically, Nm differs between the historical and future periods.

3 Results

3.1 Cold day statistics in historical period

In the 42-year historical period, a total of 253 CDs are detected in the observed data (E-OBS), averaging 6.0 CDs per season. The ENS closely reproduces this frequency with 243 CDs, averaging 5.8 CDs per year. However, there are notable differences among individual RCM simulations (4.8–6.7 CDs per season; Table 2). These differences are primarily due to the RCMs, not the driving data: all CCLM4-8–17 runs overestimate the observed frequency by 7−11%, while all RACMO22E runs underestimate it by 16–21%. The RCA4 simulations performed the best with differences ranging from – 3 to 6%. It is important to note that our definition of CDs based on relative thresholds circumvents any cold or warm biases of TMIN in individual simulations, which are quite substantial (up to several degrees of Celsius; see Fig. 2).

Table 2 Cold day statistics over historical (1979–2020) and future (RCP4.5 and RCP8.5, 2070–2099) periods

On average, the spatial extent of CDs covers ~ 4/5 of the region in both observations and models (78% of grid points in E-OBS versus 79% in ENS). However, although the grid size is identical in all RCMs, CCLM4-8–17 runs have slightly higher average spatial extent of extreme cold events (80−82%), while RCA4 driven by EC-EARTH exhibits the smallest average extent (75%).

The differences between observations and models are more noticeable in other characteristics. The mean intensity of CDs in models is generally lower than that in the observed data, except for the RCA4 simulations (Table 2). Conversely, models tend to have higher persistence of cold events. This feature is apparent from several parameters and in most simulations: The mean length of cold snaps is over 10% higher in the ENS compared to E-OBS (3.29 vs. 2.91 days, Table 2). Isolated CDs (one-day cold snaps) are more prevalent in observations (13.0% in E-OBS) compared to any model (7.9% in ENS). The maximum length of a cold snap in E-OBS is 14 days, while it reaches 17 days in the ENS and up to 28 days in simulations.

Interestingly, there was no clear relationship observed among the models between the frequency, intensity and spatial extent of CDs. Surprisingly, the longest mean duration of cold snaps are not produced by the models with the largest overestimation of CD occurrence (CCLM4-8–17), but by the simulations that reproduce the observed frequency of CDs relatively well (RCA4 driven by CNRM-CM5 and HadGEM2-ES). These two simulations also have the highest overestimation of mean CD intensity, suggesting the influence of progressive cooling during prolonged extreme cold snaps.

There is a good correspondence in the TMINp5 and CDs characteristics between E-OBS and ERA5 datasets (Fig. 2 and Table 2), and therefore our conclusions on model validation would not change if the ERA5 temperature data were used.

3.2 Atmospheric circulation

Table 3 summarises mean characteristics of the 8 CTs obtained by merging the Sammon map bins (Figs. 3 and 4a).

  • Anticyclonic patterns with an easterly airflow component located in the upper right corner of the Sammon space are merged into CT 1 and CT 2. The anticyclone is in these CTs located north or northeast of the CE region. These two CTs are associated with the lowest TMIN over the CE region (Fig. 4b). The cold temperatures in CT 2 are primarily linked to strong easterly advection of cold air masses, while radiation cooling during clear-sky conditions (Fig. 4c) plays the decisive role in CT 1.

  • CT 3 includes patterns in top centre of the Sammon space (Fig. 3) which are characterised by cold easterly or south-easterly flow around a Mediterranean trough and by high cloud cover (Fig. 4c).

  • CT 4 and CT 5 represent clear-sky anticyclonic patterns with the anticyclone’s centre located over or south to the CE region with only weak flow over the region. CT 5 features a somewhat weaker anticyclone compared to CT 4.

  • CT 6 exhibits atmospheric circulation with south-easterly flow over the CE region, which is similar to CT 3. However, CT 6 has considerably weaker flow and smaller cloud cover (Fig. 4c).

  • The very common circulation patterns similar to the climatological seasonal mean are found in the central part of the Sammon space and are merged together into CT 7. In this CT, also TMIN anomalies and cloud cover are close to seasonal average (Fig. 4bc).

  • CT 8 merges all the remaining patterns, which are predominantly characterised by cyclonic and/or westerly flow typically bringing relatively warm and humid air masses into central Europe that are not conducive to TMIN extremes.

Table 3 Summary of main characteristics of circulation types

The majority of November-March days exhibit circulation patterns that are located close to the centre of the Sammon space, while the frequency decreases towards the edges of the space (Fig. 4a). As a result, CT 7 has an observed frequency of almost 60%, while CT 2, which represents the other extreme, accounts for less than 0.5% days (Table 3). Although CT 8 covers more than half of the circulation bins in the Sammon space in Fig. 2, showing a relatively high variability of patterns classified with it, it represents less than 13% of all winter days.

3.3 Circulation types conducive to cold days in historical period

Inversely to TMIN (Fig. 4b), the probability of CDs increases across the Sammon space approximately from its bottom left to its upper right (Fig. 3), and the trend is consistent between observations (Fig. 5a) and ENS (Fig. 5b). Since the probability was utilized to delimit and sort the CTs, it decreases from CT 1 to CT 8 (Fig. 6c and Table 3). The approach also allows us to define two groups of CTs: a group of “non-conducive” CTs with CD probability below the climatological average (CTs 7–8) and a group of “conducive” CTs (CTs 1–6) with above-average CD probabilities.

Fig. 5
figure 5

Circulation patterns linked to cold days in historical period (1979–2020). Blue colour depicts the Sammon map bins in which cold days never occur; red scale indicates percentages of cold days within all days of given bin, while grey colour marks bins with occurrences of cold days lower than the mean climatological occurrence (M). The layout of circulation pattern bins corresponds to that in Fig. 3. Circle indicates the statistical significance of the cold day occurrence (p = 0.05, estimated by bootstrapping R = 10,000)

Fig. 6
figure 6

Circulation type frequency and probability of cold days within individual circulation type for historical period (left) and their change in future period (RCP8.5; right). Red points – observed data, boxplots – RCM simulations, horizontal lines in plot (c) indicate the cold-day probability over all winter days in the observed data (red) and in individual simulations (grey). Zoomed in version of panel (a) (with y-axis of 0–20%) and panel (c) (with y-axis of 0–5%) is given in Fig. S2. Results for RCP4.5 are shown in Fig. S4

The CT 8 encompasses a large group of circulation patterns in which CDs occur extremely rarely in observations (only one day was recognized as a CD according to our criteria out of 798 days classified with this type). In the most frequent CT 7, 84 CDs occur in observations. However, CD probability of this CT is only 2% which is considerably below the seasonal mean (Fig. 6c). Among the conducive CTs, the observed probability of CDs is the highest in CT 1 (27%) and CT 2 (22%). Therefore, although these two rarely occurring CTs (Fig. 6a) represent only 2.3% of all winter days, they account for 15% of all CDs in the observed data. These two CTs also exhibit the strongest mean TMIN anomaly (– 6°C; Fig. 7a). The other conducive CTs (CTs 3−6) have lower probabilities of CDs compared to CT 1 and CT 2 ranging from 13 to 6% (Fig. 6c), and they are also relatively warmer (Fig. 7a).

Fig. 7
figure 7

Same as Fig. 6, but for daily minimum temperature (TMIN) and cloud cover anomalies. Anomalies are calculated separately for historical (left) and future (right) periods. Red points—observed data, boxplots—RCM simulations. Results for RCP4.5 are shown in Fig. S5

The RCMs generally reproduce the observed links between the probability of CDs and atmospheric circulation patterns relatively well. This is shown for ENS in Fig. 5b and for individual simulations in Fig. 5c−k. Although some differences exist among the individual model runs, ENS as well as individual simulations capture the above-average probability of CDs in CTs 1−6 and the below-average probability in CTs 7−8 (the only exception being CT 4 in runs driven by EC-EARTH; Fig. 6c). CTs 1−6 account for 26% of all winter days in the ENS, while including almost 70% of all simulated CDs—a proportion similar to observations.

3.4 Attribution of model biases

Biases in the number of CDs in the historical period range from – 53 days to + 27 days, as shown by the grey bars in Fig. 8a. These biases are caused by the models’ inability to realistically reproduce observed atmospheric circulation, observed links between circulation and CDs, and their combinations. Model biases attributed to errors in CT frequencies (DC, see Sect. 2.5) are in Fig. 8a indicated by red bars, while biases attributed to the links (DL) are indicated by blue bars. The relative contributions of DL and DC to the overall bias vary among the individual simulations. In some cases, they have a compensating effect on the total bias, resulting in seemingly perfect simulations (e.g., RCA4 driven by HadGEM2-ES). However, in other cases, the biases act in conjunction, leading to marked departures from observations (e.g., RACMO22E driven by CNRM-CM5).

Fig. 8
figure 8

Attribution of differences in cold days: a historical simulations relative to observations, b RCP4.5 relative to historical simulations, and c RCP8.5 relative to historical simulations. The overall differences are shown by grey bars; differences due to circulation (DC) and links between circulation and cold day probability (DL) are highlighted in blue and red stripes, respectively

Due to the way how attribution is calculated (see equations in Sect. 2.5), even small biases in CT frequencies with strong links to cold extremes play a more important role in CD biases than substantial over- or underestimation of the non-conducive CTs. Figure 6a indicates that all RCMs replicate most of the relatively rare CTs 1−5 well. However, almost all models overestimate the frequency of CT 4 and underestimate the frequency of CT 6. It is evident from Fig. 8a that 4 simulations (RACMO22 driven by HadGEM2-ES and all runs of RCA4) considerably underestimate the frequency of conducive CTs, which leads to a circulation-induced too-low occurrence of CDs. On the other hand, the observed CT frequencies are reproduced best by the CCLM4-8-17 driven by CNRM-CM5. This indicates that the overestimation of CDs in this run is primarily caused by incorrect links between circulation and cold extremes.

The inaccurate replication of the links between CTs and CDs plays a more important role in the overall biases for the majority of simulations. One common drawback shared by all simulations is the higher than observed probability of CDs in CT 6 (Fig. 6c), resulting in positive DL biases. Other errors contribute or compensate for the error in CT 6, but they are RCM-specific (see Fig. 8a): CCLM4-8–17 tends to simulate stronger links between conducive CTs and CDs which are not compensated by an opposite link in the common CT 7. Therefore, the overestimated frequency of CDs by this model is primarily due to exaggerated TMIN extremes in conducive CTs. On the contrary, RACMO22E underestimates the number of CDs in conducive CTs 1–5, as well as in CT 7. Therefore, the underrepresented frequency of CDs by this model is primarily due to links between TMIN extremes and circulation patterns being too weak. This can be at least partly explained by the lowest persistence of conducive atmospheric circulation in this RCM indicated in Table 2 (RACMO22E also produces the shortest mean lengths of sequences of days within the conducive circulation bins). Last, RCA4 is the only RCM that accurately reproduces the number of CDs in CT 7. However, the overall DL is exaggerated in all three RCA4 simulations due to errors in other CTs (especially CT 6), which approximately offsets the aforementioned errors caused by CT frequency. Therefore, the relatively accurate representation of CD frequencies in RCA4 runs is merely an artefact of two errors of opposite sign that cancel each other out.

A more comprehensive understanding of DC and DL requires further investigation. One hypothetical crucial factor causing RCMs’ errors in synoptic-scale CT frequencies is the simulation of large-scale circulation by their driving GCMs. This factor appears to be even more significant during winter when the large-scale circulation over Europe intensifies. Some circulation biases appear to propagate from the driving models, which is apparent when comparing columns of Fig. S1c–k. For instance, simulations driven by HadGEM2-ES (central column) tend to simulate more days with extreme circulation patterns at the expense of average fields near the centre of the Sammon space. Specifically, all of these runs underestimate the combined frequency of CT 6 and CT 7 (by 8−10 percentage point in comparison to ERA5), while overestimating the frequency of westerly and zonal flows of the non-conducive CT 8 (by 6 percentage points) and anticyclonic flow of CT 4 (by 2 percentage points). Opposite tendency is present in simulations driven by CNRM-CM5 (left column), which produce higher frequencies of CT 6 and CT 7 and lower frequency of strong westerly flow of CT 8 and weak anticyclonic flow of CT 5 compared to simulations driven by other GCMs (Fig. 6a). However, individual RCMs have the ability to considerably modify their boundary circulation. This is clearly demonstrated by RCA4, which produces the lowest combined frequency of conducive CTs regardless of the driving GCM (Figs. 6a and S2).

The DL can be influenced by various factors, but the impact of cloud cover stands out due to its influence on radiation losses, which is a significant contributor to extreme TMIN anomalies. The mean observed winter cloud cover is 74%. An analysis of RCMs shows that all simulations underestimate the mean winter cloud cover, with the exception of CCLM4-8–17 driven by CNRM-CM5 and HadGEM2-ES. The underestimation primarily originates within the RCMs, with cloud cover ranging from 73 to 79% in CCLM4-8–17, 68 to 71% in RACMO22E, and only 59 to 61% in RCA4 runs. We show that both the cloud cover (Fig. 4c) and its biases (Fig. 7c) exhibit a rather strong link to CTs. The ENS and most individual simulations underestimate cloud cover in all conducive CTs, except for the anticyclonic CT 4. It might be expected that the lower cloud cover would result in a positive DL across most simulations, especially when it is underestimated in most conducive CTs. However, the anomalies depicted in Fig. 7c do not appear to be a reliable predictor to either CD probability (Fig. 6c) or DL (Fig. 8a). Notably, observe the similar DL in CCLM4-8–17 and RCA4 runs, despite the opposite cloud-cover biases in the two RCMs. This is also proved by statistical analysis, since no significant correlation between cloud-cover and conduciveness to CDs has been found for any conducive CTs.

3.5 Projected changes of cold days

The ENS indicates a slight increase (by 6%, + 0.3 CDs per year) in the frequency of CDs for the end of the twenty-first century under the RCP8.5 scenario. However, individual models exhibit a wide range of trends, with changes ranging from – 5 to + 23% (– 0.2 to + 1.3 CDs per year, Table 2). For the RCP4.5 scenario, the increase in CD frequency is slightly smaller: only about 2% in ENS, while – 7% to + 18% in individual models. These results highlight the uncertainty in the projections. We reiterate that the threshold of CDs was calculated independently for each period, which removed the background warming. None of the simulations which overestimate the occurrence of CDs in the historical period projects an increase in CD frequencies in the future for RCP8.5 (compare grey bars in Fig. 8ac). Instead, they exhibit a small decrease to 6.0−6.5 CDs per year (Table 2). A negligible decrease can also be seen in RACMO22E driven by CNRM-CM5, which combined with its strong negative bias in the historical period yields a projection with fewest CDs (decrease to 4.8 CDs per year for both scenarios; Table 2). On the other hand, four (five) simulations project a notable increase in CD probability by 13–23% (2–18%) for RCP8.5 (RCP4.5).

The results considerably vary also for other CD properties. In the RCP8.5, the ENS projects CDs that are less intense and have a somewhat smaller spatial extent compared to the historical period (Table 2). However, the mean length of CD sequences increases in the ENS, while it decreases in 4 out of 9 simulations. This decrease is a common pattern of all simulations driven by CNRM-CM5 (by a similar magnitude of ~ 10%). Any other dependencies of projected changes on either driving data or RCMs are not obvious (Table 2). The projected changes for RCP4.5 are even less clear.

3.6 Attribution of projected changes

The projected changes in CD occurrences are influenced by changes in atmospheric circulation and/or by changes in synoptic links. The projected changes in atmospheric circulation between future and historical periods are illustrated for RCP8.5 in Fig. 6b and in more detail in Figure S3. Overall, all models project an increase in anticyclonic CTs (1, 4 and 5), while the easterly and south-easterly flow (CTs 3 and 6) is projected to decrease (Figs. 3 and 6b). Similarly to biases, also the projected changes are strongly influenced by the driving data: notably, runs driven by CNRM-CM5 exhibit a consistently smaller increase in CT 5, EC-EARTH-driven runs project a marked decrease in the most frequent CT 7, and runs driven by HadGEM2-ES show a strong increase in CT 7 combined with a sharp decline in the zonal/cyclonic CT 8 (Fig. 6b). These main characteristics of projected circulation changes are well manifested also for RCP4.5, except that the mean changes in CT frequencies are generally smaller (compare Fig. 6b with S4b).

Other characteristics of CTs are also projected to change. Figure 7b demonstrates that the projected warming in TMIN is conditional on CTs: The conducive CTs 1−6 consistently exhibit more warming compared to CTs 7 and 8. This implies that the temperature difference between cold and warm circulation patterns is expected to decrease. Notably, the difference in TMIN between the warm CT 8 and cold CT 2 decreases in the ENS from 7.9 to 6.2 °C under RCP8.5 (cf. Fig. 7a, b). The decrease is well visible also for RCP4.5 (Fig. S5ab). On the contrary, the ENS does not suggest any notable changes in CD probabilities conditional on individual CTs (Fig. 6d).

The proportion of changes in CD frequency that can be attributed to circulation changes and changes in CD-circulation links varies among the models, as shown in Fig. 8bc. The projected increase in the frequency of conducive circulation within CTs 1, 4 and 5 in runs driven by EC-EARTH under RCP8.5 leads to their positive DC (Fig. 8c). Since no obvious changes in frequencies of conducive CTs are detected in these models under RCP4.5, DC is negligible for this scenario (Fig. 8b). The negative DC in CCLM4-8–17 (and for RCP4.5 also RCA4) driven by CNRM-CM5 (Fig. 8bc) seems to be caused by a shift in CT frequency from CT 6 (weak south-easterly flow) to the average CT 7 (Figs. 6b and S4b). DC in simulations driven by HadGEM2-ES is influenced and compensated by both increase of CT 5 and decrease of CT 3 frequencies, respectively (Fig. 6b). All in all, projected changes in CD frequencies, as well as relative contributions of DC and DL, are considerably more chaotic compared to CD biases in the 1979–2020 period, which exhibited a clear link to RCMs.

4 Discussion

Projections of future weather and climate extremes are vital foundations of climate change adaptation policies. The observed changes in the mean climate state and the warming trend have affected the CE region in various ways (IPCC 2022). Both human society and the natural environment must adapt to shifts in the mean state of temperature distribution as well as extreme temperature events. Our results show that over the CE region the rate of these shifts considerably differs. While the RCM ensemble analysed in our study projects a warming of winter daily mean TMIN of approximately 4 °C (2 °C) by the end of the century under RCP8.5 (RCP4.5), the same result for TMINp5 shows an increase of 7.3 °C (3.7 °C), as illustrated in Fig. 2. This dissimilarity was detected also in observations (Gross et al. 2018) and model projections (Wang et al. 2017). Although this suggests a comparatively lower severity of cold events on average, extreme low-temperature events can still occur. Furthermore, their impacts may be even more severe due to their relative rarity and the decreased resilience of future societies and ecosystems to cold extremes in a considerably warmer environment.

However, in an analysis of future temperature trends, the background warming and its sensitivity to the choice of model, model ensemble, and emission scenario cannot be completely neglected. Coppola et al. (2021) demonstrate that an CMIP6 ensemble generally shows a stronger warming compared to CMIP5 and CMIP5-based EURO-CORDEX ensembles (the change in central European mean winter TMIN being roughly 1.5 × larger in CMIP6). Even larger is the role of the emission scenario, as documented by Kotlarski et al. (2023) and Georgoulias et al. (2022), and also corroborated by our own results for the moderate RCP4.5 and the pessimistic RCP8.5 (Fig. 2). Our findings reveal that while generally of identical direction, the trends in temperature, changes in atmospheric circulation, modifications in circulation-temperature links, and CD characteristics are considerably amplified in the pessimistic scenario. Although the natural variability of climate plays (and will play) an important role in assessing extreme events, the results underline the role of the socio-economic development of human society on future extremes.

We selected CDs as a representative high-impact winter extreme event to study the manifestation of climate change. If the same TMINp5 threshold for a CD from the historical period were used for the future scenarios, there would be a very steep decrease in CD occurrences. As in Cardoso et al. (2019), the number of CDs almost disappears for RCP8.5. The decrease in our region would be to 0.1–0.4 CDs per year in most simulations. Only the simulation with the smallest warming of TMINp5 (CCLM4-8–17 driven by EC-EARTH) projects more CDs than this range (0.8 per year). On the other hand, the simulation with the most pronounced warming in TMINp5 (RCA4 driven by CNRM-CM5) projects winters without any CDs. However, we consider such an analysis a misconception. Future environments will have to adapt to a new climate, for which current—and even less historical—thresholds will bear no meaning. Calculating independent thresholds for each dataset and period has another benefit. Since we focus on evaluating the links between CDs and atmospheric circulation, it allows us to remove the substantial temperature biases in RCMs (Lhotka and Kyselý 2018; Vautard et al. 2021; Quesada et al. 2023), as well as the magnitude of the projected warming (Cardoso et al. 2019).

Another methodological choice pertains to the definition of CDs as spatial rather than local events. Consequently, CDs occur only when low temperatures affect a large portion of the study area simultaneously. Alternatively, CDs may also occur when relatively smaller areas exhibit extremely low temperatures. This spatial approach allows for varying frequencies of CDs across different datasets, as it is not fixed to a specific percentage like 5% (equivalent to approximately 7.5 days per year) and facilitates the comparison of CDs between models. The spatial extent of a CD tends to be substantial: for more than one-third of CDs, the TMIN anomaly is below TMINp5 in at least 90% of CE grid points. Only 1−4% of CDs cover less than half of the region. Our methodology is able to distinguish models according to their tendencies to simulate spatially correlated low-temperature extremes. This characteristic influences the intensity and frequency of CDs. Models with a more uniform temperature distribution across the region tend to have a higher probability of a large percentage of grid points with TMIN anomaly below TMINp5, resulting in a higher occurrence of CDs and their higher mean intensity. This is shown by CCLM4-8–17 RCM, which produces the most spacious CDs with their highest mean frequency and lowest mean intensity (Table 2).

All nine simulations overestimate the tendency to cluster CDs into sequences, that is, overestimate the persistence of cold snaps. This model drawback has also been found for summer hot days (Plavcová and Kyselý 2019) and persists from the previous generation of RCMs within the ENSEMBLES project (van der Linden and Mitchell 2009), as shown in Lhotka and Kyselý (2015b). This clustering tendency may also be related to overly persistent atmospheric circulation (Plavcová and Kyselý 2016). Our results indicate that overestimated CD frequency is not linked to longer cold snaps, while longer cold snaps are linked to stronger intensity of CDs. An influence of progressive cooling during prolonged extreme cold periods is supported by Fig. S6b, which illustrates that the longer is the CD sequence, the larger is the negative TMIN anomaly. ENS reproduces the observed pattern well. The small/large TMIN anomalies in consecutive CDs in CCLM4-8–17/RCA4 (Fig. S6b) reflect the small/large mean CD intensity in these RCMs compared to observations (Table 2).

The classification of atmospheric circulation that we employ has recently been developed and optimized specifically for studying extreme circulation conducive to meteorological extremes. Stryhal and Plavcová (2023) have shown that this particular approach based on Sammon mapping is a viable alternative to classifications by self-organizing maps (Kohonen 2001), which even after several optimization steps provide only limited insights into outlying circulation patterns strongly linked to cold extremes over the CE region. Our approach effectively separates rare CTs with extreme patterns from the densely populated centre of the Sammon space, which contains patterns close to the seasonal average (the 8 most frequent bins contain around 45% of all winter days; Fig. S1). However, it should be noted that a statistical evaluation of a classification with frequencies spanning several orders of magnitude poses certain challenges. To address this issue, we have mitigated the drawback by meta-classifying the classification output into only a few CTs, utilizing the links between circulation and CDs as a guideline, rather than relying on black-box-like processes of typical synoptic classifications. Another big advantage of this classification is that we can classify circulation patterns which do not occur in the data for which the Sammon mapping was calculated (ERA5 in our case). Our results show its effectivity to assess the links between CD probability and extreme circulation that occur only in models (compare Fig. 5b−k vs. Figure 5a).

We have found relatively strong links between CDs and synoptic-scale atmospheric circulation. We observe that in some circulation patterns CDs never occur, while in others the probability of CDs is very high, reaching even 100% for certain patterns and simulations (Fig. 5). The main findings are in agreement with previous studies on cold extremes in the CE region (Plavcová and Kyselý 2012, 2019; Lhotka and Kyselý 2018). However, by applying the novel methodology to a full 3 × 3 RCM × GCM matrix, our study provides a considerably more detailed analysis and attribution of cold extremes in the region. The three driving GCMs have independent atmospheric models, since Brands (2022) have shown that GCMs with common atmospheric components also have common regional atmospheric circulation error patterns. Despite the relatively small sample size, this selection strategy allows for a relatively broad representation of historical and future climates, while avoiding the drawbacks of unbalanced ensembles that result from large scarce data matrices (Evin et al. 2021). Furthermore, it is noteworthy that the circulation patterns conducive to CDs typically involve the presence of an anticyclone localized over central, northern, or north-eastern Europe. Importantly, all three GCMs simulate the synoptic-scale circulation over central Europe well (Stryhal and Huth 2019), even though GCMs, in general, have had a tendency to underestimate atmospheric blocking and exaggerate zonal flow in the region (Davini and D’Andrea 2016; Stryhal and Huth 2019). This indicates that the GCMs employed in this study capture the occurrence of the relevant circulation patterns reasonably accurately. This may also be the reason why CD biases in RCM simulations exhibited only a weak link to driving data, even though it was evident that GCM circulation biases notably propagate to RCM circulation, in accordance with previous studies (e.g., Røste and Landgren 2022). It is also worth noting that we included historical runs by RCA4 and CCLM4-8–17 driven by CNRM-CM5. These two simulations use an outdated CNRM-CM5 output that had inconsistent boundary forcing—an error that should, nevertheless, have only negligible impacts on climatological assessments (EURO-CORDEX errata table;Footnote 1 Ozturk et al. 2022). Their inclusion, however, prevents some kinds of analyses, such as on day-to-day correspondence of RCM and GCM fields (Røste and Landgren 2022).

Performance of the RCMs, on the other hand, alters the climate and the climate change signal of the underlying GCMs. Historical temperatures and warming in TMINp5 strongly differ among RCMs in the order of several degrees of Celsius (Fig. 2). Biases in reproduction of cloud cover also strongly depend on RCMs. 8 out of 9 simulations agree on underestimation of mean wintertime cloud cover, which is in accordance to Bartók et al. (2017). Notably, the negative bias in cloud cover is conditioned by the CTs and occurs typically under the CTs conducive to CDs. Although underestimated cloud cover is associated with night-time cooling and amplified TMIN extremes (e.g. Trigo et al. 2002), we did not find a link between cloud cover and CD biases. This suggests errors in the simulation of radiation processes within the RCMs and corroborates previous studies of the subject (Lhotka and Kyselý 2018; Tucker et al. 2022). Tucker et al. (2022) suggest that other errors in clouds’ representation, such as their thickness, offset the bias due to the cover, and that erroneous simulations of clear-sky conditions drive errors in radiative forcing of TMIN rather than cloud cover.

A general increase in the frequencies of CTs conducive to CDs is projected for 2070–2099. This contradicts previous studies that suggested an increase in zonal flow in future European climates (Cattiaux et al. 2013; Plavcová and Kyselý 2013), and it may be related to a gradual improvement of climate models in simulating blocking conditions (Davini and D’Andrea 2020). Although the projected increase in meridional and easterly flow suggests an increase in temperature extremes, the RCM simulations used in our study indicate a reduction in the temperature difference between CTs conducive to CDs and those that are not. The more pronounced increase in TMIN associated with easterly CTs may be related to the Arctic Amplification, i.e. the more rapid warming in the Arctic compared to lower latitudes (Previdi et al. 2021). Additionally, the projected decrease in the intensity of CDs may be associated with a reduced snow cover, inducing lower night-time radiative cooling. Winter et al. (2017) showed that RCMs exaggerate snow cover-albedo feedback, which also explains the often too-low TMIN extremes in historical simulations. The projected decrease in snow cover over Europe (e.g., Christensen et al. 2022; Kotlarski et al. 2023) corroborates the hypothesis. However, a comprehensive evaluation of the surface energy budget is beyond the scope of our study.

5 Conclusions

We used an ensemble of CORDEX RCMs to investigate the links between cold days (CDs) and atmospheric circulation, which is the primary driver of these events over central Europe. We employed a novel classification of atmospheric circulation based on Sammon mapping to identify and analyse circulation types (CTs) linked to CDs. The main findings of our study are summarised as follows:

  • RCMs generally capture the characteristics of CDs, but they tend to overestimate their clustering tendency (grouping of CDs into consecutive sequences).

  • Anticyclonic conditions and/or easterly advection were found to be the dominant circulation patterns conducive to CDs. Although the CTs identified as conducive to CDs represent only 28% of all winter days, they account for two thirds of CDs observed in 1979–2020. The two CTs with the strongest links to CDs represents only 2% of winter days, while they account for more than 15% of CDs.

  • The ensemble mean reproduces the observed links between CDs and atmospheric circulation well. However, in individual models, marked biases in CD simulations arise from inadequate reproduction of CT frequencies and/or conduciveness of specific CTs to CDs. The biases in reproduction of atmospheric circulation propagate from driving GCMs into RCMs, while the biases in the links between CDs and circulation depend mainly on individual RCMs. We did not find any statistical link between CD biases and cloud cover, which is considerably underestimated in CTs conducive to CDs.

  • Projections under the warmer climate indicate an increase in the frequency of CD-conducive CTs, suggesting more winter extremes in the future.

  • Main patterns of CD probabilities within each CT are projected to preserve in the future. However, the links between CD characteristics and atmospheric circulation are projected to weaken, as RCMs project faster (slower) warming for circulation patterns conducive (non-conducive) to CDs. This implies the possibility that future cold extreme events may become more frequent but less intense and less dependent on atmospheric circulation.

  • All projected changes are generally more pronounced for the worse-case climate change scenario (RCP8.5) than the intermediate one (RCP4.5).

The results suggest uncertainty in future of cold extremes in the CE climate. Our findings emphasize the importance of investigating the links between various meteorological characteristics when examining extreme weather events, particularly links to large-scale atmospheric circulation. A reliable model should accurately simulate both atmospheric circulation and its connections to extremes. We found that an RCM with good performance in terms of CD occurrence may still exhibit biases due to poor simulation of atmospheric circulation and its links to temperature or other meteorological variables (e.g., cloud cover).

Our study also highlights the ongoing importance of high-impact winter weather research, despite the overall warming trend. Cold extreme events have significant socio-economic implications, and understanding their dynamics and future changes is crucial for developing effective adaptation and risk management strategies. The assessment of RCMs’ abilities to reproduce links in the climate system is crucial for interpreting climate change scenarios based on RCM simulations.