1 Introduction

Knowing—predicting or projecting—the future climate with all its fine-scale details and associated uncertainties, can be considered as the Holy Grail in climate change adaptation (Giorgi 2019; Hewitt et al. 2021). This quest underlies large international efforts to produce high resolution climate information such as CORDEX (Giorgi et al. 2009; Jacob et al. 2020). With the application of high-resolution regional climate models, the question concerning the added value (AV) in comparison with lower resolution global climate models is often posed, in particular considering the high computational resources needed to perform such regional downscaling experiments.

The topic of AV of high resolution regional modelling systems is complex and strongly debated in literature (Feser et al. 2011; Di Luca et al. 2015; Giorgi 2019; Lloyd et al. 2021). AV covers potential gains on larger scales resulting from better resolved physical processes as well as the value of the added fine-scale information. Pragmatic considerations such as the availability of more output in regional modelling systems are important as well in user context.

Here, we only consider the aspect of the added fine-scale spatial patterns, in our case mostly connected to the orography, land–sea contrasts and high resolution land-use maps. We note that fine-scale information can also be related to small-scale atmospheric phenomena, such as convective showers, which are not necessarily geographically specific. We focussed here on changes in seasonal means, whereas AV may also be expected for extremes (Ciarlo et al. 2021).

From a physical point of view, regional climate models clearly resolve many regional processes better, such as local interactions with orography (Torma et al. 2015; Giorgi et al. 2016) or small-scale circulations as sea breezes, frontal systems and mesoscale convective systems. This generally leads to a better reproduction of present-day observed climate statistics than in coarse resolution global climate models (Feser et al. 2011; Rummukainen 2016; Gutowski et al. 2020; Prein et al. 2021).

Yet, AV in the sense of producing more reliable climate change information and/or better projections of the future climate is not so obvious. Here, we use reliable as trustworthy, but later in the paper we use reliable in a measurable statistical sense within a perfect model approach. One could argue that the fact that local processes are better resolved in high resolution models likely leads to more reliable projections of future changes. Not disputing this view, one may also question whether the improved representations of physical processes (and resulting better present-day climatology) is a sufficient justification to spend the limited computational resources on downscaling experiments instead of producing larger ensembles of global climate models simulations or on improving those models (Nishant and Sherwood 2021). We mention two primary reasons for this point of view. First, a regional model inherits most of the large-scale atmospheric flow features from the global driving model (Ulden et al. 2007) and the reliability of a high-resolution projection is therefore limited by the reliability of the driving boundary conditions from the global model. Second, at higher resolution the smaller scales become more active owing to explicitly resolved instabilities at the smaller scales, leading to large small-scale variability (Fatichi et al. 2016; Aalbers et al. 2018). By using a perfect model approach where we know the future truth, we here aim to learn on AV in a climate change context, in particular focussing on the change signal and variability at fine scales.

Natural, internal climate variations due to chaotic dynamics (Selten and Branstator 2004) are inherently unpredictable on decadal and longer timescale and are a source of irreducible uncertainty (Deser et al. 2010, 2020; Fischer et al. 2014; Fatichi et al. 2016; DelSole and Tippett 2018; Lehner et al. 2020). In global climate modelling a considerable number of so-called single model initial-condition large ensembles (SMILEs) exist, projections with the same model that only differ by a random perturbation in the initial condition (Deser et al. 2020; Maher et al. 2021). SMILEs can be used to study the internal variations of the climate system in relation to the systematic greenhouse gas (and aerosol) induced forced climate change signal (Thompson et al. 2015; Lehner et al. 2020; Maher et al. 2021), but also changes in rare extremes or variability (Wood et al. 2021; van der Wiel et al. 2021). However, such SMILEs are still quite rare in regional climate modelling mostly due to the high computational expenses of regional models and the unavailability of GCM boundary conditions (Leduc et al. 2019; von Trentini et al. 2019, 2020; Maher et al. 2021).

We used a regional SMILE with the regional climate model RACMO2 embedded in the global Earth-System-Model EC-Earth2.3 (Lenderink et al. 2014; Aalbers et al. 2018). Using this SMILE, and employing a perfect model approach, we aim to answer the following questions:

  • concerning signal-to-noise: Are there systematic fine-scale change patterns, and how do these compare to the internal variability at large and fine scales? Do single ensemble members provide useful information on the systematic fine-scale changes?

  • concerning predictability: Do predictions of a future 30-year climate, called a climate state, improve (within the perfect model approach) by using fine-scale change information in comparison to using only the large-scale information?

As part of coping with the signal-to-noise problem, we also present results using a pseudo global warming (PGW) approach (Schär et al. 1996; Brogli et al. 2019a, b). In this approach, the weather in the control period is repeated under warmer/moister conditions, using perturbations in temperature/humidity and large-scale flow derived from EC-EARTH. This approach filters out a considerable part of the climate variations related to variability of the large-scale atmospheric flow, and also other sources of internal variability like sea surface temperatures and sea-ice cover (de Vries et al. 2022). By filtering out these variations we expect to be able to detect the small-scale change features related to topography and land–sea contrast with less computational efforts; for instance by running only one ensemble member instead of having to produce a large ensemble.

In the first part of the paper, we take a signal-to-noise standpoint, and investigate the internal variability and forced signal. A spatial filtering technique (Feser 2006) is used to separate out large-scale and fine-scale features. In the second part of the paper, we approach the problem in a prediction framework setting. A perfect model approach is used to evaluate how well a 30-year future climate state can be predicted, taking one ensemble member as the truth, and trying to predict the future state using information from the other members. Various options to produce high-resolution future states are explored to investigate the potential of adding value from high-resolution modelling systems.

2 Models and methods

2.1 Simulations

We used a 16 member initial condition ensemble with the regional model RACMO2, embedded in the global climate model (GCM) EC-Earth2.3 forced by RCP8.5 emissions (see Lenderink et al. 2014; Aalbers et al. 2018). Only the EC-Earth2.3 simulation is perturbed in 1850, providing 16 member ensemble up to 2100, and these are used to force RACMO2 at its lateral boundaries for the period 1950 to 2100. For the analysis, we considered the following two periods, the control present-day climate period 1991–2020 and the future period 2071–2100. The grid spacing is 12 km, and the model domain covers the central western part of Europe.

For the future period we also analysed an ensemble of 16 RACMO2 simulations produced by a pseudo-global warming (PGW) approach. The ensemble mean changes between the future and present-day period in temperature, relative humidity, winds and pressure are first computed from EC-Earth2.3. Then for each member of the ensemble, these changes are added to the RACMO2 lateral boundary conditions from the control period, thus producing a set of 16 RACMO2 boundary conditions for the future period. Also, sea surface temperatures are adjusted, as well as the greenhouse gas and aerosol forcing. Soil moisture adjustments are only imposed on the initial state of the PGW-integrations and are derived from the standard ensemble.

The original GCM driven ensemble is denoted as the standard (STD) ensemble, while the PGW driven runs are denoted as the PGW ensemble. Note that for the control period, the PGW and standard ensemble are the same. We only consider here the winter period.

2.2 Analysis of forced response and natural variability

We start by introducing some terminology used in this paper. A realization of a 30-year climate—a 30-year time period from a member of the ensemble—is called a climate state. A climate state deviates from the climatology of the model because of random internal variability. The latter climatology is defined here as the mean over the ensemble members for the period considered. We note, however, that with 16 members there still is an internal variability component left in this mean, for instance due to long-term variability such as in the ocean. We refer to the forced response as the average change between the future and present period over the 16 ensemble members—forced response to underline that this change is predominantly due to the external forcing (greenhouse gas and aerosol concentration). For the difference between a control and future climate state, we use the term change; it consists of the forced response modified by internal variability. As a measure of internal variability we use the standard deviation of the change across the ensemble members (range between plus and minus standard deviation around the ensemble mean, containing ~ 10 of the 16 ensemble members).

For precipitation, changes are fractional changes with respect to the reference period. Relative changes avoid unphysical negative precipitation amounts when applied in the prediction framework in the next section. Also, for winter precipitation as studied here the distribution of relative changes has less outlier points (a more even distribution) as compared to absolute changes (with a small number of grid points with high change values). However, we acknowledge that relative changes could become large for areas with low precipitation amounts, such as for example occurring in southern Europe for summer. For temperature absolute differences between control and future period are used.

We used a Gaussian spatial filter to determine large-scale patterns and fine-scale pattern in the response. The filter is given by:

$$G\left(x,y\right)= \frac{1}{2\pi {\sigma }^{2}}{e}^{\frac{-{x}^{2}-{y}^{2}}{2{\sigma }^{2}}}$$

with σ the standard deviation, which is taken as 10 grid points. Applying this filter only retains the large scales, typically beyond 200 km (2 times the filter width; see e.g. Figs. 1, 2, 3, 4), which is representative for a present-day coarse resolution GCM. However, results are not very sensitive to the choice of filter length. Filtering is done using “R” routine “kernsm” from package “aws”.

Fig. 1
figure 1

Change in mean winter precipitation for the STD ensemble. Upper panels: forced response in precipitation (upper) with from left to right, a full field, b filtered field and c fine-scale field. Lower panels, absolute value of fine-scale forced response (d), in comparison with internal variability in full, filtered and fine-scale field (e, f, g)

Fig. 2
figure 2

Same as Fig. 1, but now for mean winter temperature change from the STD ensemble. We note that the light grey area in the upper left/middle top panels denote a temperature rise approximately equal to the global temperature rise of 3.1 °C, and that the color bar for the anomalies on right top figure is shifted by 3.1°

Fig. 3
figure 3

As Fig. 1, but now for the PGW ensemble

Fig. 4
figure 4

As Fig. 2 but now for the PGW ensemble

The unfiltered field for the control and future period are denoted by a subscript “full”, while the smoothed, filtered fields are marked with “filt”. The filtered fields are derived by first filtering the control C as well as the future F period to mimic the coarse resolution information of a GCM, and then determining the fractional change. The fractional change R of member j is now given by:

$$\begin{aligned} R_{\mathrm{full},j} & = \left[ {\frac{{F_{{\text{full}},j} - C_{{\text{full}},j} }}{{C_{{\text{full}},j} }}} \right] \\ R_{\mathrm{filt},{{j}}} & = \left[ {\frac{{F_{\text{filt},j} - C_{\text{filt},j} }}{{C_{\text{filt},j} }}} \right] \\ \end{aligned}$$

and the residual fine-scale change

$${R}_{\mathrm{fine},j}= {R}_{\mathrm{full},j}- {R}_{\mathrm{filt},j}.$$

We note that with this definition of the fine-scale change pattern we assess added value since it compares the relative changes from the full resolution with those derived from only coarse resolution. Alternatively, we could also have computed the filtered response \({R}_{\mathrm{filt},j}\) by performing filtering on the response pattern directly instead of the control and future period separately, which can be interpreted as the fine-scale response contained in the full resolution response. In practise, however, these definitions are very similar, and for absolute changes used for temperature they are even identical.

2.3 Analysis of conditional predictability of a climate state

Predictions are studied within a perfect model approach. As such they are by construction conditional on the model, which is assumed to be perfect, and the chosen emission scenario. Usually, in the context of long-term climate change the term projections is used for model simulations, highlighting that these simulations are in fact highly conditional and should not be interpreted as a prediction. But here we used the word prediction to emphasize that within our perfect model approach we can compare the predictions with an actual truth.

In our approach one member i is taken as the truth, assuming we know its present-day climate state and aiming to predict its future climate state. We use data of the other members j (with j ≠ i, a non-truth member) to predict the future state of i. Cycling through i, this gives a matrix of predictions; for each i (with i = 1…16) 15 predictions based on the remaining members j are produced, in total 240 predictions.

We note that for the sake of simplicity in the following equations, “i” in the left-hand side refers to the member to be predicted (and not a dependency), whereas “i” and “j” on the right-hand side denote the members that are used to predict (the dependencies). Further we note that three predictions of the future state do not depend on “i” (Eqs. 3, 4 and 7) and one not on “j” (Eq. 6). A more rigorous mathematical notation can be found in the Supplement.

The first two prediction methods—hereafter also a prediction pair—are simple delta change techniques (Lenderink et al. 2007),


The first prediction uses the change information derived from member j, and “adds” this change to the control climate state of i (which is assumed to be known) to predict its future state. The second prediction applies the same method, but only uses the large-scale pattern of change. Thus, by comparing the predictions based on Eq. 1 with those of Eq. 2, the added value of fine-scale information can be assessed. For completeness, we note that these equations are for relative changes; for absolute changes they read more simply as:

$$\begin{aligned} F_{{\text{pred}}\,{\text{Full}},i} & = C_{{\text{full}},i} + R_{\mathrm{full},j} ;\quad R_{\mathrm{full},j} = F_{{\text{full}},j} - C_{{\text{full}},j} \\ F_{{\text{pred}}\,{\text{Filt}},i} & = C_{{\text{full}},i} + R_{\mathrm{filt},j} ;\quad R_{\mathrm{filt},j} = F_{{\text{filt}},j} - C_{{\text{filt}},j} . \\ \end{aligned}$$

The predictions above are based on the estimate of the control climate from a single realization only, which may deviate from the model climatology due to internal variability. In order to assess the potential gain in prediction skill by better knowledge of the present-day climatology, we evaluate the following prediction pair:


where the “m” denotes the 16-member ensemble control mean (“Cmean” on the left). We note that in practice we do not know how the 30-year control climate state deviates from its real climatology, but by comparing this approach with the previous prediction set (Eqs. 1, 2) we can estimate how uncertainty in the present-day observed climate state affects future predictions.

In the next approach we assume Perfect knowledge of the Large-Scale change (PLS). In that case the two predictions are:


This pair assesses the hypothetical case in which the GCM is perfectly able to predict large-scale changes, and raises the question whether improvement in prediction skill can be expected by adding imperfect knowledge from member j on the fine-scale change. We note that this set, and in particular Eq. 6, is also relevant in a matrix filling context (Christensen and Kjellström 2021; see Sect. 4). Also note that Eq. 5 is similar to Eq. 1, but now with the filtered response field from member j, replaced by the true filtered response field from member i.

Finally, we note that the first prediction in Eq. 1 can be written differently:

$${F}_{\mathrm{pred}\,\mathrm{Full},i}={C}_{\mathrm{full},i}\left[1+{R}_{\mathrm{full},j}\right] = {F}_{\mathrm{full},j} \frac{{C}_{\mathrm{full},i}}{{C}_{\mathrm{full},j}}.$$

So, this prediction can be written as a direct model prediction of the future state \({F}_{\mathrm{full},j}\), modified by a correction term given by the fractional difference between the control periods of the two members i and j. This correction term is essentially measuring the internal climate noise component and how much members i and j deviate by chance. This term has the same form as a simple “bias” correction, assuming that the bias from the control period carries over to the future period.

Given that in our approach, the model is assumed to be perfect and therefore \({F}_{\mathrm{full},j}\) can be considered as an unbiased estimate of \({F}_{\mathrm{full},i}\), we also considered the following direct prediction

$${F}_{\mathrm{pred}\,\mathrm{Direct},i} = {F}_{\mathrm{full},j}.$$

A simple analysis of the error characteristics of different predictions is given in the “Appendix”, and discussed below where appropriate. The “Appendix” also shows these predictions in a table for quick reference in conjunction with the figures.

The skill of a prediction is measured by two different indices, the spatial correlation between predicted \({F}_{\mathrm{pred},i}\) and actual future climate state \({F}_{i}\), and a mean absolute error which is defined by:

$$ \begin{aligned} E_i & = \left\langle {\left| {\frac{{F_{\text{pred,i}} }}{F_{\text{i}} } - 1} \right|} \right\rangle\, {\text{for}}\,{\text{ precipitation }}\left( {{\text{in }}\,{\text{fraction}}} \right),{\text{ and}} \\ E_i & = \left\langle {\left| {F_{\text{pred,i}} - F_{\text{i}} } \right| } \right\rangle\,{\text{for}}\,{\text{temperature }}\left( {{\text{in }}\,{\text{degrees}}} \,^{\circ}{{\text{C}}} \right), \\ \end{aligned} $$

where |..| is the absolute value and < .. > the spatial mean over the analysis domain. We note that since the predictions depend on j, these errors also depend on j, thus providing a matrix i, j of errors. The exception is \({F}_{\mathrm{pred}\,\mathrm{Filt}\,\mathrm{PLS},i}\) (Eq. 6), which does not depend on j, and for which we used the same error for all j.

3 Results

3.1 Forced response and natural variability

The forced response in winter mean precipitation change shows a general pattern of more precipitation across the European continent (Fig. 1a). Most of the response is large scale (Fig. 1b). Yet, a substantial fine-scale response pattern is present primarily related to the orography; in western Scandinavia with a weaker increase near the coast and stronger behind the mountain range, in the Alpine region with weaker response over the higher orography and stronger response north and south. Also, the lower mountain ranges can be found back as weak fine-scale patterns in the forced precipitation response (Fig. 1c).

The winter precipitation climate is highly variable and characterized by large internal variability unrelated to climate change (see also Supplement showing the anomalies in response in the first 4 ensemble members). To quantify the internal variability, we use two times the ensemble standard deviation (~ 68% range) of the climate change signal from the 16 members. This is done for the full field, the filtered field, as well as the residual fine-scale field (Fig. 1e, f, g). Most of the natural variability is clearly in the large-scale pattern, yet in the alpine region large-scale and fine-scale contributions are of the same order. Generally, the fine-scale forced signal is substantially smaller in amplitude than the internal variability component, signifying low signal-to-noise values of fine-scale response patterns.

For temperature, the situation is more favourable in a signal-to-noise sense (Fig. 2). The fine-scale forced patterns are still smaller than the overall internal variability. However, the fine-scale forced signal is at least larger than the fine-scale internal variability in topographic areas, like the Alps.

Spatial patterns of forced response and internal variability produced by the PGW approach are shown in Figs. 3 and 4. For precipitation this ensemble produces slightly higher mean changes in precipitation (Fig. 3a) in comparison to the STD ensemble, much better signal-to-noise (Fig. 5), but overall the change patterns are quite similar (pattern correlation of 0.94 between full forced response in PGW and STD ensemble, and 0.83 for the fine-scale pattern only; see also Fig. 6a, b). As expected internal variability is much smaller in the PGW ensemble, both for large as well as fine scales (Fig. 3e, f, g). Apparently, constraining the large-scale circulation in the PGW approach is sufficient to also substantially reduce the internal variability at fine scales. This holds to a lesser extent for temperature, where the PGW approach still contains some internal variability, though reduced substantially compared to the standard ensemble (Fig. 4). We also note that for temperature the PGW ensemble underestimates the overall warming at larger scales, yet gives reliable estimates of the fine-scale component (pattern correlation of fine-scale pattern of 0.99; see also Fig. 7b).

Fig. 5
figure 5

Signal-to-noise (S2N, mean forced change pattern divided by 2 times inter-member standard deviation) for precipitation change in the STD ensemble (upper) and PGW ensemble (lower). From left to right, S2N in fine scale pattern, S2N in fine-scale pattern with respect to standard deviation in full response, and S2N in the coarse-scale pattern

Fig. 6
figure 6

Taylor diagrams showing correlation and standard error of mean winter precipitation change between the forced response (reference) and different projections (e.g. individual members of the ensemble). The reference is given on top; e.g. in the left top panel each member of the STD (orange) and PGW (green) ensemble is compared to the full forced response of the STD ensemble (note the comparison to the PGW forced response in the right-hand plots). Upper plots compare individual members of the PGW (green) and STD (orange) members to the forced response showing AV of the PGW approach; lower plots compare filtered (blue) and full resolution, unfiltered (red) results showing AV of fine-scale information

Fig. 7
figure 7

Same as Fig. 6 but now for mean winter temperature change

To proceed, we discus signal-to-noise—defined as the ratio between the forced change signal and two times the inter-member standard deviation—for the fine-scale change pattern and the smoothed change pattern. For precipitation signal-to-noise ratios are clearly much higher in the PGW ensemble as compared to the STD ensemble, both for the large and fine scale changes (Fig. 5). The high signal-to-noise for the fine-scale pattern in the PGW experiment is even true when comparing to the standard deviation from the full response pattern (middle panels). The STD ensemble show (very) low signal-to-noise ratios, in particular for the fine scale pattern. For temperature (see supplement) differences are not as pronounced, but again the PGW experiment is characterized by better signal-to-noise ratios.

One may ask how well the systematic response patterns can be approximated by a single member of the ensemble. We use a Taylor diagram (Taylor 2001) to show correlation, root mean square difference, and the standard deviation of the forced response pattern and the change in single members. Typically, for mean winter precipitation individual STD ensemble members have a pattern correlation of 0.7–0.85, and errors of 0.04–0.1 (root mean square difference of the fractional precipitation change; Fig. 6a). Focusing on the fine-scale pattern, spatial correlation further deteriorates, in four members even below 0.5 (Fig. 6b). The PGW ensemble clearly shows much more consistent results, with a spatial correlation of more than 0.9 for the full response field and 0.75–0.8 for the fine-scale response and smaller errors (Fig. 6a, b). All individual PGW members are (much) closer to the full as well as the fine-scale forced change patterns than any of the STD members. For temperature, differences between the PGW approach and the STD ensemble are less clear (Fig. 7). Spatial correlations are comparable or slightly higher in the PGW ensemble and the errors are generally smaller in the PGW ensemble, with no outlier simulations like in the STD ensemble. As expected, in this case the PGW ensemble is (much) less affected by internal variability.

By comparing the full and the filtered precipitation change patterns projected by the individual members to the full forced response, a measure of added value of the high resolution can be determined (Fig. 6d). In case of the STD ensemble, neither the correlation nor the standard error improve going from the filtered to the full change field. Although there is a systematic fine-scale forced response pattern, it is clear that in the full change patterns of the individual members the forced response pattern is not emerging from the noise. This situation is better for the PGW runs showing small improvements (mostly in correlation) when adding the fine-scale patterns (Fig. 6e) and even much better comparing the PGW members to the PGW forced signal (Fig. 6f).

For mean winter temperature change, added value of fine-scale information is present (Fig. 7d, e). In general, changes including the fine-scale patterns derived from single members provide a better predictor of the systematic response pattern as compared to the large-scale changes only. Also, most individual members perform already better than the ensemble mean large-scale change (Fig. 7d, red dots compare to light blue dot) showing that even with uncertain large-scale changes, in this case downscaling already adds value to the simulations.

Summarizing, both for mean winter temperature and precipitation, fine-scale patterns in the forced signal exists, signifying added value in the forced signal resulting from high resolution modelling. Yet, a robust estimate of this fine-scale forced signal is not always obtained from single model simulations, in particular for the more variable precipitation changes.

3.2 Predictions of the future climate state

We now take a complementary view, and ask how well the future climate state can be predicted. As explained in the methods we take one member i to be the truth, and try to predict its future state using information from the remaining members j, resulting in a matrix of 240 predictions (16 times 15 predictions). We use two measures of the quality of these “predictions”, the spatial correlation between the real and predicted climate state, and the spatial mean of the absolute (temperature) or relative (precipitation) error (see methods). Here, we focus on the error, and results for correlation are shown in the Supplement.

The temperature and precipitation error of the seven predictions (Sect. 2.3) derived from the STD ensemble are shown in Figs. 8 and 9, respectively. In these figures, we plot the reference prediction method, \({F}_{\mathrm{pred}\,\mathrm{Full},i}\) (the delta change method based on the full change derived from member j, Eq. 1) on the top left position (a). Shown is the percentage of predictions that improve (negative, when most predictions get worse) compared to the reference prediction method, as well as the mean error averaged over all 240 predictions. As summarized in Table 1 (“Appendix”) the prediction methods using high resolution change information are in the top row of each figure (from left to right, Eqs. 1, 3, 5, 7) and those using only low resolution are in the bottom row (from left to right, Eqs. 2, 4, 6). By comparing the three prediction pairs, for instance Eq. 1 with Eq. 2, we measure added value (see methods).

Fig. 8
figure 8

Mean absolute error (in degrees) of the predictions of the future state for mean winter temperature using the GCM driven ensemble. The ith member predicted is along the x-axis, whereas the jth member used in on the y-axis. On the top row are predictions (left to right) from Eqs. 1, 3, 5, and 7, whereas the bottom row are predictions from Eqs. 2, 4 and 6 (see also Table 1 in the “Appendix”). Numbers on top of each panel give at the left position mean error over all predictions (including 5/95th percentile of the 240 predictions) and at the right position the percentage of predictions improving as compared to the reference prediction from the STD ensemble (shown in left top panel)

Fig. 9
figure 9

Same predictions but now for mean winter precipitation and for a mean relative error (fraction)

Fig. 10
figure 10

Same as Fig. 9, but now taking the information for the “j” member from the PGW runs (taking the change information from the PGW runs). Note that the % of improved predictions shown at the right-top are compared to the reference prediction from the STD ensemble (Fig. 9a)

For temperature adding the high resolution information improves the quality of the prediction (both in terms of mean error across all predictions as well as the number of predictions improving). From left to right, starting from the mean control climate state improves the prediction (Fig. 8b), but a much bigger improvement can be obtained by using the perfect large-scale change (Fig. 8c).

Interestingly, the direct prediction (Fig. 8d) is the best prediction, apart from the prediction with perfect large-scale changes. This may look surprising, but can actually be easily understood from a simple analysis of the errors, giving a ~ 40% improvement from the direct prediction with respect to the reference prediction \({F}_{\mathrm{pred}\,\mathrm{Full},i}\) (see “Appendix”). In words, in a perfect model approach and without further knowledge on e.g. the large-scale change, the direct prediction (top right) is optimal. The reference (or full) prediction can be written as the direct prediction multiplied by a correction term based on the difference between control state of the ith and the jth member which introduces two additional error terms due to internal variability. The prediction starting from the mean climate state has only one additional error term due to internal variability and behaves, in this respect, in the middle between the direct prediction and the full prediction.

For winter precipitation in Fig. 9, the best predictions use the perfect large-scale change (Fig. 9c, g), and the direct prediction is again the best remaining prediction (Fig. 9d). So, the ordering from left to right is basically similar as expected from the error analysis. However, in this case, adding fine-scale changes does not “add value” to the predictions, but on average makes the predictions slightly worse. Hence, with the large internal variability in the fine-scale change patterns, it is not guaranteed that adding those lead to better information in a prediction sense (see also summary statistics in Figs. 11, 12).

Fig. 11
figure 11

Mean of the absolute value of temperature bias in winter across the full prediction matrix. Left two panels, mean over the domain, STD (a) and PGW (b) based predictions; right two panels, mean over the Alpine region (c, d). In red are predictions based on full changes, in blue predictions based on only large-scale changes

Fig. 12
figure 12

As previous figure, but now for absolute spatial mean of the relative error in winter mean precipitation across the full prediction matrix

Since the PGW members are (much) better at projecting the systematic forced responses at large scales as well as small scales, using those may give better predictions of the future state as well. In this case, the change from member “j” in the prediction equations (Eqs. 17) is taken from the PGW runs. Except for the direct prediction (right top) all predictions are indeed better taking the PGW changes instead of the changes derived from the STD ensemble. But even in the PGW approach, adding the fine-scale changes does not consistently improve the predictions, and only marginal differences are found.

For temperature the skill of the PGW approach is comparable to the standard approach. Although internal variability is smaller in the PGW ensemble, giving the potential for more reliable predictions, this advantage is offset by a slightly biased response in the PGW approach as compared to the standard ensemble; the systematic temperature response of the PGW approach in Fig. 4 is lower than in the standard ensemble in Fig. 2.

Up to here we analysed the behaviour over the full domain, which may hide beneficial effects of the fine-scale information in certain areas. Therefore, we also computed statistics averaged over the Alpine region (between 44 and 48°N, and 4 and 20°E). For temperature (Fig. 11) general improvements are found by adding the high resolution change information (by comparing red with blue prediction pairs) and improvement become more pronounced for the Alpine region. In contrast to the full domain, the PGW based predictions are now better for the Alpine region as compared to the STD based predictions (Fig. 11c, d). For precipitation (Fig. 12) results of the full domain and Alpine region are almost the same. For the STD ensemble, predictions deteriorate when adding fine-scale changes, whereas with in PGW approach there is no substantial difference (with slight improvements for the Alpine region).

4 Discussion

We found fine-scale patterns in the systematic forced response, confirming results in other studies (Torma et al. 2015; Giorgi et al. 2016). But, these systematic fine-scale patterns can be rather small compared to internal variability at large but also at fine scales. Whether the fine-scale response patterns add value for a user depends on their perspective. If their perspective is primarily based on change information due to global warming—for instance because they have an application where the change in risk is key—they may be mostly interested in the forced response. Yet, other user applications may be more sensitive to natural climate variability. Here, we also note that internal variability depends on the time window used—commonly 30 years—which may not correspond to the time period relevant for the user.

Our results show that within the perfect model framework a simple delta change approach using an assumed to be known large-scale change is generally producing very good results in a prediction sense. Since the large-scale change is often dominated by internal variability, which is essentially unpredictable, this is of limited practical user relevance. But, the information can be used for benchmarking statistical emulators of high resolution climate information, such as used to fill in missing RCM simulations in a GCM-RCM matrix (Christensen and Kjellström 2021; Doury et al. 2022). At least the statistical emulator should perform better than present-day climatology plus the coarse resolution change derived from the low resolution model.

We have used a perfect model approach, assuming that there are no model errors and that the spatially smoothed signal is perfectly representing the coarse scale information. It is however important to make two remarks in this respect.

First, the reason that the direct model prediction of the future climate state (Eq. 7) is better than the delta change approach using changes superimposed on the present-day climate state is related to the perfect model assumption. The delta change approach involves a correction term (as compared to the direct approach) based on the difference between two realizations of the present-day climate state, which is introducing two additional error terms related to internal variability. Yet, these additional terms are equivalent in form to a simple bias correction. Thus, realizing that models are usually biased and correcting this bias based on the control period results, the direct approach will essentially become equivalent to the full delta change approach, with again two additional error terms due to natural variability.

Second, by our filtering approach we neglect that added value from the high resolution model runs can also occur at larger scales. The interaction of the large-scale flow with for instance the orography can improve precipitation amounts also when aggregated to the larger scale. We cannot estimate this within our modelling context. In principle, one could look at the difference between the driving GCM and filtered RCM information. However, since the models have different physical parameterizations it is difficult if not impossible to disentangle which part of the difference is due to the higher resolution and which part is due to the fact that we are comparing two different model physics.

The PGW ensemble generally gives better signal-to-noise ratios; that is, smaller internal variability compared to the forced response signal. In particular, fine-scale change patterns appear rather robustly simulated in single model members. However, this improvement of the signal-to-noise should be compared to biases in the change introduced by the PGW approach. For the winter season it appears that PGW is able to catch the forced response rather well, in particular at the seasonal mean time scale. Also, fine-scale patterns in winter temperature are well simulated; besides the mean winter temperature, we also tested cold days (1st percentile) and warm days (99th percentile) and found (very) high correlations in the fine-scale pattern in the PGW ensemble compared to the standard ensemble.

In this paper, we studied the winter season motivated by the existence of pronounced small-scale forced change features as well as the considerable internal variability. Our results do not necessarily carry over to other seasons. We noticed for instance that for the summer season the response of rainfall is highly dependent on atmospheric lapse rate changes as provided through the boundaries, creating large biases in the response of our PGW experiment. However, other PGW experiments performed at a larger domain gave (much) better results for summer, more consistent with other published results (Brogli et al. 2019a, b). An in-depth analysis of the degree to which the PGW is able to reproduce the forced response from the standard ensemble is outside the scope of this paper. However, considering the improvement in signal-to-noise as shown here, this is definitely worth further exploring, in particular considering the high computational demands of regional model simulations.

5 Conclusion

We investigated the added value of fine scale information in perfect model framework, in relation to uncertainty due to internal variability. We used a 16 member single model initial-condition regional climate model ensemble by RACMO2 embedded into EC-Earth2.3, and defined fine and large-scale information by a spatial filter. In addition to the standard ensemble, we also produced a pseudo global warming (PGW) ensemble for the future period. We studied mean winter precipitation and temperature change between the control, 1991–2020, and future, 2071–2100 period.

The ensemble mean response—the forced response—show systematic changes at fine scales, predominantly related to the orography confirming results documented in the literature. Comparing the forced response with variability within the ensemble members—the internal or natural variability—the fine-scale response pattern is rather small for precipitation which shows strong internal variability at large and fine scales. This implies that single members provide only limited information on the forced response, and the added value of fine scale information is hard to prove. This can potentially be improved substantially by using the PGW approach. For winter precipitation, a single PGW member contains much more reliable information on the forced response including its fine-scale component than a single ‘normal’ ensemble member. This advantage of the PGW approach, however, should be compared to biases in the response introduced by the approach; for example the slightly higher forced precipitation response and lower temperature response (Figs. 1, 2, 3, 4). For temperature, signal to noise is much better, and single model simulations do provide information on fine scale systematic changes. In this case, the improvements by the PGW approach are less pronounced.

We further studied how well a future climatic state can be predicted assuming one ensemble member to be the truth and using change information from the other members. In this purely predictive setting, adding fine-scale information usually improves the predictions for temperature. However, this is not the case for precipitation; adding fine scale change information from the normal (STD) ensemble generally degrades the predictions, whereas the predictions using the PGW ensemble neither improve nor degrade by adding the fine scales (with some indication of improvements for the Alpine region).

In this paper we took two rather extreme points of view: a change perspective and a purely predictive perspective. In practice, however, user perspectives will probably be in between these two. To some extent users will weigh information from systematic changes and uncertainty due to random unpredictable noise. To some extent users are also likely to need absolute values (the climatic state) as decisions are often based on absolute criteria. This is also dependent on the user time horizon. Short term decisions are usually more taken in a predictive environment (such as, decadal forecasts) whereas longer term decisions may more strongly depend on, for instance, the change in risks. The added value of fine scales in this study is most apparent in the (systematic) change perspective, whereas in a purely predictive perspective the user may often be best served with only the large scale changes, in particular when the unpredictable climate noise component is large. However, we also acknowledge that this definitely needs further investigation.