# Synchrony between reanalysis-driven RCM simulations and observations: variation with time scale

- 1k Downloads
- 1 Citations

## Abstract

Unlike coupled global climate models (CGCMs) that run in a stand-alone mode, nested regional climate models (RCMs) are driven by either a CGCM or a reanalysis dataset. This feature makes high correlations between the RCM simulation and its driver possible. When the driving dataset is a reanalysis, time correlations between RCM output and observations are also common and to be expected. In certain situations time correlation between driver and driven RCM is of particular interest and techniques have been developed to increase it (e.g. large-scale spectral nudging). For such cases, a question that remains open is whether aggregating in time increases the correlation between RCM output and observations. That is, although the RCM may be unable to reproduce a given daily event, whether it will still be able to satisfactorily simulate an anomaly on a monthly or annual basis. This is a preconception that the authors of this work and others in the community have held, perhaps as a natural extension of the properties of upscaling or aggregating other statistics such as the mean squared error. Here we explore analytically four particular cases that help us partially answer this question. In addition, we use observations datasets and RCM-simulated data to illustrate our findings. Results indicate that time upscaling does not necessarily increase time correlations, and that those interested in achieving high monthly or annual time correlations between RCM output and observations may have to do so by increasing correlation as much as possible at the shortest time scale. This may indicate that even when only concerned with time correlations at large temporal scale, large-scale spectral nudging acting at the time-step level may have to be used.

## Keywords

Regional climate model Spectral nudging RCM-observations synchronicity## 1 Introduction

General statistical properties of nested regional climate model (RCM)-generated simulations may seem at first sight similar to those produced by coupled global climate models (CGCMs). Similarities in fact exist when RCM-simulated fields are studied without consideration of the chronology of events, such as climatological means or frequency distributions. But due to the control exerted by the lateral boundary conditions upon nested RCM simulations, the RCM-simulated fields do exhibit some synchronicity with their driving data, whether GCM-simulated fields or reanalyses. When driven by reanalyses, RCM-simulated fields are also correlated with observations to some extent.

For several applications such as projected future climate changes, synchronicity is not an issue as only the overall trend matters. On the other hand, synchronicity is paramount in applications such as dynamical downscaling of reanalyses aiming at obtaining high-resolution pseudo observations. This last application was first suggested by Anthes (1983) and has become an active area of research over the last few years (e.g. Li et al. 2012; Stefanova et al. 2011; Weisse et al. 2009; Kanamaru and Kanamitsu 2007). A simple proof-of-concept exercise of any of these applications is to verify whether and under which conditions RCMs’ anomalies correlate well with those of the driving data. von Storch et al. (2000) have shown that the application of large-scale spectral nudging greatly improves the time correlation between a reanalysis-driven simulation and observations, although the match is not perfect (e.g. Alexandru et al. 2009).

In applications of RCM downscaling for which synchronicity matters, an important issue relates to the way chronology-related statistical properties vary with time scale. For example, how does the time correlation between the RCM-simulated fields and the corresponding driving fields or observations change when considering the time series on daily, weekly, monthly, seasonal or annual basis?

A naïve view might lead one to think that time upscaling (aggregation) should necessarily improve time correlation and apparent synchronicity, analogous to temporal or spatial upscaling that generally improves scores such as Root-Mean-Square error or bias by reducing the unpredictable noise. A typical example is “hedging” in weather forecast, operation that uses low-pass filters or horizontal diffusion to remove poorly predicted finer scales (e.g. Jakimow et al. 1992) in order to avoid the “double penalty” problem typical of high-resolution simulations (e.g. Mass et al. 2002; Bougeault 2002). The objective of this work is to investigate in a simple framework whether this conjecture is also correct for time correlation of upscaled RCM-generated data.

This question can be generalized in the following way: How are weather anomalies in the driving data or observations reproduced by the RCM as a function of time scale? If the RCM and driving data are found to be somewhat asynchronous at some short time scale, can they be more synchronous on longer time scales? As simple as it may sound, the problem just stated is not trivial, and as far as the authors are aware, there exists no general solution without some strong assumptions. In what follows we explore analytically four particular cases that help us partially answer this question. In addition, we use observation datasets and RCM-produced data to illustrate our findings.

## 2 Analytical approaches

### 2.1 A very simple case

Here we discuss the simplest possible case when both the RCM-simulated and the observations time series have zero time autocorrelation and identical time variance. Despite its simplicity, this assumption can be somewhat realistic in some cases. Here *m* _{ t } will refer to a reanalysis-driven RCM-simulation seasonal anomaly time series and *o* _{ t } the corresponding observations anomaly time series, and their variance \(\text{var} \left( {m_{t} } \right) = \text{var} \left( {o_{t} } \right) = \sigma^{2}\). In the following we will refer to ‘observations’ only, but all developments would equally apply to driving data as well.

This may seem a counterintuitive result at first as one might have expected that time averaging would filter-out noise, resulting in an increased correlation (Sect. 2.4 will further illuminate this result). The covariance is indeed decreased, but the variances of the two variables are decreased in the same proportion, keeping the correlation unchanged. This is in fact a well-known result of multivariate statistics (see for example Johnson and Wichern 2007). Although theoretically the correlation is the same for the average, it should be noted that when estimating the correlation with data, the error of the estimation will increase as the sample size decreases with averaging (see Appendix 5). We will explore this with more detail later.

One may wonder if this rule could be generalized to all scales, that is, whether the following rule is valid: lack of correlation at short time scales implies lack of correlation at longer scales. But of course this generalization would violate one of the initial assumptions since daily values are strongly correlated in time. We will consider in Sect. 2.3 a case when autocorrelation is taken into account.

### 2.2 A case with different variances and correlations

In this example we discuss a somewhat more general case than the previous one. Let us consider a year composed of two seasons for simplicity, summer (*S*) and winter (*W*), each one with different variances, both for the simulated (*m*) and observed (*o*) variables, \(\sigma_{{S_{m} }}^{2} ,\sigma_{{S_{o} }}^{2} ,\sigma_{{W_{m} }}^{2} ,\sigma_{{W_{o} }}^{2}\), respectively, and with different correlations between simulations and observations anomalies, *r* _{ S } and *r* _{ W }, for summer and winter, respectively.

*r*

_{ A }, where

*A*=

*(S*+

*W)/2*. Assuming that successive seasons are uncorrelated, Appendix 2 shows that the correlation of the annual anomaly

*r*

_{ A }can be written as a function of seasonal values as

*r*

_{ A }is then a weighted average of the seasonal anomaly correlations

*r*

_{ A }is in fact

*smaller*than the seasonal weighted average. Hence, upscaling the time series not only does not increase the correlation but in fact may reduce it; this condition would occur for instance when the model fails to simulate well seasonal variances.

### 2.3 Autoregressive case

Cases treated thus far have assumed no autocorrelation within each time series, making them unrealistic in many instances. To allow for autocorrelation, we will now use a representation of simulated and observed time series by first-order autoregressive processes. Assuming that the simulated time series *m* and that of the observations *o* follow first order autoregressive processes, and knowing \(corr(o_{t} ,m_{t} )\), we want to know the upscaled value for \(corr(A_{o} ,A_{m} )\), where \(A_{0} = \left( {o_{t + 1} + o_{t} } \right)/2\) and \(A_{m} = \left( {m_{t + 1} + m_{t} } \right)/2\).

### 2.4 Noise plus sinusoidal signal

For the next case we assume that both time series—that of the observations and that of simulated data—consist in a random noise superimposed upon a slowly evolving sinusoidal signal. To simplify we will postulate that parameters of both time series are equal, differing only in the instantaneous values of the random noise, which are uncorrelated here. This corresponds to the case in which the model is able to reproduce the long time-scale signal in both amplitude and phase, but unable to reproduce the phase of the random noise features.

*ω*=

*2π/T*, where

*T*is the period of the signal.

The first column indicates the ratio of variances between noise *R* and sinusoidal signal *S*

\({\raise0.7ex\hbox{${\sigma_{R}^{2} }$} \!\mathord{\left/ {\vphantom {{\sigma_{R}^{2} } {\sigma_{S}^{2} }}}\right.\kern-0pt} \!\lower0.7ex\hbox{${\sigma_{S}^{2} }$}}\) | \(corr(o,m)\) | \(corr(A_{o} ,A_{m} )\) | Factor |
---|---|---|---|

1 | 0.5 | 0.66 | 1.33 |

¼ | 0.8 | 0.89 | 1.11 |

4 | 0.2 | 0.33 | 1.66 |

When the noise term is small compared with the sinusoidal signal amplitude, \(\sigma_{R}^{2} < \sigma_{S}^{2}\), the correlation of the original time series is high and averaging increases it further. When noise is dominant, \(\sigma_{R}^{2} > \sigma_{S}^{2}\), the original correlation is low and averaging increases this correlation substantially, but correlation remains modest.

This result sheds some light upon what our intuition may suggest: that when there is a distinct signal, reduction of noise by averaging does indeed increase the temporal correlation. The conflicting results obtained thus far indicate that analytical developments cannot reach the bottom of all our questions. One may wonder whether real cases have properties that resemble more one case or another.

## 3 Some results from RCM simulations and observations

### 3.1 Time upscaling from daily to monthly values

In this section we will analyse the time-correlation upscaling from RCM-downscaled reanalysis and the corresponding observed values. Downscaled daily data is taken from two CRCM5 simulations differing only in initial conditions (known internally at Ouranos as bba and bbb, hereafter mentioned as “twin” simulations), integrated for 32 years from 1980 to 2011. The gridpoints used are those that are closest to the selected weather stations. Daily values are divided in two time series, one for January and the other for July, in order to remove seasonality that may increase artificially the correlation. January and July are considered to have 32 days to facilitate computations. Daily values (a total of 1024 for each series) are upscaled using expression (1) to a 2-day average time series (a total of 512 for each series), to a 4-day average (a total of 256 for each series), to a 8-day average (a total of 128 for each series), to a 16-day (a total of 64 for each series), and to a 32-day average (a total of 32 for each series).

It is important to recall that the estimations of correlation presented at longer time scales are associated with larger random errors as the sample number is reduced by upscaling. Appendix 5 discusses and illustrates the magnitudes of such errors. Single-day time series contain near 1000 values and hence error bars can be found to the right of the diagram in Fig. 8. Series of 32-day averages contain around 30 values and can be found near the left side of the same diagram. As we can see considering the color scale, high correlation values are associated with small but asymmetric intervals. For winter temperatures, correlation values between 0.8 and 0.9 are common, which gives intervals of confidence (total length) of around 0.03 for 1-day series and near 0.3 for 32-day series. This puts into question the statistical robustness of the apparent increase of correlation by time upscaling during winter noted in Figs. 2, 3. For summer temperature and precipitation in both seasons the situation is even less clear since lower overall correlation values are associated with larger intervals of confidence.

This suggests that even if time upscaling might appear to produce an increase in correlation, establishing its statistical robustness may need more data. But naturally this prompts the following question: Is a barely detectable increase in correlation by upscaling worth the statistical battle?

### 3.2 Time upscaling from seasonal to yearly values

The previous section omitted the impact of seasons when studying correlation at different timescales. In this section, as in Sect. 2.2 we will concentrate in the upscaling from seasonal to yearly sampling.

The graphs should be interpreted as follows: when points lay above the diagonal it indicates that upscaling from seasonal to annual does in fact improve over simple combinations of seasonal correlation. As discussed in Appendix 5, errors in the estimations of correlation with a 30-year long time series are quite large, hence values in the scattergram have a considerable spread.

Figure 4a shows that for temperature most of the data cloud and their mean values lay above the diagonal, suggesting that upscaling does improve correlation. This is particularly clear for simulations bao and ban performed without large-scale nudging. For ERA-Int and the spectrally nudged simulation bar2, the improved correlation is more modest; but it should be kept in mind that it is difficult to improve very high correlations since it is a bounded quantity.

The case for precipitation is somewhat different (Fig. 4b, notice change of range in both axes). Correlations between observed and simulated precipitation are overall significantly lower than that for temperature, and upscaling seems not to have a distinctive positive effect. The highest correlations are slightly degraded by upscaling and the lower ones are slightly improved.

## 4 Conclusions

The aim of this study was to analyse the effect of time upscaling (aggregating) on the correlation between a time series produced by a reanalysis-driven RCM and observations. Lacking a general solution, we have approached the issue from four different simple analytical perspectives: 1- Two stochastic correlated time series, without autocorrelation and with identical variances, 2- As in 1, but allowing different variances and correlations between series, as typical in the case of seasonal statistics, 3- Time series with an autoregressive behaviour, typical of short time interval series, and 4- Stochastic time series that are modulated by a sinusoidal signal, such as a interannual oscillations.

The four approaches delivered different results. The first two suggested that in the absence of autocorrelation, upscaling does not improve correlation and may in fact deteriorate it if the model fails to produce the appropriate seasonal interannual variances. Substantial autocorrelation is however a typical property of daily time series—for fields such as surface temperature for example–, and this case was discussed in the third approach. Results indicated that no degradation occurs, but no substantial improvement is to be expected by upscaling either. The last case represented a stochastic time series with no autocorrelation—and no cross correlation—with a superimposed sinusoidal signal (which in fact introduces autocorrelation and cross correlation in the time series). This case can be thought of, for example, as yearly values modulated by interannual variability with a single dominant mode. In this case, upscaling produces a substantial increase in correlation for long time scale oscillations.

The answer to the question asked in the introduction is hence not straightforward from a theoretical perspective. Results for real time series will depend on which property dominates and which assumptions are more realistic for the case chosen. Examples shown here with RCM-simulated data and observations corroborate in part the more cautious analytical results, producing some hope of correlation increase with scale for winter but almost none for summer. More work with real data is needed to provide more robust answers for a variety of cases, although the hope of substantial (and practical) gains by upscaling seems to be limited.

Before more is known about the topic, the safest approach may be to assume that lack of correlation at short time scales will express itself as lack of correlation at longer scales. If one is interested in achieving high correlation with observations at time scales such as monthly, seasonal or annual, one may be constrained to use some form of large-scale nudging within the RCM in order to strengthen the control exerted by the driving data on short time scales. It is still to be seen how intense this nudging should be to attain satisfactory results.

- 1.
Two nested simulations driven by the same driving data, either using the same RCM (such as twin simulations as in Alexandru et al. 2009), variants of a given RCM (such as spectrally nudged or not), or different RCMs.

- 2.
An RCM simulation and its driving dataset, whether a reanalysis or some CGCM-simulated data. Note that this applies even if the variable to correlate is not used to drive the RCM, such as precipitation. Studies of correlation between reanalysis-driven RCMs and reanalysis are common features in the production of high-resolution pseudo-observations or poor’s man high-resolution analysis by dynamical downscaling (e.g. Kanamaru and Kanamitsu 2007).

- 3.
CGCMs simulations nudged towards reanalysis and observations (e.g. Eden et al. 2012).

- 4.
Two observational datasets. Studies on temporal correlation have been carried out, for example by Brands et al. (2012), who computed time correlations on a daily timescale between two different reanalysis.

Finally, here we have concentrated on time upscaling but this discussion could be extended to spatial-upscaling too. For example, whether for daily precipitation time correlation between RCM and observations improves when instead of a single station a regional scale is considered. This is also a question that deserves attention.

## Notes

### Acknowledgments

This project has been carried out as part of the activities supported by the Canadian Network for Regional Climate and Weather Processes (CNRCWP) funded by the Climate Change and Atmospheric Research (CCAR) fund of the Natural Sciences and Engineering Research Council of Canada (NSERC). The first and third authors thank Ouranos for its support. The CRCM5 has been developed at the Centre ESCER (at Université du Québec à Montréal) and the data used in the examples has been generated by the Climate Simulation and Analysis Group at Ouranos. We also thank Environment Canada for providing daily data from several weather stations.

## References

- Alexandru A, de Elia R, Laprise R, Separovic L, Biner S (2009) Sensitivity study of regional climate model simulations to large-scale nudging parameters. Mon Weather Rev 137:1666–1686. doi: 10.1175/2008MWR2620.1 CrossRefGoogle Scholar
- Anthes R (1983) Regional models of the atmosphere in middle latitudes. Mon Weather Rev 111:1306–1335CrossRefGoogle Scholar
- Bougeault P (2002) WGNE survey of verification methods for numerical prediction of weather elements and severe weather events. CAS/JSC WGNE Report No. 18, Appendix C. WMO/TD.No.1173, Toulouse, FranceGoogle Scholar
- Brands S, Gutiérrez J, Herrera S (2012) On the use of reanalysis data for downscaling. J Clim 25:2517–2526CrossRefGoogle Scholar
- Dee DP, Uppala SM, Simmons AJ et al (2011) The ERA interim reanalysis: configuration and performance of the data assimilation system. Q J R Meteorol Soc 656:553–597CrossRefGoogle Scholar
- Eden JM, Widmann M, Grawe D, Rast S (2012) Skill, correction and downscaling of GCM-simulated precipitation. J Clim 25:3970–3984CrossRefGoogle Scholar
- Fisher RA (1915) Frequency distribution of the values of the correlation coefficient in samples of an indefinitely large population. Biometrika 10:507–521Google Scholar
- Harris I, Jones PD, Osborn TJ, Lister DH (2013) Updated high-resolution grids of monthly climatic observations–the CRU TS3.10 dataset. Int J Climatol. doi: 10.1002/joc.3711 Google Scholar
- Jakimow G, Yakimiw E, Robert A (1992) An implicit formulation for horizontal diffusion in gridpoint models. Mon Weather Rev 120:124–130CrossRefGoogle Scholar
- Johnson RA, Wichern DW (2007) Applied multivariate statistical analysis, 6th edn. Prentice Hall, New YorkGoogle Scholar
- Kanamaru H, Kanamitsu M (2007) Fifty-Seven-year California reanalysis downscaling at 10 km (CaRD10). Part II: comparison with North American regional reanalysis. J Clim 20:5572–5592. doi: 10.1175/2007JCLI1522.1 CrossRefGoogle Scholar
- Li H, Kanamitsu M, Hong S-Y (2012) California reanalysis downscaling at 10 km using an ocean-atmosphere coupled regional model system. J Geophys Res 117:1–16. doi: 10.1029/2011JD017372 Google Scholar
- Martynov A, Laprise R, Sushama L, Winger K, Separovic L, Dugas B (2013) Reanalysis-driven climate simulation over CORDEX North America domain using the Canadian Regional Climate Model, version 5: model performance evaluation. Clim Dyn 41:2973–3005. doi: 10.1007/s00382-013-1778-9 CrossRefGoogle Scholar
- Mass CF, Ovens D, Westrick K, Colle BA (2002) Does increasing horizontal resolution produce more skillful forecasts? Bull Am Meteorol Soc 83(3):407–430CrossRefGoogle Scholar
- Mekis E, Vincent LA (2011) An overview of the second generation adjusted daily precipitation dataset for trend analysis in Canada. Atmos Ocean 49:163–177CrossRefGoogle Scholar
- Separovic L, Elía R, Laprise R (2011) Impact of spectral nudging and domain size in studies of RCM response to parameter modification. Clim Dyn 38:1325–1343. doi: 10.1007/s00382-011-1072-7 CrossRefGoogle Scholar
- Stefanova L, Misra V, Chan S, Griffin M, O’Brien JO, Smith TJ (2011) A proxy for high-resolution regional reanalysis for the Southeast United States: assessment of precipitation variability in dynamically downscaled reanalyses. Clim Dyn 38:2449–2466. doi: 10.1007/s00382-011-1230-y CrossRefGoogle Scholar
- Vincent LA, Wang XL, Milewska EJ, Wan H, Yang F, Swail V (2012) A second generation of homogenized Canadian monthly surface air temperature for climate trend analysis. J Geophys Res 117:D18110. doi: 10.1029/2012JD017859 Google Scholar
- von Storch H, Zwiers FW (1999) Statistical analysis in climate research. Cambridge University Press, Cambridge, UK/New YorkGoogle Scholar
- von Storch H, Langenberg H, Feser F (2000) A spectral nudging technique for dynamical downscaling purposes. Mon Weather Rev 128:3664–3673CrossRefGoogle Scholar
- Weisse R, von Storch H, Callies U, Chrastansky A, Feser F, Grabemann I, Gunther H, Pluess A, Stoye T, Tellkamp J, Winterfeldt J, Woth K (2009) Regional meteorological–marine reanalyses and climate change projections. Bull Am Meteorol Soc 90:849–860. doi: 10.1175/2008BAMS2713.1 CrossRefGoogle Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.