Bias adjustment and ensemble recalibration methods for seasonal forecasting: a comprehensive intercomparison using the C3S dataset


This work presents a comprehensive intercomparison of different alternatives for the calibration of seasonal forecasts, ranging from simple bias adjustment (BA)—e.g. quantile mapping—to more sophisticated ensemble recalibration (RC) methods—e.g. non-homogeneous Gaussian regression, which build on the temporal correspondence between the climate model and the corresponding observations to generate reliable predictions. To be as critical as possible, we validate the raw model and the calibrated forecasts in terms of a number of metrics which take into account different aspects of forecast quality (association, accuracy, discrimination and reliability). We focus on one-month lead forecasts of precipitation and temperature from four state-of-the-art seasonal forecasting systems, three of them included in the Copernicus Climate Change Service dataset (ECMWF-SEAS5, UK Met Office-GloSea5 and Météo France-System5) for boreal winter and summer over two illustrative regions with different skill characteristics (Europe and Southeast Asia). Our results indicate that both BA and RC methods effectively correct the large raw model biases, which is of paramount importance for users, particularly when directly using the climate model outputs to run impact models, or when computing climate indices depending on absolute values/thresholds. However, except for particular regions and/or seasons (typically with high skill), there is only marginal added value—with respect to the raw model outputs—beyond this bias removal. For those cases, RC methods can outperform BA ones, mostly due to an improvement in reliability. Finally, we also show that whereas an increase in the number of members only modestly affects the results obtained from calibration, longer hindcast periods lead to improved forecast quality, particularly for RC methods.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12


  1. Barnston AG (1994) Linear statistical short-term climate predictive skill in the northern hemisphere. J Clim 7(10):1513–1564.<1513:LSSTCP>2.0.CO;2

    Article  Google Scholar 

  2. Bedia J, Golding N, Casanueva A, Iturbide M, Buontempo C, Gutiérrez JM (2018) Seasonal predictions of Fire Weather Index: paving the way for their operational applicability in Mediterranean Europe. Clim Serv 9:101–110.

    Article  Google Scholar 

  3. Dee DP, Uppala SM, Simmons AJ, Berrisford P, Poli P, Kobayashi S, Andrae U, Balmaseda MA, Balsamo G, Bauer P, Bechtold P, Beljaars ACM, van de Berg L, Bidlot J, Bormann N, Delsol C, Dragani R, Fuentes M, Geer AJ, Haimberger L, Healy SB, Hersbach H, Holm EV, Isaksen L, Kallberg P, Koehler M, Matricardi M, McNally AP, Monge-Sanz BM, Morcrette JJ, Park BK, Peubey C, de Rosnay P, Tavolato C, Thepaut JN, Vitart F (2011) The ERA-Interim reanalysis: configuration and performance of the data assimilation system. Q J R Meteorol Soc 137(656):553–597.

    Article  Google Scholar 

  4. Déqué M (2007) Frequency of precipitation and temperature extremes over France in an anthropogenic scenario: Model results and statistical correction according to observed values. Glob Planet Change 57(1–2):16–26.

    Article  Google Scholar 

  5. Doblas-Reyes FJ, Hagedorn R, Palmer TN (2005) The rationale behind the success of multi-model ensembles in seasonal forecasting II. Calibration and combination. Tellus A 57(3):234–252.

    Google Scholar 

  6. Doblas-Reyes FJ, García-Serrano J, Lienert F, Biescas AP, Rodrigues LRL (2013) Seasonal climate predictability and forecasting: status and prospects. Wiley Interdiscip Rev Clim Change 4(4):245–268.

    Article  Google Scholar 

  7. Eade R, Smith D, Scaife A, Wallace E, Dunstone N, Hermanson L, Robinson N (2014) Do seasonal-to-decadal climate predictions underestimate the predictability of the real world? Geophys Res Lett 41(15):5620–5628.

    Article  Google Scholar 

  8. Epstein ES (1969) A scoring system for probability forecasts of ranked categories. J Appl Meteorol 8(6):985–987.<0985:ASSFPF>2.0.CO;2

    Article  Google Scholar 

  9. Feldmann K, Scheuerer M, Thorarinsdottir TL (2015) Spatial postprocessing of ensemble forecasts for temperature using non-homogeneous Gaussian regression. Mon Weather Rev 143(3):955–971.

    Article  Google Scholar 

  10. Gneiting T, Raftery AE, Westveld AH, Goldman T (2005) Calibrated probabilistic forecasting using Ensemble Model Output Statistics and minimum CRPS estimation. Mon Weather Rev 133(5):1098–1118.

    Article  Google Scholar 

  11. Gutiérrez JM, Cano R, Cofiño AS, Sordo C (2005) Analysis and downscaling multi-model seasonal forecasts in Peru using self-organizing maps. Tellus A 57(3):435–447.

    Article  Google Scholar 

  12. Gutiérrez JM, Maraun D, Widmann M, Huth R, Hertig E, Benestad R, Roessler O, Wibig J, Wilcke R, Kotlarski S, San Martín D, Herrera S, Bedia J, Casanueva A, Manzanas R, Iturbide M, Vrac M, Dubrovsky M, Ribalaygua J, Pórtoles J, Räty O, Räisänen J, Hingray B, Raynaud D, Casado MJ, Ramos P, Zerenner T, Turco M, Bosshard T, S̆tĕpánek P, Bartholy J, Pongracz R, Keller DE, Fischer AM, Cardoso RM, Soares PMM, Czernecki B, Pagé C (2018) An intercomparison of a large ensemble of statistical downscaling methods over Europe: results from the VALUE perfect predictor crossvalidation experiment. Int J Climatol.

  13. Herrera S, Kotlarski S, Soares PMM, Cardoso RM, Jaczewski A, Gutiérrez JM, Maraun D (2018) Uncertainty in gridded precipitation products: influence of station density, interpolation method and grid resolution. Int J Climatol.

  14. Hersbach H (2000) Decomposition of the continuous ranked probability score for ensemble prediction systems. Weather Forecast 15(5):559–570.<0559:DOTCRP>2.0.CO;2

    Article  Google Scholar 

  15. Iturbide M, Bedia J, Herrera S, Baño J, Fernández J, Frías MD, Manzanas R, San-Martín D, Cimadevilla E, Cofiño AS, Gutiérrez JM (2019) The R-based climate4R open framework for reproducible climate data access and post-processing. Environ Model Softw 111:42–54.

    Article  Google Scholar 

  16. Kharin VV, Zwiers FW (2003) On the ROC score of probability forecasts. J Clim 16(24):4145–4150.<4145:OTRSOP>2.0.CO;2

    Article  Google Scholar 

  17. Kotlarski S, Szabó P, Herrera S, Räty O, Keuler K, Soares PMM, Cardoso RM, Bosshard T, Pagé C, Boberg F, Gutiérrez JM, Isotta FA, Jaczewski A, Kreienkamp F, Liniger MA, Lussana C, Pianko-Kluczyńska K (2017) Observational uncertainty and regional climate model evaluation: a pan-European perspective. Int J Climatol.

  18. Lachenbruch PA, Mickey MR (1968) Estimation of error rates in discriminant analysis. Technometrics 10(1):1–11.

    Article  Google Scholar 

  19. Leung LR, Hamlet AF, Lettenmaier DP, Kumar A (1999) Simulations of the ENSO hydroclimate signals in the Pacific Northwest Columbia river basin. Bull Am Meteorol Soc 80(11):2313–2330.<2313:SOTEHS>2.0.CO;2

    Article  Google Scholar 

  20. Manzanas R, Gutiérrez JM (2018) Process-conditioned bias correction for seasonal forecasting: a case-study with ENSO in Peru. Clim Dyn.

  21. Manzanas R, Frías MD, Cofiño AS, Gutiérrez JM (2014) Validation of 40 year multimodel seasonal precipitation forecasts: The role of ENSO on the global skill. J Geophys Res Atmos 119(4):1708–1719.

    Article  Google Scholar 

  22. Manzanas R, Gutiérrez JM, Fernández J, van Meijgaard E, Calmanti S, Magariño ME, Cofiño AS, Herrera S (2017) Dynamical and statistical downscaling of seasonal temperature forecasts in Europe: added value for user applications. Clim Serv.

  23. Manzanas R, Lucero A, Weisheimer A, Gutiérrez JM (2018) Can bias correction and statistical downscaling methods improve the skill of seasonal precipitation forecasts? Clim Dyn 50(3):1161–1176.

    Article  Google Scholar 

  24. Maraun D, Wetterhall F, Ireson AM, Chandler RE, Kendon EJ, Widmann M, Brienen S, Rust HW, Sauter T, Themessl M, Venema VKC, Chun KP, Goodess CM, Jones RG, Onof C, Vrac M, Thiele-Eich I (2010) Precipitation downscaling under climate change: recent developments to bridge the gap between dynamical models and the end user. Rev Geophys 48:3.

    Article  Google Scholar 

  25. Maraun D, Shepherd TG, Widmann M, Zappa G, Walton D, Mayr GJ, Hagemann S, Richter I, Soares PMM, Hall A, Mearns LO, (2017) Towards process-informed bias correction of climate change simulations. Nat Clim Change 7:764–773.

  26. Marcos R, Llasat MC, Quintana-Seguí P, Turco M (2018) Use of bias correction techniques to improve seasonal forecasts for reservoirs: a case-study in northwestern Mediterranean. Scie Total Environ 610–611:64–74.

    Article  Google Scholar 

  27. Markus D, Mayr GJ, Achim Z (2017) Spatial ensemble postprocessing with standardized anomalies. Q J R Meteorol Soc 143(703):909–916.

    Article  Google Scholar 

  28. Molteni F, Stockdale T, Balmaseda M, Balsamo G, Buizza R, Ferranti L, Magnusson L, Mogensen K, Palmer T, Vitart F (2011) The new ECMWF seasonal forecast system (System 4). European Centre for Medium-Range Weather Forecasts.

  29. Murphy AH (1973) A new vector partition of the probability score. J Appl Meteorol 12(4):595–600.<0595:ANVPOT>2.0.CO;2

    Article  Google Scholar 

  30. Nikulin G, Asharafb S, Magariño ME, Calmanti S, Cardoso RM, Bhend J, Fernández J, Frías MD, Fröhlichb K, Frühb B, Herrera S, Manzanas R, Gutiérrez JM, Hanssona U, Kolaxa M, Liniger M, Soares PMM, Spirig C, Tome R, Wysera K (2018) Dynamical and statistical downscaling of a global seasonal hindcast in eastern Africa. Clim Serv 9:72–85.

    Article  Google Scholar 

  31. Pavan V, Marchesi S, Morgillo A, Cacciamani C, Doblas-Reyes FJ (2005) Downscaling of DEMETER winter seasonal hindcasts over Northern Italy. Tellus A 57(3):424–434.

    Article  Google Scholar 

  32. Piani C, Haerter JO, Coppola E (2010) Statistical bias correction for daily precipitation in regional climate models over Europe. Theor Appl Climatol 99(1–2):187–192.

    Article  Google Scholar 

  33. Sansom PG, Ferro CAT, Stephenson DB, Goddard L, Mason SJ (2016) Best practices for postprocessing ensemble climate forecasts. Part I: Selecting appropriate recalibration methods. J Clim 29(20):7247–7264.

    Article  Google Scholar 

  34. Scheuerer M, Möller D (2015) Probabilistic wind speed forecasting on a grid based on ensemble model output statistics. Ann Appl Stat 9(3):1328–1349.

    Article  Google Scholar 

  35. Sheau TN, Tangang F, Juneng L (2017) Bias correction of global and regional simulated daily precipitation and surface mean temperature over Southeast Asia using quantile mapping method. Glob Planet Change 149:79–90.

    Article  Google Scholar 

  36. Smith DM, Eade R, Pohlmann H (2013) A comparison of full-field and anomaly initialization for seasonal to decadal climate prediction. Clim Dyn 41(11–12):3325–3338.

    Article  Google Scholar 

  37. Thorarinsdottir TL, Johnson MS (2012) Probabilistic wind gust forecasting using non-homogeneous Gaussian regression. Mon Weather Rev 140(3):889–897.

    Article  Google Scholar 

  38. Tippett MK, Barnston AG (2008) Skill of multimodel ENSO probability forecasts. Mon Weather Rev 136(10):3933–3946.

    Article  Google Scholar 

  39. Torralba V, Doblas-Reyes FJ, MacLeod D, Christel I, Davis M (2017) Seasonal climate prediction: a new source of information for the management of wind energy resources. J Appl Meteorol Climatol 56(5):1231–1247.

    Article  Google Scholar 

  40. Weigel AP, Liniger MA, Appenzeller C (2009) Seasonal ensemble forecasts: are recalibrated single models better than multimodels? Mon Weather Rev 137(4):1460–1479.

    Article  Google Scholar 

  41. Weisheimer A, Palmer TN (2014) On the reliability of seasonal climate forecasts. J R Soc Interface 11:96.

    Article  Google Scholar 

  42. Wilks DS, Hamill TM (2007) Comparison of ensemble-MOS methods using GFS reforecasts. Mon Weather Rev 135(6):2379–2390.

    Article  Google Scholar 

  43. Zhao T, Bennett JC, Wang QJ, Schepen A, Wood AW, Robertson DE, Ramos MH (2017) How suitable is quantile mapping for postprocessing GCM precipitation forecasts? J Clim 30(9):3185–3196.

    Article  Google Scholar 

Download references


This work has been funded by the C3S activity on Evaluation and Quality Control for seasonal forecasts. JMG was partially supported by the project MULTI-SDM (CGL2015-66583-R, MINECO/FEDER). FJDR was partially funded by the H2020 EUCP project (GA 776613).

Author information



Corresponding author

Correspondence to R. Manzanas.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Description of BA and RC methods

Appendix: Description of BA and RC methods

All the methods described in this section have been applied gridbox by gridbox considering seasonal interannual series. We use the following notation: \(y_{m,t}\) and \(y'_{m,t}\) denote the original and calibrated values for the ensemble member m at time (season/year) t, \({\hat{y}}\) is the average of the ensemble mean (\({\bar{y}}_t\)) on all times t, \({\hat{o}}\) is the average of the observations on all times t, \(\sigma _{f}\) is the standard deviation of the complete ensemble (pooling all member interannual time-series) and \(\sigma _o\) is the standard deviation of the observed interannual time-series. Finally, \(\rho\) is the interannual correlation between the ensemble mean and the observational reference.

Mean (and variance) adjustment (MVA)

This is the simplest adjustment method, with a long tradition in the context of seasonal forecasting (see, e.g., Leung et al. 1999). The ensemble mean and variance are adjusted towards the corresponding observational ones in the following form:

$$\begin{aligned} y'_{m,t} = (y_{m,t}-{\hat{y}})\frac{\sigma _{o}}{\sigma _{f}} + {\hat{o}} \end{aligned}$$

A simpler version consists of correcting just the mean (MA) and has the same formulation, but excluding the term \(\sigma _{o}/\sigma _{f}\).

Empirical quantile mapping (EQM)

We have considered an empirical quantile mapping (EQM) method participating in the VALUE downscaling intercomparison initiative (Gutiérrez et al. 2018) which has been recently applied to correct seasonal precipitation forecasts (Manzanas et al. 2018; Manzanas and Gutiérrez 2018). This method calibrates the predicted empirical probability density function (PDF) by adjusting a number of quantiles based on the empirical observed PDF (Déqué 2007). In particular, here we adjust percentiles 1–99 and linearly interpolate every two consecutive percentiles inside this range. Outside this range, a constant extrapolation (using the correction obtained for the 1st or 99th percentile) is applied. This method was applied here at a ensemble-wise level; that is, the mapping was trained based on all contributing members which were pooled together (all members are supposed to be statistically indistinguishable). Then, the so-obtained unique correction factor was applied to each individual member. Note that ensemble- and member-wise approaches have been recently reported to provide very similar results (Manzanas et al. 2018).

Climate conserving recalibration (CCR)

Also known as variance inflation, this method was first introduced in Doblas-Reyes et al. (2005). It modifies the predictions to have the same interannual variance as the observational reference, while preserving their interannual correlation, and can be expressed as:

$$\begin{aligned} y'_{m,t} = \rho \frac{\sigma _{o}}{std({\bar{y}}_{t})} {\bar{y}}_{t} + \sqrt{1-\rho ^{2}} \frac{\sigma _{o}}{\sigma _{f}} (y_{m,t}-{\bar{y}}_{t}) + {\hat{o}} \end{aligned}$$

After Weigel et al. (2009), this method has been commonly referred to as climate conserving recalibration.

Ratio of predictable components (RPC)

We have also considered for this work the method introduced by Eade et al. (2014), which uses the ensemble to reduce noise and adjust the forecast variance so that the ratio of predictable components in the model and in the observations is the same (see the paper for details). In particular, they applied the following correction to adjust seasonal forecasts of the North Atlantic Oscillation (NAO), temperature and pressure in the North Atlantic region:

$$\begin{aligned} y'_{m,t}& {}= \rho \frac{\sigma _{o} }{std({\bar{y}}_{t})} ({\bar{y}}_{t}-{\hat{y}}) \nonumber \\&\quad+\, \sqrt{1-\rho ^{2}} \frac{\sigma _{o}}{\sqrt{var(y_{m,t}-{\bar{y}}_{t})}} (y_{m,t}-{\bar{y}}_{t}) + {\hat{o}} \end{aligned}$$

Linear regression recalibration (LR)

This method performs a linear regression between the ensemble mean (i.e. the time-series of \({\bar{y}}_{t}\)) and the corresponding observations:

$$\begin{aligned} o_{t} = \alpha + \beta {\bar{y}}_{t} + \epsilon \end{aligned}$$

To correct the forecast variance, the standardized anomalies are rescaled by the standard deviation of the predictive distribution from the linear fit, so \(y'_{m,t} = \alpha + \beta {\bar{y}}_{t} + \gamma _{t}(y_{m,t}-{\bar{y}}_{t})\), where

$$\begin{aligned} \gamma _{t} = std(\epsilon _{fit}) \sqrt{1+1/n+\frac{(y_{t}-{\bar{y}}_{t})^{2}}{(n-1)var(\epsilon _{obs})}}, \end{aligned}$$

\(\epsilon _{fit}\) and \(\epsilon _{obs}\) are the residuals from the regression and the observations respectively, and n the number of samples used.

Non-homogeneous Gaussian Regression (NGR)

This method (Gneiting et al. 2005) uses a constant term and the ensemble mean signal as predictors for the calibrated forecast mean and a constant term and the ensemble spread for the inflation (shrinkage) of the ensemble spread. The correction has the following form:

$$\begin{aligned} y'_{m,t} = \alpha + \beta ({\bar{y}}_{t}-{\hat{y}}) + \sqrt{\gamma ^{2}+\delta ^{2}var(y_{t})} (y_{m,t}-{\bar{y}}_{t}) \end{aligned}$$

The parameters \(\alpha\), \(\beta\), \(\gamma\) and \(\delta\) are optimized by minimizing the ensemble CRPS. NGR approaches have been applied in many previous works, but mostly in the context of short-term forecasts (see, e.g., Wilks and Hamill 2007; Thorarinsdottir and Johnson 2012; Feldmann et al. 2015; Scheuerer and Möller 2015; Markus et al. 2017). To our knowledge, only Tippett and Barnston (2008) have used it in the context of seasonal forecasting.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Manzanas, R., Gutiérrez, J.M., Bhend, J. et al. Bias adjustment and ensemble recalibration methods for seasonal forecasting: a comprehensive intercomparison using the C3S dataset. Clim Dyn 53, 1287–1305 (2019).

Download citation


  • Seasonal forecasting
  • C3S
  • Bias adjustment
  • Ensemble recalibration
  • Forecast quality
  • Reliability
  • Ensemble size
  • Hindcast length