1 Introduction

Uncertainty exists in weather predictions generated using numerical models. Some uncertainty is due to the sensitivity and dependence of the data on the initial conditions. Uncertainty also arises because of the approximated mathematical methods used to solve the equations. Although many methods have been applied to improve the production of initial conditions, such as the use of real status updates throughout observation data quality control, an expansion of observation networks, and improvements of physical processes and dynamics in numerical models, uncertainties in weather predictions have not been adequately resolved yet. To diminish errors of deterministic forecasts, the European Center for Medium-Range Weather Forecasts (ECMWF) and the National Centers for Environmental Prediction (NCEP) created the global ensemble prediction system (EPS) for medium-range weather forecasts in 1992. The ECMWF has used singular vectors (Palmer et al. 1993; Buizza and Palmer 1995; Molteni et al. 1996; Buizza et al. 2007) to simulate the initial probability density, while the NCEP has used a technique known as breeding vector (Toth and Kalnay 1993, 1997). After running a medium-range weather prediction using EPSs in the ECMWF and the NCEP, other operational forecast centers constructed EPSs using their own numerical models and different data assimilation methods; therefore, medium-range ensemble predictions are often implemented autonomously.

Applications of different physical parameterization methods have been used to account for uncertainties of single numerical models, such as those produced by uncertainties in the observations and model (e.g., due to a lack of resolution, simplified parameterization of physical processes, and effects of unsolved processes), imperfect boundary conditions, and data assimilation assumptions for probability forecasts using probability density functions (Park et al. 2008). Although the ability of ensemble predictions can be improved as members of ensembles are increased (Buizza and Palmer 1998), combining different ensembles or the generation of a multi-model is often a more effective method because of economical costs. The predictability and consistency of multi-model ensemble forecasting have been shown to be superior to a single-model-based approach (Buizza 2008; Zhou and Du 2010); thus, studies aimed at improving the accuracy of high-impact weather forecasts using various methods of ensemble generation represent a very active area of research. Potentially valuable areas of research include studies on super-ensembles (i.e., models within a multi-model ensemble are adjusted for their various biases) and hyper-ensembles (i.e., models of different physical processes are combined, such as atmospheric, ocean, and wave models).

Recently, the World Meteorological Organization’s (WMO) World Weather Research Program (WWRP) has been carrying out “THe Observing system Research and Predictability EXperiment” (THORPEX) to accelerate improvements in the accuracy of 1-day to 2-week high-impact weather forecasts. The THORPEX Interactive Grand Global Ensemble (TIGGE) is a key component of THORPEX. Among its objectives, TIGGE seeks to (1) develop a deeper understanding of the contribution of observations, initial uncertainties, and modeling uncertainties to forecasting error and (2) to investigate new methods for combining ensembles from different sources and for correcting systematic errors (Park et al. 2008). The following organizations participate in the TIGGE project: Australian Bureau of Meteorology (BoM), China Meteorological Administration (CMA), Meteorological Service of Canada (CMC), Centro de Previsao Tempo e Estudos Climaticos (CPTEC), European Center for Medium-Range Weather Forecasts (ECMWF), Japan Meteorological Administration (JMA), Korea Meteorological Administration (KMA), Meteo of France, Met Office of the United Kingdom (UKMO), the National Center for Atmospheric Research (NCAR), and National Centers for Environmental Prediction (NCEP).

Because the performance methods (e.g., initial condition field and method of initial perturbation) of EPSs used by the various Numerical Weather Prediction (NWP) centers are different from each other, many studies have focused on making comparisons and verifying data from the various EPSs, single-, and multi-models using deterministic and probabilistic verification indices (Matsueda and Tanaka 2008; Park et al. 2008; Johnson and Swinbank 2009). There are also a number of studies on the predictability of EPS in high-impact weather forecasts including tropical cyclones, heavy rainfall, and blocking events (Thielen et al. 2008; Bao et al. 2009; He et al. 2009; Froude 2010; Huang et al. 2010; Belanger et al. 2012; Yamaguchi et al. 2012; Tsai and Elsberry 2013). Pappenberger et al. (2008) investigated flood forecasts of a multi-model ensemble for Romania in October 2007 using the LISFLOOD model of the European Flood Forecasting System (EFAS) as the hydrological component and found that this technique would have led to a correct flood warning about 8 days in advance of the severe weather. Yamagughi and Majumdar (2010) investigated the dynamic mechanism of perturbation growth in a tropical cyclone (Sinlaku, the typhoon in 2008) using ECMWF, NCEP, and JMA ensembles and found that the vertical and horizontal distributions of the initial perturbations, as well as the amplitude, were quite different among three NWP centers before, during, and after the recurvature of Typhoon Sinlaku. Hwang et al. (2012) compared the performances of six ensemble models and a grand ensemble (GE) of the three best ensemble models (ECMWF, UKMO, and CMA) for inconsistencies, jumpiness, and root-mean square difference (RMSD) for 500 hPa geopotential height, 850 hPa temperature, and mean sea-level pressure and verified that the GE was more consistent than each of the single ensemble models. Additionally, in a case study of a heavy rainfall event using the GE, it was found that the GE was more skillful than the single ensemble model, which could lead to early warnings for heavy rainfall in the medium range.

Besides heavy rainfall, drought is expected to adversely affect an even larger number of people during the twenty-first century (EM-DAT 2007). Because the occurrence of meteorological drought is never the result of a single cause (it is caused by many factors that often act synergistically in nature) (NDMC 2013), predictions of drought are not considered to be weather “predictions” but are instead thought of as “outlooks” in many operational forecast centers. Nevertheless, we use the term “predictions” in this paper when referring to drought forecasts. The prediction of a drought will depend on the ability of the model to forecast two fundamental meteorological surface parameters, namely, precipitation and temperature. From the historical record, we know that anomalies of precipitation and temperature may last from several months to several decades, but it is difficult to predict drought because how long these periods last depends on air-sea interactions, soil moisture, land surface processes, topography, internal dynamics, and the accumulated influence of dynamically unstable synoptic weather systems at the global scale (NDMC 2013). However, predicting the withdrawal of a drought should not be too relatively difficult because a long-lasting extreme drought can be ended with only a day’s worth of heavy rainfall. Unfortunately, there have been few studies that deal with the predictability of drought withdrawal, until now.

This study compared the ability of EPSs from six operational forecasting centers (CMA, CMC, ECMWF, NCEP, KMA, and UKMO) to predict the onset of the summer rainy season and withdrawal of spring drought over South Korea through use of the ensemble mean and probability precipitation, which was quantitatively observed on 29 June 2012. In particular, the RMSD between the ensemble-mean forecast and control of 500 hPa geopotential height (Z500), 850 hPa temperature (T850), mean sea-level pressure (MSLP), and 850 hPa specific humidity (Q850) and the Brier scores between the precipitation probability and observations were used to investigate the forecast skills probabilistically. Moreover, the ensemble mean and spread of the Effective Drought Index at all centers were compared.

2 Verification data and methodology

2.1 TIGGE, precipitation, drought, and water resource indices

We used TIGGE data from six operational forecast centers. The data were obtained from the ECMWF archive center. The analysis domain contained 222 grids over 30–45° N and 120–135° E, and each grid was equivalent to ∼1°. The member numbers, resolutions, initial perturbation fields, and data assimilation methods of all EPSs are shown in Table 1. To compare the ensemble-mean forecasts of each EPS, a total of five variables, including Z500, T850, MSLP, Q850, and 24-h accumulated precipitation, were used.

Table 1 Configurations of TIGGE models at CMA, CMC, ECMWF, NCEP, KMA, and UKMO

Precipitation reanalysis data for the onset of the summer rainy season over South Korea were composed of precipitation data and reflectivity data. These data were produced by combining precipitation data from 657 AWS stations and reflectivity data from 11 radars over South Korea; the data had a resolution of 5 km. Moreover, 100 precipitation reanalysis data samples were used to verify the probability forecasts for 24-h accumulated precipitation at each grid. To calculate the drought index and water resources indices, daily precipitation data at Seoul (KMA station number: 47108) from 1951 to 2012 were used. Hourly precipitation data from 12 stations nearest to each grid were used to identify the predictability of the ensemble-mean precipitation at the onset of the summer rainy season (Table 2). Progress of the surface pressure system with the Changma front at the onset of the summer rainy season in middle regions of the peninsula was investigated by using surface weather charts of the KMA from 23 to 30 June 2012.

Table 2 Lists of the nearest 12 stations in grid of precipitation data by TIGGE model over South Korea

The Effective Drought Index (EDI, Byun and Wilhite 1999) was used to quantify dryness. The EDI is a daily index that represents a continuous period and daily strength changes as objective values; it is an intensive measure that considers water accumulation with the weighting function of time passage. The merits of the EDI have been verified by many studies (Morid et al. 2006; Pandey et al. 2008; Lee and Byun 2009; Kim et al. 2011; Lee et al. 2012). The EDI is calculated using three equations:

$$ EP={\displaystyle {\sum}_{n=1}^i\left(\left({\displaystyle {\sum}_{m=1}^n{P}_m}\right)/n\right)} $$
(1)
$$ DEP=EP-MEP $$
(2)
$$ EDI=DEP/ST(DEP) $$
(3)

In Eq. (1), \( {P}_m \) is the precipitation on the m day before the time of analysis and \( i \) represents the duration of the summation in days. Here, the value for \( i \) was 365. EP is the effective precipitation, and MEP illustrates the climatological mean of EP. DEP is the deviation of EP. ST(DEP) denotes the standard deviation of each day’s DEP. Further details about these equations can be found in Byun and Wilhite (1999).

The Available Water Resource Index (AWRI, Byun and Lee 2002) was also used to quantify the deficit of water. The AWRI was calculated as

$$ W=EP/\left({\displaystyle {\sum}_{n=1}^i\left(1/n\right)}\right) $$
(4)

2.2 Methods of analysis

2.2.1 Root-mean square difference

The RMSD, which measures the quality of probabilistic forecasts, denotes the values of difference between the ensemble-mean forecast and the control run over the analysis domain. The RMSD is defined as

$$ \mathrm{RMSD}=\sqrt{\frac{{\displaystyle {\sum}_{i=1}^n{\left({x}_{\mathrm{f}}-{x}_{\mathrm{v}}\right)}_i^2} \cos {\varphi}_i}{{\displaystyle {\sum}_{i=1}^n \cos {\varphi}_i}}} $$
(5)

In Eq. (5), x f and x v represent the values of the forecast and control run, respectively. The term i in Eq. (5) represents the latitude of the analysis grid.

2.2.2 Brier score

To verify the accuracy of precipitation probability forecasts, the Brier score (BS) was calculated:

$$ BS=\frac{1}{n}{{\displaystyle {\sum}_{k=1}^n\left({y}_k-{o}_k\right)}}^2 $$
(6)

where n is the total forecast frequency and Y k is the forecasted precipitation probability (Wilks 2011). If precipitation was observed, O K was defined as 1, otherwise O K was 0. Forecasted precipitation probability is defined here as the proportion of ensemble members that predicted an event to occur at a particular grid point. The BS of 1 indicates the worst possible forecast. Low values for the BS mean that the probability forecasts performed well in regard to reliability and predictive ability.

3 Comparison of ensemble-mean forecast accuracy during April to July

3.1 Root-mean square differences of ensemble means

To compare the prediction accuracy of the five operational forecast centers according to lead times, the RMSDs between the ensemble means and control forecasts at Z500, T850, MSLP, and Q850 were calculated (Figs. 1, 2, 3, and 4).

Fig. 1
figure 1

Average RMSD of the ensemble mean forecasts of geopotential height at 500 hPa during April to July 2012

Fig. 2
figure 2

Same as Fig. 1 but for temperature at 850 hPa

Fig. 3
figure 3

Same as Fig. 1 but for mean sea level pressure

Fig. 4
figure 4

Same as Fig. 1 but for specific humidity

Figure 1 shows the RMSD distribution of Z500 for 12-h intervals from April to July 2012. Because ensemble spread increases as forecast length increases (Bowler 2006), common characteristics appeared such as the RMSD increased as valid time increased and it decreased as time progressed from spring to summer. In the analysis of the RMSD at Z500, the CMA displayed an abrupt increasing pattern for the RMSD from 120-h lead time during the analysis period except for May. The CMC had the largest RMSD values during June and July. Thus, performance of the CMC for Z500 was poor. The other forecast ensembles showed a similar distribution. In the case of T850 (Fig. 2), the RMSDs of the CMA in April and the CMC in June and July had large values; hence, the accuracy of these predictions were worse than those from the other centers. Additionally, an abrupt increase of the RMSD for the NCEP and KMA appeared from about 168 h of valid time in June and July, respectively. In addition, the RMSD of the MSLP showed similar patterns to those from Z500 and T850 (Fig. 3). The CMC appeared to show poor performance here. Except for the KMA, which was not available to use at Q850, a pattern of semidiurnal variation was observed in the RMSD distribution for specific humidity, and the NCEP showed a large improvement in forecast accuracy as the forecast period was reduced (Fig. 4). However, the CMA, ECMWF, and UKMO did not have common characteristics with increasing ensemble spread as the lead time was long; this is because their values of the RMSD during 1- to 10-day lead times were similar to each other regardless of lead time periods. It was determined that the prediction accuracy of the CMC was the worst, similar to what was observed for the other variables.

Therefore, the results suggest that the ECMWF, KMA, and UKMO performed the best with respect to the RMSD for average Z500, T850, MSLP, and Q850.

3.2 Brier scores of ensemble-mean 24-h accumulated precipitation

The predictability of the precipitation probability forecast was investigated using the BS for about 24 h of accumulated precipitation over South Korea during 1- to 10-day lead times, except for the CMC, which appeared to have the worst prediction accuracy in the analysis of the RMSD. Four threshold values were used as criteria for 24-h precipitation in excess of 1, 5, 10, and 20 mm (Fig. 5).

Fig. 5
figure 5

Brier scores of precipitation probability prediction about four thresholds (1.0, 5.0, 10.0, and 20.0 mm per 24 h) at five operational forecast centers during April to July 2012

Generally, the predictability of the precipitation probability forecast was high when the lead time used for forecasts was short. But interestingly, the BSs of 4- and 5-day lead times were lower than the ones for 1-day lead times for all thresholds during April. The distribution of BSs from May to July shows this general pattern of high accuracy for predictions at short lead times. In comparisons of the monthly BS distributions, the values observed in April and May were higher than the ones in June and July; this is due to the low accuracy of precipitation probability forecasts during the strengthening of atmospheric instability and high probability of precipitation occurrences in summer. Additionally, the BS in May had a very low value; this was because the precipitation occurrence day was nearly zero owing to continuous arid weather and the occurrence of a moderate drought. The NCEP performed worse than the other EPSs for thresholds with precipitation in excess of 1 mm per 24 h in June and for all thresholds in July. The high performance of the ECMWF and KMA at a 1-day lead time could be explained by the low values of their BSs.

4 Evaluation of precipitation predictability at the onset of the summer rainy season

4.1 Weather charts and precipitation at the onset of the summer rainy season

Due to East Asian summer monsoon, Changma (summer rainy season) starts in late June and ends in late July over South Korea, generally (Qian and Lee 2000). In 2012, summer rainy season starts on the 29th of June. Figure 6 shows the daily surface weather chart of KMA at 0000 universal time coordinated (UTC) from 23 to 30 June 2012. The Changma front was positioned over the East China Sea on 23 June, and it fluctuated in the northward and southward directions. As low pressure passed over South Korea on 29 June, Changma, the summer rainy season, practically began at the middle region of South Korea on that day. As shown in Fig. 7a, daily mean precipitation amounted to less than 1 mm from 23 to 28 June, but precipitation of more than 20 mm occurred on 29 and 30 June 2012, on average. Notably, precipitation occurred on a nationwide scale on 29 June; Kyonggido and the middle Yellow Sea experienced precipitation of more than 100 mm, the middle region experienced precipitation of more than 50 mm, and Kyongsangdo and several regions of the eastern coastal sea experienced precipitation in amounts less than 10 mm (Fig. 7b).

Fig. 6
figure 6

Weather chart of KMA at 0000 UTC from 23 to 30 June 2012

Fig. 7
figure 7

a Time series of daily precipitation averaged 12 stations of Table 2 from 20 to 30 June 2012. b 24 h accumulated composite precipitation using 657 AWSs and 11 radars with 5-km resolution over South Korea on 29 June 2012

4.2 Distribution of ensemble-mean precipitation predictions from 1- to 10-day lead times

The 24-h accumulated precipitation on 29 June from 1- to 10-day lead times was investigated to compare the predictability of ensemble-mean precipitation at the onset of the summer rainy season in 2012 (Fig. 8). The spatial mean precipitation distributions of five EPSs and multi-model ensemble (MM) averaging 135 members were analyzed. The results of the analysis showed that the forecasted quantitative precipitation and its spatial distribution coincided better with the observed precipitation as lead time was reduced. In particular, the accuracy of predictions with CMA, KMA, and UKMO appeared to show better performance, as the spatial-temporal distributions of the forecasted and observed precipitation were similar. The CMA was able to forecast heavy rainfall over South Korea with a 4-day lead time, but it over-forecasted precipitation of more than 10 mm at all nationwide regions at a 1-day lead time and it could not forecast precipitation with more than 100 mm over Kyonggido and the western coastal regions. The ECMWF and NCEP were able to forecast precipitation of more than 30 mm with 3-day lead time, but they could not forecast precipitation with more than 50 mm over South Korea at a 1-day lead time. Hence, the area of forecasted maximum rainfall appeared in more northern areas than the area for the actual precipitation and the accuracy of the prediction showed poor performance. In comparison, the ensembles from the KMA and UKMO centers could forecast precipitation of more than 30 mm with 4- and 5-day lead times, respectively; they were also able to forecast precipitation less than 10 mm over the eastern coastal region and parts of Kyongsangdo (i.e., the distribution of forecasted precipitation was similar to observations). Additionally, the results of the predicted area of maximum rainfall in the western coastal region and precipitation less than 10 mm over the South Sea at KMA and UKMO appeared to be more accurate than results obtained from other center’s ensembles. Importantly, all of the operational forecast center methods could not forecast maximum precipitation of more than 100 mm. The MM with a combination of five ensemble systems showed a higher (lower) accuracy of prediction than the ECNWF and NCEP (CMA, KMA, and UKMO) in forecasting the quantitative amounts and spatial distribution of precipitation.

Fig. 8
figure 8

Spatial distribution of 24 h accumulated ensemble mean precipitations at five operational forecast centers and multi-model ensembles from 10- to 1-day lead time on 29 June 2012. Predicted precipitation with more than 10 mm per 24 h was shaded (figures of 10- to 6-day lead time were not shown)

4.3 Assessment of precipitation probability forecasts from 1- to 4-day lead times

The prediction ability of ensemble probability precipitation at all EPSs was evaluated by calculating precipitation probabilities from 1- to 4-day lead times (Fig. 9, figures of PP10 and 25 were not shown). Precipitation probability represents a simple proportion of the ensemble members. To compare prediction ability of EPSs, thresholds were divided by the precipitation probability with more than 10, 25, and 50 mm per 24 h (termed PP10, PP25, and PP50, respectively), and the assessment focused on the distribution area of maximum rainfall and amounts of precipitation less than 10 mm over Kyongsangdo, the eastern coastal region, and the South Sea.

Fig. 9
figure 9

Spatial distribution of precipitation probability forecasts in excess of 10.0, 25.0, and 50.0 mm per 24 h at five operational forecast centers and multi-model ensembles from 1- to 4-day lead time on 29 June 2012 (figures of PP10 and PP25 were not shown)

4.3.1 Distribution of precipitation probability forecast at a 1-day lead time

The result of the analysis for ensemble probability precipitation at 1-day lead time showed that approximately 70 % of the PP50 at CMA appeared in the northern region; thus, the area of maximum rainfall showed a significantly northward shifted distribution as compared to observational data. In addition, the PP50 over the middle regions was higher than for the other EPSs so that the CMA appeared to have a better performance for the threshold of heavy rainfall. However, CMA overestimated precipitation over Kyongsangdo and the eastern coastal region for the threshold of PP10; there was a probability of more than 70 % in all inland areas. The ECMWF had a very poor performance for the threshold of PP50. It had a value of less than 50 % in all areas, and its maximum appeared in the northern region of South Korea. Although the forecasted distribution of precipitation probability at the NCEP was similar to the one of the ECMWF, its probability was higher than the one of the ECMWF and NCEP showed low predictability for the precipitation probability forecast by having lower values in the thresholds of PP50 and PP25 in comparison to observed precipitation, quantitatively. Moreover, the PP10 over the South Sea showed low values with the ECMWF and NCEP. The spatial distribution and values of precipitation probability at KMA and UKMO showed nearly similar patterns. Particularly, it was noted that their prediction ability for precipitation probability was superior to the other EPSs; specifically, the distributions for the PP10 with less than 50 % in the eastern coastal region and the one with more than 90 % over the South Sea were similar to the observed precipitation distribution. Between these, the high performance of UKMO can be explained by the threshold of PP50 because it was shown that the probability of UKMO was larger than the one of KMA over the Yellow Sea. As mentioned above, predictability of precipitation probability at MM performed better (worse) than ECMWF and NCEP (KMA and UKMO) similar to the assessment of the ensemble-mean precipitation forecast.

To verify the forecasted precipitation probability quantitatively, reliability between averaged precipitation probability and observation frequency was calculated for about 12 grids that contained South Korea in the analysis domains (Fig. 10). The reliability was calculated by dividing values from five operational forecast centers with seven thresholds (1, 5, 10, 20, 30, 40, and 50 mm per 24 h) and using 100 composite precipitation data at about 1 grid. Reliability was indicated by the proximity of the plotted curve to the diagonal. The deviation from the diagonal gives the conditional bias. If the curve lies below the line, these data are indicative of over-forecasting (probabilities are too high), whereas points above the line are indicative of under-forecasting (probabilities are too low). The results showed that CMA (ECMWF, NCEP, and KMA) had over-forecasting (under-forecasting) patterns at most of the thresholds and UKMO had over-forecasting (under-forecasting) patterns at thresholds less (more) than PP20 for a 1-day lead time. The NCEP performed significantly worse than the other centers, whose PP30 was nearly zero. It seems appropriate, therefore, to conclude that the precipitation probability forecast ability at KMA and UKMO is best at a short-range lead time, as their lines were more similar with the diagonal than the other EPSs.

Fig. 10
figure 10

Reliability diagram of precipitation probability (in excess of 1.0, 5.0, 10.0, 20.0, 30.0, 40.0, and 50.0 mm per 24 h in sequence from the top to the bottom circles) over South Korea at five operational forecast centers from 1- to 4-day lead times on 29 June 2012. Diagonal line indicates perfect reliability

4.3.2 Distribution of precipitation probability forecasts from 2- to 4- day lead times

The CMA appeared to have a wider area of precipitation probability and larger values in most of the thresholds than the results from the other centers at 2- to 4-day lead times (Fig. 9b–d). The ECMWF had a probability of less than 10 % in the threshold of PP50 at 2- to 4-day lead times. Accordingly, its performance in precipitation probability forecasts was poor and the results were significantly different from the distribution of observed precipitation in excess of 100 mm. Similar to the ECMWF, the NCEP showed that PP50 values were less than 10 % at a 4-day lead time and 30 % at 2- and 3-day lead times, which suggests that it performed poorly in assessing the likelihood of heavy precipitation over South Korea. Moreover, although the PP50 with a 2-day lead time had values of more than 60 %, its area of maximum rainfall appeared in more northern areas than the observed precipitation distribution. As was already noted for the precipitation probability at a 1-day lead time, the spatial patterns of the KMA and UKMO were significantly similar to each other at 2- and 3-day lead times, and the area of maximum rainfall was located over the middle of the Yellow Sea and Kyonggido with values of more than 50 % probability; therefore, the results verified that the ensemble probability forecast of these two centers performed better than the forecasts from the other centers.

In the reliability diagram from the 1- to 4-day lead times (Fig. 10), only CMA and UKMO appeared to have over-forecasted the patterns at both 1- and 2-day lead times, whereas the others under-forecasted patterns from 2- to 4-day lead times. The PP20 of NCEP at a 2-day lead time showed nearly zero values; hence, its reliability was low. Consequently, the precipitation probability forecasts at CMA, KMA, and UKMO were found to have a better reliability for use in South Korea than the ones of ECMWF and NCEP.

4.4 Ten-day lag ensemble forecasts

Figure 11 displays the 10-day lag ensemble-mean precipitation and probability forecast at five operational forecast centers and MM at the onset of the summer rainy season on 29 June 2012. The values of 1 grid were averaged from 12 grids contained within South Korea. Predictability of the onset of the summer rainy season was investigated quantitatively through the difference of predicted values among the five operational forecast centers and MM according to previous 10-day forecasts using 24-h accumulated precipitation and PP01, PP05, and PP20 categories. The results in Fig. 11 illustrate the successively predicted 1-day precipitation amounts and their probability during a total of 10 days from 20 to 29 June and show CMA, ECMWF, NCEP, KMA, UKMO, and MM EPS from top to bottom, respectively.

Fig. 11
figure 11

Ten-day lag ensemble a mean 24 h accumulated precipitation and probability in excess of b 1.0 mm, c 5.0 mm, and d 20.0 mm per 24 h over South Korea at five operational forecast centers and multi-model ensemble from 20 to 29 June 2012

Figure 11a highlights the 10-day lag ensemble-mean precipitation forecast. As mentioned above, there were low amounts of precipitation with less than 5 mm before the onset of the summer rainy season. If there were any detected amounts of predicted precipitation at the centers before 29 June, the data would have shown poor performance for the precipitation forecast. When considering the ensemble spread generally, all EPSs predict the amount of precipitation with less than 5 mm at a 4-day lead time; therefore, it can be concluded that the onset of the summer rainy season can be predicted at a 4-day lead time (before 96 h) over South Korea using TIGGE data. Figure 11b–d illustrate the 10-day lag ensemble forecasts of PP01, PP05, and PP20 over South Korea. Because the threshold of PP01 can be regarded as criteria for precipitation occurrence, the PP01 forecast at a 6-day lead time (24 June) can obviously predict the onset of the summer rainy season with a probability of more than 90 %; probabilities of less than 50 % were observed before that day for all EPSs. After the onset of the summer rainy season, the PP01 results appeared at values of more than 50 % continuously; thus, these data can be used to predict continuous precipitation occurrences for several days.

Because the observed precipitation showed amounts of less than 5 mm on 29 June, the PP05 was used as the threshold to evaluate the accuracy of precipitation probability forecasts. Similar to the ensemble-mean precipitation forecasts in Fig. 11a, the lag forecast patterns of PP05 from a 4-day lead time were able to successfully forecast the onset of the summer rainy season on 29 June. The PP20 is typically used in quantitative methods to compare the observed mean precipitation because daily mean precipitation amounts are generally more than 20 mm on 29 June. However, in this study, the PP20 was usually smaller than 50 % from 4- to 10-day lead times and probabilities of 1 to ∼3-day lead times were nearly equal and less than 50 %. Therefore, the lag ensemble of PP20 showed under-forecasting patterns at all EPSs.

However, it seems appropriate to conclude that the onset of the summer rainy season over South Korea is able to be predicted at a 4-day lead time with all EPSs by using lag ensemble forecasts of mean precipitation and its probability.

4.5 Brier score at the onset of the summer rainy season

Brier scores from 1- to 10-day lead times were calculated for five operational forecast centers during the onset of the summer rainy season on 29 June 2012 (Fig. 12). Accuracy of the ensemble precipitation probability was evaluated for six thresholds (1, 5, 10, 20, 30, and 50 mm per 24 h) using the precipitation probability over 12 grids in South Korea and composite precipitations (100 observed precipitation data points per 1 grid).

Fig. 12
figure 12

Brier scores of precipitation probability in excess of 1.0, 5.0, 10.0, 20.0, 30.0, and 50.0 mm per 24 h over South Korea at five operational forecast centers from 1- to 10-day lead time on 29 June 2012

The results showed a general pattern whereby the BS approached zero as the lead times were reduced, which indicates that the predictability of the precipitation probability was improved. When considering the results according to thresholds, the CMA and NCEP performed worse than the other EPSs for thresholds of 1 and 5 mm per 24 h (i.e., small amounts of precipitation), but all EPSs showed high predictability during short lead times, and the BSs were nearly equal to zero for a 3-day lead time. For the threshold of 10 mm per 24 h, the BSs were large for the CMA, ECMWF, and NCEP, which is indicative of poor performance; however, the predictability of all EPSs improved from a 3-day lead time. The BS of the NCEP was the largest (indicating poor performance) among all EPSs for thresholds of 20 and 30 mm per 24 h at a 1-day lead time.

For large amounts of precipitation (thresholds of 50 mm per 24 h), which occurs infrequently, improvements in predictability were not observed as the length of lead time was changed.

5 Predictability assessment for withdrawal of spring drought

5.1 Precipitation distribution during April to July

Figure 13 illustrates daily precipitation, distributions of daily EDI, AWRI, and its negative anomalies (shaded area) at the Seoul station during April to July 2012. In the daily precipitation distribution, there was little precipitation from late April to late June. Thus, monthly precipitation in May showed an extremely deficit distribution that was 10 % of the normal and daily precipitation frequency of June. This dry period was the fifth lowest minimum frequency observed during the most recent 30 years. Therefore, there were moderate water resource deficit states (AWRI with less than 150 mm) in early May, and severe water resource deficit states (AWRI with less than 100 mm) and moderate drought states (EDI with less than −0.7) during June. However, the arid weather recovered to normal conditions because of a heavy precipitation event on 29 June 2012. After that day, the summer rainy season had started in the middle region over South Korea.

Fig. 13
figure 13

Time series of daily precipitation, effective drought index, and available water resources index in Seoul from April to July 2012. Dark gray area of EDI indicated period of moderate drought occurrence with EDI below −0.7 and gray (dark gray) area of AWRI appeared period of moderate (severe) water deficit occurrences with AWRI below 150 (100) mm, respectively

5.2 Observed and predicted EDI distribution using ensemble-mean precipitation

The EDI was utilized to quantitatively assess predictability in the withdrawal of the spring drought using forecasted ensemble-mean precipitation data. It was used according to the following method to adjust time equivalently between locally calculated EDI values and universally forecasted ensemble precipitation values. Table 3 displays the time table and numbers of 240 h with 6-h intervals issued at 0000 UTC and 0900 local standard time (LST) on 20 June 2012. To calculate forecasted daily accumulated precipitation at 0000 LST for South Korea, this study used averaged forecasted precipitation at 2100 (time number = 3) and 0300 LST (time number = 4) as shown in Eqs. (7), (8), and (9):

$$ {TP}_{21\mathrm{day}}=\left[\left({TP}_{8t}-{TP}_{4t}\right)+\left({TP}_{7t}-{TP}_{3t}\right)\right]/2 $$
(7)
$$ {TP}_{22\mathrm{day}}=\left[\left({TP}_{12t}-{TP}_{8t}\right)+\left({TP}_{11t}-{TP}_{7t}\right)\right]/2 $$
(8)
$$ {TP}_{29\mathrm{day}}=\left[\left({TP}_{40t}-{TP}_{36t}\right)+\left({TP}_{39t}-{TP}_{35t}\right)\right]/2 $$
(9)
Table 3 Time table of UTC and LST at South Korea in ensemble data on June 20, 2012

where TP represents the total precipitation during 24 h. Using the same methods, this study calculated forecasted daily accumulated precipitation at five operational forecast centers during 9 days (from 21 to 29 June) from 20 June 2012 ensemble data. The observed and predicted EDIs were compared for the various centers using forecasted data during 20 to 29 June 2012.

Figure 14 shows the time series of the observed and predicted daily EDIs for the Seoul station that were calculated using 1- to 10-day lead times at 0000 LST for 30 June. The continuing moderate drought had recovered to normal conditions because of a 69.5-mm daily precipitation event on 30 June 2012. On the date, the EDI was 0.04 and above normal. Additionally, the AWRI was 159.1 mm and the severe water resources deficit state had changed to normal conditions (i.e., AWRI more than 150 mm).

Fig. 14
figure 14

Time series of 9-day predicted EDI (open circles) from 1- to 10-day lead time on 30 June and observed EDI (closed circles) from 21 June to 8 July in Seoul at five operational forecast centers in 2012

The results of the predictability assessment for the withdrawal of the spring drought on 30 June 2012 using the predicted EDIs showed that the predicted EDI of the CMA from a 3-day lead time showed the most similar pattern to the observed values in comparison to the other EPSs and good performance of predictions in withdrawal of the spring drought were achieved according to the EDI on 30 June where forecasts were above zero (wet conditions). The predicted EDI of the ECMWF was available to forecast relief of the spring drought (−0.7 < EDI < 0.7), but the values did not show wet conditions. The predicted EDI of the NCEP at 4- and 5-day lead times forecasted the relief of the spring drought on 30 June, but the 3-day lead time had a relatively large difference from observed values and hence a low consistence of prediction. The predicted EDI of the KMA from 1- to 5-day lead times forecasted relief of the spring drought on 30 June, but there was not a period of wet conditions; large differences existed between the observed and predicted EDIs at 4- and 5-day lead times. Distribution of the EDI at UKMO was similar to the one at KMA, and the predicted EDI at a 4-day lead time forecasted relief of the spring drought on 30 June, but similar to KMA, the values did not change to wet conditions.

Therefore, it can be concluded that the CMA performs the best when predicting the withdrawal of the spring drought, and the results demonstrate that the withdrawal of the spring drought can be forecasted about 3 days beforehand (72 h) using predicted EDI values.

5.3 Distribution of ensemble spread in the EDI

Spreads of predicted EDIs of all EPS members were calculated at all centers from 20 to 29 June (Fig. 15). Their patterns were nearly the same as the spreads of precipitation. That is to say, the spread of the EDI was dependent on the one of precipitation of their members. Thus, spreads of ECMWF and NCEP (CMA, KMA, and UKMO) were smaller (larger), relatively. The KMA had the largest spreads of EDI. The distribution of the spreads is displayed using inter-quarter ranges (IQRs). The upper and lower ends of the box are drawn at the quartiles, and the bar through the box is drawn at the median. The whiskers extend from the quartiles to the maximum and minimum spread values. Circles indicate outliers (upper and lower 10 % of all spreads). On 30 June, the day of the withdrawal of the spring drought, the median spreads of EDI were above zero similar to the observed EDI only for CMA at 2- and 3-day lead times. Although median of spread was not similar to the observed EDI on 30 June, its pattern at KMA and UKMO for 3- and 4-day lead times looks the same as that for the observed EDI from the following day, but the ones of ECMWF and NCEP had large differences.

Fig. 15
figure 15

Same as Fig. 14, but for EDI spreads

Accurate prediction of precipitation is essential to forecast the withdrawal of drought using EPS. Because the EDI is calculated by precipitation with a time-dependant reduction function and predicted precipitation at 1-day lead time had some difference with the observed ones, it is difficult to forecast the day of withdrawal in spring drought in this case. However, it was worthwhile to try to diagnose the prediction probability of withdrawal in drought using ensemble precipitation data and drought indices; in the future, it would be valuable to analyze methods for predictability improvement in such high-impact weather events simultaneously.

6 Summary and conclusions

This study compared prediction accuracy in Z500, T850, MSLP, and Q850 from April to July using the RMSD and BS and assessed the predictability of the onset of the summer rainy season by use of ensemble mean, probability precipitation, 10-day lag ensemble forecasts. Additionally, this study diagnosed the prediction probability of the withdrawal in the spring drought through predicted EDI values that were calculated by ensemble-mean precipitation over South Korea with EPSs at five operational forecast centers in 2012.

First, results from the analysis in RMSD between predictions and control runs of the EPSs showed that the CMC had the largest values of RMSD, which indicate that the CMC performed worse than other ensemble systems under Z500, T850, MSLP, and Q850 conditions. The CMA displayed a sudden increase of RMSD after 120-h forecast time; thus, its performance was not much better. The RMSD had a decreasing pattern from April to July in all variables. Results from the assessment using the BS for ensemble probability forecasts of 24-h accumulated precipitation were analyzed, and the data showed that the ECMWF and KMA had the best performance (i.e., nearly zero BS at a 1-day lead time).

Second, the results of assessing predictability of the onset in the summer rainy season by use of ensemble mean and probability precipitation verified that it was possible to forecast this phenomenon at a 4-day lead time on average. Although the CMA appeared to over-forecast the pattern, it could forecast heavy precipitation at a 4-day lead time. The ECMWF and NCEP showed low prediction abilities as their maximum precipitation areas lied over a region more northern than the observation area and smaller amounts of precipitation were forecasted. Performance of KMA was similar to UKMO, and their predicted maximum precipitation area was nearly the same as the observational one. In addition, they could forecast heavy precipitation on 29 June from 4 days beforehand. The MM performed better (worse) than ECMWF and NCEP (CMA, KMA, and UKMO). Precipitation with amounts in excess of 100 mm was not forecasted in all EPSs. In the analysis of ensemble probability forecasts, CMA appeared to show an over-forecasting pattern, which had large values over a wide region, whereas the ECMWF and NCEP had under-forecasting patterns because they resulted in small values whose maximum precipitation area lied to the north of the observation area. The KMA and UKMO had the best performance in terms of the prediction ability for the spatial distribution of precipitation. The onset of the summer rainy season was forecasted at a 4-day lead time using the 10-day lag ensemble mean and the average probability of precipitation.

Third, this study investigated the predictability of withdrawal in the spring drought on 30 June through use of predicted daily EDI values computed using ensemble-mean precipitation data. The results showed that the EDI at CMA from a 3-day lead time performed best in that the predicted values were the most similar to the observed EDI values. Moreover, the results from the KMA and UKMO for 3- and 4-day lead times showed similar patterns to the observed EDI values, but they could not forecast a return to wet conditions on 30 June. The ECMWF and NCEP could not forecast relief of the spring drought consistently. Distribution of the EDI spreads showed that the ECMWF and NCEP data had smaller ranges than the data from CMA, KMA, and UKMO, relatively, and the range increased when precipitation was forecasted. But overall, it was difficult to predict the withdrawal of the spring drought and additional studies needed be conducted.