1 Introduction

Comprehensive and reliable meteorological datasets are essential for driving land surface hydrological models (Fang et al. 2015; Guo et al. 2018; Raimonet et al. 2017). In addition, the availability of large-scale, long-term datasets of land surface water and energy states is critical for understanding the global environmental system and water cycle variables (Chun-Lin and Xin 2011; Moradkhani 2008). The Global Land Data Assimilation System and the Famine Early Warning Systems Network Land Data Assimilation System provide a suite of multi-model and multi weather forcing estimates of hydro-climate conditions for data-sparse, food-insecure regions (McNally et al. 2019, 2017; Rodell et al. 2004). As a continuing effort, the North American Land Data Assimilation System has been implemented to produce substantial estimates of land surface hydrology (Mitchell et al. 2004). Specifically, high-quality forcing datasets can be used to identify the onset and demise of droughts and floods (Dong et al. 2011) using the land surface model (LSM). Such datasets can also evaluate historical extreme events and prevent record-setting disasters in advance by combining effective water resource management measures with model simulation (Lee et al. 2017a). However, consistent observations of land surface water and energy budgets are frequently unavailable over large scales (Sheffield et al. 2006). Thus, the significance of accurate forcing for large-scale land surface modeling efforts was established (Berg et al. 2003; Fekete et al. 2004; Nijssen and Lettenmaier 2004; Sheffield et al. 2004).

Globally, many studies have created several decadal time series of forcing data, including the variables of precipitation, temperature, humidity, and radiation at daily to sub-daily timescales. Most of these products have been based on reanalysis data such as those provided by the National Centers for Environmental Prediction–National Center for Atmospheric Research (Kalnay et al. 1996; Kistler et al. 2001), the European Center for Medium-Range Weather Forecasts ERA-40 and ERA-15 (Gibson et al. 1997; Uppala et al. 2005), the National Centers for Environmental Prediction–Department of Energy (Kanamitsu and Saha 1996), and the National Aeronautics and Space Administration Global Modeling and Assimilation Office (Schubert et al. 1993) reanalysis. In addition to these global datasets, a few datasets have been developed to improve the accuracy of meteorological data on a regional scale, such as the Canadian Land Data Assimilation System (Carrera et al. 2015), the European Land Data Assimilation System (Jacobs et al. 2005, 2008), and the China Meteorological Forcing Dataset (He et al. 2020).

In South Korea, few studies have attempted to develop reliable meteorological forcings. However, high-resolution precipitation and temperature data are necessary to identify detailed spatial drought and flood conditions at a regional scale. The necessity of the development of forcing datasets and hydrologic model was highlighted to meet the full potential of regional land surface modeling as flood and drought monitoring system to provide continuous high-resolution water and energy balance variables in South Korea (Jung et al. 2019). Korea Land Data Assimilation System (KLDAS) has been established for agricultural drought (i.e., soil moisture deficit) monitoring in South Korea (Jung 2021). Soil moisture and evapotranspiration were calculated throughout South Korea using the KLDAS of the Korea-Land Surface Information System to monitor drought (PARK et al. 2021). For instance, the Modern-Era Retrospective Analysis for Research and Application version 2 (MERRA-2) products can represent most inland areas in South Korea with an intermediate spatial resolution of 0.5°. Most recently, Lim et al. (2012) developed meteorological forcing data to drive the LSM at each grid point using analysis-based and observation-based data. The forcing data consisted of temperature, relative humidity, wind speed, surface pressure, and longwave radiation. They were obtained from the 0.5625° Global Data Assimilation and Prediction System (GDAPS) analysis fields of the Korea Meteorological Administration at 6-h intervals. All variables were interpolated from the GDAPS grid to 10 km for South Korea. For example, precipitation was generated using 6-h accumulated precipitation reports from the World Meteorological Organization Global Telecommunication System and hourly precipitation from the Automatic Weather System over the Korean Peninsula. Precipitation data were then spatially interpolated using Shepard’s distance-weighting method (Shepard 1968; Willmott et al. 1985) for South Korea. However, the forcing dataset in South Korea is only available for the 5-year period from 2004 to 2008.

Berg et al. (2003), Ngo‐Duc et al. (2005), and Sheffield et al. (2004) showed that there was great potential and advantages for using hybrid datasets that combine reanalysis with observation-based datasets to remove biases. This hybrid approach preserves the consistency and continuity of the reanalysis. Many studies have developed large-scale, long-term datasets using a similar hybrid method (Nijssen et al. 2001; Maurer et al. 2002; Nijssen and Lettenmaier 2004; Livneh et al. 2013). To overcome the current restriction of forcing data availability for South Korea, this study adopted a hybrid dataset of observations and reanalysis data.

This paper aimed to create a 41-year (1980‒2020) near-surface meteorological dataset (hereafter called ASOS-MERRA2) with hourly time scales at 0.125° spatial resolution. The ASOS-MERRA2 represents an improvement of the reanalysis dataset (i.e., MERRA-2) in South Korea. An expected advantage of these newly created forcing data is that they should contain temporal and spatial characteristics similar to ground-truth meteorological observations. This study describes the process of creating ASOS-MERRA2 and its validation, including bias and correlation improvements from the existing MERRA-2. In addition, comparison statistics are presented, such as annual, monthly, and daily statistics of precipitation, temperature, pressure, specific humidity, and wind speed from hourly data. The evaluation was carried out in four major river basins in South Korea to explore basin-wide characteristics. The key features of the ASOS-MERRA2 are its long period, reliability, high resolution, and continuity of weather data development.

The remainder of this paper is organized as follows. Section 2 describes the observation data, and Section 3 explains a method to merge MERRA-2 and ASOS. Sections 4 and 5 demonstrate the improvement and comparison statistics of the ASOS-MERRA2 in spatial and temporal scales using the independent stations approach (He et al. 2020). Finally, Sect. 6 concludes with the newly developed, locally available forcing dataset and its potential use for hydrologic modeling.

2 Data

This section includes the study area, local weather observations, base meteorological forcing dataset, and a method to merge the observational and reanalysis datasets to create a new forcing dataset for South Korea.

2.1 Study area

The study area covers the southern part of the Korean Peninsula between 34–38.5°N and 126–130°E. Over 60% of the domain consists of mixed forests, deciduous broadleaf forests, and woody savannas, whereas 30% is cropland, mainly in the western and southwestern regions. Urbanized areas accounted for 5.2% of the domain (Jung et al. 2020; Sur et al. 2015). This study focused on four major river basins in South Korea. The national water management framework has been mainly oriented to the Han River basin (26,000 km2) in the north, Geum River basin (10,000 km2) in the west, Nakdong River basin (24,000 km2) in the southeast, and Seomjin-Yeongsan River basin (8,000 km2) in the southwest of South Korea (Lee and Kim 2007). The geographical regions of the four major river basins are shown in Fig. 1. In Section 5, each forcing was analyzed in the different river basins. Figure 1 shows the spatial distribution of the Automated Synoptic Observing System (ASOS) stations and the availability of data at each station in South Korea.

Fig. 1
figure 1

Four major river basins and 94 Automated Surface Observing System (ASOS) stations (yellow dots) with ASOS availability (blue dots) in South Korea

2.1.1 ASOS

Automated Surface Observing System units (Gang et al. 2016; In et al. 2014; Kim et al. 2020, 2018; Lee et al. 2017b) are automated sensors designed to serve meteorological and aviation observational needs (www.weather.gov/media/asos/). Ninety-four ASOS stations are operated over the four major river basins by the Korea Meteorological Administration in South Korea. Precipitation, temperature, pressure, relative humidity, and wind speed observations were collected to generate a forcing dataset, as described in Section 3. The advantage of the automation of the surface observations is the reduction or elimination of direct human involvement in collecting, processing, and disseminating (transmitting, displaying, and broadcasting) the surface observations. Even though the ASOS is a highly automated and reliable product, quality control has been implemented to ensure the automated product is of a high standard. There are three levels of quality control for ASOS at different temporal and spatial scales. Level 1 is performed on site in real time before an observation is transmitted. Level 2 is completed for a designated area, usually within 2 h of the scheduled observation transmission time. Level 3 is performed centrally on all aviation routine weather reports nationwide in South Korea.

2.2 MERRA-2

The MERRA-2 (Gelaro et al. 2017) datasets were used to provide the underlying dataset for ASOS-MERRA2. The MERRA-2 is available globally at an hourly and horizontal resolutions of 2/3° longitude by 1/2° latitude. It was intended as an intermediate reanalysis dataset, which influences recent developments at the National Aeronautics and Space Administration Global Modeling and Assimilation Office (GMAO) in modeling and data assimilation to provide a milestone for GMAO’s longer-term goal of developing an Integrated Earth System Analysis. Specifically, the MERRA-2 data product was used to fill the unmeasured station data from the ASOS. The detailed process of combining the data of ASOS and MERRA-2 is described in Section 3.

3 Method of generating ASOS-MERRA2

Even though most ASOS stations have continuous and automated observations, some data were missing due to operational issues such as initiation, partial malfunction, and potential maintenance. About 28% (11.5 out of 41 years on average) of the data were neither unmeasured nor unavailable from ASOS stations, and the corresponding MERRA-2 product was used to fill the missing data gaps. Each grid cell of ASOS-MERRA2 at 0.125° and the contained within the MERRA-2 cell at 0.5° are the identical. For example, MERRA-2 cell (38.5N and 129.5E) has value of 1.58 m of precipitation; the corresponding 16 ASOS-MERRA2 cells (the combination of 38.3125N, 38.4375N, 38.5625N, 38.6875N and 129.3125E, 129.4375E, 129.5625E, 129.6875E) are also 1.58 m.

The algorithm for creating the new forcing dataset is shown in Fig. 2. For example, the ASOS station has a 3-h time series of precipitation from January 1985 to December 2020 (the available period). No data were available between January 1980 and December 1984 (the missing period). The first step was to divide the 3-h data into hourly time series. The next step was to find the corresponding MERRA-2 grid precipitation data for the missing period. We then combined the missing period with MERRA-2 and the available period with ASOS in the continuous time series from January 1980 to December 2020. The output time series from this combination is referred to as ASOS-MERRA2. Once we had a continuous 41-year-long time series at each station, these point data were spatially interpolated using the inverse distance weighting method (Shepard 1968) at a grid size of 0.125°. The same procedure was applied to the temperature, pressure, and wind speed data to generate ASOS-MERRA2. Exceptionally, the specific humidity (SH) was calculated from both the ASOS and MERRA-2 relative humidity (RH), the newly developed ASOS-MERRA2 temperature (T), and pressure (PS). Once we had a continuous time series of RH, T, and PS, the calculation of SH by Bolton’s method (Bolton 1980) was applied using the following equations:

$${vp}_{sat}=6.112\times {e}^{\frac{\left(17.67\times T\right)}{\left(T+243.5\right)}}$$
(1)
$$vp={vp}_{sat}\times \frac{RH}{100}$$
(2)
$${T}_{d}=\mathrm{log}\left(\frac{vp}{6.112}\right)\times \left[\frac{243.5}{\left(17.67-\mathrm{log}\left(\frac{vp}{6.112}\right)\right)}\right]$$
(3)
$$vp=6.112\times {e}^{\frac{\left(17.67\times {T}_{d}\right)}{\left({T}_{d}+243.5\right)}}$$
(4)
$$SH=\frac{0.622\times vp}{\left(PS-0.378\times vp\right)}\times 1000$$
(5)

where

Fig. 2
figure 2

Algorithm for creating the Automated Synoptic Observing System-Modern-Era Retrospective Analysis for Research and Application version 2 (ASOS-MERRA2)

$${vp}_{sat}=\mathrm{saturation\;vapor\;pressure }(\mathrm{hPa})$$
$$vp=\mathrm{vapor\;pressure }(\mathrm{hPa})$$
$$T=\mathrm{air\;temperature }(^\circ \mathrm{C})$$
$${T}_{d}=\mathrm{dew\;point\;temperature }(^\circ \mathrm{C})$$
$$RH=\mathrm{relative\;humidity }\left(\mathrm{\%}\right)$$
$$SH=\mathrm{specific\;humidity }(\mathrm{g}/\mathrm{kg})$$
$$PS=\mathrm{surface\;pressure }(\mathrm{hPa})$$

Again, the SH point data were spatially interpolated using the inverse distance weighting method at a grid size of 0.125°. The final product of ASOS-MERRA2 was a gridded dataset from January 1980 to December 2020 with hourly temporal resolution. There were 721 ASOS-MERRA2 grids (0.125°) that covered the southern part of the Korean Peninsula, while MERRA-2 only had 65 grids (0.5°) across the same area. The physical definitions of the variables used are given in Table 1.

Table 1 Definitions of the variables used in the Automated Synoptic Observing System-Modern-Era Retrospective Analysis for Research and Application version 2 (ASOS-MERRA2) forcing data set

4 Validation of ASOS-MERRA2

The most important question is whether the newly generated ASOS-MERRA2 is better than the existing MERRA-2 datasets, which are widely used globally. To test the robustness of the ASOS-MERRA2, the observational data of nine stations of the 94 stations (about 9.6% of total stations) in South Korea originally used to create ASOS-MERRA2 were intentionally removed, and then the ASOS-MERRA2 forcing was regenerated (Fig. 3). These nine stations were selected based on two criteria: (1) at least two stations were selected within each basin, and (2) those selected stations were at least 80 km apart to avoid overlapping with other validation-purpose stations. This validation method was adopted from the China Meteorological Forcing Dataset (He et al. 2020) and gave us another version of ASOS-MERRA2. Hereafter, the altered version is referred to as ASOS-MERRA2*, where * stands for the distinction from ASOS-MERRA2. It should also be noted that ASOS-MERRA2* is not the released version of ASOS-MERRA2 and is for validation purposes only. Because the observational data from the nine removed stations were not involved in creating the gridded ASOS-MERRA2*, they could be used as independent observations to evaluate the result of the algorithm of ASOS-MERRA2*. The statistical error metrics of MERRA-2 and ASOS-MERRA2* were calculated on a daily scale against local observations from the nine stations, and the results are summarized in Fig. 4. The statistical error metrics of the ASOS-MERRA2* based on the evaluation at the nine excluded stations were compared with the gridded background (MERRA-2) dataset for 1980–2020. To test whether these two mean values (between MERRA-2 and ASOS-MERRA-2 data for each variable: precipitation, air temperature, surface pressure, specific humidity, and wind speed) were statistically different, two-sample t-test (Snedecor and Cochran 1989) was implemented with the null hypothesis that the population mean values were equal for the two samples. Then we did not reject the null hypothesis and concluded that the two-population means were not significantly different at the 0.05 significance level.

Fig. 3
figure 3

Validation of the Automated Synoptic Observing System-Modern-Era Retrospective Analysis for Research and Application version 2 (ASOS-MERRA2) using hypothetically unknown stations. Green stars: nine ASOS stations excluded from ASOS-MERRA2; red dots: 85 ASOS stations used in ASOS-MERRA2*. There are three values from each green star: [1] MERRA-2 grid value, [2] ASOS-MERRA2* grid value, and [3] ASOS station value

Fig. 4
figure 4

Statistical error matrices for the validation of the Automated Synoptic Observing System-Modern-Era Retrospective Analysis for Research and Application version 2 (ASOS-MERRA2). R2, coefficient of determination; MAE, mean absolute error; RMSE, root mean square error; MERRA-2, difference or relationship between [1] MERRA-2 and [3] ASOS observations; ASOS-MERRA2*, difference or relationship between the [2] ASOS-MERRA2* and [3] ASOS observations

For precipitation and temperature, the coefficient of determination (R2) results for ASOS-MERRA2* were only slightly improved compared with MERRA-2 data from 1980 to 2020. The mean absolute error (MAE) and root mean square error (RMSE) of ASOS-MERRA2* precipitation and temperature were lower than the MERRA-2 results. Although the R2 of pressure was somewhat improved from MERRA-2 to ASOS-MERRA2*, the MAE and RMSE were significantly enhanced with closer-to-zero errors. For specific humidity and wind speed, the overall statistical error metrics (R2, MAE, and RMSE) were improved for ASOS-MERRA2*. The R2 of wind speed was the lowest for both MERRA-2 and ASOS-MERRA2* among the five forcings. Because the wind speed value in this comparison was a combination of the wind direction and its magnitude in both the u- and v-directions, the lower daily R2 may be due to this compound and sensitive estimation. In conclusion, the algorithm for creating ASOS-MERRA2 can reduce biases and improve correlations in the input gridded data from observations against MERRA-2. Although one validation set may not be enough to say that this sample data and results are valid. To support this idea, we have tested ‘two’ additional sets of validation. A total number of three different sample datasets was tested in the same validation method. However, two additional validation sets showed nearly the same results in comparison against ASOS ground observation. Since the results were not improved (or changed), the additional validation comparison was not included in this chapter.

5 MERRA-2 and ASOS-MERRA2 comparison

5.1 Precipitation

The spatial distribution of precipitation from MERRA-2, ASOS-MERRA2, and their differences are compared in Fig. 5. Each grid cell in the top panel indicated a mean annual total precipitation over 41 years (1980–2020). The overall amount of ASOS-MERRA2 precipitation was greater than that of MERRA-2. The annual total precipitation (ATP) ranged from 993 to 1631 mm in ASOS-MERRA2 and 881 to 1583 mm in MERRA-2. Wetter conditions were found in the Seomjin-Yeongsan River basin, while the Nakdong River Basin had relatively dry conditions for both forcings, indicating a recent severe drought in this basin (Ryu et al. 2002; Yang and Kim 2013). Most regions had more precipitation in the ASOS-MERRA2, except the northern and eastern parts of the Han River Basin, which had less precipitation. The bottom panel of Fig. 5 shows the number of rain days (more than 2 mm precipitation per day) per year for each grid cell. The ASOS-MERRA2 forcing had more rain days in the Geum and Seomjin-Yeongsan River basins, while MERRA-2 had more rain days in the Nakdong River Basin and the northern part of the Han River Basin. The ATP and rain days showed a similar spatial pattern in MERRA-2 and ASOS-MERRA2, yet ASOS-MERRA2 provided more spatially distinctive patterns with the observation-driven data.

Fig. 5
figure 5

Spatial analysis of precipitation using Automated Synoptic Observing System-Modern-Era Retrospective Analysis for Research and Application version 2 (ASOS-MERRA2) and MERRA-2, and the difference between these. To calculate the difference, MERRA-2 grid (0.5°) was evenly downscaled to match ASOS-MERRA2 grid resolution (0.125°)

The ATP, monthly mean precipitation, and number of rain days were calculated for South Korea and the four basins as a time series (Fig. 6). For the ATP, ASOS-MERRA2 generally had more precipitation than MERRA-2, and the difference between the two forcings was evident in the Geum River Basin. In addition to the difference in the ATP in the Geum River Basin, this basin also did not show good agreement of the rain days between the two forcings. Recent reports have addressed the climatic complexity of the Geum River Basin (Ahn et al. 2018, 2016; Lee et al. 2018), and it appears that ASOS-MERRA2 successfully captures the dynamics of precipitation in this basin. South Korea had high precipitation in summer, which peaked in July. The four basins had nearly identical monthly cycles of precipitation. The number of rain days also showed similar fluctuations and ranges (80–140 days) between MERRA-2 and ASOS-MERRA2.

Fig. 6
figure 6

Temporal analysis of precipitation using Automated Synoptic Observing System-Modern-Era Retrospective Analysis for Research and Application version 2 (ASOS-MERRA2) and MERRA-2 for all South Korea and four river basins. The blue line indicates ASOS-MERRA2 and the red line indicates MERRA-2

5.2 Temperature

The spatial distribution of the daily mean temperature from MERRA-2, ASOS-MERRA2, and their differences are compared in Fig. 7. For the top panel, the December-January–February (DJF) daily mean temperature (ASOS-MERRA2) was relatively high along the coastline. The inland and northern parts of the Han River Basin had cooler conditions than other areas. This spatial pattern was similar to the MERRA-2 results. The overall DJF temperature from MERRA-2 was slightly higher than that of ASOS-MERRA2. The difference in daily temperature between the two forcings was exceptionally large in the Han River Basin. The bottom panel shows the June-July–August (JJA) daily mean temperature. The spatial pattern of the MERRA-2 results from DJF and JJA were similar, while the ASOS-MERRA2 results from winter and summer were clearly different. The higher spatial resolution of the ASOS-MERRA2 product provided a more detailed spatial pattern of temperature. Unlike the winter temperature, the overall temperature in summer was higher in the ASOS-MERRA2 results.

Fig. 7
figure 7

Spatial analysis of temperature using Automated Synoptic Observing System-Modern-Era Retrospective Analysis for Research and Application version 2 (ASOS-MERRA2) and MERRA-2, and the difference between these. DJF, December-January–February; JJA, June-July–August

The annual mean, monthly mean temperature, and diurnal temperature range (DTR) were calculated for South Korea and the four basins (Fig. 8). The DTR was calculated from the difference between the daily maximum and minimum temperatures. This difference was averaged for each month. The ASOS-MERRA2 annual mean temperature was generally higher than that of MERRA-2, and the difference was primarily observed in the Han River Basin. However, the difference was only up to 1.8 K in this basin, and the temperature fluctuations from both forcings were analogous to each other. The monthly mean temperatures were nearly identical for ASOS-MERRA2 and MERRA-2. There were temperature peaks in July and August in all four basins. The DTR of ASOS-MERRA2 was mostly larger than that of MERRA-2 except in summer (JJA). The ASOS-MERRA2 hourly temperature was a more precise record, while the DTR of MERRA-2 did not demonstrate the actual range of hourly temperatures. The overall temperature results were comparable for both forcings within a feasible range.

Fig. 8
figure 8

Temporal analysis of temperature using Automated Synoptic Observing System-Modern-Era Retrospective Analysis for Research and Application version 2 (ASOS-MERRA2) and MERRA-2 for all South Korea and four river basins. The blue line indicates ASOS-MERRA2 and the red line indicates MERRA-2

5.3 Pressure

The comparison of the pressure results was calculated for monthly and daily time steps (Fig. 9). The spatially averaged time series of the mean pressure (41 years \(\times\) 12 months = 492) from MERRA-2 and ASOS-MERRA2 were compared in a scatter plot. The least squares line indicated that both forcings were systematically well matched, but the ASOS-MERRA2 results were 22.3 hPa (2.24%) higher than MERRA-2 in South Korea. The Han River Basin had the most considerable pressure difference (34.81 hPa; 3.55%) in the monthly mean pressure and its annual cycle. The winter season had a slightly higher pressure than the summer months. However, the monthly mean pressure throughout the year ranged up to 15 hPa in most basins.

Fig. 9
figure 9

Temporal analysis of pressure using Automated Synoptic Observing System-Modern-Era Retrospective Analysis for Research and Application version 2 (ASOS-MERRA2) and MERRA-2 for all South Korea and four river basins. The blue line indicates ASOS-MERRA2 and the red line indicates MERRA-2

The diurnal pressure range (DPR) was calculated from the difference between the daily maximum and minimum pressures from the hourly pressure data. This difference was averaged for each month. Both forcings had nearly identical DPRs, and the summer season had a low DPR in all basins. The pressures from MERRA-2 and ASOS-MERRA2 were systematically biased against each other, but the overall diurnal characteristics from the hourly results were similar. The spatial analysis of pressure between ASOS-MERRA2 and MERRA-2 is described in the Appendix fig. 12. The eastern part of South Korea is a primarily mountainous area (covering Han and Nakdong Rivers). The surface pressure of the two rivers is relatively small 981.39 to 995.61 hPa, while other two rivers (Geum and Seonjin-Youngsan) are 1001.64 to 1003.15 hPa.

5.4 Specific humidity

The specific humidity results were compared for monthly and daily time steps (Fig. 10). The scatter plot was the spatially averaged time series of the mean humidity from MERRA-2 and ASOS-MERRA2. Compared with the pressure results, MERRA-2 and ASOS-MERRA2 were close but slightly off from the least square line. This is because the specific humidity was calculated from multiple sources (relative humidity, temperature, and pressure), and Bolton’s equation was not a linear relationship among the different sources. The ASOS-MERRA2 results were 0.99 g/kg (13.45%) higher than those of MERRA-2 in South Korea. However, unlike the pressure results, both forcings of the specific humidity were within the same range of 0 to 20 g/kg.

Fig. 10
figure 10

Temporal analysis of specific humidity using Automated Synoptic Observing System-Modern-Era Retrospective Analysis for Research and Application version 2 (ASOS-MERRA2) and MERRA-2 for all South Korea and four river basins. The blue line indicates ASOS-MERRA2 and the red line indicates MERRA-2

The monthly mean humidity from ASOS-MERRA2 was overestimated from January to June in most basins. However, both forcings had similar annual cycle patterns throughout the year, with humid summers and relatively dry winters. For example, the mean humidity in August (15.0 g/kg) was six times larger than that in January (2.5 g/kg). The diurnal humidity range (DHR) was calculated from the difference between the daily maximum and minimum hourly humidity. In all basins, the DHR from ASOS-MERRA2 was larger than that from MERRA-2 every month. The difference in DHR was captured mainly from April to October when the humidity increased. As a result, the DHR of MERRA-2 flattened throughout the year. The ASOS-MERRA2 specific humidity, as a function of the ASOS-MERRA2 temperature, was affected by the more extensive diurnal range of the ASOS-MERRA2 temperature. As a result, the specific humidity from MERRA-2 and ASOS-MERRA2 was more biased than the results from pressure, and the overall diurnal features were different for both forcings. ASOS was site-specific result; however, MERRA-2 was downscaled from spatially wide regions. Therefore, ASOS-MERRA2 (containing ASOS results) could provide more sensitive and detailed results of DHR (Diurnal Humidity Range) than MERRA-2 could. The existing bias between MERRA-2 and ASOS-MERRA2 tells us the importance of site-specific forcing development. The spatial analysis of humidity between ASOS-MERRA2 and MERRA-2 is described in the Appendix fig. 12.

5.5 Wind speed

The wind speed was calculated from the u- and v- directions. The magnitude of the wind speed in both directions was compared at monthly and daily time steps. The wind speeds from MERRA-2 and ASOS-MERRA2 were compared in terms of the monthly mean using a scatter plot (Fig. 11). The overall wind speed from MERRA-2 (5.67 m/s) was faster than ASOS-MERRA2 (3.21 m/s). Both forcings did not have a good agreement, and the difference was most prominent among different forcings (pressure and specific humidity). The annual cycle of the monthly mean wind speed did not vary noticeably throughout the year. Winter (December and January) wind speeds were slightly faster than in other months. The diurnal wind speed range (DWR) was calculated from the difference between the daily maximum and minimum wind speeds from the hourly data. Unlike the diurnal range of temperature and humidity, the DWR of MERRA-2 was larger than that of ASOS-MERRA2. The DWR was relatively low during the warmer months (from June to August). The spatial analysis of wind speed between ASOS-MERRA2 and MERRA-2 is described in the Appendix fig. 12.

Fig. 11
figure 11

Temporal analysis of wind speed using Automated Synoptic Observing System-Modern-Era Retrospective Analysis for Research and Application version 2 (ASOS-MERRA2) and MERRA-2 for all South Korea and four river basins. The blue line indicates ASOS-MERRA2 and the red line indicates MERRA-2

6 Conclusion

This study describes long-term (41 years) and high-resolution (hourly and 0.125°) meteorological datasets that can be used to drive hydrologic models of land surface water and energy budgets for South Korea. As the necessity for accurate estimates of the spatial and temporal variation in surface hydrology and energy balance is evident, this regional scale dataset was developed for South Korea, including, for the first time, recent observations with quantitative validation. This study is intended to provide a benchmark weather forcing dataset that combines state-of-the-art reanalysis products (MERRA-2) with the most recent meteorological observation-based datasets (ASOS). This hybrid method has potential applications in record-lacking regions, such as South East Asia and Africa, where weather observations are available but for insufficient periods. Below are point-by-point findings from our analysis:

  1. 1.

    Validations against measured observations show that the region-based and high spatial resolution data of ASOS-MERRA2 are superior (i.e., improved errors and correlations for all five variables) to the existing MERRA-2 data.

  2. 2.

    ASOS-MERRA2 precipitation successfully capture the recent drought in the Geum River Basin (Ahn et al. 2016; Jung et al. 2020; Ryu et al. 2004) and the precipitation effect due to annual typhoons in the headwaters of the Han River and Seomjin-Yeongsan River basins (Bae et al. 2008; Kim et al. 2015).

  3. 3.

    The diurnal temperature range in ASOS-MERRA2 was more sensitive than that of MERRA-2. This is because ASOS-MERRA2 avoids the statistical smoothing of MERRA-2 using locally available meteorological observations.

  4. 4.

    The pressure from ASOS-MERRA2 was slightly higher (2.24%) than that from MERRA-2, whereas the specific humidity showed good agreement between the two forcing datasets within the same range of results.

  5. 5.

    Among the five forcing components, wind speed had the most significant bias, possibly because of the complexity and sensitivity of data collection from the ASOS network.

Another advantage of ASOS-MERRA2 is the expectation of future extensions from the current status. The final product of the ASOS-MERRA2 forcings was a ground observation-based gridded dataset that can provide more detailed spatial patterns and diurnal ranges of results. This new dataset significantly improves the original reanalysis product (MERRA-2) and can be used for applications and analytical studies in the hydrological and climatological sciences in South Korea.

The hydrologic model outputs (i.e., soil moisture and evapotranspiration) driven by ASOS-MERRA2 can be used as agricultural drought monitoring tools to improve meteorological forcing. The standardized soil moisture index and standardized precipitation evapotranspiration index approaches can be applied using the updated meteorological inputs and hydrologic variables from the LSM simulated by ASOS-MERRA2. In addition to enhanced agricultural drought monitoring, the simulated hydrologic model is expected to provide valuable information about heat waves (i.e., temperature and humidity), which directly affect mortality (Kim et al. 2014; Lee et al. 2016; Yeh et al. 2018). Thus, the LSM simulated by the newly developed ASOS-MERRA2 is expected to produce more reliable long-term and high spatial–temporal resolution land surface hydrological outputs in South Korea.