1 Introduction

Continental river discharge is an essential factor in the global freshwater budget (e.g., Trenberth et al. 2007). Therefore, an assembled, global river dataset is desired to facilitate the development of ocean circulation models. Dai et al. (2009) constructed a dataset of historical river discharge based on observations at the farthest downstream stations of large continental rivers. Missing data were populated using land surface models with observed atmospheric data. This river discharge dataset was included in the boundary condition data (Large and Yeager 2009) to drive ocean models for the protocol of the CLIVAR Coordinated Ocean sea-ice Reference Experiments (COREs; Griffies et al. 2012) and the CMIP6 Ocean Model Intercomparison Project (OMIP; Griffies et al. 2016). However, updates for this dataset were stopped in 2007. An updated or new historical dataset is therefore needed to facilitate simulations of recent climate events. Furthermore, eddying and coastal ocean modeling requires a higher spatial and temporal resolution than that offered by the current dataset.

The JRA-55-based surface dataset is a suitable dataset for driving ocean–sea ice models (Tsujino et al. 2017). It is a recent and long-term reanalysis using a high-resolution (~ 55 km) atmospheric model (Kobayashi et al. 2015) and has been updated in near-real time since 1958. The spatial resolution and the time intervals are sufficient for eddying and coastal ocean modeling, but the dataset is not suitable for discharge modeling because of the absence of a river routing model. The aim of this study was to construct a dataset of historical river discharge using a global river routing model integrated with the input runoff from the land surface model of JRA-55. This method is potentially useful for estimating river discharge. In a previous study (Dai and Trenberth 2002), similar estimates were made using precipitation minus evaporation based on atmospheric reanalysis data, and annual and monthly agreement with the observed river discharge data was observed. The input runoff used in this study includes soil moisture and snow accumulation data.

The original reanalysis output of JRA-55 including the input runoff contains some climatological and time-depended biases, such as those caused by the update of the assimilation techniques and the satellite observation systems (Kobayashi et al. 2015). Therefore, adjustments to remove these biases were required before the integration of the river routing model to construct the new river discharge dataset. These adjustments were performed using the following philosophies. The annual mean climatology of the new dataset is consistent with that of the previous widely used dataset (Dai et al. 2009) so that no climatological ocean features are lost. Meanwhile, interannual variation, such as the ENSO, and shorter time scales are not modified to be coherent with the atmospheric events, such as the passage of low pressure, in JRA-55. These river discharge data are designed for use with the atmospheric data of the modified JRA-55 by Tsujino et al. (2017), so that a user can apply the dataset to recent climate events with eddying regimes as a boundary condition.

2 Model and methods

The global river routing model CaMa-Flood (Yamazaki et al. 2011) was used to calculate river discharge. The spatial and temporal resolutions were determined to be sufficient for eddy-permitting global ocean models. The spatial resolution of the river routing model is 0.25° in longitude and latitude and the temporal resolution of the output is 1 day for the dataset. The flow paths of the major continental rivers are well resolved in CaMa-Flood (Fig. 1) and divided areas are realistically represented. The input runoff from the JRA-55 land surface model is routed to the oceans along the river network map, which is prescribed to fit the land surface grids of JRA-55. The subgrid scale river parameters were calculated on the basis of a 1-km resolution digital elevation model. These parameters were calibrated to represent the seasonal cycle of the reference data for the major continental rivers.

Fig. 1
figure 1

Drainage basins of major continental rivers resolved by CaMa-Flood (upper), basin groups (middle), and river width (lower)

Some biases were found in the input runoff of the original JRA-55 data. For example, the sum of the input runoff from the drainage basin of the Amazon River exhibits large step-like variations (Fig. 2, upper panel). Under the assumption that the input runoff from the land instantaneously runs into the oceans, the variation in the total input runoff for each drainage basin is consistent with the river discharge at each river mouth. This assumption is plausible for an epoch of a few years. However, these step-like variations are not visible for the observation-based river discharge at the river mouth (Fig. 2, upper panel) and for the precipitations obtained from recent satellite observations from the Global Precipitation Climatology Project (GPCP; Adler et al. 2003; Fig. 2, lower panel). The variation could likely be caused by the biases of the precipitation in JRA-55 (Fig. 2, lower panel) as suggested by Harada et al. (2016). These time-dependent step-like biases are due to the update of the assimilation techniques as mentioned above (Kobayashi et al. 2015).

Fig. 2
figure 2

Adjustments for the Amazon River. Upper panel: input runoff to the river from the land surface component of JRA-55 (black); low-pass filtered by 5-year Lanczos window (orange); river runoff to the ocean by Dai et al. (2009) (red); regressed river runoff to the ocean based on comparison between Dai et al. (2009) and GPCP (green); low-pass filtered by 5-year Lanczos window (blue). Middle panel: river runoff to the ocean calculated by CaMa-Flood with the adjusted input (blue); river runoff to the ocean by Dai et al. (2009) (red); correction (multiplicative) factor (0.2 < f < 5.0) applied to the JRA-55 river runoff from land used as an input (orange) (right ordinate); recent observed river runoff to the ocean by Dai (2016) (dotted red line). Lower panel: precipitation for basin group (black, JRA-55; green, GPCP) (color figure online)

To facilitate future updates to the dataset, a simple adjustment procedure is required. Therefore, multiplicative factors were introduced to remove the biases. The factors were estimated to fit the reference data for river discharge into oceans by Dai et al. (2009) for time periods greater than 5 years and applied to the aforementioned assumption that the input runoff is consistent with the river discharge at each river mouth. However, the period of the reference dataset was not sufficient to remove the step-like biases, which are also observed around 2008. Therefore, the reference data were extended from 2007 to 2015 by linear regression (over common data period 1979–2007) using the annual total precipitation of GPCP in the drainage basins, which is well correlated with the river discharge into oceans for major rivers (Fig. 2, upper panel). The multiplicative factor was estimated as the ratio of river discharge by Dai et al. (2009) to the total runoff of JRA-55 in each drainage basin after a low-pass filter (5-year Lanczos window). The factors were extrapolated back from 1963 to 1958 and fixed after 2010 because the distinctive time-dependent biases of JRA-55 are not seen and the assimilation method of JRA-55 was constant after that. Therefore, the climatological bias should be constant after 2010. Furthermore, the size of the multiplicative factor was confined from 0.2 to 5 to avoid an excessive correction (Fig. 2, middle panel).

These procedures were applied to 38 major continental rivers with a large river discharge and seven rivers with a large drainage basin area (Table 1) because these rivers are well resolved in the model. However, some pairs were treated as one river, e.g., the Brahmaputra and Ganges, Parana and Uruguay, and Sacramento and San Joaquim rivers, because the river paths or mouths cross (Table 1). The major 45 continental rivers contribute 57% of the total river discharge into oceans except for Antarctica. They account for 74% of river discharge into the Atlantic Ocean, 36% into the Pacific Ocean, 46% into the Indian Ocean, and 58% into the Arctic sea (Table 2). The total drainage basin area of the 45 rivers is 47% of the total drainage basin area into oceans. This suggests that the contribution of individual smaller rivers is negligible when examining the global freshwater budget. However, the river discharge due to the remaining, smaller rivers is not well resolved in the model. Therefore, small rivers were grouped into 12 aggregated basins: the western and eastern boundaries of the Atlantic, Pacific, and Indian oceans, the Arctic Ocean, Hudson Bay, and the Mediterranean, Black, Baltic, and Red seas. The multiplicative factors were estimated from the sum of the river runoff for each basin group excluding the major continental rivers and were applied to the small rivers using the same methodology as the one used for the major continental rivers.

Table 1 Comparison of major continental rivers analyzed in this study from CORE (Dai et al. 2009) and JRA55-do (CaMa-Flood) for the period 1963–2007. Correlation coefficients are estimated over 1963–2007
Table 2 Comparison of river discharge for CORE (Dai et al. 2009) and JRA55-do (CaMa-Flood) from 1963 to 2007 for major ocean basins

The other time-dependent adjustments were not applied directly to the input runoff in order to preserve the variations corresponding to the daily to interannual events represented in JRA-55. Meanwhile, ad hoc calibration of the bathymetric parameters was introduced to CaMa-Flood to obtain a realistic climatology of the seasonal cycle by Dai et al. (2009) for the major continental rivers. To calibrate the parameters, sensitivity tests were performed by perturbing the river depth and width from − 60% to + 100% at the same time. For the Amazon River, the width and depth were changed by 20% to get the highest correlation with the reference data. If significant correlations were not observed, the parameters were set with lower root mean square deviations.

In addition, the river discharge from liquid water flux from Greenland was adjusted using the climatology data of Bamber et al. (2012), because the reference data of Dai et al. (2009) were underestimated (0.002 Sv) compared with the latest observations (0.028 Sv) reported.

At the present time, CaMa-Flood has been integrated from 1958 to 2016 with the adjusted input runoff of JRA-55.

3 Yearly river discharge

Over the period 1963–2007, the total river discharge into the ocean calculated by CaMa-Flood is approximately 1.14 Sv (Table 2). The climatological mean river discharges of the major continental rivers agree with the reference data by Dai et al. (2009) as shown in Table 1. The middle panel of Fig. 2 and others (supplemental figures) show the time series of yearly continental river discharge by Dai et al. (2009; red line) and the results by CaMa-Flood simulation in this study (blue line). The orange line indicates the multiplicative factors for the input river runoff of JRA-55. The step-like biases seen in the original input runoff of JRA-55 are well removed from the simulated river discharge into oceans. Furthermore, the variation in the simulated annual mean river discharge corresponds with the reference data over time periods greater than 5 years, because the input runoff is adjusted to the reference data by the multiplicative factor. However, the adjustment did not work as intended before 1963 because of the backward extrapolation of the factors. Therefore, seasonal climatology for the period 1958–1962 was introduced.

Although the adjustment is only effective for time periods greater than 5 years, the variation in annual mean river discharge over shorter time periods was simulated in CaMa-Flood in agreement with Dai et al. (2009). For example, interannual variations such as the ENSO cycle are well represented. The river discharge of the Amazon River decreases following El Niño years (1982–1983, 1992–1993, 1997–1998, and 2009–2010) as suggested by previous studies (e.g., Dai et al. 1997; Amarasekera et al. 1997).

The correlation coefficients of the yearly river discharge simulated by CaMa-Flood and the observation-based data by Dai et al. (2009) are statistically significant for the major continental rivers above 0.4, which exceeds the 99% significance level based on a two-tailed t test in the 45-year analysis period (1963–2007). However, some of the West African rivers (Niger, Ogooué, and Sanaga) show a relatively low correlation below 0.4. This could be due to the large biases related to the wrong precipitations of JRA-55 on the drainage basins (lower panels of supplemental figures). Low correlation coefficients are also seen for rivers with a small drainage basin area, such as the Sepik River in New Guinea, and rivers with a small discharge, such as the Murray and Rio Grande rivers. These are affected significantly by the model resolution and input runoff biases, so yearly fluctuations are consistent with the variations of the precipitation in JRA-55 in the drainage basins.

Figure 3 shows the yearly basin integrated river discharge of the Atlantic, Pacific, Indian, and Arctic oceans, the Mediterranean and Black seas, and the global values. The correlation coefficient for the individual ocean basins are above 0.7 (Table 2) for 1963–2007. The variation of the river discharge is well explained by the total precipitation for the basin group (Fig. 4). The recent precipitation from JRA-55 is consistent with GPCP after the end of Dai et al. (2009), suggesting that the recent yearly river discharge is realistically represented in this study. In addition, for some of the major continental rivers, the observed river discharge was updated to 2015 by Dai (2016). These recent data are not included in the reference data to adjust the input runoff. However, the yearly variations of the river discharge are well represented compared with the observations after 2007 (Fig. 2, middle panel, blue line versus dotted red line), suggesting that the adjustment procedure is valid for future updates.

Fig. 3
figure 3

River discharge into Atlantic, Pacific, Indian, and Arctic Oceans, Mediterranean and Black seas, and global oceans. Blue lines indicate river runoff calculated by CaMa-Flood with an adjusted input. Red lines indicate river runoff from by Dai et al. (2009). Dotted blue lines are calculated from climatology data for 1958–1962 (color figure online)

Fig. 4
figure 4

Total precipitation for divided basins for Atlantic, Pacific, Indian, and Arctic Oceans, Mediterranean and Black seas, and global oceans. Black lines are calculated from the original JRA-55. Red lines are estimated from Dai et al. (2009). Green lines are estimated from GPCP (color figure online)

4 Seasonal river discharge

The seasonal river discharge cycle has a large amplitude for many rivers. Figure 5 shows the climatological seasonal cycle of river discharge by CaMa-Flood over the period 1963–2007. There is a time lag between the river discharge by CaMa-Flood and the input runoff due to the time delay of freshwater traveling downstream to the ocean. The lag is partly controlled by the bathymetric conditions of the rivers, which can be a tuning parameter in the model. For the Amazon River, the highest discharge is seen from May to June, while the peak input runoff is in early spring. For the Brahmaputra/Ganges, Mississippi, Mekong, and Tocantins rivers, the peak of the simulated river discharge is delayed in comparison with the reference. These lags could be caused by the limitation of the parameter tuning in CaMa-Flood or defective time series of the input runoff. For the Congo and Parana/Uruguay rivers, the amplitude of the simulated seasonal river discharge is larger than that of the reference data. The difference can be caused by the larger seasonal cycle of the input runoff or lack of dam operations in the model. Obstacles, such as dams and lakes, help smooth out and regulate river flow, as suggested by Dai and Trenberth (2002). For the large Arctic rivers (Yenisey, Lena, and Ob), the rapid increase in river discharge in June resulting from the rapid increase of the input runoff due to snow melting in spring is well simulated. However, the additional peak is seen in fall to early winter in our dataset. These peaks are related to the large input runoff from July to August, which is consistent with the peaks of precipitations (Fig. 5, black lines). The large input runoff in this period could be induced by the underestimation of the evaporation on the land surface in the land component of JRA-55. A similar relationship between the river discharge and the input runoff is also seen in the entire Arctic region (Fig. 6).

Fig. 5
figure 5

Mean annual cycle of river discharge by CaMa-Flood (blue lines) and adjusted input runoff estimated from JRA-55 (blue dotted lines) for the largest 12 major rivers over the period 1963–2007 (left ordinate). Red lines indicate river discharge from Dai et al. (2009). Black lines indicate for total precipitation of JRA-55 for drainage basins and green lines indicate GPCP (right ordinate). Note two annual cycles are shown in each panel (color figure online)

Fig. 6
figure 6

Mean annual cycle of river discharge into Atlantic, Pacific, Indian, and Arctic Oceans, Mediterranean and Black Seas, and global oceans over the period 1963–2007 (left ordinate). Blue line indicates river runoff calculated by CaMa-Flood with adjusted input runoffs indicated by blue dotted lines. Red lines indicate river discharge from Dai et al. (2009). Black lines indicate for total precipitation of JRA-55 for drainage basins and green lines indicate GPCP (right ordinate). Note two annual cycles are shown in each panel (color figure online)

Some small rivers also show a low correlation coefficient (Table 2). However, the phase of the seasonal cycle of the basin-integrated river discharge (Atlantic, Pacific, Indian, and Arctic oceans, Mediterranean and Black seas, and global oceans) is consistent with the reference data (Fig. 6). For the Mediterranean and Black seas, although the basin-integrated precipitation of JRA-55 is consistent with that of GPCP (Fig. 6b), the amplitude of seasonal river discharge is larger than the reference data. This may be due to the relatively large amplitude of the input runoff caused by the underestimation of the evaporation, which positively correlates with the precipitation and thus reduces the input runoff in the drainage basin.

5 Summary

In this study, we used the global river routing scheme CaMa-Flood to accurately represent the yearly and seasonal continental river discharge and compared the results with a reference dataset. Significant changes might have appeared in the seasonal variations in the Arctic Ocean with respect to Dai et al. (2009). However, the changes of the river input datasets do not have critical impacts on the sea ice seasonal variations obtained from Arctic ocean–sea ice model simulations using the different river datasets (Eiji Watanabe, personal communication, 2017).

The most important advantages of this study are the capability to update the dataset in near-real time and the time and horizontal resolutions, particularly for short rivers, where the time lag between the input runoff and the river discharge into the ocean is negligible. For example, the response of a river discharge to the passing of an atmospheric low is resolved and accompanied by the concurrent precipitation. This suggests that the methodology is suitable for simulations using high-resolution models. The dataset is released as a subset of JRA55-do (Tsujino et al. 2017) and will be updated in near real time.