1 Introduction

In October 2019, Typhoon Hagibis hit eastern Japan, including the Metropolitan area around Tokyo and brought with it extremely heavy precipitation that led to large-scale flooding, affecting more than 390,000 livelihoods and causing 100 people to lose their lives (EM-DAT: The International Disaster Database https://www.emdat.be/, accessed on Nov. 6, 2020). Typhoon Hagibis was listed as one of the top 10 international weather events in 2019 (LeComte 2020) and estimated to have been the second costliest Western Pacific typhoon on record with $15 billion economic damage.

Weather station data (Automated Meteorological Data Acquisition System, AMeDAS, see description below in Section 2) shows that precipitation on the day of October 12, 2019, over eastern Japan reached over 240mm in the region around Tokyo (shown in Fig. 1), the highest on record (1976–2020), statistically corresponding to a ~ 1 in 286 year event. The rainfall associated with Typhoon Hagibis was reported to be the highest daily amount in the 60-year period for which reliable data exists (Prakoso 2019; Chie et al. 2019).

Fig. 1
figure 1

AMeDAS weather station (244 stations are used) observed rainfall in mm/day on the 12th of October 2019

Previous studies (Takemi and Unuma 2020) have assessed the immediate meteorological factors causing the heavy precipitation during the passage of Typhoon Hagibis and found that a nearly moist-adiabatic lapse rate, moist absolute instability, abundant moisture content, and high relative humidity throughout the troposphere jointly contributed to generating the heavy precipitation. Kawase et al. (2021) took a step further and conducted an anatomy of the Typhoon (Hoerling et al. 2013) and found that the historical warming intensified the strength of Typhoon Hagibis as well as the extreme heavy precipitation associated with the passing of the typhoon and that the topography of the area also contributed to the enhancement in total precipitation.

However, this still leaves the question to what extent human-induced climate change has altered the likelihood of an extreme heavy precipitation event like the one associated with Hagibis from an impact perspective. In other words, can we quantify the role of anthropogenic climate change in the devastating rainfall event like the one associated with the passing of Typhoon Hagibis?

Research on hurricanes in the North Atlantic, using the methodology of probabilistic event attribution (Philip et al. 2020), found that rainfall associated with tropical storm Imelda was made about twice as likely due to climate change whereas the likelihood of the rainfall associated with hurricane Harvey tripled (Van Oldenborgh et al. 2017). Harvey hit Houston, Texas, in 2017 and led to approx. $90 bn damages from the rainfall alone, $67 bn of which were attributed to climate change by Frame et al. (2020). While other changes in tropical cyclone characteristics, like frequency or intensity that have been observed, are still difficult to attribute to human-induced climate change, the increase in associated rainfall is consistent across regions and methods and is consistent with our physical understanding (Knutson et al. 2019, 2020). However, studies outside the North Atlantic are rare, even though the water masses accompanying tropical cyclones are responsible for a large proportion of their damages and understanding how they change in a warmer world is thus crucial to building resilience (Bistricky et al. 2019).

This study takes an impact-oriented perspective, i.e., focusing on the level of heavy precipitation such as the one associated with Typhoon Hagibis, irrespective of the drivers of such heavy precipitation (whether it’s driven by Typhoon passing or other factors). Because irrespective of the drivers leading to such heavy precipitation, the impacts felt in the region would be just as disastrous with the same level of heavy precipitation.

In order to conduct a study offering an understanding of the role of anthropogenic climate change in the aspects of the extreme heavy precipitation event responsible for the damages, we first need to identify what the best definition of the event is, with respect to the closest correspondence to the impacts. There is always a trade-off between what happens on the ground and what can be simulated with confidence by state-of-the art climate models (Leach et al. 2020).

Following Kawase et al. (2021), we focus on precipitation, total column precipitable water (pwat) as a measure of moisture content throughout the atmospheric column, and sea level pressure (SLP)- to track minimum SLP as the central position of the typhoon. We use pwat and SLP to describe the environmental conditions during the passing of Typhoon Hagibis (shown in Fig. 2). The spatial extent of the event was identified as the region (34.5N-38N, 138E-141.5E) highlighted in Figure 1 (panel a, same as used by Kwase et al. 2021). We use this spatial definition to identify the best temporal definition as well as the variable to focus our assessment on. In the Supplementary Information (SI, Fig. S2) we show the sensitivity of event definition to the choice of the spatial coverage, finding that our results are not sensitive to the precise spatial extent analyzed; hence, for the ease of comparison, we employ the same spatial definition as Kwase et al. (2021) .

Fig. 2
figure 2

Depicting the passage of Typhoon Hagibis measured using three different metrics, precipitation (top row), precipitable water (middle row), and sea level pressure (bottom row) from JRA-55 (note that the colorbar for SLP maxes out at 1100 and all values above are are masked out). The region of interest for heavy precipitation impacts is shown as the red bounded box in the top row

The passing of Typhoon Hagibis caused the flash flooding mainly due to the single day of extreme heavy precipitation on Oct 12th (Fig. 1), although the precipitation on Oct 11th was also quite high (second highest ranking of Oct over the region of interest). Similar to precipitation, the 1-day maximum pwat in Oct ranked the highest in the historical record. In Oct 2019, the 1-day maximum precipitation and 1-day maximum precipitation over the studied region both occurred on the 12th. Based on the fact that pwat behaves very similarly to precipitation itself (as seen in Fig. 2) and that precipitation is available from the station data record and commonly available from climate model simulations, we focus the analysis on precipitation. As a common practice for extreme event attribution, event selection is based on the severity of the impacts of the extreme event (Philip et al. 2020 and the references therein), the event is picked from an impact perspective, but focusing on the time window that the event had happened, in this case October, so that extreme heavy precipitation events brought by other systems, e.g., the Baiu front during the early summer season (Ohba et al. 2015) does not get picked. Hence, October maximum 1-day precipitation over the studied region (34.5N-38N, 138E-141.5E) is used for event definition.

In the remainder of this study, we use a range of weather station observed and reanalysis products (Section 3), as well as a range of climate models (Section 5) to assess whether and to what extent anthropogenic climate change altered the likelihood of the extreme precipitation event as defined above to occur. The data and methods are described in section 2, while Section 4 is dedicated to evaluating the models. We synthesize the results from the various data products in Section 6 and discuss in Section 7 how these can be translated into an attribution of the economic losses.

2 Data and methods

2.1 Observational data

2.1.1 AMeDAS station data

October precipitation data were extracted from the Automated Meteorological Data Acquisition System (AMeDAS), which is a collection of automatic weather stations run by the Japan Meteorology Agency (JMA) for automatic observation of precipitation (as well as wind direction and speed, temperature, and sunshine duration) to provide real-time monitoring of weather conditions with high temporal and spatial resolution, providing the best observation dataset to use to assess long-term local rainfall in Japan. Daily precipitation in Oct over 1976–2020 was used in this analysis (1974 and 1975 were discarded due to scarcity of station records). For event definition, we use the Oct maximum 1-day of the average precipitation of all stations that fall within the studied region.

To account for observational uncertainty, we also use the following three gridded datasets for comparison.

2.1.2 JRA-55

The first gridded dataset is Japanese 55-year Reanalysis (JRA-55, Japanese 55-year Reanalysis, Daily 3-Hourly and 6-Hourly Data) precipitation data over 1958-current (from Japan Meteorological Agency/Japan. 2013, updated monthly, accessed through Research Data Archive at the National Center for Atmospheric Research, Computational and Information Systems Laboratory. https://doi.org/10.5065/D6HH6H41, last accessed on Nov 6th 2020). This dataset has a maximum regional mean 1-day precipitation of 112.71 mm day-1 in Oct 2019, averaged over the studied region (34.5N-38N, 138E-141.5E).

2.1.3 CPC

We also use the CPC Unified Precipitation Analysis (1979-current), a gridded dataset based on rain gauges. This dataset has a maximum regional mean 1-day precipitation of 96.48 mm day-1 in Oct 2019, averaged over the studied region (34.5N-38N, 138E-141.5E).

2.1.4 ERA5

Daily precipitation from from the Fifth generation of European Centre for Medium-Range Weather Forecasts (ECMWF) atmospheric reanalyses of the global climate - ERA5 (1979–current, from https://cds.climate.copernicus.eu/, Copernicus Climate Change Service) is also used for comparison. ERA5 has a maximum regional mean1-day precipitation of 128.95 mm day-1 in Oct 2019, averaged over the studied region (34.5N-38N, 138E-141.5E).

It’s worth highlighting that the gridded datasets underestimated the rainfall compared with station records. However, to account for observational uncertainty, we use the gridded data products for comparison, keeping the underestimation in mind.

2.1.5 Global mean surface temperature (GMST)

In order to calculate observational trends, a standard method is to fit the data to an extreme value distribution that varies with a covariate that describes global warming, such as the global mean surface temperature (interested readers are referred to Philip et al. 2020 for further details). As a common practice, we use the (low-pass filtered) global mean surface temperature (GMST), where GMST is taken from the National Aeronautics and Space Administration (NASA) Goddard Institute for Space Science (GISS) surface temperature analysis (GISTEMP, Hansen et al. 2010).

2.2 Model and experiment descriptions

We use ten different sets of climate model simulations with various model resolutions (high, medium and low), to estimate changes in extreme heavy precipitation such as the one associated with Typhoon Hagibis. We present the attribution results for each resolution category separately and combined, to investigate the effect of model resolution on the attribution results.

2.2.1 CMIP6 HighResMIP

Eight sets of coupled model simulations (1950–2050) from the High Resolution Model Intercomparison Project (HighResMIP) for Coupled Model Intercomparison Project Phase 6 (CMIP6, Eyring et al. 2016) with daily precipitation outputs are used (available from the Centre for Environmental Data Analysis archive at the time of the analysis).These simulations are restricted to 1950–2050 due to the computational resources required by model resolution (Haarsma et al. 2020). Due to the computational resources restraint, instead of a long spin-up to equilibrium, an alternative spin-up approach is taken, where a spin-up of 50 years is run with constant 1950s forcing to reduce the large initial drift (Haarsma et al. 2020). The initial state for the historical simulation (1950–2014) is obtained from the spin-up simulation, and the historically evolving forcing was imposed. For the future period (2015–2050), the forcing fields are based on the Coupled Model Intercomparison Project Phase 6 (CMIP6) Shared socio-economic pathway (SSP) 5-8.5 scenario, a forcing scenario as close to CMIP5 representative concentration pathway 8.5 (RCP8.5) as possible.

The eight HighResMIP models are (1) CMCC-CM2-VHR4, 1 simulation is used, run by the Fondazione Centro Euro-Mediterraneo sui Cambiamenti Climatici (CMCC); (2) CMCC-CM2-HR4, 1 simulation is used, also run by CMCC; (3) CNRM-CM6-1-HR, 1 simulation is used, run by the CNRM (Centre National de Recherches Meteorologiques), CERFACS (Centre Europeen de Recherche et de Formation Avancee en Calcul Scientifique) (CNRM-CERFACS); (4) CNRM-CM6-1, run by CNRM-CERFACS; (5) EC-Earth3P-HR, 3 simulations are used, run by EC-Earth-Consortium; (6) EC-Earth3P (EC-Earth 3.2), 3 simulations are used, run by EC-Earth-Consortium; (7) HadGEM3-GC31-HM(HadGEM3-GC3.1-N512ORCA025), 1 simulation is used, run by the Met Office Hadley Centre; and 8) HadGEM3-GC31-HM (HadGEM3-GC3.1-N216ORCA025), 1 simulation is used, run by the Met Office Hadley Centre. Further details of each model’s setup are shown in Table S1.

2.2.2 d4PDF

Time-slice ensemble simulations from the Database for Policy Decision-Making for Future Climate Change (d4PDF; Mizuta et al. 2017), which makes use of global (60-km horizontal mesh) atmospheric models, are used in our analysis, specifically the historical climate simulations and the non-warming climate simulations over 1951–2010. We assess the likelihood of occurrence for heavy precipitation on par with that brought by Typhoon Hagibis, under historical (transient forcing) vs. non-warming (fixed forcing as described in Mizuta et al. 2017) scenarios. One hundred ensemble members from each scenario are used in the analysis. The atmospheric models have been shown to satisfactorily simulate the historical climate in natural variations,and extreme events such as heavy precipitation and tropical cyclones (Ishii and Mori 2020).

2.2.3 CMIP5 EC-Earth2.3

Daily precipitation from a 16-member EC-Earth2.3 (EC-Earth, Hazeleger et al. 2010) coupled GCM simulations (1860–2100, 150km horizontal resolution) were also used. This set of simulations uses historical greenhouse gas and aerosol forcings up until 2005 and RCP8.5 forcings from 2005 onwards.

These ten sets of model simulations are separated into three high-, medium-, and low-resolution groups based on the horizontal spatial grids of the model outputs as shown in Table 1.

Table 1 Horizontal spatial grids of the climate model outputs in high-, medium-, and low-resolution groups used in this study

2.3 Statistical methods

In this analysis, we analyze the time series of Oct maximum 1-day precipitation from the region identified above, in the observation-based datasets as well as the model simulations. The methods used for observational and model analysis and for model validation and synthesis follow the best-practice approach as detailed in Philip et al. (2020), which is briefly summarized in the following steps: (i) trend calculation from observations; (ii) model validation; (iii) multi-method multi-model attribution; and (iv) synthesis of the attribution statement.

We apply this method to identify how the likelihood of occurrence of the event as defined above has changed due to anthropogenic climate change. For the event under study, we use a generalized extreme value (GEV) distribution that scales with GMST (i.e., using GMST as covariate, Philip et al. 2020 and Van Oldenborgh et al. 2021). The resulting probability ratio (PR) for the event under study is calculated for the comparison between the climates of 2019 and an earlier time with much less anthropogenic warming. PR is calculated as the probability of occurrence (P1) under the current climate, divided by the probability of occurrence (P0) under the climate of the earlier time (with much less anthropogenic warming). It is worth noting that in the GEV analysis, the confidence intervals are estimated using a non-parametric bootstrap, so the GEV fit is done many times using samples of covariate-data pairs drawn from the original series with replacement. Besides PR, we also investigate the fraction of attributable risk (FAR), i.e. the fraction of the current risk that is attributable to the past greenhouse gas emissions (Hegerl et al. 2007), calculated as 1-( P0/ P1). In this study, 1951 is used to represent the earlier time with much less anthropogenic warming compared with the current climate. 1951 is used instead of 1900, a year commonly used to represent pre-industrial climate in event attribution studies (Philip et al. 2020), because the HighResMIP simulations only start from 1950 (with a GMST covariate starting from 1951), and to avoid too much extrapolation since the station data records used in this study goes back to 1974. In a final step, we synthesize results from observations and models that pass the validation tests into a single attribution statement (section 5 below).

Given that the d4PDF non-warming simulations are undertaken using timeslices with fixed forcings rather than transient simulations, and that these simulations have a large ensemble size (100 ensemble members per scenario), we instead calculate the probability ratio by separately calculating the probability of occurrence from the historical forcing experiments and the non-warming experiments, using non-parametric methods (counting the probability of occurrence above the threshold corresponding to the event return time in the d4PDF modelled world).

3 Observational analysis: return time and trend

A GEV fit that scales with smoothed GMST gives a significant trend in the Oct maximum 1-day precipitation averaged over the AMeDAS station data over the studied region (as shown in Fig. 3). The fit gives a return period time for the Oct 2019 heavy precipitation associated with the passing of Typhoon Hagibis of ~286 years (61–12,761 yrs) in the current 2019 climate, and a return period time of ~2313 years (92-infinite) when extrapolated to the 1951 climate, resulting in a PR of 7.97, with a lower bound of 0.23 and an infinite upper bound. For the following climate model simulation analysis, we use the return period time of 268 years to get the event thresholds in each individual model.

Fig. 3
figure 3

Fit of the Oct maximum 1-day precipitation averaged over the AMeDAS station data over the studied region. A GEV distribution is used that scales with the smoothed global mean surface temperature (GMST). The distribution is evaluated for the climates of 1951 and 2019. (a) Oct maximum 1-day precipitation against the change in GMST. Observations are shown as blue signs and the magenta square shows the 2019 value, which was not included in the fit. The thick line denotes the time-vary location parameter and the thin lines are 1 and 2 above, respectively; the two vertical red lines show the 95% confidence interval of for the climates of 1951 (extrapolated) and 2019. (b) Return times plots with fits for the 2019 climate (black lines) and the 1951 climate (blue lines with 95% CI), and the magenta line denotes the 2019 event. Observations are shown twice, once shifted down to 1951 (blue signs), and once shifted to the climate of 2019 (black signs) using the fitted dependence on smoothed GMST

The same analysis is repeated with the gridded datasets, which give a PR of 1.83(95% confidence interval of 0.17-unbounded, referred to as CI from hereon) for JRA-55, 35.67 (CI, 0.07-unbounded) for ERA5, and 5.82 (CI, 0.20-21306) for CPC. As mentioned in the previous section, the gridded datasets underestimated the rainfall compared with station records, but we investigate these datasets for comparison, and results for all the observational-based dataset are very similar in terms of both the best estimate and uncertainty bound, and indicate an increase in the likelihood of the event to occur (with a best estimate of PR greater than 1). When considering all the observational-based datasets, the synthesized observational result shows an increase in the likelihood of the event to occur, with a best estimate PR of 7.42 (CI, 0.128-0.737E+05), shown as the bright blue bar in Figure 4. The best estimates and the bounds of each observation set are directly averaged to get the best estimates and extent of the bright blue bar.

Fig. 4
figure 4

Synthesis of the probability ratios (PR) for Oct maximum 1-day precipitation (mm/day). The blue bars show the confidence intervals of the PR results from observations and the red bars show PR results from models, and the weighted average is shown in magenta. The models with more than one ensemble member going into the calculation of confidence interval and best estimate are indicated by showing the number of members used after the model name, and all the other models only used one ensemble member

4 Model evaluation

Following the best-practice approach set out in Philip et al. (2020), we assess whether a certain model can be confidently used in our attribution study, which requires that the GEV fit parameters (trend, scale, and shape parameter) of the model agree with the GEV fit parameters of the observation (AMeDAS station data), which ensures that reliability of the model is good in terms of the statistical description of extremes, as the natural variability is completely uncorrelated between the observations and the models. The reliability check results (presented in Table S2) show that the majority of the models passed the fit parameter check, whereas the CMCC-CM2-VHR4 and EC-Earth3P-HR models failed the test, and are excluded from the rest of the attribution analysis.

5 Multi-method multi-model attribution

Using the models that did pass the evaluation test, we calculate PRs for the models calculated using the same approach as for the observations but using the models’ own smoothed GMST instead.

We synthesize the results from observations and climate models that have passed the evaluation (Table S2) in order to assess the overall magnitude of change in likelihood of the event occurring and whether the observed change is attributable to anthropogenic climate change. The detailed steps to perform the synthesis from observations and climate models have been described by Philip et al. (2020) and again in Ciavarella et al. (2021). Here we summarize these steps to provide some clarity (interested readers are referred to the aforementioned studies for further details). The synthesis is presented in the form of a horizontal bar plot (as shown in Figure 4), in which the blue bars show the confidence intervals of the PR results from observations and the red bars show PR results from models.

First of all, for each individual observation (pale blue bars) and model (pale red bars), the best fit values (i.e., best estimates) are shown as the vertical black lines within the bars, and the confidence intervals are calculated from the GEV fitting (as described in section 2.3 Statistical methods), arising from natural variability of each dataset.

First the observational results are synthesised. For observations, the total uncertainties for each dataset (\({\sigma }_{total-i}\)) come from two sources: natural variability \({\sigma }_{i}\) (the extend of each blue bar in Fig. 4) and “representation uncertaintyˮ \({\tau }_{obs}\) (i.e., how well the observational dataset is representative of the extreme under study here). We make the assumption that the uncertainty due to natural variability is correlated between the multiple observational estimates (Ciavarella et al. 2021), as different observational datasets are different records of the same reality. The mean best estimate of all observational datasets (\({\mu }_{obs}\)) is directly averaged (as shown in Equation 1) from all the best estimates of each observational dataset (\({\mu }_{i}\)); similarly the mean natural variability \({\sigma }_{obs}\) of all observational datasets is directly averaged (as shown in Equation 2) from all the natural variabilities of each observational dataset (\({\sigma }_{i}\)). As mentioned previously, P0 and P1 each has confidence intervals (from the GEV analysis), which are estimated using a non-parametric bootstrap (so the GEV fit is done many times), bootstrap is also used to calculate uncertainty estimates (\({\sigma }_{i}\)) of PR, using samples of P0 and P1 pairs drawn from GEV analysis. Another assumption we make here is that the representation uncertainty is the same for all observational datasets and that it is equal to the spread of the observational best estimates, i.e., differences between observational best estimates are assumed to be due to representation uncertainty. Therefore, the representation uncertainty (Equation 3) is calculated as the averaged deviations of the best estimate of each observational dataset (\({\mu }_{i}\)) from the mean best estimate of all observational datasets (\({\mu }_{obs}\)). As shown in Equation 4, the representation uncertainty is then added (in quadrature) to the natural variability of each observational dataset individually and the increase in uncertainty from this representation uncertainty is shown in the figure as a white extension to each pale blue bar, the full extent of which represents the total uncertainties for each dataset \({\sigma }_{total-i}\). The bright blue bar represents the combined observational result—a direct average of the best estimates and the uncertainty bounds of all the observational datasets, respectively.

$${\mu }_{obs}= \sum\nolimits_{i}{\mu }_{i}/{N}_{obs}$$
(1)
$${\sigma }_{obs} = \sum\nolimits_{i}{\sigma }_{i}/{N}_{obs}$$
(2)
$${\tau }_{obs}^{2}= \sum\nolimits_{i}{\left({\mu }_{i}-{\mu }_{obs}\right)}^{2}/({N}_{obs}-1)$$
(3)
$${\sigma }_{total-i}^{2}= {\sigma }_{i}^{2}+ {\tau }_{obs}^{2}$$
(4)

in which \(i\) represents each observation and \({N}_{obs}\) the number of observations.

As mentioned previously, sigma is an estimate of the uncertainty in PR, which comes from the confidence interval from bootstrapping (as explained above); therefore it is dependent on the length of the data sample. Hence, it is not the natural variability in the traditional sense but rather variability due to sampling uncertainty. In the observations, this variability is treated as due to the natural variability of the climate system recorded in the data, hence referred to as such, but this is not the case in the models, especially for models with multiple ensemble members. But to keep the terms consistent between observations and models, in the model section, we also refer to sigma as natural variability.

Second, the model results are synthesized. For models, the total uncertainty for each model (\({\sigma }_{total-j}\)) comes from natural variability (\({\sigma }_{j}\)) and a common intermodel spread term (\({\tau }_{mod}\)). It is worth pointing out that the reason we chose to include intermodel spread in the weight is to account for the systematic uncertainty from using models, each with their own imperfect representation of the climate system. For models, in contrast to observations, natural variability is assumed to be uncorrelated, assuming each model simulation represents an independent realization of the climate system. The combined model best estimate (\({\mu }_{mod}\) as shown in Equation 5) is the weighted mean of all the individual model best estimates (\({\mu }_{j}\)), and the weighting is the inverse of the sum of the squared natural variability uncertainties plus the squared intermodel spread, to account for systematic model errors (Ciavarella et al. 2021) as explained above. The common model spread term \({\tau }_{mod}\) can be estimated numerically (following Equation 6). It is worth pointing out that models with larger ensemble members will get a bigger weighting due to the fact that they have more data in the GEV fitting, leading to a smaller natural variability uncertainty (as explained in the paragraph above). The same weighting is also applied to combine the uncertainty due to natural variability (as shown in Equation 7), referred to as the combined model natural variability (\({\sigma }_{mod}\)). Then the intermodel spread term is added in quadrature to the combined natural variability (as shown in Equation 8), which forms the total uncertainty of the combined model results (\({\sigma }_{mod-total})\), denoted as the extent of the bright red bar in Fig. 4. Here we acknowledge that the full model uncertainty can be larger than model spread, due to model structural biases, missing physical processes, and inefficiently represented processes in the models due to parameterizations, etc.. As shown in Equation 9, the intermodel spread term is also added (in quadrature) to the natural variability of each model individually and the increase in uncertainty from this intermodel spread term is shown in the figure as a white extension to each pale red bar, the full extent of which represents the total uncertainties for each model \({\sigma }_{total-j}\).

$${\mu }_{mod}= \sum\nolimits_{j}\frac{{\mu }_{j}}{{\sigma }_{j}^{2}+ {\tau }_{mod}^{2}} / \sum_{j}\frac{1}{{\sigma }_{j}^{2}+{\tau }_{mod}^{2}}$$
(5)
$$\frac{{\chi }^{2}\left({\mu }_{j}, {\sigma }_{j};{\tau }_{mod}\right)}{dof}= \sum\nolimits_{j}\frac{{({\mu }_{j}-{\mu }_{mod})}^{2}}{{\sigma }_{j}^{2}+ {\tau }_{mod}^{2}}$$
(6)
$${\sigma }_{mod}= \sum\nolimits_{j}\frac{{\sigma }_{j}}{{\sigma }_{j}^{2}+ {\tau }_{mod}^{2}} / \sum\nolimits_{j}\frac{1}{{\sigma }_{j}^{2}+{\tau }_{mod}^{2}}$$
(7)
$${\sigma }_{mod-total}^{2}= {\sigma }_{mod}^{2}+ {\tau }_{mod}^{2}$$
(8)
$${\sigma }_{total-j}^{2}= {\sigma }_{j}^{2}+ {\tau }_{mod}^{2}$$
(9)

in which \(j\) represents each model, \({N}_{mod}\) the number of models, \({\tau }_{mod}\) the intermodel spread term, and \(dof\) degree of freedom, i.e., \({N}_{mod}-1\). \({\chi }^{2}/dof\) compares the ratio of the model spread to the natural variability. If >1, then the spread of the models is greater than expected due to natural variability. A intermodel spread uncertainty term is added in quadrature to the natural variability such that \({\chi }^{2}/dof\) = 1. The value for \({\tau }_{mod}\) is determined numerically. Initialised at zero, it only needs to be determined if \({\chi }^{2}/dof\) >1 when \({\tau }_{mod}=0\), i.e., \(\frac{{\chi }^{2}\left({\mu }_{j}, {\sigma }_{j};0\right)}{dof}>1\). Two limits for the value of \({\tau }_{mod}\) are first found, one which gives a \({\chi }^{2}/dof\) >1, and another which gives \({\chi }^{2}/dof\) <1. These two limits straddle unity, and Brentʼs method (Brent 1973) is then used to find the root of \({\tau }_{mod}\) between these two limits, and then \({\mu }_{mod}\) and \({\sigma }_{mod}\) are recalculated for the new value of \({\tau }_{mod}\), then \({\chi }^{2}\) is recalculated for the new value of \({\tau }_{mod}\), and then returns \({\chi }^{2}/dof\). If \(\frac{{\chi }^{2}\left({\mu }_{j}, {\sigma }_{j};0\right)}{dof}<1\), the spread in \({\mu }_{j}\) can be explained by natural variability alone and the model spread is neglected, i.e., \({\tau }_{mod}\)=0.

Finally, observations and models are combined into a single result. If the observational and model results are clearly incompatible (determined by Equation 10), then we conclude that the model biases are too large for these to be combined. Otherwise, they are combined in two ways if they seem to be compatible (in the case of this study, they are). Following the approach taken by previous studies (Philip et al. 2020 and Ciavarella et al. 2021), first we ignore model uncertainty beyond common model spread (i.e., ignore uncertainties due to model structural biases, missing physical processes, and inefficiently represented processes in the models due to parameterizations etc.), in which case the synthesis result of observations and models is calculated as the weighted average of the combined observation result (bright blue bar) and the combined model result (bright red bar), with the weights being the inverse square of the total variances (Equations 11 and 12 for observation and model respectively) for both the best estimate (\({\mu }_{synthesis-weighted}\), Equation 13) and the uncertainty bounds (Equations 14 and 15 for lower and upper bound respectively). In essence, the model and observational results are combined based on the squared of the deviations from their respective means, and the squares are proportional to a measure of variance since the deviations are proportional to a measure of standard error (similarly for Equations 17 and 18). Second, because the full model uncertainty can be larger than the common model spread (as explained above), for the uncertainty bounds, we also show the more conservative estimate of an unweighted average of the combined observations result (bright blue bar) and the combined models result (bright red bar), for both the mean (Equation 16) and uncertainty bounds (Equations 17 and 18 for lower and upper bounds respectively), indicated by the white box around the magenta bar in the synthesis Fig. 4, resulting in a larger confidence interval than the weighted results.

$${({\mu }_{mod}-{\mu }_{obs})}^{2}\gg {\sigma }_{mod}^{2}+ {\sigma }_{obs}^{2}$$
(10)

In the following equations, \({W}_{obs}\) represents the weight applied on the obs (bright blue bar), \({OBS}_{lower}\) represents the lower end of the uncertainty bound for obs (left end of the bright blue bar), \({OBS}_{upper}\) represents the upper end of the uncertainty bound for obs (right end of the bright blue bar), and we use \({W}_{mod}\), \({MOD}_{lower}\), and \({MOD}_{upper}\) to represent the corresponding terms for the model results (bright red bar). We use \({\mu }_{synthesis-weighted}\) to represent the weighted mean and \({SYN}_{weighted-lower}\) and \({SYN}_{weighted-upper}\) to represent the lower and upper bound of the uncertainty for the weighted results (i.e., the extent of the purple bar). Finally, we use \({\mu }_{synthesis-unweighted}\) to represent the unweighted mean and \({SYN}_{unweighted-lower}\) and \({SYN}_{weighted-upper}\) to represent the lower and upper bound of the uncertainty for the unweighted results (i.e., the extent of the white box around the magenta bar in the synthesis Fig. 4).

$${W}_{obs}={~}^{1}\!\left/ \!{~}_{{({OBS}_{upper}-{OBS}_{lower})}^{2}}\right.$$
(11)
$${W}_{mod}={~}^{1}\!\left/ \!{~}_{{({MOD}_{upper}-{MOD}_{lower})}^{2}}\right.$$
(12)
$${\mu }_{synthesis-weighted}=\frac{{W}_{obs}\times {\mu }_{obs}+{W}_{mod}\times {\mu }_{mod}}{{W}_{obs}+{W}_{mod}}$$
(13)
$${SYN}_{weighted-lower}={\mu }_{synthesis-weighted} - \frac{\sqrt{{W}_{obs}\times {({\mu }_{obs}-{OBS}_{lower})}^{2}+{W}_{mod}\times {({\mu }_{mod}-{MOD}_{lower})}^{2}}}{{W}_{obs}+{W}_{mod}}$$
(14)
$${SYN}_{weighted-upper}={\mu }_{synthesis-weighted} + \frac{\sqrt{{W}_{obs}\times {({{OBS}_{upper}-\mu }_{obs})}^{2}+{W}_{mod}\times {({{MOD}_{upper}-\mu }_{mod})}^{2}}}{{W}_{obs}+{W}_{mod}}$$
(15)
$${\mu }_{synthesis-unweighted}=\frac{{\mu }_{obs}+{\mu }_{mod}}{2}$$
(16)
$${SYN}_{unweighted-lower}={\mu }_{synthesis-unweighted} -\frac{\sqrt{{({\mu }_{obs}-{OBS}_{lower})}^{2}+{({\mu }_{mod}-{MOD}_{lower})}^{2}}}{2}$$
(17)
$${SYN}_{unweighted-upper}={\mu }_{synthesis-unweighted} +\frac{\sqrt{{({{OBS}_{upper}-\mu }_{obs})}^{2}+{({{MOD}_{upper}-\mu }_{mod})}^{2}}}{2}$$
(18)

The observation and model attribution results are summarized in Table 2. The observation results are very consistent and indicate an increase in the likelihood of the event to occur, with a best estimate PR of 7.42. The model results are not as consistent as the observations. The high-resolution model HadGEM3-GC31-HM shows a slight increased likelihood of occurrence of 1.09 (CI, 0.06–18.01). The medium-resolution group of models shows a smaller increased likelihood of occurrence with a best estimate RR of 1.05 (CI, 0.653–1.78), in which three models (CNRM-CM6-1-HR, EC-Earth3P, and HadGEM3-GC31-MM) show decreased PR, whereas d4PDF show increased PR. The low-resolution group of models show increased likelihood of occurrence with a best estimate RR of 2.13 (CI, 1.56–3.08), in which two models (CNRM-CM6-1 and EC-Earth) show increased PR, whereas CMCC-CM2-HR4 show decreased PR. Taking all models into consideration, irrespective of the resolution, results in a best estimate PR of 1.67 (CI, 1.15–2.49). The ranges of the models are not compatible with each other (Table 2 and Fig. 4), suggesting that model uncertainty plays a role over the natural variability. Irrespective of resolution, when looking at each group as a whole, all three present an increase in the likelihood of the event to occur, with a best estimate PR >1. It is important to point out that, due to the way weighting is applied (models with larger ensemble members will get a bigger weighting due to a smaller natural variability uncertainty), the confidence intervals for the synthesized model results are effectively based on those of d4PDF and EC-Earth because they have more simulations and consequently much smaller confidence intervals. Hence the multi-model synthesized results are to a first approximation based on a combination of EC-Earth and d4PDF, with the other models shown in Fig. 4 effectively contributing little to the synthesized model results.

Table 2 Probability ratios (PR) for Oct maximum 1-day precipitation (mm/day) in each dataset, as well as synthesized results of observations and models respectively, and the overall PR and fraction of attributable risk (FAR) synthesis results combining observations and models. The best estimate as well as 95% confidence interval (CI) are shown

6 Hazard synthesis

As mentioned in the previous section, to reach an overarching attribution statement we first synthesize the observations and models separately following the methodology described in Philip et al., (2020) and in Ciavarella et al. (2021), in a second step we combine the model results with the observations to give an overarching attribution statement.

The results for the observations are all very similar and clearly indicate an increase in the likelihood of the event to occur, with a best estimate of 7.42. However, due to the relatively short time scale and high natural variability, the lower bounds in all products do not exclude a decrease in likelihood. Similarly, the upper bound is large, indicating that from observations alone, an increase in the occurrence probability of several orders of magnitude (statistically unbounded, see Table 2) also cannot be excluded. Four of the medium-/low-resolution models (CNRM-CM6-1-HR, HadGEM3-GC31-MM, CMCC-CM2-HR4, and CNRM-CM6-1) used also show large uncertainty ranges however the probability ratios in these cases are smaller than one for the best estimate for three of them (indicating a decrease in likelihood). The high-resolution HadGEM3-GC31-HM also has large uncertainties due to a comparable small ensemble size but indicates an increase in likelihood together with the two large ensembles (EC-Earth and d4PDF). Synthesizing these model results leads to a probability ratio of 1.67 (CI, 1.15–2.49) which is almost identical to the weighted synthesized result of models and observations (the magenta bar in Fig. 4).

7 Impacts and losses

While this study focuses on the rainfall hazard, the primary impact was flooding. We did choose the definition of the event to be closely linked to the impacts, but heavy rainfall is not equivalent to losses from flooding. Drivers that exacerbate or reduce impacts in storms of this scale are a complex combination of an extreme natural hazard, long-term planning decisions and short term disaster preparedness and response decisions. Despite the very well prepared Japanese population the exceptionally high rainfall amounts in a very short time resulted in rapid increase of river water leading to many floods eventually (Susilo et al. 2020). According to the Japanese Government (https://reliefweb.int/report/japan/japan-typhoon-hagibis-information-bulletin), due to the passage of Typhoon Hagibis, 66 embankments in 47 major rivers collapsed and 203 rivers overflowed. Many communities were reportedly isolated by the floods and/or the landslides. Due to the isolation, 49 people were reported dead and another 15 were reported missing days after the event. Ninety-nine people died in the disaster, and this number was reported in the disaster database.

Recent flood events resulting from storms such as Typhoon Jebi in 2018 had led to very high economic losses and led reinsurer RenaissanceRe (https://www.artemis.bm/news/faxai-loss-to-near-10bn-hagibis-15bn-climate-change-a-factor-renre-ceo/) to predict the insured economic losses around $15 bn shortly after Hagibis. Longer after the event however MunichRe estimated the insured losses much lower (https://www.artemis.bm/news/munich-re-estimates-typhoon-hagibis-loss-at-10bn-faxai-at-7bn/), around $10 bn, with $17 bn overall losses. They also noted that in contrast to, e.g., Typhoon Faxai that hit Japan further South also in October 2019 damages from Hagibis were largely due to the enormous amounts of rainfall while Faxai primarily caused wind damage. The damages estimated by MunichRe are the same ones that entered the disaster database EM-DAT.

The $10bn insured losses are therefore a conservative estimate of the losses caused by the rainfall associated with Typhoon Hagibis. Given that these numbers only include the insured losses, with no consideration of incurred cost due to life years lost and other more indirect economic impacts (Frame et al. 2020). In this study we only account for the immediate damages and neither the short-term indirect losses from consequences of the immediate damages http://documents1.worldbank.org/curated/en/186631467998501319/pdf/WPS7357.pdf nor long-term losses due to the effects of well-being (Noy and duPont 2016). These insured costs are thus indeed a conservative estimate even though they do include some costs due to wind rather than flood damage, as it has been shown (e.g., by reinsurer Munich-Re) that Hagibis “showed much lower intensity wind gusts than Jebi, produced an extraordinary amount of rainfall with some areas accumulating more than 1m of precipitation in 24h” http://thoughtleadership.aon.com/Documents/20200122-if-natcat2020.pdf . Using this conservative estimate means that in fact we are using the lower bound of the economic costs as this, in the insured costs is what can be assessed well, whereas the upper bound is so uncertain that a numerical estimate is not possible (Frame et al. 2020). We therefore use this estimate and follow the same methodology introduced by Frame et al. (2020) to combine the change in likelihood of the damage inducing rainfall due to climate change as assessed in the preceding sections of this paper and estimate the economic losses due to anthropogenic climate change. Using the best estimate of the fraction of attributable risk (FAR) of 0.40 derived above, as well as the conservative assessment of the losses of $10bn we estimate an attributable economic loss of $4 bn.

We do acknowledge that the economic loss due to large flooding in Japan is impacted by many factors, however, the heavy precipitation associated with passage of the Typhoon served as an ultimate trigger, without which the large flooding and the following economic loss wouldnʼt have occurred. The question we are trying to answer with this study is: everything else being equal, what is the role of anthropogenic climate in the devastating rainfall event like the one associated with the passage of Typhoon Hagibis? With everything else being equal, we mean that in a counterfactual world without anthropogenic climate change, all the other factors (except the difference in precipitation) including the trajectory of the typhoon, the distribution of the precipitation as a result of the typhoon, the river flows, the land surface conditions, the local preparedness and responses etc., are held exactly the same as what had happened in real life, what is the economic loss due to anthropogenic climate change’s impact on the heavy precipitation.

8 Conclusion

The impacts of anthropogenic climate change on tropical cyclones are complex and we are far from having a complete picture, not least due to the fact that data availability and model fitness for purpose are a challenge and the full picture will take time to emerge. However, a consistently model simulated and projected impact of climate change on tropical cyclones is that when they do occur they bring more rain than they would in a world without human-induced climate change (Harvey,Irma etc. Liu et al. 2018), thus increasing the destruction the heavy precipitation causes. Estimating economic damages of extreme weather events is even more difficult and uncertainty assessments are much larger than for the hazards and changes in the hazards. Taking only the insured losses and the lower estimates of the hazard attribution is however possible to assess a lower bound of economic losses due to the extreme heavy precipitation (associated with the passing of tropical cyclones in this case), as for the first time done for Hurricane Harvey.

Applying the same method for Hagibis leads to an estimated attributable cost of approx. $4 bn. Compared to Harvey these numbers are a lot lower which is primarily due to the fact that the damages are much lower. In the case of Harvey the assessed change in hazard was also more consistent between observation based attribution and model based, whereas for Hagibis, the models show much smaller changes indicating that the results shown here are likely an underestimation.

Nevertheless this study highlights that climate change has clearly led to increased precipitation during heavy rainfall events in the Tokyo area. Coupled with sea level rise, climate change poses new challenges to adaptation and mitigation of flooding events and disaster management in this area. This needs to be seen in the context of a large urban population in the coastal area, highlighting that climate change has resulted in an increase in the number of people and value of property at risk of being flooded.