Attribution of the heavy rainfall events leading to severe flooding in Western Europe during July 2021

In July 2021 extreme rainfall across Western Europe caused severe flooding and substantial impacts, including over 200 fatalities and extensive infrastructure damage within Germany and the Benelux countries. After the event, a hydrological assessment and a probabilistic event attribution analysis of rainfall data were initiated and complemented by discussing the vulnerability and exposure context. The global mean surface temperature (GMST) served as a covariate in a generalised extreme value distribution fitted to observational and model data, exploiting the dependence on GMST to estimate how anthropogenic climate change affects the likelihood and severity of extreme events. Rainfall accumulations in Ahr/Erft and the Belgian Meuse catchment vastly exceeded previous observed records. In regions of that limited size the robust estimation of return values and the detection and attribution of rainfall trends are challenging. However, for the larger Western European region it was found that, under current climate conditions, on average one rainfall event of this magnitude can be expected every 400 years at any given location. Consequently, within the entire region, events of similar magnitude are expected to occur more frequently than once in 400 years. Anthropogenic climate change has already increased the intensity of the maximum 1-day rainfall event in the summer season by 3–19 %. The likelihood of such an event to occur today compared to a 1.2 ∘\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^{\circ }$$\end{document}C cooler climate has increased by a factor of 1.2–9. Models indicate that intensity and frequency of such events will further increase with future global warming. While attribution of small-scale events remains challenging, this study shows that there is a robust increase in the likelihood and severity of rainfall events such as the ones causing extreme impacts in July 2021 when considering a larger region.

Europe. However, rainfall attribution remains challenging for areas at the size of the affected river catchments. While acknowledging such challenges, the severe impacts of the July 2021 rainfall event provide an impetus to investigate the role of anthropogenic climate change in the event.
Shortly after the event, a World Weather Attribution (WWA) study was conducted based on the WWA event attribution methodology Philip et al. 2020) and a report was made available (Kreienkamp et al. 2021), which is consolidated in this paper. Using observational datasets, this study analyses two severely impacted regions. As the robust estimation of return values and the detection and attribution of trends is, however, challenging at the scale of the most severely impacted catchments, the study also uses observational data and model simulations to address a more general question of how much climate change has altered the probability and intensity of similar events occurring in an area of similar size within Western/Central Europe, a region characterised by precipitation climatologies similar to those of the affected areas. Based on the specific characteristics of the event, Regional Climate Model (RCM) simulations which are available at a resolution equal to or higher than 12.5 km and Convection-Permitting Regional Climate Model (CPRCM) simulations below 5 km resolution, are included in the model analysis. Thus, the study also implicitly evaluates whether inclusion of CPRCM data adds value to attribution studies of heavy rainfall events and thus provides a new perspective within WWA studies.
First, the meteorological and hydrological characteristics of the event are summarised in Section 2. While the flooding is most relevant to the impacts, an attribution of the flood has not been possible due to the limited availability of hydrological observations and simulations. However, a description of the hydrological characteristics of the affected regions, the event characteristics and the estimated peak flows provides valuable background information to understand the severe impacts. The hydrological and meteorological analysis of the event closes with a discussion on the suitability of hydrological and meteorological datasets for this attribution study (Sections 2.5 and 2.6). Then, the event definition (Section 3) and the statistical method (Section 4) are presented. The data are presented in Section 5, followed by the results in Section 6. The paper is finished by discussing vulnerability and exposure with respect to the event, including the early warning systems applied in Germany (Section 7) followed by a summary and outlook (Section 8).

Fig. 1
Meteorological situation during the event. Contour lines show the 500 hPa geopotential averaged over the 12-15 July 2021 period (in 10 2 m 2 s −2 ). Colours show the vertical integral of the convergence of moisture flux, averaged over the 12-15 July 2021 period (in kg m −2 day −1 ). Data source: ERA5 reanalysis Hersbach et al. (2018a, b), Copernicus Climate Change Service, see also also Hersbach et al. (2020) most affected rivers included the Ahr, Erft, Kyll and Prüm, while the worst floods in Belgium occurred in the Ardennes along the Vesdre and Ourthe rivers and then downstream along the Meuse.

Hydrological characteristics of the affected areas
The affected catchments (see Fig. 3) are located in the Ardennes and the Eifel mountains in the north-western part of the central European low mountain range. This region is geologically characterised by paleozoic siliciclastic sedimentary rocks, regionally also limestones, locally interbedded with paleogene and neogene volcanic rocks that led to the development of shallow soils with a low water capacity due to their high skeletal content. Topographically, it is an elevated plain (mostly 200 to 500 m above sea level) surmounted by individual mountain ranges (up to 700 m), into which the river network has cut deeply during an ongoing phase of tectonic uplift, often in the form of V-shaped notch valleys. The valleys are locally very narrow with steep slopes leading to funnel-like effects in the event of floods (Seel 1983). These hydrogeological, pedological, and topographical characteristics pre-condition the area to be prone to high and rapid hydrological responsiveness. In combination with high soil saturation levels (see, e.g., EUMETSAT 2021), low saturated hydraulic conductivity (i.e. a low infiltration capacity of the soil) and sustained heavy precipitation over a widespread area  (Junghänel et al. 2021), the hydrological event of July 2021 was very destructive. In addition to many common features of the affected catchments, there are also differentiating characteristics that have a modifying effect on the flood such as regional hydrological, topographical, and pedological details, different land cover (arable and grassland use, degree of forestation, viticulture) and water management measures (dams, flood retention). At the level of individual rivers, the respective valley cross-sections, river construction, building development near the river and effects during the event itself (backwater, surges) cause differences in the local flooding.

Hydrological event characterisation
The usual flood season in this region is winter, when relatively high precipitation and low evapotranspiration rates, possibly in combination with snowmelt, cause the highest annual discharges. Nilson and Iber (2020) showed that the annual highest river flows (MHQ) have increased in the Eifel region by up to +8 to +10 % based on observations since the 1950s, mostly from winter floods. Future projections suggest further increases of +10 % by the end of the 21st century assuming the extreme RCP8.5 climate change scenario (ensemble median, ensemble range +5 % to +20 %, 2071-2100 relative to 1971-2000Nilson and Iber 2020). Summer events are relatively rare in the pluvial flow regimes of central Europe; however, many large and particularly destructive floods caused by heavy rainfall have occurred in past summers, e.g. 1804, 1888, 1910, 2016 in the Ahr region (Roggenkamp and Herget 2014). For the Meuse river, the flood in July 2021 was the highest summer flood event on record (Task Force Fact-finding hoogwater 2021), exceeding volumes of previous events by 40 %. Figure 4 shows hydrographs of selected gauging stations, whose locations are shown in Fig. 3, during the event period. The catchments of the Vesdre, Ahr, and Kyll responded particularly rapidly during the afternoon of July 14. At all gauging stations pre-existing water level maxima or estimates of extreme scenarios (red lines in Fig. 4)   Belgian Meuse catchment, more than half exceeded the 25-year return period threshold for discharge and the threshold corresponding to the 100-year return period was exceeded at 29 locations.

Estimation of peak flows
A precise reconstruction and classification of the hydrological and hydrodynamic processes of 14 and 15 July 2021 is not possible. Additional surveys are necessary and ongoing, because in many places the extreme floods not only far exceeded previous events but led to failure or destruction of gauging stations. Therefore, only sparse water-level measurements are available for some catchments (e.g. Ahr, Vesdre), and robust statistics are hampered by large uncertainties in reconstructed data. Thus, also the estimation of peak flows, which are  Table 1 derived from known relationships between discharge and river height, is challenging for this event. The water levels of this event cannot directly be linked to a discharge rate because the relationship is derived for pre-event river cross-sections. During the flooding, large erosion and mass movement resulted in changes of these historic relationships. Based on fragmented data and model simulations for the gauging station Altenahr (Ahr river, see Landesamt für Umwelt Rheinland-Pfalz 2022) and flood marks in the Ahr valley (Roggenkamp and Herget 2022) the peak flow could have been as high as 1000 m 3 s −1 (± 200 m 3 s −1 ) at 03:00 on 15 July 2021. The estimated peak flow is a factor of 4.2 (3.4 to 5.0) higher than the previous highest peak in the instrumental record since 1947 (236 m 3 s −1 , June 2, 2016). This is the same order of magnitude as the highest peak flow quantified to date for summer 1804 (about 1200 m 3 s −1 in Walporzheim, 4 km upstream of Altenahr Herget 2014, 2015)). Assuming a peak flow value of 1000 m 3 s −1 for the 2021 event and including reconstructed peak flow values of historic events (1804,1888,1910,1918,1920), Roggenkamp and Herget (2022) assign a return period of 850 years to the 2021 flood. An event that would statistically occur once in 100 years (HQ100) would hence yield 434 m 3 s −1 , a substantially higher HQ100 than the previous value set to 241 m 3 s −1 (see Table 1) that was based on the instrumental record only. Estimated return periods and return values always depend on the length of the underlying time series and are sensitive to an extension of the time series with new observed extrema or historical data.
At the gauging station of Eijsden (Meuse River) a peak flow of 3179 m 3 s −1 was observed on 15 July 2021 (20:00). This value equals the design level (return period 250 years) of the major dike systems. Water management interventions, like intentional flooding of floodplains and wetlands, prevented downstream floods and kept the Meuse within its banks. In some tributaries of the Meuse (Vesdre, Roer), the event however exceeded design levels. Here, return periods were preliminarily estimated to be as high as 1000 years, based on simulated data (Vesdre, 520 m 3 s −1 ) and manually measured data (Roer, 270 m 3 s −1 ), see Task Force Fact-finding hoogwater (2021).

Suitability of hydrological data for an attribution analysis
The lower Ahr and Middle Meuse rivers showed significantly higher water levels than any flood in the instrumental record. For the Ahr, extreme discharge values of HQ100 and higher, determined by extreme value statistics from the data records, were clearly exceeded. For the Ahr River the flooding is comparable only to the highest historically documented and quantified, but not measured, events. Although some estimates of hydrological event characteristics can be made, the data coverage is insufficient for a probabilistic attribution based on hydrological data, which would require analysis of long flood timeseries. Thus, the attribution study focuses on heavy precipitation as the first order flood-generating meteorological driver rather than on hydrological phenomena.

Suitability of meteorological data for an attribution analysis
As described above, the flood-inducing rainfall event was characterised by processes at different spatio-temporal scales, ranging from the scale of pressure systems to localised convective showers. These processes at different scales interact ) and given the high atmospheric humidity at the time of the event (dew point: 18 • C, integrated water vapour content: 40-45 mm, estimated from ERA5; data not shown), this interaction was likely particularly strong (Lenderink et al. 2017). Convection can thereby alter largescale conditions and promote further moisture convergence and instability (Lenderink et al. 2017).
From a flood perspective, the importance of diversity in scales is also apparent. High discharge in the Meuse was caused by 2 days of widespread rain, whereas the Ahr floods were governed by processes at sub-daily to daily time scales (Mohr et al. 2022). The link between extreme precipitation and extreme discharge is rather complex, and extreme precipitation is only one factor that may contribute to an extreme flood (Sharma et al. 2018;Merz et al. 2020). This is particularly relevant as observational and modelling evidence points at altered spatial structures of convective rain storms in a warming climate (Wasko and Sharma 2015;Lochbihler et al. 2019Lochbihler et al. , 2021. From a climate modelling perspective, there is no optimal strategy for simulating rainfall events for attribution studies due to the involvement of processes at these different spatiotemporal scales. Global Climate Models (GCMs) only capture the large-scale dynamics of such an event, while the convective dynamics are strongly simplified through parameterisations, and rainfall intensity is averaged over unrealistically large areas based on model resolution. Non-convection-permitting Regional Climate Models (RCMs) resolve the rainfall characteristics more realistically, while being computationally efficient enough to produce long simulations required for statistically robust results (Rummukainen 2016). However, they still use convection parameterisations, often shared with and designed for GCMs (Kirshbaum 2020), leading to potentially large biases in precipitation statistics such as intensity, spatial distribution, and timing ). Finally, Convection-Permitting Regional Climate Models (CPRCMs) resolve the largest convective dynamics, and have much better statistics of convective rain Kendon et al. 2017;Lucas-Picher et al. 2021;Schär et al. 2020;Termonia et al. 2018;Van de Vyver et al. 2021), but are computationally demanding and, therefore, only produce simulations of a length that is insufficient to establish robust statistics of climate change.

Event definition
While the extreme precipitation in Germany and Belgium was embedded into the same synoptic system, the meteorological and hydrological characteristics that led to the impacts differed. Therefore, depending on the region, different respective impact-relevant meteorological indices were analysed. Flooding in the Ahr/Erft catchments was caused by intensive precipitation on the sub-daily to daily timescale (see Fig. 2). Mohr et al. (2022) suggest an accumulation time of approximately 15 h, however, as the data available for attribution studies have a daily resolution, the yearly-maximum of daily precipitation accumulations (RX1day) is analysed for the combined Ahr/Erft catchment (black polygons in Fig. 5). For the Belgian Meuse catchment (red polygon in Fig. 5) the precipitation leading to impacts accumulated over 2 days, motivating the analysis of yearly-maximum two-day precipitation accumulations (RX2day) for this region. As attribution studies are challenging for regions at the size of selected catchments, the analysis was widened to include rainfall events occurring within 14 non-overlapping rectangular tiles with a dimension of 1.25 • latitude × 2 • longitude covering and surrounding the affected region (black squares/tiles in Fig. 5). The tiles have a size of 139 km in North-South direction by, depending on the latitude, between 136 km and 151 km in East-West direction. The RX1day time series from all tiles are combined/pooled to increase the amount of data for the observational analysis and for the model-based attribution. This means for example that, instead of having 100 years of observations available to estimate return periods, 1400 years of observations are available as each of the tiles provides 100 years of data. To select regions that have sufficiently similar statistics to be combined in such an approach, their mean and extreme precipitation statistics (parameters of the GEV distribution) were checked to be sufficiently similar. Tile 4 (most south-east, not shown) has been removed as the mean of yearly block maxima and its standard deviation were distinctively higher than for the other tiles. The tile north of tile 4 (tile 8, not shown) has been removed due to a dispersion parameter that was distinct from those of the other tiles. Of the 14 remaining tiles, the mean of the yearly block maxima are between 19 and 29 mm, whereby tile 10 has a mean of 24 mm, with standard deviations between 6 and 8 mm. The dispersion parameter is for all tiles in the range of 0.17 to 0.33 while the shape parameter is in the range of −0.25 to 0.4 (plots not shown here). Despite this rather large variability, Box-Whisker plots for all remaining tiles show that the tail of the resulting overall distribution is not just dominated by a few specific tiles (see Fig. 9). Further analysis and the development of a standardised software to select regions which can be combined in such an approach is underway (Zanger et al. 2022) and supports the selection made here.
Using the overall Pooling Region, the analysis investigates whether the likelihood and severity of extreme rainfall events occurring anywhere in the 14 sub-regions are impacted by anthropogenic climate change. The pooling approach was also used for extreme precipitation in the Cévennes mountains (Vautard et al. 2015;Luu et al. 2022) and Central Europe (Zeder and Fischer 2020), albeit using a pooling of rain gauge stations. For the Pooling Region, RX1day is presented here; however, Table 4 provides the RX2day results from observations and models which are also discussed in Kreienkamp et al. (2021). For all regions, only those rainfall extremes occurring during the boreal summer half-year (April to September) are Note that a simplified polygon shape, which describes the combined Ahr/Erft region approximately, was used for the analysis of observational/model data. The outline of the area plotted in Fig. 3 is shown as a grey box and country borders are shown as grey polygons analysed, as there are seasonal differences in precipitation characteristics, preconditioning, and the water holding capacities of soils.

Statistical methods
To evaluate how climate change affected the severity and frequency of the July 2021 event, extreme value statistics are applied as explained in detail in Philip et al. (2020) and van Oldenborgh et al. (2021). Extreme rainfall accumulations from observations and climate simulations are described using a Generalised Extreme Value (GEV) distribution (see, e.g., Coles 2001) as: where x is the evaluated variable -here RX1day for the Ahr/Erft and Pooling Region or RX2day for the Meuse -μ is the location parameter and σ the scale parameter. The shape parameter ξ is assumed constant over time (see, e.g., Philip et al. 2020) and, to avoid unphysically large values, a penalty term keeps |ξ | <= 0.4 (Martins and Stedinger 2000). The effects of global warming are included by requiring that the location parameter and scale parameter depend on a covariate for which the Global Mean Surface Temperature (GMST) anomaly (T ) is used here (Eqs. 2 and 3). Philip et al. (2020) pointed out that results are similar when using atmospheric C O 2 concentration or radiative forcing estimates instead. The dependence on T is assumed to be exponential as inspired by the CC relationship. Furthermore, the ratio σ/μ is assumed to remain constant.
Here μ 0 and σ 0 are the location and scale parameters for covariate T = 0, and α is the trend parameter (calculated with respect to T as explained in Appendix A; unit of α: (mm day −1 ) K −1 ). If no trend is detected, α = 0 and therefore μ = μ 0 , which means that the GEV fits become independent of T . To estimate the impact of anthropogenic climate change on the distribution, the equations are evaluated at different GMSTs, i.e. at the current climate GMST, at the pre-industrial climate (1850-1900 average, based on the Global Warming Index https://www.globalwarmingindex.org) with a GMST 1.2°C cooler than nowadays, and for a future climate of +2°C with respect to pre-industrial (or +0.8°C with respect to nowadays). It is noted that the assumptions of exponential scaling and those made for the parameters described above might be less robust here than in analyses of large-scale precipitation events; however, they have been used for events that involve convection in other areas of the world (e.g. van Oldenborgh et al. 2017). Two indicators of change in extreme events are analysed, i.e. the probability ratio (PR) which describes the change in the probability of occurrence for an event with at least the magnitude of the analysed event between the pre-industrial (P 0 ) and current (P 1 ) climate, and the change in intensity ( I ) of the event: R 1 the rainfall intensity under current climatic conditions while R 0 is the intensity of rainfall under pre-industrial climate conditions when assuming an event of the same exceedance probability. The impact of climate change on such events is evaluated for a "past-to-present" difference (−1.2°C temperature change), described in the equations above, and a "present-tofuture" (+0.8°C temperature change) difference, noting that the latter is only possible based on climate models while the former is equally analysed for observations. For the presentto-future analysis, P 1 and R 1 describe the future probability and intensity, while P 0 and R 0 describe the values of 2021. To provide overarching attribution statements, the PR and I from each model are first averaged to provide a combined model-based estimate and this average is then combined with the observations-based estimate as explained in Section 6.3.
In the validation and trend analysis of model simulations, all model data up to 2030 are used to study past-to-present changes to increase the data availability. This decision on the years included in the analyses is a compromise between the standard WWA procedure, which uses data up to the present GMST for the past-to-present analysis, and having sufficient data for a good fit. Confidence intervals (CI) for the individual model or observation datasets express the statistical-model uncertainty, obtained using a non-parametric bootstrap procedure. As required due to the spatial pooling, the bootstrap takes spatial correlation into account by drawing a spatial moving block including all tiles with a correlation > 1/e. The underlying code uses a simple way to handle correlation across series in the bootstrap, i.e. when a random value is drawn corresponding to any of the tiles, the simultaneous value in all tiles which have a serial correlation > 1/e are automatically drawn in the same bootstrap sample (the code is available within subroutine sample_bootstrap at https://gitlab.com/KNMI-OSS/climexp/ climexp_numerical/-/blob/master/src/attribute_dist.f90. One complication for using CPRMS data is their comparably short simulation timeframe. Several CPRCMs have only about 10 years of data available for the historical climate period (here defined as pre-2030), which is insufficient for a reliable GEV fit. For these models, sufficient data are only available by including the model data of future years, i.e. the entire simulation timeframe of the given CPRCM, including the future period, is used to evaluate the seasonal cycle and to fit a GEV distribution that is used to specify the event's magnitude from the return period. As a result, these models can only produce present-to-future results, but no past-to-present comparison. Models for which this applies are marked with a star (*) in the evaluation (Table 2) and in the attribution result, see Tables 3, 4 and Fig. 8. While the DWD-CCLM5-MIROC5 had more than 10 simulation years available to evaluate the model quality based on data pre-2030, attribution results are presented only for the present-to-future period based on expert judgement that results for the past-to-present period were likely based on a too short time series to achieve reliable results.

Data
Observational data are introduced here and an overview of the analysed model data is provided. The observational analysis uses the German gridded REGNIE dataset for the Ahr/Erft region, a gridded Belgian dataset for the Meuse catchment, and E-OBS data for the Pooling Region. Four-year low-pass filtered GMST anomalies (based on NASA's GISTEMP) are used as a measure of anthropogenic climate change within the observational analysis. The anomalies are expressed relative to 1951-1980.

Observational data
E-OBS E-OBS is a 0.25 • × 0.25 • gridded temperature and precipitation dataset covering Europe, based on interpolation of precipitation observations from rain gauges (Cornes et al. 2018). The daily values refer to the 24-h period over which the measurements are made (as it is also the case for the other gridded datasets). E-OBS v23.1e, extended in time to cover the event of July 2021 using near real-time synoptic messages as described in Cornes et al. (2018), was used within the analysis of the Pooling Region to produce the climatology, to analyse trends, to estimate return periods, and to validate models. An enhanced, non-public E-OBS dataset that includes more observational data in the years since 2004 and additional observations of the July 2021 event from regional and federal authorities, was produced specifically for Fig. 2; however, as it is less homogeneous in time than E-OBS v23.1e, this enhanced dataset has not been used for trend analyses and estimation of return periods.

REGNIE
The Regionalised Precipitation Heights (REGNIE) dataset of DWD is used for the observational analysis of the Ahr/Erft region. REGNIE provides daily gridded precipitation data with a resolution of 1 km 2 for Germany derived from station data. The methodology is described in Rauthe et al. (2013).
Belgian gridded dataset For the Belgian Meuse catchment, the observations are interpolated on a grid of 5 km × 5 km. The data used are collected from different observation networks, i.e. RMI's climatological rain gauge network, the network of Service Publique de Wallonie -Mobilité et Infrastructures (SPW-MI) in Wallonia, and the network of Hydrologisch Informatie Centrum (HIC) of Waterbouwkundig Laboratorium in Flanders. The data were quality controlled by RMI prior to the interpolation. The interpolation has been done using as weight the inverse of the distance to the stations. Time series of 24 stations over Belgium have also been incorporated into the enhanced E-OBS dataset.

Model simulations
Within the model analysis RCMs from EURO-CORDEX and RACMO, as well as CPRCM simulations from the UK Met Office (UKMO), HCLIM38, WRF, ALARO-0 and COSMO-CLM are used to investigate how climate change influenced the event's frequency and severity. While CPRCMs show added value in many aspects, e.g., in a more realistic reproduction of precipitation characteristics and better resolved dynamical processes, they typically only provide one ensemble member and limited simulation years, often in time-slice experiments (Lucas-Picher et al. 2021;Termonia et al. 2018;Prein et al. 2015). The analysed RCMs provide longer simulations with several ensemble members, but lack the fine-scale detail and improved physics that CPRCMs achieve by limiting or removing convection parameterisation (Jacob et al. 2020). As measure of anthropogenic climate change in models, the 4-year low-pass filtered GMST is used when only one realisation of the GMST from the driving GCM is readily available within the KNMI Climate Explorer (https://climexp.knmi.nl/) and an ensemble-mean GMST of the driving model is used if several ensemble members are available. Further details and references for these simulations can be found in Appendix C.

Observational analysis of the combined Ahr/Erft catchments and Belgian part of Meuse catchment
The analysis of gridded REGNIE rainfall data, spatially averaged over the Ahr/Erft catchments indicates a RX1day precipitation value of 93 mm day −1 for the 2021 event. The yearly block maxima for RX1day in the Ahr/Erft catchment (see Fig. 6(a)) are used to fit GEV parameters to estimate the return period of the event, see Fig. 6(c). The estimated return period (vertical position where the GEV fit would intersect the horizontal magenta line giving the event's intensity in Fig. 6(c), not visible due to the scale) is substantially larger than the length of the time series, i.e. in the order of 15,000 years. Because the estimated return period is many times longer than the time series used to estimate it, a robust estimation with a sufficiently narrow uncertainty interval cannot be given. In Kreienkamp et al. (2021) a conservative estimate of 700 years was selected to analyse model simulations; however, it was found that the selected region is too small for robust detection and attribution. The analysis of the two-day precipitation, spatially averaged over the Belgian part of the Meuse catchment upstream of Eijsden using the Belgian gridded dataset, indicates a RX2day of 53 mm day −1 for 2021 (see Fig. 6(b)), with a GEV-estimated return period of about 1000 years (95 % CI 200 to 120,000 years, see Fig. 6(d), where the GEV fit intersects the horizontal magenta line giving the observed magnitude of the event).
Note that the GEV fit to these observational data does not include the values of the analysed events, as the areas were specifically chosen because they exhibited the extremes. From the observational analysis of RX1day (Ahr/Erft) and RX2day (Meuse) there is no statistically significant trend (the interested reader can find trend parameters in Appendix A). This leads to GEV fits which are hardly distinguishable between pre-industrial and current climate conditions (compare the red and blue GEV fits in Fig. 6(c-d)). The absence of a significant trend at the scale of these catchments is consistent with theoretical expectations (Li et al. 2020;Aalbers et al. 2018;Fischer and Knutti 2014) given the large internal variability and the small spatial scale of events, which may mask or amplify local trends and prevent robust conclusions about the existence of local trends.
The robust estimation of return values and the detection and attribution of trends are challenging for regions of such size, and thus push the limits of what current methods of probabilistic extreme event attribution are designed for. However, at the global scale for most larger subcontinental regions with sufficient observations, an observed increase in heavy precipitation is well established in the literature, including medium confidence in an increase in a larger region including West-Central Europe (Seneviratne et al. 2021). The study therefore moves on to evaluate how the probability and magnitude of similar events within the larger Pooling Region have been affected by anthropogenic climate change.

Observational analysis Pooling Region
To increase the statistical robustness and broaden the applicability of the results, the observational data for 14 sub-regions (tiles) sized 1.25 • latitude × 2 • longitude are analysed (see event definition in Section 3). For each of these 14 tiles the observational time series is used to calculate block maxima and afterwards these block maxima are combined to estimate return periods using a GEV fit as described in Section 4. The observational analysis indicates a RX1day of 64 mm day −1 in Fig. 7(a) for tile 10 which corresponds to most severely affected region (see Fig. 5). It can further be seen in Fig. 7(a) that events of similar size have occurred within different tiles of the Pooling Region over the last 100 years.
Based on the increased sample size that was achieved through pooling, 5-year block maxima, i.e. including only the highest value from a 5-year block per tile in the Pooling Region, are used to calculate the GEV fit. This extension of the block size is supported by, e.g., Ben Alaya et al. (2020) who found that, for extreme precipitation, and given a sufficient sample size, increasing the block size generally leads to a better fit. Data from the event are included in the fit, as we selected a wider region rather than a small area that has been most affected by the event. Figure 7 (b) shows the GEV fit to the 5-year block maxima. The  1930 1940 1950 1960 1970 1980 1990 2000  to 2500 years, determined as vertical position of the intercept between the "GEV scale fit 2021" and the line that provides the "observed 2021" magnitude). A return period of 400 years means that one such rainfall event is expected in every tile per 400 years, i.e. more than one such event per 400 years is expected within the larger European region. A statistically significant trend has emerged from the noise for RX1day in the Pooling Region (trend parameters in Appendix A), scaling the GEV fit using the GMST as a covariate as described in Section 4. This can be seen in Fig. 7 (b), which shows a clear dependence on the GMST, i.e. the GEV fit scaled to past climate conditions (blue line) can be distinguished from the GEV fit scaled to climate conditions of 2021 (red line). In summary, the GEV fit to the observations indicates that there is a pronounced change in the RX1day index between past and present climate conditions, with a PR of 7.7 (95 % CI from 2.0 to 52) and a change in intensity of 22 % (95 % CI from 6.7 to 34 %).

Multi-model attribution
This section provides attribution results, i.e. changes in probability ratio (PR) and intensity by comparing the timing of peaks as well as checking that the GEV parameters are within the 95 % CI of the parameters estimated based on the observations. A check evaluating whether the model spatial pattern of heavy rainfall (i.e. the climatology of April-September RX1day from 1920 to 2021, calculated at each grid point individually) resembles the observed one has been done for RCMs (RACMO and EURO-CORDEX). Beyond the quantitative evaluation performed on CPRCMs, their suitability was supported by (i) respective experts who were either co-authors or have been involved in the process and (ii) scientific publications evaluating the model simulations, where available. Table 2 summarises the evaluation results for the models that are included in the study. In general, simulations based on the GCMs HADGEM2 and CanESM, as well as the RCM REMO were not included as further discussed in Appendix C. Figure 8 shows the attribution results for the PRs (left) and for the changes of intensity ( I , right) for the pooled RX1day data. The past-to-present comparison is shown at the top and intensity changes (right) for RX1day data of the Pooling Region. Past-to-present means comparing the 2021 event in the present-day climate with a pre-industrial climate (1.2°C cooler than today) and present-tofuture means comparing a climate with 2°C of global warming (above pre-industrial) with present-day values. The synthesis bar shows in magenta the weighted average and additionally the black bar gives the unweighted average (see Section 4 for a detailed explanation of these bars). The numbers provided in brackets after the model names provide the number of model simulations included in the dataset. Models marked with a star (*) only provide estimates of the attributable signal for the future period as described in Section 4 and the present-to-future at the bottom. The PR and I calculated based on the observational dataset is shown in blue, while the results from individual models are presented in pale red and in dark red a combined estimate based on the models is given. Each bar shows the 95 % CI. The black vertical line within each bar provides the best estimate based on the GEV fit to the original time series. The best estimates of the PRs and intensity changes for past-topresent indicate that the event has become more likely and more intense based on all models except the MPI-ESM-KIT. Despite the opposite best estimate signal of the MPI-ESM-KIT model, its large CIs overlap with those of most other models. The applied attribution method is designed to deal with such varying signals from different models and takes into account the confidence in individual estimates. The PR and I from each model are averaged to provide a combined model-based estimate which is shown as a dark red bar in the panels of Fig. 8 (see Li and Otto 2022, for a detailed description). Each model is weighted by the inverse of its uncertainty (95 % CI) and, as larger ensembles tend to have smaller uncertainty ranges, the ensembles of RCM simulations receive a higher weighting than the short CPRCM simulations. Then, the modelbased estimate and the observations-based estimate are combined to give an overarching attribution statement (synthesis). The synthesis is calculated using two approaches described below, with further detail and equations available in Li and Otto (2022). In this particular study, the final uncertainty reported in the overarching attribution statement combines these two approaches. First, ignoring model uncertainties beyond the model spread (i.e. ignoring model biases that are shared between the models used and, therefore, cannot be detected by differences between them, see also Li and Otto 2022), the weighted average of the combined model-based estimate (dark red bar) and the observations-based estimate (blue bar) is calculated and shown as magenta bars in Fig. 8. Given that the weights are inversely related to the uncertainty range, which for model ensembles with large amounts of data is much smaller than for observations, the model-based estimate dominates the result and leads to a reasonably small uncertainty range. Second, acknowledging that the uncertainty range of the short simulations (here of CPRCMs) and the observations-based estimate have limited influence in this result, we also consider the uncertainty bounds based on an unweighted average between the combined model-based estimate (dark red bar) and the observations-based estimate (blue bar), shown as a white extension of the magenta synthesis bar in Fig. 8(a-b). The extended uncertainty bar is calculated based on Equations 17 and 18 of Li and Otto (2022) using the log() of input values for PRs (transforming the results back using ex p() of the result) and using log(1 + data/100) for the intensity change (results obtained by using 100 * (−1 + ex p(data)) of the resulting value). We note that the larger uncertainty range based on the unweighted mean was originally designed to give more weight to the observations in cases where the model and observations disagree, possibly due to a model bias shared by the used models. However, while research on how to best combine results from different classes of models with observations is ongoing, the results in Section 6 combine the weighted (magenta bar) and the unweighted (white bar) synthesis approach by taking the largest range, which is similar to the range of results that are obtained from combining the past-to-present CPRCM results (omitting the other RCMs) with observations. As a result, the combination of both uncertainty bars gives an indication of confidence that better reflects the judgement of involved experts.
The combined PR (synthesis) for RX1day extremes, occurring in any of the tiles of the Pooling Region ( Fig. 8(a)) ranges from 1.2 to 9 for the past-to-present comparison. For the future compared to present (Fig. 8(c)), the models indicate a PR of 1.2 to 1.4. The combined I for past-to-present ranges from 3 to 19 % ( Fig. 8(b)), whereas for the present-to-future the models indicate an additional intensity change of 0.8 % to 6 % ( Fig. 8(d)) consistent with the smaller warming signal in present-to-future than in past-to-present. The observational analysis indicates a significant increase, larger than the median estimate from models, but overlapping with the CIs of individual models. Such behaviour of a stronger attributable signal in observations than in simulations could, e.g., be caused by a low estimate of the climate change signal in simulations as indicated in Fischer and Knutti (2016) (their Fig. 2) for precipitation. Another explanation would be the internal variability which has a large impact on the uncertainty of return level estimates for heavy rainfall (Poschlod and Ludwig 2021). In general, the attribution method used here is designed to combine the knowledge gained from observations and model simulations. Similar behaviour has also been found in attribution studies of recent heat waves (Philip et al. 2022;Zachariah et al. 2022) It should be noted that uncertainties for the present-to-future period are not directly comparable with those calculated for the past-to-present because, aside from including additional models, different timeframes, and a smaller change in GMST (−1.2 • C for the past-topresent and +0.8 • C for the present-to-future), the present-to-future comparison does not contain observational data and, therefore, cannot include uncertainty relating to the difference between model and observational results. The uncertainty estimates for the present-to-future comparison consequently may be underestimated.
The RX2day results (see Table 4 and Kreienkamp et al. 2021), are generally very similar to the RX1day results for the Pooling Region and as for RX1day results, the observations tend to show more significant positive changes than the models and act to extend the upper range of the synthesis.

Discussion of the results
This study investigated how climate change has influenced the likelihood and severity of the heavy precipitation event that led to severe flooding in Western Europe in July 2021. Due to data availability an attribution based on hydrological data was not possible and, therefore, the attribution analysis focused on rainfall as a precursor to the flooding. At the scale of the most severely affected catchments, a robust estimation of return values and the detection and attribution of trends is challenging. Based on large climate model ensembles it has been demonstrated that variability dominates over trends in heavy precipitation at the scale of grid points and small catchments, for which the signal can differ significantly between several realisations of a model (Li et al. 2020;Aalbers et al. 2018;Fischer and Knutti 2014). Thus, even 100-year long trends at the scale of stations or small catchments such as the Ahr/Erft may be strongly affected by the large internal climate variability and deviate from the forced response. Therefore, spatial pooling -i.e. the analysis of 14 regions within Western Europe -is applied here to reduce the role of internal variability and thereby increase the signal-tonoise ratio. Over a larger spatial domain the sample size of rare events is higher and a climate signal can be more robustly detected even in comparatively short observational records or model simulations. By using a spatial pooling approach, a robust signal towards increased intensity and severity of similar heavy precipitation events was detected within this Western European region. Nevertheless, a substantial range of possible values for, e.g., the PR exists due to i) rather low signal-to-noise ratios in observed trends given ii) the rareness of the event and the combination of large-scale and small-scale features, and iii) limited availability of models that represent convective dynamics while also covering long time spans.
Choosing climate model simulations for this analysis was challenging and led to a combination of RCM and CPRCM simulations being included, based on their different characteristics described in Section 2.6. While each model type has benefits and limitations, the change in absolute humidity of the atmosphere is robustly represented in all modelling streams. Although actual increases in humidity do not always follow the local temperature changes using the CC relationship of 6-7 % per degree, observations show (Byrne and O'Gorman 2018) and models project robust increases in humidity of the atmosphere. This provides a firm basis for observed and projected increases in precipitation extremes (Fischer and Knutti 2016). Typically expected changes in near surface humidity are 4-6 % per degree global warming (O'Gorman and Muller 2010; Lenderink and Attema 2015;Lenderink et al. 2021), depending on precipitation-soil moisture drying feedback in summer over the continent. This provides a physically based baseline estimate of changes to precipitation extremes and it should be noted that most of the trends presented here are consistent with this estimate.

Vulnerability and exposure
The July 2021 floods resulted in extreme impacts as presented in Section 1. Given the long return period of the analysed event, such an event is likely to cause negative impacts. As events like this are expected to occur ever more frequently, examining the vulnerability and exposure context is critical to reducing future impacts. Risk, as defined by the IPCC, is the interplay between a hazard, exposure and vulnerability to an extreme event (IPCC 2021b). The United Nations define disaster risk as the "potential loss of life, injury, or destroyed or damaged assets which could occur to a system, society or a community in a specific period of time, determined probabilistically as a function of hazard, exposure, vulnerability and capacity" (UNDDR 2022). Recent research suggests expanding risk analyses to account for cascading and compounding impacts on socioeconomic outcomes (Raymond et al. 2020;Balch et al. 2020). After discussing causes for vulnerability, this section summarises some of the flood protection measures and early warning systems that have been operational in the affected regions.
Differences in vulnerability and exposure can arise from non-climate factors such as multi-dimensional inequalities that result in differential impacts for vulnerable individuals and households (IPCC 2021b). Drivers of social vulnerability to floods include demographics, socioeconomic characteristics, health, risk perception, and coping capacity (e.g. Rufat et al. 2015). The most vulnerable demographics to flood events are generally thought to be children, older people, and persons with disabilities, as these groups may be less aware of risks or warnings, have more difficulty self-evacuating, require further assistance by emergency response teams, and may be more difficult to reach (Lowe et al. 2013;Rufat et al. 2015). The analysed flood event highlights that even highly developed countries like Germany are vulnerable to the impacts of such extreme events.

Flood protection
Past experience with floods has shaped protection measures in the region. In Germany, for example, the Federal Water Act (German: Wasserhaushaltsgesetz -WHG) which implements the European Union Floods Directive of 2007 (Directive 2007/60/EC) handles the flood protection in its Chapter 6. Paragraph §74 of the WHG outlines three risk levels for which hazard and risk maps have to be developed: floods with high probability, floods with medium probability, defined as having an expected return interval of at least 100 years, and low probability floods (expected return interval at least 200 years). The European Union Floods Directive stipulates that flood risk maps had to be developed by 2013 and revised thereafter every 6 years, also taking into account climate change risks. The Federal Water Act also addresses certain actions that can increase vulnerability and exposure to flood events by, e.g., providing guidelines on hazardous materials, preventing individual actions that could impede water flow, mandating environmental conservation considerations and ensuring that community efforts do not adversely affect upstream or downstream communities. In Rhineland-Palatinate and North Rhine-Westphalia, flood risk maps have been developed for most of the rivers including the Ahr and the Erft (Landesamt für Umwelt Rheinland-Pfalz 2021; Landesamt für Natur, Umwelt und Verbraucherschutz, Nordrhein-Westfalen 2021). However, even the extreme scenario with low probability was overtopped by the current event in some regions. Roggenkamp and Herget (2022) estimate the peak flow return time of the Ahr flood event at once in 850 years. Given that design levels for infrastructure typically are oriented at events with a shorter return period, such an event is expected to incur major impacts.
In Germany, many flood protection measures (e.g. dike relocations, polders) are coordinated in the "national flood protection programme" (Bund/Länder-Arbeitsgemeinschaft Wasser 2014). In the Netherlands, municipalities created flood buffer zones, lowered floodplains, engaged in channel dredging and widening, and enacted specific zoning laws (Rijkswaterstaat, Ministerie van Infrastructuur en Waterstaat 2021). In Belgium, the strategies of flood prevention, mitigation and defence -collectively, Water Systems Arrangement -are the purview of the regional governments, while the flood preparation and flood management functions, relating to emergency planning, crisis management, and insurance, are governed at the federal level (Mees et al. 2016). Implementation of such protective measures reduces vulnerability and exposure.

Early warning
Forecasts and early warning systems are key factors in disaster flood preparedness and response, and their presence and effectiveness can be key to reduce impacts of such hazards. The region affected by this event is located in highly developed countries and is well served by weather forecasts and risk communication systems. Structures vary widely between European countries -Germany, Belgium, and the Netherlands have different early warning systems which communicate flood warnings to local authorities, relevant agencies, and the public. For example, in Germany, flood forecasts are issued by the 16 federal states, based on weather forecasts of DWD. For the catchments affected by the flood 2021, forecasts are issued by the federal states of Rhineland Palatinate (e.g. Ahr) and North Rhine-Westphalia (e.g. Erft). Early warnings are transmitted in several ways to authorities and the population, including through warning apps such as "NINA" (https://www.bbk.bund.de/DE/ Warnung-Vorsorge/Warn-App-NINA/warn-app-nina_node.html), "Meine Pegel" (https:// www.hochwasserzentralen.info/meinepegel/) or "KatWarn" (https://www.katwarn.de/). The European Floods Awareness System (EFAS; https://www.efas.eu), part of the EU's Copernicus Emergency Management Service, issues forecast bulletins to relevant authorities across Europe, which, however, is not addressing the public.
The rainfall of July 2021 was well captured by weather forecasts as early as July 9, although still with large uncertainties regarding spatial and temporal distribution and the expected precipitation amounts. For instance, forecasts analysed by EFAS found high probabilities of flooding for the Rhine and Meuse catchments, and they issued a first informal flood notification to EFAS partners in these regions on July 10 (Emergency Management Service 2021). EFAS partners are national, regional or local authorities with obligations to provide flood forecasting services, see https://www.efas.eu/en/partners-list-0 for a list of institutions. Independently, the DWD (for weather forecasts) and agencies issued their own warnings starting around July 11 (Landesamt für Umwelt Rheinland-Pfalz 2022) and issued updates and warnings as the forecast uncertainty decreased and river monitoring started to show signs of flood onset. For example, the forecasting system of the State Office for the Environment, Rhineland-Palatinate in Germany issued a total of ten forecasts for the gauge Altenahr/Ahr on July 14. The forecasted maximum water levels were between 2.25 m in the morning, and 7.07 m in the late evening. Despite the existence of accurate forecasts and early warning systems, not all exposed authorities and people received the warnings and when they did, the warnings were not always understood and acted upon. This is consistent with well documented barriers and enablers to the effectiveness of early warning systems -including accessibility and reach of the information, the lead time and content of the messages (e.g. impact-based, actionable), the ability to act on the information, risk perception and tolerance (Twigg 2003;Kelman and Glantz 2014;Perera et al. 2020;Kreibich et al. 2021;Netzel et al. 2021).
Due to the severe impact of the analysed flooding, the process of analysing different aspects of disaster preparedness and response, as well as drawing conclusions and implementing improvements is still ongoing. Findings of a German governmental interim report point to a need for realigning responsibilities within the Federal Office for Civil Protection and Disaster Assistance to improve disaster response and crisis management (Bundesministerium des Innern, für Bau und Heimat; Bundesministerium der Finanzen 2021). The related final report (Bundesministerium des Innern und für Heimat; Bundesministerium der Finanzen 2022) evaluates what improvements could be implemented within Germany to better protect against future events. Among other things, the Joint Competence Centre Civil Protection (German: Gemeinsames Kompetenzzentrum Bevölkerungsschutz; GeKoB) has been established in 2022, which will support relevant authorities in future crisis situations with, e.g., comprehensive situation assessments and technical expertise. Furthermore, cell-broadcast is in the process of being implemented and the siren infrastructure will be expanded.

Summary and outlook
This paper provides a probabilistic attribution study of the flood-inducing heavy rainfall of July 2021 in Western Europe as well as a discussion of vulnerability and exposure including early warning systems. While river discharge and water levels are the physical components most directly linked to the event's impacts, the attribution study focuses on the main meteorological driver, i.e. the heavy rainfall, as the available data and methods do not permit the attribution of hydrological parameters. Nevertheless, a section of this paper is devoted to an initial hydrological analysis. In general, statistical weather generators and simulations with improved hydrological models capable of reproducing extreme events, could be used to enhance the ability to analyse hydrological extremes in addition to their meteorological precursors (see, e.g., Khazaei et al. 2012). Due to the interest in the flooding of 2021 in Western Europe, different approaches are currently being tested, with, e.g., Ludwig et al. (2022) presenting results based on a hydrological model and pseudo-global-warming experiments with the Weather Research and Forecasting (WRF) model. Studies can also include knowledge about historic flood events which predate the observational record, but for which flood marks and paintings exist, as these events provide valuable knowledge about the frequency at which flood events are to be expected and about the areas that will be flooded at a given water level. Such an approach has been tested in Vorogushyn (2022).
Due to the scale of the analysed event, and its characteristics that involve large-scale features as well as convective effects, an attribution of this rainfall event was challenging. The study presents a methodology in which (i) a pooling approach is employed to increase the availability of data as well as the robustness of results and (ii) a combination of RCM and CPRCM simulations is used to account for convective characteristics while maintaining sufficient simulations to conduct an attribution. Using spatial pooling enables the probabilistic attribution of this event on a larger Western European scale using the attribution approach from World Weather Attribution (Philip et al. 2020;van Oldenborgh et al. 2021). Thereby, this study outlines a pathway for the probabilistic attribution of rainfall events which exhibit similar complications due to the scale of involved processes. Further research into the statistical foundation of the pooling approach is currently underway (Zanger et al. 2022) including the development of standardised tools to select regions in which data can be pooled for attribution analysis.
The analysis shows that, at a given location within this region, one such event can be expected every 400 years under the current climate conditions. As a result, such events are expected to occur more frequently than once in 400 years within the entire Pooling Region. Within the Western European Pooling Region a robust signal towards an increased frequency and intensity of one-day rainfall events with a similar return period was found. Climate change made such a rainfall event 1.2 to 9 times more likely. The magnitude of the rainfall increased by about 3-19 % compared to a global climate 1.2 • C cooler than today, based on the analysis of rainfall events in the boreal summer half-year. A similar increase in the rainfall intensity and frequency is found when defining the event based on the rainfall accumulated over 2 days. Further increases in frequency and intensity of such events are expected with additional climate change.
The findings are not just relevant for expanding the evidence from attribution science and thereby motivating mitigation efforts, but also for the design of policy interventions. The final report on the flood disaster of the German Federal Ministry for the Interior and Community (Bundesministerium des Innern und für Heimat; Bundesministerium der Finanzen 2022) acknowledges that the significant need for action in climate change adaptation is becoming increasingly clear, not least as a result of the heavy rainfall and flooding events of July 2021. This shows that the results of attribution studies, such as the rapid study upon which this paper builds, are motivating action on adaptation to the risks of climate change and, thereby, can help to decrease the vulnerability and exposure to such events.

C.2 EURO-CORDEX
The EURO-CORDEX Group defined a framework for the preparation of regional climate projections (Jacob et al. 2014(Jacob et al. , 2020Giorgi et al. 2009 (2020) and Zier et al. (2021). Some of the model data sets show clear problems and were excluded as a result. For example simulations based on the GCMs HadGEM2 and CanESM2 exhibit annual cycles of precipitation that deviates significantly from observation (see Fig. S1 in the Supplementary Information). Furthermore, the RCM REMO shows structurally offset precipitation, where windward-leeward effects on mountains appear too far windward of the mountains see Fig. S2 in the Supplementary Information). Following the conclusions of this evaluation, we removed all simulations downscaling the CanESM and the HadGEM2-ES GCMs. Simulations using the REMO RCM model were also removed. Each remaining simulation was bias corrected using the Cumulative Distribution Function transform method as described in Bartók et al. (2019).

C.4 HCLIM38
The HCLIM38 simulations were carried out by the nonhydrostatic model HCLIM38-AROME (Belušić et al. 2020) at 2.5 km horizontal resolution on a domain ranging from the UK to Poland and Mid-Italy to Southern Scandinavia. The HCLIM38 simulations consist of three GCM-forced runs (historic 1996-2005, RCP 8.5 2041-2050and 2090-2099. HCLIM38 is run in a double-nested way, using the RACMO RCM inbetween the host model and the CPRCM.

C.6 ALARO-0
The ALARO-0 climate projections cover the continuous periods 1950-2100 at 4 km resolution for a wide region centred over Belgium (Termonia et al. 2018) and has undergone thorough validation ) and investigation (Helsen et al. 2020) with respect to extreme rainfall. The runs were performed in a double one-way nesting setup where the Belgian domain was nested in an ALARO-0 run over the EURO-CORDEX domain, which was driven by CNRM-CM5 (historical, RCP 2.6, RCP 4.5, RCP 8.5). In this attribution study we use RCP 4.5 and RCP 8.5 simulations. The model domain covers 7 of the 14 tiles (i.e. tiles 4,5,7,8,10,11,12 shown in Fig. 5).

C.7 COSMO-CLM
Several time-slice experiments of the COSMO-CLM (https://www.clm-community.eu; Rockel et al. 2008) at convection permitting scales have been analysed within this study, all using the RCP8.5 emission scenario for the future. At the Karlsruhe Institute of Technology (KIT) three simulations at 2.8 km horizontal resolution have been conducted, driven by the GCMs EC-EARTH, HadGEM2, and MPI-ESM-LR respectively (EC-EARTH-KIT, HadGEM2-KIT, MPI-ESM-KIT); the model domain covered 7 of the 14 tiles (i.e. tiles 2, 3, 6, 7, 10, 11, 12) of the regions shown in Fig. 5. The HadGEM2-based simulation by KIT failed the model validation due to an unreasonable seasonal cycle (see also Supplementary information) and a high shape parameter and was excluded from further analysis. At DWD a simulation at 3 km horizontal resolution driven by MIROC-MIROC5 was conducted (DWD-CCLM5-MIROC5) within the first phase of the project "Expertennetzwerk (Network of Experts)". All 14 tiles were within the model domain of these simulations.

Appendix D: Model evaluation and attribution results
The results of the model evaluation are given in Table 2. The attribution results which are shown on Fig. 8 are given in Table 3. Additionally, the attribution results for RX2day in the Pooling Region, which are not in detail discussed in this paper, are given in Table 4. The threshold provided in Tables 3 and 4 provide the rainfall accumulation reached during the analysed event within the observations-based E-OBS dataset and, for the models, the threshold presents the rainfall accumulation which has an equivalent return period to the observed extreme event. By selecting an extreme event with the same return period (and not the same magnitude) the climate model simulations are corrected for offsets with respect to the observations. For a more detailed explanation of this basic bias correction, see Philip et al. 2020).