Large-scale stochastic flood hazard analysis applied to the Po River

Reliable hazard analysis is crucial in the flood risk management of river basins. For the floodplains of large, developed rivers, flood hazard analysis often needs to account for the complex hydrology of multiple tributaries and the potential failure of dikes. Estimating this hazard using deterministic methods ignores two major aspects of large-scale risk analysis: the spatial–temporal variability of extreme events caused by tributaries, and the uncertainty of dike breach development. Innovative stochastic methods are here developed to account for these uncertainties and are applied to the Po River in Italy. The effects of using these stochastic methods are compared against deterministic equivalents, and the methods are combined to demonstrate applications for an overall stochastic hazard analysis. The results show these uncertainties can impact extreme event water levels by more than 2 m at certain channel locations, and also affect inundation and breaching patterns. The combined hazard analysis allows for probability distributions of flood hazard and dike failure to be developed, which can be used to assess future flood risk management measures.


Flood risk analysis and research objective
Each year flooding causes the most damage of any natural disaster (Jongman et al. 2012) and, as such, many Flood Risk Management (FRM) strategies employ a risk-based approach (Voortman et al. 2009), where risk is the combination of exposure and hazard (with its associated probability). In analysing the hazard of large river systems, potential interactions between sub-catchments, dikes and other connected components become more relevant (Vorogushyn et al. 2017), and often require system wide models simulated using stochastic approaches.
Hydraulic models used to analyse flood hazard and other hydraulic aspects of protected river systems have been developed for the Elbe (Merz et al. 2016), the Rhine (Hegnauer et al. 2014), the Mississippi (Remo et al. 2012), and the Po River in Italy (Castellarin et al. 2011), amongst others. Such models are often used to calculate the location-specific hydraulic load associated with a given probability or return period (Vogel and Castellarin, 2017). Hydrological boundary conditions provide the characteristics of the extreme event being modelled (e.g. a 1/200 year event), while the schematisation of the model dictates how that event is routed in the area of interest.
However, deterministic estimates of these two components are unlikely to provide sufficient information for a reliable hazard analysis to be performed. Extreme events may originate from multiple tributary catchments, and can cause defence failures at various locations. Therefore, the present study proposes methods to include these sources of uncertainty (catchment hydrology and dike breaching) in large-scale flood hazard (and thus risk) analyses. The effect of including each of these components is also analysed individually, by comparing it to its deterministic equivalent. The concepts behind these methods are given below, along with a general framework for their application and analysis. In the following sections they are applied to the Po River case study, and the results discussed.

Hydrological extremes
The concept of a return period discharge at a given location is much used in flood risk analysis (FRA). Extreme value theory can generate these discharge values, provided gauged locations with sufficiently long timeseries are available (Towler et al. 2010). To see the effect of an extreme discharge on a complete river system, a 'design hydrograph' (Mediero et al. 2010) can be applied as a boundary condition to a model. This hydrograph will generally have an associated probability represented either by the hydrograph itself or the resulting hydrographs at specific locations. The use of one or more design hydrographs to represent extreme events was formerly standard practice on the Dutch Rhine and Meuse systems (Van Der Most 2015) and is currently applied to the Po River system in Italy (Castellarin et al. 2011).
Applying a single design hydrograph to a river channel will not represent the variability in the shape of the floodwave. Recent studies have addressed this by developing multi-variate distributions from which hydrographs can be sampled. As well as the peak discharge, these distributions can include other hydrological characteristic variables such as 30-day volume (Domeneghetti et al. 2013), 3-day volume (Liu et al. 2009), or the duration above a baseflow (Gräler et al. 2013).
The tool most commonly used to account for the dependence between multiple hydraulic variables, such as those listed above, is copulas (Favre et al. 2004;Salvadori and De Michele 2010, amongst others). A more detailed explanations of copulas are given by Renard and Lang (2007), but its principal function is to model the dependence between variables while maintaining their marginal distributions. As well as the dependence between characteristics of a single hydrograph, copulas can also be used to model streamflow characteristics from multiple sites (Hao and Singh 2013), which may 1 3 be correlated due to rainfall patterns. Weather generators can be applied to produce hydrographs that are spatially and temporally coherent (Merz et al. 2016;Dunn et al. 2016), but are not always readily available for many regions. In such situations, copulas can be used to model the dependence between multiple tributaries, for example in the Brisbane River catchment (Charalambous et al. 2013).
In the present study, a copula is developed to model both the dependencies of floodwave characteristics to each other and to characteristics at other tributary boundaries. The results from the use of the copula are compared against a model using deterministic hydrological boundary conditions.

Dike breaching
As well as the catchment hydrology, river routing also introduces uncertainties when generating the hydraulic loads associated with return periods. Parameters such as roughness are part of this uncertainty (Pappenberger et al. 2005); however when routing extreme events in a protected river system, the breaching of defences may cause the greatest change in expected loads. Possible breaching mechanisms include overtopping, piping and macrostability (Steenbergen et al. 2004), and the conditions required to trigger breaching are often highly complex and subject to uncertainty.
The interdependency of breaches on downstream flows is often called 'system behaviour' and has been shown to be highly influential in determining local and system-wide hazards. System behaviour analyses that account for the uncertainty related to breaching include those by Apel et al. (2009), De Bruijn et al. (2016, Assteerawatt et al. (2016), Gouldby et al. (2012) Vorogushyn et al. (2010), and (Curran et al. 2019). As downstream loads are dependent on upstream breaches, the above methods all utilise large models that try to represent the entire system. These models are run in Monte-Carlo simulations, in which both the hydraulic loads and dike strengths are varied per simulation.
In system behaviour analyses, the strength of the dike in resisting different breaching mechanisms is often approximated using fragility curves (Wojciechowska et al. 2015), which define the failure probability as a function of the hydraulic load for a discretised section of dike. Breaching uncertainty for that section can then be quantified by sampling the strengths defined in these fragility curves as a stochastic threshold for breaching in Monte Carlo simulations (De Bruijn et al. 2016b). Generally the fragility curves define failure probability as a function of water level (Bachmann et al. 2013;Mazzoleni et al. 2014b); however advanced methods to include both water level and duration of exceedance for that water level have been applied by Vorogushyn et al. (2009) andCurran et al. (2018).
Once a breach is triggered in a hydraulic simulation, the breach growth is determined by several variables which are also subject to uncertainty. Empirical models have been developed to estimate this growth (Verheij and Van der Knaap 2002), but most studies use a distribution based on historical data, (Mazzoleni et al. 2014a;Vorogushyn et al. 2010;Domeneghetti et al. 2013).
In the present study, both the dike breach triggering and growth functions are developed and applied to multiple potential breach locations in a system behaviour analysis. This stochastic analysis is compared against the use of deterministic breach fragility estimates.

3 2 General approach
The general approach proposed to account for the effects of the sources of uncertainty described above is given in the yellow-shaded section of Fig. 1. For each tributary in the system of interest, hydrological timeseries data are analysed, and distributions of relevant variables are fit and used as inputs to a copula. Similarly, dike data are collected and analysed to (1) discretise and schematise the protection system into a hydraulic model, and (2) generate distributions of dike strength for each discretised section. Both the hydrological event copula and the sets of dike strength distributions are sampled as inputs for simulations of a hydraulic model in a stochastic analysis. This allows for a complete stochastic flood hazard analysis to be generated.
As well as this overall analysis, it is also interesting to see the individual effects on the system when including these uncertainties. These analyses are shown in the unshaded sections of Fig. 1. For the hydrology, a single extreme hydrological event is compared against the events aggregated from sampling the hydrological copula. In these comparisons dike breaching is not included. For the dike breaching uncertainty analysis, a single extreme hydrological event is repeatedly simulated in the system with variable dike strength thresholds sampled from the developed distributions, and the aggregation of these simulations is compared against a simulation using deterministic thresholds.
The application of this approach to the Po River is described in the following section, with more specific detail about the models and methods used (e.g. the data used, the relevant hydrological variables, etc.). The results of each of the analyses proposed are then given and discussed, and concluding remarks are given in the final section.

Study area and available data
The case study is the Po River basin in Italy, which has a catchment area of ~ 75,000 km 2 and is fed by tributaries from the Alps in the North and West, and the Apennines in the South, as shown in Fig. 2. The region is considered highly vulnerable to flooding, both economically and with respect to loss-of-life (Domeneghetti et al. 2015), and thus the floodplain is compartmentalised using a dike protection system. 'C-Buffer' compartments are protected from the river by embankments with a safety standard of 1/200 years and may be situated directly beside the main channel of the river or to smaller 'B-Buffer' compartments that lie inside the main embankments (see inset of Fig. 2). The C-Buffer boundaries relate to the maximum inundation extents of catastrophic floods (Adb-Po 2016), and at certain downstream locations extend beyond the catchment boundaries, as shown in Fig. 2 (see Castellarin et al. 2011, for more detail). The present study utilises four main data sources for the case study; dike profile data, a 30 year daily discharge dataset of the tributaries, a hydraulic model of the main channel, and boundary conditions for that hydraulic model that represent a 1/500 year flood event. These data sources are discussed below and in the following sections.
The 1D HEC-Ras hydraulic model, representing a 370 km stretch of the main channel and selected tributaries, was schematised by Castellarin et al. (2011), and includes a quasi-2D schematisation of the multiple protected compartments in the floodplain. The model was calibrated on water level stages from the October 2000 flood event (Castellarin et al. 2011), which is estimated to have a 1 in 50 year return period. The deterministic boundary conditions used in this study are based on an empirical method to estimate synthetic design hydrographs (SDHs) for ungauged locations on the Po River by Maione et al. (2003). The method was used to develop a set of 1/500 year SDHs for the major tributaries in a study by Consorzio Italcopo (2002) and was adapted for use in the HEC-Ras model. This set of SDHs is hereafter called Tr500. The SDHs are generated ensuring peak discharges and volumes at multiple locations along the main channel conform to the return period loads estimated by flood frequency analysis of historical data. SDHs have also been developed relating to a 1/200 year event (Tr200- Colucci et al. 2003), but these were calibrated using an earlier version of the hydraulic model, and thus ignored in the present study.

Hydrology uncertainty estimation
Despite Tr200 and Tr500 having been used in many studies (Castellarin et al. 2011), they contravene the spatial and temporal variability of actual events present in the region. This can be seen in the study of Maione et al. (2003), where historical events are compared against the return period Flow-Duration Frequency (FDF) curves that were used to develop Tr500. The Maione study demonstrates that historical floodwave events are not characterised by the same return period at all locations. For example, the October 2000 event appears to be a ~ 1/35 year event at Pontelagoscuro and a ~ 1/60 year event at Borgoforte (see locations in Fig. 2). Therefore, the assumption of homogeneous return period loads all along the main channel is unrealistic. This is also the conclusion of Bianchi (2018), who highlights the role the various tributaries play in the inhomogeneity of return periods of hydraulic loads along the main river. The spatial and temporal hydrological uncertainty in load estimation caused by the tributaries to the Po River is addressed in the present study by simulating multiple events sampled from a copula.
The proposed hydrological method generates a long timeseries of annual maximum (AMAX) floodwaves in the main channel of the Po by routing multiple 'yearly' events, each of which is represented by a set of SDHs. Each SDH within an event is applied as a boundary condition for a tributary and has certain characteristics which come from sampling a Gaussian copula. This copula is chosen due to its simplicity, as more complex versions might not be suitable for the short (30 year) timeseries available. A resulting limitation is that the copula cannot include more complicated dependence structures than correlation coefficients (Zhang and Singh 2019), such as tail dependency. The Gaussian copula takes as input an n*n correlation matrix, where n is the number of correlated variables, i.e. the total number of characteristics for all SDHs. The sample taken for each event is then an n*1 vector of correlated standard uniformly distributed random variables. These represent probabilities of non-exceedance for each of the characteristics and can be easily transformed into 'real-world' values using probability distributions. Three floodwave characteristics were used to build SDHs on 16 tributaries, and therefore 48 probability distributions and a 48*48 correlation matrix are required.
As the goal of the sampling technique is to produce AMAX floodwaves in the main channel of the Po, using AMAX distributions for each tributary would likely over-estimate these events. Instead, timeseries of SDH characteristics are derived based on the tributary floodwaves that 'contributed' to events on the main channel, i.e. that occurred within a certain time window of the AMAX floodwave on the main channel. Therefore, timeseries of concurrent discharges for the main channel and all tributaries are required. Historical timeseries of discharge data are available for many stations along the main channel and tributaries; however some locations are missing and some data are incomplete. Instead, a 30 year daily discharge dataset from the SMHI (Swedish Meteorological and Hydrological Institute) is used, based on historical weather data and the HYPE hydrological model (Lindström et al. 2010). As this is hindcast data, it includes recent major events in the regions such as in October 2000.
Discharge data are obtained from the SMHI dataset for the outlets of the 16 tributaries whose sub-catchments shown in Fig. 2, as well as the discharge on the main channel at Pontelagoscuro. It is possible that using this single reference location would cause extreme events at that location alone but, in reality, the variability introduced by the copula prevents this. Five other tributaries are considered too small to develop SDHs and are therefore modelled as steady-state contributions. The annual maximum (AMAX) floodwaves at this reference are identified, and the peak discharge on each of the tributaries within a period from 20 days before to 10 days after this downstream peak is noted. These heuristic bounds simplify the hydrological and hydraulic processes involved, and may need to be addressed in future studies.
As well as peak discharge, two more floodwave characteristics are obtained from each the contributing floodwaves; duration above baseflow, and time lag with respect to the AMAX floodwave (Fig. 3). The latter is simply the time between main channel AMAX peak and contributing peak, which can also be 'negative', i.e. the tributary peak occurs after the AMAX downstream peak. For the 'duration above baseflow' characteristic, the baseflow for each tributary is estimated using a filtering algorithm (Arnold and Allen 1999), so that periods of direct runoff for each peak can be calculated. For a single AMAX event, the red and blue lines in Fig. 3 show how the three characteristics are obtained for an example tributary floodwave. For the main channel floodwave (blue), the contributing tributary (red) has a peak flow of (5500 m 3 /s) which occurs two days previously, and is Fig. 3 Example of identifying floodwave characteristics (peak, duration and lag) of a tributary (red) contributing to downstream flow (blue) for a single event of the 30 year timeseries. An SDH generated from these characteristics is shown in black above its baseflow for 10 days. The 30 year timeseries' of these three floodwave characteristics for all 16 tributaries provide the information used in the Gaussian copula (both the marginal distributions and correlation matrix).
Once the time-series of characteristics are obtained, the 48*48 correlation matrix can be generated. This matrix represents correlations between the three floodwave characteristics of each tributary. Distributions are fit to each timeseries; however due to their short (30 year) time span, the selection of a best distribution fit to the data is in many cases trivial. For simplicity, the characteristics of each tributary use the same distributions, but different parameters. The distributions used are; log-normal (peak discharge), normal (time lag) and log-normal (duration above baseflow), and the parameters for each are given in Appendix A.
With the correlation matrix and distributions defined, events sampled from the copula be used to produce the set of SDH boundary conditions for the hydraulic model. An example of an SDH generated from the three characteristics of the tributary in Fig. 3 is shown by the black triangular floodwave. When generating the SDHs, the rising and recessing limbs above the baseflow were considered to be 1/3 and 2/3 of the sampled duration time, respectively.
With sufficient samples, the water levels associated with specific return periods can then be estimated at any location. However, assumptions such as the contributing time period, the rising and recessing limbs ratios, and the reference main channel location simplify hydrology of the region. One example effect is that floodwaves from tributaries that do not 'contribute' to an AMAX event at the reference location on the main channel are missed. As a validation for the method, the deterministic Tr500 boundary condition is used to directly compare water levels along the main channel. Despite a positive validation (see Sect. 4.1.1), these simplifications should be addressed in future studies.

Dike breaching uncertainty estimation
Research addressing the large-scale impact of uncertainty in dike breaching for the region (like the hydrological uncertainty) is limited. Although studies on stochastic breaching for subsections of the Po River have been performed by Mazzoleni et al. (2014a) and Domeneghetti et al. (2013), only the study of Castellarin et al. (2011) considers the entire river stretch linked to the C-Buffer, and thus the effect of breaches on the entire system. However, that study used deterministic breaching thresholds for overtopping based on dike heights. Mazzoleni et al. (2014a) include piping threshold uncertainty by developing fragility curves, but these were only built for a 100 km stretch of the Po. Breach growth data are available from Turitto et al. (2010), who fit a lognormal distribution to 200 years of historical breach data in the region. The formation times of these breaches were considered deterministic at 0.1 h by Mazzoleni et al. (2014b), but sampled from a normal distribution with a mean of 2 h by Domeneghetti et al. (2013). In the present study, the uncertainties in breach triggering (due to overtopping and piping) and growth are estimated by sampling probability distributions developed for discretised sections of the entire 370 km main channel.
As a first step in this process, dike sections are discretised to ensure each dike section is connected to at most one C-Buffer compartment on the floodplain side, and either the river or a B-Buffer compartment on the river side (see inset at bottom of Fig. 2). For each section, the starting point of a potential breach (red stars in Fig. 2) is identified at the location considered to be most vulnerable to overtopping. This is selected by comparing the model dike heights with the water levels observed in a steady-state flow. The 1 3 93 schematised breaches have a pre-defined bottom elevation equal to the connected compartment, while the width is a function of time (see Table 1).
The next step is to develop dike strength distributions to apply to these sections and the breach starting points. This allows a set of sampled values from these dike strength distributions to be used as failure thresholds in each simulation for the potential breach locations. For overtopping, Castellarin et al. (2011) assumed a deterministic condition that if the water level exceeded the dike height at a breach location for 3 h, failure would occur. However, in most cases these dike heights are based on interpolations between cross sections, which are themselves based on surveyed or LIDAR data. While the underlying LIDAR data are reasonably accurate, the linear interpolation between sections introduces error. In the present study, the linearly interpolated values are compared with ~ 6 km of dike for which the actual heights are known, and a normal distribution of the error was found to have a standard deviation of ~ 0.5 m. This was therefore applied as an uncertainty distribution to all locations.
While these distributions may over-estimate the uncertainty in some places, it is maintained at all sections to account for other uncertainties such as errors in the water levels calculated by the model, and errors in the LIDAR data. It should be noted that this method does not take account of the length effect in each section (i.e. the longer a section is, the higher the probability of overtopping at some location); however this limitation is mitigated by the selection of vulnerable breach locations within each dike section. The duration of water level exceedance required for overtopping is also made variable in the simulations, using a uniform distribution between 2 and 4 h, based on previous research (Domeneghetti et al. 2015).
For piping, fragility curves have been developed for each breach location using cross section data from the Po River basin authority and the limit state formulation of Mazzoleni et al. (2014a), in which failure is determined by: where n is the porosity, ΔH is the water level and L is the critical length through which the pipe must form. The curves are developed by varying the n and L values according to a triangular and uniform distribution, respectively. As with overtopping, the method does not take account of the length effect in each section. Hence, in the present study the curves are generated using the available cross section data immediately upstream and downstream of the potential breach location. The curves also do not include the duration required to induce piping failure. Given these limitations, the analysis of the effect of piping is kept separate in the present study, see Table 2.
In relation to breach growth, the final breach width and the time required to reach it are here considered fully dependent. Therefore, in each Monte Carlo simulation, a single sampled exceedance probability is applied to distributions for both these variables. The final breach width is sampled from the truncated lognormal distribution developed by Govi and Turitto (2000) while the distribution of breach development time is taken from the truncated normal distribution suggested by Domeneghetti et al. (2013). The growth is assumed to be linear in time towards this final width.
As a comparison for the breach triggering and growth uncertainties, deterministic events are also simulated. These simulations use the values of maximum likelihood from each distribution, and in the case of overtopping duration, the midpoint of the uniform distribution used, as shown in Table 1.

Analysis structure
Three analyses are conducted for this case study which reflect the general analysis structure given in Fig. 1. The effects of the hydrological and breaching uncertainty are quantified separately in the 'Hyd' and 'Dike' analyses described in Sects. 3.2 and 3.3, and the 'All' analysis evaluates hazard distributions generated by including the uncertainties of both components. The simulations compared within each analysis are also given a descriptive name (Table 2). These simulations use either the breachable (BREACHBL) or unbreachable (NOBREACH) implementations of the HEC-Ras model, and in both cases flow over the dike (here called 'overflow' to distinguish it from the breaching mechanism 'overtopping') is implemented using the weir equation.
The 'Hyd' analysis evaluates the effect of the hydrological uncertainty by comparing deterministic and variable boundary conditions. Twenty thousand simulations are performed for Var_Hyd, which is a long enough timeseries to estimate extreme events. As discussed, in the 'Dike' analysis the influence of the simplified piping failure estimates is kept separate to the overtopping method. In both cases, 6000 simulations are performed for sampling the individual distributions in the 'Var_Dike' analyses. The 'All' analysis also separates the piping failure estimates, simulating 30,000 events for each. This is enough to get convergence of dike sections failure probabilities, accounting for the multiple sources of uncertainty. For future studies, efficiency in the number of simulations could be improved using methods such as importance sampling (Diermanse et al. 2014).

Hydrological analysis ('Hyd')
The purpose of this analysis is to compare hydraulic loads under deterministic hydrological boundary conditions (Tr500) to those observed from the stochastic boundary conditions developed using the Gaussian copula and the SMHI data.

Return period water levels
For each breach location along the main channel, the maximum water levels from the 20,000 Var_Hyd simulations at that location are ranked, allowing the expected levels for various return periods to be estimated. Combining these return period levels for each location along the main channel makes a 'profile' of return period water levels. It should be noted that this profile does not represent any single simulation from Var_Hyd. As the SDHs in the Tr500 boundary condition are developed in such a way to reproduce the 500 year return period event at every point along the main channel, the water levels from the Det_Hyd500 simulation can be directly compared to the 500 year return period water profiles from Var_Hyd. Figure 4 shows the locations and elevations of the return period profiles and the river chainage of the main tributary confluences. To see the results more clearly, the levels of these return period profiles relative to the elevations in the Det_Hyd500 simulation are also plotted.
The 500 year profile from Var_Hyd slightly under-estimates the levels given by Det_ Hyd500 by a maximum of 30 cm at a chainage of 340 km upstream. Further downstream, the levels match much more closely; however this is in large part due to overflow that often occurs during extreme events (see Fig. 5).

Return period inundation
Although breaches do not occur in the Var_Hyd analysis, the compartments experience flooding inundation due to overflow. The total maximum flood volume for each compartment from the 20,000 Var_Hyd simulations are ranked so that the expected volumes for various return periods can be estimated. Again, it should be noted that the Var_Hyd results do not represent any single simulation, but the combined results from each storage area. Figure 5 shows how these volumes compare to those observed in the deterministic simulation.
For extreme events, many B-Buffer compartments are completely filled, and therefore show little change in volume between the simulations. However, a number of C-Buffer compartments are seen to experience a change in volume of over 0.1 Mm3, just from 1 3 overflow. The comparison highlights the error in assuming the Det_Hyd500 boundary conditions will cause 1/500 year inundation levels due to overflow. These results are further discussed below.

Event uncertainty
While the aggregation of the Var_Hyd simulations can show the expected return period values at various locations, the individual simulations represent realistic events and do not necessarily conform to a single return period load at every location. Figure 6 compares 46 individual simulations from Var_Hyd with Det_Hyd500. The 46 Var_Hyd simulations are selected due to having an observed maximum water level at Pontelagoscuro that is within 10 cm of the level simulated in Det_Hyd500 at the same location. For this reason, the Var_Hyd simulations appear to converge at this downstream point, while the upstream levels in these simulations can differ largely to the deterministic simulation. For clarity, in the bottom part of the figure the Var_Hyd profiles are shown relative to DetHyd_500 profile.
Most of the deviation of these events occurs upstream, prior to the confluence of the Adda tributary, after which the contributions from tributaries are relatively small. Another reason the events are more similar downstream is that a large degree of overflow occurs in the region of the Oglio tributary (near Borgoforte), as can be seen in Fig. 5.

Dike Breaching analysis-'Dike'
The purpose of this analysis is to compare the effects of deterministic and stochastic breaching simulations on the Po. As shown in Table 2, separate 'overtopping' and 'overtopping and piping' analyses have been performed due to the simplified methods used for piping. The boundary conditions used are in all cases the Tr500 event. In the stochastic simulations, the thresholds are sampled from the uncertainty estimates described in Sect. 3.2. In the deterministic simulations the values of maximum likelihood are used as the thresholds.

Water profile uncertainty
The water level profiles of the deterministic and variable simulations in which breaching occurs due to overtopping are shown in Fig. 7. The locations of breaches that occurred in Det_DikeO500 are highlighted in the top plot. For clarity, in the bottom plot variable profiles are shown relative to the deterministic one, and as 'denser' in locations where the 6000 profiles converge. As the boundary conditions are all the same, no deviation is seen before the first potential breach location at around 360 km upstream. In contrast to the hydrological analysis, the water level uncertainty due to dike breaching is more pronounced downstream.
The same comparison can be made when the breaching occurs due to both overtopping and piping. The difference in the range of uncertainty surrounding the water level profile when piping is included is negligible, and is therefore not included.

Breach and inundation uncertainty
The inundation volumes and breach locations resulting from Det_DikeO500 are shown at the top of Fig. 8. As a comparison, the averaged compartment inundation volumes and the percentage of breaches that occurred over the 6000 simulations of Var_DikeO500 are shown at the bottom of the figure.
As can be expected, breached sections of Det_DikeO500 have high corresponding breaching percentages in Var_DikeO500, and the compartments connected to those sections are also subject to a large degree of inundation. However, a number of other dike sections also appear to be vulnerable in the stochastic analysis, causing other compartments to Fig. 4 Return period water profiles from Var_Hyd and corresponding mapped locations. Top graph: absolute elevations above sea level. Bottom graph: elevations relative to Det_Hyd500 (i.e. RP elevation profile from Var_Hyd -elevation profile from Det_Hyd500) Fig. 5 Comparison of maximum inundation volumes (due to overflow) for DetHyd500 and the 500 year expected volumes for Var_Hyd. Areas with volumes < 0.1 Mm 3 are shown in white. A 'significant' change in volume is considered to 0.1 Mm 3 experience significant inundation. Again, the results are similar when the thresholds for the piping mechanism are included, are therefore presented in Appendix D.

Overall Analysis-'All'
This section shows two of the potential applications of using both sources of uncertainty in flood hazard analyses; dike failure probabilities and expected hazard distributions. From Figs. 5 and 8 it can be seen that the floodplain compartments upstream of Borgoforte are the most vulnerable to flooding and breach failures, and thus only this region is considered in this analysis (Fig. 9). As the Var_AllO and Var_AllOP simulations include both hydrological and dike breaching uncertainty, failure probabilities can be calculated from these scenarios. However, the differences between the scenarios are minor, and therefore only Var_AllO is plotted in Fig. 9. It can be seen that the calculated failure probabilities for most sections conform to or surpass the protection standard for the region (1/200 year).
For the three stochastic simulations that sample annual hydrological boundary condition from a copula (Var_Hyd, Var_AllO, and Var_AllOP), the expected maximum inundation volumes for various return periods can be calculated for each compartment. The calculated values for two of these C-Buffer compartments (A and B) of these are shown in Fig. 10.
In most cases, the return period inundation volumes expected in the compartments are lower for Var_Hyd as overflow is included, but dike failures are not. For the two compartments shown, dike breaching causes inundation to be seen in events as small as 1/50 years. Some upstream compartments (such as A) have a smaller maximum capacity when a breach occurs (as flow can return to the river through the breach). This means that, for extreme events, these compartments can show higher inundation levels if breaching is not included. Fig. 6 Individual simulation profiles from Var_Hyd that 'match' (± 10 cm) the elevation of DetHyd500 at Pontelagoscuro. The bottom graph shows the relative profiles (simulation profile from Var_Hyd -elevation profile from Det_Hyd500)

Discussion
As a validation for the method of uncertainty estimation used in the Hyd analysis, the return period water levels calculated from Var_Hyd are compared against those from Det_ Hyd500 (Fig. 4). It should be noted that neither method provide the 'true' water levels but do allow for a benchmark of the variable analysis. The 500 year levels from the variable method agree closely with those from the deterministic boundary conditions, despite no historical water levels being utilised in their calculation. The levels are under-estimated upstream by about 20-30 cm, which is considered acceptable given the simplified methods through which the SDHs are created.
The Tr500 boundary conditions elicit the 500 year hydraulic load at every location along the main channel, based on flood-frequency analysis of historic data. However, they cannot be used to estimate 1/500 year inundation levels in the compartments, as seen in Fig. 5. The reason for this is the variability in water level profiles (± 2 m at their most extreme) observed in Fig. 6, which suggests that it is unlikely that an event matches the 1/500 year water levels at every location along the main channel, especially upstream. This is in agreement with the results of Maione et al. (2003), who demonstrated the same result for the October 2000 event. Both the deterministic and stochastic methods depend on the principle of hydrological stationarity, which may not be applicable in the catchment (Milly et al. 2005).
The methods through which dike breaching is analysed are simplified, and validation of the fragility of the current dike system based on real-world data is not feasible. However, the fact that most calculated failure probabilities conform to the standard of protection in the region (1/200 year) helps to validate the method. The vulnerable areas also conform to those predicted by Mazzoleni et al. (2014a, b). The results of the 'Dike' analysis indicate that downstream water levels may vary by as much a 1 m for an extreme event due to the possibility of breaches (Fig. 7). Furthermore, many sections of dike and connected storage areas are shown to be vulnerable to breaching when stochastic simulations are used instead of deterministic ones (Figs. 8,9). Finally, including both sets of uncertainty in the 'All' analysis demonstrates some of the applications of stochastic analyses in the region, as well as insights that can be gained. Failure probabilities of most sections are estimated to conform to the standard of protection in the region (1/200 year), but the analysis highlights many sections that may need further investigation (Fig. 9). Although the differences between the two breaching scenarios ('Overtopping' and 'Overtopping and Piping') appear minimal, the separation was necessary due to the simplified approach used to estimate piping.
Including both sets of uncertainty also allows expected hydraulic loads in the storage areas to be calculated for different return periods. Ideally, damage models could be applied directly to water depths to estimate values such as Expected Annual Damage (EAD). However, the quasi-2D structure used in the model means the depths calculated are not always representative of actual flood events. Instead the maximum inundation volume experienced in each compartment is here shown (Fig. 10), which could also potentially be used in damage estimates (Prettenthaler et al. 2010).

Conclusions
Flood risk analysis in large protected river systems is highly dependent on uncertainties in the tributary hydrology and dike fragility. If the distributions and dependencies of these variables can be estimated, Monte Carlo (MC) simulations of events sampled from the distributions can be used to estimate hazard and thus risk. In comparison with deterministic events, the aggregated MC results may provide valuable information in terms of uncertainty and probability. The presented research investigates the effect of hydrological and dike fragility uncertainty on flood hazard in the Po River region of Italy, compared against existing deterministic hazard analysis methods. Applications of combining these stochastic analyses for complete hazard analyses are also demonstrated.
Hydrological uncertainty is estimated using multiple Synthetic Discharge Hydrographs (SDHs) for each tributary, which are built by sampling floodwave characteristics from a copula dependence model. Dike breaching uncertainty is based on distributions representing dike fragility and breach growth rate. The validation of the hydrological results show close agreement with those from a prior flood frequency analysis (< 30 cm). Validation of the uncertainty in dike breaching is more difficult but, for most dike sections, failure probabilities calculated from the method achieve or surpass the 1/200 year standard of protection in the region. Comparing deterministic methods against these uncertainty estimations demonstrates the inhomogeneity of return period flows along the river, due to tributary inflows (± 2 m upstream) and breach outflows (± 0.5 m downstream).
Combining these uncertainty estimation methods in a complete hazard analysis allows probability distributions of hydraulic loads in the compartments and river to be computed, as well as dike failure probabilities. From these results, the potential dynamics of the river-dike-floodplain system during extreme events can be observed, such as the possibility of reduced storage capacities upstream when breaches allow inundated compartments to flow back into the river. However, the simplifications and assumptions used in generating these results should be taken into account in future studies. For the hydrology, these include the probability distributions, the copula and the short (30 year) timeseries used to generate them both, while for the dike strengths they include the piping formulation and growth functions. Finally, the quasi-2D simulation method also limits the inundation hazard information that can be obtained from the analyses. Nevertheless, the proposed approach provides useful hazard information in the system, which (in combination with a damage model) could be used in the assessments of expected annual damage or proposed measures. The availability of such information, together with the uncertainty estimations, is certain to be of benefit to flood management decision makers. The overall approach is likely to be applicable to various large-scale protected river systems worldwide, including the Elbe and Rhine (Germany), Mississippi (USA) and the Yellow River (China).