1 Introduction

Adapting to the consequences of climate change is without doubt one of the central challenges faced by humanity in the next decades. One of these consequences are intense heavy rain events, which scientists agree will increase both in their intensity and their frequency within the next years [1,2,3].

The flash flood from 14 to 15 July 2021 in Germany, Luxembourg, and Belgium at the latest has caused a special public interest in adapting to such events. This event claimed more than 180 lives [4] and caused tremendous damage, which has been estimated at a total of 32 billion euros [5].

Typically, flood mitigation concepts are created based on simulations rather than using optimization methods [6]. Prioritization of actions is then often done by a simple point scheme [7]. This method, like any other purely simulation-based method, lacks the consideration of site-specific interplay of actions, which motivates using optimization methods in the context of flood mitigation.

Although there is clear evidence of the efficacy of precautionary measures [8], the literature on the use of optimization techniques in order to design flood mitigation concepts is surprisingly limited, which is also pointed out in [9, 10]. The closest related study to this work is [9], where the “Optimal Flood Mitigation Problem,” which aims to optimize the positioning of a single type of precautionary measure (embankments) to protect critical assets in the case of a flood scenario, is introduced. Furthermore, two time-indexed mixed-integer programming formulations that over- and underestimate the water flows during a flooding scenario are presented. Here, the over- and underestimation is caused by the linearization of nonlinear constraints. Due to the time-indexed formulation, however, the MIPs do not tractably scale to realistic scenarios (as noted in the abstract of Tasseff [9]).

The problem of designing mitigation concepts for coastal floods in the Netherlands is an impressive example of the potential of using optimization techniques in flood mitigation. A mixed-integer programming formulation for a cost-efficient design of dike heights is presented in [11, 12]. Furthermore, a greedy search algorithm to compute a combination of reinforcement measures for dike segments, which is 42% cheaper than the combination obtained from the common approach, is implemented in [13].

Moreover, a genetic algorithm is used to compute efficient mitigation concepts for fluvial (river-caused) flash floods on the Thames Estuary (London, England) in a multi-objective setting [10]. Apart from the measures themselves, they also compute a threshold value for the timing to make an intervention given the uncertainty of the development of climate change and its impact on fluvial flash floods.

Another approach to the design of flood mitigation concepts can be found in [14], where a simulated annealing algorithm is used to determine an allocation of low-impact actions such as porous pavements and green roofs to districts in a megacity. Moreover, a particle swarm optimization algorithm is used in [15] to determine an optimal pumping schedule and optimal weir crest heights for detention reservoirs to minimize downstream flood damage.

An often neglected but crucial factor in creating successful flood mitigation concepts is taking the cooperation of residents into account since; in many cases, the most effective actions are located on private properties. Indeed, the potential of incentives in flood prevention has already been established as promising [16,17,18]. In practice, however, plans are often made before involving critical private actors. A holistic review on using market-based instruments for flood risk management is provided in [19].

Besides these approaches for flood mitigation, a wide variety of optimization techniques are used in post-disaster flood management. The design of evacuation plans including shelter location planning and helicopter assignment in a multi-objective robust setting is investigated in [20], and real-time operation procedures that specify reservoir releases during a flood are examined in [21, 22]. For a more extensive review of optimization and machine learning approaches in post-disaster flood management, we refer to [23].

Other applications of optimization techniques in water management involve the design of sewage water systems [24], a real-time release schedule for reservoirs during a flood [22], and the geometrical design of retention basins [25].

1.1 The Project AKUT

The work presented in this paper has been performed within the project AKUT—an acronym for the German translation of “Incentive Systems for Municipal Flood Prevention”—which has been funded by the German Federal Ministry for the Environment, Nature Conservation, and Nuclear Safety from January 2019 to March 2021. Within the project, a mixed-integer programming approach has been developed to find an optimal combination of actions to be taken such that the resulting damage on buildings is minimized while respecting a given budget and constraints on the cooperation of the residents. The resulting MIP has been implemented in a web application (also referred to as AKUT) using the Flask framework and Python 3.8. The application is available for municipalities free of charge (so far only in German language).

The project team included a municipality providing us with real-world data and the engineering office “igr GmbH” validating our results by comparing them to results of state-of-the-art simulations. Furthermore, the Professorship of Water Resource Management and Sanitary Environmental Engineering at Mainz University of Applied Sciences formulated and developed the engineering methodology for the model while the authors of this paper formulated the mathematical model and implemented the web application.

1.2 Our Contribution

In this paper, we present a novel mixed-integer programming approach for computing optimized flood mitigation concepts that minimize the damage to buildings due to flash floods caused by heavy rain events. To the best of our knowledge, this approach marks the first usage of optimization techniques in the context of planning precautionary measures for flood mitigation that scales well enough to be applied to real-world instances. Our model allows for different types of precautionary measures (basins, ditches, and embankments) that lead to elevations or depressions of the terrain surface. Moreover, the model takes constraints on the cooperation of the residents into account. One of the central challenges to make this approach work for realistic scenarios is modeling the surface terrain efficiently while still maintaining a realistic representation. We tackle this challenge via an efficient graph-based approach together with suitable preprocessing methods. Moreover, we present a combinatorial algorithm that is able to quickly compute an initial feasible solution of the presented MIP.

Our approach has been implemented as an innovative decision support tool in the form of a web application, which has already been used in practice by more than 30 engineering offices, municipalities, universities, and other institutions from all over Germany. We compare the results obtained from our model on real-world instances from different municipalities to results obtained from established simulation software, and investigate the main drivers for the running time and the quality of the obtained solutions. The novelty of our approach in comparison to a selection of the previously presented existing literature is summarized in Table 1.

Table 1 Comparison of our work to existing literature. A tick in the column “optimization” indicates that optimization algorithms are used and a tick in the column “pluvial” represents that the paper considers a pluvial flood scenario (as opposed to a fluvial or coastal flood scenario). A tick in the column “scalable” indicates that the developed method scales well enough to be applied to realistic scenarios. Finally, a tick in the column “incentivation” means that incentives or cooperation of residents is considered

2 Problem Description and Input Data

In this section, we define the underlying problem and describe the input data on which our approach is based.

In short, given a set of possible locations for retention basins (simply called basins in the following), ditches, and embankments, the goal in the problem is to determine a subset of these actions to take such that the resulting damage to buildings is minimized while respecting a given budget and constraints on the cooperation of residents.

The terrain surface is given as a digital terrain model (DTM), which is an established standard in engineering [6]. A DTM contains 2D coordinates in UTM format on a given grid together with their corresponding geodesic height, which is similar to the elevation above sea level. In our case, we use a grid size of 1 m.

Each of the data points in the DTM then determines the geodesic height of the one by one meter square centered at the 2D coordinate. This square is called the shape of a coordinate, and two 2D coordinates are called adjacent if their distance is 1 m, i.e., if one coordinate is 1 m to the north, south, west, or east of the other. These data are available for all German municipalities and, hence, suitable for applying our model in practice.

To estimate the damage that occurs due to flooding in the case of a rain event, information about the buildings’ locations is required. To this end, the shape of a building is defined as the polygon derived from its outline. The outlines of the buildings are obtained from ALKIS,Footnote 1 which is a digital land information system. Just like the data for the DTM, these data are available to all German municipalities.

To link the positions of the buildings to the DTM coordinates, we say that a building is on a coordinate, if the shape of the building intersects with the shape of the coordinate. Conversely, we say that a coordinate intersects with a building in this case.

The definition of damage caused to buildings is based on the advisory leaflet DWA-M 119 [6] published by the German Association for Water, Wastewater and Waste (DWA) in 2016. The DWA is a politically and economically independent organization that supports safe and sustainable water management and prepares the DWA set of rules, which includes a large number of standards and advisory leaflets. Within the advisory leaflet DWA-M 119, they identify two main factors for the damage caused to a building, the first of which is the maximum water level at the building.

Here, the maximum water level at a building is the maximum over all water levels at the coordinates intersecting with the building. The hazard class, which represents the maximum water level at a building, is a categorical measure attaining the values zero to four. It is derived from the maximum water level at a building by the following rulesFootnote 2:

  • Zero: The maximum water level at a building is 0 cm, i.e., none of the coordinates intersecting with the building has a strictly positive water level.

  • One: The maximum water level at the building is strictly larger than 0 cm and less than or equal to 10 cm.

  • Two: The maximum water level at the building is strictly larger than 10 cm and less than or equal to 30 cm.

  • Three: The maximum water level at the building is strictly larger than 30 cm and less than or equal to 50 cm.

  • Four: The maximum water level at the building is strictly larger than 50 cm.

The second main factor describes the (quite intuitive) fact that not every building suffers an equal amount of damage at a given water level. As an example, it is by far less severe if a garage is affected by the rain event compared to the case where a hospital is affected. To take this into account, the damage at a building does not only depend on the water level at the building (represented by its hazard class), but also on its damage class. The damage class is a categorical measure of the damage occurring at a building if water accumulates at it. It can attain the values one to four, where one corresponds to the lowest damage class (the garage in our example), i.e., the least amount of damage, and four corresponds to the highest damage class (the hospital in our example). The data from ALKIS, aside from just the shape of the building, provide additional information about the building like its usage, which allows to preset the damage class for some of the buildings automatically. For the remaining buildings, the damage class has to be specified manually.

The combination of the hazard class and the damage class yields the need for protection of a building, which is rated using a point system with a scale from zero to seven. For buildings with hazard class zero, i.e., none of the coordinates intersecting with the building have a strictly positive water level, the need for protection is also zero. Every other building has a strictly positive need for protection, which is increasing in both the building’s damage class and its hazard class. The objective of the problem is to minimize the sum of all buildings’ needs for protection.

In order to protect the buildings, a set of potential basins, ditches, and embankments is given, which together make up the possible actions. Each possible action is given by the polygon of its location, its construction costs, and its depth (in the case of a basin or ditch) or height (in the case of an embankment). The polygon of an action’s location is called its shape.

Similar to the buildings, we say that an action is on a coordinate if its shape intersects with the shape of the coordinate. In this case, we also say that the coordinate intersects with the action. If an action is taken, i.e., a basin, ditch, or embankment is built, the geodesic height of all coordinates intersecting with the action is decreased by the action’s depth or increased by its height. The change in geodesic height affects the flow of the water on the terrain surface and, hence, can protect buildings. The overall cost for taking actions is bounded from above by a given budget.

Taking an action requires the consent of the owners of the properties on which the action is located. The owners of the properties are called actors in the following. The outlines of the properties are also obtained from ALKIS. As before, the shape of a property is defined as the polygon derived from its outline. An action is on a property if their shapes intersect. Convincing actors to cooperate might be more or less hard. To guarantee that the recommended combination of actions can realistically be implemented, the number of hard-to-convince actors on whose properties actions are to be taken is bounded from above. To this end, an extended traffic light rating system with the following characterizations is used:

  • Green: The actor is willing to cooperate.

  • Yellow: The actor needs minor incentives to cooperate.

  • Red: The actor needs major incentives to cooperate.

  • Black: The actor does not cooperate at all.

For simplicity, we also refer to green, yellow, red, and black properties in the following. The willingness to cooperate has to be assigned manually by the user for each property.

3 Mathematical Modeling

We now present a graph-based model for the problem described in Section 2 as well as an approach for reducing the size of the underlying graph (Section 3.1). Afterwards, in Section 3.2, we derive our mixed-integer programming formulation that is used for solving the graph-based model, and we describe valid inequalities and presolve techniques that are used to improve performance.

3.1 Graph-Based Model

3.1.1 Construction of the Graph

In this section, we construct the directed graph \(G_{\text {or}} = (V_{\text {or}}, R_{\text {or}})\), which we call the original graph, from the DTM.

Recall that, for a directed graph \(G= (V, R)\), a node \(v \in V\) is called a child of \(u \in V\) if there is an arc from u to v. Conversely, u is then called a parent of v. Moreover, a node \(v \in V\) is called a successor of \(u \in V\) if there exists a (directed) path from u to v. Conversely, u is then called a predecessor of v. The set of outgoing arcs of \(v \in V\) is denoted by \(\delta ^+(v)\) and the set of incoming arcs by \(\delta ^-(v)\). A node with no outgoing arcs is called a leaf, and a node without incoming arcs is called a root. For an arc \(r \in R\), we denote its start node by \(\alpha (r)\) and its end node by \(\omega (r)\).

For the construction of \(G_{\text {or}} = (V_{\text {or}}, R_{\text {or}})\), recall that the DTM contains data about the geodesic height for coordinates on a 1-m grid. The set \(V_{\text {or}}\) of nodes is constructed by associating one node with each of these coordinates, i.e., there is a one to one correspondence between the coordinates in the DTM and the nodes in the graph. To keep track of this correspondence, each node gets the coordinate as an additional attribute.

The geodesic height of a node is defined as the geodesic height of its corresponding coordinate. We store the geodesic height as an attribute for each node in the graph. We then index the nodes in \(V_{\text {or}}= \{v_1, \dots , v_n\}\) in non-decreasing order of geodesic height, where ties are broken arbitrarily. Furthermore, we define the shape of a node \(v \in V_{\text {or}}\) as the shape of its corresponding coordinate, i.e., in this case, the one by one meter square with its center at the corresponding coordinate. The definitions of whether a building, action, or property intersects with (the shape of) a node are analogous to the ones for coordinates provided in the previous section. Finally, each node \(v \in V_{\text {or}}\) is assigned an area, which we denote by \(\text {area}_v\). In the case of the original graph, the area is 1 m² for each node. This changes though for the graphs we construct in Section 3.1.3.

For any two nodes whose corresponding coordinates are adjacent on the grid, there is an arc in \(R_{\text {or}}\) between the nodes, which is oriented from the node with the higher index to the node with the lower index. This means that arcs are directed from the node with larger geodesic height (the higher node) to the node with lower geodesic height (the lower node) whenever the two nodes do not have the same geodesic height. Note that, if all nodes in \(V_{\text {or}}\) have pairwise distinct geodesic heights, this makes the original graph \(G_{\text {or}} = (V_{\text {or}}, R_{\text {or}})\) acyclic since the geodesic heights induce a topological sorting (both in the mathematical and literal sense) in this case. An example of the original graph is provided in Fig. 1.

To model the runoff behavior of the precipitation water, we compute flows in the graph, which are determined by the nodes’ geodesic heights. To this end, for an arc \(r \in R_{\text {or}}\), we define its slope as the absolute difference of its incident nodes’ geodesic heights and denote it by \(\text {slope}_r\). When distributing the outflow of a node \(v \in V_{\text {or}}\) among its downhill arcs, we want to ensure that a higher slope causes more water flow on an arc. This is modeled by the ratios of the arcs, which are introduced next and are based on the concept of processing networks [26, 27], in which flow is distributed among the outgoing arcs of a node according to fixed ratios.

To compute the ratio of an arc \(r \in R_{\text {or}}\), which we denote by \(\text {ratio}_r\), we have to distinguish two cases. If the sum of the slopes of all outgoing arcs of the node \(\alpha (r)\) is nonzero, we define the ratio of r by \(\text {ratio}_r:=\nicefrac {\text {slope}_r}{\sum _{\hat{r} \in \delta ^+(\alpha (r))} \text {slope}_{\hat{r}}}.\) If the sum is zero, the ratio of r is defined as one divided by the number of successors, i.e., as \(\text {ratio}_r:=\nicefrac {1}{|\delta ^+(\alpha (r))|}.\)  

In some situations, however, actions that are taken lead to water flowing in the opposite direction of an arc \(r \in R_{\text {or}}\), which means that the original graph does not suffice for our model. A simple example for such a situation is illustrated in Fig. 2. To this end, for an arc \(r \in R_{\text {or}}\), we denote the inverse arc by \(\overleftarrow{r}\). The extended original graph \(G_{\text {or}}^{\text {ex}} = (V_{\text {or}}, R_{\text {or}}^{\text {ex}})\) is then constructed by adding the inverse arc \(\overleftarrow{r}\) for each \(r \in R_{\text {or}}\) to the original graph and setting the ratio of the arc \(\overleftarrow{r}\) to the ratio of r. Note that this does not change the node set \(V_{\text {or}}\).

Fig. 1
figure 1

An example of the original graph \(G_{\text {or}} = (V_{\text {or}}, R_{\text {or}})\) on the left and an example of the extended original graph \(G_{\text {or}}^{\text {ex}} = (V_{\text {or}}, R_{\text {or}}^{\text {ex}})\) on the right. The number in each node corresponds to its geodesic height, also indicated by the node’s color. The nodes are indexed in non-decreasing order of geodesic height, where ties are broken arbitrarily as, e.g., for \(v_4\) and \(v_5\). The arcs in the original graph are directed such that they start at the node with the higher index

Fig. 2
figure 2

The instance consists of two nodes u and v, where v is the higher of the two nodes. This means that water flows from v to u, which is illustrated on the left-hand side. If a basin with depth strictly larger than the absolute difference of the nodes’ geodesic heights can be built on v, the resulting geodesic height of v after building the basin is less than the geodesic height of u. Therefore, the water flows in the opposite direction after building the basin, which is illustrated on the right-hand side

3.1.2 Description of the Graph-Based Model

The goal in our problem is to provide best-possible protection for the buildings by taking a combination of actions respecting a given budget and the cooperation of the actors. In this section, we formulate the corresponding optimization problem formally by using the extended original graph introduced in the previous section.

The input of our graph-based model consists of the following:

  • The original graph \(G_{\text {or}} = (V_{\text {or}}, R_{\text {or}})\) and the extended original graph \(G_{\text {or}}^{\text {ex}} = (V_{\text {or}}, R_{\text {or}}^{\text {ex}})\)

  • The set \(B\) of buildings, each of which is given by its shape and its damage class

  • The set \(\mathcal {A}\) of possible actions, each of which is given by its shape, its construction costs, and its depth/height

  • The set \(P\) of properties, each of which is given by its shape and the willingness to cooperate of the corresponding actor

  • A budget denoted by \(\text {budget}\), which represents an upper bound on the total cost for taking actions

  • The maximum combined number of yellow and red properties denoted by \(\text {maxAllowedYellow}+ \text {maxAllowedRed}\) on which actions can be taken

  • The maximum number of red properties \(\text {maxAllowedRed}\) on which actions can be taken

  • The rain per square meter denoted by \(\text {rain}\)

A feasible solution is a set of actions whose total cost does not exceed the given budget and where neither the combined number of yellow and red properties on which actions are taken nor the number of red properties on which actions are taken exceeds the allowed maximum. The objective is to minimize the sum of all buildings’ needs for protection, which is computed from a given feasible solution as described in the following.

The decision on which actions are to be taken changes the geodesic heights of the nodes intersecting with these actions, which may in turn change the flows in the graph. The change of the geodesic height is straightforward if there is at most one action taken on a node. However, if there are several actions with different depths/heights taken on one node, like, for example, a ditch leading into a deeper basin, we need a more sophisticated rule, which is given as follows:

  • (GH1) If at least one action decreasing the geodesic height (i.e., a basin or a ditch) is built on a node \(v \in V_{\text {or}}\), then the geodesic height of v is set to the node’s original geodesic height minus the maximum depth of any of the basins or ditches built on v.

  • (GH2) If no actions decreasing the geodesic height are built on a node \(v \in V_{\text {or}}\) (i.e., neither basins nor ditches are built on v), then the geodesic height of v is set to the node’s original geodesic height plus the maximum height of any of the embankments built on v.

Once this decision has been made, the resulting geodesic heights (after taking the actions) determine the flows between the nodes, which then allows us to compute the water levels. Before we describe how the flows are computed, we describe the connection between the flows and the water levels. To this end, we define the excess of a node \(v \in V_{\text {or}}\) as the amount of water accumulating at v, i.e., as the initial water from the rainfall plus the node’s inflow minus its outflow. The water level at v is defined as the excess of v divided by the node’s area.

We next describe how the water levels are computed. An efficient implementation of a combinatorial algorithm for this computation is provided as Algorithm 6 in Appendix 2, which uses Algorithms 4 and 5 as subroutines for computing the flows in the graph and joining nodes, respectively.

We define \(G^{D} = (V^{D}, R^{D})\) for a subset \(D \subseteq \mathcal {A}\) of actions as the graph that is obtained from G when adjusting the geodesic heights as described in (GH1) and (GH2) and changing arc directions where necessary. This graph then represents the input to Algorithm 6 if when computing the water levels in the scenario where exactly the actions in D are taken. Throughout the computation, we keep track of both the water levels and the excesses of all nodes in V. At the start of the algorithm, each node receives its initial water from the rainfall, which is computed by multiplying the given rain per square meter with the node’s area. This water may then flow over the node’s outgoing arcs.

The outflow of a node \(v \in V\) (i.e., the water flowing from the node to its adjacent nodes) is distributed proportionally to the ratios of the outgoing arcs if v is not a leaf. The leaves then start flooding until the water level at some leaf \(u \in V\) equals the absolute difference of the geodesic height of u and its lowest parent node \(v \in V\), in which case we say that the water level at u matches the geodesic height of v. The nodes are then joined and from there on represented by u. Joining nodes within Algorithm 6 is done via the subroutine presented in Algorithm 5. This process is then repeated until there is no node that is not a leaf in the graph and has strictly positive excess. The behavior of the flows during the algorithm is illustrated in Fig. 3.

Fig. 3
figure 3

An illustration of the flows in Algorithm 6

After the flows are computed, the water levels at the nodes follow immediately by dividing the excess of each node by the nodes’ area. This allows us to determine the maximum water level at each of the buildings and, hence, its resulting hazard class, which is obtained as described in Section 2. The combination of the buildings’ hazard and damage classes yield the corresponding needs for protections whose sum is to be minimized.

Note that implementing the described procedure for computing the water levels efficiently is important for obtaining feasible running times of our overall approach on real-world instances. In fact, this procedure is used both when reducing the graph size as described in the following subsection and for obtaining a feasible initial solution of our MIP as discussed in Section 4.2. The implementation as an efficient combinatorial algorithm presented in Appendix 2 uses an intelligent update of the flows, which decreases the running time by about 90% on average compared to computing the flows from scratch in each iteration. Moreover, using a Fibonacci heap to store leaf nodes and the corresponding times until the water level at the leaf matches the geodesic height of its lowest parent further reduces the running time by about 25% on average compared to a simple sorted-array implementation.

3.1.3 Reducing the Graph Size

The size of the original graph \(G_{\text {or}} = (V_{\text {or}}, R_{\text {or}})\) or the extended graph \(G_{\text {or}}^{\text {ex}} = (V_{\text {or}}, R_{\text {or}}^{\text {ex}})\) is the main determinant for the size of a problem instance. In this section, we describe how the graph size can be reduced while still maintaining a realistic model of the problem described in Section 2.

It is worth noting that the sizes of both the original graph and the extended graph are linear in the cardinality of the node set \(V_{\text {or}}\), which we therefore use as a natural measure of the size of these graphs. The aim of this section is to derive the reduced graph \(G_{\text {red}} = (V_{\text {red}}, R_{\text {red}})\) from the original graph. In fact, applying our MIP presented in the next section based on the original graph only works for unrealistically small instances. Hence, reducing the size of the graph is actually crucial in order to obtain a model that is applicable in practice.

As a quick outline of this section, we provide a short summary of the ideas of our graph size reduction techniques:

  1. 1.

    Instead of a fixed grid size of 1 m, we use a dynamic grid size, which means that certain parts of the terrain surface are modeled using coarser 25 m or 5 m grids.

  2. 2.

    We remove nodes that do not cause flow into critical locations.

  3. 3.

    We contract all nodes in non-critical locations that dispense water to critical locations into one source node.

  4. 4.

    We contract adjacent nodes of similar geodesic heights.

Before we can apply these ideas, we have to introduce some further definitions. To model the terrain surface using a grid size of 25 m, we construct the graph \(G_{25} = (V_{25}, R_{25})\) from those coordinates in the DTM where both UTM coordinates are integer multiples of 25 m. This works completely analogously to the construction of the original graph. The only difference is that the shape of a node in \(V_{25}\) is no longer a square with an edge length of 1 m, but now a square with an edge length of 25 ms. Consequently, each node’s area in \(G_{25}\) amounts to 625 m². Also note that, for example, a building is on a node \(v \in V_{25}\) if its shape intersects with the shape of v, which in \(G_{25}\) is a square with an edge length of 25 ms.

In the same fashion, we construct the graph \(G_{5} = (V_{5}, R_{5})\) from those coordinates in the DTM where both UTM coordinates are multiples of 5. It is important that, although the nodes in \(V_{25}\), \(V_{5}\), and \(V_{\text {or}}\) stem from the same coordinates, the sets are disjoint as the attributes of the nodes (e.g., their areas) differ.

To obtain more information about the graphs \(G_{25}\), \(G_{5}\), and \(G_{\text {or}}\), we first assess for each node whether there are a buildings or possible actions on it. This means that, for each node \(v \in \hat{V}\), where \(\hat{V}\) is one of the sets \(V_{25}\), \(V_{5}\), or \(V_{\text {or}}\), we store a set of buildings on the node, which we denote by \(B_v\), and a set of possible actions, which we denote by \(\mathcal {A}_v\). An efficient algorithm for obtaining these sets is provided in Appendix 1.

1. Using a dynamic grid size: We construct a graph with a dynamic grid size, which we denote by \(G_{\text {dg}} = (V_{\text {dg}}, R_{\text {dg}})\). To this end, we first construct the node set \(V_{\text {dg}}\) and then the arc set \(R_{\text {dg}}\). To construct the node set, we initialize \(V_{\text {dg}}\) as a copy of \(V_{25}\), then resolve each node that intersects with a building or an action at a 5 m grid size, and finally resolve each node that has been resolved at a 5 m grid size and that intersects with a ditch or an embankment at a grid size of 1 m. This procedure is described in Algorithm 1. To keep track of the resolution at the single nodes, we further store the resolution \(\text {res}_v\) for each node \(v \in V_{\text {dg}}\).

Algorithm 1
figure a

Construct-nodes

The set of arcs \(R_{\text {dg}}\) is then constructed by adding an arc between two nodes in \(V_{\text {dg}}\) if and only if they are adjacent on the dynamic grid (i.e., their shapes have a common edge). The arc is again directed from the node with the higher index to the node with the lower index according to the ordering of the corresponding nodes with the same coordinates in \(V_{\text {or}}\) (i.e., from the higher node to the lower node whenever the two nodes do not have the same geodesic height).

To compute the ratios, we also have to take the resolutions of the nodes into account. This stems from the fact that, with the dynamic grid size, the length of the common edge of two adjacent nodes’ shapes can be 1 m, 5 m, or 25 m. The ratio of an arc \(r \in R_{\text {dg}}\), hence, depends on the slopes of the outgoing arcs of \(\alpha (r)\) and on the proportion of the boundary of the shape of \(\alpha (r)\) that the shapes of \(\alpha (r)\) and \(\omega (r)\) have in common. The ratio of r is computed as

$$\begin{aligned} \text {ratio}_r := \left( \nicefrac {\text {slope}_r}{\sum _{\hat{r} \in \delta ^+(\alpha (r))} \text {slope}_{\hat{r}}}\right) \cdot \text {correction}_r, \end{aligned}$$

where

$$\begin{aligned} \text {correction}_r := {\left\{ \begin{array}{ll} 1 &{} \,\text {if } \text {res}_{\alpha (r)} \le \text {res}_{\omega (r)} \\ \nicefrac {\text {res}_{\omega (r)}}{\text {res}_{\alpha (r)}}&{} \, \text {else} \end{array}\right. }. \end{aligned}$$

An example of shapes of nodes and the corresponding graph \(G_{\text {dg}} = (V_{\text {dg}}, R_{\text {dg}})\) is provided in Fig. 4.

Modeling all buildings at a grid size of 5 m is still overly exact. Buildings at which no (or only negligible) water levels are to be expected can still be modeled at a grid size of 25 m. To assess a good grid size, we compute the water levels on the graphs \(G_{\text {dg}} = (V_{\text {dg}}, R_{\text {dg}})\) and \(G_{25} = (V_{25}, R_{25})\) using Algorithm 6, which is presented Appendix 2.

We call a node \(v \in V_{25}\) threatened, if it has a strictly positive water level in the computation on \(G_{25} = (V_{25}, R_{25})\) or if any node in \(V_{\text {dg}}\) whose shape intersects with the shape of v has a water level greater than or equal to 1 cm in the computation on \(G_{\text {dg}} = (V_{\text {dg}}, R_{\text {dg}})\).

For each non-threatened node \(v \in V_{25}\) that only intersects with buildings and not with actions, we rescale its resolution in \(G_{\text {dg}} = (V_{\text {dg}}, R_{\text {dg}})\) back to 25 m, i.e., we contract all nodes in \(V_{\text {dg}}\) whose shapes intersect with the shape of v into v. Afterwards, we recompute the arc set \(R_{\text {dg}}\) and the ratios with the updated node set \(V_{\text {dg}}\) as we have done before, which yields the final version of \(G_{\text {dg}} = (V_{\text {dg}}, R_{\text {dg}})\).

Fig. 4
figure 4

A screenshot from our web application on the left-hand side, where the dynamic grid size is visualized and the nodes intersecting with a building are colored yellow. The corresponding part of the graph \(G_{\text {dg}} = (V_{\text {dg}}, R_{\text {dg}})\) is visualized on the right-hand side

The reduction in the overall number of nodes achieved by this step highly depends on the number of nodes in \(V_{\text {or}}\) that do not intersect with any buildings or actions, as the number of these nodes is reduced by the highest factor of 625. In the instances presented in Section 4, the overall number of nodes is usually reduced by a factor of about 500.

2. Removing nodes not causing flow into critical locations: Our next goal is to remove nodes from the graph that do not cause any flow into critical locations. To this end, we define four new properties for nodes. A node \(v \in V_{\text {dg}}\) is called ...

  • critical if its shape intersects with a building or a potential action.

  • relevant if it is critical, its resolution is not 25 m, or it is a successor of a critical node in \(G_{\text {dg}} = (V_{\text {dg}}, R_{\text {dg}})\). Apart from critical nodes, relevant nodes are either nodes where water may accumulate and then cause critical nodes to be flooded due to back pressure, or nodes that are needed to complete the grid without gaps.

  • water-dispensing if it is not relevant, but it is a predecessor of a relevant node in \(G_{\text {dg}} = (V_{\text {dg}}, R_{\text {dg}})\). Water accumulating on such nodes does not cause flooding of relevant nodes due to back pressure. These nodes are, however, still interesting as they dispense water to relevant nodes.

  • irrelevant if it is neither of the above. Irrelevant nodes do not contribute in any way to the flooding of relevant nodes.

As an example, think of a village at the foot of a mountain. Here, the nodes at coordinates within the village are the relevant nodes, the nodes at coordinates on the side of the mountain facing the village are the water-dispensing nodes, and the nodes at coordinates on sides of the mountain not facing the municipality are the irrelevant nodes.

The first step of the node removal consists of removing all irrelevant nodes from \(V_{\text {dg}}\). It is worth noting that this may cause the graph to be no longer weakly connected. In practice though, this only happens if buildings are spread widely apart from each other, which is seldom the case. Apart from this, our model still works if the graph is not weakly connected. We denote the graph obtained by this method by \(G_{\text {ri}} = (V_{\text {ri}}, R_{\text {ri}})\).Footnote 3

The reduction in the overall number of nodes achieved by this step highly depends on the number of irrelevant nodes, which in turn depends on the choice of the input DTM. Barely any nodes are irrelevant in cases where the region covered by the DTM has been chosen relatively tight around the build-up region to be protected, whereas a lot of nodes are irrelevant if the region covered by the DTM has been chosen relatively large. However, since the region covered by the DTM is composed of 1 by 1 km rectangles and must always be chosen large enough so that no potentially relevant or water-dispensing nodes are omitted, a certain number of irrelevant nodes is usually unavoidable, so the removal of irrelevant nodes represents an important first step in reducing the overall number of nodes.

3. Contracting nodes in non-critical locations: In the next step, we deal with the water-dispensing nodes. By construction, flow through these nodes is not affected by the decision on which actions are taken. Next, we contract all water-dispensing nodes into a single node s, which we call the source node. Note that this contraction also changes the arc set. The arcs that are incident to s arise from arcs in \(G_{\text {ri}}\) that are directed from a water-dispensing node to a relevant node. In particular, this means that the in-degree of s is zero.

The area of the source node is set to the sum of the areas of all water-dispensing nodes. To compute the ratios of the arcs that are incident to s, we first compute the flows in the graph \(G_{\text {ri}}\) using Algorithm 4 and denote the resulting flow on \(r \in R_{\text {ri}}\) by \(f_r\). The ratio of an arc \(r \in R_{\text {wd}}\) starting in s is then set to the sum of the inflow into \(\omega (r)\) from water-dispensing nodes divided by the total inflow from water-dispensing into relevant nodes in \(G_{\text {ri}}\):

$${\text{ratio}}_r := \sum_{\substack{\hat{r} \in R{\text{wd}}: \\ \alpha(\hat{r}) \text{ is water-dispensing}\\ \text{and } \omega(\hat{r}) = \omega(r)} } f{\hat{r}} \bigg/ \sum_{\substack{\tilde{r} \in R{\text{wd}}: \\ \alpha(\tilde{r}) \text{ is water-dispensing}\\ \text{and } \omega(\tilde{r}) \text{ is relevant} }} f{\tilde{r}}$$

For completeness, we set the geodesic height of s to the largest geodesic height in the graph before contraction plus 1 m. This ensures that the source node is never flooded unless an unrealistically large amount of rain per m2 is used. We denote the obtained graph by \(G_{\text {wd}} = (V_{\text {wd}}, R_{\text {wd}})\).Footnote 4

4. Contracting adjacent nodes of similar geodesic heights: As a last step, we contract adjacent nodes into a new node if they have the same geodesic height up to a given threshold and the same combination of actions and buildings on them, which yields the desired reduced graph \(G_{\text {red}} = (V_{\text {red}}, R_{\text {red}})\). The exact procedure for computing \(G_{\text {red}}\) is presented in Algorithm 2, which will be explained in the following paragraphs. The corresponding reduction step has two benefits. First, it further reduces the number of nodes. Second, and far more beneficially, it greatly improved the numerical stability of the MIP. Indeed, numerical issues caused the MIP to be infeasible before we introduced this procedure. The improved numerical stability stems from the fact that, after the procedure, all nodes in the resulting reduced graph \(G_{\text {red}}\) have pairwise distinct geodesic heights, and there are only very few adjacent nodes that have similar geodesic heights.

Algorithm 2 is divided into four parts. In the first part, we contract nodes that intersect with the same sets of buildings and actions and have a similar geodesic height into a new node representing the contracted nodes. The shape of such a new node is defined as the union of the shapes of the contracted nodes, and the boundary of such a node is the boundary of its shape. The geodesic height of the new node is then set to the area-weighted average over the geodesic heights of the nodes that have been contracted into the new node \(v \in V_{\text {red}}\):

$$\text{gh}_v := \sum_{\substack{v' \in V_\text{wd}: \\ v' \text{ is contracted into }v}} \text{gh}_{v'} \cdot \text{area}_{v'} \bigg/ \sum_{\substack{v' \in V_\text{wd}: \\ v' \text{ is contracted into }v}} \text{area}_{v'}$$

In practice, this procedure usually leads to all nodes in \(V_{\text {red}}\) having pairwise distinct geodesic heights. However, if this is not the case, we add a slight noise to the geodesic heights of each pair of nodes that have the same geodesic height. This is important in order to guarantee that the MIP produces a feasible solution of the problem.

In the second part, we remove uphill arcs \(r \in R_{\text {red}}\) that might arise during this procedure.

In the third part, we recompute the ratios of the newly obtained arcs. This time, for a node \(v \in V_{\text {red}}\), the ratio of an arc \(r \in \delta ^+_{G_{\text {red}}}(v)\) is set proportionally to the slopes of the arcs leaving v and to the length of the intersection of the boundaries of v and \(\omega (r)\).

In the final part, we remove the node s and instead increase the area of nodes that are adjacent to s. This only decreases the size of the graph by a single node, but greatly improves the numerical stability of the MIP.

Finding a good value for the threshold is critical here. An overly high value leads to unrealistic results, whereas an overly low value decreases the performance gain obtained from the contraction. We found that, depending on the terrain surface, a value between 5 and 15 cm works best. On hilly surfaces, the value can preferably be set a bit higher, whereas on smooth surfaces, it is better to stick to low values.

The reduction in the overall number of nodes achieved in this last step mainly depends on the threshold parameter and the hilliness of the modeled region. The higher the threshold parameter and the flatter the region, the greater the reduction in the number of nodes.

For three representative regions, which are revisited later in Section 4.2, an overview of the reduction in the overall number of nodes from \(G_{\text {or}}\) to \(G_{\text {red}}\) is provided in Table 2.

Algorithm 2
figure b

Contract-components

Table 2 Reduction in the total number of nodes achieved for three representative regions, where the factor provided in the third column is obtained as \(\nicefrac {|V_{\text {or}}|}{|V_{\text {red}}|}\)  

The extended reduced graph \(G_{\text {red}}^{\text {ex}} = (V_{\text {red}}^{\text {ex}}, R_{\text {red}}^{\text {ex}})\) is constructed from the reduced graph \(G_{\text {red}} = (V_{\text {red}}, R_{\text {red}})\) returned by Algorithm 2 in the same manner as we constructed it for the original graph, i.e., for each arc \(r \in R_{\text {red}}\), we add a copy of r in reverse direction.

3.2 Mixed-Integer Programming Formulation and Presolve Techniques

In Section 3.2.1, we present our mixed-integer programming formulation of the problem defined in Section 3.1 as well as several intuitive valid inequalities that improve solution times. The constraints are formulated verbally, while the mathematical formulation can be found in Appendix 3. We then describe methods to preset some of the variables in Section 3.2.2.

3.2.1 Mixed-Integer Programming Formulation

Before stating the mixed-integer programming formulation, we provide complete lists of the sets, parameters, and variables for better readability. The MIP takes, among other things, a graph and its extended graph as an input. Any of the graphs we constructed before could be used, but, as already mentioned, we highly recommend to use the reduced graph (and the corresponding extended reduced graph) here as all other graphs make the model too large or numerically unstable. The graph used in the MIP is denoted by \(G = (V, R)\) and the corresponding extended graph by \(G^{\text {ex}}= (V, R^{\text {ex}})\). Throughout this section, we assume that the nodes in V have pairwise distinct geodesic heights, which is the case if \(G = G_{\text {red}}\).

Sets

V:

node set of the graph

\(R\):

arc set of the graph

\(R^{\text {ex}}\):

arc set of the extended graph

\(B\):

set of buildings

\(\mathcal {B}\):

set of possible retention basins

\(\mathcal {D}\):

set of possible ditches

\(\mathcal {E}\):

set of possible embankments

\(\mathcal {A}\):

set of all possible actions, where \(\mathcal {A}= \mathcal {B}\cup \mathcal {D}\cup \mathcal {E}\)

\(P\):

set of properties

\(P_{\text {yellow}}\subseteq P\):

set of properties where the corresponding actor needs minor incentives to cooperate

\(P_{\text {red}}\subseteq P\):

set of properties where the corresponding actor needs major incentives to cooperate

\(P_{\text {black}}\subseteq P\):

set of properties where the corresponding actor does not cooperate at all

The sets corresponding to possible actions are denoted by calligraphic letters. We further introduce the set \(\mathcal {B}_v \subseteq \mathcal {B}\) for each \(v \in V\) as the set of basins on v. The sets \(\mathcal {D}_v\) and \(\mathcal {E}_v\) are defined analogously, and we let \(V_\beta\) denote the set of all nodes intersecting with building \(\beta \in B\).

Parameters

\(\text {rain}\):

total rain per m2 in m

\(\text {budget}\):

budget for the total cost of taken actions

\(\text {GH}_v\):

original geodesic height of node \(v \in V\)

\(\text {area}_v\):

area of node \(v \in V\) in m2 

\(\text {ratio}_r\):

ratio of outflow of node \(\alpha (r)\) allocated to arc \(r \in R^{\text {ex}}\)

\(\text {depth}_a\):

depth of basin or ditch \(a \in \mathcal {B}\cup \mathcal {D}\) in m

\(\text {height}_e\):

height of embankment \(e \in \mathcal {E}\) in m

\(\text {cost}_a\):

cost of action \(a \in \mathcal {A}\)

\(\text {thresholdWL}_k\):

threshold water level in m for hazard class \(k \in \{0,1, 2, 3\}\)

\(\text {damage}_{k,\beta }\):

damage in the objective function if building \(\beta \in B\) belongs to hazard class \(k \in \{1, 2, 3, 4\}\)

\(\text {maxAllowedYellow}\):

maximum number of properties needing minor incentives to cooperate that actions can be built on

\(\text {maxAllowedRed}\):

maximum number of properties needing major incentives to cooperate that actions can be built on

Variables

\(f_r\):

total flow on arc \(r \in R^{\text {ex}}\) in \(\mathrm m^3\)  

\(\text {excess}_v\):

excess of node \(v \in V\) in \(\mathrm m^3\)  

\(\text {wl}_v\):

water level at node \(v \in V\) in m

\(\text {flooded}_v\):

1 if \(\text {wl}_v > 0\), 0 otherwise

\(\text {active}_r\):

1 if there is flow along arc \(r \in R^{\text {ex}}\), 0 otherwise

\(\text {full}_r\):

1 if \(\text {wl}_{\alpha (r)} > 0\) for \(r \in R\), 0 otherwise

\(\text {decBasin}_b\):

1 if basin \(b \in \mathcal {B}\) is built, 0 otherwise

\(\text {decDitch}_d\):

1 if ditch \(d \in \mathcal {D}\) is built, 0 otherwise

\(\text {decEmb}_e\):

1 if embankment \(e \in \mathcal {E}\) is built, 0 otherwise

\(\text {gh}_v\):

geodesic height of node \(v \in V\) after actions have been built in m

\(\text {down}_v\):

1 if a ditch or basin is built on \(v \in V\), 0 otherwise

\(\text{max}\_\text{inc}_v\)  :

maximum increase of height through building embankments on \(v \in V\) in m

\(\text{max}\_\text{dec}_v\)  :

maximum decrease of height through building ditches or basins on \(v \in V\) in m

\(\text{aux}\_\text{fd}_r\)  :

binary auxiliary variable for the flow distribution over arc \(r \in R^{\text {ex}}\): 1 if arc is active and not full, 0 otherwise

\(\text {od}_r\):

1 if node \(\alpha (r)\) is higher than node \(\omega (r)\) after building the actions for \(r \in R\), 0 otherwise

\(\text {auxO1F1}_r\):

binary auxiliary variable for \(r \in R\): 1 if \(\text {od}_r = 1\) and \(\text {full}_r = 1\), 0 otherwise

\(\text {auxO1F0}_r\):

binary auxiliary variable for \(r \in R\): 1 if \(\text {od}_r = 1\) and \(\text {full}_r = 0\), 0 otherwise

\(\text {max}\_\text{wl}_{\beta }\)  :

maximum water level at any node intersecting with building \(\beta \in B\) in m

\(\text {hc}_{k, \beta }\):

1 if building \(\beta \in B\) belongs to hazard class \(k \in \{0, \dots ,4\}\), 0 otherwise

\(\text {action}_p\):

1 if an action is taken on property \(p \in P\), 0 otherwise

\(\text {hdb}_b\):

absolute value of the height difference in m that is caused by building basin \(b \in \mathcal {B}\) if basin b is built, 0 otherwise

\(\text {hdd}_d\):

absolute value of the height difference in m that is caused by building ditch \(d \in \mathcal {D}\) if ditch d is built, 0 otherwise

\(\text {hde}_e\):

absolute value of the height difference in m that is caused by building embankment \(e \in \mathcal {E}\) if it is built, 0 otherwise

Objective Function

The only term in the objective function is the damage caused to the buildings, which depends on their hazard class and their damage class. Thus, the objective function to be minimized is given as

$$\begin{aligned} \sum _{\beta \in B} \sum _{k = 1}^{4} \text {damage}_{k,\beta } \cdot \text {hc}_{k, \beta }. \end{aligned}$$

Constraints

To enhance readability, we use the \(\max\) operator within our formulation. This operator takes a set of variables and/or parameters as an argument and returns the maximum among their values. Note that the operator can alternatively be implemented using big M constraints. This, however, may lead to numerical instability if finding a suitable value M is difficult. We therefore use the \(\max\) operator, which is pre-implemented in most modern MIP solvers.

Furthermore, we make use of indicator constraints. An indicator constraint is of the form

$$\begin{aligned} bin = val \quad \Longrightarrow \quad a^Tx \le b \end{aligned}$$

and states that the constraint \(a^Tx \le b\) must be satisfied if the binary variable bin has value \(val \in \{0,1\}\). An indicator constraint can also be implemented using a big M constraint. It is, however, well known that indicator constraints have many advantages compared to big M formulations [28]. Indicator constraints are, like the \(\max\) operator, pre-implemented in many modern MIP solvers.

The formulation of some constraints requires using strict inequalities, which is not possible theoretically in a MIP. In practice, however, values are encoded as floats with a bounded number of decimal places. Therefore, a strict inequality \(x < y\) can be formulated as \(x \le y - \varepsilon\) for some small \(\varepsilon > 0\).

Water Levels at Nodes

To determine the water levels, we first compute the excess of each node \(v\in V\):

  1. 1.

    The excess of node \(v \in V\) is the inflow minus the outflow plus the rain volume on the node. The excess of a node \(v\in V\) immediately yields the water level at the node:

  2. 2.

    The water level at node \(v \in V\) is the excess of node v divided by its area.

Geodesic Heights of Nodes

In contrast to most traditional flow problems, we do not aim to optimize the flow in the graph, but the terrain surface determining the flows. The following constraints therefore set the geodesic height variable \(\text {gh}_v\) for each node \(v \in V\). First, to distinguish the two cases (GH1) and (GH2) from Section 3.1.2, the variable \(\text {down}_v\) is set to one in case (GH1), and to zero otherwise:

  1. 3.

    If a basin \(b \in \mathcal {B}\) is built on node \(v \in V\), the variable \(\text {down}_v\) is set to one.

  2. 4.

    If a ditch \(d \in \mathcal {D}\) is built on node \(v \in V\), the variable \(\text {down}_v\) is set to one.

  3. 5.

    If neither ditches nor basins are built on node \(v \in V\), the variable \(\text {down}_v\) is set to zero.

    Next, the variables \(\text {hdb}_b\), \(\text {hdd}_d\), and \(\text {hde}_e\) for \(b \in \mathcal {B}\), \(d \in \mathcal {D}\), and \(e \in \mathcal {E}\) that determine the height differences that result from taking actions are set:

  4. 6.

    The variable \(\text {hdb}_b\) is set to \(\text {depth}_b\) if basin \(b \in \mathcal {B}\) is built (i.e., if \(\text {decBasin}_b = 1\)), and to zero otherwise.

  5. 7.

    The variable \(\text {hdd}_d\) is set to \(\text {depth}_d\) if ditch \(b \in \mathcal {B}\) is built (i.e., if \(\text {decDitch}_d = 1\)), and to zero otherwise.

  6. 8.

    The variable \(\text {hde}_e\) is set to \(\text {height}_e\) if embankment \(e \in \mathcal {E}\) is built (i.e., if \(\text {decEmb}_e = 1\)), and to zero otherwise.

    To enable setting the geodesic height variables as described in the case distinction, the maximum depth of any of the basins or ditches built on v in case (GH1) and the maximum height of any of the embankments built on v in case (GH2) is now computed:

  7. 9.

    The maximum decrease \(\text {max}\_\text{dec}_v\) of the geodesic height at node \(v \in V\) is set to the maximum of the height differences that result from building basins or ditches on v and 0.

  8. 10.

    The maximum increase of the geodesic height \(\text {max}\_\text{inc}_v\) at node \(v \in V\) is set to the maximum of the height differences that result from building embankments on v and 0.

    Finally, the geodesic height variable \(\text {gh}_v\) is set for each node \(v \in V\):

  9. 11.

    The geodesic height \(\text {gh}_v\) of node \(v \in V\) is greater than or equal to the original geodesic height of v minus the maximum decrease caused by basins and ditches.

  10. 12.

    The geodesic height \(\text {gh}_v\) of node \(v \in V\) is less than or equal to the original geodesic height of v plus the maximum increase caused by embankments.

  11. 13.

    If a basin or ditch is built on node \(v \in V\) (i.e., \(\text {down}_v = 1\)), the geodesic height \(\text {gh}_v\) of v is less than or equal to the original geodesic height of v minus the maximum decrease caused by basins and ditches and, hence, in combination with Constraint (11), equal to the original geodesic height of v minus the maximum decrease caused by basins and ditches. This is modeled using a big M constraint where \(M_v:= \max (\{\text {depth}_b | b \in \mathcal {B}_v\} \cup \{\text {depth}_d | d \in \mathcal {D}_v\} \cup \{0\}) + \max (\{\text {height}_e | e \in \mathcal {E}_v\} \cup \{0\})\).

  12. 14.

    If no basin or ditch is built on node \(v \in V\) (i.e., \(\text {down}_v = 0\)), the geodesic height \(\text {gh}_v\) of v is greater than or equal to the original geodesic height of v plus the maximum increase caused by embankments and, hence, in combination with Constraint (12), equal to the original geodesic of v height plus the maximum increase caused by embankments. This is again modeled using a big M constraint with the same \(M_v\) as in the previous constraint.

Arc Directions

There might be arcs in the input graph where, after taking actions and thereby changing the geodesic heights of nodes, the start node has a lower geodesic height than the end node, so the direction of the arc has to be reversed. If this is not the case for an arc \(r \in R\), the arc is said to have original direction and the variable \(\text {od}_r\) is set to one by using indicator constraints:

  1. 15.

    If arc \(r \in R\) has original direction, the variable \(\text {od}_r\) is set to one.

  2. 16.

    Otherwise, the variable \(\text {od}_r\) is set to zero.

Full Arcs

The following constraints deal with the behavior of the flows on the arcs in the extended graph \(G^{\text {ex}}= (V, R^{\text {ex}})\). To this end, we introduce the following terminology: An arc \(r \in R\) is called full if the water level at the lower of the two nodes \(\alpha (r)\) and \(\omega (r)\) is greater than or equal to the absolute difference of their geodesic heights. For its inverse arc \(\overset{\leftarrow }{r}\in R^{\text {ex}}\setminus R\), we say that this arc is full if and only if r is full.Footnote 5 Note that this definition refers to the geodesic heights after taking actions, where it is possible that \(\alpha (r)\) has a smaller geodesic height than \(\omega (r)\). To connect the variables \(\text {full}_r\) to the water levels, some binary auxiliary variables incorporating the original direction variables are first introduced:

  1. 17.

    The variable \(\text {auxO1F1}_r\) for arc \(r \in R\) is set to one if and only if \(\text {od}_r = 1\) and \(\text {full}_r = 1\).

  2. 18.

    The variable \(\text {auxO1F0}_r\) for arc \(r \in R\) is set to one if and only if \(\text {od}_r = 1\) and \(\text {full}_r = 0\).

  3. 19.

    The variable \(\text {auxO0F1}_r\) for arc \(r \in R\) is set to one if and only if \(\text {od}_r = 0\) and \(\text {full}_r = 1\).

  4. 20.

    The variable \(\text {auxO0F0}_r\) for arc \(r \in R\) is set to one if and only if \(\text {od}_r = 0\) and \(\text {full}_r = 0\).

    The following constraints connect the variables \(\text {full}_r\) to the water levels using the auxiliary variables:

  5. 21.

    If arc \(r \in R\) has original direction and is full, the water level at \(\omega (r)\) must be greater than or equal to the absolute difference of the geodesic heights of \(\alpha (r)\) and \(\omega (r)\).

  6. 22.

    If arc \(r \in R\) has original direction and is not full, the water level at \(\omega (r)\) must be less than the absolute difference of the geodesic heights of \(\alpha (r)\) and \(\omega (r)\).

  7. 23.

    If arc \(r \in R\) does not have original direction and is full, the water level at \(\alpha (r)\) must be greater than or equal to the absolute difference of the geodesic heights of \(\alpha (r)\) and \(\omega (r)\).

  8. 24.

    If arc \(r \in R\) does not have original direction and is not full, the water level at \(\alpha (r)\) must be less than the absolute difference of the geodesic heights of \(\alpha (r)\) and \(\omega (r)\).

Flooded Nodes

A node \(v \in V\) is called flooded if its water level \(\text {wl}_v\) is strictly positive, and non-flooded otherwise. The following indicator constraints set the variables \(\text {flooded}_v\) for \(v\in V\) that indicate flooded nodes:

  1. 25.

    If the water level \(\text {wl}_v\) at node \(v\in V\) is strictly positive, the variable \(\text {flooded}_v\) is set to one.

  2. 26.

    If the water level \(\text {wl}_v\) at node v is zero, the variable \(\text {flooded}_v\) is set to zero.

Active Arcs

The net flow between two adjacent nodes in the extended graph can be in either one or the other direction. An arc \(r\in R^{\text {ex}}\) is called active if the flow on r is strictly positive. The following constraints set the variables \(\text {active}_r\) for\(r\in R^{\text {ex}}\) that indicate active arcs:

  1. 27

    For arc \(r \in R\) and its inverse arc \(\overset{\leftarrow }{r}\in R^{\text {ex}}\), at most one of the variables \(\text {active}_r\) and \(\text {active}_{\overset{\leftarrow }{r}}\) can be equal to one.

  2. 28

    If an arc \(r \in R^{\text {ex}}\) is not active, the flow on the arc must be zero.

Flow on Arcs that Are Not Full

The outflow of a node \(v \in V\) is to be distributed according to the ratios of its outgoing arcs in the extended graph \(G^{ex} = (V, R^{\text {ex}})\) that are active and not full. The following constraints set the auxiliary variables \(\text {aux}\_\text{fd}_r\) and \(\text {aux}\_\text{fd}_{\overset{\leftarrow }{r}}\) for \(r\in R\) that indicate arcs that are both active and full:

  1. 29.

    For arc \(r \in R\), the auxiliary variable \(\text {aux}\_\text{fd}_r\) is to one if and only if the arc is active and not full.

  2. 30.

    For arc \(r \in R\), the auxiliary variable \(\text {aux}\_\text{fd}_{\overset{\leftarrow }{r}}\) for the inverse arc is set to one if and only if \(\overleftarrow{r}\) is active and not full.Footnote 6

    The outflow of each node \(v \in V\) is now distributed among its outgoing arcs in the extended graph that are active and not full:

  3. 31.

    For node \(v \in V\) and each pair of arcs \(r_1, r_2 \in \delta ^+_{G^{\text {ex}}}(v)\), if both arcs are active and not full, the flow is distributed proportionally to the ratios \(\text {ratio}_{r_1}\) and \(\text {ratio}_{r_2}\).

    For each arc \(r \in R\) that is not full, the water level at the higher of the two nodes \(\alpha (r)\) and \(\omega (r)\) must be zero.

  4. 32.

    For each arc \(r \in R\) that is not full and has original direction, the water level at \(\alpha (r)\) is set to zero.

  5. 33.

    For each arc \(r \in R\) that is not full and does not have original direction, the water level at \(\omega (r)\) is set to zero.

    Water can only flow on downhill arcs \(r \in R^{\text {ex}}\) that are not full:

  6. 34.

    For each arc \(r \in R\) that is not full and has original direction, the arc \(\overleftarrow{r}\) is not active.

  7. 35.

    For each arc \(r \in R\) that is not full and does not have original direction, the arc r is not active.

Flow on Full Arcs

As the flow is immediately connected to the water levels by Constraints (1) and (2), the flow on each full arc \(r\in R\) can be set indirectly by connecting the water levels at its start node and its end node:

  1. 36.

    For each full arc \(r \in R\), the sum of the geodesic height and the water level must be equal in \(\alpha (r)\) and \(\omega (r)\).

Maximum Water Levels at Buildings

  1. 37.

    For each building \(\beta \in B\), the maximum water level variable \(\text{max}\_\text{wl}_\beta\) is set to the maximum of the water levels at nodes intersecting with the building.

    Note that, strictly speaking, the maximum is not taken here, but the maximum water level at the building is only bounded from below by each water level at an intersecting node. The objective function then aims to minimize the maximum water levels at the buildings to achieve equality.

Hazard Classes of Buildings

  1. 38.

    Each building \(\beta \in B\) belongs to exactly one hazard class.

  2. 39.

    If building \(\beta \in B\) belongs to hazard class \(k \in \{ 0, \dots , 4\}\), its maximum water level must be less than or equal to the upper threshold of this hazard class.

    Again, the maximum water levels are only bounded from above as a higher hazard class leads to a higher penalty in the objective function.

Budget Constraint

  1. 40.

    The total cost for building basins, ditches, and embankments must not exceed the given budget.

Incentives for Actors

The following constraints enforce the given upper bounds on the incentives required for cooperation of actors and ensure that no actions are taken on properties of actors that do not cooperate at all. This is done by means of the variables \(\text {action}_p\) for \(p\in P\) that indicate properties on which at least one action is taken:

  1. 41.

    Actions are taken on at most \(\text {maxAllowedYellow}+\text {maxAllowedRed}\) yellow and red properties in total.

  2. 42.

    Actions are taken on at most \(\text {maxAllowedRed}\) red properties.

  3. 43.

    No actions are taken on black properties.

  4. 44.

    The variable \(\text {action}_p\) for property \(p\in P\) is set to one if at least one action is taken on property p.

    It is worth noting that it is not trivial to see that the MIP is indeed a correct formulation of the problem defined in Section 3.1. However, we show this in Appendix 2 by proving that any feasible solution of the MIP taking exactly the actions in \(D \subseteq \mathcal {A}\) leads to the same water levels at the nodes as the result of Algorithm 6 applied on \(G^{D}\), which is the graph that results from taking the actions in D and adjusting the geodesic heights and arc directions accordingly as described in Section 2.

Valid Inequalities

We finish the description of the MIP by presenting three intuitive sets of valid inequalities that improve the solution times of the model:

  1. 45.

    For each pair of consecutive original-direction (i.e., downhill) arcs \(r_1,r_2\in R\) with \(\omega (r_1)=\alpha (r_2)\), the first arc \(r_1\) can only be full if the second arc \(r_2\) is full as well.

  2. 46.

    If node \(v\in V\) is flooded, then each arc \(r \in \delta ^+_{G^{\text {ex}}}(v)\) with \(\text {gh}_v > \text {gh}_{\omega (r)}\) must be full (otherwise, water could still flow in downhill direction from v).

  3. 47.

    If node \(v\in V\) is not flooded, then no arc \(r \in \delta ^-_{G^{\text {ex}}}(v)\) with \(\text {gh}_v < \text {gh}_{\alpha (r)}\) can be full.

3.2.2 Presolve Techniques

We close this chapter by presenting two methods to preset some of the variables.

Through our analysis, we found that the variables \(\text {flooded}_v\) for \(v\in V\) are the major bottleneck of the MIP. It is therefore natural to investigate which nodes must always be flooded and which nodes can never be flooded in a feasible solution in order to preset some of these variables to one or zero, respectively.

We start by presetting variables for nodes that must always be flooded. To this end, we consider the leaves of the graph \(G = (V, R)\). If there is no possible embankment on a leaf \(l \in V\) and no possible ditches or basins on any of the nodes in \(\delta ^-(l)\), the leaf will also be a leaf after taking actions—independent of which actions are selected. This means that l is flooded in any feasible solution since at least the initial water from the rain event will build up a water level strictly larger than zero at l. For all such leaves, we can therefore preset the variable \(\text {flooded}_l\) to one.

Identifying nodes \(v\in V\) for which the variable \(\text {flooded}_v\) can be preset to zero (i.e., nodes that can never be flooded in any feasible solution) is more involved. The idea here is that, if no possible action is located on v, the water levels at all successors of v must match the geodesic height of v in order for v to be flooded. Thus, if the total amount of rain on the whole area does not suffice for raising the water level at each successor to the absolute difference of the geodesic height of the successor and the geodesic height of v, then v can never be flooded in any feasible solution.

In order to find such non-flooded nodes, we start by computing the maximum possible geodesic height of each node than can be obtained after taking actions,Footnote 7 and then construct a new graph \(G_{\text {nf}} = (V_{\text {nf}}, R_{\text {nf}})\) where each node is assigned its corresponding maximum possible geodesic height and arcs are directed in downhill direction with respect to these geodesic heights.Footnote 8 For each node on which no actions are located, we compute its successors in \(G_{\text {nf}}\).Footnote 9 If the amount of rain that is needed to raise the water level at each of these successors to the absolute difference of the geodesic height of the successor and the geodesic height of v exceeds the total rain volume on the whole area, node v can never be flooded in any feasible solution. If this is not the case, we can apply the same idea using a larger set of nodes instead of the successors of v. To this end, we consider the undirected version of \(G_{\text {nf}}\) and remove all nodes that have strictly larger geodesic height than v. We then compute all nodes different from v that are in the same connected component as v in the remaining undirected graph. It is clear that the set of these nodes is a superset of the set of successors of v in \(G_{\text {nf}}\), so we can apply the reasoning as before to this larger set of nodes.

The pseudocode of the corresponding algorithm is presented as Algorithm 7 in Appendix 4. Note that one could of course use the larger set of nodes right away, but this would cause a non-negligible overhead in computation.

4 Computational Results

In this section, we present a comparison of the results obtained from our MIP to results obtained from established simulation software (Section 4.1). Afterwards, we use real-world instances from different municipalities to identify and analyze the main drivers for the running time of our method and the quality of the obtained solutions (Section 4.2).

4.1 Comparison with Established Simulation Software

To validate our approach, we compared the results obtained on real-world instances to results obtained on these instances from the well-established simulation software “HYSTEM-EXTRAN” [29]. HYSTEM-EXTRAN is the German industry standard for hydro-dynamic simulations in urban water management and is used by most engineering offices and municipalities when evaluating precautionary measures for flash floods caused by heavy rain events. It does, however, not support any kind of optimization, but can only be used to simulate the water levels resulting from a given (usually manually chosen) combination of actions for a given amount of rain. Thus, we compared the water levels resulting from our approach for the status quo of each instance (which contains only the already implemented actions, if any) without allowing any additional actions to the water levels obtained from HYSTEM-EXTRAN’s simulation for the same situation. The results obtained from HYSTEM-EXTRAN have been provided and validated by the engineering office igr AG, which was one of our partners in the project AKUT.

All in all, we found that the results predominantly coincide, with only slight differences that usually occur at the periphery of flooded areas. An illustrative example, in which a 30-year rain event (i.e., the heaviest rain to be expected in the chosen area over a time span of 30 years) has been simulated in a hilly region, is provided in Fig. 5. We further found that AKUT slightly underestimates the damage to buildings in hilly regions whereas it slightly overestimates the damage to buildings in flat regions. This is due to the fact that HYSTEM-EXTRAN also takes damage caused by high current velocity into account, which is neglected in AKUT.

Fig. 5
figure 5

Extract from a comparison of water levels obtained from HYSTEM-EXTRAN and AKUT for a 30-year rain event in a hilly region. The darker the blue color, the higher the water level, where the highest obtained levels are illustrated in purple in the case of HYSTEM-EXTRAN

4.2 Running Time and Performance

We now investigate the running times of our MIP and the quality of the obtained solutions. For both of them, we present the most important drivers that have been identified by applying our approach to a wide range of different real-world problem instances. As an illustration, results for nine representative instances obtained from three different regions (two municipalities and a part of a city) that are considered in three relevant scenarios are presented. The three scenarios are a 30-year rain event with a budget that allows to take four actions, a 50-year rain event with with a budget that allows to take four actions, and a 50-year rain event with with a larger budget that allows to take six actions.Footnote 10 The regions are a municipality on a hilly terrain, called “Hilly Region” (HR) in the following, a municipality on a flat terrain, called “Flat Region 1” (FR1) in the following, and a part of a city on a flat terrain, called “Flat Region 2” (FR2) in the following. For each region, the set \(\mathcal {A}\) of possible actions is the same for all three scenarios and consists of about 20 actions that have been selected according to the local circumstances such that each of them could be implemented in reality.

For each instance, an initial solution taking no actions, which is computed using Algorithm 6, is given to the MIP. As termination criterion, a 3% MIP gap is used for the hilly region, and a 5% MIP gap for the flat regions. Furthermore, a time limit of 24 h is set. All computations in this section were executed using Gurobi 9.5.0 on a server with 32 AMD EPYC 7542 processors (2.9GHz). The most important characteristics of the instances together with the results and the running times are provided in Table 3.

Table 3 Computational results for nine representative instances. The column “HM” (Hilliness Meassure) contains the median of the values obtained by dividing the slope of each arc by the Euclidean distance of the centers of its incident nodes, which is a measure of how hilly the terrain is. The column “FSF” contains the time until the final solution is found. The column “IOV” contains the objective value of the initial solution of the MIP provided by applying Algorithm 6. The column “BOV” contains the objective value of the best solution returned by the MIP. Note that the different values of \(|V_{\text {red}}|\) among instances with the same region result from different merging of nodes during preprocessing due to different rain events

In general, it is found that the maximum possible number of actions is taken in each of the nine instances. It is worth noting that there are instances, which are not presented here, where this is not the case. Possible reasons for not taking an action although it would be possible are (1) an overabundance of possible actions and a large budget such that the action that is not taken does not contribute to the quality of the solution anymore and (2) a poor-quality location of the action such that the action does not protect any buildings.

Among the 42 actions that are selected by the MIP in the presented instances, 40 are basins, while only two are ditches or embankments. This confirms observations made on numerous real-world instances indicating that retention basins, if they can be built, are usually the most efficient actions. However, building retention basins requires free space, which is not always available, especially in densely populated urban areas. Embankments and ditches are usually built together with a retention basin such that the actions overlap geographically. An intuitive explanation for this behavior of the MIP is that retention basins act as a storage for the water, while ditches or embankments connect the inflow from a larger area to the basins.

Another interesting finding is that, in hilly regions, the most efficient retention basins, i.e., the ones that are typically selected by the MIP, are often low-lying and located centrally. As an illustration, in the three scenarios considered for HR, 10 of the 14 selected basins are low-lying and located centrally.

Although the rain volume is significantly higher in the instances modeling the 50-year rain events, it is found that neither the budget nor the rain volume dramatically change the set of taken actions. Among the 12 actions taken in the three considered instances with a 50-year rain event and four possible actions per instance, eight are also taken in the corresponding instances with a 30-year rain event. Furthermore, among the 12 actions taken in the three instances with a 50-year rain event and six possible actions per instance, nine are also taken in the corresponding instances with four possible actions.

In general, the running times show high fluctuations as can be seen in FR1, where the instance with the 30-year rain event takes significantly less time to solve than the instances with the 50-year rain event. Still, several factors influencing the running time and the quality of the obtained solutions can be identified.

Concerning the running time, the most important factor is the number of nodes in the graph (i.e., \(|V_{\text {red}}|\)). The instances of FR1, which are the instances with the largest number of nodes, with a 50-year rain event are the only instances in our experiments on which the MIP gap could not be closed before reaching the time limit, whereas all other instances could be solved within less than 7 h.

The second most important factor is the hilliness of the terrain surface. Comparing HR and FR2, the instances for HR are solved significantly faster than the instances for FR2 despite the graph for FR2 being slightly smaller than the graph for HR. Further, in hillier regions, the MIP tends to be numerically more stable.

Although the parameter “MIPFocus” is set to 2 and the parameter “Heuristics” is set to 0.01, which both enforce a stronger attention on improving the lower bound, the final solution is usually found relatively quickly, and the solver spends a significant part of the overall running time on improving the lower bound afterwards, as can be seen when comparing the values in the columns “Running Time” and “FSF” (final solution found) in Table 3. Without tuning the parameters accordingly, the running time increases drastically since the solver struggles to close the MIP gap.

Concerning the quality of the solutions, we observe that hillier regions usually have more damage potential overall. To illustrate this, we compare HR to FR1, which have almost the same number of buildings. However, the objective values of both the initial solution and the solution returned by the MIP are more than twice as large in HR as in FR1. This is due to two reasons. Firstly, hilly regions have heavier rainfalls than flat regions due to orographic precipitation [30]. In our example, a 30-year rain event in HR has a precipitation level of 44.9 mm whereas a 30-year rain event in FR1 has a precipitation level of only 35.9 mm. Secondly, hilly regions tend to have a larger drainage area, which can also be seen comparing the total areas of HR and FR1. Indeed, the difference in the total areas result almost entirely from a higher number of water-dispensing nodes in instances of HR.

Lastly, the density of the buildings (i.e., the number of buildings per area) affects the potential of how much better the solution returned by the MIP can be compared to the initial solution (i.e., the solution where no actions are taken). In our case, there is a significantly higher density of buildings in FR2 than there is in HR and FR1. We see that the difference between the objective values of the initial solution and the obtained solution from the MIP is considerably higher in FR2 than it is in HR and FR1. This stems from the fact that, if a high water level at a critical location is prevented by an action, the action protects more buildings in FR2 than it does in the other two regions. It is worth noting, however, that there are other instances with a high density of buildings where planning impactful actions becomes hard due to lack of space. In such cases, a high density of buildings can decrease the potential for damage reduction by taking actions significantly.

5 Conclusion

To the best of our knowledge, the web application AKUT is the first software that uses optimization techniques to support decision-making in planning precautionary measures for flash floods caused by heavy rain events and scales well enough to be applied to realistic scenarios. It is currently used by more than 25 organizations from all over Germany. The usage of optimization techniques in this context has evidently provided valuable support in handling a challenging and highly topical task for various organizations like municipalities, engineering offices, and research institutes.

A mixed-integer program is used to minimize the damage in the case of a heavy rain event by taking best-possible actions subject to a limited budget and constraints on the cooperation of residents. To model the terrain surface, a grid graph obtained from a digital terrain model is transformed by several preprocessing methods such that the cardinality of its node set becomes small enough to apply the previously mentioned mixed-integer program while still maintaining a realistic representation of the terrain surface. Comparisons with results from established software provide strong evidence that solutions obtained from our approach yield realistic results.

When applying the software to large cities, these must currently be subdivided into several parts due to performance reasons. Hence, an interesting question would be how the performance of our approach could further be improved such that it can handle larger instances. As the problem decomposes into several smaller subproblems, using decomposition methods could be a promising attempt. Additionally, even though an efficiently implemented combinatorial algorithm (Algorithm 6) is already used to generate an initial feasible solution of our MIP, another possible approach for improving the running times of the model could be to implement callbacks that use this algorithm to compute new solutions at later stages of the branch and bound process.