1 Introduction

Around half of the world’s population lives in urban areas, therefore identifying flooding risk hotspots is arguably a fundamental preliminary step in urban planning and risk management. The flooding risk hotspots are defined as zones that are relatively likely to be exposed to flooding; in other words, the flooding risk hotspots can be seen as areas in which high probability of occurrence of flooding (high hazard) coincides with an area of high exposure (Jalayer et al. 2014). It is important to underline that delineation of hotspots cannot replace a comprehensive evaluation of flooding risk based on accurate hazard and vulnerability assessment. In other words, the delineation of hotspots is an effective screening tool for identifying the zones for which a detailed and comprehensive risk assessment is required [as proposed for example in De Risi et al. (2013)]. The procedure applied in this paper delineates the flooding risk hotspots by overlaying zones of high flooding susceptibility (as a proxy for zones with high flooding hazard) and zones of high exposure to flooding risk. The flooding susceptibility has been assessed through the Topographic Wetness Index (TWI) (Kirkby 1975; Qin et al. 2011). The TWI is an index that has a purely topographic interpretation (it basically measures the capability of the land surface to accumulate water based on slope and elevation) and is quite straightforward to calculate in a GIS environment for very large areal extents. This index allows for the delineation of a portion of hydrographic basin potentially exposed to flood inundation, by identifying all the areas characterized by a TWI that exceeds a given threshold; this threshold is strictly dependent on the geomorphology of the hydrographic basin. The threshold value can be calibrated based on results of detailed delineation of the inundation profile for selected zones (Motevalli and Vafakhah 2016; Jalayer et al. 2014; Degiorgis et al. 2012; Manfreda et al. 2008, 2014, 2015). High exposure areas are identified by using a map of urban morphology types (UMTs), from which residential UMTs are extracted, and a geo-spatial census dataset for demographic information (e.g. population density).

A UMT map is a special form of land-use map which combines aspects of urban form (e.g. the general structure and composition of the built environment, green areas, etc.) with aspects of urban function (e.g. residential/commercial uses or community services for the built environment, agriculture or recreation for the green areas, etc.) (Cavan et al. 2012; Gill et al. 2008; Pauleit and Duhme 2000). UMTs have characteristic physical features (like land cover) and are distinguished based on the human activities they accommodate (i.e. land uses). Since mapped UMTs take account of both physical properties and human activities as the key factors for characterizing urban areas (e.g. residential areas, transportation areas, bare ground, etc.), they can be helpful for representing environmental phenomena and related exposures. For example, they have already been used to estimate land surface temperature patterns in African urban areas (Cavan et al. 2014). The UMTs and associated mapped units provide a meso-scale perspective (i.e. between the city level and that of individual land parcels) making a suitable basis for the spatial analysis of cities.

The overlay of land cover and flood-prone area maps has been done in various research efforts [see for example Meyer and Messner (2005) or Gwilliam et al. (2006)]. In a traditional approach towards delineating the urban flooding hotspots, zones of high exposure to flooding are overlaid with areas that are known to be susceptible to flooding or flood-prone areas. These areas are usually identified based on available historical flooding data or by defining a buffer zone around the rivers [see Gall et al. (2007), Apel et al. (2009), Manfreda et al. (2014) or De Risi et al. (2015) for a comprehensive discussion on identification of flooding risk hotspots]. The novelty in the approach proposed herein lies in the use of Bayesian parameter estimation for the characterization of uncertainties in delineating the potentially flood-prone areas. More specifically, in two previous works (De Risi et al. 2014; Jalayer et al. 2014) the authors demonstrate how the potentially flood-prone areas can be delineated through a Maximum Likelihood Estimation (MLE) procedure applied to a spatial window in micro-scale based on historical and calculated flooding extents, respectively. In this work, these two approaches are compared and integrated in a Bayesian framework. Both the calculated inundation map for a specific return period and the historical flooding extent have been used to calibrate the TWI threshold. Bayesian parameter updating is employed in order to estimate the TWI threshold considering the historical extent as prior and the information provided by the inundation map used as likelihood. In other words, information from the hydraulic calculations are integrated with the results obtained based on only the historical extent. For different statistics (i.e., maximum likelihood estimate, the lower and upper bounds for the maximum likelihood estimate, different percentiles) of the TWI threshold, the map of potentially flood-prone areas (i.e. areas highly exposed to flooding) is overlaid with the map of residential urban morphology units (i.e. areas of high exposure) to delineate the residential flooding risk hotspots. Overlaying the resulting hotspots with a geo-spatial census dataset, it is possible to estimate the number of people potentially affected by flooding. This type of screening approach may be particularly valuable in data poor locations (Samela et al. 2015, 2017) or places where conducting detailed flood risk assessments might be difficult for other reasons, such as due to rapid urban growth.

This research effort is conducted within the European EP7 project entitled Climate Change and Urban Vulnerability (CLUVA) in Africa. The application presented in the paper reports the identification of urban flooding risk hotspots in the city of Ouagadougou (Burkina Faso), a city for which the observed spatial extent of a major flood event in 2009 (De Risi et al. 2014) and a calculated inundation map for a return period of 300 years, are both available. The flooding hotspots are identified by overlaying the map of potentially flood-prone areas (TWI based) with the UMT map developed in the context of CLUVA project for Ouagadougou (Lindley et al. 2015).

2 Procedure

The outlined procedure is composed of three parts: (a) delineation of flood-prone areas using the topographic wetness index (TWI), based on both inundation maps calculated for a specific return period and also information about the extent of historical flooding in the area of interest to calibrate the TWI threshold; (b) delineation of geographical functional units using the urban morphology types (UMT); (c) identification of urban hotspots by overlaying the TWI, UMT and census population density dataset. Below a detailed description of the procedure employed to define the flood-prone areas is presented.

2.1 Delineation of flood-prone areas using the TWI

The topographic wetness index, initially introduced by Kirkby (1975), has been shown to be strongly correlated to the area exposed to flood inundation (Manfreda et al. 2007, 2008, 2011). The TWI for a given point O within the hydrographic basin is calculated as following:

$$TWI = \log \left( {\frac{{A_{S} }}{\tan \beta }} \right)$$
(1)

where A s is the specific catchment area expressed in meters and calculated as the local up-slope area draining through point O per unit contour length (A/L); β is the local slope at the point in question expressed in degrees.

The TWI allows for the delineation of a portion of a hydrographic basin potentially exposed to flood inundation (referred to herein as flood-prone or more briefly as FP) by identifying all the areas characterized by a topographic index that exceeds a given threshold. Therefore, the delineation of flood-prone areas is strictly dependent on the TWI threshold, if the urban territory can be characterized by a single threshold value. The following sections describe how the likelihood function for the TWI threshold is calculated based on the historical spatial extent for a selected zone of interest within the basin.

2.2 Maximum likelihood estimation of TWI threshold

The maximum likelihood procedure employed for calibrating the TWI threshold is described in detail in Jalayer et al. (2014) and De Risi et al. (2014).

Let W represent the spatial window of a zone of interest (within the basin) containing historical flooding extent information. Moreover, let FP represent the flood-prone areas identified as TWI > τ (where τ is the TWI threshold) and IN represent the inundated areas based on calculated inundation profile or historical information. Figure 1a illustrates in a schematic manner W and the portions identified as FP and IN.

Fig. 1
figure 1

A schematic representation of the spatial window of reference (W): a flood-prone (FP) and inundated (IN) areas, b FP and IN (yellow), and not FP and not IN (green)

The probability of the correct delineation of flood-prone areas or the likelihood function for the TWI threshold τ denoted as L(τ|W) for various values of τ can be calculated as following:

$$\begin{aligned} L\left( {\tau |W} \right) = & P(FP,IN|\tau ,W) + P\left( {\overline{IN} ,\overline{FP} |\tau ,W} \right) \\ = & P\left( {FP|\tau ,W} \right) \cdot P\left( {IN|FP,\tau ,W} \right) + P\left( {\overline{FP} |\tau ,W} \right) \cdot P\left( {\overline{IN} |\overline{FP} ,\tau ,W} \right) \\ \end{aligned}$$
(2)

where P(FP, IN|τ,W) denotes the probability that a given point within zone W is identified both as flood-prone FP (using the TWI method) and inundated IN (based on additional information such as the historical flooding spatial extent or calculated inundation maps), and conditioned on (the | sign) a given value of τ of the TWI threshold. The area FP and IN is indicated by yellow color in Fig. 1b. Similarly, P(\(\overline{FP} ,\overline{IN} |\tau ,{\text{W}}\)) denotes the probability that a given point within the zone of interest is neither identified as FP nor as IN conditioned on a given value of τ of the TWI threshold. The area not FP and not IN is indicated by green color in Fig. 1b. In Eq. (2), the terms P(\(FP,IN|\tau ,W\)) and P(\(\overline{FP} ,\overline{IN} |\tau ,{\text{W}}\)) are further expanded using the probability theory product rule (Jaynes 2003).

2.3 Using Bayesian parameter estimation to calibrate τ based on more than one source of information

In this section, the maximum likelihood calibration of the TWI threshold is extended into a more general framework. Suppose that some background information (e.g., historical flooding extent, inundation map for another spatial window, etc.) I 1 is available on the value of the TWI threshold τ. In such cases, the TWI threshold can be calibrated by employing a Bayesian parameter estimation procedure, where the available background information I 1 is represented by a prior probability distribution. If no specific background information is available (beyond that provided by the specific data), a “non-informative” prior can be adopted (e.g., the uniform distribution). Although a sensitivity analysis is quite useful for determining the role of the prior, the choice of prior herein is made based on available background information. Moreover, the use of non-parametric distributions precluded the use of sensitivity analysis for optimal choice of the prior among several feasible options (i.e., probability models).

Now suppose that the information (spatial extents from past flooding events, inundation maps, etc.) coming from the spatial window under consideration is represented by I 2 (I 1 and I 2 are both statements that describe the available information. Note that this is a simple and effective way of representing hybrid information sources). The posterior probability distribution for τ given the information provided by I 1 and I 2 can be estimated as:

$$p(\tau |I_{1} ,I_{2} ) = \frac{{L(\tau |I_{2} )p(\tau |I_{1} )}}{{\sum\limits_{\forall \tau } {L(\tau |I_{2} )p(\tau |I_{1} )} }}$$
(3)

where p(τ|I 1, I 2) denotes the posterior probability distribution for τ; L(τ|I 2) is the likelihood function for τ (calculated as per Eq. 2, strictly speaking the term is conditioned also on I 1; this depedence is not considered herein) and p(τ|I 1) is the prior probability distribution for τ. Note that p(τ|I 1) is the posterior distribution resulting from the previous application of the procedure; that is, before having the additional information I 2. It is important to underline that the approach presented in this work is based only on one parameter (i.e. τ) and the posterior distribution for the TWI threshold (Eq. 3) can be obtained with a reasonable computational time. Therefore, the application of advanced simulation schemes such Markov Chain Monte Carlo (MCMC) is not necessary herein. Moreover, to keep the procedure more tractable and easy-to-implement, exclusively non-parametric or “empirical” distributions are employed herein (except for the non-informative uniform prior distribution adopted when no specific background information was available).

Note that Eq. (3) is particularly useful for calculating the threshold τ having (when available) historical flooding spatial extents for more than one spatial window within the basin (note that they could also correspond to different flooding events). The formulation presented in Eq. (3) can also be extended to cases where information from more than two spatial windows are available. Another specific case is when both the calculated inundation maps and historical flood extents are available. In this case, the posterior probability calculated based on for example the historical extent (I 1) can be used as prior probability distribution to calculate the posterior distribution, using the calculated inundation map (I 2) to compute the new likelihood function. The advantage of using the Bayesian framework herein lies in its versatility in integrating different types of information available for different spatial windows within the zone of interest.

2.4 Quantitative measure of the error in TWI method

To quantitatively evaluate the accuracy of the TWI method (with respect to more precise methods or data) in identifying the potentially flood-prone areas, two alternative indicators are defined: the false negative indicator Fneg, and the false positive indicator Fpos. These two indicators are used substantially to quantify the potential of the procedure for issuing “false alarm”. That is, they provide quantified estimates of the probability that the procedure indicates a given areal window as flood-prone or not flood-prone erroneously; referred respectively to as the false positive and false negative alarm.

The index measuring the false negative error in the identification of flood-prone areas Fneg represents the probability that a given zone is identified as inundated or IN according to the hydraulic model/historical inundation extent given that it is identified as not flood-prone (\(\overline{FP}\)) according to the approximate TWI method:

$$\,F_{neg} = P(IN|\overline{FP} ) = \frac{{A_{W} (IN,\overline{FP} )}}{{A_{W} (\overline{FP} )}}$$
(4)

Note that this term measures the probability that the TWI method “misses” an inundated area and is particularly important in the current procedure. The closer the Fneg ratio is to zero, the smaller is the error due to false negative identification of the flood-prone contour.

The second index Fpos represents the probability that a given zone is identified as not inundated according to the hydraulic model/historical flooding extent (\(\overline{IN}\)) given that it is identified as flood-prone according to the approximate TWI method (FP).

$$F_{pos} = P(\overline{IN} |\text{FP}) = \frac{{A_{W} (\overline{IN} \text{,FP})}}{{A_{W} (\text{FP})}}$$
(5)

This term quantifies the degree of conservatism in the TWI method measured by the areal extent of the zones that are not inundated but they are indicated as flood-prone by the TWI method. The closer the Fpos ratio is to zero, the smaller is the error due to false positive identification of the flood-prone areas.

3 Case study: Ouagadougou

Ouagadougou (12°21′N 1°32′W) is the capital city of Burkina Faso, located within the watershed of the Massili river (a tributary of the Nakambé river) in the West African Sahel region (John et al. 2012). The hydrological characteristics of the city include four backwaters channels feeding four major reservoirs which provide rainwater drainage, storage and water supply functions. On September 1, 2009, an extreme rainfall event hit the city and resulted in wide-spread damage (destruction of buildings and infrastructure) and displacement of population (Reliefweb 2009). More than 25 cm of rain fell in 12 h turning the streets of Ouagadougou into fast-flowing rivers. The city’s infrastructure was severely affected as the floods cut off electricity, fresh water and fuel supplies. The city is used to heavy seasonal rainfall (700 mm of rain between May and September) but this was the worst flooding in 50 years. An estimated 109,000 people were affected (Giugni et al. 2012). In the Bayesian framework described in this paper, the areal extent of this flood event is used, to delineate the flooding risk hotspots for Ouagadougou.

3.1 Delineation of flood-prone areas for Ouagadougou using the topographic wetness index (TWI)

The TWI map is constructed in a GIS environment based on the DEM of the city (Fig. 2a, resolution: 3 m). Figure 2b illustrates the resulting TWI map for Ouagadougou. It can be observed that the TWI values vary between about 8 and 25; largest TWI values can be found around the natural water channels.

Fig. 2
figure 2

a The DEM of Ouagadougu, b the TWI map and the stream links

3.2 The inundation spatial extents

The TWI threshold for Ouagadougou has been calibrated based on two different datasets. The first dataset consists in the inundated area of 2009 flooding event (Fig. 3a). This spatial extent has been obtained based on the information available at the internet site (http://www.mapaction.org/map-catalogue/mapdetail/1719.html). As depicted in Fig. 3a, two inundated areas have been identified.

Fig. 3
figure 3

a Inundated areas after the 2009 events, A1 (the larger area used in this study) and A2 (the smaller area), b the inundation map in terms of hmax in meters for a return period of 300 years (also the catchment area is shown in the figure)

In this paper, the areal extent of A1 has been employed. The second dataset consists of the calculated inundation map for a return period of 300 years (Fig. 3b). The inundation profile has been calculated by two-dimensional simulation of flood volume propagation using the software FLO-2D (FLO-2D 2004; O’Brien et al. 1993) based on historical rainfall records and by employing the Curve Number Method (SCS 1972) to calculate the hydrograph. A detailed step-by-step description of the hydrologic-hydraulic routine carried out can be found in De Risi (2013) and De Risi et al. (2013).

The DEM has been used to identify the catchment area (drainage area 453.4 km2, length of main channel 8.9 km, average slope 3.4%, average elevation 313.5 m). The catchment area is characterized by a curve number equal to 75.5 based on the SCS (1972) classification. The analyses use a simulation time of 60 h. The calculations are performed for a return period of 300 years. It seems that the spatial extent of inundated areas is not significantly affected by the return period and depends instead on the geomorphology of the area (Jalayer et al. 2014); therefore, the maximum return period that is used in the practical application is adopted herein. On the other hand, the flood depth values are quite sensitive to the return period. The inundation map is reported in terms of the maximum flood depth (hmax), as is shown in Fig. 3b.

3.3 Maximum likelihood estimation of the TWI threshold

This section demonstrates how the procedure described above can be applied to calculate the likelihood of a correct identification of the flood-prone areas as a function of the TWI threshold.

A maximum likelihood parameter estimation is carried out based on two different data sources (using non-informative uniform prior distributions in both cases): (1) historical flooding spatial extent and (2) calculated inundation profile. These are subsequently referred to as Case 1 and Case 2, respectively. Both Case 1 and Case 2 are calculated for the same spatial window. In the next step, a Bayesian parameter estimation is adopted to integrate the information about historical flooding spatial extent (as a non-parametric prior information) with the calculated inundation maps (Case 3). Note that in the special case treated herein both sources of information correspond to the same spatial window. Nevertheless, the adopted Bayesian procedure is applicable also to the more general case where the alternative sources of information belong to various spatial windows. The spatial window W that contains the inundated area has an extent of about 156 km2. The total area of the Ouagadougou administrative area is equal to around Aurban = 3046 km2 which is taken as the spatial extent of the city for the purposes of this analysis.

3.4 First stage: maximum likelihood estimation for two data sets separately

The results for Case 1 and for Case 2 are reported in Figs. 4 and 5, respectively. In these cases and in the absence of prior information, the TWI is calibrated based on the maximum likelihood method described earlier. In other words, the Bayesian procedure introduced in Eq. (3) for a given spatial window W with a uniform prior distribution for τ, reduces to a maximum likelihood estimation procedure (Eq. 2). For all the possible values of τ, the probability that a given zone is flood- prone, denoted by P(FP|W,τ), is estimated as the ratio of the flood-prone areal extent within the city delineated by the threshold τ to the total area of the city administrative boundary and is plotted as red lines in Figs. 4a and 5a.Footnote 1 Moreover, the probability P(IN|W,FP,τ) that a given point is IN given that it is already indicated as FP for a given value of τ, is estimated in window W as the ratio of areal extent marked as both IN and FP and the areal extent marked as FP. This quantity is plotted as grey stars in Fig. 4a for Case 1, and in Fig. 5a for Case 2. The probability P(FP,IN|W,τ) that a given point is indicated both as flood-prone FP (by the TWI method) and inundated IN (based on the historical extent in Case 1 and inundation map in Case 2) is calculated as the product of P(FP|W,τ) and P(IN|W,FP,τ) and is plotted as blue circles in Figs. 4a and 5a for Cases 1 and 2, respectively. In a similar manner, the probability \(P(\overline{FP} |W,\tau )\) that a given point is not indicated as flood-prone (based on the TWI method) is calculated as the complementary probability of being flood-prone P(FP|W,τ) and is plotted as red lines in Figs. 4b and 5b. The probability that a given zone is not indicated as inundated given that it is not flood-prone, for a given value of τ, is plotted as the grey stars in Figs. 4b and 5b. Finally, the probability \(P(\overline{IN} ,\overline{FP} |W,\tau )\) that a given point is not inundated and not flood-prone, for a given value of τ, is calculated as the product of \(P(\overline{FP} |W,\tau )\) and \(P(\overline{IN} |\overline{FP} ,W,\tau )\) and is plotted as the blue circles in Figs. 4b and 5b.

Fig. 4
figure 4

a Probability of being FP and IN given τ, b probability of being \(\overline{FP}\) and \(\overline{IN}\) given τ (for the historical flood extent, Case 1)

Fig. 5
figure 5

a Probability of being FP and IN given τ, b probability of being \(\overline{FP}\) and \(\overline{IN}\) given τ (for the calculated inundation profile, Case 2)

The likelihood function for threshold τ is then calculated from Eq. (2) by summing up the probability of being flood-prone and inundated and the probability of not being flood-prone and not being inundated, for all possible τ values (i.e., summing up the curves illustrated by blue circles in Fig. 4a, b for the historical extent and Fig. 5a, b for the calculated inundation map).

Figure 6a, b illustrate the likelihood function L(τ|W) for τ based on historical flooding extent information (Case 1) and hydraulic calculated profile (Case 2), respectively.

Fig. 6
figure 6

The likelihood function obtained based on information coming from a historical extent (Case 1) and b the calculated profile (Case 2)

As mentioned before, the likelihood function L(τ|W) can also be interpreted as the posterior probability distribution function for τ p(τ|W) for the spatial window W based on uniform prior distribution. Consequently, the maximum likelihood estimate for τ (i.e. the value that corresponds to the maximum likelihood) can be identified for both cases. Furthermore, by identifying the τ values corresponding to more than 99% of the maximum likelihood value, it is possible to define a maximum likelihood interval that varies between \(\tau_{\text{ML}}^{ - }\) and \(\tau_{\text{ML}}^{ + }\). That is, from a practical point of view, the information used for calibrating the TWI threshold leads to identifying a maximum likelihood interval for τ. Moreover, having the probability density function, also the 16th and 50th percentiles are calculated and marked on the plots for Cases 1 and 2.

3.5 Second stage: Bayesian updating

In the next step, the posterior distribution for τ based on information coming from the historical flood extent (Fig. 6a) is used as prior distribution p(τ|I 1) in Eq. (3) for calibrating the TWI threshold based on the calculated inundation map I 2 for the same spatial window. Figure 7 illustrates the posterior probability distribution p(τ|I 2,I 1) for τ (the black squares) overlaid on the prior probability distribution p(τ|I 1) (the grey squares). Also in this case, the maximum likelihood estimation, the 99% maximum likelihood interval and the 16th and 50th percentiles are calculated and reported on the graph.

Fig. 7
figure 7

Estimation of the posterior probability distribution for τ (Case 3). The black squares are the posterior distribution p(τ|I2,I1) (based on both the historical flooding extent and the calculated inundation map) and the grey squares are the prior distribution p(τ|I1) (based on only the historical flooding extent)

3.6 Synthesis of the numerical and graphical results

In Table 1 are reported the results related to the different studied cases. Recall that Case 1 represents the TWI threshold calibration based on the historical flooding extent; Case 2 refers to the results obtained based on the calculated inundation maps; and Case 3 refers to the results obtained by employing Bayesian inference and based on both the historical spatial extent and the calculated inundation maps.

Table 1 The statistics for the TWI threshold distribution for the different cases

It can be seen from Table 1 that the maximum likelihood estimate is very similar for the three cases analyzed. Nevertheless, as demonstrated in Table 2, the dispersion (βτ calculated as the difference between the logarithm of 50th and 16th percentiles) and the width of the 99% maximum likelihood interval width (∆τML) decreases passing from Case 1 to Case 3. It is to be expected that the dispersion in the estimate for TWI threshold reduces in this way as more information sources are used in the calibration process in Case 3. This is also a sign of the consistency and agreement between the information provided by the historical flooding extent and those provided by the hydraulic propagation (inundation maps) as far as it regards the estimation of the maximum likelihood TWI estimate.

Table 2 The dispersion and maximum likelihood interval width

Table 3 reports the indices Fneg and Fpos that measure the false-negative and false-positive errors of the TWI method in the identification of potentially flood-prone areas, respectively. These indices are reported for the three cases and for all the alternative estimates of the TWI threshold; namely, τML, τ16, τ50, \(\tau_{\text{ML}}^{ - }\) and \(\tau_{\text{ML}}^{ + }\). In general, the indices related to false-negative detection error term are smaller than that of the false positive detection. This is partly due to the fact that the flood-prone areal extent (the denominator for Fpos term) is systematically (at least in this case-study) smaller than the non-flood-prone areal extent (the denominator for Fneg term). These two terms, apart from offering a quantitative measure of the TWI method’s error in delineation of the flood-prone areas, also provide indication on the most suitable estimate for the TWI threshold. It can be observed from the table that τML and τ +ML are associated with smallest errors. Moreover, it can be observed that the TWI method is more precise when calibrated based on the (calculated) inundation maps with respect to the historically flooded areal extent. This seems reasonable as the extent of the historically inundated area might have been obtained with some reasonable degree of conservatism. This also explains why Cases 2 (calibrating based on the inundation map) and 3 (calibrating based on both inundation map and historical extent of the inundated areas) are associated with smallest errors. For example, \(\tau_{\text{ML}}^{ + }\) is associated with around 5% false negative error and around 9.5% false positive error in case 3. Herein, it has been chosen to adopt τML as the optimal TWI threshold estimate. It is associated with a false-negative error of around 3% and a false-positive error of around 19% in Case 3. It is worth mentioning that for the purpose of delineating the flooding risk hotspots, some conservatism in delineating the flood-prone areas was desirable (i.e., the rather large false-positive error term).

Table 3 The errors in FP delineation using TWI for the different cases

Finally, if one decides to use the 16th percentile estimate of the TWI threshold, the false negative error term reported in Table 3 is almost negligible; this is while the false positive error term is very large. This observation is consistent with the fact that using a 16th percentile estimate for the TWI threshold, one is making a very conservative estimate of the extent of the flood-prone areas. A visual check of the accuracy of the results can be performed by overlaying the inundated zones (delineated based on the past flooding event and the inundation map) and the TWI map for threshold larger than the maximum likelihood estimate (τML) equal to 20 (i.e. TWI > τML) for both Cases 1 and 2.

Figure 8 illustrates the results of overlaying the 2009 flooding extent, the inundation map corresponding to TR = 300 years and the TWI map with a threshold equal to 20. It can be observed that the obtained FP map matches the IN map obtained by employing the hydraulic routine quite well and has a reasonable agreement with the 2009 flooding extent. This agrees with the calculated values for Fneg and Fpos (listed in Table 3) which are quantitative measures of the error in the TWI method.

Fig. 8
figure 8

Overlay of FP and IN areas for the case study spatial window

3.7 UMT units for Ouagadougou city

The urban morphology types (UMT) for Ouagadougou are classified and delineated based on aerial photos acquired between 2009 and 2010. In Fig. 9, the high level UMT map for Ouagadougou, constructed within the CLUVA project (Lindley and Gill 2013), is shown.

Fig. 9
figure 9

High level UMT map for Ouagadougou covering 3051 km2 (2009–2010)

Around 65% of the city is covered by areas classified as agriculture or vegetation and residential areas make up a further 23%. Focusing on the latter UMT category (Fig. 10a), five categories of dwellings can be distinguished, along with the percentage of Ouagadougou that they represent: (a) high class (1.7%) (b) medium class (3.8%), (c) low class (3.8%), (d) very low class (1.3%) and (e) scattered settlements (12.1%). As would be expected, the agricultural areas are associated with many scattered settlements. In fact, scattered settlements make up the single largest residential land category in the Ouagadougou area, around 52% of the total residential area. Rural settlements are generally associated with traditional adobe dwellings; although some settlements outside of the main urban core can belong to a mix of dwelling types (John et al. 2012).

Fig. 10
figure 10

a Residential UMT; Urban residential flooding risk hotspots delineated for different TWI thresholds: b 16th percentile, c maximum likelihood, d 50th percentile

It is particularly noticeable that the periphery of the main urban core is surrounded by zones generally associated with what has been termed as very low class dwellings. These areas, representing about 6% of the total residential area, are comprised mainly of adobe construction with no formal infrastructure. Formally planned areas of high class dwellings tend to be located towards the centre of the city. The largest zone of this dwelling type is situated between the airport and a large recreational area in the south of the city centre. The large area of housing construction in the south of the recreational area is associated with high class housing and administrative/commercial functions. The high class dwellings area corresponds to about the 8% of the residential area. Medium and low class dwellings each make up around 17% of the total residential area. It is important to highlight that taking the entire Ouagadougou administrative area of 12 districts results in a rather low overall population density (around 62.2 people/km2 in 2006). Population densities in the main urban core are much higher, i.e. in the 5 districts of the central Commune of Ouagadougou. The boundaries of these districts are subject to change but are reported to occupy around 520 km2 (John et al. 2012). Boundary changes are affected by very high rates of urban and population growth fueled by strong rural–urban migration. This is expected to result in a population of 4.7 million by 2025 compared to only 110,000 in 1974 (John et al. 2012). Indeed, the bare ground UMT shown in Fig. 9 (0.8% of the Ouagadougou area) is specifically related to dwellings under construction in 2009/2010. The amount of housing construction is, however, expected to be larger due to informal construction activities which tend to operate in a more piecemeal fashion.

3.8 Identification of flooding urban hotspots by overlaying the TWI, UMT and the population density

Three datasets, namely the TWI map of flood-prone areas, the urban morphology units classified as residential and the 2006 geospatial census dataset for Ouagadougou have been overlaid to delineate the residential flooding risk hotspots and to estimate the exposure to flooding, expressed as the estimated number of affected people. Figure 10b–d illustrates the delineated urban hotspots, obtained by overlaying the residential UMT (Fig. 10a), the TWI maps corresponding to various threshold statistics (obtained for Case 3 in Table 1) and the geo-spatial dataset on population density obtained from the census (2006). Table 4 below demonstrates the percentage of residential areas potentially affected by flooding (the areal extent of hotspots represented in Fig. 10) with respect to the entire city area.

Table 4 Exposure to flooding risk in terms of the estimated percentage of the residential area potentially affected by flooding with respect to the entire city

Table 5 reports the results in terms of the percentage of people potentially affected by flooding who live in the residential area (the estimated number of people affected by flooding normalized by the total population in the residential area), for different estimates of the TWI threshold.

Table 5 Exposure to flooding risk in terms of the estimated number of inhabitants potentially affected by flooding with respect to the number of inhabitants in the residential area

It can be observed from Tables 4 and 5 that there exists a decreasing monotonic relationship between the TWI threshold and the number of people potentially affected.

Therefore, based on the invariance property of percentiles (Jaynes 2003) when the mapping function is strictly monotonic, the 16th percentile of the TWI threshold translates into 84th percentile of the number of affected people and vice versa (based on the strong assumption that the uncertainty in the TWI threshold is the only source of uncertainty in estimating the exposure to flooding). Finally, the pie chart shown in Fig. 11 illustrates the percentage breakdown of the areal extent of the residential flood hotspots (corresponding to τML equal to 20) in terms of different residential sub-classes. The same pie chart represents the percentage breakdown also for people affected by flooding since a constant population density of 62.2 people/km2 is adopted in the calculation. It can be observed that 51% of the population in the flood-prone residential areas lives in scattered settlements. This sub UMT type is composed mainly of adobe structures that are particularly vulnerable to flooding.

Fig. 11
figure 11

Breakdown of the residential flooding risk hotspots in terms of area and people affected by flooding

4 Summary and conclusion

Residential flooding risk hotspots in the city of Ouagadougou are delineated by overlaying three GIS-based datasets, namely, the topographic wetness index (TWI), the urban morphology types (UMT) and a population density dataset. The flood-prone areas based on the TWI method are identified by delineating the areas distinguished with TWI larger than a certain threshold. This threshold can be calibrated based on available information, such as historical flooding extent and/or inundation maps calculated for a certain area within the basin. A probabilistic GIS-based method is used for calculating the maximum likelihood estimate and the 16th and 50th percentiles for the TWI threshold based on both calculated inundation maps and areal extents for a previous flooding event. Bayesian parameter estimation is used to calibrate the TWI threshold based on both the inundation maps and the areal extent of a previous flood event for the same area within the basin.

The flood-prone areas delineated for various TWI threshold statistics (e.g. τML, τML+) are then overlaid with the UMT units identified as residential to identify the hotspots and the areal extent of the UMT units potentially affected by flooding. Integrating the population density as a geo-spatial dataset, leads to estimation of the number of people affected by flooding.

The Bayesian updating incorporating both the information from the inundation map and the historical flooding extent seems not to significantly affect the maximum likelihood estimate (using information from only one of the above-mentioned sources), although it leads to smaller 99% maximum likelihood intervals and a smaller dispersion. In other words, the information provided by the inundation map and the areal extension of the 2009 flooding event seem to be consistent; at least in terms of the delineation of the inundated areas. However, this is to be expected as the information used for calibrating the TWI threshold is coming from the same spatial window within the basin.

The two measures of the false-negative and false-positive error in the identification of potentially flood-prone areas by the TWI method provide quantitative evidence of the (acceptable) level of approximation in the TWI method (at least in the calibration spatial window). It can be observed that the false-negative error term associated with the maximum likelihood estimate of the TWI threshold is generally very low in the spatial window used for the calibration purposes (between 3 and 9%). The false positive error term is larger than the false negative term (especially when the historical flooding extent is used as data) and testifies the conservatism of the procedure in delineating the flood-prone areas. However, also the false positive error term is reduced when the TWI threshold is calibrated based on both historical flooding extent and the (calculated) inundation maps (less than 20%). Studying these error terms identifies the maximum likelihood estimates τML and τML+ for the TWI threshold as the most suitable estimates to use since both error terms associated with them are acceptably low.

Differences in exposure characteristics have been assessed for a range of different residential types. It can be observed that the scattered settlements constitute around 50% of the residential buildings that are located in the flood-prone areas in Ouagadougou.

It is estimated that between around 4% (50th percentile) to 18% (16th percentile) of the total area of Ouagadougou may be affected by flooding. Between 10% (50th percentile) and 40% (16th percentile) of the people who live in the residential areas may be affected by flooding. Hence, it is important to emphasize that considering uncertainty in the TWI threshold leads to considerable differences in the estimated exposure to flooding.

The application of the TWI method (with a fixed threshold) implicitly assumes that the entire area of interest can be characterized by the same TWI threshold. This hypothesis has been verified in a previous work (De Risi et al. 2014) for the specific case study, but generally represents a limitation to further investigate.

It is important to note that TWI is calculated based only on topographic/contributing area information extracted from a digital elevation model, which is a limitation of the proposed procedure. In a natural basin, the topography of the area plays a very important role in flood routing (Pregnolato et al. 2016). In fact, the TWI as a proxy for delineating potentially flood-prone areas works best in a natural basin. Clearly, in an urbanized area, the buildings, the drainage system and the road infrastructure will also contribute to flood water routing and will help determine the actual flood extents observed. In such cases, the TWI will be a less precise means of delineating flood-prone areas, and additional variables should be considered (Tehrany et al. 2015; Jafarzadegan and Merwade 2017). This might entail fitting multivariate probabilistic models and using simulation-based approaches, such as Markov-Chain Monte Carlo Simulation or Machine Learning procedures, to carry out the Bayesian inference. Nevertheless, considering that Ouagadougou does not have a very dense urban core and considering that the city has a large, relatively under-developed peri-urban area, using the TWI as a proxy for delineation of potentially flood-prone areas might still be reasonable. In general, the procedure might not be very accurate for small return periods for which the effect of urban sewer system is more significant. In any case, the TWI is considered herein as a proxy for delineation of flood-prone areas with the objective of arriving at an approximate but efficient means of delineating the hotspots in terms of flooding risk. Once the hotspots are established, one can use more detailed hydraulic procedures that perform flood routing considering the presence of infrastructure, for limited areas and not the whole city.

In short, TWI responds to the need for fast procedures for flood hazard zoning in urban areas where the exposed assets are usually concentrated. It is noteworthy that the statistical backbone of the proposed procedure is still valid if it is used with a proxy that is more suitable for urban areas.

Arguably, having data corresponding to more flooding events makes the probabilistic inference more robust. In fact, adopting the Bayesian inference provides the possibility of incorporating hybrid sources of information (e.g., accurate hydraulic calculations made for various parts of the zone of interest and historical flooding extents corresponding to different events and different spatial windows). Furthermore, the adoption of Bayesian inference makes it possible to implement such information adaptively and gradually (i.e., as they become available). As a result, also the maps of flood-prone areas and the flooding risk hotspots can be updated as more information becomes available. As the applications presented in this work illustrate, the posterior distribution for the TWI threshold can be updated adaptively. In such cases, the posterior distribution obtained in the previous application becomes the new informative prior. Using hybrid sources of information such as the inundation maps obtained based on detailed hydraulic calculations and flooding extents corresponding to different historical events ensures that one avoids over-fitting to a specific historical flooding event. In other words, the power of the Bayesian approach is also in its versatility in implementing hybrid sources of information and the fact that it can be adaptively used to update the probability distributions as more data is available. Although there are some limitations, the results from this study provide a promising foundation for high level screening of flood risk hotspots prior to more detailed work. The specific results of this study may assist in planning the development of the Ouagadougou administrative zone where flood risk data are limited and where there is a rapid pace of change. In general, the proposed procedure is suitable for obtaining a zero-level map of flood-prone areas and delineating the flood risk hotspots—as the first part of a multi-level risk assessment approach—in the urban scale.