1 Introduction

Floods are likely the most recurrent and destructive disaster worldwide (Teng et al. 2017). Consequently, it is becoming more important to develop techniques to help understand past flood hazards so that we can better adapt to potential future risks (Sarhadi et al. 2012; Smith et al. 2019; Wang et al. 2018). The recent occurrence of two extreme flood events produced by cyclones Idai and Kenneth (USAID 2019) with their resulting high human and economic losses in Southern Africa have reemphasized the need to further investigate risks related to floods, especially in regions with limited coping capacity, which are mostly data-scarce. Generally, a first step to investigate past flood events is to reconstruct flow characteristics (e.g. river discharge, water depth) using hydrodynamic models. Hydrodynamic models have enabled a better evaluation of the hazard extent, magnitude and dynamics of past floods scenarios, hence providing inputs for assessing the physical vulnerability of the built environment (e.g. developing flood damage models) to support risk adaptation (Fuchs et al. 2019). In particular, flow characteristics derived from hydrodynamic models provide reliable and sizeable input data at building locations to develop robust flood damage models (Chow et al. 2019; Wagenaar et al. 2017).

The use of hydrodynamic models to reconstruct past flood events typically requires calibration data for setting boundary conditions to adjust simulated and observed measurements. Commonly applied calibration data include time series of river discharge, precipitation data or remotely sensed data. However, these data are rarely available in many data-scarce regions consequently limiting application of hydrodynamic modelling and further contributing to higher flood risk in many parts of the world (Sanyal et al. 2013; Yan et al. 2014; Schumann et al. 2015; Komi et al. 2017). In this study, data-scarce regions refer in particular to areas with limited hydrological, meteorological or remotely sensed data to characterize rainfall or run-off in river channels or on flood plains.

Hunter et al. (2005) highlighted the potential of utilizing distributed post-event observations for calibrating hydrodynamic models. Observation data can be water depths or extents measured after a flood and can be observed as marks on buildings or trees. For example, Neal et al. (2009) used distributed water and wrack marks collected immediately after the 2005 Carlisle flood (UK) to reconstruct the flood event using a hydrodynamic model. Similarly, other studies that use distributed water level data observations include Bronstert et al. (2018) and Borga et al. (2019) in Germany and Italy, respectively, Zischg et al. (2018c) and Bernet et al. (2019) using geolocalized insurance claims in Switzerland. Recently, Wang et al. (2018) demonstrated an extended application of post-event data for model calibration by using documented reports and photos shared by the flooded community through social media feeds. An earlier study by Tran et al. (2009) also recommended the use of data retrieved from flooded communities provided they are either victims or eye- witnesses of the events and have a good understanding of their environment.

Recent reviews by Assumpção et al. (2018) and Sy et al. (2019) showed that studies that utilize data, collected from affected communities, to reconstruct past flood scenario are increasing. Such an approach, using knowledge from affected communities, is particularly encouraged in data-scarce regions given their potential to provide low cost and sizeable data with good spatial and temporal coverage (Sy et al. 2019, 2020; Assumpção et al. 2018). For example, several studies have reconstructed past floods using either water depths and velocity retrieved from texts, pictures and videos uploaded to social media platforms (see review by Assumpção et al. 2018)) or flood duration derived using interviews (e.g. Sy et al. 2016). Data retrieved from communities may require extensive pre-processing and are susceptible to errors since they are reported by non-experts (e.g. incorrectly reported or missing georeference, time stamp and object value). Furthermore, the susceptibility of flood victims to post-traumatic stress disorder (PTSD) depending on flood type, severity and degree of exposure (Chen and Liu 2015; Fontalba-Navas et al. 2017) may additionally influence the accuracy of data retrieved from flood communities. As a result, some studies (e.g. McDougall and Temple-Watts 2012; McDougall 2011) have suggested field visits to validate reported observations. Focussing on field visits, several researchers have integrated face-to-face interviews for the collection of post-event observations and subsequent application for past flood reconstruction using a GIS approach (e.g. Poser and Dransch 2010; Singh 2014; Sy et al. 2016, 2020). A recent advancement in mapping past floods in typical data-scarce regions using interviews was carried out by Sy et al. (2020) in Yeumbeul North, Senegal. In the study, flood depths and extents were retrieved for multiple flood events (between 2005, 2009, 2012) based on personal recollections using two (independent) sets of community representatives. Sy et al. (2020) observed a good agreement after comparing flood spatial extents derived with a community-based approach and remotely sensed data. Given observed progress in the application of community-based data and the urgent need for understanding flood risk, especially in vulnerable regions, Assumpção et al. (2018) and Sy et al. (2020) have recommended exploring new alternatives in integrating such data into hydrodynamic models in data-scarce regions.

A recent flood event in 2017 in Suleja and Tafa local government areas in Niger State, Nigeria, has resulted in high human and economic casualties (OCHA 2019). The region is situated in the central part of Nigeria; an area typically characterized by several inland rivers. No available historical records of river discharge or precipitation are available for this area, and previous studies were limited to terrain analysis to delineate flood zones using either a 90 m STRM (e.g. Mayomi et al. 2014) or a 22 m resolution NigeriaSat-X (e.g. Ndanusa et al. 2018). Although floods are recurrent in this region, the magnitude and intensity of the 2017 event were higher (Adeleye et al. 2019). More so, with a population of about 300,000 in Suleja and Tafa (NBS 2010) and the threat from climate change, it is likely that more people will be exposed to flood risk. The application of hydrodynamic modelling in such regions is important to gain a better understanding of the hazard extent, magnitude and dynamics so as to support future flood risk management decisions. In particular, knowledge about spatial locations likely to experience high hazard magnitude can be used to support physical vulnerability assessment, recommend adaptation measures or guide emergency planners.

Herein, we present a method that leverages the use of post-event observations collected through field interviews, and a hydrodynamic model to reconstruct a plausible scenario of a past flood event in a typical data-scarce location. The model output is intended to enrich data sets to develop flood damage models. For example, the method can extrapolate from a sample of buildings for which flood depths are known to the remaining neighbouring buildings affected by the flood event. In addition, model results will improve the understanding of the spatial patterns of flood hazard (e.g. maximum water depths) in the region to support flood management decisions. Herein, we explain and test the developed method in a typical data-scarce small to medium scale ungauged catchment with five river reaches.

2 Study site and data

2.1 Study site

There are five major reaches in the study area (Fig. 1b). Reaches 1, 4 and 5 are located in Suleja, while reaches 2 and 3 are located in Tafa. The river channels are semi-natural with no flood protection work and partly characterized by grassland vegetation or sediments. All river channels are mostly shallow (≤ 2 m) and have a frequently varying cross-section. The main river traversing the study area is the River Iku and has a reservoir upstream (Fig. 1a). Table 1 provides a summary of reach characteristics based on field observations and satellite imagery (ESRI et al. 2020). Qualitative characterization of the channel and floodplain (Table 1) was partly based on methods in Phillips and Tadayon (2006).

Fig. 1
figure 1

WorldDEM DTM of study region showing a catchments 1–5 b inspected buildings and model flow input points, junctions, built-up areas with arrows indicating river flow direction. Inset map indicates location of Suleja and Tafa area, Nigeria

Table 1 Reach characteristics in the study region

A 30 m resolution map of the built-up areas in the location can be found in Fig. 1b. The built-up areas, which predominantly consist of sandcrete and clay residential buildings, were compiled under the Global Human Settlement (GHS) project of the European Commission (Pesaresi et al. 2013). It includes multi-temporal built-up layers derived from Landsat image collections from Global Land Survey (GLS) between 1975 and 2000, and ad-hoc Landsat 8 collection 2013/2014 (for more information, see Florczyk et al. (2019)). A 100 m spatial resolution land cover map from 2018 (Buchhorn et al. 2020) indicates the catchment encompassing the study area is comprised of 44% agriculture, 34% forest, 10% urban, 9% shrubs and 3% other. A WorldDEM DTM (Fig. 1a) of the study site was acquired from the European Space Agency. The WorldDEM DTM is a bare earth surface without vegetation and man-made objects and has a spatial resolution of 12 m, an absolute vertical accuracy ≤ 10 m and an absolute horizontal accuracy ≤ 6 m (Archer 2018). The WorldDEM DTM data set can be acquired free of charge for scientific purpose.

2.2 Flood event and data availability

In 2017, from 8 to 9 of July, prolonged heavy rainfall caused severe flooding in Suleja/Tafa region. According to local reports, the floods lasted for about 12 h and led to the death of fifteen people (The Guardian 2017; OCHA 2019). Additionally, hundreds of residential houses and infrastructural facilities were damaged by the floods (Adeleye et al. 2019). Reports from locals in the community indicated that the entire flood was a one-day event (from early morning on 9 July to late evening) and the flood approached at high velocities and resulted in multiple wave-like surges. There were also reports of multiple backwater effects at two river junctions marked junction 1 (J1) and junction 2 (J2) in Fig. 1b. Additionally, local reports indicated that the blockage of a bridge culvert by transported materials (tree trunks and cars) close to J1, caused a damming of the floodwater at surrounding areas and resulted in ponding that increased water depths and durations.

For our study, we considered replicating the Suleja/Tafa flood event using hourly precipitation to drive a catchment scale hydrological model that produces hourly discharge for input into a detailed reach scale hydrodynamic model of the flooded sub-urban area. However, hourly observed precipitation before and during the flood (July 7–9, 2017) is not available because no weather station exists in the Suleja/Tafa region. Recent advances in reanalysis (Hersbach et al. 2018), which combines model data with observations, were also considered as a source for precipitation data. Specifically, we acquired precipitation from the ERA5-Land reanalysis dataset (Muñoz-Sabater et al. 2018) with a relatively fine spatial (9 km) and temporal (1 h) resolution for the Suleja/Tafa region. The ERA5-Land precipitation prior to and during (July 7–9, 2017) the flood event is low in intensity (< 12 mm d−1) and this indicates that ERA5-Land did not reproduce the meteorological conditions that produced the flood. In addition to the unavailability of local precipitation data, no discharge records exist for the rivers in the study area. As such, without hydrometeorological data, we could not develop a catchment scale hydrological model that drives a reach scale hydrodynamic model.

3 Methods

In this section, details on data collection, pre-processing and model selection and build are provided. For the model build, given that (i) reconstructing past flood scenarios depends on the optimal agreement between observed and modelled characteristics and (ii) flood damage models require reliable estimates of flood characteristics at building locations, the modelling approach developed uses several steps of simulations to minimize the error (root mean squared error (RMSE)) between modelled and observed flood depths and durations at building locations.

3.1 Data collection

A field investigation was carried out eight months after the flood event. An initial mapping of the affected areas included the analysis of reports, videos and photos from online media sources (e.g. The Guardian 2017; Vanguard 2017) and social media feeds (Facebook). The questionnaire for interviews was developed covering information relating to flood characteristics, depth, duration, qualitative description of velocity and GPS coordinates of the affected buildings. An extensive house-to-house survey was conducted from March to May 2018, within the initially mapped area. If watermarks were still existing on building walls or block fences, the water depth was measured in situ with a measuring tape. If watermarks were no longer visible, building occupants were asked based on their recollection. Observations were collected using the developed questionnaire (see supplementary material). Many house inhabitants, especially along reaches 2, 3 and 5, reported having experienced two flood waves within 24 h. The second wave is suspected to have been influenced by the reservoir upstream reach 2. From an interview with the reservoir personnel, it was gathered that: (i) the spillways were closed during the entire flood event and (ii) discharge from the reservoir, which was reported to arrive many hours after the first flood, was mainly due to the overtopping of the spillway. Since no further information was available from the reservoir personnel regarding the commencement, duration and estimated discharge of the overtopping, the contribution of the second flood wave was not considered during the modelling.

3.2 Data pre-processing

Before using the WorldDEM DTM, several pre-processing steps were carried out. To represent sections of the river with small widths, the DEM was resampled to a spatial resolution of 4 m. The flow accumulation of the DEM did not align with the actual river channel digitized from a satellite image taken by a DigitalGlobe satellite on the 10 January 2017 with a resolution of 0.5 m and a horizontal positional accuracy of 10.2 m (ESRI et al., 2020). Herein, this satellite image was used at several stages. For a selected area in the study region, we show a typical deviation of the channel generated from flow accumulation with the actual channel from the satellite image (Fig. 2a). A simple procedure was implemented to reduce the channel deviation by reconstructing a new river channel. First, cross-sections with a downstream distance spacing between 5 m (where channel width is frequently irregular) to 30 m (where channel width is averagely consistent) were manually drawn across the digitized channel centerline extracted from the satellite image (Fig. 2a). A total of 362 cross-sections were produced, and channel widths were assigned to each cross-section using widths derived from photographs from fieldwork and the satellite image. Thereafter, we use the WorldDEM DTM to generate a contour map of the study region and manually digitize the centerline of the river with the aid of the satellite image (Fig. 2b). This centerline was preferred over the flow accumulation centerline because it better represented the actual channel location in the satellite image. At this stage, the digitized centerline was segmented using the 5–30 m distance spacing and each segment was assigned a width value that corresponded to the nearest cross-section. Afterwards, a new channel outline was produced by buffering the centerline with a distance equal to the width value of each segment (Fig. 2c). The resulting channel outline is a compromise between the positional and dimensional representation of the river. Moreover, this method reduces the spatial deviation between the channel generated from DEM flow accumulation (Fig. 2a) and the channel from the contour map (Fig. 2c). In the next step, DEM locations that spatially coincided with the newly developed channel outline were reduced 1 m in elevation. This produced a simple rectangular cross-sectional profile (Fig. 2d). Lastly, channel blockages existing in the DEM (e.g. bridges and culverts) were identified, using the satellite image, and removed. Observation points collected from the field survey were georeferenced and converted to the spatial resolution of the DEM. Thereafter, for each georeferenced observation point, information on observed flood depth and duration was assigned.

Fig. 2
figure 2

Procedure for reconstructing river channel; a deviation of channel derived from flow accumulation of WorldDEM DTM with digitized channel from a high-resolution satellite image. Cross-sections of 5–30 m drawn for width extraction b digitized channel centerline using generated contour map from WorldDEM DTM (1 m contour intervals used for visualization purpose) c generated (reconstructed) river channel (shows improvement in channel deviation compared to a) d 1 m drop of generated river channel

3.3 Model selection and build

The procedure for reconstructing the flood event is based on iteratively changing the boundary conditions (i.e. inflow hydrographs) and model parameters (i.e. roughness) of a hydrodynamic model until modelled and observed flow depths are similar at inundated building locations. In addition to flow depth, our method considers temporal aspects of the flood because the river network consists of five different river reaches, and thus, the superimposition of flood waves from the upstream tributaries ultimately determines the flood in the downstream river reach.

For this study, we used CAESAR-Lisflood (CL) (Coulthard et al. 2013) which is the integration of a non-steady hydrodynamic model LISFLOOD-FP 2D (Bates and De Roo 2000) and the landscape evolution model CAESAR (Coulthard et al. 2002). CAESAR-Lisflood is a full 2D model that routes water and sediment over a regular grid of cells representing channel and floodplain topography. In CAESAR-Lisflood, sediment transport can be optionally turned off, as in our study, and the model essentially operates like LISFLOOD-FP by simulating two-dimensional flow with a simplified form of the shallow water equation (for more information on governing equations see Coulthard et al. 2013). CAESAR-Lisflood operates with minimum parameterization and is open source (https://sourceforge.net/projects/caesar-lisflood/). In our study, we used CAESAR-Lisflood version 1.9 g. Input data for CAESAR-Lisflood are a DEM, discharge and friction values for channel and floodplain locations. Discharge is added to the DEM at specified locations. The entire DEM is reclassified into 4 m grid cells to represent both river channel and flood plain.

The modelling was carried out in four main steps of model run (Fig. 3). Steps 1–3 determine the optimum flood discharge, duration and timing, whilst step four simulates the reconstructed flood event over the whole river network in the model domain. The first two steps were carried out for each reach separately; consequently, observations at the river junctions (Fig. 1b, J1 and J2) were not included. Running separate simulations for each reach also reduced computational demand since we use a DEM with about 1.5 million grid cells.

Fig. 3
figure 3

Flowchart of model development

3.3.1 Step 1: Optimum peak discharge

The Manning’s coefficient (n) accounts for the roughness of either the channel (nc) or floodplain (nfp) and partly serves to control how water is routed from one grid cell to another in CL. To account for a wide range of channel bed characteristics and floodplain land cover types, we select n values of 0.01 (cemented surface), 0.07 (medium shrubs and weeds or trees) 0.14 (dense shrubs and weeds or trees) (Chow 1959). Nine combinations (3 × 3 matrix) of the selected n values were trialed for the nfp and nc. For each model run, a uniform n is assigned to the channel (nc) and another to the flood plain (nfp) for all nine n combinations. Hence, for five river reaches, we have a total of 45 simulations for step 1. The simulations were performed by applying a linear hydrograph (Fig. 4a) at indicated flow input points (Fig. 1b). A hydrograph with a duration of 24 h and a maximum discharge of 250 m3s−1 was used to capture the maximum discharge at which simulated depths closely matched observed depths on each reach. Since buildings in this region are mostly over 80 m2 (determined from sampling 648 buildings in the area using google earth photos), recorded modelled flood depths are extracted from the exact building location and raster cells in the immediate neighbourhood of the building. From these locations, the simulated water depth with the least difference to the observed building water depth was selected. We compute a root mean square error (RMSE) between observed and modelled flood depth every 10 min of a model run. For each simulation, the input discharge was plotted against the RMSE to understand model error response to combinations of input discharge and roughness. For each river reach, the peak discharge value, corresponding to a minimum RMSE for a specific combination of n, is selected as the optimal discharge.

Fig. 4
figure 4

Input hydrographs to determine flood a peak and b duration. Hydrograph examples are provided for reach 1

3.3.2 Step 2: Optimum duration

In this step, observed (or inferred) flood duration data are used to estimate the optimum duration of the flow on upstream reaches (1–4). Similar to a recent study by Zischg et al. (2018a), we adopt a two parametric gamma function to develop synthetic hydrographs with durations of 6, 12, 18, 24 and 36 h. The choice of utilizing a gamma function was because the catchments are: (i) relatively small to medium size in area and (ii) largely unforested (66%) with low amounts of rainfall interception and these characteristics more than likely produce a flashier hydrological response that is represented by the selected distribution (USDA 2007; Welsh et al. 2009). For each upstream reach, the synthetic hydrographs are scaled to the peak discharge value determined from simulation step 1. For example, Fig. 4b shows hydrographs with the peak discharge of reach 1 (88 m3s−1). Model runs were carried out for each upstream reach (reaches 1–4) using all five synthetic hydrographs; hence, 20 model runs were performed in total for simulation step 2. All simulations were carried out using the nfp and nc combination determined from simulation step 1. Model depths were recorded every 10 min of each model run. To compute simulated duration at each observation point, we sum up the total time water depth was  ≥  0.20 m at the grid cell. The use of 0.20 m is to allow a minimum building floor elevation (typical in this region as observed during field data collection) before water enters a building. A RMSE is computed to check the difference between the modelled and observed flood duration at the building locations. The synthetic hydrograph that provides a minimum RMSE value, is selected as the optimum duration hydrograph.

3.3.3 Step 3: Optimum downstream hydrograph

Reports from eye witness documented during the field data collection indicated that the upstream reaches (reach 1–4) did not flood at the same time. This suggests that the downstream hydrograph (reach 5) is not a simple summation of upstream hydrographs across time. To investigate the timing of the flood, we carried out three steps to explore all possible combinations of the upstream hydrographs (reach 1––4, determined from simulation step 1 and 2) to match the optimum downstream peak discharge (reach 5, determined from simulation step 1). The steps carried out are as follows:

  1. I.

    We constructed all possible permutation sets between the hydrographs of reaches 1 – 4 using a 24 h upper bound as the maximum duration because local reports suggest it was a one-day event. Maintaining an upper bound of 24 h was achieved by setting up a permutation algorithm such that temporal shifts for any individual reach hydrograph were possible between 1 and 6 h.

  2. II.

    We established a condition to only select permutated sets where at least one hydrograph is not shifted in time. This condition was to constrain the temporal shifts to a specific event start time.

  3. III.

    Thereafter, hydrographs from selected permutations sets were summed up across time. Results of the permutation set with the closest match to optimal peak discharge for reach 5 was selected as the optimal downstream hydrograph.

We ran two simulations; one using the selected optimal hydrograph that matches the peak discharge for reach 5, and for comparison, another assuming all upstream reaches flowed downstream at the same time (referred to as no shift in time).

3.3.4 Step 4: Simulating the flood event over the entire river network

Lastly, the simulation for the entire river network was ran using the temporally shifted hydrographs that resulted in the optimum downstream hydrograph with regard to peak discharge (simulation step 3). All 300 sampled observations were used to compute a global RMSE (gRMSE) that compares observed and simulated flood depths across the entire river network. Secondly, we compute a separate RMSE using only observations at J1, J2 and 4 observations located about 500 m downstream J1 (Fig. 1b): this is to assess the performance of the temporal shifting of hydrographs and the dependence of the computed gRMSE on calibration data.

4 Results

Out of 300 sampled buildings, about 100 buildings still had flood marks and the corresponding flood depths could be measured directly. Generally, collected observation data on flood depths were more than observations on flood duration. This was partly because it was easier for residents to remember maximum water depth than the time between which the flood reached the building and when it receded. Summary statistics of flood depth and duration data collected from the field and used for the modelling are presented in Table 2.

Table 2 Summary statistics of flood observations reporting minimum (min), interquartile range (IQR), mean and maximum (max) flood depth and duration on each reach

4.1 Optimized discharge

Results for all nine combinations of nc and nfp for all five reaches are shown in Fig. 5a–e. The results show how the RMSE between observed and modelled flood depths increases or decreases as the discharge linearly increases. Simulations with n values between 0.01 and 0.07 for floodplain and channel produced flood depths that closely match observed depths at unusually high discharge. This discharge is regarded as unusually high given the size of the catchment (see Sect. 2.2 and Fig. 1) and based on the rainfall duration, which was reported by the locals to be about 12 h. For all reaches, simulations with n combinations of nfp = 0.14 and nc = 0.01, nfp = 0.14 and nc = 0.07, and nfp = 0.14 and nc = 0.14 consistently maintained the lowest RMSE values compared to other n combinations. Further investigation of these three optimally performing n combinations showed that whilst simulations with nfp = 0.14 and nc = 0.01 resulted in fast routing of the water causing some observation points not to be flooded, simulations with nfp = 0.14 and nc = 0.14 achieved an overestimation of the flood levels at the observation points. Hence, we selected nfp = 0.14 and nc = 0.07 as optimum n combination for the study region and the combination was used for other simulation steps. Table 3 shows the values of optimal peak discharge and the minimum RMSE between simulated and observed water depths for each reach. Optimal peak discharge for reach 5 was 147 m3 s−1 and the summation of peak discharges from upstream reaches (1–4) was 340 m3 s−1: hence, the lower optimal peak discharge for reach 5 indicates that the upstream reaches did not synchronize peak discharges in time. Minimum RMSE was generally comparable across all reaches except for reach 1 with RMSE of 0.67 m (Table 3).

Fig. 5
figure 5

RMSE between observed and simulated flood depths from models driven with linearly increasing discharge and Manning’s roughness (fp = floodplain, c = river channel) combinations for a reach 1, b reach 2, c reach 3, d reach 4 and e reach 5. Note y-axis limits (0.3, 1.75)

Table 3 Summary of results for simulated optimal peak discharge and duration with RMSE’s for corresponding reaches

4.2 Optimized duration

On reaches 1, 3 and 4 (Fig. 6a, c, d), the 6, 12 and 18 h hydrograph resulted in lower RMSE (≤ 5 h), hence suggesting that the flood duration at the building locations was likely within that range. Simulations with longer durations (24 and 36 h) resulted in high water volume and overestimated observed flood depths and durations. Consequently, for reaches 1, 3 and 4, the 12 h hydrograph duration was selected to be optimal since it results in the minimal RMSE. Generally, RMSE for reach 2 (Fig. 6b) was comparably higher with both the lowest (6 h) and highest (36 h) duration hydrographs showing high errors. Results of the inundation extents showed that in reach 2, the 6 and 12 h simulations did not flood many observations points, hence resulting in a high RMSE. Conversely, high duration of 24 and 36 h resulted in comparatively high water depths at observation points and prolonged water duration. The 18 h duration hydrograph was selected for reach 2 since it had the lowest RMSE.

Fig. 6
figure 6

RMSE between observed and simulated flood duration for a reach 1 b reach 2 c reach 3 d reach 4

4.3 Combining hydrographs

Figure 7 a is 1296 temporally shifted hydrographs and a hydrograph assuming no temporal shift. One combination of temporal shifts matched the peak discharge (Fig. 7a, in blue) previously determined for reach 5 in isolation (147 m3 s−1). This temporal shift was such that floods from smaller catchments were the first to flow downstream. Additionally, RMSE between observed and modelled depths on reach 5 for the two simulations using the optimal temporal shift and no shift is shown in Fig. 7b.

Fig. 7
figure 7

a Iteration of hydrographs with temporal shift for reaches 1–4. b RMSE between observed and simulated water depths

4.4 Entire river network

Figure 8 a is the upstream input hydrographs, with optimal temporal shifts, used to simulate the entire river network. The global RMSE (gRMSE), over time, between observed and modelled flood depths for the entire river network is shown in Fig. 8b. Out of 300 observations, 12 were not flooded, consequently affecting the computed gRMSE. The minimum gRMSE for all observations on the entire river network is 0.61 m, and this corresponds to a simulated time of about 11 h. A RMSE for observations associated with river junctions (n = 29) was 0.67 m.

Fig. 8
figure 8

a Optimal upstream hydrographs for simulating the entire catchment b RMSE between observed and modelled depths

Figure 9 is simulated flood depths corresponding to the model time step with the minimum gRMSE for all observations. Model errors at building locations are mapped for reaches 1, 2, 3 and 5 (Fig. 9a–d). Errors with negative values are model overestimations, and positive values are underestimations. Figure 9a, high errors were most likely due to the deviation (about 50 m) of the reconstructed channel highlighted earlier. The deviation was caused by a sharp bend of the river channel, which was not well represented by the 12 m resolution DTM. Due to the deviation, most of the buildings were further away from the channel, and hence, model errors were predominantly underestimations of observed flood depths. Few of the observation points in Fig. 9a were not flooded, further increasing errors and computed gRMSE. In Fig. 9b, differences in observed and modelled flood depths were most likely due to reported backwater effects and culvert blockages close to J1. Errors in Fig. 9c were also likely related to the channel deviation resulting in both overestimation and underestimation of observed depths. Additionally, the contribution of an additional tributary (Fig. 9c) must have further influenced flow characteristics at the junction due to channel interactions. Since additional tributaries were not included in the modelling, their influence on flow characteristics at the junction may not be well captured. Similar interaction effects from an additional tributary might likely have influenced model errors on observations in Fig. 9d.

Fig. 9
figure 9

Flood model for the entire catchment with errors showing the difference between observed flood depths and simulated depths (at minimum RMSE) for selected areas a, b, c and d

5 Discussion

5.1 Model input and output evaluation

The range of n selected (0.01, 0.07, 0.14) provided an evaluation of CL’s sensitivity to different combinations of nc and nfp. It demonstrates the dependence of peak discharge on n and reach characteristics. Combinations of nc and nfp with lower values (i.e. low roughness) allow water to be quickly routed towards the outlet, and hence, it takes a longer time and higher discharge for the water to propagate laterally and flood the building locations (Fig. 5). Curves relating discharge and RMSE (Fig. 5) show similar patterns between (i) reaches 1 and 5 and (ii) reaches 2, 3 and 4. Similar patterns for reaches 1 and 5 is likely due to a characteristically U-shaped valley in both reaches (Table 1) meaning that lower values of n will require more time and discharge to flood the locations of the observation points that are further away from the channel. On the other hand, reaches 2, 3 and 4 have floodplains with relatively flat terrain, and hence, a small increase in discharge is more likely to laterally increase the flooded area faster. In general, these patterns suggest that flood plain shape affects time to peak as shown in Fig. 5. Higher RMSE observed for reach 1 (simulation step 1) (Table 3) compared to other reaches is likely due to the deviation of the reconstructed channel: this deviation was highest in reach 1 (~ 50 m) at a specific location with a sharp channel bend (Fig. 1b). The deviation of the channel resulted in having observed water depths locations either being closer or further away from the channel, consequently resulting in higher RMSE. For example, RMSE on reach 1 reduces by 0.1 m if observation points where the channel deviates were removed.

The second step of simulation approximated the flow propagation time on upstream reaches using flood durations at building locations. The highest difference between modelled and observed duration occurred on reach 2. However, this may be attributed to a culvert blockage under a bridge at J1 (Fig. 1b) and backwater effects, causing water to be retained in low terrain areas close to the outlet along reach 2. Although replication of flow at partially and fully blocked culverts (or bridges) is possible in CL by increasing cell n value at blocked locations, adding this level of complexity to the model introduces a number of unknowns that need to be calibrated, such as (i) n value to represent the blockage which will vary over time and (ii) time estimate for the onset and duration of the blockage. As such, the replication of such effects becomes difficult and could potentially produce higher RMSEs on reach 2.

The temporal shifting of hydrographs performed to characterize the downstream flow resulted in the selection of a hydrograph with a peak discharge of 147 m3 s−1 and a total duration of 21 h. Although results for both simulations with an optimal shift and no shift (Fig. 7a, b) showed comparable minimum RMSE of 0.52 m, RMSE for the no shift scenario highly overestimated the peak discharge (303 m3s−1). The sinuous shape of the RMSE in the no shift simulation (Fig. 7b) resulted because the model flood depths closely matched the observed depths at both the rising and falling limb of the hydrograph. Similar results were found in Neal et al. (2013) when peak flows were simply allowed to coincide in time (no shift) which resulted in an overestimation of observed depths. The optimal shift hydrograph showed a steady, but stable, minimum RMSE meaning that all upstream channels flowed downstream in such a way as to consistently maintain the closest match to the 55 observed water depths downstream. To further evaluate the optimally shifted hydrographs and corresponding peak discharge of 147 m3 s−1, we investigated the simulated depths at all the building locations on reach 5. Simulated water depths for both simulations with a peak discharge of 147 m3 s−1 and 303 m3 s−1 were jointly plotted against simulation time (Fig. S1 a). In addition, we show the distribution of observed flood depths at all building locations on reach 5 (Fig. S1 b). While the flood depths for the simulation with the selected optimal peak discharge of 147 m3s−1 are mostly within the range of observed depths, the simulation with a peak discharge of 303 m3s−1 contains many instances of simulated flood depths that lie outside the range of observed depths (Fig. S1 b). This comparison provides further evidence supporting the temporal shifting of upstream hydrographs.

The RMSE between observed and modelled flood depths at the river junctions was 0.67 m. Due to associated complexities at such locations (channel interactions and(or) culvert blockage) and exclusion of the data in model calibration, an RMSE (0.67 m) higher than the gRMSE (0.61 m) was expected. However, such a small difference in RMSE of 0.06 m indicated that the temporal shifting of hydrographs closely captured the flow processes at the river junctions. This demonstrates the applicability of this method in a complex multichannel river network. Additionally, it also showed that model results were not dependent on the calibration data used, which indicates the potential transferability of the method. The gRMSE of 0.61 m is within ± 0.10 m of other studies that use similar hydrodynamic models and had comparatively higher data availability and quality (e.g. Mignot et al. 2006; Fewtrell et al. 2011; Neal et al. 2011; Yan et al. 2015; Ramirez et al. 2016; Altenau et al. 2017). Schumann et al. (2015) noted that generally, water level accuracies vary from few centimetres to 1–2 m during calibration, and our results are within this range. Hence, given limitations due to DEM resolution, the geometric representation of the river channel, and unavailability of data (rainfall or discharge) to set up boundary conditions, a gRMSE of 0.61 m for 300 observations is acceptable and may provide a provisional alternative for reconstructing plausible flood scenarios in data-scarce areas.

The results obtained from this study may be replicated using different input discharge values, shape of duration hydrographs and (or) different temporal combinations of the upstream peak discharge. As such various model setups can be equifinal (Beven and Freer 2001) but this is less important in our study because the aim was not to reconstruct the exact hydrographs. Our aim was to arrive at a plausible scenario that minimizes the RMSE on each reach and later the gRMSE on the entire river network. At such minimum gRMSE, flood depths can be reliably extrapolated to other neighbouring buildings to enrich a data set to develop flood damage models. The use of CL allows the consideration of process characteristics (e.g. Manning’s coefficient) and temporal dynamics of the upstream reaches using a physically informed method, which is usually not guaranteed by geo-statistical approaches.

Our study extends current methods in the application of house-to-house interview data collected using questionnaires. In particular, it extends the application of interview data from methods using GIS to map past flood events (e.g. Poser and Dransch 2010; Singh 2014; Sy et al. 2016, 2020) to a method using a hydrodynamic model. In such a way, the physical and dynamic characteristic of past flood events can also be represented. Focussing on hydrodynamic approaches, our study extends the method applied by Borga et al. (2019) and Bronstert et al. (2018) in utilizing post-event data to estimate peak discharge for a single channel; here, we applied a similar method for a complex multichannel network with further consideration of temporal dynamics between upstream and downstream catchments. This study agrees with Neal et al. (2013), Pattison et al. (2014) and Zischg et al. (2018b), on the importance of considering spatiotemporal dependence of multiple upstream channels on downstream catchment for flood analysis.

5.2 Model limitations and uncertainties

Several studies (e.g. Saksena and Merwade 2015; Ramirez et al. 2016) have shown that model errors are directly related to the accuracy of the elevation model used since the DEM directly affects where water is routed. Consequently, the coarser a DEM is, the more likely it will not correctly represent floodplain and channel characteristics. Saksena and Merwade (2015) suggested that errors from coarse resolution DEMs are specifically related to reach length and width, valley shape and land-use. In our study, the use of a 12 m resolution WorldDEM DTM has a finer resolution than globally available DEMs (e.g. 30 m SRTM), but we find the WorldDEM DTM still requires pre-processing to represent river channels. Thus, the location, width and depth of the river channel, the presence of artefacts like bridges and culverts, and shape of channel valley contribute to model uncertainty. In this study, to reduce these uncertainties, we reconstructed the channel location and width with the support of field photographs and satellite imagery. However, considering that the reconstructed channel deviated in multiple areas, which have likely resulted in local errors (see Sect. 4.4), it is apparent that data limitations directly affect model uncertainty.

Another source of uncertainty is the characterization of the channel using a simplified assumption of a rectangular shape and a uniform drop in channel elevation. A study by Neal et al. (2015) showed that except for increase in flood wave propagation that is usually more pronounced in large catchments, flood models with simple channels have comparatively the same accuracy in water depth and extent as models using more complex channel bed representations. This suggests that our choice of a simple channel representation produces a negligible amount of error. A uniform drop of the entire channel by 1 m was used for our study. Without defining a channel in the DEM, the floods in some locations commenced nearly at the level of the floodplain and this would underestimate optimal discharge values. In contrast, excessively dropping the channel in elevation would unrealistically confine the river and severely overestimate optimal discharge. To test the sensitivity of the selected drop in channel elevation, we ran additional simulations with different channel elevations. Channel elevation drops generated for the simulations were 0.5 m, 0.75 m, 1.25 m and 1.5 m. All simulations were ran using the same n values for flood plain (0.14) and channel (0.07), which is consistent with the selected optimal n values. In total, we ran 20 simulations, which corresponds to 4 simulations (0.5, 0.75, 1.25 and 1.5 m bed elevation) on reaches 1–5. Results of the simulations generally show low sensitivity of the peak discharge and RMSE on changing channel elevations (Figure S2 a–e, and Table S1).

As buildings were not extruded in the DEM, determined peak discharge on each reach (Table 3) may be slightly overestimated. This is because of limited confinement and friction by such a DEM, meaning that more water volume and higher discharge will be required to closely match observed depths. Bermúdez and Zischg (2018) and Neal et al. (2009) made similar conclusions where building representation was found to affect model depths. In addition, a common practice in the Suleja/Tafa region is the construction of walls around buildings which serve as a flood protection. The walls are especially predominant on reaches 1 and 2 and are close to the channels. Generally, walls alter the amount of floodwater that enters the building except if they have been damaged by the flood. Our model uses a 4 m spatial resolution DEM and cannot accurately represent fine-scale features like walls and adding these features to the DEM, at the current resolution, would substantially confine water to the channel and prevent it from propagating to the flood plain. In not representing walls, we have likely overestimated flood depths at building locations.

Furthermore, uncertainties relating to interview data exist. The fieldwork was carried out eight months after the flood event, and the accuracy of observed flood depths and duration was partly dependent on the personal reflection of people living in the affected houses. Generally, while people’s recollection of past events is vague after time (Lacy and Stark 2013), floods are traumatic events and people are likely to correctly remember details over long periods (Sotgiu and Galati 2007). In our data, about 35% of observation points had flood marks and it was possible to measure the flood depths directly. Such field visits provide an opportunity to reduce uncertainties related to inaccurately reported flood depths by community residents (McDougall and Temple-Watts 2012; McDougall 2011). Also, during such field visits, additional data from sediment deposits or erosion lines or marks on riparian vegetation can be further explored for obtaining flood depths; this has been recently demonstrated by Garrote et al. (2018) to improve flood hazard assessment. Conversely, unlike flood depths where it is possible to use flood marks, the case is slightly different for flood duration. Flood durations were completely based on personal reflections, which makes the data susceptible to uncertainties. Also, some studies (e.g. Mbow et al. 2008; Sy et al. 2016) have observed that local factors such as lack of means for evacuation and topographic depressions can influence flood duration data at building locations and consequently affect model performance. For example, in Table 2, such high variations in flood duration can be observed on different reaches with the maximum range at Junction 1 having a minimum of 1 h and a maximum of 178 h. In general, recently demonstrated approaches in citizen science projects can be used to reduce the uncertainty in interview data collection. For example, two separate individuals (within a household) can be used to check the consistency of information provided: a similar approach was demonstrated by Sy et al. (2020) who interviewed two sets of community representatives to reconstruct three past flood events in Senegal. Sy et al. (2020) reported that this approach improved the validity and reliability of their method since they can check the consistency of information using both sets of participants. Also, while incidences of post-traumatic stress disorder (PTSD) have been observed to affect flood victims, (Chen and Liu 2015) showed that PTSD has a low prevalence rate (11.45%) at 6 or more months after the flood event. Given that our field data collection was carried out 8 months after the flood and about 35% of flood depths were measured directly on field, our data and results are less likely to be affected by incidence of PTSD in interviewees.

6 Conclusion

The knowledge about past flood hazards, reconstructed by using hydrodynamic models, provides a key input for flood risk assessment. Scenarios of past hazards allow a better understanding of spatial and temporal patterns of flood risk within a region which is important for future flood management decisions. In this study, we developed a systematic method to reconstruct a plausible scenario of the 2017 flood event in Suleja and Tafa region, by utilizing field interview data and hydrodynamic modelling. The application of spatially distributed post-flood data, in particular, flood depths and durations, collected from eye-witnesses shows a good potential for calibrating hydrodynamic models in data-scarce regions. However, this does not come without several challenges such as low DEM resolution, channel bed realignment or dealing with interview data uncertainties. In this study, these aspects of potential challenges and limitations in the application of hydrodynamic modelling in data-scarce regions have been duly discussed.

This study represents the first application of a hydrodynamic modelling approach in Suleja and Tafa area. It contributes to current knowledge on reach characteristics and identification of potential high flood magnitude areas, which are important for future risk management. For example, spatial distribution of flood characteristics (e.g. flood depth and duration) at building locations within and outside the surveyed extent can be utilized for developing flood damage curves to support physical vulnerability assessment of regional building types. Although the method applied in this study may be case study specific, the individual components of the new approach can be applied in similar data-scarce locations. For example, field interview data are usually easy to acquire using questionnaires and therefore provide a sizeable calibration data in such regions. Furthermore, the study highlights possible opportunities and challenges in modelling floods on data-scarce reaches. Further development and improvement of methods covering systematic field data collection and its integration for hydrodynamic modelling are encouraged.

Due to data limitation, the scope of this study was limited only to scenario reconstruction. However, the approach developed opens a pathway for future flood reconstruction in data-scarce areas. In particular, the techniques applied (e.g. in identifying optimal peak discharge and duration) and discussions outlined on uncertainties provide a good baseline for future studies to improve upon. We therefore recommend the application of the method in a data-rich region so as to further evaluate the plausibility of model output with observed data.