1 Introduction

In the last decade, major flood events across Canada have caused heavy damage to infrastructure and homes and have cost the lives of Canadians. In 2013, a flood in Calgary caused 6 billion dollars in damages to the city and killed five people (Bogdan et al. 2018). Over in Eastern Canada, a wide-set flooding event in April of 2019 caused over 15,000 residences from across Ontario, Quebec, and New Brunswick to be flooded; the water levels from this flooding event had broken Quebec records, previously held by the 2017 flood (Lowrie 2019). To help mitigate the damage caused by floods across the nation, and to help better understand flood patterns, a centralized system for flood/inundation mapping in Canada would be beneficial. These maps could visually display the estimated spatial extents of flood water from a simulated flood event and are important tools for facilitating risk communication to stakeholders and motivating action (Henstra et al. 2019; Dransch et al. 2010). Further, if an accurate and computationally rapid flood model was combined with continuous hydrological/meteorological measurements and/or prediction data, it would be possible to create flood maps in near-real time that represent near-future forecasted flood conditions, or even simulate experimental flood conditions based on user inputs. These on-the-fly flood maps could support emergency personnel and decision makers in being better informed about ongoing flood events in their respective jurisdictions.

Traditional methods for flood mapping focus on complex hydrodynamic and hydraulic models to map out the inundation extent areas. These models often incorporate shallow water equations (SWEs) in various dimensions (1D–3D) and require a variety of data inputs including flow rates, temperature, bed roughness, wind conditions, and more (Jovanovic et al. 2019). The extensive data requirement of these models limits their usability within many Canadian communities that are data scarce, while their high computation times are impractical for on-the-fly flood mapping. Although many of these models, like TUFLOW, parallelize SWE calculations and are being implemented into cloud-based systems to reduce overall computation times, the high cost constraints and data requirements are still a considerable factor (Jovanovic et al. 2019; Van Der Velde and Huxley, 2020). Other research into flood mapping is focused on simplifying the models themselves (0D) and their data requirements to allow for widespread application (Rebolho et al. 2018; McGrath et al. 2018).

One such simplified flood model is the Height Above Nearest Drainage (HAND) model. The HAND model estimates flood extents by normalizing topography data through calculating the difference between the elevation of a land grid cell and the elevation of the river grid cell it is estimated to drain into through flow simulations (Nobre et al. 2011). This model only requires topological data of the watershed and a river network raster file (Liu et al. 2018). Once a HAND raster file is produced, creating an inundation/flood map only requires the user to run a simple raster calculation:

$${\text{D}}_{{\text{z}}} = {\text{ H }}{-}{\text{ H}}_{{{\text{HAND}}}}$$
(1)

where DZ is the flood depth (m) at each pixel, H is the water level (m) from either a gauge measurement or a reach-average level, and HHAND is the value (m) of the HAND raster at each pixel. The HAND model has performed well in previous inundation and floodplain mapping studies (Garousi-Nejad et al. 2019; Diehl et al. 2021), including a flood modeling study by McGrath et al. (2018) for watersheds in Quebec and New Brunswick. Results of the study showed that the HAND model produced inundation maps with Critical Success index (CSi) values of 0.903 and 0.958. Computational time with HAND was rapid, with only a duration of 3.3 and 5.6 seconds of processing needed to generate a flood extent raster for two study sites of size 17 km by 8 km (136 million pixels) and 11 by 7 km (77 million pixels) with a 1 m resolution (McGrath et al. 2018). In comparison, Vacondio et al. (2017) required eight hours to process a flood extent raster with about 40 million pixels (2 m resolution) using a Cartesian grid-based 2D SWE model. Combining the HAND model with predictive or continuous hydrological/meteorological data, such as from the hydrometric gauge stations, operated by the National Hydrological Service (NHS), or the GeoMet platform from the Meteorological Service of Canada (MSC), could assist in producing on-the-fly flood maps across Canada. The challenge here is that in many sites without a hydrometric gauge station, the best source of hydrological data comes from models and platforms, like GeoMet, that only provide estimates of river discharge/flow (m3/sec) (Q). Because the HAND model requires water level (H) data, a mathematical relationship between Q and H is needed to generate a flood extent map in these ungauged locations.

To address this issue, we developed a custom ArcGIS tool, Canadian Estimator of Ratings Curves using HAND and Discharge (CERC-HAND-D), that can assist in the creation of synthetic rating curves (SRCs) through the method described by Zheng et al. (2018a). An SRC defines a mathematical relationship between Q and H for a given segment of river and its catchment area. For each iteration of H, a corresponding Q is calculated using the cross-sectional geometry of the inundation maps, river characteristics, and the Manning’s formula. CERC-HAND-D can produce an SRC in the form of an equation (SRC equation) and a look-up table (SRC table), depending on the need of the user.

Creating accurate SRCs is challenging due to a variety of factors. For instance, according to Godbout et al. (2019), the Manning’s formula is sensitive to the Manning’s roughness coefficient (n). An irregular surface, such as a rocky riverbed, will create a larger resistance to water flow than a smooth textured surface, such as asphalt, would. As the stage height of a flood increases and the flood waters reaches the riverbank and the floodplain, the flood waters are exposed to multiple different land surface types across the wetted perimeter, leading to the difficulty in determining an appropriate n value. Traditionally, SRCs are created using a single fixed n value in the calculation of Q, with this fixed value representing the roughness of the main channel bed (Zheng et al. 2018a). Vatanchi & Maghrebi (2019) emphasized the bias of using a single fixed n value and found that changes in floodplain or riverbank roughness will have a stronger effect on SRCs than the channel bed roughness. Another challenge with creating SRCs, especially when using the HAND model to define flood extents, is that river reaches with extreme lengths and/or slopes will tend to have inflated estimations. In their study on the accuracy of SRCs in a Texas watershed, Godbout et al. (2019) found that river reaches with gradients below 0.001 m/m usually performed poorly, while river reaches with lengths lower than 1.2 km also performed poorly, and as well above 5 km the performance starts to decline (Garousi-Nejad et al. 2019).

In this study, we experimented with two distributed n methods (weighted and minimum-median) for calculating SRCs, in addition to applying a single fixed value. CERC-HAND-D was modified to accommodate the two experimental methods, and the study was conducted in a variety of locations across Central and Eastern Canada to explore the applicability of CERC-HAND-D in the diverse landscapes of Canada.

2 Study sites and control data

Five control rating curves tables, derived from the primary sensor in the hydrometric gauge stations, were provided by the NHS. These were the most current data available; each being last updated between September 4th, 2019 and August 28th, 2020. These gauge stations (Table 1) were selected due to availability of control rating curves and high-resolution topographic data. In addition, these gauge stations were selected to provide a variety of river slopes and reach lengths to test the accuracy of the SRCs when given moderate and extreme reach characteristics. A reach is defined here as the section of river within a catchment polygon (Fig. 1). Slopes greater than 0.002 m/m were considered to be high gradient while slopes lower than 0.001 m/m were considered to be low gradient (Godbout et al. 2019). In keeping with the finding of Godbout et al. (2019), river reaches shorter than 1.2 kms were considered short, while river reaches above 5 km were considered long.

Table 1 Hydrometric gauge station statistics; data provided by Environment and Climate Change Canada (ECCC)
Fig. 1
figure 1

The location of the research case studies located in Central and Eastern Canada with the station number of the hydrometric stations, operated by the National Hydrological Service. The catchment polygons and the river network for the study areas: Whitemud River in Westborne, Manitoba; Grand River in Cambridge, Ontario; Riviere Richelieu near Saint-Jean-sur-Richelieu, Quebec; Aroostook River near Tinker, New Brunswick; North River in North River, Nova Scotia. River network was sourced from the National Hydro Network (NHN)

The sites selected for this study are displayed in Fig. 1. The Whitemud River study site in Westborne, Manitoba is a low slope gradient area (S = 0.00069 m/m) and is highly forest covered (40%) with a moderate reach length of 2.7 km. The Whitemud River was one of many rivers that experienced flooding during the province wide spring-melt flood event in May of 2011, with the village of Westbourne experiencing a 1 in 100-year flooding (Government of Manitoba 2013). The Grand River study site is located in the highly urbanized (78.4%) downtown area of Cambridge, Ontario, which was the location of a 1 in 500-year flood event in 1974 (Gardner 1977). The Grand River in downtown Cambridge has a low-moderate slope (S = 0.0013 m/m) and a moderate reach length (3.7 km). The Riviere Richelieu study site near Saint-Jean-sur-Richelieu, Quebec had the longest river reach of 15.7 km, with a low-moderate river slope of 0.0013 m/m and is made up mostly of cropland (63.8%). A record flood occurred on the Riviere Richelieu in early spring of 2011 that lasted for 67 days as a consequence of a mixture of spring melt and intense spring rains (Saad et al. 2015). The Aroostook River study site near Tinker, New Brunswick is both moderate in river length (1.9 km) and river slope (S = 0.0015 m/m), surrounded by highly rural areas (49.5% forested, 36.9% cropland). The Aroostook River is a main tributary to the Saint John River and was one of the sites of a record-breaking spring flooding event that occurred along the Saint John River and its tributaries in May of 2008 (Newton and Burrell 2016). The North River study site in North River, Nova Scotia has the highest river gradient (S = 0.0046 m/m) and is highly forested (57.1%) with a moderate river length (1.7 km). The site was the location of a severe widespread flood event in September of 2012 caused by an intense rainfall and leading to water levels of 1.5 m (CBC News 2012). Because the control dataset from the NHS provided an inconsistent range of H values between hydrometric stations, water level (m) values (H) between a range of 0.1–2.7 m for the Riviere Richelieu study site and 0.1–2.5 m for the North River site; the other sites had an H value range of 0.1–5 m.

3 Materials and methods

The subsequent steps were followed to generate the SRC equations and tables using CERC-HAND-D: (a) accessing publicly available data from the publicly available national datasets; (b) deriving the HAND rasters; (c) running the CERC-HAND-D tool to create the SRCs tables using single fixed n values and distributed n methods; (d) nonlinear regression to derive the final SRC equations using CERC-HAND-D; (e) accuracy assessment of the SRCs with the control rating curves from the NHS.

3.1 Raw datasets

3.1.1 Digital terrain models

Topological data of each study area (Fig. 2a, d, j, g, m) were acquired from the HRDEM dataset, provided by Natural Resources Canada (NRCan). This dataset provides digital terrain models (DTMs) of scenes from across Canada, covering about 400,000 km2 of area (Bélanger et al. 2020). These models were generated using airborne Light Detection and Ranging (LiDAR) measurements. DTMs are “bare-earth” representations of topography, in which vegetation and manmade structures, such as buildings and bridges, are not included (Natural Resources Canada 2020). For this study, DTMs of each case study were utilized as they are better suited for hydrological studies. Horizontal resolutions were 2 m for the Richelieu study area and 1 m for the remaining study areas; the study sites vertical accuracy range of 0.036—0.226 m with a 95% confidence level in non-vegetated areas.

Fig. 2
figure 2

Input data sourced or derived from national and public datasets. Elevation data from the HRDEM dataset a, d, g, j, m. Land class data from the 2015LCC b, e, h, k, n. The HAND rasters c, f, i, l derived from HRDEM and the NHN

3.1.2 Stream network

Stream network vector data (Fig. 1) were extracted from the NHN dataset. The data were created at scale of 1:50,000 or smaller, and it provides geospatial information on inland bodies of water in Canada, including lakes, rivers, and streams. Only major rivers and tributaries in the stream network data were chosen for creating the HAND model and the SRCs to save on computation time (McGrath et al. 2018). The NHN stream network vector data were also used to create the catchment polygons (Fig. 1) by using the Watershed tool in ArcGIS Pro 2.4 with an input D8 flow raster file and a pour points vector file that was based off the intersecting mid-vertices of the NHN stream network vector dataset.

3.1.3 Land class

Land class data (Fig. 2b, e, h, k, m) were acquired from the 2015 Land Cover of Canada (2015LCC), provided by the Canada Center for Remote Sensing (CCRS). The dataset has a resolution of 30 m and was generated using satellite observations. In Table 2, n values for each land class category were based off Table 3–1 in the HEC-RAS manual (Brunner 2016), specifically using the ‘Normal’ column. CERC-HAND-D automatically adds these values to the attribute table of the 2015LCC.

Table 2 Designation of n values for 2015LCC file, based on the normal values in HEC-RAS manual (Brunner 2016; Table 3–1)

3.2 Height above nearest drainage

A HAND raster of each study area (Fig. 2c, f, l, i, o) was derived using the Hydrology toolset available in ArcGIS Pro 2.4 (under the spatial analyst license) and methods described by Tarboton (1997) and Nobre et al. (2011). First, the DTM files were hydrologically conditioned using the Fill tool to remove voids in the elevation profile. Then, the void-filled DTM was passed through the Flow Direction tool (D8 algorithm) and then the Flow Accumulation tool to create a stream raster, which is a binary raster dataset where a cell value of 1 is a stream cell and a cell value of 0 is a non-stream cell. The void-filled DTM was then applied to the Flow Direction tool again using the D-Infinity algorithm to create the flow direction raster. Finally, the three datasets (void-filled DTM, stream raster, and flow direction raster) were added to the Flow Distance tool with a vertical distance type setting and then a HAND model was the output file. The value of each pixel in these HAND rasters represents the elevation difference between each point of interest and the end point its drainage path (Fig. 3). The main assumption of the HAND model is that any cell with a HAND value lower than the simulated flood water level will be inundated; therefore, Eq. 1 was the raster calculation equation used to create the flood maps.

Fig. 3
figure 3

Visual display of how the HAND raster is created. The HAND value is the vertical difference between the point of interest and the drainage endpoint that it is most likely to drain into (Nobre et al. 2011)

3.3 SRC tables

Three Python 3.7 scripts were written to calculate the SRCs, with the only difference being how they represented surface roughness. Each script requires the Spatial Analyst license extension for ArcGIS Pro to allow access to the various Spatial Analyst module functions used in the code (i.e., arcpy.sa). Equations 511 from Zheng et al. (2018a) were the basis for the various intermediate calculations that derived the Manning’s parameters (river slope, n, flood volume, and hydraulic radius). All three scripts could be loaded into the CERC-HAND-D custom tool and were written to accommodate the data from the previously mentioned public databases.

The workflow for the CERC-HAND-D tool is shown in Fig. 4, displaying how the tool incorporated the input datasets into various spatial analytics and mathematical equations from Zheng et al. (2018a) to calculate the values of the Manning’s Formula parameters (Fig. 5a). These parameters were based on reach average characteristics (wetted perimeter, slope, etc.) of each inundation zone. The tool first clips all the input datasets to the extent of the catchment polygon. Then, the tool iterates through water level (H) values (m) starting at 0.1 m, increasing by increments of 0.1 m, until the maximum H is met. In each H iteration (Hi), an inundation extent raster file is created using a raster calculation (Hi – HAND value), which is used to compute the entire flood volume. The inundation extent file is then used to create a temporary clip of the DEM and that clipped DEM will be converted to a slope raster. That slope raster is used to calculate the hydraulic radius of the entire flooded zone. The catchment average flood volume and hydraulic radius for each Hi are derived by dividing the two parameters by the stream length within the watershed. Once the stream network file is clipped by the catchment polygon, the average river slope is estimated by sampling the DTM elevations at the stream network start and end vertices (dangling points) using the Feature Vertices to Points tool with dangle option enabled. The slope is calculated as the difference between the highest and lowest elevations divided by the length of the river reach. The CERC-HAND-D tool provides the user with the option to designate an n value for the channel bed roughness based on physical characteristics of the channel bed and n values from HEC-RAS. For example, because the river channel in the Whitemud river case study was meandering, an n value of 0.045 was chosen, while an n value of 0.035 was chosen for the rest of the study sites as these channels were straighter (Brunner 2016). In the single fixed n value script, these n values are used to represent the surface roughness of the entire scene. Once the parameters are calculated and a Q value is calculated using the Manning’s formula, each Q is added to a resulting SRC table along with the Hi.

Fig. 4
figure 4

Workflow of the CERC-HAND-D tool, specifically for the scripts involving the distributed n methods. For the fixed n version of CERC-HAND-D, the Land Class arm of the flowchart is not present

Fig. 5
figure 5

The Manning’s formula and its parameters (a), including the reach average cross-sectional area, the hydraulic radius, river reach slope, and the roughness coefficient (n). The geometry and characteristics of the riverchannel/riverbanks and how they influence the parameters in the Manning’s formula (b)

3.4 Distributed n methods

Recent studies have suggested that a composite n value could improve the overall accuracy of an SRC (Zheng et al. 2018b; Garousi-Nejad et al. 2019). Several composite n relationships have been proposed by researchers in the past (Chen and Yen 2002). Often these methods involve partitioning the cross-sectional area of the inundation extent into subsections, such as river channel and riverbank (Fig. 5b), then providing each subsection an n value, wetted perimeter, and cross-sectional area (Vatanchi and Maghrebi 2019; Brunner 2020). It could be challenging for a user to select a proper n value for each subsection along the growing wetted perimeter or to identify where these boundaries such as the riverbanks are located. For this study, simplistic approaches were experimented with to determine the best representation of variable flow resistances in the study area. Similar to the methods of Ozdemir et al. (2013), a classified land-class raster (2015LCC) was used to capture the general distribution of surface roughness in each scene and to delineate where the river channel ends and the riverbank begins.

3.4.1 Minimum-median distributed n method

The first distributed n method involved extracting the minimum and median n values from the land classes present within the catchment area of each case study. For example, the urban land class had the smallest n value (Table 2), and in every case study, the urban land class was present (Fig. 4); thus, in every case study, a value of 0.016 was used as the minimum n value. For each Hi, the minimum and median n values were incorporated into the Manning’s formula so that each Hi translated into two flow (Q) values that capture a range of possible Q values for a given fixed H. Thus, the median and minimum SRCs could be used to estimate a best-case and a worst-case flood scenario or to create a prediction window that potentially captures the true flood extent. A maximum n value was experimented with, however this resulted in an SRC that greatly over-estimated H values for every case study.

3.4.2 Weighted distributed n method

The second distributed method is similar to the general method to calculate a composite n value proposed by Yen (2002). The weighted method calculates a weighted average flow (Q) value for a given inundation extent based on the land classes within the maximum inundated extent and their respective n values, expressed as:

$$Q_{{Weighted}} = ~\mathop \sum \limits_{i}^{{N_{L} }} Q_{i} W_{i}$$
(2)

where QWeighted is the weighted average Q (m3/sec) value, Qi is the Q (m3/sec) value calculated using the n value of land class type i, Wi is the weighting factor (unitless), and NL is the total number of land class types within the maximum inundated extent. The weighting factor is equivalent to the proportion of the maximum inundation extent that land class type i encompasses. The purpose of this is to incorporate all the n values within the watershed, but still emphasize, or provide higher influential weight, to the n values that are associated with prominent land classes.

3.5 Deriving SRC equations

Once the SRC table is fully populated, each of the Python scripts used the SciPy (v.1.2.1) library to perform a nonlinear least squares regression (scipy.optimize.curve_fit) on the H values and estimated Q values from the table. Because the SRC equations need to convert Q values into H values, the Q are designated as the independent variable and H are designated as the dependant variable.

Fig. 6
figure 6

Synthetic ratings curves creating using the three n methods (fixed, minimum-median, weighted) for each study site; all charts include the ± 15% acceptance ranges and the SRC equations

3.6 Accuracy assessment

All SRC tables, including the water gauge control rating curve tables, were added to Microsoft Excel. Because the DTMs in the HRDEM dataset do not include bathymetry data for any bodies of water, the origin point (0 m) for the control H values was set to the base river level (m) of each site (Fig. 5b). The river base river levels were chosen based on the average daily levels (m) between the start and end dates for the LiDAR capture. For example, the Grand River site had its LiDAR survey done between October 20th–December 6th, 2017, and between these dates the average daily level was 0.99 m; thus, the origin point for the control H data was set to 0.99 m. To compare the estimated flow (Q) values to the control Q values, an uncertainty range of 15% around the control Q values was selected and the percentage of estimated Q values that fell within this range was calculated. Failing to capture any estimated Q values within the 15% acceptance range (AR) was considered a failure for the SRC table. Fifteen percent was selected based off research done by Mansanarez et al. (2019), where rating curve errors stayed under 15% for medium flows in rivers with similar mean annual flows as the Grand River, North River, and Whitemud River sites. This AR comparison method was chosen to account for possible water gauge reading errors, potential errors in DTM creation, and uncertainty in HAND model performance.

While there is no set standard method for using statistics to compare SRC equations, previous research has generally used a normalized root mean squared error (NRMSE) to perform error analysis (Godbout et al. 2019; Kavousizadeh et al. 2019; Vatanchi and Maghrebi 2019). For each study site, the NRMSE was calculated as:

$${\text{NRMSE}} = ~\frac{{\sqrt {\frac{{\sum \left( {Hc - He} \right)^{2} }}{N}} }}{{Hc_{{\max }} - Hc_{{\min }} }}$$
(3)

where Hc is the control H (m) values from the NHS hydrometric stations, while He is the estimated H (m) value from the SRC equations and N is the number of data points used to compare the two datasets. An NRMSE closer to 0 indicates that the SRC equation has a stronger agreement with the control rating curve. This NRMSE formula used in Godbout et al. (2019) was chosen because under-estimations and over-estimations do not cancel each other out and because the formula allows for comparisons between rivers of different scales. Because the NRMSE does not measure the extent of under-estimations and over-estimations, the percent bias formula from Godbout et al. (2019) was used:

$${\text{Percent}}~{\text{bias}} = \left( {\frac{{\sum \left( {Hc - He} \right)}}{{\sum Hc}}} \right)*100\left( \% \right)$$
(4)

A positive percent bias would measure the degree of over-estimation, while a negative percent bias would measure the degree of under-estimation of the final SRCs. For the purpose of clarity, the minimum n SRCs and the median n SRCs were tested separately to better examine the performance of each, though they are part of the same distributed n method.

4 Results

Except for the Riviere Richelieu study site, every study site had at least one SRC table where the estimated data fell within the 15% acceptance range, with the minimum and weighted n SRCs always producing an AR score above 0% (Table 3). The minimum n SRCs produced an AR score range of 24.0–64.0%, while the weighted n SRCs produced an AR score range of 23.7 – 74.0%. While the fixed n SRCs produced an AR score range of 44.0–64.0%, it failed in the Whitemud River site in addition to the Riviere Richelieu site. The median n SRCs failed (AR = 0%) in every site except for the North River site (AR = 44%) and the site with the highest AR score range was Aroostook River (AR = 64.0–74.0%), excluding the failed score (AR = 0%) from the median n SRC. In general, with the exclusion of the AR scores from the median n SRCs and the Riviere Richelieu site, the AR scores ranged from 23.7–74.0% with a median AR score of 54.8%. While these scores are not overwhelming high, they still indicate that the general trend for the SRC estimates follows that of the control SRC. These trends are supported by Fig. 6, which plots the SRC equations against the 15% acceptance ranges. As can be seen in Fig. 6, none of the Riviere Richelieu SRCs overlapped with the acceptance range, with the median n SRCs also deviating from the acceptance range. Otherwise, there is generally at least some (> 23.7%) overlap between the SRCs and the acceptance ranges.

Table 3 Error analysis Table for the SRCs

These trends are further supported by the NRMSE and percent bias error analysis on the SRC equations, both shown in Table 3. The North River, Aroostook River, and Grand River study sites had a NRMSE range  of 3.7%–8.8% and a percent bias range of -7.8%–9.4% when excluding the median n SRC, which had an NRMSE range of 7.0–45.3% and a percent bias range of 11.5–78.3%. These results indicate that the SRCs for these study sites, with the exception of the median n SRCS, are generally accurate when compared with the control rating curves, with some deviation present. The Riviere Richelieu study site had the weakest performance out of all the study sites, with a NRMSE range of 37.5%–76.2% and a percent bias range of 69.5%–140.1%, which indicates that the SRCs in this study sites excessively overestimate H values regardless of the n SRCS used. Interestingly, while the Whitemud River study site had a low AR score range (23.7–34.2%) with two n SRCss and failed (AR = 0%) with the other n SRCs, the NRMSE and percent bias ranges were only somewhat high (14.0–30.9% and 21.0–56.5% respectively), indicating that the SRC errors may not be significant. Except for the North River study site, every study site and every n method produced SRCs with positive percent bias, which implies that these SRCs will more often overestimate H values rather than underestimate them. In the context of flood prediction, the more conservative error is preferable, as it is more dangerous to under-estimate a flood than to over-estimate it (Godbot et al. 2018).

Further, between the fixed n SRCs, the minimum n SRCs, and the weighted n SRCs, no n method notably outperformed throughout the case studies. None of the methods consistently produced SRCs with the lowest NRSME and in each case study the SRC with the lowest NRSME only outperformed the next SRC by 2.5% or less. However, as previously mentioned, when the minimum n SRCs are combined with the median n SRCs, despite being the lowest performing group of SRCs, there is an advantage of a prediction window (Fig. 6). The assumption with using this prediction window is that the true Q-H relationship will generally fall somewhere between the median n SRC and the minimum n SRC. This assumption holds for the North River and Grand River sites quite well, and it nearly holds true for the Aroostook River site as well. This suggests that the minimum-median n method may be the preferred method to implement into the CERC-HAND-D tool.

There does seem to be some indication that both river gradient and reach length did influence the quality of the SRCs based on the results of this study. Both the case study with the lowest river gradient (Whitemud River) and the case study with the largest reach length (Riviere Richelieu) performed the worst. The Riviere Richelieu site most likely failed because reach average calculations, such as slope and wetted perimeter, become less accurate as they are applied over larger areas; it is possible that segmenting the catchment polygon to smaller sections of the reach might improve outcomes. Interestingly, the North River study site had the highest river gradient (S = 0.0046 m/m), yet the CERC-HAND-D tool was able to produce generally accurate SRCs for this study area. This is despite the fact that previous research suggests that SRCs are not accurate in geographical areas with high or low river gradients. While low reach lengths were not experiment with in this study (< 1.2 km), the poor results of the Riviere Richelieu study area (> 5 km) suggest that CERC-HAND-D, regardless of the n method used, may be unsuitable for large river reaches.

5 Historical flood recreation

The overall goal of creating the CERC-HAND-D tool is to support on-the-fly flood mapping utilizing the HAND model, and thus in this section, a series of flood maps that recreated the flood extents of the 2011 Richelieu floods are presented. The objective was to test the overall workflow, starting with converting Q values into H values using an SRC, then applying those H values to the HAND model to produce flood maps.

5.1 Background and testing metrics

The worst performing case study, Riviere Richelieu, was selected for this test to determine how significant the errors could be when using the CERC-HAND-D tool to support flood mapping. The minimum-median n method was chosen due to the advantage of the prediction window provided by this method. As the 2011 Richelieu flood occurred over a period of two months, it was decided to use flood extent data that was captured during the peak flood event in an attempt to recreate the spatial–temporal variability of the flood. According to the historical record of the NHN gauge station located in the Richelieu study area (#02OJ007), peak flow during the 2011 spring floods occurred on May 6th; however, there was no flood extent data available for that day. Instead, gauge data from May 5th, May 7th, May 8th, and May 12th were used as there were satellite-derived flood extent polygons available; these datasets were sourced from the Open Canada Database (https://open.canada.ca/en). The daily average discharge (m3/sec) values (Qc) and the daily average water level (m) values (Hc) from station 2OJ007 are shown in Table 4.

Table 4 Contingency table for minimum n SRC and median n SRC flood maps of the 2011 Richelieu flood

To calculate the estimate water level (m) values (He) using the input Qc values, the minimum-median n SRC equations from the Riviere Richelieu study site (Fig. 6) were applied:

$${H}_{min}=0.037\left({Q}^{0.575}\right)$$
(5)
$${H}_{med}=0.051\left({Q}^{0.575}\right)$$
(6)

To compare the HAND derived flood maps with the flood extent polygons, the techniques used by McGrath et al. (2018) and Chaudhuri et al. (2021) to perform a binary classification evaluation were followed. Pixels in the flood maps were categorized as either true positive (TP), false positive (FP), true negative (TN), or false negative (FN), and then added into a contingency table (Table 4). To evaluate the accuracy of the flood maps, the Critical Success index (CSi), the Matthews correlation coefficient (MCC) and Percent Bias (Bias) were calculated using the following equations:

$${\text{CSi}} = \frac{{{\text{TP}}}}{{{\text{TP}} + ~{\text{FP}} + ~{\text{FN}}}}$$
(7)
$${\text{MCC}} = \frac{{{\text{TP}}~{\times}{\text{~TN}}~ - ~{\text{FP}}~{\times}~{\text{FN}}}}{{\sqrt {\left( {{\text{TP}} + {\text{FP}}} \right)\left( {{\text{TP}} + {\text{FN}}} \right)\left( {{\text{TN}} + {\text{FP}}} \right)\left( {{\text{TN}} + {\text{FN}}} \right)} }}$$
(8)
$$Bias=\frac{\rm{TP}+\rm{FP}}{\rm{TP}+\rm{FN}}$$
(9)

CSi was used as an overall score, in which TP, FP, and FN values were balanced out in their influence on the final score (McGrath et al. 2018). MCC was also used as an overall score, where high scores are the result of correctly predicting the majority of the positive (flooded) and negative (non-flooded) indicators equally (Chicco & Jurman, 2020; Chaudhuri et al. 2021). Similar to Sect. 4, Bias was used to determine the magnitude of over or underestimations for the flood maps (McGrath et al. 2018). The CSI, MCC and Bias scores are shown in Table 4.

5.2 Comparison results

Overall, the minimum-median n method did in fact create a prediction window that functioned as it should for this test; the Hc values were consistently in-between the two He values (Hmin and Hmed) for each date of study (Table 4). Both the CSi and MCC scores over the four study dates were moderate to high, with a CSi range of 0.644–0.809 and an MCC range of 0.776 – 0.877; these scores suggest a strong agreement between the flood extent datasets and the recreations. The minimum n SRCs performed slightly better than the median n SRCs with a median CSI score of 0.722 vs. 0.684 and with a median CSI score of 0.820 vs. 0794. All of the Bias scores were above 1, with a Bias score range of 1.048 – 1.555, suggesting that the simulations over-estimated the extent of the flood.

Similar trends appear in the classified flood maps, alongside the control flood maps, in Fig. 7 (May 5th, 2011) and Fig. 8 (May 7th, 2011). In each of the classified flood maps (Figs. 7b ,7c, 8b, 8c), there are FP cells that align along the sides of the river, indicating that the HAND model over-estimated the extent of the flood; it is also evident that the median n SRC maps (Figs. 7b, 8b) have more FP cells than the minimum n SRC maps (Figs. 7c, 8c), which agrees with the results of Table 4. In the control flood maps (Figs. 7a, 8a), some inconsistencies in the control flood extent data are highlighted. For example, in Fig. 7a, several flood cells were disconnected from the main Riviere Richelieu stream; these consistently produced FN cells in the classified flood maps. Upon inspection, these cells aligned with the Chambly Canal that runs parallel to the river, with no indication of overland flooding. This could be a result of an artifact error made by satellite capture, as it is not present on the May 8th map. In addition, a bridge (Fig. 8a) occluded flood cells (approx. 3933 pixels) in some maps and produced FP cells in the classified flood maps; the occlusion from the bridge is not present in Fig. 7a. To determine if these control data errors significantly lowered the CSi and MCC scores, the number of FP, TP, TN, FN cells in confusion matrix were adjusted. It was found that these control errors only affected the median CSi and MCC scores by about 0.200 (unitless). Because the effects were minimal, the scores were kept as there are.

Fig. 7
figure 7

May 5th, 2011 flood polygon over the Riviere Richelieu study site (a). The binary classified flood map for the median n SRC (b) and the minimum n SRC (c). Note the circle highlighting the canal artifact from the control data

Fig. 8
figure 8

May 7th, 2011 flood polygon over the Riviere Richelieu study site (a). The binary classified flood map for the median n SRC (b) and the minimum n SRC (c). Note the circle highlighting the bridge artifact from the control data

Interestingly, even though the minimum n SRCs produced an under-estimated H values, the Bias scores were still greater than 1. This implies that the minimum n SRC flood maps had more significant over-estimations of flood extents than under-estimations. This is most likely a result of the HAND model itself, as it can result in overestimations in flatter geographical regions (Hocini et al. 2020) and the Riviere Richelieu site does have a low-moderate river gradient (0.0013 m/m) and flat terrain slopes. Further, it was noticed in the minimum n SRC flood maps (Figs. 7c and 8c) that there was a consistent group of FN cells in the southern portion of the map. Through inspection of the HAND model (Fig. 2l), it seems that this portion of the river was erroneously deemed elevated above the adjacent sections of river; this may stem from an error made during the processing of the HAND model (Sect. 3.2). Additionally, it is likely that the MCC scores were higher than the CSi scores because the CSi formula does not have TN (Eq. 7), while the MCC does (Eq. 8); notice in the classified maps and Table 4, the overwhelming number of TN cells (> 2,000,000 pixels) compared to the number of TP, FP, and FN cells. Because the large number of TN cells may be inflating the MCC scores, the CSi scores are probably more accurate.

6 Discussion

The CERC-HAND-D tool has shown promise as a proxy for hydrometric gauge derived rating curves, especially when using the minimum-median n method. On its own, the minimum n method performed quite well, with an NRMSE range of 5.4–40.0% and an AR score range of 0.0–64.0%; those ranges would be 5.4–19% and 24.0–64.0% if the challenging Riviere Richelieu site is excluded. Further, combining a minimum n SRC with a median n SRC produces the advantageous prediction window that provides more opportunities to catch the true Q-H relationship. As seen in Sect. 5, the minimum-median n SRC was able to capture the gauge water level (m) value (Hc) in its prediction window for each testing event. This suggests that within the prediction window of the tool, with the minimum-median n method incorporated, the user is more likely to find results that match closely with natural Q-H relationships than when using a single SRC. This is especially true when applying the tool in geographic locations that follow the set guidelines for quality SRC performance (river length < 5 km; S > 0.001 m/m). There is an inherent uncertainty in designating an n value for the Manning’s formula (Tullis 2012), and the minimum-median n method allows for a prediction window to accommodate for this uncertainty. The prediction window is also beneficial as it allows us to avoid the general difficulty of finding a single n value, even a composite n value, that can represent all the surface resistances present within a scene.

Short river reaches (< 1.2 km) were not tested and while the North River study site was considered to be high gradient (S = 0.0046 m/m) for our study, Godbout et al. (2019) were able to produce acceptable results up until S = 0.1 m/m. It would be beneficial to further assess tool performance in geographical study areas with these characteristics. The land class types can affect the quality of the SRCs, specifically between urban, cropland, and forest land classes and further testing for this issue is recommended (Godbout et al. 2019). The heavily urbanized (78.4%) Grand River study area and the heavily forested (49.5–57.1%) Aroostook and North River study sites all had lower NRMSE scores, while the Riviere Richelieu study site, with cropland taking up 63.8% of its area, performed poorly. These results suggest that croplands might have a negative influence on tool performance, although it is uncertain if this a factor of croplands generally being flat or how land classes interacts with river gradients and river lengths to affect SRC quality. It was also assumed that proximity to bridges, dams, culverts, and other hydraulic infrastructures would influence tool performance; however, this was not explored in our study.

A future study may also want to explore an alternative source for land class data that has a higher resolution than the 2015LCC (30 m) to match the resolution of the HRDEM (1–2 m). Diehl et al. (2021) combined HAND models with a 1 m resolution land class raster to created flood probability maps that had good agreement (F-statistics = 0.61–0.82) with 1D HEC-RAC maps. Interestingly, the researchers achieved these results using a weighted n method similar to the one in this study, although Diehl et al. (2021) weighted the n values rather than the Q values. This may suggest that if a 1 m resolution land cover was available and incorporated into the CERC-HAND-D worflow, the weighted n method might perform better. This would not be surprising, considering that weighted n method was similarly successful in this study (NRMSE range of 3.7–37.5%) as the minimum n method. The authors are currently experimenting with creating a land cover of Canada with a 1 m resolution and if successful, the raster map may be incorporated into the workflow of CERC-HAND-D.

Adjustments and additional features to the CERC-HAND-D workflow could be implemented to improve overall tool usability and performance. For instance, the HRDEM dataset does not include bathymetry, which resulted in under-estimation of flow (Q) values due to the exclusion of the in-channel geometry (Moretti and Orlandini 2018). A tested solution was to create a base river flow (QBASE) variable to offset and correct the estimated Q values. A similar approach to the Terrain Correction Technique (Choné et al. 2018) was used, where the flow depth during the LiDAR survey was used to correct flow estimations. The results of this experiment showed that QBASE only decreased NRSME by 1.5% or less, and in the North River study site it increased NRSME by up to 3.6%. This was on top of the fact that requiring the user to determine a flow depth value made the tool less user-friendly. In future, if a more successful and automated method of calculating QBASE is found, then this may become an option feature for CERC-HAND-D.

The HAND model itself is imperfect and is best utilized for capturing fluvial floods caused by river waters rising and spilling over their banks, while ice-jam and coastal floods cannot easily be replicated by the model (Wing et al. 2019). Flat terrains are also a challenge for the HAND model, as the model tends to over-estimate flood extents in these locations; this is a limitation shared with CERC-HAND-D, as the HAND model is required for creating SRCs. Further, the HAND model has been shown to be outperformed by Hydraulic 1D and 2D SWE models, albeit with higher computation times, and the model cannot produce outputs such as velocity and shear stress, which are important for flood hazard assessments (Hocini et al. 2020; Rebolho et al. 2018). Despite this, results such as those found by McGrath et al. (2018) suggest that the HAND model can be a cost-effective, rapid, and accurate flood model that would be suitable for on-the-fly flood mapping. By keeping within the limitations of both the HAND model and the CERC-HAND-D tool, it is possible to expand the flood mapping capabilities across Canada in a variety of ways. Further, there is no centralized source of catchment polygon data available for Canada that could support the CERC-HAND-D tool. While there are watershed polygons available from NRCan, they are generally too large to fit within the 5 km river length limit set for creating SRCs. Users can follow the same steps taken in this study to create the catchment polygons (Sect. 3); however, alternative DTM files will result in inconsistent catchment polygons.

Currently, the centralization of flood maps in Canada is limited, as the availability of these datasets is often restricted to regional and local governments with no coordinated standards between government bodies on the quality and the accessibility of flood maps (Henstra et al. 2019). The HAND model, with support by CERC-HAND-D, could assist in building a national repository of flood maps in Canada, especially for regions where these maps are outdated or non-existent. An even more advantageous approach would be to create an interactive web mapping application that houses archive and current flood maps that are both HAND derived and non-HAND derived (Henstra et al. 2019). A similar web application, named Hydrogeomorphic Flood Hazard Mapping, was created by Tavares da Costa et al. (2019), wherein big open-access datasets were used to support flood hazard mapping across Europe, providing both archive flood hazard maps and experimentally derived maps in conjunction. Additionally, Chaudhuri et al. (2021) have created a prototype for a flood mapping web application (InundatEd-v1.0) in the Grand River watershed by combining HAND models with discrete global grid system (DGGS)-based architecture.

Further, a potential web application could combine active river gauge data and/or predictive meteorological data with the CERC-HAND-D tool and the HAND model to produce on-the-fly flood maps. Some sources of such continuous and predictive data across Canada include the Real-Time Hydrometric Dataset (RTHD), provided by NHS, and the GeoMet platform, provided by MSC. Liu et al. (2018) have done a similar project where the HAND model, river streamflow forecasts, and SRC look-up tables are applied together to produce real-time inundation extent data on a national scale; Zheng et al. (2018b) have also done something similar with their Geoflood method. While there are some doubts on how suitable the HAND model is at inundation mapping on a national scale (Wing et al. 2019), a potential web application can be updated frequently as methods and models improve.

7 Conclusion

In this paper, we have discussed a custom ArcGIS Pro tool, called CERC-HAND-D, how the tool can produce SRCs to act as a proxy for rating curves, and the ways in which these SRCs can support on-the-fly flood mapping in Canada. CERC-HAND-D was shown to create SRCs that accurately (NRMSE = 3.7–8.8%, Percent Bias = −7.8–9.4%) compared to the control rating curves when river gradients and reach lengths are moderate. The minimum-median n method was advantageous for creating a prediction window that better captures true Q-H relationships through deriving two SRCs. Combining these SRCs with the HAND model in a workflow, two flood maps were produced that accurately captured (CSI = 0.644–0.809; MCC = 0.776–0.877) the extent of the 2011 Richelieu flood event. Further testing with CERC-HAND-D will be needed to better establish the limitations of the tool, and some adjustments to the GUI could make the tool more user-friendly. Future directions for the tool involve implementing CERC-HAND-D and the HAND model in the creation of an on-the-fly flood mapping application that would be widely applicable across the country.